# ML4Collider

The data collected by the Large Hadron Collider (LHC) experiments are vast in both the number of collisions and in the complexity of each collision. A central role of machine learning in LHC physics is to improve this data reduction, reducing the relevant information contained in the low-level, high-dimensional data into a higher-level and smaller-dimensional space. This project enables collaboration between particle physicists and computer scientists to aid in scientific discoveries using state-of-the-art machine learning methods for particle physics in colliders. We have several projects on the boundary of particle physics and machine learning:

**Symmetry-embedded networks:**- Symmetry Preserving Attention Networks (SPA-Net): A novel approach to jet-parton assignment, based on tensor neural networks using a generalized attention mechanism. Building networks whose internal symmetries mirror the symmetries of the data can result in dramatic improvements in performance arXiv:2010.09206

**Machine Teaching, Physicists Learning:**- An new approach to translating black-box networks into a small set of interpretable observables. As a demonstration, we apply our approach to the benchmark task of jet classification arXiv:2010.11998 and electron identification arXiv:2011.01984 in collider physics.

**Fast Machine Learning:**- Sparse AutoRegressive Model (SARM): An alternative to Generative Adversarial Networks (GANs), which explicitly learns the sparseness of the data with a tractable likelihood, making it more stable and interpretable than other methods. arXiv:2009.14017
- Optimal Transport based Unfolding and Simulation (OTUS): An alternative approach to fast simulation, which can be trained directly from data in an unsupervised manner, with the potential to replace full simulation rather than augment it. IML seminar slide.

The new tools will be developed as open source codes to transform the nature of data analysis in colliders and maximize significant impact on particle physics.

### Collaborators Include:

### Related Funding

- ARO 76649-CS
- NSF NRT-1633631
- DOE DE-SC0009920
- DOE DE-SC0015971

### Presentations

- 23 Oct 2020 - "Foundations of a Fast, Data-Driven, Machine-Learned Simulator", Daniel Whiteson, 4th Inter-Experimental Machine Learning Workshop

### Publications

- Learning to Identify Electrons, J. Collado, J. Howard, T. Faucett, T. Tong, P. Baldi et. al., arXiv 2011.01984 (03 Nov 2020) [1 citation].
- Mapping Machine-Learned Physics into a Human-Readable Space, T. Faucett, J. Thaler and D. Whiteson, arXiv 2010.11998 (22 Oct 2020) [8 citations].
- Permutationless Many-Jet Event Reconstruction with Symmetry Preserving Attention Networks, M. Fenton, A. Shmakov, T. Ho, S. Hsu, D. Whiteson et. al., arXiv 2010.09206 (19 Oct 2020) [3 citations].
- Sparse autoregressive models for scalable generation of sparse images in particle physics, Y. Lu, J. Collado, D. Whiteson and P. Baldi, Phys.Rev.D 103 036012 (2021) (23 Sep 2020) [1 citation].