TREASURE
TREASURE (Tokenizing HEP Collider Data for AI Discovery) is a DOE HEP pilot initiative designed to bridge the gap between experimental physics and frontier AI research. The project develops technical frameworks for converting particle physics data into forms suitable for training and evaluating large-scale AI models, including Foundation Models based on Transformer architectures.
The initiative focuses on four primary objectives:
- Multi-Level Tokenization — Converting particle physics events into discrete tokens suitable for Transformer-based machine learning systems
- AI-Readiness Protocols — Creating standards for data curation and metadata to support Foundation Model training
- Physics Benchmarking — Testing model performance on pattern recognition, Higgs physics, and new physics discovery tasks
- Collaborative Infrastructure — Developing the “American Science Cloud” (AmSC) Intelligent Data Activities to enable scalable AI research across national laboratories
TREASURE involves researchers from Lawrence Berkeley, Brookhaven, Argonne, Fermi, and SLAC national laboratories, with a focus on cross-experimental discovery mechanisms and open data initiatives.
Collaborators Include:
Project Website
Related Funding
- DOE