LHC Olympics 2020 (LHC Olympics 2020 Anomaly Detection Challenge)

Introduced by Kasieczka et al. in The LHC Olympics 2020: A Community Challenge for Anomaly Detection in High Energy Physics

These are the official datasets for the LHC Olympics 2020 Anomaly Detection Challenge. Each "black box" contains 1M events meant to be representative of actual LHC data. These events may include signal(s) and the challenge consists of finding these signals using the method of your choice. We have uploaded a total of THREE black boxes to be used for the challenge.

In addition, we include a background sample of 1M events meant to aid in the challenge. The background sample consists of QCD dijet events simulated using Pythia8 and Delphes 3.4.1. Be warned that both the physics and the detector modeling for this simulation may not exactly reflect the "data" in the black boxes. For both background and black box data, events are selected using a single fat-jet (R=1) trigger with pT threshold of 1.2 TeV.

These events are stored as pandas dataframes saved to compressed h5 format. For each event, all reconstructed particles are assumed to be massless and are recorded in detector coordinates (pT, eta, phi). More detailed information such as particle charge is not included. Events are zero padded to constant size arrays of 700 particles. The array format is therefore (Nevents=1M, 2100).

For more information, including a complete description of the challenge and an example Jupyter notebook illustrating how to read and process the events, see the official LHC Olympics 2020 webpage here.

UPDATE: November 23, 2020

Now that the challenge is over, we have uploaded the solutions to Black Boxes 1 and 3. They are simple ASCII files (events_LHCO2020_BlackBox1.masterkey and events_LHCO2020_BlackBox3.masterkey) where each line is the truth label -- 0 for background and 1 (and 2 in the case of BB3) for signal -- of each event in the corresponding h5 files (same ordering). For more information about the solutions, please visit the LHCO2020 webpage.

UPDATE: February 11, 2021

We have uploaded the Delphes detector cards and Pythia command files used to produce the Black Box datasets.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


Modalities


Languages