First page Back Continue Last page Overview Image

ML(3)

arXiv:2107.02157

LHC physics dataset for unsupervised New Physics detection at 40 MHz

Ekaterina Govorkova, Ema Puljak, Thea Aarrestad, Maurizio Pierini, Kinga Anna Woźniak, Jennifer Ngadiuba

In particle detectors at the Large Hadron Collider, tens of terabytes of data are produced every second from proton-proton collisions occurring at a rate of 40 megahertz. This data rate is reduced to a sustainable level by a real-time event filter processing system which decides whether each collision event should be kept for further analysis or be discarded. We introduce a dataset of proton collision events which emulates a typical data stream collected by such a real-time processing system, pre-filtered by requiring the presence of at least one electron or muon. This dataset could be used to develop novel event selection strategies and assess their sensitivity to new phenomena. In particular, by publishing this dataset we intend to stimulate a community-based effort towards the design of novel algorithms for performing unsupervised New Physics detection, customized to fit the bandwidth, latency and computational resource constraints of the real-time event selection system of a typical particle detector.