Santiago Tanco: Bayesian tools for the LHC: a proof of concept in di-Higgs searches
A/1-106 - Seminarska soba fizike (F5)
Extracting reliable physical information from collider data requires a combination of mathematical, computational and statistical tools to model observed distributions, often with the help of powerful simulations. But when simulations cannot be fully trusted, data-driven approaches become indispensable. In this talk, I will present Bayesian tools that provide a flexible, data-driven method for unsupervised training on probabilistic models of collider observables, highlighting their advantages for parameter inference and uncertainty estimation. I will illustrate these points in the context of pp > hh > bbbb searches, where recent analyses use a variation of the widely used data-driven "ABCD method". Within our framework, the ABCD method can be generalized and improved, especially when its assumptions are not exactly satisfied, leading to more robust background subtraction. Our approach is able to exploit correlations in multi-dimensional data by modelling kinematic variables and b-tagging scores at the event level. These results also show a way in which simulations can be used as guides for unsupervised estimation of the true data distributions. Based on 2402.08001 and ongoing research.