# An operational definition of quark and gluon jets

Patrick T. Komiske, Eric M. Metodiev, Jesse Thaler

September 2018
### Abstract

While “quark” and “gluon” jets are often treated as separate, well-defined objects in both theoretical and experimental contexts, no precise, practical, and hadron-level definition of jet flavor presently exists. To remedy this issue, we develop and advocate for a data-driven, operational definition of quark and gluon jets that is readily applicable at colliders. Rather than specifying a per-jet flavor label, we aggregately define quark and gluon jets at the distribution level in terms of measured hadronic cross sections. Intuitively, quark and gluon jets emerge as the two maximally separable categories within two jet samples in data. Benefiting from recent work on data-driven classifiers and topic modeling for jets, we show that the practical tools needed to implement our definition already exist for experimental applications. As an informative example, we demonstrate the power of our operational definition using Z+jet and dijet samples, illustrating that pure quark and gluon distributions and fractions can be successfully extracted in a fully well-defined manner.

Publication

*Journal of High Energy Physics*, **11** (2018) 059

Figure 2: Log-likelihood ratios of the two mixed samples ($Z$+jet and dijet events) as determined by dix different machine learning classifiers: two types of Particle Flow Networks (PFNs), an Energy Flow Network (EFN), a linear classifier based on Energy Flow Polynomials (EFPs), a convolutional neural network (CNN) operating on jet images, and a dense neural network (DNN) operating on a set of N-subjettiness values. The reducibility factors, the maxima and minima of these curves, are shown on the right. From these, the quark and gluon fraction values can be extracted using a simple formula.