### Massachusetts Institute of Technology

I am a PhD candidate in the Center for Theoretical Physics at MIT. I am interested in using the tools of quantum field theory and machine learning to study fundamental particle physics.

Much of my research focuses on physics relevant at the Large Hadron Collider, including jet physics, QCD, and new physics searches. My collaborators and I have developed several novel tools and algorithms for use at the LHC, some of which are highlighted on this page.

### Interests

• Particle Physics
• Quantum Field Theory
• Collider Physics
• Jets and QCD
• Data Science
• Machine Learning
• Optimal Transport
• Algorithms/Software

### Education

• PhD Candidate in Physics, 2016-

Massachusetts Institute of Technology

• AM in Physics, 2016

Harvard University

• AB in Physics and Mathematics, 2016

Harvard University
summa cum laude, Highest Honors
Seconday field in computer science

# Publications

My primary publications are listed below with quick descriptions. Click on a title to get more detailed information about a paper including the abstract and selected figures. Publications can be searched or filtered here. Note that authorship is alphabetical in high-energy physics.

The background image is a visualization an Energy Flow Network used to classify quark and gluon jets. The sizes and locations of the rings highlight the singularity structure of QCD.

### The Hidden Geometry of Particle Collisions

We unify many concepts in collider physics, including infrared and collinear safety, observables, jet finding, pileup mitigation and more, using a geometric language based on the Energy Mover’s Distance. Along the way, we develop new techniques grounded in this geometry, including extensions of observables, new jet-finding algorithms, novel pileup mitigation based on Apollonius diagrams, and a concrete notion of “theory space.”

### OmniFold: A Method to Simultaneously Unfold All Observables

We develop OmniFold, an ML-based unfolding technique that can incorporate full-phase-space information, works without binning, and can avoid choosing specific observables.

### Cutting Multiparticle Correlators Down to Size

We show that a broad class of mathematical objects, multiparticle correlators, can be manipulated by “cutting” the vertices and edges of their graphical representation, leading to many identities, computational speedups, and surprising connections to string theory.

### Exploring the Space of Jets with CMS Open Data

We explore the CMS 2011A Jet Primary Dataset using standard jet substructure observables as well as the Energy Mover’s Distance. Our reprocessed datasets and analysis code are made public to facilitate future Open Data studies.

### The Machine Learning Landscape of Top Taggers

A community report on a variety of ML top taggers to which we contributed a PFN, EFN, and EFP model.

### The Metric Space of Collider Events

We develop a metric, the Energy Mover’s Distance (EMD), on the space of events that, intuitively, is the amount of “work” required to rearrange one event into another. Many techniques that require a pairwise distance between objects can now be applied to collider events, including quantifying event distortion, classification based on density estimation, and studying the space of events itself.

### Energy Flow Networks: Deep Sets for Particle Jets

We adapt and specialize the Deep Sets neural network architecture for use with collider events, since the particles in an event naturally form a variable length, unordered set of objects. Our resulting Energy Flow Networks (EFNs) and Particle Flow Networks (PFNs) are incredibly powerful and simple architectures for use in collider physics.

### An operational definition of quark and gluon jets

We develop a precise, practical, hadron-level definition of quark and gluon jets based on topic modeling of two mixed samples of jets. This allows for data-driven extractions of separate quark- and gluon-jet cross sections, among other things.

### Learning to classify from impure samples with high-dimensional data

We study two methods of weakly supervised training in the context of jet classification, extending them to deep neural network architectures. We find that the Classification Without Labels (CWoLa) paradigm outperforms Learning from Label Proportions (LLP).

### Energy Flow Polynomials: A complete linear basis for jet substructure

We develop the Energy Flow Polynomials (EFPs), a set of IRC-safe observables that form an (over)complete basis for any IRC-safe observable. This supports the sufficiency of linear methods for tasks such as classifying different jets, and indeed we find that a linear classifier using EFPs performs surprisingly well on a variety of jet discrimination tasks.

### Pileup Mitigation with Machine Learning (PUMML)

We develop the PUMML framework for mitigating the contamination from extra protons colliding at the LHC using machine learning. We demonstrate that a convolutional neural network can clean up such contamination at least as well as existing methods, with improvements in robustness across a wide variety of pileup levels.

### Deep learning in color: Towards automated quark/gluon jet discrimination

We show for the first time that deep learning is quite successful at discriminating between quark and gluon jets. We use a convolutional neural network trained on jet images and observable large improvements in classification efficiency, as well as rough insensitivity to the mismodeling of quark and gluon jets by Monte Carlo simulations.

# Projects

### EnergyFlow

Python package for computing Energy Flow Polynomials, instantiating Energy/Particle Flow Networks, computing the Energy Mover’s Distance between events, and working with particle kinematics in python.

### EnergyEnergyCorrelators

C++ library with a Python wrapper for computing $N$-point energy-energy correlators and related high-dimensional structures. Utilizes the BOOST histogram library for simple, efficient, and flexible binning of distributions.

### Wasserstein

A C++ library with a Python wrapper for computing the $p$-Wasserstein distances, known as the Earth Mover’s Distance for $p=1$ and the Energy Mover’s Distance in particle physics.

### MOD

The MIT Open Data project utilizes public collider data for interesting scientific endeavors in a complementary manner to the experimental collaborations. For our analysis using the CMS 2011A Jet Primary Dataset and associated simulated datasets, we re-released a number of datasets in an easy to use format as well as made our entire analysis publically available.

### EventGeneration

A C++ library for facilitating particle physics event generation with Pythia8 and jet clustering with FastJet3 including the association of the hard-process, parton-level, and hadron-level events. Includes a python script for reading the resulting text files.

# Experience

#### Center for Theoretical PhysicsMassachusetts Institute of Technology

Sep 2016 - Present Cambridge, MA

• Machine learning neural network architecture/algorithm development for high-energy particle physics datasets
• Software library development and creation of easy-to-use public datasets, including reprocessing TBs of CMS Open Data
• Studied Large Hadron Collider phenomena, jet physics, quantum field theory, quantum chromodynamics

#### 8.09/8.309 - Advanced Classical MechanicsMassachusetts Institute of Technology

Sep 2017 - Dec 2019 Cambridge, MA

TA for classical mechanics taught by Iain Stewart in 2017, 2018, 2019

• Taught weekly sections (2018, 2019)
• Held regular office hours
• Helped with exam review sessions
• Graded homework (2017) and exams

#### Harvard University

Sep 2015 - May 2016 Cambridge, MA

#### Harvard University

Aug 2012 - May 2016 Cambridge, MA
• summa cum laude
• Highest Honors in Physics
• Secondary field in computer science
• John Harvard Scholarship (2014 - 2015)
• Derek C. Bok Award for Distinction in Teaching (2014)
• Harvard College Scholarship (2013 - 2014)

#### Harvard University

Jun 2015 - Aug 2015 Cambridge, MA
• Computed the normal modes of an exponential block-spring system allowing for the definition of a family of Fourier-like discrete transformations from position space to mode space, worked with Howard Georgi and Matthew Schwartz
• Explored the quantum-to-classical transition through decoherence to a pointer basis, worked with Matthew Reece

#### Jane Street Capital

Jan 2015, Jan 2016 New York, NY
• Studied financial markets
• Wrote bash program to analyze novel type of options trade
• Practiced trading in mock simulations

#### Harvard University Physics Department

Sep 2014 - Dec 2015 Cambridge, MA

TF for Honors Special Relativity (Physics 16, Fall 2014) taught by Howard Georgi and Quantum Mechanics I (Physics 143a, Fall 2015) taught by Matt Reece

• Taught weekly sections
• Prepared practice problems
• Organized LaTeX and Mathematica review sessions

#### Superconducting Electronics Group, Quantum Computing CollaborationNorthrop Grumman Electronic Systems

May 2014 - Aug 2014 Baltimore, MD
Wrote MATLAB program to interface with existing experimental code base to improve the fidelity of high-speed, precision microwave pulses used for qubit control via calculation of a transfer function and deconvolution methods

#### Harvard University Mathematics Department

Sep 2013 - Dec 2013 Cambridge, MA
• Ran weekly problem sessions
• Worked one-on-one with students in class and the math question center

#### Asymmetric Operations DepartmentResearch and Exploratory Development DepartmentJohns Hopkins University Applied Physics Laboratory

May 2012 - Aug 2013 Laurel, MD
• Investigated electromagnetic properties of high-impedance Sievenpiper metamaterial structures for low-profile RF antenna applications
• Characterized material properties of magnetic nanoparticle polymers
• Catalogued dielectric properties of explosive simulant materials for transportation security purposes

#### Century High School

Aug 2008 - Jun 2012 Sykesville, MD
• National AP Scholar
• STEM Academy Award
• Math and Science Content Awards