Alex Rogozhnikov

Blog: Brilliantly wrong
Github: arogozhnikov
ResearchGate: Alex_Rogozhnikov

Education and Career

2014-2017 — research scientist at Yandex (leading search engine in Russia), worked on joint research projects with CERN (European Organization for Nuclear Research).
2015 — Ph.D. in computer science from Moscow State University
2014 — M.Sc. in machine learning from Yandex School of Data Analysis
2014 — M.Sc. in mathematical physics from Higher School of Economics (diploma with honors)
2012 — M.Sc. in computer science from Moscow State University (diploma with honors)

Previously a member of CERN experiments: LHCb, SHiP, and an associated member of the OPERA experiment at INFN.
Winner of national olympiads in physics and mathematics.

Research Experience

Applied machine learning to different problems in high energy physics at the Large Hadron Collider (and other experiments). I have developed machine learning-based approaches for particle identification, tracking, online data filtering, flavour tagging and particle shower detection.

Proposed three machine learning algorithms to address issues specific to HEP applications.

Previous research topics include optimal control, mathematical physics, numerical methods and solid state theory.

Selected publications


Talks at conferences and workshops: ACAT 2016 (Chile), Heavy Flavour Data Mining Workshop 2016 (Zurich), International Workshop on Nuclear Emulsions 2016 (Naples), IML Machine Learning workshop 2017 (Geneva). Co-authors also presented results at CHEP 2015 (Okinawa, Japan), ICML 2015 (Lille, France), ML prospects and applications 2015 (Berlin) and ACAT 2017 (upcoming, Seattle, USA).

Software Development

I am a developer of hep_ml package of machine learning algorithms for high energy physics with sklearn-compatible interface and previously one of core contributors to yandex/REP, a docker-based environment for data analysis in particle physics.

Proficient in python (and use modern fortran to optimize critical places). Know theano and pytorch, familiar with keras, lasagne and tensorflow.
Using javascript from time to time, with glsl / hlsl for visualizations.
Previously used C# and .NET platform, had some experience in C++, PHP, SQL. Had courses in MATLAB, assembler; passed the CUDA certification.

In 2012, our team got 1st place (out of more than 500) in international competition "Accelerate Your Code" by Intel, we provided the fastest parallel DNA processing system in C++ with openmp.


Organized and co-organized around a dozen of in-class data challenges based on kaggle platform. Provided specific evaluation metrics for "Flavours of Physics" challenge at Kaggle run by Yandex & CERN.

My interactive visualizations of different machine learning techniques can be found in "Brilliantly wrong" blog.

Research projects

InfiniteBoost: building infinite ensembles with gradient descent (with T.Likhomanenko)

InfiniteBoost is a modification of gradient boosting that converges when a number of trees in the ensemble tends to infinity. In this approach, it is also possible to introduce automated tuning of capacity (it is a parameter that is similar to learning rate in gradient boosting). Read more

Particle identification at the LHCb (with D.Derkach, M.Hushchyn, T.Likhomanenko)

LHCb is one of four major experiments at the LHC, and it is a bit different from other experiments — LHCb is single arm, and analyzes particles in quite limited angle. However, the advantage of this scheme compared to other experiments, is that LHCb provides more information to identify particles, which makes it more precise in studies of b-physics.

We prepared a major update of particle identification system with deep networks and GBDT. Important part was preparing models which are independent on momentum using an approach from "Boosting to uniformity" (see below). Read more

Finding electromagnetic showers in the OPERA

The OPERA is an experiment placed inside a mountain in Italy, it was created to confirm neutrino oscillations (Nobel prize 2015). In this project I've created a system that detects electromagnetic showers in the data collected by the OPERA. That is, among millions of base tracks it finds a pattern of several hundred base tracks. Read more

Reweighting with Boosted Decision Trees

An important problem of many analyses in high energy physics is a discrepancy between simulated data and real data. An approach used previously to reduce this effect can only handle discrepancies in 1-2 variables.

I've proposed an algorithm that directly solves reweighting problem in many dimensions and additionally addresses some issues important for LHCb analyses, such as handling negative weights (so-called sWeights). This tool is used in LHCb analyses. Read more

Inclusive flavour tagging at the LHCb (with D.Derkach, T.Likhomanenko)

Guessing flavour of a neutral (non-charged) meson isn't easy, but required to estimate some of the standard model parameters. This information can be partially reconstructed by analyzing tracks left by other particles produced after collision.

We came up with a simple probabilistic model which combines information from all the other tracks — and it works better than previous approaches, where separate analysis was performed for each type of tagging particles and for each meson. Read more

Later I've tried to improve the system by including attention-like mechanism. This helps when amount of training data is limited. Read more

Boosting to uniformity (main author, with A. Bukva, V. Gligorov, A. Ustyuzhanin, and M. Williams)

Various statistical dependencies may be easy to use to improve classification, but are undesirable to influence our decisions (simplified examples are gender and race when using ML in hiring). Simply removing these features from training may not be sufficient, since other features may still have this information (e.g. photo in CV helps easily guess the gender, in some languages gender of a writer can be inferred from the text).

We developed a method that is capable of suppressing dependency between classification result and one or more selected variables using specific loss (that is based on Cramer-von Mises criterion). Read more

Tracking in the COMET (with E. Gillies)

The COMET is an experiment in high energy physics currently under construction in Japan targeted at finding charged LFV transitions. The goal was to prepare a fast system that efficiently selects candidate events for transitions.

Using machine learning coupled with a soft modification of Hough transform we were able to improve wire-level recognition quality: ROC AUC from 0.95 to 0.9993. Read more

Inclusive trigger for the LHCb (contributing author)

Millions of collisions should be analyzed each second at the LHCb experiment, which is an enormous amount of data (that can't even be stored), so the experiment uses online triggers that decide which collisions to store and which can be deleted.

Our team developed new trigger system based on MatrixNet (Yandex proprietary GBDT modification). I was responsible for speeding up the model and managed to compress an already trained MatrixNet ensemble from 10'000 trees to 100 without significant drop in quality. Read more

Optimal boundary control of oscillations in distributed systems (PhD thesis)

I was in a group led by Vladimir Il'in (Russian wiki) and investigated the problems of optimal boundary control of oscillations described by wave equation (exciting / damping of particular oscillations by actively interacting with a system at the boundary). Typical approaches investigate numerical algorithms to find an approximate solution, our group developed methods to solve the problem analytically, hence precisely.

I've introduced a special notation based on operator matrices to describe the problem and provided optimal solution of control problem for composite rods/strings of multiple parts (previous results covered only very specific case with two parts and additional strong requirements).

This research was selected as "best student's paper in mathematics" by Russian Academy of Sciences in 2012.

Computing properties of dense loop model using duality with spanning web model

This is a research in solid state theory: both dense loop model and spanning web model are lattice models (and have corresponding partition functions), their nice duality made it possible to compute partition function and loops density of dense loop model. Read more (my part is computations for web models).