MLHEP 2015 lectures slides

In the end of this August our team from Yandex organized MLHEP 2015 - summer school on Machine Learning in High Energy Physics.

School lasted only for 4 days, but even in this little time we managed to teach many things.

School contained of two tracks: introductory and advanced, every day each track has 2 lectures + 2 practical seminars. Also in each evening there was a special physical talk by invited speakers from CERN.

No, this is not everything: we organized inclass kaggle competition based on the COMET tracking problem I wrote about (part 1, part 2), so participants played with ML methods on real-world problem.

I gave lectures on introductory track. This was really challenging - put the course of ML in 4 days to people who have no experience in ML and have different background (while major part of introductory track listeners were particle physicists, but this is not very helpful).

One more caveat: since the schedule was completely filled, we decided to give no tasks (and thus all the theoretical knowledge will be obtained from slides).

For this purpose I decided to minimize the number of things introduced in course. The only non-trivial notion I used was decision function. No $F(x)$, no $h_i(x)$, no $Q(x, y)$, no margins, no $\Theta$, no $C(Y, F)$ and other stuff.

Despite this limitations, course contained all the ‘starter kit’ and even more:

knn
optimal bayesian classifier, QDA
logistic regression
neural networks
decision trees, building, splitting criterions
estimating feature importance
overfitting
ensembles, bagging
Random Forest
comparison of multidimensional distributions
AdaBoost
Gradient Boosting, modifications for regression, classification, ranking
Boosting to uniformity (uBoost and FlatnessLoss)
Fast predictions for online trigger systems (Bonsai BDT)
reweighting, Gradient Boosted reweighting
hyper-parameters optimization, Gaussian Processes
using classifiers’ output to test physical hypotheses
unsupervised ML: PCA, autoencoders

Also I significantly reduced number of formulas and added different demonstrations of how different algorithms work.

This is really much for introductory 4-days course, but I consider this to be ok to give more during the course. The problem is I forgot to put some important notes with conclusions, next time I’ll add them explicitly to slides :)

Slides

MLHEP 2015: Introductory Lecture #1 from arogozhnikov

MLHEP 2015: Introductory Lecture #2 from arogozhnikov

MLHEP 2015: Introductory Lecture #3 from arogozhnikov

MLHEP 2015: Introductory Lecture #4 from arogozhnikov

MLHEP 2015 lectures slides

Slides

Links

Top posts at "brilliantly wrong": (all posts)