Dived deeper into the methods of training NNs.
Good yet incomplete list of what people did in this area is given in this article dated 2006.
Unfortunately there is no my favourite Rprop and it's modifications (IRprop+, IRprop-).
dived deeper into experiments with neural networks, and I decided to improve IRprop by keeping track about not
simply moving along each axis, but as well along directions
$w_i + w_j$, $w_i - w_j$, being sure that this should speed up training progress.
Well, it increased the speed for the first time, but very fast it stops decreasing loss function and when it is close to the minimal value, it starts serious oscillations and becomes simply unstable. This method is implemented as experimental IRprop* trainer in `hep_ml`.