Losses for Gradient Boosting¶

hep_ml.losses contains different loss functions to use in gradient boosting.

Apart from standard classification losses, hep_ml contains losses for uniform classification (see BinFlatnessLossFunction, KnnFlatnessLossFunction, KnnAdaLossFunction) and for ranking (see RankBoostLossFunction)

Interface

Loss functions inside hep_ml are stateful estimators and require initial fitting, which is done automatically inside gradient boosting.

All loss function should be derived from AbstractLossFunction and implement this interface.

Examples¶

Training gradient boosting, optimizing LogLoss and using all features

>>> from hep_ml.gradientboosting import UGradientBoostingClassifier, LogLossFunction
>>> classifier = UGradientBoostingClassifier(loss=LogLossFunction(), n_estimators=100)
>>> classifier.fit(X, y, sample_weight=sample_weight)

Using composite loss function and subsampling:

>>> loss = CompositeLossFunction()
>>> classifier = UGradientBoostingClassifier(loss=loss, subsample=0.5)

To get uniform predictions in mass in background (note that mass should not present in features):

>>> loss = BinFlatnessLossFunction(uniform_features=['mass'], uniform_label=0, train_features=['pt', 'flight_time'])
>>> classifier = UGradientBoostingClassifier(loss=loss)

To get uniform predictions in both signal and background:

>>> loss = BinFlatnessLossFunction(uniform_features=['mass'], uniform_label=[0, 1], train_features=['pt', 'flight_time'])
>>> classifier = UGradientBoostingClassifier(loss=loss)

class hep_ml.losses.AbstractLossFunction[source]¶

Bases: BaseEstimator

This is base class for loss functions used in hep_ml. Main differences compared to scikit-learn loss functions:

losses are stateful, and may require fitting of training data before usage.
thus, when computing gradient, hessian, one shall provide predictions of all events.
losses are object that shall be passed as estimators to gradient boosting (see examples).
only two-class case is supported, and different classes may have different role and meaning.

compute_optimal_step(y_pred)[source]¶

Compute optimal global step. This method is typically used to make optimal step before fitting trees to reduce variance.

Parameters:: y_pred – initial predictions, numpy.array of shape [n_samples]
Returns:: float

fit(X, y, sample_weight)[source]¶: This method is optional, it is called before all the others. Heavy preprocessing should be done here.

prepare_new_leaves_values(terminal_regions, leaf_values, y_pred)[source]¶

Loss function can prepare better values for leaves by overriding this function

Parameters:

terminal_regions – indices of terminal regions of each event.
leaf_values – numpy.array, current mapping of leaf indices to prediction values.
y_pred – predictions before adding new tree.

Returns:

numpy.array with new prediction values for all leaves.

prepare_tree_params(y_pred)[source]¶

Prepares parameters for regression tree that minimizes MSE

Parameters:: y_pred – contains predictions for all the events passed to fit method, moreover, the order should be the same
Returns:: tuple (tree_target, tree_weight) with target and weight to be used in decision tree

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → AbstractLossFunction¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns¶

selfobject: The updated object.

class hep_ml.losses.AdaLossFunction(regularization=5.0)[source]¶

Bases: HessianLossFunction

AdaLossFunction is the same as Exponential Loss Function (aka exploss)

Parameters:: regularization – float, penalty for leaves with few events, corresponds roughly to the number of added events of both classes to each leaf.

fit(X, y, sample_weight)[source]¶: This method is optional, it is called before all the others. Heavy preprocessing should be done here.

hessian(y_pred)[source]¶: Returns diagonal of hessian matrix. :param y_pred: numpy.array of shape [n_samples] with events passed in the same order as in fit. :return: numpy.array of shape [n_sampels] with second derivatives with respect to each prediction.

negative_gradient(y_pred)[source]¶

prepare_tree_params(y_pred)[source]¶

Prepares parameters for regression tree that minimizes MSE

Parameters:: y_pred – contains predictions for all the events passed to fit method, moreover, the order should be the same
Returns:: tuple (tree_target, tree_weight) with target and weight to be used in decision tree

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → AdaLossFunction¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns¶

selfobject: The updated object.

class hep_ml.losses.BinFlatnessLossFunction(uniform_features, uniform_label, n_bins=10, power=2.0, fl_coefficient=3.0, allow_wrong_signs=True)[source]¶

Bases: AbstractFlatnessLossFunction

This loss function contains separately penalty for non-flatness and for bad prediction quality. See [FL] for details.

$\text{loss} =\text{ExpLoss} + c \times \text{FlatnessLoss}$

FlatnessLoss computed using binning of uniform variables

Parameters:

uniform_features (list[str]) – names of features, along which we want to obtain uniformity of predictions
uniform_label (int|list[int]) – the label(s) of classes for which uniformity is desired
n_bins (int) – number of bins along each variable
power (float) – the loss contains the difference $| F - F_bin |^p$, where p is power
fl_coefficient (float) – multiplier for flatness_loss. Controls the tradeoff of quality vs uniformity.
allow_wrong_signs (bool) – defines whether gradient may different sign from the “sign of class” (i.e. may have negative gradient on signal). If False, values will be clipped to zero.

[FL]

A. Rogozhnikov et al, New approaches for boosting to uniformity http://arxiv.org/abs/1410.4140

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → BinFlatnessLossFunction¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns¶

selfobject: The updated object.

class hep_ml.losses.CompositeLossFunction(regularization=5.0)[source]¶

Bases: HessianLossFunction

Composite loss function is defined as exploss for backgorund events and logloss for signal with proper constants.

Such kind of loss functions is very useful to optimize AMS or in situations where very clean signal is expected.

Parameters:: regularization – float, penalty for leaves with few events, corresponds roughly to the number of added events of both classes to each leaf.

fit(X, y, sample_weight)[source]¶: This method is optional, it is called before all the others. Heavy preprocessing should be done here.

hessian(y_pred)[source]¶: Returns diagonal of hessian matrix. :param y_pred: numpy.array of shape [n_samples] with events passed in the same order as in fit. :return: numpy.array of shape [n_sampels] with second derivatives with respect to each prediction.

negative_gradient(y_pred)[source]¶

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → CompositeLossFunction¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns¶

selfobject: The updated object.

class hep_ml.losses.KnnAdaLossFunction(uniform_features, uniform_label, knn=10, row_norm=1.0)[source]¶

Bases: AbstractMatrixLossFunction

Modification of AdaLoss to achieve uniformity of predictions

$\text{loss} = \sum_i w_i * exp(- \sum_j a_{ij} y_j score_j)$

A matrix is square, each row corresponds to a single event in train dataset, in each row we put ones to the closest neighbours if this event from uniform class. See [BU] for details.

Parameters:

uniform_features (list[str]) – the features, along which uniformity is desired
uniform_label (int|list[int]) – the label (labels) of ‘uniform classes’
knn (int) – the number of nonzero elements in the row, corresponding to event in ‘uniform class’

[BU]

A. Rogozhnikov et al, New approaches for boosting to uniformity http://arxiv.org/abs/1410.4140

compute_parameters(trainX, trainY, trainW)[source]¶: This method should be overloaded in descendant, and should return A, w (matrix and vector)

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → KnnAdaLossFunction¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns¶

selfobject: The updated object.

class hep_ml.losses.KnnFlatnessLossFunction(uniform_features, uniform_label, n_neighbours=100, power=2.0, fl_coefficient=3.0, max_groups=5000, allow_wrong_signs=True, random_state=42)[source]¶

Bases: AbstractFlatnessLossFunction

This loss function contains separately penalty for non-flatness and for bad prediction quality. See [FL] for details.

$\text{loss} = \text{ExpLoss} + c \times \text{FlatnessLoss}$

FlatnessLoss computed using nearest neighbors in space of uniform features

Parameters:

uniform_features (list[str]) – names of features, along which we want to obtain uniformity of predictions
uniform_label (int|list[int]) – the label(s) of classes for which uniformity is desired
n_neighbours (int) – number of neighbors used in flatness loss
power (float) – the loss contains the difference $| F - F_bin |^p$, where p is power
fl_coefficient (float) – multiplier for flatness_loss. Controls the tradeoff of quality vs uniformity.
allow_wrong_signs (bool) – defines whether gradient may different sign from the “sign of class” (i.e. may have negative gradient on signal). If False, values will be clipped to zero.
max_groups (int) – to limit memory consumption when training sample is large, we randomly pick this number of points with their members.

[FL]

A. Rogozhnikov et al, New approaches for boosting to uniformity http://arxiv.org/abs/1410.4140

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → KnnFlatnessLossFunction¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns¶

selfobject: The updated object.

class hep_ml.losses.LogLossFunction(regularization=5.0)[source]¶

Bases: HessianLossFunction

Logistic loss function (logloss), aka binomial deviance, aka cross-entropy, aka log-likelihood loss.

Parameters:: regularization – float, penalty for leaves with few events, corresponds roughly to the number of added events of both classes to each leaf.

fit(X, y, sample_weight)[source]¶: This method is optional, it is called before all the others. Heavy preprocessing should be done here.

hessian(y_pred)[source]¶: Returns diagonal of hessian matrix. :param y_pred: numpy.array of shape [n_samples] with events passed in the same order as in fit. :return: numpy.array of shape [n_sampels] with second derivatives with respect to each prediction.

negative_gradient(y_pred)[source]¶

prepare_tree_params(y_pred)[source]¶

Prepares parameters for regression tree that minimizes MSE

Parameters:: y_pred – contains predictions for all the events passed to fit method, moreover, the order should be the same
Returns:: tuple (tree_target, tree_weight) with target and weight to be used in decision tree

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → LogLossFunction¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns¶

selfobject: The updated object.

class hep_ml.losses.MAELossFunction[source]¶

Bases: AbstractLossFunction

Mean absolute error loss function, used for regression. $\text{loss} = \sum_i |y_i - \hat{y}_i|$

compute_optimal_step(y_pred)[source]¶

Compute optimal global step. This method is typically used to make optimal step before fitting trees to reduce variance.

Parameters:: y_pred – initial predictions, numpy.array of shape [n_samples]
Returns:: float

fit(X, y, sample_weight)[source]¶: This method is optional, it is called before all the others. Heavy preprocessing should be done here.

negative_gradient(y_pred)[source]¶

prepare_new_leaves_values(terminal_regions, leaf_values, y_pred)[source]¶

Loss function can prepare better values for leaves by overriding this function

Parameters:

terminal_regions – indices of terminal regions of each event.
leaf_values – numpy.array, current mapping of leaf indices to prediction values.
y_pred – predictions before adding new tree.

Returns:

numpy.array with new prediction values for all leaves.

prepare_tree_params(y_pred)[source]¶

Prepares parameters for regression tree that minimizes MSE

Parameters:: y_pred – contains predictions for all the events passed to fit method, moreover, the order should be the same
Returns:: tuple (tree_target, tree_weight) with target and weight to be used in decision tree

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → MAELossFunction¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns¶

selfobject: The updated object.

class hep_ml.losses.MSELossFunction(regularization=5.0)[source]¶

Bases: HessianLossFunction

Mean squared error loss function, used for regression. $\text{loss} = \sum_i (y_i - \hat{y}_i)^2$

Parameters:: regularization – float, penalty for leaves with few events, corresponds roughly to the number of added events of both classes to each leaf.

compute_optimal_step(y_pred)[source]¶: Optimal step is computed using Newton-Raphson algorithm (10 iterations). :param y_pred: predictions (usually, zeros) :return: float

fit(X, y, sample_weight)[source]¶: This method is optional, it is called before all the others. Heavy preprocessing should be done here.

hessian(y_pred)[source]¶: Returns diagonal of hessian matrix. :param y_pred: numpy.array of shape [n_samples] with events passed in the same order as in fit. :return: numpy.array of shape [n_sampels] with second derivatives with respect to each prediction.

negative_gradient(y_pred)[source]¶

prepare_tree_params(y_pred)[source]¶

Prepares parameters for regression tree that minimizes MSE

Parameters:: y_pred – contains predictions for all the events passed to fit method, moreover, the order should be the same
Returns:: tuple (tree_target, tree_weight) with target and weight to be used in decision tree

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → MSELossFunction¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns¶

selfobject: The updated object.

class hep_ml.losses.RankBoostLossFunction(request_column, penalty_power=1.0, update_iterations=1)[source]¶

Bases: HessianLossFunction

RankBoostLossFunction is target of optimization in RankBoost [RB] algorithm, which was developed for ranking and introduces penalties for wrong order of predictions.

However, this implementation goes further and there is selection of optimal leaf values based on iterative procedure. This implementation also uses matrix decomposition of loss function, which is very effective, when labels are from some very limited set (usually it is 0, 1, 2, 3, 4)

$\text{loss} = \sum_{ij} w_{ij} exp(pred_i - pred_j)$,

$w_{ij} = ( \alpha + \beta * [query_i = query_j]) R_{label_i, label_j}$, where $R_{ij} = 0$ if $i \leq j$, else $R_{ij} = (i - j)^{p}$

Parameters:

request_column (str) – name of column with search query ids. The higher attention is payed to samples with same query.
penalty_power (float) – describes dependence of penalty on the difference between target labels.
update_iterations (int) – number of minimization steps to provide optimal values in leaves.

[RB]

Freund et al. An Efficient Boosting Algorithm for Combining Preferences

fit(X, y, sample_weight)[source]¶: This method is optional, it is called before all the others. Heavy preprocessing should be done here.

hessian(y_pred)[source]¶: Returns diagonal of hessian matrix. :param y_pred: numpy.array of shape [n_samples] with events passed in the same order as in fit. :return: numpy.array of shape [n_sampels] with second derivatives with respect to each prediction.

negative_gradient(y_pred)[source]¶

prepare_new_leaves_values(terminal_regions, leaf_values, y_pred)[source]¶: This expression comes from optimization of second-order approximation of loss function.

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → RankBoostLossFunction¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns¶

selfobject: The updated object.

class hep_ml.losses.ReweightLossFunction(regularization=5.0)[source]¶

Bases: AbstractLossFunction

Loss function used to reweight distributions. Works inside hep_ml.reweight.GBReweighter See [Rew] for details.

Conventions: $y=0$ - target distribution, $y=1$ - original distribution.

Weights after look like:

$w = w_0$ for target distribution
$w = w_0 * exp(pred)$ for events from original distribution (so predictions for target distribution is ignored)

Parameters:: regularization (float) – roughly, it’s number of events added in each leaf to prevent overfitting.

[Rew]

http://arogozhnikov.github.io/2015/10/09/gradient-boosted-reweighter.html

fit(X, y, sample_weight)[source]¶: This method is optional, it is called before all the others. Heavy preprocessing should be done here.

negative_gradient(y_pred)[source]¶

prepare_new_leaves_values(terminal_regions, leaf_values, y_pred)[source]¶

Loss function can prepare better values for leaves by overriding this function

Parameters:

terminal_regions – indices of terminal regions of each event.
leaf_values – numpy.array, current mapping of leaf indices to prediction values.
y_pred – predictions before adding new tree.

Returns:

numpy.array with new prediction values for all leaves.

prepare_tree_params(y_pred)[source]¶

Prepares parameters for regression tree that minimizes MSE

Parameters:: y_pred – contains predictions for all the events passed to fit method, moreover, the order should be the same
Returns:: tuple (tree_target, tree_weight) with target and weight to be used in decision tree

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → ReweightLossFunction¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns¶

selfobject: The updated object.