Missing Value Handle
LightGBM enables the missing value handle by default. Disable it by setting
LightGBM uses NA (NaN) to represent missing values by default. Change it to use zero by setting
zero_as_missing=false(default), the unrecorded values in sparse matrices (and LightSVM) are treated as zeros.
zero_as_missing=true, NA and zeros (including unrecorded values in sparse matrices (and LightSVM)) are treated as missing.
Categorical Feature Support
LightGBM offers good accuracy with integer-encoded categorical features. LightGBM applies Fisher (1958) to find the optimal split over categories as described here. This often performs better than one-hot encoding.
categorical_featureto specify the categorical features. Refer to the parameter
Categorical features will be cast to
int32(integer codes will be extracted from pandas categoricals in the Python-package) so they must be encoded as non-negative integers (negative values will be treated as missing) less than
Int32.MaxValue(2147483647). It is best to use a contiguous range of integers started from zero. Floating point numbers in categorical features will be rounded towards 0.
cat_smoothto deal with over-fitting (when
#datais small or
For a categorical feature with high cardinality (
#categoryis large), it often works best to treat the feature as numeric, either by simply ignoring the categorical interpretation of the integers or by embedding the categories in a low-dimensional numeric space.
The label should be of type
int, such that larger numbers correspond to higher relevance (e.g. 0:bad, 1:fair, 2:good, 3:perfect).
label_gainto set the gain(weight) of
lambdarank_truncation_levelto truncate the max DCG.
Cost Efficient Gradient Boosting
Cost Efficient Gradient Boosting (CEGB) makes it possible to penalise boosting based on the cost of obtaining feature values. CEGB penalises learning in the following ways:
Each time a tree is split, a penalty of
When a feature is used for the first time,
cegb_penalty_feature_coupledis applied. This penalty can be different for each feature and should be specified as one
When a feature is used for the first time for a data row,
cegb_penalty_feature_lazyis applied. Like
cegb_penalty_feature_coupled, this penalty is specified as one
Each of the penalties above is scaled by
Using this parameter, it is possible to change the overall strength of the CEGB penalties by changing only one parameter.
Refer to Parameters Tuning.
Refer to Distributed Learning Guide.
Recommendations for gcc Users (MinGW, *nix)
Refer to gcc Tips.