Advanced Topics
Missing Value Handle
LightGBM enables the missing value handle by default. Disable it by setting
use_missing=false
.LightGBM uses NA (NaN) to represent missing values by default. Change it to use zero by setting
zero_as_missing=true
.When
zero_as_missing=false
(default), the unrecorded values in sparse matrices (and LightSVM) are treated as zeros.When
zero_as_missing=true
, NA and zeros (including unrecorded values in sparse matrices (and LightSVM)) are treated as missing.
Categorical Feature Support
LightGBM offers good accuracy with integer-encoded categorical features. LightGBM applies Fisher (1958) to find the optimal split over categories as described here. This often performs better than one-hot encoding.
Use
categorical_feature
to specify the categorical features. Refer to the parametercategorical_feature
in Parameters.Categorical features must be encoded as non-negative integers (
int
) less thanInt32.MaxValue
(2147483647). It is best to use a contiguous range of integers started from zero.Use
min_data_per_group
,cat_smooth
to deal with over-fitting (when#data
is small or#category
is large).For a categorical feature with high cardinality (
#category
is large), it often works best to treat the feature as numeric, either by simply ignoring the categorical interpretation of the integers or by embedding the categories in a low-dimensional numeric space.
LambdaRank
The label should be of type
int
, such that larger numbers correspond to higher relevance (e.g. 0:bad, 1:fair, 2:good, 3:perfect).Use
label_gain
to set the gain(weight) ofint
label.Use
lambdarank_truncation_level
to truncate the max DCG.
Cost Efficient Gradient Boosting
Cost Efficient Gradient Boosting (CEGB) makes it possible to penalise boosting based on the cost of obtaining feature values. CEGB penalises learning in the following ways:
Each time a tree is split, a penalty of
cegb_penalty_split
is applied.When a feature is used for the first time,
cegb_penalty_feature_coupled
is applied. This penalty can be different for each feature and should be specified as onedouble
per feature.When a feature is used for the first time for a data row,
cegb_penalty_feature_lazy
is applied. Likecegb_penalty_feature_coupled
, this penalty is specified as onedouble
per feature.
Each of the penalties above is scaled by cegb_tradeoff
.
Using this parameter, it is possible to change the overall strength of the CEGB penalties by changing only one parameter.
Parameters Tuning
Refer to Parameters Tuning.
Distributed Learning
Refer to Distributed Learning Guide.
GPU Support
Refer to GPU Tutorial and GPU Targets.
Recommendations for gcc Users (MinGW, *nix)
Refer to gcc Tips.