Low-level R interface to train a LightGBM model. Unlike lightgbm
,
this function is focused on performance (e.g. speed, memory efficiency). It is also
less likely to have breaking API changes in new releases than lightgbm
.
lgb.train( params = list(), data, nrounds = 100L, valids = list(), obj = NULL, eval = NULL, verbose = 1L, record = TRUE, eval_freq = 1L, init_model = NULL, colnames = NULL, categorical_feature = NULL, early_stopping_rounds = NULL, callbacks = list(), reset_data = FALSE, serializable = TRUE )
params | a list of parameters. See the "Parameters" section of the documentation for a list of parameters and valid values. |
---|---|
data | a |
nrounds | number of training rounds |
valids | a list of |
obj | objective function, can be character or custom objective function. Examples include
|
eval | evaluation function(s). This can be a character vector, function, or list with a mixture of strings and functions.
|
verbose | verbosity for output, if <= 0 and |
record | Boolean, TRUE will record iteration message to |
eval_freq | evaluation output frequency, only effective when verbose > 0 and |
init_model | path of model file or |
colnames | feature names, if not null, will use this to overwrite the names in dataset |
categorical_feature | categorical features. This can either be a character vector of feature
names or an integer vector with the indices of the features (e.g.
|
early_stopping_rounds | int. Activates early stopping. When this parameter is non-null,
training will stop if the evaluation of any metric on any validation set
fails to improve for |
callbacks | List of callback functions that are applied at each iteration. |
reset_data | Boolean, setting it to TRUE (not the default value) will transform the booster model into a predictor model which frees up memory and the original datasets |
serializable | whether to make the resulting objects serializable through functions such as
|
a trained booster model lgb.Booster
.
"early stopping" refers to stopping the training process if the model's performance on a given validation set does not improve for several consecutive iterations.
If multiple arguments are given to eval
, their order will be preserved. If you enable
early stopping by setting early_stopping_rounds
in params
, by default all
metrics will be considered for early stopping.
If you want to only consider the first metric for early stopping, pass
first_metric_only = TRUE
in params
. Note that if you also specify metric
in params
, that metric will be considered the "first" one. If you omit metric
,
a default metric will be used based on your choice for the parameter obj
(keyword argument)
or objective
(passed into params
).
# \donttest{ data(agaricus.train, package = "lightgbm") train <- agaricus.train dtrain <- lgb.Dataset(train$data, label = train$label) data(agaricus.test, package = "lightgbm") test <- agaricus.test dtest <- lgb.Dataset.create.valid(dtrain, test$data, label = test$label) params <- list( objective = "regression" , metric = "l2" , min_data = 1L , learning_rate = 1.0 ) valids <- list(test = dtest) model <- lgb.train( params = params , data = dtrain , nrounds = 5L , valids = valids , early_stopping_rounds = 3L ) #> [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000927 seconds. #> You can set `force_row_wise=true` to remove the overhead. #> And if memory is not enough, you can set `force_col_wise=true`. #> [LightGBM] [Info] Total Bins 232 #> [LightGBM] [Info] Number of data points in the train set: 6513, number of used features: 116 #> [LightGBM] [Info] Start training from score 0.482113 #> [LightGBM] [Warning] No further splits with positive gain, best gain: -inf #> [1] "[1]: test's l2:6.44165e-17" #> [1] "Will train until there is no improvement in 3 rounds." #> [LightGBM] [Warning] No further splits with positive gain, best gain: -inf #> [1] "[2]: test's l2:1.97215e-31" #> [LightGBM] [Warning] No further splits with positive gain, best gain: -inf #> [1] "[3]: test's l2:0" #> [LightGBM] [Warning] No further splits with positive gain, best gain: -inf #> [LightGBM] [Warning] Stopped training because there are no more leaves that meet the split requirements #> [1] "[4]: test's l2:0" #> [LightGBM] [Warning] No further splits with positive gain, best gain: -inf #> [LightGBM] [Warning] Stopped training because there are no more leaves that meet the split requirements #> [1] "[5]: test's l2:0" #> [1] "Did not meet early stopping, best iteration is: [3]: test's l2:0" # }