Construct validation data according to training data
lgb.Dataset.create.valid( dataset, data, label = NULL, weight = NULL, group = NULL, init_score = NULL, params = list() )
dataset |
|
---|---|
data | a |
label | vector of labels to use as the target variable |
weight | numeric vector of sample weights |
group | used for learning-to-rank tasks. An integer vector describing how to
group rows together as ordered results from the same set of candidate results
to be ranked. For example, if you have a 100-document dataset with
|
init_score | initial score is the base prediction lightgbm will boost from |
params | a list of parameters. See
The "Dataset Parameters" section of the documentation for a list of parameters
and valid values. If this is an empty list (the default), the validation Dataset
will have the same parameters as the Dataset passed to argument |
constructed dataset
# \donttest{ setLGBMthreads(2L) data.table::setDTthreads(1L) data(agaricus.train, package = "lightgbm") train <- agaricus.train dtrain <- lgb.Dataset(train$data, label = train$label) data(agaricus.test, package = "lightgbm") test <- agaricus.test dtest <- lgb.Dataset.create.valid(dtrain, test$data, label = test$label) # parameters can be changed between the training data and validation set, # for example to account for training data in a text file with a header row # and validation data in a text file without it train_file <- tempfile(pattern = "train_", fileext = ".csv") write.table( data.frame(y = rnorm(100L), x1 = rnorm(100L), x2 = rnorm(100L)) , file = train_file , sep = "," , col.names = TRUE , row.names = FALSE , quote = FALSE ) valid_file <- tempfile(pattern = "valid_", fileext = ".csv") write.table( data.frame(y = rnorm(100L), x1 = rnorm(100L), x2 = rnorm(100L)) , file = valid_file , sep = "," , col.names = FALSE , row.names = FALSE , quote = FALSE ) dtrain <- lgb.Dataset( data = train_file , params = list(has_header = TRUE) ) dtrain$construct() #> [LightGBM] [Info] Construct bin mappers from text data time 0.00 seconds dvalid <- lgb.Dataset( data = valid_file , params = list(has_header = FALSE) ) dvalid$construct() #> [LightGBM] [Info] Construct bin mappers from text data time 0.00 seconds # }