C API

Copyright
Copyright (c) 2016 Microsoft Corporation. All rights reserved. Licensed under the MIT License. See LICENSE file in the project root for license information.
Note
To avoid type conversion on large data, the most of our exposed interface supports both float32 and float64, except the following:
  1. gradient and Hessian;
  2. current score for training and validation data.

The reason is that they are called frequently, and the type conversion on them may be time-cost.

Defines

C_API_DTYPE_FLOAT32 (0)

float32 (single precision float).

C_API_DTYPE_FLOAT64 (1)

float64 (double precision float).

C_API_DTYPE_INT32 (2)

int32.

C_API_DTYPE_INT64 (3)

int64.

C_API_DTYPE_INT8 (4)

int8.

C_API_PREDICT_CONTRIB (3)

Predict feature contributions (SHAP values).

C_API_PREDICT_LEAF_INDEX (2)

Predict leaf index.

C_API_PREDICT_NORMAL (0)

Normal prediction, with transform (if needed).

C_API_PREDICT_RAW_SCORE (1)

Predict raw score.

THREAD_LOCAL thread_local

Thread local specifier.

Typedefs

typedef void* BoosterHandle

Handle of booster.

typedef void* DatasetHandle

Handle of dataset.

Functions

static char* LastErrorMsg()

Handle of error message.

Return
Error message

LIGHTGBM_C_EXPORT int LGBM_BoosterAddValidData(BoosterHandle handle, const DatasetHandle valid_data)

Add new validation data to booster.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • valid_data: Validation dataset

LIGHTGBM_C_EXPORT int LGBM_BoosterCalcNumPredict(BoosterHandle handle, int num_row, int predict_type, int num_iteration, int64_t * out_len)

Get number of predictions.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • num_row: Number of rows
  • predict_type: What should be predicted
    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);
    • C_API_PREDICT_RAW_SCORE: raw score;
    • C_API_PREDICT_LEAF_INDEX: leaf index;
    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)
  • num_iteration: Number of iterations for prediction, <= 0 means no limit
  • [out] out_len: Length of prediction

LIGHTGBM_C_EXPORT int LGBM_BoosterCreate(const DatasetHandle train_data, const char * parameters, BoosterHandle * out)

Create a new boosting learner.

Return
0 when succeed, -1 when failure happens
Parameters
  • train_data: Training dataset
  • parameters: Parameters in format ‘key1=value1 key2=value2’
  • [out] out: Handle of created booster

LIGHTGBM_C_EXPORT int LGBM_BoosterCreateFromModelfile(const char * filename, int * out_num_iterations, BoosterHandle * out)

Load an existing booster from model file.

Return
0 when succeed, -1 when failure happens
Parameters
  • filename: Filename of model
  • [out] out_num_iterations: Number of iterations of this booster
  • [out] out: Handle of created booster

LIGHTGBM_C_EXPORT int LGBM_BoosterDumpModel(BoosterHandle handle, int start_iteration, int num_iteration, int64_t buffer_len, int64_t * out_len, char * out_str)

Dump model to JSON.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • start_iteration: Start index of the iteration that should be dumped
  • num_iteration: Index of the iteration that should be dumped, <= 0 means dump all
  • buffer_len: String buffer length, if buffer_len < out_len, you should re-allocate buffer
  • [out] out_len: Actual output length
  • [out] out_str: JSON format string of model, should pre-allocate memory

LIGHTGBM_C_EXPORT int LGBM_BoosterFeatureImportance(BoosterHandle handle, int num_iteration, int importance_type, double * out_results)

Get model feature importance.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • num_iteration: Number of iterations for which feature importance is calculated, <= 0 means use all
  • importance_type: Method of importance calculation:
    • 0 for split, result contains numbers of times the feature is used in a model;
    • 1 for gain, result contains total gains of splits which use the feature
  • [out] out_results: Result array with feature importance

LIGHTGBM_C_EXPORT int LGBM_BoosterFree(BoosterHandle handle)

Free space for booster.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster to be freed

LIGHTGBM_C_EXPORT int LGBM_BoosterGetCurrentIteration(BoosterHandle handle, int * out_iteration)

Get index of the current boosting iteration.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • [out] out_iteration: Index of the current boosting iteration

LIGHTGBM_C_EXPORT int LGBM_BoosterGetEval(BoosterHandle handle, int data_idx, int * out_len, double * out_results)

Get evaluation for training data and validation data.

Note
  1. You should call LGBM_BoosterGetEvalNames first to get the names of evaluation datasets.
  2. You should pre-allocate memory for out_results, you can get its length by LGBM_BoosterGetEvalCounts.
Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • data_idx: Index of data, 0: training data, 1: 1st validation data, 2: 2nd validation data and so on
  • [out] out_len: Length of output result
  • [out] out_results: Array with evaluation results

LIGHTGBM_C_EXPORT int LGBM_BoosterGetEvalCounts(BoosterHandle handle, int * out_len)

Get number of evaluation datasets.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • [out] out_len: Total number of evaluation datasets

LIGHTGBM_C_EXPORT int LGBM_BoosterGetEvalNames(BoosterHandle handle, int * out_len, char ** out_strs)

Get names of evaluation datasets.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • [out] out_len: Total number of evaluation datasets
  • [out] out_strs: Names of evaluation datasets, should pre-allocate memory

LIGHTGBM_C_EXPORT int LGBM_BoosterGetFeatureNames(BoosterHandle handle, int * out_len, char ** out_strs)

Get names of features.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • [out] out_len: Total number of features
  • [out] out_strs: Names of features, should pre-allocate memory

LIGHTGBM_C_EXPORT int LGBM_BoosterGetLeafValue(BoosterHandle handle, int tree_idx, int leaf_idx, double * out_val)

Get leaf value.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • tree_idx: Index of tree
  • leaf_idx: Index of leaf
  • [out] out_val: Output result from the specified leaf

LIGHTGBM_C_EXPORT int LGBM_BoosterGetNumClasses(BoosterHandle handle, int * out_len)

Get number of classes.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • [out] out_len: Number of classes

LIGHTGBM_C_EXPORT int LGBM_BoosterGetNumFeature(BoosterHandle handle, int * out_len)

Get number of features.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • [out] out_len: Total number of features

LIGHTGBM_C_EXPORT int LGBM_BoosterGetNumPredict(BoosterHandle handle, int data_idx, int64_t * out_len)

Get number of predictions for training data and validation data (this can be used to support customized evaluation functions).

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • data_idx: Index of data, 0: training data, 1: 1st validation data, 2: 2nd validation data and so on
  • [out] out_len: Number of predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterGetPredict(BoosterHandle handle, int data_idx, int64_t * out_len, double * out_result)

Get prediction for training data and validation data.

Note
You should pre-allocate memory for out_result, its length is equal to num_class * num_data.
Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • data_idx: Index of data, 0: training data, 1: 1st validation data, 2: 2nd validation data and so on
  • [out] out_len: Length of output result
  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterLoadModelFromString(const char * model_str, int * out_num_iterations, BoosterHandle * out)

Load an existing booster from string.

Return
0 when succeed, -1 when failure happens
Parameters
  • model_str: Model string
  • [out] out_num_iterations: Number of iterations of this booster
  • [out] out: Handle of created booster

LIGHTGBM_C_EXPORT int LGBM_BoosterMerge(BoosterHandle handle, BoosterHandle other_handle)

Merge model from other_handle into handle.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster, will merge another booster into this one
  • other_handle: Other handle of booster

LIGHTGBM_C_EXPORT int LGBM_BoosterNumberOfTotalModel(BoosterHandle handle, int * out_models)

Get number of weak sub-models.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • [out] out_models: Number of weak sub-models

LIGHTGBM_C_EXPORT int LGBM_BoosterNumModelPerIteration(BoosterHandle handle, int * out_tree_per_iteration)

Get number of trees per iteration.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • [out] out_tree_per_iteration: Number of trees per iteration

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForCSC(BoosterHandle handle, const void * col_ptr, int col_ptr_type, const int32_t * indices, const void * data, int data_type, int64_t ncol_ptr, int64_t nelem, int64_t num_row, int predict_type, int num_iteration, const char * parameter, int64_t * out_len, double * out_result)

Make prediction for a new dataset in CSC format.

Note
You should pre-allocate memory for out_result:
  • for normal and raw score, its length is equal to num_class * num_data;
  • for leaf index, its length is equal to num_class * num_data * num_iteration;
  • for feature contributions, its length is equal to num_class * num_data * (num_feature + 1).
Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • col_ptr: Pointer to column headers
  • col_ptr_type: Type of col_ptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64
  • indices: Pointer to row indices
  • data: Pointer to the data space
  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64
  • ncol_ptr: Number of columns in the matrix + 1
  • nelem: Number of nonzero elements in the matrix
  • num_row: Number of rows
  • predict_type: What should be predicted
    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);
    • C_API_PREDICT_RAW_SCORE: raw score;
    • C_API_PREDICT_LEAF_INDEX: leaf index;
    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)
  • num_iteration: Number of iteration for prediction, <= 0 means no limit
  • parameter: Other parameters for prediction, e.g. early stopping for prediction
  • [out] out_len: Length of output result
  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForCSR(BoosterHandle handle, const void * indptr, int indptr_type, const int32_t * indices, const void * data, int data_type, int64_t nindptr, int64_t nelem, int64_t num_col, int predict_type, int num_iteration, const char * parameter, int64_t * out_len, double * out_result)

Make prediction for a new dataset in CSR format.

Note
You should pre-allocate memory for out_result:
  • for normal and raw score, its length is equal to num_class * num_data;
  • for leaf index, its length is equal to num_class * num_data * num_iteration;
  • for feature contributions, its length is equal to num_class * num_data * (num_feature + 1).
Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • indptr: Pointer to row headers
  • indptr_type: Type of indptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64
  • indices: Pointer to column indices
  • data: Pointer to the data space
  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64
  • nindptr: Number of rows in the matrix + 1
  • nelem: Number of nonzero elements in the matrix
  • num_col: Number of columns
  • predict_type: What should be predicted
    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);
    • C_API_PREDICT_RAW_SCORE: raw score;
    • C_API_PREDICT_LEAF_INDEX: leaf index;
    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)
  • num_iteration: Number of iterations for prediction, <= 0 means no limit
  • parameter: Other parameters for prediction, e.g. early stopping for prediction
  • [out] out_len: Length of output result
  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForCSRSingleRow(BoosterHandle handle, const void * indptr, int indptr_type, const int32_t * indices, const void * data, int data_type, int64_t nindptr, int64_t nelem, int64_t num_col, int predict_type, int num_iteration, const char * parameter, int64_t * out_len, double * out_result)

Make prediction for a new dataset in CSR format. This method re-uses the internal predictor structure from previous calls and is optimized for single row invocation.

Note
You should pre-allocate memory for out_result:
  • for normal and raw score, its length is equal to num_class * num_data;
  • for leaf index, its length is equal to num_class * num_data * num_iteration;
  • for feature contributions, its length is equal to num_class * num_data * (num_feature + 1).
Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • indptr: Pointer to row headers
  • indptr_type: Type of indptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64
  • indices: Pointer to column indices
  • data: Pointer to the data space
  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64
  • nindptr: Number of rows in the matrix + 1
  • nelem: Number of nonzero elements in the matrix
  • num_col: Number of columns
  • predict_type: What should be predicted
    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);
    • C_API_PREDICT_RAW_SCORE: raw score;
    • C_API_PREDICT_LEAF_INDEX: leaf index;
    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)
  • num_iteration: Number of iterations for prediction, <= 0 means no limit
  • parameter: Other parameters for prediction, e.g. early stopping for prediction
  • [out] out_len: Length of output result
  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForFile(BoosterHandle handle, const char * data_filename, int data_has_header, int predict_type, int num_iteration, const char * parameter, const char * result_filename)

Make prediction for file.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • data_filename: Filename of file with data
  • data_has_header: Whether file has header or not
  • predict_type: What should be predicted
    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);
    • C_API_PREDICT_RAW_SCORE: raw score;
    • C_API_PREDICT_LEAF_INDEX: leaf index;
    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)
  • num_iteration: Number of iterations for prediction, <= 0 means no limit
  • parameter: Other parameters for prediction, e.g. early stopping for prediction
  • result_filename: Filename of result file in which predictions will be written

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForMat(BoosterHandle handle, const void * data, int data_type, int32_t nrow, int32_t ncol, int is_row_major, int predict_type, int num_iteration, const char * parameter, int64_t * out_len, double * out_result)

Make prediction for a new dataset.

Note
You should pre-allocate memory for out_result:
  • for normal and raw score, its length is equal to num_class * num_data;
  • for leaf index, its length is equal to num_class * num_data * num_iteration;
  • for feature contributions, its length is equal to num_class * num_data * (num_feature + 1).
Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • data: Pointer to the data space
  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64
  • nrow: Number of rows
  • ncol: Number of columns
  • is_row_major: 1 for row-major, 0 for column-major
  • predict_type: What should be predicted
    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);
    • C_API_PREDICT_RAW_SCORE: raw score;
    • C_API_PREDICT_LEAF_INDEX: leaf index;
    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)
  • num_iteration: Number of iteration for prediction, <= 0 means no limit
  • parameter: Other parameters for prediction, e.g. early stopping for prediction
  • [out] out_len: Length of output result
  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForMats(BoosterHandle handle, const void ** data, int data_type, int32_t nrow, int32_t ncol, int predict_type, int num_iteration, const char * parameter, int64_t * out_len, double * out_result)

Make prediction for a new dataset presented in a form of array of pointers to rows.

Note
You should pre-allocate memory for out_result:
  • for normal and raw score, its length is equal to num_class * num_data;
  • for leaf index, its length is equal to num_class * num_data * num_iteration;
  • for feature contributions, its length is equal to num_class * num_data * (num_feature + 1).
Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • data: Pointer to the data space
  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64
  • nrow: Number of rows
  • ncol: Number columns
  • predict_type: What should be predicted
    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);
    • C_API_PREDICT_RAW_SCORE: raw score;
    • C_API_PREDICT_LEAF_INDEX: leaf index;
    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)
  • num_iteration: Number of iteration for prediction, <= 0 means no limit
  • parameter: Other parameters for prediction, e.g. early stopping for prediction
  • [out] out_len: Length of output result
  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForMatSingleRow(BoosterHandle handle, const void * data, int data_type, int ncol, int is_row_major, int predict_type, int num_iteration, const char * parameter, int64_t * out_len, double * out_result)

Make prediction for a new dataset. This method re-uses the internal predictor structure from previous calls and is optimized for single row invocation.

Note
You should pre-allocate memory for out_result:
  • for normal and raw score, its length is equal to num_class * num_data;
  • for leaf index, its length is equal to num_class * num_data * num_iteration;
  • for feature contributions, its length is equal to num_class * num_data * (num_feature + 1).
Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • data: Pointer to the data space
  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64
  • ncol: Number columns
  • is_row_major: 1 for row-major, 0 for column-major
  • predict_type: What should be predicted
    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);
    • C_API_PREDICT_RAW_SCORE: raw score;
    • C_API_PREDICT_LEAF_INDEX: leaf index;
    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)
  • num_iteration: Number of iteration for prediction, <= 0 means no limit
  • parameter: Other parameters for prediction, e.g. early stopping for prediction
  • [out] out_len: Length of output result
  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterRefit(BoosterHandle handle, const int32_t * leaf_preds, int32_t nrow, int32_t ncol)

Refit the tree model using the new data (online learning).

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • leaf_preds: Pointer to predicted leaf indices
  • nrow: Number of rows of leaf_preds
  • ncol: Number of columns of leaf_preds

LIGHTGBM_C_EXPORT int LGBM_BoosterResetParameter(BoosterHandle handle, const char * parameters)

Reset config for booster.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • parameters: Parameters in format ‘key1=value1 key2=value2’

LIGHTGBM_C_EXPORT int LGBM_BoosterResetTrainingData(BoosterHandle handle, const DatasetHandle train_data)

Reset training data for booster.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • train_data: Training dataset

LIGHTGBM_C_EXPORT int LGBM_BoosterRollbackOneIter(BoosterHandle handle)

Rollback one iteration.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster

LIGHTGBM_C_EXPORT int LGBM_BoosterSaveModel(BoosterHandle handle, int start_iteration, int num_iteration, const char * filename)

Save model into file.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • start_iteration: Start index of the iteration that should be saved
  • num_iteration: Index of the iteration that should be saved, <= 0 means save all
  • filename: The name of the file

LIGHTGBM_C_EXPORT int LGBM_BoosterSaveModelToString(BoosterHandle handle, int start_iteration, int num_iteration, int64_t buffer_len, int64_t * out_len, char * out_str)

Save model to string.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • start_iteration: Start index of the iteration that should be saved
  • num_iteration: Index of the iteration that should be saved, <= 0 means save all
  • buffer_len: String buffer length, if buffer_len < out_len, you should re-allocate buffer
  • [out] out_len: Actual output length
  • [out] out_str: String of model, should pre-allocate memory

LIGHTGBM_C_EXPORT int LGBM_BoosterSetLeafValue(BoosterHandle handle, int tree_idx, int leaf_idx, double val)

Set leaf value.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • tree_idx: Index of tree
  • leaf_idx: Index of leaf
  • val: Leaf value

LIGHTGBM_C_EXPORT int LGBM_BoosterShuffleModels(BoosterHandle handle, int start_iter, int end_iter)

Shuffle models.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • start_iter: The first iteration that will be shuffled
  • end_iter: The last iteration that will be shuffled

LIGHTGBM_C_EXPORT int LGBM_BoosterUpdateOneIter(BoosterHandle handle, int * is_finished)

Update the model for one iteration.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • [out] is_finished: 1 means the update was successfully finished (cannot split any more), 0 indicates failure

LIGHTGBM_C_EXPORT int LGBM_BoosterUpdateOneIterCustom(BoosterHandle handle, const float * grad, const float * hess, int * is_finished)

Update the model by specifying gradient and Hessian directly (this can be used to support customized loss functions).

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of booster
  • grad: The first order derivative (gradient) statistics
  • hess: The second order derivative (Hessian) statistics
  • [out] is_finished: 1 means the update was successfully finished (cannot split any more), 0 indicates failure

LIGHTGBM_C_EXPORT int LGBM_DatasetAddFeaturesFrom(DatasetHandle target, DatasetHandle source)

Add features from source to target.

Return
0 when succeed, -1 when failure happens
Parameters
  • target: The handle of the dataset to add features to
  • source: The handle of the dataset to take features from

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateByReference(const DatasetHandle reference, int64_t num_total_row, DatasetHandle * out)

Allocate the space for dataset and bucket feature bins according to reference dataset.

Return
0 when succeed, -1 when failure happens
Parameters
  • reference: Used to align bin mapper with other dataset
  • num_total_row: Number of total rows
  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromCSC(const void * col_ptr, int col_ptr_type, const int32_t * indices, const void * data, int data_type, int64_t ncol_ptr, int64_t nelem, int64_t num_row, const char * parameters, const DatasetHandle reference, DatasetHandle * out)

Create a dataset from CSC format.

Return
0 when succeed, -1 when failure happens
Parameters
  • col_ptr: Pointer to column headers
  • col_ptr_type: Type of col_ptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64
  • indices: Pointer to row indices
  • data: Pointer to the data space
  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64
  • ncol_ptr: Number of columns in the matrix + 1
  • nelem: Number of nonzero elements in the matrix
  • num_row: Number of rows
  • parameters: Additional parameters
  • reference: Used to align bin mapper with other dataset, nullptr means isn’t used
  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromCSR(const void * indptr, int indptr_type, const int32_t * indices, const void * data, int data_type, int64_t nindptr, int64_t nelem, int64_t num_col, const char * parameters, const DatasetHandle reference, DatasetHandle * out)

Create a dataset from CSR format.

Return
0 when succeed, -1 when failure happens
Parameters
  • indptr: Pointer to row headers
  • indptr_type: Type of indptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64
  • indices: Pointer to column indices
  • data: Pointer to the data space
  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64
  • nindptr: Number of rows in the matrix + 1
  • nelem: Number of nonzero elements in the matrix
  • num_col: Number of columns
  • parameters: Additional parameters
  • reference: Used to align bin mapper with other dataset, nullptr means isn’t used
  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromCSRFunc(void * get_row_funptr, int num_rows, int64_t num_col, const char * parameters, const DatasetHandle reference, DatasetHandle * out)

Create a dataset from CSR format through callbacks.

Return
0 when succeed, -1 when failure happens
Parameters
  • get_row_funptr: Pointer to std::function<void(int idx, std::vector<std::pair<int, double>>& ret)> (called for every row and expected to clear and fill ret)
  • num_rows: Number of rows
  • num_col: Number of columns
  • parameters: Additional parameters
  • reference: Used to align bin mapper with other dataset, nullptr means isn’t used
  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromFile(const char * filename, const char * parameters, const DatasetHandle reference, DatasetHandle * out)

Load dataset from file (like LightGBM CLI version does).

Return
0 when succeed, -1 when failure happens
Parameters
  • filename: The name of the file
  • parameters: Additional parameters
  • reference: Used to align bin mapper with other dataset, nullptr means isn’t used
  • [out] out: A loaded dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromMat(const void * data, int data_type, int32_t nrow, int32_t ncol, int is_row_major, const char * parameters, const DatasetHandle reference, DatasetHandle * out)

Create dataset from dense matrix.

Return
0 when succeed, -1 when failure happens
Parameters
  • data: Pointer to the data space
  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64
  • nrow: Number of rows
  • ncol: Number of columns
  • is_row_major: 1 for row-major, 0 for column-major
  • parameters: Additional parameters
  • reference: Used to align bin mapper with other dataset, nullptr means isn’t used
  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromMats(int32_t nmat, const void ** data, int data_type, int32_t * nrow, int32_t ncol, int is_row_major, const char * parameters, const DatasetHandle reference, DatasetHandle * out)

Create dataset from array of dense matrices.

Return
0 when succeed, -1 when failure happens
Parameters
  • nmat: Number of dense matrices
  • data: Pointer to the data space
  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64
  • nrow: Number of rows
  • ncol: Number of columns
  • is_row_major: 1 for row-major, 0 for column-major
  • parameters: Additional parameters
  • reference: Used to align bin mapper with other dataset, nullptr means isn’t used
  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromSampledColumn(double ** sample_data, int ** sample_indices, int32_t ncol, const int * num_per_col, int32_t num_sample_row, int32_t num_total_row, const char * parameters, DatasetHandle * out)

Allocate the space for dataset and bucket feature bins according to sampled data.

Return
0 when succeed, -1 when failure happens
Parameters
  • sample_data: Sampled data, grouped by the column
  • sample_indices: Indices of sampled data
  • ncol: Number of columns
  • num_per_col: Size of each sampling column
  • num_sample_row: Number of sampled rows
  • num_total_row: Number of total rows
  • parameters: Additional parameters
  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetDumpText(DatasetHandle handle, const char * filename)

Save dataset to text file, intended for debugging use only.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of dataset
  • filename: The name of the file

LIGHTGBM_C_EXPORT int LGBM_DatasetFree(DatasetHandle handle)

Free space for dataset.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of dataset to be freed

LIGHTGBM_C_EXPORT int LGBM_DatasetGetFeatureNames(DatasetHandle handle, char ** feature_names, int * num_feature_names)

Get feature names of dataset.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of dataset
  • [out] feature_names: Feature names, should pre-allocate memory
  • [out] num_feature_names: Number of feature names

LIGHTGBM_C_EXPORT int LGBM_DatasetGetField(DatasetHandle handle, const char * field_name, int * out_len, const void ** out_ptr, int * out_type)

Get info vector from dataset.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of dataset
  • field_name: Field name
  • [out] out_len: Used to set result length
  • [out] out_ptr: Pointer to the result
  • [out] out_type: Type of result pointer, can be C_API_DTYPE_INT8, C_API_DTYPE_INT32, C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

LIGHTGBM_C_EXPORT int LGBM_DatasetGetNumData(DatasetHandle handle, int * out)

Get number of data points.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of dataset
  • [out] out: The address to hold number of data points

LIGHTGBM_C_EXPORT int LGBM_DatasetGetNumFeature(DatasetHandle handle, int * out)

Get number of features.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of dataset
  • [out] out: The address to hold number of features

LIGHTGBM_C_EXPORT int LGBM_DatasetGetSubset(const DatasetHandle handle, const int32_t * used_row_indices, int32_t num_used_row_indices, const char * parameters, DatasetHandle * out)

Create subset of a data.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of full dataset
  • used_row_indices: Indices used in subset
  • num_used_row_indices: Length of used_row_indices
  • parameters: Additional parameters
  • [out] out: Subset of data

LIGHTGBM_C_EXPORT int LGBM_DatasetPushRows(DatasetHandle dataset, const void * data, int data_type, int32_t nrow, int32_t ncol, int32_t start_row)

Push data to existing dataset, if nrow + start_row == num_total_row, will call dataset->FinishLoad.

Return
0 when succeed, -1 when failure happens
Parameters
  • dataset: Handle of dataset
  • data: Pointer to the data space
  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64
  • nrow: Number of rows
  • ncol: Number of columns
  • start_row: Row start index

LIGHTGBM_C_EXPORT int LGBM_DatasetPushRowsByCSR(DatasetHandle dataset, const void * indptr, int indptr_type, const int32_t * indices, const void * data, int data_type, int64_t nindptr, int64_t nelem, int64_t num_col, int64_t start_row)

Push data to existing dataset, if nrow + start_row == num_total_row, will call dataset->FinishLoad.

Return
0 when succeed, -1 when failure happens
Parameters
  • dataset: Handle of dataset
  • indptr: Pointer to row headers
  • indptr_type: Type of indptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64
  • indices: Pointer to column indices
  • data: Pointer to the data space
  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64
  • nindptr: Number of rows in the matrix + 1
  • nelem: Number of nonzero elements in the matrix
  • num_col: Number of columns
  • start_row: Row start index

LIGHTGBM_C_EXPORT int LGBM_DatasetSaveBinary(DatasetHandle handle, const char * filename)

Save dataset to binary file.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of dataset
  • filename: The name of the file

LIGHTGBM_C_EXPORT int LGBM_DatasetSetFeatureNames(DatasetHandle handle, const char ** feature_names, int num_feature_names)

Save feature names to dataset.

Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of dataset
  • feature_names: Feature names
  • num_feature_names: Number of feature names

LIGHTGBM_C_EXPORT int LGBM_DatasetSetField(DatasetHandle handle, const char * field_name, const void * field_data, int num_element, int type)

Set vector to a content in info.

Note
  • group only works for C_API_DTYPE_INT32;
  • label and weight only work for C_API_DTYPE_FLOAT32;
  • init_score only works for C_API_DTYPE_FLOAT64.
Return
0 when succeed, -1 when failure happens
Parameters
  • handle: Handle of dataset
  • field_name: Field name, can be label, weight, init_score, group
  • field_data: Pointer to data vector
  • num_element: Number of elements in field_data
  • type: Type of field_data pointer, can be C_API_DTYPE_INT32, C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

LIGHTGBM_C_EXPORT int LGBM_DatasetUpdateParam(DatasetHandle handle, const char * parameters)

Update parameters for a dataset.

Parameters
  • handle: Handle of dataset
  • parameters: Parameters

LIGHTGBM_C_EXPORT const char* LGBM_GetLastError()

Get string message of the last error.

Return
Error information

LIGHTGBM_C_EXPORT int LGBM_NetworkFree()

Finalize the network.

Return
0 when succeed, -1 when failure happens

LIGHTGBM_C_EXPORT int LGBM_NetworkInit(const char * machines, int local_listen_port, int listen_time_out, int num_machines)

Initialize the network.

Return
0 when succeed, -1 when failure happens
Parameters
  • machines: List of machines in format ‘ip1:port1,ip2:port2’
  • local_listen_port: TCP listen port for local machines
  • listen_time_out: Socket time-out in minutes
  • num_machines: Total number of machines

LIGHTGBM_C_EXPORT int LGBM_NetworkInitWithFunctions(int num_machines, int rank, void * reduce_scatter_ext_fun, void * allgather_ext_fun)

Initialize the network with external collective functions.

Return
0 when succeed, -1 when failure happens
Parameters
  • num_machines: Total number of machines
  • rank: Rank of local machine
  • reduce_scatter_ext_fun: The external reduce-scatter function
  • allgather_ext_fun: The external allgather function

void LGBM_SetLastError(const char * msg)

Set string message of the last error.

Parameters
  • msg: Error message