C API

Copyright

Copyright (c) 2016 Microsoft Corporation. All rights reserved. Licensed under the MIT License. See LICENSE file in the project root for license information.

Note

To avoid type conversion on large data, the most of our exposed interface supports both float32 and float64, except the following:

  1. gradient and Hessian;

  2. current score for training and validation data.

The reason is that they are called frequently, and the type conversion on them may be time-cost.

Defines

C_API_DTYPE_FLOAT32 (0)

float32 (single precision float).

C_API_DTYPE_FLOAT64 (1)

float64 (double precision float).

C_API_DTYPE_INT32 (2)

int32.

C_API_DTYPE_INT64 (3)

int64.

C_API_FEATURE_IMPORTANCE_GAIN (1)

Gain type of feature importance.

C_API_FEATURE_IMPORTANCE_SPLIT (0)

Split type of feature importance.

C_API_MATRIX_TYPE_CSC (1)

CSC sparse matrix type.

C_API_MATRIX_TYPE_CSR (0)

CSR sparse matrix type.

C_API_PREDICT_CONTRIB (3)

Predict feature contributions (SHAP values).

C_API_PREDICT_LEAF_INDEX (2)

Predict leaf index.

C_API_PREDICT_NORMAL (0)

Normal prediction, with transform (if needed).

C_API_PREDICT_RAW_SCORE (1)

Predict raw score.

THREAD_LOCAL thread_local

Thread local specifier.

Typedefs

typedef void *BoosterHandle

Handle of booster.

typedef void *DatasetHandle

Handle of dataset.

typedef void *FastConfigHandle

Handle of FastConfig.

Functions

char *LastErrorMsg()

Handle of error message.

Return

Error message

LIGHTGBM_C_EXPORT int LGBM_BoosterAddValidData(BoosterHandle handle, const DatasetHandle valid_data)

Add new validation data to booster.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • valid_data: Validation dataset

LIGHTGBM_C_EXPORT int LGBM_BoosterCalcNumPredict(BoosterHandle handle, int num_row, int predict_type, int start_iteration, int num_iteration, int64_t *out_len)

Get number of predictions.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • num_row: Number of rows

  • predict_type: What should be predicted

    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);

    • C_API_PREDICT_RAW_SCORE: raw score;

    • C_API_PREDICT_LEAF_INDEX: leaf index;

    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)

  • start_iteration: Start index of the iteration to predict

  • num_iteration: Number of iterations for prediction, <= 0 means no limit

  • [out] out_len: Length of prediction

LIGHTGBM_C_EXPORT int LGBM_BoosterCreate(const DatasetHandle train_data, const char *parameters, BoosterHandle *out)

Create a new boosting learner.

Return

0 when succeed, -1 when failure happens

Parameters
  • train_data: Training dataset

  • parameters: Parameters in format ‘key1=value1 key2=value2’

  • [out] out: Handle of created booster

LIGHTGBM_C_EXPORT int LGBM_BoosterCreateFromModelfile(const char *filename, int *out_num_iterations, BoosterHandle *out)

Load an existing booster from model file.

Return

0 when succeed, -1 when failure happens

Parameters
  • filename: Filename of model

  • [out] out_num_iterations: Number of iterations of this booster

  • [out] out: Handle of created booster

LIGHTGBM_C_EXPORT int LGBM_BoosterDumpModel(BoosterHandle handle, int start_iteration, int num_iteration, int feature_importance_type, int64_t buffer_len, int64_t *out_len, char *out_str)

Dump model to JSON.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • start_iteration: Start index of the iteration that should be dumped

  • num_iteration: Index of the iteration that should be dumped, <= 0 means dump all

  • feature_importance_type: Type of feature importance, can be C_API_FEATURE_IMPORTANCE_SPLIT or C_API_FEATURE_IMPORTANCE_GAIN

  • buffer_len: String buffer length, if buffer_len < out_len, you should re-allocate buffer

  • [out] out_len: Actual output length

  • [out] out_str: JSON format string of model, should pre-allocate memory

LIGHTGBM_C_EXPORT int LGBM_BoosterFeatureImportance(BoosterHandle handle, int num_iteration, int importance_type, double *out_results)

Get model feature importance.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • num_iteration: Number of iterations for which feature importance is calculated, <= 0 means use all

  • importance_type: Method of importance calculation:

    • C_API_FEATURE_IMPORTANCE_SPLIT: result contains numbers of times the feature is used in a model;

    • C_API_FEATURE_IMPORTANCE_GAIN: result contains total gains of splits which use the feature

  • [out] out_results: Result array with feature importance

LIGHTGBM_C_EXPORT int LGBM_BoosterFree(BoosterHandle handle)

Free space for booster.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster to be freed

LIGHTGBM_C_EXPORT int LGBM_BoosterFreePredictSparse(void *indptr, int32_t *indices, void *data, int indptr_type, int data_type)

Method corresponding to LGBM_BoosterPredictSparseOutput to free the allocated data.

Return

0 when succeed, -1 when failure happens

Parameters
  • indptr: Pointer to output row headers or column headers to be deallocated

  • indices: Pointer to sparse indices to be deallocated

  • data: Pointer to sparse data space to be deallocated

  • indptr_type: Type of indptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

LIGHTGBM_C_EXPORT int LGBM_BoosterGetCurrentIteration(BoosterHandle handle, int *out_iteration)

Get index of the current boosting iteration.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • [out] out_iteration: Index of the current boosting iteration

LIGHTGBM_C_EXPORT int LGBM_BoosterGetEval(BoosterHandle handle, int data_idx, int *out_len, double *out_results)

Get evaluation for training data and validation data.

Note

  1. You should call LGBM_BoosterGetEvalNames first to get the names of evaluation datasets.

  2. You should pre-allocate memory for out_results, you can get its length by LGBM_BoosterGetEvalCounts.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • data_idx: Index of data, 0: training data, 1: 1st validation data, 2: 2nd validation data and so on

  • [out] out_len: Length of output result

  • [out] out_results: Array with evaluation results

LIGHTGBM_C_EXPORT int LGBM_BoosterGetEvalCounts(BoosterHandle handle, int *out_len)

Get number of evaluation datasets.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • [out] out_len: Total number of evaluation datasets

LIGHTGBM_C_EXPORT int LGBM_BoosterGetEvalNames(BoosterHandle handle, const int len, int *out_len, const size_t buffer_len, size_t *out_buffer_len, char **out_strs)

Get names of evaluation datasets.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • len: Number of char* pointers stored at out_strs. If smaller than the max size, only this many strings are copied

  • [out] out_len: Total number of evaluation datasets

  • buffer_len: Size of pre-allocated strings. Content is copied up to buffer_len - 1 and null-terminated

  • [out] out_buffer_len: String sizes required to do the full string copies

  • [out] out_strs: Names of evaluation datasets, should pre-allocate memory

LIGHTGBM_C_EXPORT int LGBM_BoosterGetFeatureNames(BoosterHandle handle, const int len, int *out_len, const size_t buffer_len, size_t *out_buffer_len, char **out_strs)

Get names of features.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • len: Number of char* pointers stored at out_strs. If smaller than the max size, only this many strings are copied

  • [out] out_len: Total number of features

  • buffer_len: Size of pre-allocated strings. Content is copied up to buffer_len - 1 and null-terminated

  • [out] out_buffer_len: String sizes required to do the full string copies

  • [out] out_strs: Names of features, should pre-allocate memory

LIGHTGBM_C_EXPORT int LGBM_BoosterGetLeafValue(BoosterHandle handle, int tree_idx, int leaf_idx, double *out_val)

Get leaf value.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • tree_idx: Index of tree

  • leaf_idx: Index of leaf

  • [out] out_val: Output result from the specified leaf

LIGHTGBM_C_EXPORT int LGBM_BoosterGetLowerBoundValue(BoosterHandle handle, double *out_results)

Get model lower bound value.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • [out] out_results: Result pointing to min value

LIGHTGBM_C_EXPORT int LGBM_BoosterGetNumClasses(BoosterHandle handle, int *out_len)

Get number of classes.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • [out] out_len: Number of classes

LIGHTGBM_C_EXPORT int LGBM_BoosterGetNumFeature(BoosterHandle handle, int *out_len)

Get number of features.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • [out] out_len: Total number of features

LIGHTGBM_C_EXPORT int LGBM_BoosterGetNumPredict(BoosterHandle handle, int data_idx, int64_t *out_len)

Get number of predictions for training data and validation data (this can be used to support customized evaluation functions).

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • data_idx: Index of data, 0: training data, 1: 1st validation data, 2: 2nd validation data and so on

  • [out] out_len: Number of predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterGetPredict(BoosterHandle handle, int data_idx, int64_t *out_len, double *out_result)

Get prediction for training data and validation data.

Note

You should pre-allocate memory for out_result, its length is equal to num_class * num_data.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • data_idx: Index of data, 0: training data, 1: 1st validation data, 2: 2nd validation data and so on

  • [out] out_len: Length of output result

  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterGetUpperBoundValue(BoosterHandle handle, double *out_results)

Get model upper bound value.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • [out] out_results: Result pointing to max value

LIGHTGBM_C_EXPORT int LGBM_BoosterLoadModelFromString(const char *model_str, int *out_num_iterations, BoosterHandle *out)

Load an existing booster from string.

Return

0 when succeed, -1 when failure happens

Parameters
  • model_str: Model string

  • [out] out_num_iterations: Number of iterations of this booster

  • [out] out: Handle of created booster

LIGHTGBM_C_EXPORT int LGBM_BoosterMerge(BoosterHandle handle, BoosterHandle other_handle)

Merge model from other_handle into handle.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster, will merge another booster into this one

  • other_handle: Other handle of booster

LIGHTGBM_C_EXPORT int LGBM_BoosterNumberOfTotalModel(BoosterHandle handle, int *out_models)

Get number of weak sub-models.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • [out] out_models: Number of weak sub-models

LIGHTGBM_C_EXPORT int LGBM_BoosterNumModelPerIteration(BoosterHandle handle, int *out_tree_per_iteration)

Get number of trees per iteration.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • [out] out_tree_per_iteration: Number of trees per iteration

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForCSC(BoosterHandle handle, const void *col_ptr, int col_ptr_type, const int32_t *indices, const void *data, int data_type, int64_t ncol_ptr, int64_t nelem, int64_t num_row, int predict_type, int start_iteration, int num_iteration, const char *parameter, int64_t *out_len, double *out_result)

Make prediction for a new dataset in CSC format.

Note

You should pre-allocate memory for out_result:

  • for normal and raw score, its length is equal to num_class * num_data;

  • for leaf index, its length is equal to num_class * num_data * num_iteration;

  • for feature contributions, its length is equal to num_class * num_data * (num_feature + 1).

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • col_ptr: Pointer to column headers

  • col_ptr_type: Type of col_ptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64

  • indices: Pointer to row indices

  • data: Pointer to the data space

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • ncol_ptr: Number of columns in the matrix + 1

  • nelem: Number of nonzero elements in the matrix

  • num_row: Number of rows

  • predict_type: What should be predicted

    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);

    • C_API_PREDICT_RAW_SCORE: raw score;

    • C_API_PREDICT_LEAF_INDEX: leaf index;

    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)

  • start_iteration: Start index of the iteration to predict

  • num_iteration: Number of iteration for prediction, <= 0 means no limit

  • parameter: Other parameters for prediction, e.g. early stopping for prediction

  • [out] out_len: Length of output result

  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForCSR(BoosterHandle handle, const void *indptr, int indptr_type, const int32_t *indices, const void *data, int data_type, int64_t nindptr, int64_t nelem, int64_t num_col, int predict_type, int start_iteration, int num_iteration, const char *parameter, int64_t *out_len, double *out_result)

Make prediction for a new dataset in CSR format.

Note

You should pre-allocate memory for out_result:

  • for normal and raw score, its length is equal to num_class * num_data;

  • for leaf index, its length is equal to num_class * num_data * num_iteration;

  • for feature contributions, its length is equal to num_class * num_data * (num_feature + 1).

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • indptr: Pointer to row headers

  • indptr_type: Type of indptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64

  • indices: Pointer to column indices

  • data: Pointer to the data space

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • nindptr: Number of rows in the matrix + 1

  • nelem: Number of nonzero elements in the matrix

  • num_col: Number of columns

  • predict_type: What should be predicted

    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);

    • C_API_PREDICT_RAW_SCORE: raw score;

    • C_API_PREDICT_LEAF_INDEX: leaf index;

    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)

  • start_iteration: Start index of the iteration to predict

  • num_iteration: Number of iterations for prediction, <= 0 means no limit

  • parameter: Other parameters for prediction, e.g. early stopping for prediction

  • [out] out_len: Length of output result

  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForCSRSingleRow(BoosterHandle handle, const void *indptr, int indptr_type, const int32_t *indices, const void *data, int data_type, int64_t nindptr, int64_t nelem, int64_t num_col, int predict_type, int start_iteration, int num_iteration, const char *parameter, int64_t *out_len, double *out_result)

Make prediction for a new dataset in CSR format. This method re-uses the internal predictor structure from previous calls and is optimized for single row invocation.

Note

You should pre-allocate memory for out_result:

  • for normal and raw score, its length is equal to num_class * num_data;

  • for leaf index, its length is equal to num_class * num_data * num_iteration;

  • for feature contributions, its length is equal to num_class * num_data * (num_feature + 1).

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • indptr: Pointer to row headers

  • indptr_type: Type of indptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64

  • indices: Pointer to column indices

  • data: Pointer to the data space

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • nindptr: Number of rows in the matrix + 1

  • nelem: Number of nonzero elements in the matrix

  • num_col: Number of columns

  • predict_type: What should be predicted

    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);

    • C_API_PREDICT_RAW_SCORE: raw score;

    • C_API_PREDICT_LEAF_INDEX: leaf index;

    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)

  • start_iteration: Start index of the iteration to predict

  • num_iteration: Number of iterations for prediction, <= 0 means no limit

  • parameter: Other parameters for prediction, e.g. early stopping for prediction

  • [out] out_len: Length of output result

  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForCSRSingleRowFast(FastConfigHandle fastConfig_handle, const void *indptr, const int indptr_type, const int32_t *indices, const void *data, const int64_t nindptr, const int64_t nelem, int64_t *out_len, double *out_result)

Faster variant of LGBM_BoosterPredictForCSRSingleRow.

Score single rows after setup with LGBM_BoosterPredictForCSRSingleRowFastInit.

By removing the setup steps from this call extra optimizations can be made like initializing the config only once, instead of once per call.

Note

Setting up the number of threads is only done once at LGBM_BoosterPredictForCSRSingleRowFastInit instead of at each prediction. If you use a different number of threads in other calls, you need to start the setup process over, or that number of threads will be used for these calls as well.

Note

You should pre-allocate memory for out_result:

  • for normal and raw score, its length is equal to num_class * num_data;

  • for leaf index, its length is equal to num_class * num_data * num_iteration;

  • for feature contributions, its length is equal to num_class * num_data * (num_feature + 1).

Return

0 when succeed, -1 when failure happens

Parameters
  • fastConfig_handle: FastConfig object handle returned by LGBM_BoosterPredictForCSRSingleRowFastInit

  • indptr: Pointer to row headers

  • indptr_type: Type of indptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64

  • indices: Pointer to column indices

  • data: Pointer to the data space

  • nindptr: Number of rows in the matrix + 1

  • nelem: Number of nonzero elements in the matrix

  • [out] out_len: Length of output result

  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForCSRSingleRowFastInit(BoosterHandle handle, const int predict_type, const int start_iteration, const int num_iteration, const int data_type, const int64_t num_col, const char *parameter, FastConfigHandle *out_fastConfig)

Initialize and return a FastConfigHandle for use with LGBM_BoosterPredictForCSRSingleRowFast.

Release the FastConfig by passing its handle to LGBM_FastConfigFree when no longer needed.

Return

0 when it succeeds, -1 when failure happens

Parameters
  • handle: Booster handle

  • predict_type: What should be predicted

    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);

    • C_API_PREDICT_RAW_SCORE: raw score;

    • C_API_PREDICT_LEAF_INDEX: leaf index;

    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)

  • start_iteration: Start index of the iteration to predict

  • num_iteration: Number of iterations for prediction, <= 0 means no limit

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • num_col: Number of columns

  • parameter: Other parameters for prediction, e.g. early stopping for prediction

  • [out] out_fastConfig: FastConfig object with which you can call LGBM_BoosterPredictForCSRSingleRowFast

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForFile(BoosterHandle handle, const char *data_filename, int data_has_header, int predict_type, int start_iteration, int num_iteration, const char *parameter, const char *result_filename)

Make prediction for file.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • data_filename: Filename of file with data

  • data_has_header: Whether file has header or not

  • predict_type: What should be predicted

    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);

    • C_API_PREDICT_RAW_SCORE: raw score;

    • C_API_PREDICT_LEAF_INDEX: leaf index;

    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)

  • start_iteration: Start index of the iteration to predict

  • num_iteration: Number of iterations for prediction, <= 0 means no limit

  • parameter: Other parameters for prediction, e.g. early stopping for prediction

  • result_filename: Filename of result file in which predictions will be written

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForMat(BoosterHandle handle, const void *data, int data_type, int32_t nrow, int32_t ncol, int is_row_major, int predict_type, int start_iteration, int num_iteration, const char *parameter, int64_t *out_len, double *out_result)

Make prediction for a new dataset.

Note

You should pre-allocate memory for out_result:

  • for normal and raw score, its length is equal to num_class * num_data;

  • for leaf index, its length is equal to num_class * num_data * num_iteration;

  • for feature contributions, its length is equal to num_class * num_data * (num_feature + 1).

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • data: Pointer to the data space

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • nrow: Number of rows

  • ncol: Number of columns

  • is_row_major: 1 for row-major, 0 for column-major

  • predict_type: What should be predicted

    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);

    • C_API_PREDICT_RAW_SCORE: raw score;

    • C_API_PREDICT_LEAF_INDEX: leaf index;

    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)

  • start_iteration: Start index of the iteration to predict

  • num_iteration: Number of iteration for prediction, <= 0 means no limit

  • parameter: Other parameters for prediction, e.g. early stopping for prediction

  • [out] out_len: Length of output result

  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForMats(BoosterHandle handle, const void **data, int data_type, int32_t nrow, int32_t ncol, int predict_type, int start_iteration, int num_iteration, const char *parameter, int64_t *out_len, double *out_result)

Make prediction for a new dataset presented in a form of array of pointers to rows.

Note

You should pre-allocate memory for out_result:

  • for normal and raw score, its length is equal to num_class * num_data;

  • for leaf index, its length is equal to num_class * num_data * num_iteration;

  • for feature contributions, its length is equal to num_class * num_data * (num_feature + 1).

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • data: Pointer to the data space

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • nrow: Number of rows

  • ncol: Number columns

  • predict_type: What should be predicted

    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);

    • C_API_PREDICT_RAW_SCORE: raw score;

    • C_API_PREDICT_LEAF_INDEX: leaf index;

    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)

  • start_iteration: Start index of the iteration to predict

  • num_iteration: Number of iteration for prediction, <= 0 means no limit

  • parameter: Other parameters for prediction, e.g. early stopping for prediction

  • [out] out_len: Length of output result

  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForMatSingleRow(BoosterHandle handle, const void *data, int data_type, int ncol, int is_row_major, int predict_type, int start_iteration, int num_iteration, const char *parameter, int64_t *out_len, double *out_result)

Make prediction for a new dataset. This method re-uses the internal predictor structure from previous calls and is optimized for single row invocation.

Note

You should pre-allocate memory for out_result:

  • for normal and raw score, its length is equal to num_class * num_data;

  • for leaf index, its length is equal to num_class * num_data * num_iteration;

  • for feature contributions, its length is equal to num_class * num_data * (num_feature + 1).

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • data: Pointer to the data space

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • ncol: Number columns

  • is_row_major: 1 for row-major, 0 for column-major

  • predict_type: What should be predicted

    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);

    • C_API_PREDICT_RAW_SCORE: raw score;

    • C_API_PREDICT_LEAF_INDEX: leaf index;

    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)

  • start_iteration: Start index of the iteration to predict

  • num_iteration: Number of iteration for prediction, <= 0 means no limit

  • parameter: Other parameters for prediction, e.g. early stopping for prediction

  • [out] out_len: Length of output result

  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForMatSingleRowFast(FastConfigHandle fastConfig_handle, const void *data, int64_t *out_len, double *out_result)

Faster variant of LGBM_BoosterPredictForMatSingleRow.

Score a single row after setup with LGBM_BoosterPredictForMatSingleRowFastInit.

By removing the setup steps from this call extra optimizations can be made like initializing the config only once, instead of once per call.

Note

Setting up the number of threads is only done once at LGBM_BoosterPredictForMatSingleRowFastInit instead of at each prediction. If you use a different number of threads in other calls, you need to start the setup process over, or that number of threads will be used for these calls as well.

Return

0 when it succeeds, -1 when failure happens

Parameters
  • fastConfig_handle: FastConfig object handle returned by LGBM_BoosterPredictForMatSingleRowFastInit

  • data: Single-row array data (no other way than row-major form).

  • [out] out_len: Length of output result

  • [out] out_result: Pointer to array with predictions

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictForMatSingleRowFastInit(BoosterHandle handle, const int predict_type, const int start_iteration, const int num_iteration, const int data_type, const int32_t ncol, const char *parameter, FastConfigHandle *out_fastConfig)

Initialize and return a FastConfigHandle for use with LGBM_BoosterPredictForMatSingleRowFast.

Release the FastConfig by passing its handle to LGBM_FastConfigFree when no longer needed.

Return

0 when it succeeds, -1 when failure happens

Parameters
  • handle: Booster handle

  • predict_type: What should be predicted

    • C_API_PREDICT_NORMAL: normal prediction, with transform (if needed);

    • C_API_PREDICT_RAW_SCORE: raw score;

    • C_API_PREDICT_LEAF_INDEX: leaf index;

    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)

  • start_iteration: Start index of the iteration to predict

  • num_iteration: Number of iterations for prediction, <= 0 means no limit

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • ncol: Number of columns

  • parameter: Other parameters for prediction, e.g. early stopping for prediction

  • [out] out_fastConfig: FastConfig object with which you can call LGBM_BoosterPredictForMatSingleRowFast

LIGHTGBM_C_EXPORT int LGBM_BoosterPredictSparseOutput(BoosterHandle handle, const void *indptr, int indptr_type, const int32_t *indices, const void *data, int data_type, int64_t nindptr, int64_t nelem, int64_t num_col_or_row, int predict_type, int start_iteration, int num_iteration, const char *parameter, int matrix_type, int64_t *out_len, void **out_indptr, int32_t **out_indices, void **out_data)

Make sparse prediction for a new dataset in CSR or CSC format. Currently only used for feature contributions.

Note

The outputs are pre-allocated, as they can vary for each invocation, but the shape should be the same:

  • for feature contributions, the shape of sparse matrix will be num_class * num_data * (num_feature + 1). The output indptr_type for the sparse matrix will be the same as the given input indptr_type. Call LGBM_BoosterFreePredictSparse to deallocate resources.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • indptr: Pointer to row headers for CSR or column headers for CSC

  • indptr_type: Type of indptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64

  • indices: Pointer to column indices for CSR or row indices for CSC

  • data: Pointer to the data space

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • nindptr: Number of rows in the matrix + 1

  • nelem: Number of nonzero elements in the matrix

  • num_col_or_row: Number of columns for CSR or number of rows for CSC

  • predict_type: What should be predicted, only feature contributions supported currently

    • C_API_PREDICT_CONTRIB: feature contributions (SHAP values)

  • start_iteration: Start index of the iteration to predict

  • num_iteration: Number of iterations for prediction, <= 0 means no limit

  • parameter: Other parameters for prediction, e.g. early stopping for prediction

  • matrix_type: Type of matrix input and output, can be C_API_MATRIX_TYPE_CSR or C_API_MATRIX_TYPE_CSC

  • [out] out_len: Length of output indices and data

  • [out] out_indptr: Pointer to output row headers for CSR or column headers for CSC

  • [out] out_indices: Pointer to sparse column indices for CSR or row indices for CSC

  • [out] out_data: Pointer to sparse data space

LIGHTGBM_C_EXPORT int LGBM_BoosterRefit(BoosterHandle handle, const int32_t *leaf_preds, int32_t nrow, int32_t ncol)

Refit the tree model using the new data (online learning).

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • leaf_preds: Pointer to predicted leaf indices

  • nrow: Number of rows of leaf_preds

  • ncol: Number of columns of leaf_preds

LIGHTGBM_C_EXPORT int LGBM_BoosterResetParameter(BoosterHandle handle, const char *parameters)

Reset config for booster.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • parameters: Parameters in format ‘key1=value1 key2=value2’

LIGHTGBM_C_EXPORT int LGBM_BoosterResetTrainingData(BoosterHandle handle, const DatasetHandle train_data)

Reset training data for booster.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • train_data: Training dataset

LIGHTGBM_C_EXPORT int LGBM_BoosterRollbackOneIter(BoosterHandle handle)

Rollback one iteration.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

LIGHTGBM_C_EXPORT int LGBM_BoosterSaveModel(BoosterHandle handle, int start_iteration, int num_iteration, int feature_importance_type, const char *filename)

Save model into file.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • start_iteration: Start index of the iteration that should be saved

  • num_iteration: Index of the iteration that should be saved, <= 0 means save all

  • feature_importance_type: Type of feature importance, can be C_API_FEATURE_IMPORTANCE_SPLIT or C_API_FEATURE_IMPORTANCE_GAIN

  • filename: The name of the file

LIGHTGBM_C_EXPORT int LGBM_BoosterSaveModelToString(BoosterHandle handle, int start_iteration, int num_iteration, int feature_importance_type, int64_t buffer_len, int64_t *out_len, char *out_str)

Save model to string.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • start_iteration: Start index of the iteration that should be saved

  • num_iteration: Index of the iteration that should be saved, <= 0 means save all

  • feature_importance_type: Type of feature importance, can be C_API_FEATURE_IMPORTANCE_SPLIT or C_API_FEATURE_IMPORTANCE_GAIN

  • buffer_len: String buffer length, if buffer_len < out_len, you should re-allocate buffer

  • [out] out_len: Actual output length

  • [out] out_str: String of model, should pre-allocate memory

LIGHTGBM_C_EXPORT int LGBM_BoosterSetLeafValue(BoosterHandle handle, int tree_idx, int leaf_idx, double val)

Set leaf value.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • tree_idx: Index of tree

  • leaf_idx: Index of leaf

  • val: Leaf value

LIGHTGBM_C_EXPORT int LGBM_BoosterShuffleModels(BoosterHandle handle, int start_iter, int end_iter)

Shuffle models.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • start_iter: The first iteration that will be shuffled

  • end_iter: The last iteration that will be shuffled

LIGHTGBM_C_EXPORT int LGBM_BoosterUpdateOneIter(BoosterHandle handle, int *is_finished)

Update the model for one iteration.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • [out] is_finished: 1 means the update was successfully finished (cannot split any more), 0 indicates failure

LIGHTGBM_C_EXPORT int LGBM_BoosterUpdateOneIterCustom(BoosterHandle handle, const float *grad, const float *hess, int *is_finished)

Update the model by specifying gradient and Hessian directly (this can be used to support customized loss functions).

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of booster

  • grad: The first order derivative (gradient) statistics

  • hess: The second order derivative (Hessian) statistics

  • [out] is_finished: 1 means the update was successfully finished (cannot split any more), 0 indicates failure

LIGHTGBM_C_EXPORT int LGBM_DatasetAddFeaturesFrom(DatasetHandle target, DatasetHandle source)

Add features from source to target.

Return

0 when succeed, -1 when failure happens

Parameters
  • target: The handle of the dataset to add features to

  • source: The handle of the dataset to take features from

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateByReference(const DatasetHandle reference, int64_t num_total_row, DatasetHandle *out)

Allocate the space for dataset and bucket feature bins according to reference dataset.

Return

0 when succeed, -1 when failure happens

Parameters
  • reference: Used to align bin mapper with other dataset

  • num_total_row: Number of total rows

  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromCSC(const void *col_ptr, int col_ptr_type, const int32_t *indices, const void *data, int data_type, int64_t ncol_ptr, int64_t nelem, int64_t num_row, const char *parameters, const DatasetHandle reference, DatasetHandle *out)

Create a dataset from CSC format.

Return

0 when succeed, -1 when failure happens

Parameters
  • col_ptr: Pointer to column headers

  • col_ptr_type: Type of col_ptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64

  • indices: Pointer to row indices

  • data: Pointer to the data space

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • ncol_ptr: Number of columns in the matrix + 1

  • nelem: Number of nonzero elements in the matrix

  • num_row: Number of rows

  • parameters: Additional parameters

  • reference: Used to align bin mapper with other dataset, nullptr means isn’t used

  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromCSR(const void *indptr, int indptr_type, const int32_t *indices, const void *data, int data_type, int64_t nindptr, int64_t nelem, int64_t num_col, const char *parameters, const DatasetHandle reference, DatasetHandle *out)

Create a dataset from CSR format.

Return

0 when succeed, -1 when failure happens

Parameters
  • indptr: Pointer to row headers

  • indptr_type: Type of indptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64

  • indices: Pointer to column indices

  • data: Pointer to the data space

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • nindptr: Number of rows in the matrix + 1

  • nelem: Number of nonzero elements in the matrix

  • num_col: Number of columns

  • parameters: Additional parameters

  • reference: Used to align bin mapper with other dataset, nullptr means isn’t used

  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromCSRFunc(void *get_row_funptr, int num_rows, int64_t num_col, const char *parameters, const DatasetHandle reference, DatasetHandle *out)

Create a dataset from CSR format through callbacks.

Return

0 when succeed, -1 when failure happens

Parameters
  • get_row_funptr: Pointer to std::function<void(int idx, std::vector<std::pair<int, double>>& ret)> (called for every row and expected to clear and fill ret)

  • num_rows: Number of rows

  • num_col: Number of columns

  • parameters: Additional parameters

  • reference: Used to align bin mapper with other dataset, nullptr means isn’t used

  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromFile(const char *filename, const char *parameters, const DatasetHandle reference, DatasetHandle *out)

Load dataset from file (like LightGBM CLI version does).

Return

0 when succeed, -1 when failure happens

Parameters
  • filename: The name of the file

  • parameters: Additional parameters

  • reference: Used to align bin mapper with other dataset, nullptr means isn’t used

  • [out] out: A loaded dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromMat(const void *data, int data_type, int32_t nrow, int32_t ncol, int is_row_major, const char *parameters, const DatasetHandle reference, DatasetHandle *out)

Create dataset from dense matrix.

Return

0 when succeed, -1 when failure happens

Parameters
  • data: Pointer to the data space

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • nrow: Number of rows

  • ncol: Number of columns

  • is_row_major: 1 for row-major, 0 for column-major

  • parameters: Additional parameters

  • reference: Used to align bin mapper with other dataset, nullptr means isn’t used

  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromMats(int32_t nmat, const void **data, int data_type, int32_t *nrow, int32_t ncol, int is_row_major, const char *parameters, const DatasetHandle reference, DatasetHandle *out)

Create dataset from array of dense matrices.

Return

0 when succeed, -1 when failure happens

Parameters
  • nmat: Number of dense matrices

  • data: Pointer to the data space

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • nrow: Number of rows

  • ncol: Number of columns

  • is_row_major: 1 for row-major, 0 for column-major

  • parameters: Additional parameters

  • reference: Used to align bin mapper with other dataset, nullptr means isn’t used

  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetCreateFromSampledColumn(double **sample_data, int **sample_indices, int32_t ncol, const int *num_per_col, int32_t num_sample_row, int32_t num_total_row, const char *parameters, DatasetHandle *out)

Allocate the space for dataset and bucket feature bins according to sampled data.

Return

0 when succeed, -1 when failure happens

Parameters
  • sample_data: Sampled data, grouped by the column

  • sample_indices: Indices of sampled data

  • ncol: Number of columns

  • num_per_col: Size of each sampling column

  • num_sample_row: Number of sampled rows

  • num_total_row: Number of total rows

  • parameters: Additional parameters

  • [out] out: Created dataset

LIGHTGBM_C_EXPORT int LGBM_DatasetDumpText(DatasetHandle handle, const char *filename)

Save dataset to text file, intended for debugging use only.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of dataset

  • filename: The name of the file

LIGHTGBM_C_EXPORT int LGBM_DatasetFree(DatasetHandle handle)

Free space for dataset.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of dataset to be freed

LIGHTGBM_C_EXPORT int LGBM_DatasetGetFeatureNames(DatasetHandle handle, const int len, int *num_feature_names, const size_t buffer_len, size_t *out_buffer_len, char **feature_names)

Get feature names of dataset.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of dataset

  • len: Number of char* pointers stored at out_strs. If smaller than the max size, only this many strings are copied

  • [out] num_feature_names: Number of feature names

  • buffer_len: Size of pre-allocated strings. Content is copied up to buffer_len - 1 and null-terminated

  • [out] out_buffer_len: String sizes required to do the full string copies

  • [out] feature_names: Feature names, should pre-allocate memory

LIGHTGBM_C_EXPORT int LGBM_DatasetGetField(DatasetHandle handle, const char *field_name, int *out_len, const void **out_ptr, int *out_type)

Get info vector from dataset.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of dataset

  • field_name: Field name

  • [out] out_len: Used to set result length

  • [out] out_ptr: Pointer to the result

  • [out] out_type: Type of result pointer, can be C_API_DTYPE_INT32, C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

LIGHTGBM_C_EXPORT int LGBM_DatasetGetNumData(DatasetHandle handle, int *out)

Get number of data points.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of dataset

  • [out] out: The address to hold number of data points

LIGHTGBM_C_EXPORT int LGBM_DatasetGetNumFeature(DatasetHandle handle, int *out)

Get number of features.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of dataset

  • [out] out: The address to hold number of features

LIGHTGBM_C_EXPORT int LGBM_DatasetGetSubset(const DatasetHandle handle, const int32_t *used_row_indices, int32_t num_used_row_indices, const char *parameters, DatasetHandle *out)

Create subset of a data.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of full dataset

  • used_row_indices: Indices used in subset

  • num_used_row_indices: Length of used_row_indices

  • parameters: Additional parameters

  • [out] out: Subset of data

LIGHTGBM_C_EXPORT int LGBM_DatasetPushRows(DatasetHandle dataset, const void *data, int data_type, int32_t nrow, int32_t ncol, int32_t start_row)

Push data to existing dataset, if nrow + start_row == num_total_row, will call dataset->FinishLoad.

Return

0 when succeed, -1 when failure happens

Parameters
  • dataset: Handle of dataset

  • data: Pointer to the data space

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • nrow: Number of rows

  • ncol: Number of columns

  • start_row: Row start index

LIGHTGBM_C_EXPORT int LGBM_DatasetPushRowsByCSR(DatasetHandle dataset, const void *indptr, int indptr_type, const int32_t *indices, const void *data, int data_type, int64_t nindptr, int64_t nelem, int64_t num_col, int64_t start_row)

Push data to existing dataset, if nrow + start_row == num_total_row, will call dataset->FinishLoad.

Return

0 when succeed, -1 when failure happens

Parameters
  • dataset: Handle of dataset

  • indptr: Pointer to row headers

  • indptr_type: Type of indptr, can be C_API_DTYPE_INT32 or C_API_DTYPE_INT64

  • indices: Pointer to column indices

  • data: Pointer to the data space

  • data_type: Type of data pointer, can be C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

  • nindptr: Number of rows in the matrix + 1

  • nelem: Number of nonzero elements in the matrix

  • num_col: Number of columns

  • start_row: Row start index

LIGHTGBM_C_EXPORT int LGBM_DatasetSaveBinary(DatasetHandle handle, const char *filename)

Save dataset to binary file.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of dataset

  • filename: The name of the file

LIGHTGBM_C_EXPORT int LGBM_DatasetSetFeatureNames(DatasetHandle handle, const char **feature_names, int num_feature_names)

Save feature names to dataset.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of dataset

  • feature_names: Feature names

  • num_feature_names: Number of feature names

LIGHTGBM_C_EXPORT int LGBM_DatasetSetField(DatasetHandle handle, const char *field_name, const void *field_data, int num_element, int type)

Set vector to a content in info.

Note

  • group only works for C_API_DTYPE_INT32;

  • label and weight only work for C_API_DTYPE_FLOAT32;

  • init_score only works for C_API_DTYPE_FLOAT64.

Return

0 when succeed, -1 when failure happens

Parameters
  • handle: Handle of dataset

  • field_name: Field name, can be label, weight, init_score, group

  • field_data: Pointer to data vector

  • num_element: Number of elements in field_data

  • type: Type of field_data pointer, can be C_API_DTYPE_INT32, C_API_DTYPE_FLOAT32 or C_API_DTYPE_FLOAT64

LIGHTGBM_C_EXPORT int LGBM_DatasetUpdateParamChecking(const char *old_parameters, const char *new_parameters)

Raise errors for attempts to update dataset parameters.

Return

0 when succeed, -1 when failure happens

Parameters
  • old_parameters: Current dataset parameters

  • new_parameters: New dataset parameters

LIGHTGBM_C_EXPORT int LGBM_FastConfigFree(FastConfigHandle fastConfig)

Release FastConfig object.

Return

0 when it succeeds, -1 when failure happens

Parameters
  • fastConfig: Handle to the FastConfig object acquired with a *FastInit() method.

LIGHTGBM_C_EXPORT const char *LGBM_GetLastError()

Get string message of the last error.

Return

Error information

LIGHTGBM_C_EXPORT int LGBM_NetworkFree()

Finalize the network.

Return

0 when succeed, -1 when failure happens

LIGHTGBM_C_EXPORT int LGBM_NetworkInit(const char *machines, int local_listen_port, int listen_time_out, int num_machines)

Initialize the network.

Return

0 when succeed, -1 when failure happens

Parameters
  • machines: List of machines in format ‘ip1:port1,ip2:port2’

  • local_listen_port: TCP listen port for local machines

  • listen_time_out: Socket time-out in minutes

  • num_machines: Total number of machines

LIGHTGBM_C_EXPORT int LGBM_NetworkInitWithFunctions(int num_machines, int rank, void *reduce_scatter_ext_fun, void *allgather_ext_fun)

Initialize the network with external collective functions.

Return

0 when succeed, -1 when failure happens

Parameters
  • num_machines: Total number of machines

  • rank: Rank of local machine

  • reduce_scatter_ext_fun: The external reduce-scatter function

  • allgather_ext_fun: The external allgather function

LIGHTGBM_C_EXPORT int LGBM_RegisterLogCallback(void (*callback)(const char*))

Register a callback function for log redirecting.

Return

0 when succeed, -1 when failure happens

Parameters
  • callback: The callback function to register

void LGBM_SetLastError(const char *msg)

Set string message of the last error.

Parameters
  • msg: Error message