Quick Start

This is a quick start guide for LightGBM CLI version.

Follow the Installation Guide to install LightGBM first.

List of other helpful links

Training Data Format

LightGBM supports input data files with CSV, TSV and LibSVM (zero-based) formats.

Files could be both with and without headers.

Label column could be specified both by index and by name.

Some columns could be ignored.

Categorical Feature Support

LightGBM can use categorical features directly (without one-hot encoding). The experiment on Expo data shows about 8x speed-up compared with one-hot encoding.

For the setting details, please refer to the categorical_feature parameter.

Weight and Query/Group Data

LightGBM also supports weighted training, it needs an additional weight data. And it needs an additional query data for ranking task.

Also, weight and query data could be specified as columns in training data in the same manner as label.

Parameters Quick Look

The parameters format is key1=value1 key2=value2 ....

Parameters can be set both in config file and command line. If one parameter appears in both command line and config file, LightGBM will use the parameter from the command line.

The most important parameters which new users should take a look at are located into Core Parameters and the top of Learning Control Parameters sections of the full detailed list of LightGBM’s parameters.

Run LightGBM

lightgbm config=your_config_file other_args ...

Parameters can be set both in the config file and command line, and the parameters in command line have higher priority than in the config file. For example, the following command line will keep num_trees=10 and ignore the same parameter in the config file.

lightgbm config=train.conf num_trees=10