Recommendations When Using gcc

It is recommended to use -O3 -mtune=native to achieve maximum speed during LightGBM training.

Using Intel Ivy Bridge CPU on 1M x 1K Bosch dataset, the performance increases as follow:

Compilation Flag

Performance Index

-O2 -mtune=core2

100.00%

-O2 -mtune=native

100.90%

-O3 -mtune=native

102.78%

-O3 -ffast-math -mtune=native

100.64%

You can find more details on the experimentation below:

The image below compares the runtime for training with different compiler options to a baseline using LightGBM compiled with -O2 --mtune=core2. All three options are faster than that baseline. The best performance was achieved with -O3 --mtune=native.

Picture with a chart grouped by compiler set of options using O2 M tune equals core2 as the baseline. All the other 3 options are faster, with O3 M tune equals native being the fastest.