lightgbm.create_tree_digraph

lightgbm.create_tree_digraph(booster, tree_index=0, show_info=None, precision=3, orientation='horizontal', example_case=None, max_category_values=10, **kwargs)[source]

Create a digraph representation of specified tree.

Each node in the graph represents a node in the tree.

Non-leaf nodes have labels like Column_10 <= 875.9, which means “this node splits on the feature named “Column_10”, with threshold 875.9”.

Leaf nodes have labels like leaf 2: 0.422, which means “this node is a leaf node, and the predicted value for records that fall into this node is 0.422”. The number (2) is an internal unique identifier and doesn’t have any special meaning.

Note

For more information please visit https://graphviz.readthedocs.io/en/stable/api.html#digraph.

Parameters:
  • booster (Booster or LGBMModel) – Booster or LGBMModel instance to be converted.

  • tree_index (int, optional (default=0)) – The index of a target tree to convert.

  • show_info (list of str, or None, optional (default=None)) –

    What information should be shown in nodes.

    • 'split_gain' : gain from adding this split to the model

    • 'internal_value' : raw predicted value that would be produced by this node if it was a leaf node

    • 'internal_count' : number of records from the training data that fall into this non-leaf node

    • 'internal_weight' : total weight of all nodes that fall into this non-leaf node

    • 'leaf_count' : number of records from the training data that fall into this leaf node

    • 'leaf_weight' : total weight (sum of Hessian) of all observations that fall into this leaf node

    • 'data_percentage' : percentage of training data that fall into this node

  • precision (int or None, optional (default=3)) – Used to restrict the display of floating point values to a certain precision.

  • orientation (str, optional (default='horizontal')) – Orientation of the tree. Can be ‘horizontal’ or ‘vertical’.

  • example_case (numpy 2-D array, pandas DataFrame or None, optional (default=None)) –

    Single row with the same structure as the training data. If not None, the plot will highlight the path that sample takes through the tree.

    New in version 4.0.0.

  • max_category_values (int, optional (default=10)) –

    The maximum number of category values to display in tree nodes, if the number of thresholds is greater than this value, thresholds will be collapsed and displayed on the label tooltip instead.

    Warning

    Consider wrapping the SVG string of the tree graph with IPython.display.HTML when running on JupyterLab to get the tooltip working right.

    Example:

    from IPython.display import HTML
    
    graph = lgb.create_tree_digraph(clf, max_category_values=5)
    HTML(graph._repr_image_svg_xml())
    

    New in version 4.0.0.

  • **kwargs – Other parameters passed to Digraph constructor. Check https://graphviz.readthedocs.io/en/stable/api.html#digraph for the full list of supported parameters.

Returns:

graph – The digraph representation of specified tree.

Return type:

graphviz.Digraph