BoostedTreesCalculateBestFeatureSplitV2

public final class BoostedTreesCalculateBestFeatureSplitV2

Calculates gains for each feature and returns the best possible split information for each node. However, if no split is found, then no split information is returned for that node.

The split information is the best threshold (bucket id), gains and left/right node contributions per node for each feature.

It is possible that not all nodes can be split on each feature. Hence, the list of possible nodes can differ between the features. Therefore, we return `node_ids_list` for each feature, containing the list of nodes that this feature can be used to split.

In this manner, the output is the best split per features and per node, so that it needs to be combined later to produce the best split for each node (among all possible features).

The output shapes are compatible in a way that the first dimension of all tensors are the same and equal to the number of possible split nodes for each feature.

Constants

String OP_NAME The name of this op, as known by TensorFlow core engine

Public Methods

static BoostedTreesCalculateBestFeatureSplitV2
create ( Scope scope, Operand < TInt32 > nodeIdRange, Iterable< Operand < TFloat32 >> statsSummariesList, Operand < TString > splitTypes, Operand < TInt32 > candidateFeatureIds, Operand < TFloat32 > l1, Operand < TFloat32 > l2, Operand < TFloat32 > treeComplexity, Operand < TFloat32 > minNodeWeight, Long logitsDimension)
Factory method to create a class wrapping a new BoostedTreesCalculateBestFeatureSplitV2 operation.
Output < TInt32 >
featureDimensions ()
A Rank 1 tensors indicating the best feature dimension for each feature to split for certain nodes if the feature is multi-dimension.
Output < TInt32 >
featureIds ()
A Rank 1 tensors indicating the best feature id for each node.
Output < TFloat32 >
gains ()
A Rank 1 tensor indicating the best gains for each feature to split for certain nodes.
Output < TFloat32 >
leftNodeContribs ()
A Rank 2 tensors indicating the contribution of the left nodes when branching from parent nodes (given by the tensor element in the output node_ids_list) to the left direction by the given threshold for each feature.
Output < TInt32 >
nodeIds ()
A Rank 1 tensors indicating possible split node ids for each feature.
Output < TFloat32 >
rightNodeContribs ()
A Rank 2 tensors, with the same shape/conditions as left_node_contribs_list, but just that the value is for the right node.
Output < TString >
splitWithDefaultDirections ()
A Rank 1 tensors indicating the which direction to go if data is missing.
Output < TInt32 >
thresholds ()
A Rank 1 tensors indicating the bucket id to compare with (as a threshold) for split in each node.

Inherited Methods

Constants

public static final String OP_NAME

The name of this op, as known by TensorFlow core engine

Constant Value: "BoostedTreesCalculateBestFeatureSplitV2"

Public Methods

public static BoostedTreesCalculateBestFeatureSplitV2 create ( Scope scope, Operand < TInt32 > nodeIdRange, Iterable< Operand < TFloat32 >> statsSummariesList, Operand < TString > splitTypes, Operand < TInt32 > candidateFeatureIds, Operand < TFloat32 > l1, Operand < TFloat32 > l2, Operand < TFloat32 > treeComplexity, Operand < TFloat32 > minNodeWeight, Long logitsDimension)

Factory method to create a class wrapping a new BoostedTreesCalculateBestFeatureSplitV2 operation.

Parameters
scope current scope
nodeIdRange A Rank 1 tensor (shape=[2]) to specify the range [first, last) of node ids to process within `stats_summary_list`. The nodes are iterated between the two nodes specified by the tensor, as like `for node_id in range(node_id_range[0], node_id_range[1])` (Note that the last index node_id_range[1] is exclusive).
statsSummariesList A list of Rank 4 tensor (#shape=[max_splits, feature_dims, bucket, stats_dims]) for accumulated stats summary (gradient/hessian) per node, per dimension, per buckets for each feature. The first dimension of the tensor is the maximum number of splits, and thus not all elements of it will be used, but only the indexes specified by node_ids will be used.
splitTypes A Rank 1 tensor indicating if this Op should perform inequality split or equality split per feature.
candidateFeatureIds Rank 1 tensor with ids for each feature. This is the real id of the feature.
l1 l1 regularization factor on leaf weights, per instance based.
l2 l2 regularization factor on leaf weights, per instance based.
treeComplexity adjustment to the gain, per leaf based.
minNodeWeight minimum avg of hessians in a node before required for the node to be considered for splitting.
logitsDimension The dimension of logit, i.e., number of classes.
Returns
  • a new instance of BoostedTreesCalculateBestFeatureSplitV2

public Output < TInt32 > featureDimensions ()

A Rank 1 tensors indicating the best feature dimension for each feature to split for certain nodes if the feature is multi-dimension. See above for details like shapes and sizes.

public Output < TInt32 > featureIds ()

A Rank 1 tensors indicating the best feature id for each node. See above for details like shapes and sizes.

public Output < TFloat32 > gains ()

A Rank 1 tensor indicating the best gains for each feature to split for certain nodes. See above for details like shapes and sizes.

public Output < TFloat32 > leftNodeContribs ()

A Rank 2 tensors indicating the contribution of the left nodes when branching from parent nodes (given by the tensor element in the output node_ids_list) to the left direction by the given threshold for each feature. This value will be used to make the left node value by adding to the parent node value. Second dimension size is 1 for 1-dimensional logits, but would be larger for multi-class problems. See above for details like shapes and sizes.

public Output < TInt32 > nodeIds ()

A Rank 1 tensors indicating possible split node ids for each feature. The length of the list is num_features, but each tensor has different size as each feature provides different possible nodes. See above for details like shapes and sizes.

public Output < TFloat32 > rightNodeContribs ()

A Rank 2 tensors, with the same shape/conditions as left_node_contribs_list, but just that the value is for the right node.

public Output < TString > splitWithDefaultDirections ()

A Rank 1 tensors indicating the which direction to go if data is missing. See above for details like shapes and sizes. Inequality with default left returns 0, inequality with default right returns 1, equality with default right returns 2.

public Output < TInt32 > thresholds ()

A Rank 1 tensors indicating the bucket id to compare with (as a threshold) for split in each node. See above for details like shapes and sizes.