New! Use Simple ML for Sheets to apply machine learning to the data in your Google Sheets
Read More
tfdf.keras.FeatureUsage
Stay organized with collections
Save and categorize content based on your preferences.
Semantic and hyper-parameters for a single feature.
tfdf.keras.FeatureUsage(
name: str,
semantic: Optional[tfdf.keras.FeatureSemantic
] = None,
num_discretized_numerical_bins: Optional[int] = None,
max_vocab_count: Optional[int] = None,
min_vocab_frequency: Optional[int] = None,
override_global_imputation_value: Optional[str] = None,
monotonic: tfdf.keras.core.MonotonicConstraint
= None
)
Used in the notebooks
This class allows to |
- Limit the input features of the model.
- Set manually the semantic of a feature.
- Specify feature specific hyper-parameters.
|
Note that the model's "features" argument is optional. If it is not specified,
all available feature will be used. See the "CoreModel" class
documentation for more details.
Usage example:
# A feature named "A". The semantic will be detected automatically. The
# global hyper-parameters of the model will be used.
feature_a = FeatureUsage(name="A")
# A feature named "C" representing a CATEGORICAL value.
# Specifying the semantic ensure the feature is correctly detected.
# In this case, the feature might be stored as an integer, and would have be
# detected as NUMERICAL.
feature_b = FeatureUsage(name="B", semantic=Semantic.CATEGORICAL)
# A feature with a specific maximum dictionary size.
feature_c = FeatureUsage(name="C",
semantic=Semantic.CATEGORICAL,
max_vocab_count=32)
model = CoreModel(features=[feature_a, feature_b, feature_c])
Attributes |
name
|
The name of the feature. Used as an identifier if the dataset is a
dictionary of tensors.
|
semantic
|
Semantic of the feature. If None, the semantic is automatically
determined. The semantic controls how a feature is interpreted by a model.
Using the wrong semantic (e.g. numerical instead of categorical) will hurt
your model. See "FeatureSemantic" and "Semantic" for the definition of the
of available semantics.
|
num_discretized_numerical_bins
|
For DISCRETIZED_NUMERICAL features only.
Number of bins used to discretize DISCRETIZED_NUMERICAL features.
|
max_vocab_count
|
For CATEGORICAL and CATEGORICAL_SET features only. Number
of unique categorical values stored as string. If more categorical values
are present, the least frequent values are grouped into a
Out-of-vocabulary item. Reducing the value can improve or hurt the model.
|
min_vocab_frequency
|
For CATEGORICAL and CATEGORICAL_SET features only.
Minimum number of occurence of a categorical value. Values present less
than "min_vocab_frequency" times in the training dataset are treated as
"Out-of-vocabulary".
|
override_global_imputation_value
|
For CATEGORICAL and CATEGORICAL_SET
features only. If set, replaces the global imputation value used to handle
missing values. That is, at inference time, missing values will be treated
as "override_global_imputation_value". "override_global_imputation_value"
can only be used on categorical features and on columns not containing
missing values in the training dataset. If the algorithm used to handle
missing values is not "GLOBAL_IMPUTATION" (default algorithm), this value
is ignored.
|
monotonic
|
Monotonic constraints between the feature and the model output.
Use None (default) for a non monotonic constrainted features.
Monotonic.INCREASING ensures the model is monotonically increasing with
the features. Monotonic.DECREASING ensures the model is monotonically
decreasing with the features. Alternatively, you can also use 0 , +1
and -1 to respectively define a non-constrained, monotonically
increasing, and monotonically decreasing feature.
|
guide
|
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-04-26 UTC.
[{
"type": "thumb-down",
"id": "missingTheInformationINeed",
"label":"Missing the information I need"
},{
"type": "thumb-down",
"id": "tooComplicatedTooManySteps",
"label":"Too complicated / too many steps"
},{
"type": "thumb-down",
"id": "outOfDate",
"label":"Out of date"
},{
"type": "thumb-down",
"id": "samplesCodeIssue",
"label":"Samples / code issue"
},{
"type": "thumb-down",
"id": "otherDown",
"label":"Other"
}]
[{
"type": "thumb-up",
"id": "easyToUnderstand",
"label":"Easy to understand"
},{
"type": "thumb-up",
"id": "solvedMyProblem",
"label":"Solved my problem"
},{
"type": "thumb-up",
"id": "otherUp",
"label":"Other"
}]
{"lastModified": "Last updated 2024-04-26 UTC."}
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-04-26 UTC."],[],[]]