Represents sparse feature where ids are set by hashing.

Used in the notebooks

Used in the guide Used in the tutorials

Use this when your sparse features are in string or integer format, and you want to distribute your inputs into a finite number of buckets by hashing. output_id = Hash(input_feature_string) % bucket_size for string type input. For int type input, the value is converted to its string representation first and then hashed by the same formula.

For input dictionary features, features[key] is either Tensor or SparseTensor. If Tensor, missing values can be represented by -1 for int and '' for string, which will be dropped by this feature column.


import tensorflow as tf
keywords = tf.feature_column.categorical_column_with_hash_bucket("keywords",
columns = [keywords]
features = {'keywords': tf.constant([['Tensorflow', 'Keras', 'RNN', 'LSTM',
'CNN'], ['LSTM', '