tf.contrib.feature_column.sequence_categorical_column_with_vocabulary_file
Stay organized with collections
Save and categorize content based on your preferences.
A sequence of categorical terms where ids use a vocabulary file.
tf.contrib.feature_column.sequence_categorical_column_with_vocabulary_file(
key, vocabulary_file, vocabulary_size=None, num_oov_buckets=0,
default_value=None, dtype=tf.dtypes.string
)
Pass this to embedding_column
or indicator_column
to convert sequence
categorical data into dense representation for input to sequence NN, such as
RNN.
Example:
states = sequence_categorical_column_with_vocabulary_file(
key='states', vocabulary_file='/us/states.txt', vocabulary_size=50,
num_oov_buckets=5)
states_embedding = embedding_column(states, dimension=10)
columns = [states_embedding]
features = tf.io.parse_example(..., features=make_parse_example_spec(columns))
input_layer, sequence_length = sequence_input_layer(features, columns)
rnn_cell = tf.compat.v1.nn.rnn_cell.BasicRNNCell(hidden_size)
outputs, state = tf.compat.v1.nn.dynamic_rnn(
rnn_cell, inputs=input_layer, sequence_length=sequence_length)
Args |
key
|
A unique string identifying the input feature.
|
vocabulary_file
|
The vocabulary file name.
|
vocabulary_size
|
Number of the elements in the vocabulary. This must be no
greater than length of vocabulary_file , if less than length, later
values are ignored. If None, it is set to the length of vocabulary_file .
|
num_oov_buckets
|
Non-negative integer, the number of out-of-vocabulary
buckets. All out-of-vocabulary inputs will be assigned IDs in the range
[vocabulary_size, vocabulary_size+num_oov_buckets) based on a hash of
the input value. A positive num_oov_buckets can not be specified with
default_value .
|
default_value
|
The integer ID value to return for out-of-vocabulary feature
values, defaults to -1 . This can not be specified with a positive
num_oov_buckets .
|
dtype
|
The type of features. Only string and integer types are supported.
|
Returns |
A _SequenceCategoricalColumn .
|
Raises |
ValueError
|
vocabulary_file is missing or cannot be opened.
|
ValueError
|
vocabulary_size is missing or < 1.
|
ValueError
|
num_oov_buckets is a negative integer.
|
ValueError
|
num_oov_buckets and default_value are both specified.
|
ValueError
|
dtype is neither string nor integer.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2020-10-01 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2020-10-01 UTC."],[],[]]