tf.keras.preprocessing.text.one_hot

tf.keras.preprocessing.text.one_hot(
    text,
    n,
    filters='!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n',
    lower=True,
    split=' '
)

Defined in tensorflow/python/keras/preprocessing/text.py.

One-hot encodes a text into a list of word indexes of size n.

This is a wrapper to the hashing_trick function using hash as the hashing function; unicity of word to index mapping non-guaranteed.

Arguments:

  • text: Input text (string).
  • n: int, size of vocabulary.
  • filters: list (or concatenation) of characters to filter out, such as punctuation. Default: '!"#$%&()*+,-./:;<=>?@[\]^_`{|}~\t\n', includes basic punctuation, tabs, and newlines.
  • lower: boolean, whether to set the text to lowercase.
  • split: string, separator for word splitting.

Returns:

List of integers in [1, n]. Each integer encodes a word (unicity non-guaranteed).