tft.ngrams( tokens, ngram_range, separator, name=None )
SparseTensor of n-grams.
SparseTensor of tokens, returns a
SparseTensor containing the
ngrams that can be constructed from each row.
separator is inserted between each pair of tokens, so " " would be an
appropriate choice if the tokens are words, while "" would be an appropriate
choice if they are characters.
tokens is a
indices = [[0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2], [1, 3]] values = ['One', 'was', 'Johnny', 'Two', 'was', 'a', 'rat'] dense_shape = [2, 4]
If we set ngrams_range = (1,3) separator = ' '
output is a
indices = [[0, 0], [0, 1], [0, 2], ..., [1, 6], [1, 7], [1, 8]] values = ['One', 'One was', 'One was Johnny', 'was', 'was Johnny', 'Johnny', 'Two', 'Two was', 'Two was a', 'was', 'was a', 'was a rat', 'a', 'a rat', 'rat'] dense_shape = [2, 9]
tokens: a two-dimensional
tf.stringcontaining tokens that will be used to construct ngrams.
ngram_range: A pair with the range (inclusive) of ngram sizes to return.
separator: a string that will be inserted between tokens when ngrams are constructed.
name: (Optional) A name for this operation.
SparseTensor containing all ngrams from each row of the input. Note:
if an ngram appears multiple times in the input row, it will be present the
same number of times in the output. For unique ngrams, see tft.bag_of_words.
ValueError: if ngram_range < 1 or ngram_range < ngram_range