tft.ngrams

tft.ngrams(
    tokens,
    ngram_range,
    separator,
    name=None
)

Create a SparseTensor of n-grams.

Given a SparseTensor of tokens, returns a SparseTensor containing the ngrams that can be constructed from each row.

separator is inserted between each pair of tokens, so " " would be an appropriate choice if the tokens are words, while "" would be an appropriate choice if they are characters.

Example:

tokens is a SparseTensor with

indices = [[0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2], [1, 3]] values = ['One', 'was', 'Johnny', 'Two', 'was', 'a', 'rat'] dense_shape = [2, 4]

If we set ngrams_range = (1,3) separator = ' '

output is a SparseTensor with

indices = [[0, 0], [0, 1], [0, 2], ..., [1, 6], [1, 7], [1, 8]] values = ['One', 'One was', 'One was Johnny', 'was', 'was Johnny', 'Johnny', 'Two', 'Two was', 'Two was a', 'was', 'was a', 'was a rat', 'a', 'a rat', 'rat'] dense_shape = [2, 9]

Args:

  • tokens: a two-dimensionalSparseTensor of dtype tf.string containing tokens that will be used to construct ngrams.
  • ngram_range: A pair with the range (inclusive) of ngram sizes to return.
  • separator: a string that will be inserted between tokens when ngrams are constructed.
  • name: (Optional) A name for this operation.

Returns:

A SparseTensor containing all ngrams from each row of the input.

Raises:

  • ValueError: if ngram_range[0] < 1 or ngram_range[1] < ngram_range[0]