text.ngrams

Create a tensor of n-grams based on the input data data.

Used in the notebooks

Used in the guide

Creates a tensor of n-grams based on data. The n-grams are of width width and are created along axis axis; the n-grams are created by combining windows of width adjacent elements from data using reduction_type. This op is intended to cover basic use cases; more complex combinations can be created using the sliding_window op.

input_data = tf.ragged.constant([["e", "f", "g"], ["dd", "ee"]])
ngrams(
  input_data,
  width=2,
  axis=-1,
  reduction_type=Reduction.STRING_JOIN,
  string_separator="|")
<tf.RaggedTensor [[b'e|f', b'f|g'], [b'dd|ee']]>

data The data to reduce.
width The width of the ngram window. If there is not sufficient data to fill out the ngram window, the resulting ngram will be empty.
axis The axis to create ngrams along. Note that for string join reductions, only axis '-1' is supported; for other reductions, any positive or negative axis can be used. Should be a constant.
reduction_type A member of the Reduction enum. Should be a constant. Currently supports:

string_separator The separator string used for Reduction.STRING_JOIN. Ignored otherwise. Must be a string constant, not a Tensor.
name The op name.

A tensor of ngrams. If the input is a tf.Tensor, the output will also be a tf.Tensor; if the input is a tf.RaggedTensor, the output will be a tf.RaggedTensor.

InvalidArgumentError if reduction_type is either None or not a Reduction, or if reduction_type is STRING_JOIN and axis is not -1.