tf.raw_ops.StringNGrams

Creates ngrams from ragged string data.

tf.raw_ops.StringNGrams(
    data, data_splits, separator, ngram_widths, left_pad, right_pad, pad_width,
    preserve_short_sequences, name=None
)

This op accepts a ragged tensor with 1 ragged dimension containing only strings and outputs a ragged tensor with 1 ragged dimension containing ngrams of that string, joined along the innermost axis.

Args:

  • data: A Tensor of type string. The values tensor of the ragged string tensor to make ngrams out of. Must be a 1D string tensor.
  • data_splits: A Tensor. Must be one of the following types: int32, int64. The splits tensor of the ragged string tensor to make ngrams out of.
  • separator: A string. The string to append between elements of the token. Use "" for no separator.
  • ngram_widths: A list of ints. The sizes of the ngrams to create.
  • left_pad: A string. The string to use to pad the left side of the ngram sequence. Only used if pad_width != 0.
  • right_pad: A string. The string to use to pad the right side of the ngram sequence. Only used if pad_width != 0.
  • pad_width: An int. The number of padding elements to add to each side of each sequence. Note that padding will never be greater than 'ngram_widths'-1 regardless of this value. If pad_width=-1, then add max(ngram_widths)-1 elements.
  • preserve_short_sequences: A bool.
  • name: A name for the operation (optional).

Returns:

A tuple of Tensor objects (ngrams, ngrams_splits).

  • ngrams: A Tensor of type string.
  • ngrams_splits: A Tensor. Has the same type as data_splits.