tf.strings.substr

Aliases:

  • tf.strings.substr
  • tf.substr
tf.strings.substr(
    input,
    pos,
    len,
    name=None
)

Defined in generated file: tensorflow/python/ops/gen_string_ops.py.

Return substrings from Tensor of strings.

For each string in the input Tensor, creates a substring starting at index pos with a total length of len.

If len defines a substring that would extend beyond the length of the input string, then as many characters as possible are used.

If pos is negative or specifies a character index larger than any of the input strings, then an InvalidArgumentError is thrown.

pos and len must have the same shape, otherwise a ValueError is thrown on Op creation.

NOTE: strings.substr supports broadcasting up to two dimensions. More about broadcasting here


Examples

Using scalar pos and len:

input = [b'Hello', b'World']
position = 1
length = 3

output = [b'ell', b'orl']

Using pos and len with same shape as input:

input = [[b'ten', b'eleven', b'twelve'],
         [b'thirteen', b'fourteen', b'fifteen'],
         [b'sixteen', b'seventeen', b'eighteen']]
position = [[1, 2, 3],
            [1, 2, 3],
            [1, 2, 3]]
length =   [[2, 3, 4],
            [4, 3, 2],
            [5, 5, 5]]

output = [[b'en', b'eve', b'lve'],
          [b'hirt', b'urt', b'te'],
          [b'ixtee', b'vente', b'hteen']]

Broadcasting pos and len onto input:

input = [[b'ten', b'eleven', b'twelve'],
         [b'thirteen', b'fourteen', b'fifteen'],
         [b'sixteen', b'seventeen', b'eighteen'],
         [b'nineteen', b'twenty', b'twentyone']]
position = [1, 2, 3]
length =   [1, 2, 3]

output = [[b'e', b'ev', b'lve'],
          [b'h', b'ur', b'tee'],
          [b'i', b've', b'hte'],
          [b'i', b'en', b'nty']]

Broadcasting input onto pos and len:

input = b'thirteen'
position = [1, 5, 7]
length =   [3, 2, 1]

output = [b'hir', b'ee', b'n']

Args:

  • input: A Tensor of type string. Tensor of strings
  • pos: A Tensor. Must be one of the following types: int32, int64. Scalar defining the position of first character in each substring
  • len: A Tensor. Must have the same type as pos. Scalar defining the number of characters to include in each substring
  • name: A name for the operation (optional).

Returns:

A Tensor of type string.