tf.RaggedTensor

TensorFlow 1 version View source on GitHub

Represents a ragged tensor.

tf.RaggedTensor(
    values, row_splits, cached_row_lengths=None, cached_value_rowids=None,
    cached_nrows=None, internal=False, uniform_row_length=None
)

A RaggedTensor is a tensor with one or more ragged dimensions, which are dimensions whose slices may have different lengths. For example, the inner (column) dimension of rt=[[3, 1, 4, 1], [], [5, 9, 2], [6], []] is ragged, since the column slices (rt[0, :], ..., rt[4, :]) have different lengths. Dimensions whose slices all have the same length are called uniform dimensions. The outermost dimension of a RaggedTensor is always uniform, since it consists of a single slice (and so there is no possibility for differing slice lengths).

The total number of dimensions in a RaggedTensor is called its rank, and the number of ragged dimensions in a RaggedTensor is called its ragged-rank. A RaggedTensor's ragged-rank is fixed at graph creation time: it can't depend on the runtime values of Tensors, and can't vary dynamically for different session runs.

Potentially Ragged Tensors

Many ops support both Tensors and RaggedTensors. The term "potentially ragged tensor" may be used to refer to a tensor that might be either a Tensor or a RaggedTensor. The ragged-rank of a Tensor is zero.

Documenting RaggedTensor Shapes

When documenting the shape of a RaggedTensor, ragged dimensions can be indicated by enclosing them in parentheses. For example, the shape of a 3-D RaggedTensor that stores the fixed-size word embedding for each word in a sentence, for each sentence in a batch, could be written as [num_sentences, (num_words), embedding_size]. The parentheses around (num_words) indicate that dimension is ragged, and that the length of each element list in that dimension may vary for each item.

Component Tensors

Internally, a RaggedTensor consists of a concatenated list of values that are partitioned into variable-length rows. In particular, each RaggedTensor consists of:

  • A values tensor, which concatenates the variable-length rows into a flattened list. For example, the values tensor for [[3, 1, 4, 1], [], [5, 9, 2], [6], []] is [3, 1, 4, 1, 5, 9, 2, 6].

  • A row_splits vector, which indicates how those flattened values are divided into rows. In particular, the values for row rt[i] are stored in the slice rt.values[rt.row_splits[i]:rt.row_splits[i+1]].

Example:

print(tf.RaggedTensor.from_row_splits( 
      values=[3, 1, 4, 1, 5, 9, 2, 6], 
      row_splits=[0, 4, 4, 7, 8, 8])) 
<tf.RaggedTensor [[3, 1, 4, 1], [], [5, 9, 2], [6], []]> 

Alternative Row-Partitioning Schemes

In addition to row_splits, ragged tensors provide support for four other row-partitioning schemes:

  • row_lengths: a vector with shape [nrows], which specifies the length of each row.

  • value_rowids and nrows: value_rowids is a vector with shape [nvals], corresponding one-to-one with values, which specifies each value's row index. In particular, the row rt[row] consists of the values rt.values[j] where value_rowids[j]==row. nrows is an integer scalar that specifies the number of rows in the RaggedTensor. (nrows is used to indicate trailing empty rows.)

  • row_starts: a vector with shape [nrows], which specifies the start offset of each row. Equivalent to row_splits[:-1].

  • row_limits: a vector with shape [nrows], which specifies the stop offset of each row. Equivalent to row_splits[1:].

  • uniform_row_length: A scalar tensor, specifying the length of every row. This row-partitioning scheme may only be used if all rows have the same length.

Example: The following ragged tensors are equivalent, and all represent the nested list [[3, 1, 4, 1], [], [5, 9, 2], [6], []].

values = [3, 1, 4, 1, 5, 9, 2, 6] 
rt1 = RaggedTensor.from_row_splits(values, row_splits=[0, 4, 4, 7, 8, 8]) 
rt2 = RaggedTensor.from_row_lengths(values, row_lengths=[4, 0, 3, 1, 0]) 
rt3 = RaggedTensor.from_value_rowids( 
    values, value_rowids=[0, 0, 0, 0, 2, 2, 2, 3], nrows=5) 
rt4 = RaggedTensor.from_row_starts(values, row_starts=[0, 4, 4, 7, 8]) 
rt5 = RaggedTensor.from_row_limits(values, row_limits=[4, 4, 7, 8, 8]) 

Multiple Ragged Dimensions

RaggedTensors with multiple ragged dimensions can be defined by using a nested RaggedTensor for the values tensor. Each nested RaggedTensor adds a single ragged dimension.

inner_rt = RaggedTensor.from_row_splits(  # =rt1 from above 
    values=[3, 1, 4, 1, 5, 9, 2, 6], row_splits=[0, 4, 4, 7, 8, 8]) 
outer_rt = RaggedTensor.from_row_splits( 
    values=inner_rt, row_splits=[0, 3, 3, 5]) 
print(outer_rt.to_list()) 
[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]] 
print(outer_rt.ragged_rank) 
2 

The factory function RaggedTensor.from_nested_row_splits may be used to construct a RaggedTensor with multiple ragged dimensions directly, by providing a list of row_splits tensors:

RaggedTensor.from_nested_row_splits( 
    flat_values=[3, 1, 4, 1, 5, 9, 2, 6], 
    nested_row_splits=([0, 3, 3, 5], [0, 4, 4, 7, 8, 8])).to_list() 
[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]] 

Uniform Inner Dimensions

RaggedTensors with uniform inner dimensions can be defined by using a multidimensional Tensor for values.

rt = RaggedTensor.from_row_splits(values=tf.ones([5, 3], tf.int32), 
                                  row_splits=[0, 2, 5]) 
print(rt.to_list()) 
[[[1, 1, 1], [1, 1, 1]], 
 [[1, 1, 1], [1, 1, 1], [1, 1, 1]]] 
print(rt.shape) 
(2, None, 3) 

Uniform Outer Dimensions

RaggedTensors with uniform outer dimensions can be defined by using one or more RaggedTensor with a uniform_row_length row-partitioning tensor. For example, a RaggedTensor with shape [2, 2, None] can be constructed with this method from a RaggedTensor values with shape [4, None]:

values = tf.ragged.constant([[1, 2, 3], [4], [5, 6], [7, 8, 9, 10]]) 
print(values.shape) 
(4, None) 
rt6 = tf.RaggedTensor.from_uniform_row_length(values, 2) 
print(rt6) 
<tf.RaggedTensor [[[1, 2, 3], [4]], [[5, 6], [7, 8, 9, 10]]]> 
print(rt6.shape) 
(2, 2, None) 

Note that rt6 only contains one ragged dimension (the innermost dimension). In contrast, if from_row_splits is used to construct a similar RaggedTensor, then that RaggedTensor will have two ragged dimensions:

rt7 = tf.RaggedTensor.from_row_splits(values, [0, 2, 4]) 
print(rt7.shape) 
(2, None, None) 

Uniform and ragged outer dimensions may be interleaved, meaning that a tensor with any combination of ragged and uniform dimensions may be created. For example, a RaggedTensor t4 with shape [3, None, 4, 8, None, 2] could be constructed as follows:

t0 = tf.zeros([1000, 2])                           # Shape:         [1000, 2]
t1 = RaggedTensor.from_row_lengths(t0, [...])      #           [160, None, 2]
t2 = RaggedTensor.from_uniform_row_length(t1, 8)   #         [20, 8, None, 2]
t3 = RaggedTensor.from_uniform_row_length(t2, 4)   #       [5, 4, 8, None, 2]
t4 = RaggedTensor.from_row_lengths(t3, [...])      # [3, None, 4, 8, None, 2]

Args:

  • values: A potentially ragged tensor of any dtype and shape [nvals, ...].
  • row_splits: A 1-D integer tensor with shape [nrows+1].
  • cached_row_lengths: A 1-D integer tensor with shape [nrows]
  • cached_value_rowids: A 1-D integer tensor with shape [nvals].
  • cached_nrows: A 1-D integer scalar tensor.
  • internal: True if the constructor is being called by one of the factory methods. If false, an exception will be raised.
  • uniform_row_length: A scalar tensor.

Attributes:

  • dtype: The DType of values in this tensor.
  • flat_values: The innermost values tensor for this ragged tensor.

    Concretely, if rt.values is a Tensor, then rt.flat_values is rt.values; otherwise, rt.flat_values is rt.values.flat_values.

    Conceptually, flat_values is the tensor formed by flattening the outermost dimension and all of the ragged dimensions into a single dimension.

    rt.flat_values.shape = [nvals] + rt.shape[rt.ragged_rank + 1:] (where nvals is the number of items in the flattened dimensions).

    Example:

  rt = tf.ragged.constant([[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]]) 
  print(rt.flat_values) 
    tf.Tensor([3 1 4 1 5 9 2 6], shape=(8,), dtype=int32) 
     
  • nested_row_splits: A tuple containing the row_splits for all ragged dimensions.

    rt.nested_row_splits is a tuple containing the row_splits tensors for all ragged dimensions in rt, ordered from outermost to innermost. In particular, rt.nested_row_splits = (rt.row_splits,) + value_splits where:

    • value_splits = () if rt.values is a Tensor.
    • value_splits = rt.values.nested_row_splits otherwise.

    Example:

  rt = tf.ragged.constant( 
      [[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]]]) 
  for i, splits in enumerate(rt.nested_row_splits): 
    print('Splits for dimension %d: %s' % (i+1, splits.numpy())) 
    Splits for dimension 1: [0 3] 
    Splits for dimension 2: [0 3 3 5] 
    Splits for dimension 3: [0 4 4 7 8 8] 
     
  • ragged_rank: The number of ragged dimensions in this ragged tensor.

  • row_splits: The row-split indices for this ragged tensor's values.

    rt.row_splits specifies where the values for each row begin and end in rt.values. In particular, the values for row rt[i] are stored in the slice rt.values[rt.row_splits[i]:rt.row_splits[i+1]].

    Example:

  rt = tf.ragged.constant([[3, 1, 4, 1], [], [5, 9, 2], [6], []]) 
  print(rt.row_splits)  # indices of row splits in rt.values 
    tf.Tensor([0 4 4 7 8 8], shape=(6,), dtype=int64) 
     
  • shape: The statically known shape of this ragged tensor.
  tf.ragged.constant([[0], [1, 2]]).shape 
    TensorShape([2, None]) 
     
  tf.ragged.constant([[[0, 1]], [[1, 2], [3, 4]]], ragged_rank=1).shape 
    TensorShape([2, None, 2]) 
     
  • uniform_row_length: The length of each row in this ragged tensor, or None if rows are ragged.
  rt1 = tf.ragged.constant([[1, 2, 3], [4], [5, 6], [7, 8, 9, 10]]) 
  print(rt1.uniform_row_length)  # rows are ragged. 
    None 
     
  rt2 = tf.RaggedTensor.from_uniform_row_length( 
      values=rt1, uniform_row_length=2) 
  print(rt2) 
    <tf.RaggedTensor [[[1, 2, 3], [4]], [[5, 6], [7, 8, 9, 10]]]> 
  print(rt2.uniform_row_length)  # rows are not ragged (all have size 2). 
    tf.Tensor(2, shape=(), dtype=int64) 
     

A RaggedTensor's rows are only considered to be uniform (i.e. non-ragged) if it can be determined statically (at graph construction time) that the rows all have the same length.

  • values: The concatenated rows for this ragged tensor.

    rt.values is a potentially ragged tensor formed by flattening the two outermost dimensions of rt into a single dimension.

    rt.values.shape = [nvals] + rt.shape[2:] (where nvals is the number of items in the outer two dimensions of rt).

    rt.ragged_rank = self.ragged_rank - 1

    Example:

  rt = tf.ragged.constant([[3, 1, 4, 1], [], [5, 9, 2], [6], []]) 
  print(rt.values) 
    tf.Tensor([3 1 4 1 5 9 2 6], shape=(8,), dtype=int32) 
     

Raises:

  • TypeError: If a row partitioning tensor has an inappropriate dtype.
  • TypeError: If exactly one row partitioning argument was not specified.
  • ValueError: If a row partitioning tensor has an inappropriate shape.
  • ValueError: If multiple partitioning arguments are specified.
  • ValueError: If nrows is specified but value_rowids is not None.

Methods

__abs__

View source

__abs__(
    x, name=None
)

Computes the absolute value of a tensor.

Given a tensor of integer or floating-point values, this operation returns a tensor of the same type, where each element contains the absolute value of the corresponding element in the input.

Given a tensor x of complex numbers, this operation returns a tensor of type float32 or float64 that is the absolute value of each element in x. For a complex number \(a + bj\), its absolute value is computed as \(\sqrt{a^2

  • b^2}\). For example:
x = tf.constant([[-2.25 + 4.75j], [-3.25 + 5.75j]]) 
tf.abs(x) 
<tf.Tensor: shape=(2, 1), dtype=float64, numpy= 
array([[5.25594901], 
       [6.60492241]])> 

Args:

  • x: A Tensor or SparseTensor of type float16, float32, float64, int32, int64, complex64 or complex128.
  • name: A name for the operation (optional).

Returns:

A Tensor or SparseTensor of the same size, type and sparsity as x, with absolute values. Note, for complex64 or complex128 input, the returned Tensor will be of type float32 or float64, respectively.

If x is a SparseTensor, returns SparseTensor(x.indices, tf.math.abs(x.values, ...), x.dense_shape)

__add__

__add__(
    x, y, name=None
)

Returns x + y element-wise.

NOTE: math.add supports broadcasting. AddN does not. More about broadcasting here

Args:

  • x: A Tensor. Must be one of the following types: bfloat16, half, float32, float64, uint8, int8, int16, int32, int64, complex64, complex128, string.
  • y: A Tensor. Must have the same type as x.
  • name: A name for the operation (optional).

Returns:

A Tensor. Has the same type as x.

__and__

View source

__and__(
    x, y, name=None
)

Logical AND function.

The operation works for the following input types:

  • Two single elements of type bool
  • One tf.Tensor of type bool and one single bool, where the result will be calculated by applying logical AND with the single element to each element in the larger Tensor.
  • Two tf.Tensor objects of type bool of the same shape. In this case, the result will be the element-wise logical AND of the two input tensors.

Usage:

a = tf.constant([True]) 
b = tf.constant([False]) 
tf.math.logical_and(a, b) 
<tf.Tensor: shape=(1,), dtype=bool, numpy=array([False])> 
c = tf.constant([True]) 
x = tf.constant([False, True, True, False]) 
tf.math.logical_and(c, x) 
<tf.Tensor: shape=(4,), dtype=bool, numpy=array([False,  True,  True, False])> 
y = tf.constant([False, False, True, True]) 
z = tf.constant([False, True, False, True]) 
tf.math.logical_and(y, z) 
<tf.Tensor: shape=(4,), dtype=bool, numpy=array([False, False, False,  True])> 

Args:

  • x: A tf.Tensor type bool.
  • y: A tf.Tensor of type bool.
  • name: A name for the operation (optional).

Returns:

A tf.Tensor of type bool with the same size as that of x or y.

__bool__

View source

__bool__(
    _
)

Dummy method to prevent a RaggedTensor from being used as a Python bool.

__div__

View source

__div__(
    x, y, name=None
)

Divides x / y elementwise (using Python 2 division operator semantics). (deprecated)

NOTE: Prefer using the Tensor division operator or tf.divide which obey Python 3 division operator semantics.

This function divides x and y, forcing Python 2 semantics. That is, if x and y are both integers then the result will be an integer. This is in contrast to Python 3, where division with / is always a float while division with // is always an integer.

Args:

  • x: Tensor numerator of real numeric type.
  • y: Tensor denominator of real numeric type.
  • name: A name for the operation (optional).

Returns:

x / y returns the quotient of x and y.

__floordiv__

View source

__floordiv__(
    x, y, name=None
)

Divides x / y elementwise, rounding toward the most negative integer.

The same as tf.compat.v1.div(x,y) for integers, but uses tf.floor(tf.compat.v1.div(x,y)) for floating point arguments so that the result is always an integer (though possibly an integer represented as floating point). This op is generated by x // y floor division in Python 3 and in Python 2.7 with from __future__ import division.

x and y must have the same type, and the result will have the same type as well.

Args:

  • x: Tensor numerator of real numeric type.
  • y: Tensor denominator of real numeric type.
  • name: A name for the operation (optional).

Returns:

x / y rounded down.

Raises:

  • TypeError: If the inputs are complex.

__ge__

__ge__(
    x, y, name=None
)

Returns the truth value of (x >= y) element-wise.

NOTE: math.greater_equal supports broadcasting. More about broadcasting here

Example:

x = tf.constant([5, 4, 6, 7])
y = tf.constant([5, 2, 5, 10])
tf.math.greater_equal(x, y) ==> [True, True, True, False]

x = tf.constant([5, 4, 6, 7])
y = tf.constant([5])
tf.math.greater_equal(x, y) ==> [True, False, True, True]

Args:

  • x: A Tensor. Must be one of the following types: float32, float64, int32, uint8, int16, int8, int64, bfloat16, uint16, half, uint32, uint64.
  • y: A Tensor. Must have the same type as x.
  • name: A name for the operation (optional).

Returns:

A Tensor of type bool.

__getitem__

View source

__getitem__(
    key
)

Returns the specified piece of this RaggedTensor.

Supports multidimensional indexing and slicing, with one restriction: indexing into a ragged inner dimension is not allowed. This case is problematic because the indicated value may exist in some rows but not others. In such cases, it's not obvious whether we should (1) report an IndexError; (2) use a default value; or (3) skip that value and return a tensor with fewer rows than we started with. Following the guiding principles of Python ("In the face of ambiguity, refuse the temptation to guess"), we simply disallow this operation.

Args:

  • self: The RaggedTensor to slice.
  • key: Indicates which piece of the RaggedTensor to return, using standard Python semantics (e.g., negative values index from the end). key may have any of the following types:

    • int constant
    • Scalar integer Tensor
    • slice containing integer constants and/or scalar integer Tensors

    • Ellipsis

    • tf.newaxis

    • tuple containing any of the above (for multidimensional indexing)

Returns:

A Tensor or RaggedTensor object. Values that include at least one ragged dimension are returned as RaggedTensor. Values that include no ragged dimensions are returned as Tensor. See above for examples of expressions that return Tensors vs RaggedTensors.

Raises:

  • ValueError: If key is out of bounds.
  • ValueError: If key is not supported.
  • TypeError: If the indices in key have an unsupported type.

Examples:

# A 2-D ragged tensor with 1 ragged dimension. 
rt = tf.ragged.constant([['a', 'b', 'c'], ['d', 'e'], ['f'], ['g']]) 
rt[0].numpy()                 # First row (1-D `Tensor`) 
array([b'a', b'b', b'c'], dtype=object) 
rt[:3].to_list()              # First three rows (2-D RaggedTensor) 
[[b'a', b'b', b'c'], [b'd', b'e'], [b'f']] 
rt[3, 0].numpy()              # 1st element of 4th row (scalar) 
b'g' 
# A 3-D ragged tensor with 2 ragged dimensions. 
rt = tf.ragged.constant([[[1, 2, 3], [4]], 
                         [[5], [], [6]], 
                         [[7]], 
                         [[8, 9], [10]]]) 
rt[1].to_list()               # Second row (2-D RaggedTensor) 
[[5], [], [6]] 
rt[3, 0].numpy()              # First element of fourth row (1-D Tensor) 
array([8, 9], dtype=int32) 
rt[:, 1:3].to_list()          # Items 1-3 of each row (3-D RaggedTensor) 
[[[4]], [[], [6]], [], [[10]]] 
rt[:, -1:].to_list()          # Last item of each row (3-D RaggedTensor) 
[[[4]], [[6]], [[7]], [[10]]] 

__gt__

__gt__(
    x, y, name=None
)

Returns the truth value of (x > y) element-wise.

NOTE: math.greater supports broadcasting. More about broadcasting here

Example:

x = tf.constant([5, 4, 6])
y = tf.constant([5, 2, 5])
tf.math.greater(x, y) ==> [False, True, True]

x = tf.constant([5, 4, 6])
y = tf.constant([5])
tf.math.greater(x, y) ==> [False, False, True]

Args:

  • x: A Tensor. Must be one of the following types: float32, float64, int32, uint8, int16, int8, int64, bfloat16, uint16, half, uint32, uint64.
  • y: A Tensor. Must have the same type as x.
  • name: A name for the operation (optional).

Returns:

A Tensor of type bool.

__invert__

__invert__(
    x, name=None
)

Returns the truth value of NOT x element-wise.

Example:

tf.math.logical_not(tf.constant([True, False])) 
<tf.Tensor: shape=(2,), dtype=bool, numpy=array([False,  True])> 

Args:

  • x: A Tensor of type bool. A Tensor of type bool.
  • name: A name for the operation (optional).

Returns:

A Tensor of type bool.

__le__

__le__(
    x, y, name=None
)

Returns the truth value of (x <= y) element-wise.

NOTE: math.less_equal supports broadcasting. More about broadcasting here

Example:

x = tf.constant([5, 4, 6])
y = tf.constant([5])
tf.math.less_equal(x, y) ==> [True, True, False]

x = tf.constant([5, 4, 6])
y = tf.constant([5, 6, 6])
tf.math.less_equal(x, y) ==> [True, True, True]

Args:

  • x: A Tensor. Must be one of the following types: float32, float64, int32, uint8, int16, int8, int64, bfloat16, uint16, half, uint32, uint64.
  • y: A Tensor. Must have the same type as x.
  • name: A name for the operation (optional).

Returns:

A Tensor of type bool.

__lt__

__lt__(
    x, y, name=None
)

Returns the truth value of (x < y) element-wise.

NOTE: math.less supports broadcasting. More about broadcasting here

Example:

x = tf.constant([5, 4, 6])
y = tf.constant([5])
tf.math.less(x, y) ==> [False, True, False]

x = tf.constant([5, 4, 6])
y = tf.constant([5, 6, 7])
tf.math.less(x, y) ==> [False, True, True]

Args:

  • x: A Tensor. Must be one of the following types: float32, float64, int32, uint8, int16, int8, int64, bfloat16, uint16, half, uint32, uint64.
  • y: A Tensor. Must have the same type as x.
  • name: A name for the operation (optional).

Returns:

A Tensor of type bool.

__mod__

__mod__(
    x, y, name=None
)

Returns element-wise remainder of division. When x < 0 xor y < 0 is

true, this follows Python semantics in that the result here is consistent with a flooring divide. E.g. floor(x / y) * y + mod(x, y) = x.

NOTE: math.floormod supports broadcasting. More about broadcasting here

Args:

  • x: A Tensor. Must be one of the following types: int32, int64, bfloat16, half, float32, float64.
  • y: A Tensor. Must have the same type as x.
  • name: A name for the operation (optional).

Returns:

A Tensor. Has the same type as x.

__mul__

View source

__mul__(
    x, y, name=None
)

Returns an element-wise x * y.

For example:

x = tf.constant(([1, 2, 3, 4])) 
tf.math.multiply(x, x) 
<tf.Tensor: shape=(4,), dtype=..., numpy=array([ 1,  4,  9, 16], dtype=int32)> 

Since tf.math.multiply will convert its arguments to Tensors, you can also pass in non-Tensor arguments:

tf.math.multiply(7,6) 
<tf.Tensor: shape=(), dtype=int32, numpy=42> 

If x.shape is not thes same as y.shape, they will be broadcast to a compatible shape. (More about broadcasting here.)

For example:

x = tf.ones([1, 2]); 
y = tf.ones([2, 1]);