tf.RaggedTensor
bookmark_border Stay organized with collections Save and categorize content based on your preferences.

TensorFlow 1 version

View source on GitHub

Represents a ragged tensor.

View aliases

Compat aliases for migration

See Migration guide for more details.

tf.compat.v1.RaggedTensor

tf.RaggedTensor(
    values, row_splits, cached_row_lengths=None, cached_value_rowids=None,
    cached_nrows=None, internal=False, uniform_row_length=None
)

A RaggedTensor is a tensor with one or more ragged dimensions, which are dimensions whose slices may have different lengths. For example, the inner (column) dimension of rt=[[3, 1, 4, 1], [], [5, 9, 2], [6], []] is ragged, since the column slices (rt[0, :], ..., rt[4, :]) have different lengths. Dimensions whose slices all have the same length are called uniform dimensions. The outermost dimension of a RaggedTensor is always uniform, since it consists of a single slice (and so there is no possibility for differing slice lengths).

The total number of dimensions in a RaggedTensor is called its rank, and the number of ragged dimensions in a RaggedTensor is called its ragged-rank. A RaggedTensor's ragged-rank is fixed at graph creation time: it can't depend on the runtime values of Tensors, and can't vary dynamically for different session runs.

Potentially Ragged Tensors

Many ops support both Tensors and RaggedTensors. The term "potentially ragged tensor" may be used to refer to a tensor that might be either a Tensor or a RaggedTensor. The ragged-rank of a Tensor is zero.

Documenting RaggedTensor Shapes

When documenting the shape of a RaggedTensor, ragged dimensions can be indicated by enclosing them in parentheses. For example, the shape of a 3-D RaggedTensor that stores the fixed-size word embedding for each word in a sentence, for each sentence in a batch, could be written as [num_sentences, (num_words), embedding_size]. The parentheses around (num_words) indicate that dimension is ragged, and that the length of each element list in that dimension may vary for each item.

Component Tensors

Internally, a RaggedTensor consists of a concatenated list of values that are partitioned into variable-length rows. In particular, each RaggedTensor consists of:

A values tensor, which concatenates the variable-length rows into a flattened list. For example, the values tensor for [[3, 1, 4, 1], [], [5, 9, 2], [6], []] is [3, 1, 4, 1, 5, 9, 2, 6].
A row_splits vector, which indicates how those flattened values are divided into rows. In particular, the values for row rt[i] are stored in the slice rt.values[rt.row_splits[i]:rt.row_splits[i+1]].

Example:

print(tf.RaggedTensor.from_row_splits(
      values=[3, 1, 4, 1, 5, 9, 2, 6],
      row_splits=[0, 4, 4, 7, 8, 8]))
<tf.RaggedTensor [[3, 1, 4, 1], [], [5, 9, 2], [6], []]>

Alternative Row-Partitioning Schemes

In addition to row_splits, ragged tensors provide support for four other row-partitioning schemes:

row_lengths: a vector with shape [nrows], which specifies the length of each row.
value_rowids and nrows: value_rowids is a vector with shape [nvals], corresponding one-to-one with values, which specifies each value's row index. In particular, the row rt[row] consists of the values rt.values[j] where value_rowids[j]==row. nrows is an integer scalar that specifies the number of rows in the RaggedTensor. (nrows is used to indicate trailing empty rows.)
row_starts: a vector with shape [nrows], which specifies the start offset of each row. Equivalent to row_splits[:-1].
row_limits: a vector with shape [nrows], which specifies the stop offset of each row. Equivalent to row_splits[1:].
uniform_row_length: A scalar tensor, specifying the length of every row. This row-partitioning scheme may only be used if all rows have the same length.

Example: The following ragged tensors are equivalent, and all represent the nested list [[3, 1, 4, 1], [], [5, 9, 2], [6], []].

values = [3, 1, 4, 1, 5, 9, 2, 6]
rt1 = RaggedTensor.from_row_splits(values, row_splits=[0, 4, 4, 7, 8, 8])
rt2 = RaggedTensor.from_row_lengths(values, row_lengths=[4, 0, 3, 1, 0])
rt3 = RaggedTensor.from_value_rowids(
    values, value_rowids=[0, 0, 0, 0, 2, 2, 2, 3], nrows=5)
rt4 = RaggedTensor.from_row_starts(values, row_starts=[0, 4, 4, 7, 8])
rt5 = RaggedTensor.from_row_limits(values, row_limits=[4, 4, 7, 8, 8])

Multiple Ragged Dimensions

RaggedTensors with multiple ragged dimensions can be defined by using a nested RaggedTensor for the values tensor. Each nested RaggedTensor adds a single ragged dimension.

inner_rt = RaggedTensor.from_row_splits(  # =rt1 from above
    values=[3, 1, 4, 1, 5, 9, 2, 6], row_splits=[0, 4, 4, 7, 8, 8])
outer_rt = RaggedTensor.from_row_splits(
    values=inner_rt, row_splits=[0, 3, 3, 5])
print(outer_rt.to_list())
[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]]
print(outer_rt.ragged_rank)
2

The factory function RaggedTensor.from_nested_row_splits may be used to construct a RaggedTensor with multiple ragged dimensions directly, by providing a list of row_splits tensors:

RaggedTensor.from_nested_row_splits(
    flat_values=[3, 1, 4, 1, 5, 9, 2, 6],
    nested_row_splits=([0, 3, 3, 5], [0, 4, 4, 7, 8, 8])).to_list()
[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]]

Uniform Inner Dimensions

RaggedTensors with uniform inner dimensions can be defined by using a multidimensional Tensor for values.

rt = RaggedTensor.from_row_splits(values=tf.ones([5, 3], tf.int32),
                                  row_splits=[0, 2, 5])
print(rt.to_list())
[[[1, 1, 1], [1, 1, 1]],
 [[1, 1, 1], [1, 1, 1], [1, 1, 1]]]
print(rt.shape)
(2, None, 3)

Uniform Outer Dimensions

RaggedTensors with uniform outer dimensions can be defined by using one or more RaggedTensor with a uniform_row_length row-partitioning tensor. For example, a RaggedTensor with shape [2, 2, None] can be constructed with this method from a RaggedTensor values with shape [4, None]:

values = tf.ragged.constant([[1, 2, 3], [4], [5, 6], [7, 8, 9, 10]])
print(values.shape)
(4, None)
rt6 = tf.RaggedTensor.from_uniform_row_length(values, 2)
print(rt6)
<tf.RaggedTensor [[[1, 2, 3], [4]], [[5, 6], [7, 8, 9, 10]]]>
print(rt6.shape)
(2, 2, None)

Note that rt6 only contains one ragged dimension (the innermost dimension). In contrast, if from_row_splits is used to construct a similar RaggedTensor, then that RaggedTensor will have two ragged dimensions:

rt7 = tf.RaggedTensor.from_row_splits(values, [0, 2, 4])
print(rt7.shape)
(2, None, None)

Uniform and ragged outer dimensions may be interleaved, meaning that a tensor with any combination of ragged and uniform dimensions may be created. For example, a RaggedTensor t4 with shape [3, None, 4, 8, None, 2] could be constructed as follows:

t0 = tf.zeros([1000, 2])                           # Shape:         [1000, 2]
t1 = RaggedTensor.from_row_lengths(t0, [...])      #           [160, None, 2]
t2 = RaggedTensor.from_uniform_row_length(t1, 8)   #         [20, 8, None, 2]
t3 = RaggedTensor.from_uniform_row_length(t2, 4)   #       [5, 4, 8, None, 2]
t4 = RaggedTensor.from_row_lengths(t3, [...])      # [3, None, 4, 8, None, 2]

Args
`values`	A potentially ragged tensor of any dtype and shape `[nvals, ...]`.
`row_splits`	A 1-D integer tensor with shape `[nrows+1]`.
`cached_row_lengths`	A 1-D integer tensor with shape `[nrows]`
`cached_value_rowids`	A 1-D integer tensor with shape `[nvals]`.
`cached_nrows`	A 1-D integer scalar tensor.
`internal`	True if the constructor is being called by one of the factory methods. If false, an exception will be raised.
`uniform_row_length`	A scalar tensor.

Raises
`TypeError`	If a row partitioning tensor has an inappropriate dtype.
`TypeError`	If exactly one row partitioning argument was not specified.
`ValueError`	If a row partitioning tensor has an inappropriate shape.
`ValueError`	If multiple partitioning arguments are specified.
`ValueError`	If nrows is specified but value_rowids is not None.

Attributes
`dtype`	The `DType` of values in this tensor.
`flat_values`	The innermost `values` tensor for this ragged tensor. Concretely, if `rt.values` is a `Tensor`, then `rt.flat_values` is `rt.values`; otherwise, `rt.flat_values` is `rt.values.flat_values`. Conceptually, `flat_values` is the tensor formed by flattening the outermost dimension and all of the ragged dimensions into a single dimension. `rt.flat_values.shape = [nvals] + rt.shape[rt.ragged_rank + 1:]` (where `nvals` is the number of items in the flattened dimensions). Example: `rt = tf.ragged.constant([[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]])` `print(rt.flat_values)` `tf.Tensor([3 1 4 1 5 9 2 6], shape=(8,), dtype=int32)`
`nested_row_splits`	A tuple containing the row_splits for all ragged dimensions. `rt.nested_row_splits` is a tuple containing the `row_splits` tensors for all ragged dimensions in `rt`, ordered from outermost to innermost. In particular, `rt.nested_row_splits = (rt.row_splits,) + value_splits` where: `value_splits = ()` if `rt.values` is a `Tensor`. `value_splits = rt.values.nested_row_splits` otherwise. Example: `rt = tf.ragged.constant(` `[[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]]])` `for i, splits in enumerate(rt.nested_row_splits):` `print('Splits for dimension %d: %s' % (i+1, splits.numpy()))` `Splits for dimension 1: [0 3]` `Splits for dimension 2: [0 3 3 5]` `Splits for dimension 3: [0 4 4 7 8 8]`
`ragged_rank`	The number of ragged dimensions in this ragged tensor.
`row_splits`	The row-split indices for this ragged tensor's `values`. `rt.row_splits` specifies where the values for each row begin and end in `rt.values`. In particular, the values for row `rt[i]` are stored in the slice `rt.values[rt.row_splits[i]:rt.row_splits[i+1]]`. Example: `rt = tf.ragged.constant([[3, 1, 4, 1], [], [5, 9, 2], [6], []])` `print(rt.row_splits) # indices of row splits in rt.values` `tf.Tensor([0 4 4 7 8 8], shape=(6,), dtype=int64)`
`shape`	The statically known shape of this ragged tensor. `tf.ragged.constant([[0], [1, 2]]).shape` `TensorShape([2, None])` `tf.ragged.constant([[[0, 1]], [[1, 2], [3, 4]]], ragged_rank=1).shape` `TensorShape([2, None, 2])`
`uniform_row_length`	The length of each row in this ragged tensor, or None if rows are ragged. `rt1 = tf.ragged.constant([[1, 2, 3], [4], [5, 6], [7, 8, 9, 10]])` `print(rt1.uniform_row_length) # rows are ragged.` `None` `rt2 = tf.RaggedTensor.from_uniform_row_length(` `values=rt1, uniform_row_length=2)` `print(rt2)` `<tf.RaggedTensor [[[1, 2, 3], [4]], [[5, 6], [7, 8, 9, 10]]]>` `print(rt2.uniform_row_length) # rows are not ragged (all have size 2).` `tf.Tensor(2, shape=(), dtype=int64)` A RaggedTensor's rows are only considered to be uniform (i.e. non-ragged) if it can be determined statically (at graph construction time) that the rows all have the same length.
`values`	The concatenated rows for this ragged tensor. `rt.values` is a potentially ragged tensor formed by flattening the two outermost dimensions of `rt` into a single dimension. `rt.values.shape = [nvals] + rt.shape[2:]` (where `nvals` is the number of items in the outer two dimensions of `rt`). `rt.ragged_rank = self.ragged_rank - 1` Example: `rt = tf.ragged.constant([[3, 1, 4, 1], [], [5, 9, 2], [6], []])` `print(rt.values)` `tf.Tensor([3 1 4 1 5 9 2 6], shape=(8,), dtype=int32)`

Args
`axis`	An integer scalar or vector indicating which axes to return the bounding box for. If not specified, then the full bounding box is returned.
`name`	A name prefix for the returned tensor (optional).
`out_type`	`dtype` for the returned tensor. Defaults to `self.row_splits.dtype`.

Args
`flat_values`	A potentially ragged tensor.
`nested_row_lengths`	A list of 1-D integer tensors. The `i`th tensor is used as the `row_lengths` for the `i`th ragged dimension.
`name`	A name prefix for the RaggedTensor (optional).
`validate`	If true, then use assertions to check that the arguments form a valid `RaggedTensor`. Note: these assertions incur a runtime cost, since they must be checked for each tensor value.

Args
`flat_values`	A potentially ragged tensor.
`nested_value_rowids`	A list of 1-D integer tensors. The `i`th tensor is used as the `value_rowids` for the `i`th ragged dimension.
`nested_nrows`	A list of integer scalars. The `i`th scalar is used as the `nrows` for the `i`th ragged dimension.
`name`	A name prefix for the RaggedTensor (optional).
`validate`	If true, then use assertions to check that the arguments form a valid `RaggedTensor`. Note: these assertions incur a runtime cost, since they must be checked for each tensor value.

Args
`values`	A potentially ragged tensor with shape `[nvals, ...]`.
`row_lengths`	A 1-D integer tensor with shape `[nrows]`. Must be nonnegative. `sum(row_lengths)` must be `nvals`.
`name`	A name prefix for the RaggedTensor (optional).
`validate`	If true, then use assertions to check that the arguments form a valid `RaggedTensor`. Note: these assertions incur a runtime cost, since they must be checked for each tensor value.

Args
`values`	A potentially ragged tensor with shape `[nvals, ...]`.
`row_limits`	A 1-D integer tensor with shape `[nrows]`. Must be sorted in ascending order. If `nrows>0`, then `row_limits[-1]` must be `nvals`.
`name`	A name prefix for the RaggedTensor (optional).
`validate`	If true, then use assertions to check that the arguments form a valid `RaggedTensor`. Note: these assertions incur a runtime cost, since they must be checked for each tensor value.

Args
`values`	A potentially ragged tensor with shape `[nvals, ...]`.
`row_starts`	A 1-D integer tensor with shape `[nrows]`. Must be nonnegative and sorted in ascending order. If `nrows>0`, then `row_starts[0]` must be zero.
`name`	A name prefix for the RaggedTensor (optional).
`validate`	If true, then use assertions to check that the arguments form a valid `RaggedTensor`. Note: these assertions incur a runtime cost, since they must be checked for each tensor value.

Args
`st_input`	The sparse tensor to convert. Must have rank 2.
`name`	A name prefix for the returned tensors (optional).
`row_splits_dtype`	`dtype` for the returned `RaggedTensor`'s `row_splits` tensor. One of `tf.int32` or `tf.int64`.

Args
`tensor`	The `Tensor` to convert. Must have rank `ragged_rank + 1` or higher.
`lengths`	An optional set of row lengths, specified using a 1-D integer `Tensor` whose length is equal to `tensor.shape[0]` (the number of rows in `tensor`). If specified, then `output[row]` will contain `tensor[row][:lengths[row]]`. Negative lengths are treated as zero. You may optionally pass a list or tuple of lengths to this argument, which will be used as nested row lengths to construct a ragged tensor with multiple ragged dimensions.
`padding`	An optional padding value. If specified, then any row suffix consisting entirely of `padding` will be excluded from the returned RaggedTensor. `padding` is a `Tensor` with the same dtype as `tensor` and with `shape=tensor.shape[ragged_rank + 1:]`.
`ragged_rank`	Integer specifying the ragged rank for the returned `RaggedTensor`. Must be greater than zero.
`name`	A name prefix for the returned tensors (optional).
`row_splits_dtype`	`dtype` for the returned `RaggedTensor`'s `row_splits` tensor. One of `tf.int32` or `tf.int64`.

Args
`values`	A potentially ragged tensor with shape `[nvals, ...]`.
`value_rowids`	A 1-D integer tensor with shape `[nvals]`, which corresponds one-to-one with `values`, and specifies each value's row index. Must be nonnegative, and must be sorted in ascending order.
`nrows`	An integer scalar specifying the number of rows. This should be specified if the `RaggedTensor` may containing empty training rows. Must be greater than `value_rowids[-1]` (or zero if `value_rowids` is empty). Defaults to `value_rowids[-1]` (or zero if `value_rowids` is empty).
`name`	A name prefix for the RaggedTensor (optional).
`validate`	If true, then use assertions to check that the arguments form a valid `RaggedTensor`. Note: these assertions incur a runtime cost, since they must be checked for each tensor value.

Args
`outer_axis`	`int`: The first dimension in the range of dimensions to merge. May be negative if `self.shape.rank` is statically known.
`inner_axis`	`int`: The last dimension in the range of dimensions to merge. May be negative if `self.shape.rank` is statically known.

Args
`axis`	An integer constant indicating the axis whose row lengths should be returned.
`name`	A name prefix for the returned tensor (optional).

Args
`default_value`	Value to set for indices not specified in `self`. Defaults to zero. `default_value` must be broadcastable to `self.shape[self.ragged_rank + 1:]`.
`name`	A name prefix for the returned tensors (optional).
`shape`	The shape of the resulting dense tensor. In particular, `result.shape[i]` is `shape[i]` (if `shape[i]` is not None), or `self.bounding_shape(i)` (otherwise).`shape.rank` must be `None` or equal to `self.rank`.

Args
`x`	A `Tensor` or `SparseTensor` of type `float16`, `float32`, `float64`, `int32`, `int64`, `complex64` or `complex128`.
`name`	A name for the operation (optional).

Args
`x`	A `Tensor`. Must be one of the following types: `bfloat16`, `half`, `float32`, `float64`, `uint8`, `int8`, `int16`, `int32`, `int64`, `complex64`, `complex128`, `string`.
`y`	A `Tensor`. Must have the same type as `x`.
`name`	A name for the operation (optional).

Args
`x`	A `tf.Tensor` type bool.
`y`	A `tf.Tensor` of type bool.
`name`	A name for the operation (optional).

Args
`x`	`Tensor` numerator of real numeric type.
`y`	`Tensor` denominator of real numeric type.
`name`	A name for the operation (optional).

Raises
`ValueError`	If `key` is out of bounds.
`ValueError`	If `key` is not supported.
`TypeError`	If the indices in `key` have an unsupported type.

Args
`x`	A `Tensor` of type `bool`. A `Tensor` of type `bool`.
`name`	A name for the operation (optional).

Args
`x`	A Tensor. Must be one of the following types: `bfloat16`, `half`, `float32`, `float64`, `uint8`, `int8`, `uint16`, `int16`, `int32`, `int64`, `complex64`, `complex128`.
`y`	A `Tensor`. Must have the same type as `x`.
`name`	A name for the operation (optional).

Args
`x`	A `Tensor` of type `bool`.
`y`	A `Tensor` of type `bool`.
`name`	A name for the operation (optional).

tf.RaggedTensor bookmark_borderbookmark Stay organized with collections Save and categorize content based on your preferences.

View aliases

Potentially Ragged Tensors

Documenting RaggedTensor Shapes

Component Tensors

Example:

Alternative Row-Partitioning Schemes

Multiple Ragged Dimensions

Uniform Inner Dimensions

Uniform Outer Dimensions

Args

Raises

Attributes

Example:

Example:

Example:

Example:

Methods

bounding_shape

Example:

consumers

from_nested_row_lengths

Equivalent to:

from_nested_row_splits

Equivalent to:

from_nested_value_rowids

Equivalent to:

from_row_lengths

Example:

from_row_limits

Example:

from_row_splits

Example:

from_row_starts

Example:

from_sparse

Example:

from_tensor

Examples:

from_uniform_row_length

from_value_rowids

Example:

merge_dims

Examples:

nested_row_lengths

nested_value_rowids

Example:

nrows

Example:

numpy

Examples

row_lengths

Example:

row_limits

Example:

row_starts

Example:

to_list

to_sparse

Example:

to_tensor

Examples:

value_rowids

Example:

with_flat_values

with_row_splits_dtype

with_values

__abs__

__add__

__and__

Usage:

__bool__

__div__

__floordiv__

__ge__

Example:

__getitem__

Examples:

__gt__

Example:

tf.RaggedTensor
bookmark_border Stay organized with collections Save and categorize content based on your preferences.

`bounding_shape`

`consumers`

`from_nested_row_lengths`

`from_nested_row_splits`

`from_nested_value_rowids`

`from_row_lengths`

`from_row_limits`

`from_row_splits`

`from_row_starts`

`from_sparse`

`from_tensor`

`from_uniform_row_length`

`from_value_rowids`

`merge_dims`

`nested_row_lengths`

`nested_value_rowids`

`nrows`

`numpy`

`row_lengths`

`row_limits`

`row_starts`

`to_list`

`to_sparse`

`to_tensor`

`value_rowids`

`with_flat_values`

`with_row_splits_dtype`

`with_values`

`abs`

`add`

`and`

`bool`

`div`

`floordiv`

`ge`

`getitem`

`gt`

`invert`

`le`

`lt`

`mod`

`mul`

`neg`

`nonzero`

`or`

`pow`

`radd`

`rand`

`rdiv`

`rfloordiv`

`rmod`

`rmul`

`ror`

`rpow`

`rsub`

`rtruediv`

`rxor`

`sub`

`truediv`

`xor`

Args
`x`	A `Tensor` of type `float16`, `float32`, `float64`, `int32`, `int64`, `complex64`, or `complex128`.
`y`	A `Tensor` of type `float16`, `float32`, `float64`, `int32`, `int64`, `complex64`, or `complex128`.
`name`	A name for the operation (optional).

Args
`x`	`Tensor` numerator of numeric type.
`y`	`Tensor` denominator of numeric type.
`name`	A name for the operation (optional).