Help protect the Great Barrier Reef with TensorFlow on Kaggle

# tf.RaggedTensor

Represents a ragged tensor.

A RaggedTensor is a tensor with one or more ragged dimensions, which are dimensions whose slices may have different lengths. For example, the inner (column) dimension of rt=[[3, 1, 4, 1], [], [5, 9, 2], [6], []] is ragged, since the column slices (rt[0, :], ..., rt[4, :]) have different lengths. Dimensions whose slices all have the same length are called uniform dimensions. The outermost dimension of a RaggedTensor is always uniform, since it consists of a single slice (and so there is no possibility for differing slice lengths).

The total number of dimensions in a RaggedTensor is called its rank, and the number of ragged dimensions in a RaggedTensor is called its ragged-rank. A RaggedTensor's ragged-rank is fixed at graph creation time: it can't depend on the runtime values of Tensors, and can't vary dynamically for different session runs.

### Potentially Ragged Tensors

Many ops support both Tensors and RaggedTensors. The term "potentially ragged tensor" may be used to refer to a tensor that might be either a Tensor or a RaggedTensor. The ragged-rank of a Tensor is zero.

### Documenting RaggedTensor Shapes

When documenting the shape of a RaggedTensor, ragged dimensions can be indicated by enclosing them in parentheses. For example, the shape of a 3-D RaggedTensor that stores the fixed-size word embedding for each word in a sentence, for each sentence in a batch, could be written as [num_sentences, (num_words), embedding_size]. The parentheses around (num_words) indicate that dimension is ragged, and that the length of each element list in that dimension may vary for each item.

### Component Tensors

Internally, a RaggedTensor consists of a concatenated list of values that are partitioned into variable-length rows. In particular, each RaggedTensor consists of:

• A values tensor, which concatenates the variable-length rows into a flattened list. For example, the values tensor for [[3, 1, 4, 1], [], [5, 9, 2], [6], []] is [3, 1, 4, 1, 5, 9, 2, 6].

• A row_splits vector, which indicates how those flattened values are divided into rows. In particular, the values for row rt[i] are stored in the slice rt.values[rt.row_splits[i]:rt.row_splits[i+1]].

#### Example:

print(tf.RaggedTensor.from_row_splits(
    values=[3, 1, 4, 1, 5, 9, 2, 6],
    row_splits=[0, 4, 4, 7, 8, 8]))
<tf.RaggedTensor [[3, 1, 4, 1], [], [5, 9, 2], [6], []]>


### Alternative Row-Partitioning Schemes

In addition to row_splits, ragged tensors provide support for four other row-partitioning schemes:

• row_lengths: a vector with shape [nrows], which specifies the length of each row.

• value_rowids and nrows: value_rowids is a vector with shape [nvals], corresponding one-to-one with values, which specifies each value's row index. In particular, the row rt[row] consists of the values rt.values[j] where value_rowids[j]==row. nrows is an integer scalar that specifies the number of rows in the RaggedTensor. (nrows is used to indicate trailing empty rows.)

• row_starts: a vector with shape [nrows], which specifies the start offset of each row. Equivalent to row_splits[:-1].

• row_limits: a vector with shape [nrows], which specifies the stop offset of each row. Equivalent to row_splits[1:].

Example: The following ragged tensors are equivalent, and all represent the nested list [[3, 1, 4, 1], [], [5, 9, 2], [6], []].

values = [3, 1, 4, 1, 5, 9, 2, 6]
rt1 = RaggedTensor.from_row_splits(values, row_splits=[0, 4, 4, 7, 8, 8])
rt2 = RaggedTensor.from_row_lengths(values, row_lengths=[4, 0, 3, 1, 0])
rt3 = RaggedTensor.from_value_rowids(
    values, value_rowids=[0, 0, 0, 0, 2, 2, 2, 3], nrows=5)
rt4 = RaggedTensor.from_row_starts(values, row_starts=[0, 4, 4, 7, 8])
rt5 = RaggedTensor.from_row_limits(values, row_limits=[4, 4, 7, 8, 8])


### Multiple Ragged Dimensions

RaggedTensors with multiple ragged dimensions can be defined by using a nested RaggedTensor for the values tensor. Each nested RaggedTensor adds a single ragged dimension.

inner_rt = RaggedTensor.from_row_splits(  # =rt1 from above
    values=[3, 1, 4, 1, 5, 9, 2, 6], row_splits=[0, 4, 4, 7, 8, 8])
outer_rt = RaggedTensor.from_row_splits(
    values=inner_rt, row_splits=[0, 3, 3, 5])
print outer_rt.to_list()
[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]]
print outer_rt.ragged_rank
2


The factory function RaggedTensor.from_nested_row_splits may be used to construct a RaggedTensor with multiple ragged dimensions directly, by providing a list of row_splits tensors:

RaggedTensor.from_nested_row_splits(
    flat_values=[3, 1, 4, 1, 5, 9, 2, 6],
    nested_row_splits=([0, 3, 3, 5], [0, 4, 4, 7, 8, 8])).to_list()
[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]]


### Uniform Inner Dimensions

RaggedTensors with uniform inner dimensions can be defined by using a multidimensional Tensor for values.

rt = RaggedTensor.from_row_splits(values=tf.ones([5, 3]),
..                                    row_splits=[0, 2, 5])
print rt.to_list()
[[[1, 1, 1], [1, 1, 1]],
 [[1, 1, 1], [1, 1, 1], [1, 1, 1]]]
print rt.shape
 (2, ?, 3)


### RaggedTensor Shape Restrictions

The shape of a RaggedTensor is currently restricted to have the following form:

• A single uniform dimension
• Followed by one or more ragged dimensions
• Followed by zero or more uniform dimensions.

This restriction follows from the fact that each nested RaggedTensor replaces the uniform outermost dimension of its values with a uniform dimension followed by a ragged dimension.

values A potentially ragged tensor of any dtype and shape [nvals, ...].
row_splits A 1-D integer tensor with shape [nrows+1].
cached_row_lengths A 1-D integer tensor with shape [nrows]
cached_value_rowids A 1-D integer tensor with shape [nvals].
cached_nrows A 1-D integer scalar tensor.
internal True if the constructor is being called by one of the factory methods. If false, an exception will be raised.

TypeError If a row partitioning tensor has an inappropriate dtype.
TypeError If exactly one row partitioning argument was not specified.
ValueError If a row partitioning tensor has an inappropriate shape.
ValueError If multiple partitioning arguments are specified.
ValueError If nrows is specified but value_rowids is not None.

dtype The DType of values in this tensor.
flat_values The innermost values tensor for this ragged tensor.

Concretely, if rt.values is a Tensor, then rt.flat_values is rt.values; otherwise, rt.flat_values is rt.values.flat_values.

Conceptually, flat_values is the tensor formed by flattening the outermost dimension and all of the ragged dimensions into a single dimension.

rt.flat_values.shape = [nvals] + rt.shape[rt.ragged_rank + 1:] (where nvals is the number of items in the flattened dimensions).

#### Example:

rt = ragged.constant([[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]])
print rt.flat_values()
tf.Tensor([3, 1, 4, 1, 5, 9, 2, 6])


nested_row_splits A tuple containing the row_splits for all ragged dimensions.

rt.nested_row_splits is a tuple containing the row_splits tensors for all ragged dimensions in rt, ordered from outermost to innermost. In particular, rt.nested_row_splits = (rt.row_splits,) + value_splits where:

• value_splits = () if rt.values is a Tensor.
• value_splits = rt.values.nested_row_splits otherwise.

#### Example:

rt = ragged.constant([[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]]])
for i, splits in enumerate(rt.nested_row_splits()):
  print('Splits for dimension %d: %s' % (i+1, splits))
Splits for dimension 1: [0, 1]
Splits for dimension 2: [0, 3, 3, 5]
Splits for dimension 3: [0, 4, 4, 7, 8, 8]


ragged_rank The number of ragged dimensions in this ragged tensor.
row_splits The row-split indices for this ragged tensor's values.

rt.row_splits specifies where the values for each row begin and end in rt.values. In particular, the values for row rt[i] are stored in the slice rt.values[rt.row_splits[i]:rt.row_splits[i+1]].

#### Example:

>>> rt = ragged.constant([[3, 1, 4, 1], [], [5, 9, 2], [6], []])
>>> print rt.row_splits  # indices of row splits in rt.values
tf.Tensor([0, 4, 4, 7, 8, 8])


shape The statically known shape of this ragged tensor.
values The concatenated rows for this ragged tensor.

rt.values is a potentially ragged tensor formed by flattening the two outermost dimensions of rt into a single dimension.

rt.values.shape = [nvals] + rt.shape[2:] (where nvals is the number of items in the outer two dimensions of rt).

rt.ragged_rank = self.ragged_rank - 1

#### Example:

>>> rt = ragged.constant([[3, 1, 4, 1], [], [5, 9, 2], [6], []])
>>> print rt.values
tf.Tensor([3, 1, 4, 1, 5, 9, 2, 6])