![]() |
![]() |
Represents a ragged tensor.
tf.RaggedTensor(
values, row_splits, cached_row_lengths=None, cached_value_rowids=None,
cached_nrows=None, internal=False
)
A RaggedTensor
is a tensor with one or more ragged dimensions, which are
dimensions whose slices may have different lengths. For example, the inner
(column) dimension of rt=[[3, 1, 4, 1], [], [5, 9, 2], [6], []]
is ragged,
since the column slices (rt[0, :]
, ..., rt[4, :]
) have different lengths.
Dimensions whose slices all have the same length are called uniform
dimensions. The outermost dimension of a RaggedTensor
is always uniform,
since it consists of a single slice (and so there is no possibility for
differing slice lengths).
The total number of dimensions in a RaggedTensor
is called its rank,
and the number of ragged dimensions in a RaggedTensor
is called its
ragged-rank. A RaggedTensor
's ragged-rank is fixed at graph creation
time: it can't depend on the runtime values of Tensor
s, and can't vary
dynamically for different session runs.
Potentially Ragged Tensors
Many ops support both Tensor
s and RaggedTensor
s. The term "potentially
ragged tensor" may be used to refer to a tensor that might be either a
Tensor
or a RaggedTensor
. The ragged-rank of a Tensor
is zero.
Documenting RaggedTensor Shapes
When documenting the shape of a RaggedTensor, ragged dimensions can be
indicated by enclosing them in parentheses. For example, the shape of
a 3-D RaggedTensor
that stores the fixed-size word embedding for each
word in a sentence, for each sentence in a batch, could be written as
[num_sentences, (num_words), embedding_size]
. The parentheses around
(num_words)
indicate that dimension is ragged, and that the length
of each element list in that dimension may vary for each item.
Component Tensors
Internally, a RaggedTensor
consists of a concatenated list of values that
are partitioned into variable-length rows. In particular, each RaggedTensor
consists of:
A
values
tensor, which concatenates the variable-length rows into a flattened list. For example, thevalues
tensor for[[3, 1, 4, 1], [], [5, 9, 2], [6], []]
is[3, 1, 4, 1, 5, 9, 2, 6]
.A
row_splits
vector, which indicates how those flattened values are divided into rows. In particular, the values for rowrt[i]
are stored in the slicert.values[rt.row_splits[i]:rt.row_splits[i+1]]
.
Example:
print(tf.RaggedTensor.from_row_splits(
values=[3, 1, 4, 1, 5, 9, 2, 6],
row_splits=[0, 4, 4, 7, 8, 8]))
<tf.RaggedTensor [[3, 1, 4, 1], [], [5, 9, 2], [6], []]>
Alternative Row-Partitioning Schemes
In addition to row_splits
, ragged tensors provide support for four other
row-partitioning schemes:
row_lengths
: a vector with shape[nrows]
, which specifies the length of each row.value_rowids
andnrows
:value_rowids
is a vector with shape[nvals]
, corresponding one-to-one withvalues
, which specifies each value's row index. In particular, the rowrt[row]
consists of the valuesrt.values[j]
wherevalue_rowids[j]==row
.nrows
is an integer scalar that specifies the number of rows in theRaggedTensor
. (nrows
is used to indicate trailing empty rows.)row_starts
: a vector with shape[nrows]
, which specifies the start offset of each row. Equivalent torow_splits[:-1]
.row_limits
: a vector with shape[nrows]
, which specifies the stop offset of each row. Equivalent torow_splits[1:]
.
Example: The following ragged tensors are equivalent, and all represent the
nested list [[3, 1, 4, 1], [], [5, 9, 2], [6], []]
.
values = [3, 1, 4, 1, 5, 9, 2, 6]
rt1 = RaggedTensor.from_row_splits(values, row_splits=[0, 4, 4, 7, 8, 8])
rt2 = RaggedTensor.from_row_lengths(values, row_lengths=[4, 0, 3, 1, 0])
rt3 = RaggedTensor.from_value_rowids(
values, value_rowids=[0, 0, 0, 0, 2, 2, 2, 3], nrows=5)
rt4 = RaggedTensor.from_row_starts(values, row_starts=[0, 4, 4, 7, 8])
rt5 = RaggedTensor.from_row_limits(values, row_limits=[4, 4, 7, 8, 8])
Multiple Ragged Dimensions
RaggedTensor
s with multiple ragged dimensions can be defined by using
a nested RaggedTensor
for the values
tensor. Each nested RaggedTensor
adds a single ragged dimension.
inner_rt = RaggedTensor.from_row_splits( # =rt1 from above
values=[3, 1, 4, 1, 5, 9, 2, 6], row_splits=[0, 4, 4, 7, 8, 8])
outer_rt = RaggedTensor.from_row_splits(
values=inner_rt, row_splits=[0, 3, 3, 5])
print outer_rt.to_list()
[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]]
print outer_rt.ragged_rank
2
The factory function RaggedTensor.from_nested_row_splits
may be used to
construct a RaggedTensor
with multiple ragged dimensions directly, by
providing a list of row_splits
tensors:
RaggedTensor.from_nested_row_splits(
flat_values=[3, 1, 4, 1, 5, 9, 2, 6],
nested_row_splits=([0, 3, 3, 5], [0, 4, 4, 7, 8, 8])).to_list()
[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]]
Uniform Inner Dimensions
RaggedTensor
s with uniform inner dimensions can be defined
by using a multidimensional Tensor
for values
.
rt = RaggedTensor.from_row_splits(values=tf.ones([5, 3]),
.. row_splits=[0, 2, 5])
print rt.to_list()
[[[1, 1, 1], [1, 1, 1]],
[[1, 1, 1], [1, 1, 1], [1, 1, 1]]]
print rt.shape
(2, ?, 3)
RaggedTensor Shape Restrictions
The shape of a RaggedTensor is currently restricted to have the following form:
- A single uniform dimension
- Followed by one or more ragged dimensions
- Followed by zero or more uniform dimensions.
This restriction follows from the fact that each nested RaggedTensor
replaces the uniform outermost dimension of its values
with a uniform
dimension followed by a ragged dimension.
Args | |
---|---|
values
|
A potentially ragged tensor of any dtype and shape [nvals, ...] .
|
row_splits
|
A 1-D integer tensor with shape [nrows+1] .
|
cached_row_lengths
|
A 1-D integer tensor with shape [nrows]
|
cached_value_rowids
|
A 1-D integer tensor with shape [nvals] .
|
cached_nrows
|
A 1-D integer scalar tensor. |
internal
|
True if the constructor is being called by one of the factory methods. If false, an exception will be raised. |
Raises | |
---|---|
TypeError
|
If a row partitioning tensor has an inappropriate dtype. |
TypeError
|
If exactly one row partitioning argument was not specified. |
ValueError
|
If a row partitioning tensor has an inappropriate shape. |
ValueError
|
If multiple partitioning arguments are specified. |
ValueError
|
If nrows is specified but value_rowids is not None. |
Attributes | |
---|---|
dtype
|
The DType of values in this tensor.
|
flat_values
|
The innermost values tensor for this ragged tensor.
Concretely, if Conceptually,
Example:
|
nested_row_splits
|
A tuple containing the row_splits for all ragged dimensions.
Example:
|
ragged_rank
|
The number of ragged dimensions in this ragged tensor. |
row_splits
|
The row-split indices for this ragged tensor's values .
Example:
|
shape
|
The statically known shape of this ragged tensor. |
values
|
The concatenated rows for this ragged tensor.
Example:
|
Methods
bounding_shape
bounding_shape(
axis=None, name=None, out_type=None
)
Returns the tight bounding box shape for this RaggedTensor
.
Args | |
---|---|
axis
|
An integer scalar or vector indicating which axes to return the bounding box for. If not specified, then the full bounding box is returned. |
name
|
A name prefix for the returned tensor (optional). |
out_type
|
dtype for the returned tensor. Defaults to
self.row_splits.dtype .
|
Returns | |
---|---|
An integer Tensor (dtype=self.row_splits.dtype ). If axis is not
specified, then output is a vector with
output.shape=[self.shape.ndims] . If axis is a scalar, then the
output is a scalar. If axis is a vector, then output is a vector,
where output[i] is the bounding size for dimension axis[i] .
|
Example:
>>> rt = ragged.constant([[1, 2, 3, 4], [5], [], [6, 7, 8, 9], [10]]) >>> rt.bounding_shape() [5, 4]
consumers
consumers()
from_nested_row_lengths
@classmethod
from_nested_row_lengths( flat_values, nested_row_lengths, name=None, validate=True )
Creates a RaggedTensor
from a nested list of row_lengths
tensors.
Equivalent to:
result = flat_values
for row_lengths in reversed(nested_row_lengths):
result = from_row_lengths(result, row_lengths)
Args | |
---|---|
flat_values
|
A potentially ragged tensor. |
nested_row_lengths
|
A list of 1-D integer tensors. The i th tensor is
used as the row_lengths for the i th ragged dimension.
|
name
|
A name prefix for the RaggedTensor (optional). |
validate
|
If true, then use assertions to check that the arguments form
a valid RaggedTensor .
|
Returns | |
---|---|
A RaggedTensor (or flat_values if nested_row_lengths is empty).
|
from_nested_row_splits
@classmethod
from_nested_row_splits( flat_values, nested_row_splits, name=None, validate=True )
Creates a RaggedTensor
from a nested list of row_splits
tensors.
Equivalent to:
result = flat_values
for row_splits in reversed(nested_row_splits):
result = from_row_splits(result, row_splits)
Args | |
---|---|
flat_values
|
A potentially ragged tensor. |
nested_row_splits
|
A list of 1-D integer tensors. The i th tensor is
used as the row_splits for the i th ragged dimension.
|
name
|
A name prefix for the RaggedTensor (optional). |
validate
|
If true, then use assertions to check that the arguments form a
valid RaggedTensor .
|
Returns | |
---|---|
A RaggedTensor (or flat_values if nested_row_splits is empty).
|
from_nested_value_rowids
@classmethod
from_nested_value_rowids( flat_values, nested_value_rowids, nested_nrows=None, name=None, validate=True )
Creates a RaggedTensor
from a nested list of value_rowids
tensors.
Equivalent to:
result = flat_values
for (rowids, nrows) in reversed(zip(nested_value_rowids, nested_n