View source on GitHub |
Builds a sliding window for data
with a specified width.
text.sliding_window(
data, width, axis=-1, name=None
)
Returns a tensor constructed from data
, where each element in
dimension axis
is a slice of data
starting at the corresponding
position, with the given width and step size. I.e.:
result.shape.ndims = data.shape.ndims + 1
result[i1..iaxis, a] = data[i1..iaxis, a:a+width]
(where0 <= a < data[i1...iaxis].shape[0] - (width - 1)
).
Note that each result row (along dimension axis
) has width - 1
fewer items
than the corresponding data
row. If a data
row has fewer than width
items, then the corresponding result
row will be empty. If you wish for
the result
rows to be the same size as the data
rows, you can use
pad_along_dimension
to add width - 1
padding elements before calling
this op.
Examples:
Sliding window (width=3) across a sequence of tokens:
# input: <string>[sequence_length]
input = tf.constant(["one", "two", "three", "four", "five", "six"])
# output: <string>[sequence_length-2, 3]
sliding_window(data=input, width=3, axis=0)
<tf.Tensor: shape=(4, 3), dtype=string, numpy=
array([[b'one', b'two', b'three'],
[b'two', b'three', b'four'],
[b'three', b'four', b'five'],
[b'four', b'five', b'six']], dtype=object)>
Sliding window (width=2) across the inner dimension of a ragged matrix containing a batch of token sequences:
# input: <string>[num_sentences, (num_words)]
input = tf.ragged.constant(
[['Up', 'high', 'in', 'the', 'air'],
['Down', 'under', 'water'],
['Away', 'to', 'outer', 'space']])
# output: <string>[num_sentences, (num_word-1), 2]
sliding_window(input, width=2, axis=-1)
<tf.RaggedTensor [[[b'Up', b'high'], [b'high', b'in'], [b'in', b'the'],
[b'the', b'air']], [[b'Down', b'under'],
[b'under', b'water']],
[[b'Away', b'to'], [b'to', b'outer'],
[b'outer', b'space']]]>
Sliding window across the second dimension of a 3-D tensor containing batches of sequences of embedding vectors:
# input: <int32>[num_sequences, sequence_length, embedding_size]
input = tf.constant([
[[1, 1, 1], [2, 2, 1], [3, 3, 1], [4, 4, 1], [5, 5, 1]],
[[1, 1, 2], [2, 2, 2], [3, 3, 2], [4, 4, 2], [5, 5, 2]]])
# output: <int32>[num_sequences, sequence_length-1, 2, embedding_size]
sliding_window(data=input, width=2, axis=1)
<tf.Tensor: shape=(2, 4, 2, 3), dtype=int32, numpy=
array([[[[1, 1, 1],
[2, 2, 1]],
[[2, 2, 1],
[3, 3, 1]],
[[3, 3, 1],
[4, 4, 1]],
[[4, 4, 1],
[5, 5, 1]]],
[[[1, 1, 2],
[2, 2, 2]],
[[2, 2, 2],
[3, 3, 2]],
[[3, 3, 2],
[4, 4, 2]],
[[4, 4, 2],
[5, 5, 2]]]], dtype=int32)>
Returns | |
---|---|
A K+1 dimensional tensor with the same dtype as data , where:
|