View source on GitHub |
Return an alignment from a set of source spans to a set of target spans.
text.span_alignment(
source_start,
source_limit,
target_start,
target_limit,
contains=False,
contained_by=False,
partial_overlap=False,
multivalent_result=False,
name=None
)
The source and target spans are specified using B+1 dimensional tensors,
with B>=0
batch dimensions followed by a final dimension that lists the
span offsets for each span in the batch:
- The
i
th source span in batchb1...bB
starts atsource_start[b1...bB, i]
(inclusive), and extends to just beforesource_limit[b1...bB, i]
(exclusive). - The
j
th target span in batchb1...bB
starts attarget_start[b1...bB, j]
(inclusive), and extends to just beforetarget_limit[b1...bB, j]
(exclusive).
result[b1...bB, i]
contains the index (or indices) of the target span that
overlaps with the i
th source span in batch b1...bB
. The
multivalent_result
parameter indicates whether the result should contain
a single span that aligns with the source span, or all spans that align with
the source span.
If
multivalent_result
is false (the default), thenresult[b1...bB, i]=j
indicates that thej
th target span overlaps with thei
th source span in batchb1...bB
. If no target spans overlap with thei
th target span, thenresult[b1...bB, i]=-1
.If
multivalent_result
is true, thenresult[b1...bB, i, n]=j
indicates that thej
th target span is then
th span that overlaps with thei
th source span in in batchb1...bB
.
For a definition of span overlap, see the docstring for span_overlaps()
.
Examples:
Given the following source and target spans (with no batch dimensions):
# 0 5 10 15 20 25 30 35 40 45 50 55 60
# |====|====|====|====|====|====|====|====|====|====|====|====|
# Source: [-0-] [-1-] [2] [3] [4][-5-][-6-][-7-][-8-][-9-]
# Target: [-0-][-1-] [-2-][-3-][-4-] [5] [6] [7] [-8-][-9-][10]
# |====|====|====|====|====|====|====|====|====|====|====|====|
source_starts = [0, 10, 16, 20, 27, 30, 35, 40, 45, 50]
source_limits = [5, 15, 19, 23, 30, 35, 40, 45, 50, 55]
target_starts = [0, 5, 15, 20, 25, 31, 35, 42, 47, 52, 57]
target_limits = [5, 10, 20, 25, 30, 34, 38, 45, 52, 57, 61]
span_alignment(source_starts, source_limits, target_starts, target_limits)
<tf.Tensor: shape=(10,), dtype=int64,
numpy=array([ 0, -1, -1, -1, -1, -1, -1, -1, -1, -1])>
span_alignment(source_starts, source_limits, target_starts, target_limits,
multivalent_result=True)
<tf.RaggedTensor [[0], [], [], [], [], [], [], [], [], []]>
span_alignment(source_starts, source_limits, target_starts, target_limits,
contains=True)
<tf.Tensor: shape=(10,), dtype=int64,
numpy=array([ 0, -1, -1, -1, -1, 5, 6, 7, -1, -1])>
span_alignment(source_starts, source_limits, target_starts, target_limits,
partial_overlap=True, multivalent_result=True)
<tf.RaggedTensor [[0], [], [2], [3], [4], [5], [6], [7], [8], [8, 9]]>
Returns | |
---|---|
An int64 tensor with values in the range: -1 <= result < target_size .
If multivalent_result=False , then the returned tensor has shape
[source_size] , where source_size is the length of the source_start
and source_limit input tensors. If multivalent_result=True , then the
returned tensor has shape `[source_size, (num_aligned_target_spans)].
|