View source on GitHub
|
Return an alignment from a set of source spans to a set of target spans.
text.span_alignment(
source_start,
source_limit,
target_start,
target_limit,
contains=False,
contained_by=False,
partial_overlap=False,
multivalent_result=False,
name=None
)
The source and target spans are specified using B+1 dimensional tensors,
with B>=0 batch dimensions followed by a final dimension that lists the
span offsets for each span in the batch:
- The
ith source span in batchb1...bBstarts atsource_start[b1...bB, i](inclusive), and extends to just beforesource_limit[b1...bB, i](exclusive). - The
jth target span in batchb1...bBstarts attarget_start[b1...bB, j](inclusive), and extends to just beforetarget_limit[b1...bB, j](exclusive).
result[b1...bB, i] contains the index (or indices) of the target span that
overlaps with the ith source span in batch b1...bB. The
multivalent_result parameter indicates whether the result should contain
a single span that aligns with the source span, or all spans that align with
the source span.
If
multivalent_resultis false (the default), thenresult[b1...bB, i]=jindicates that thejth target span overlaps with theith source span in batchb1...bB. If no target spans overlap with theith target span, thenresult[b1...bB, i]=-1.If
multivalent_resultis true, thenresult[b1...bB, i, n]=jindicates that thejth target span is thenth span that overlaps with theith source span in in batchb1...bB.
For a definition of span overlap, see the docstring for span_overlaps().
Examples:
Given the following source and target spans (with no batch dimensions):
# 0 5 10 15 20 25 30 35 40 45 50 55 60# |====|====|====|====|====|====|====|====|====|====|====|====|# Source: [-0-] [-1-] [2] [3] [4][-5-][-6-][-7-][-8-][-9-]# Target: [-0-][-1-] [-2-][-3-][-4-] [5] [6] [7] [-8-][-9-][10]# |====|====|====|====|====|====|====|====|====|====|====|====|source_starts = [0, 10, 16, 20, 27, 30, 35, 40, 45, 50]source_limits = [5, 15, 19, 23, 30, 35, 40, 45, 50, 55]target_starts = [0, 5, 15, 20, 25, 31, 35, 42, 47, 52, 57]target_limits = [5, 10, 20, 25, 30, 34, 38, 45, 52, 57, 61]span_alignment(source_starts, source_limits, target_starts, target_limits)<tf.Tensor: shape=(10,), dtype=int64,numpy=array([ 0, -1, -1, -1, -1, -1, -1, -1, -1, -1])>span_alignment(source_starts, source_limits, target_starts, target_limits,multivalent_result=True)<tf.RaggedTensor [[0], [], [], [], [], [], [], [], [], []]>span_alignment(source_starts, source_limits, target_starts, target_limits,contains=True)<tf.Tensor: shape=(10,), dtype=int64,numpy=array([ 0, -1, -1, -1, -1, 5, 6, 7, -1, -1])>span_alignment(source_starts, source_limits, target_starts, target_limits,partial_overlap=True, multivalent_result=True)<tf.RaggedTensor [[0], [], [2], [3], [4], [5], [6], [7], [8], [8, 9]]>
Returns | |
|---|---|
An int64 tensor with values in the range: -1 <= result < target_size.
If multivalent_result=False, then the returned tensor has shape
[source_size], where source_size is the length of the source_start
and source_limit input tensors. If multivalent_result=True, then the
returned tensor has shape `[source_size, (num_aligned_target_spans)].
|
View source on GitHub