View source on GitHub |
Computes LCS-based similarity score between the hypotheses and references.
text.metrics.rouge_l(
hypotheses, references, alpha=None
)
Used in the notebooks
Used in the tutorials |
---|
The Rouge-L metric is a score from 0 to 1 indicating how similar two sequences are, based on the length of the longest common subsequence (LCS). In particular, Rouge-L is the weighted harmonic mean (or f-measure) combining the LCS precision (the percentage of the hypothesis sequence covered by the LCS) and the LCS recall (the percentage of the reference sequence covered by the LCS).
Source: https://www.microsoft.com/en-us/research/publication/ rouge-a-package-for-automatic-evaluation-of-summaries/
This method returns the F-measure, Precision, and Recall for each (hypothesis, reference) pair.
Alpha is used as a weight for the harmonic mean of precision and recall. A value of 0 means recall is more important and 1 means precision is more important. Leaving alpha unset implies alpha=.5, which is the default in the official ROUGE-1.5.5.pl script. Setting alpha to a negative number triggers a compatibility mode with the tensor2tensor implementation of ROUGE-L.
hypotheses = tf.ragged.constant([["a","b"]])
references = tf.ragged.constant([["b"]])
f, p, r = rouge_l(hypotheses, references, alpha=1)
print("f: %s, p: %s, r: %s" % (f, p, r))
f: tf.Tensor([0.5], shape=(1,), dtype=float32),
p: tf.Tensor([0.5], shape=(1,), dtype=float32),
r: tf.Tensor([1.], shape=(1,), dtype=float32)
Returns | |
---|---|
an (f_measure, p_measure, r_measure) tuple, where each element is a vector of floats with shape [N]. The i-th float in each vector contains the similarity measure of hypotheses[i] and references[i]. |