TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

answer_equivalence

Description:

The Answer Equivalence Dataset contains human ratings on model predictions from several models on the SQuAD dataset. The ratings establish whether the predicted answer is 'equivalent' to the gold answer (taking into account both question and context).

More specifically, by 'equivalent' we mean that the predicted answer contains at least the same information as the gold answer and does not add superfluous information. The dataset contains annotations for: * predictions from BiDAF on SQuAD dev * predictions from XLNet on SQuAD dev * predictions from Luke on SQuAD dev * predictions from Albert on SQuAD training, dev and test examples

Homepage: https://github.com/google-research-datasets/answer-equivalence-dataset
Source code: tfds.datasets.answer_equivalence.Builder
Versions:
- 1.0.0 (default): Initial release.
Download size: 45.86 MiB
Dataset size: 47.24 MiB
Auto-cached (documentation): Yes
Splits:

Split	Examples
`'ae_dev'`	4,446
`'ae_test'`	9,724
`'dev_bidaf'`	7,522
`'dev_luke'`	4,590
`'dev_xlnet'`	7,932
`'train'`	9,090

Feature structure:

FeaturesDict({
    'candidate': Text(shape=(), dtype=string),
    'context': Text(shape=(), dtype=string),
    'gold_index': int32,
    'qid': Text(shape=(), dtype=string),
    'question': Text(shape=(), dtype=string),
    'question_1': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'question_2': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'question_3': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'question_4': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'reference': Text(shape=(), dtype=string),
    'score': float32,
})

Feature documentation:

Feature	Class	Dtype
	FeaturesDict
candidate	Text	string
context	Text	string
gold_index	Tensor	int32
qid	Text	string
question	Text	string
question_1	ClassLabel	int64
question_2	ClassLabel	int64
question_3	ClassLabel	int64
question_4	ClassLabel	int64
reference	Text	string
score	Tensor	float32

Supervised keys (See as_supervised doc): None
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):

Citation:

@article{bulian-etal-2022-tomayto,
      title={Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation},
      author={Jannis Bulian and Christian Buck and Wojciech Gajewski and Benjamin Boerschinger and Tal Schuster},
      year={2022},
      eprint={2202.07654},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}