- Description:
QASC is a question-answering dataset with a focus on sentence composition. It consists of 9,980 8-way multiple-choice questions about grade school science (8,134 train, 926 dev, 920 test), and comes with a corpus of 17M sentences.
Additional Documentation: Explore on Papers With Code
Homepage: https://allenai.org/data/qasc
Source code:
tfds.datasets.qasc.Builder
Versions:
0.1.0
(default): No release notes.
Download size:
1.54 MiB
Dataset size:
6.61 MiB
Auto-cached (documentation): Yes
Splits:
Split | Examples |
---|---|
'test' |
920 |
'train' |
8,134 |
'validation' |
926 |
- Feature structure:
FeaturesDict({
'answerKey': Text(shape=(), dtype=string),
'choices': Sequence({
'label': Text(shape=(), dtype=string),
'text': Text(shape=(), dtype=string),
}),
'combinedfact': Text(shape=(), dtype=string),
'fact1': Text(shape=(), dtype=string),
'fact2': Text(shape=(), dtype=string),
'formatted_question': Text(shape=(), dtype=string),
'id': Text(shape=(), dtype=string),
'question': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
answerKey | Text | string | ||
choices | Sequence | |||
choices/label | Text | string | ||
choices/text | Text | string | ||
combinedfact | Text | string | ||
fact1 | Text | string | ||
fact2 | Text | string | ||
formatted_question | Text | string | ||
id | Text | string | ||
question | Text | string |
Supervised keys (See
as_supervised
doc):None
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):
- Citation:
@article{allenai:qasc,
author = {Tushar Khot and Peter Clark and Michal Guerquin and Peter Jansen and Ashish Sabharwal},
title = {QASC: A Dataset for Question Answering via Sentence Composition},
journal = {arXiv:1910.11473v2},
year = {2020},
}