cosmos_qa

  • Description:

Cosmos QA is a large-scale dataset of 35.6K problems that require commonsense-based reading comprehension, formulated as multiple-choice questions. It focuses on reading between the lines over a diverse collection of people's everyday narratives, asking questions concerning on the likely causes or effects of events that require reasoning beyond the exact text spans in the context.

Split Examples
'test' 6,963
'train' 25,262
'validation' 2,985
  • Features:
FeaturesDict({
    'answer0': Text(shape=(), dtype=tf.string),
    'answer1': Text(shape=(), dtype=tf.string),
    'answer2': Text(shape=(), dtype=tf.string),
    'answer3': Text(shape=(), dtype=tf.string),
    'context': Text(shape=(), dtype=tf.string),
    'id': Text(shape=(), dtype=tf.string),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=4),
    'question': Text(shape=(), dtype=tf.string),
})
@inproceedings{huang-etal-2019-cosmos,
    title = "Cosmos {QA}: Machine Reading Comprehension with Contextual Commonsense Reasoning",
    author = "Huang, Lifu  and
      Le Bras, Ronan  and
      Bhagavatula, Chandra  and
      Choi, Yejin",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    year = "2019",
    url = "https://www.aclweb.org/anthology/D19-1243"
}