TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

hellaswag

Description:

The HellaSwag dataset is a benchmark for Commonsense NLI. It includes a context and some endings which complete the context.

Additional Documentation: Explore on Papers With Code
Homepage: https://rowanzellers.com/hellaswag/
Source code: tfds.text.Hellaswag
Versions:
- 0.0.1: No release notes.
- 1.0.0: Adding separate splits for in-domain and out-of-domain validation/test sets.
- 1.1.0 (default): Another split dimension for source (wikihow vs activitynet)
Download size: 68.18 MiB
Dataset size: 107.45 MiB
Auto-cached (documentation): Yes
Splits:

Split	Examples
`'test'`	10,003
`'test_ind_activitynet'`	1,870
`'test_ind_wikihow'`	3,132
`'test_ood_activitynet'`	1,651
`'test_ood_wikihow'`	3,350
`'train'`	39,905
`'train_activitynet'`	14,740
`'train_wikihow'`	25,165
`'validation'`	10,042
`'validation_ind_activitynet'`	1,809
`'validation_ind_wikihow'`	3,192
`'validation_ood_activitynet'`	1,434
`'validation_ood_wikihow'`	3,607

Feature structure:

FeaturesDict({
    'activity_label': Text(shape=(), dtype=string),
    'context': Text(shape=(), dtype=string),
    'endings': Sequence(Text(shape=(), dtype=string)),
    'label': int32,
    'source_id': Text(shape=(), dtype=string),
    'split_type': Text(shape=(), dtype=string),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
activity_label	Text		string
context	Text		string
endings	Sequence(Text)	(None,)	string
label	Tensor		int32
source_id	Text		string
split_type	Text		string

Supervised keys (See as_supervised doc): None
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):

Citation:

@inproceedings{zellers2019hellaswag,
    title={HellaSwag: Can a Machine Really Finish Your Sentence?},
    author={Zellers, Rowan and Holtzman, Ari and Bisk, Yonatan and Farhadi, Ali and Choi, Yejin},
    booktitle ={Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
    year={2019}
}