xnli

  • Description:

XNLI is a subset of a few thousand examples from MNLI which has been translated into a 14 different languages (some low-ish resource). As with MNLI, the goal is to predict textual entailment (does sentence A imply/contradict/neither sentence B) and is a classification task (given two sentences, predict one of three labels).

Split Examples
'test' 5,010
'validation' 2,490
  • Feature structure:
FeaturesDict({
    'hypothesis': TranslationVariableLanguages({
        'language': Text(shape=(), dtype=string),
        'translation': Text(shape=(), dtype=string),
    }),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'premise': Translation({
        'ar': Text(shape=(), dtype=string),
        'bg': Text(shape=(), dtype=string),
        'de': Text(shape=(), dtype=string),
        'el': Text(shape=(), dtype=string),
        'en': Text(shape=(), dtype=string),
        'es': Text(shape=(), dtype=string),
        'fr': Text(shape=(), dtype=string),
        'hi': Text(shape=(), dtype=string),
        'ru': Text(shape=(), dtype=string),
        'sw': Text(shape=(), dtype=string),
        'th': Text(shape=(), dtype=string),
        'tr': Text(shape=(), dtype=string),
        'ur': Text(shape=(), dtype=string),
        'vi': Text(shape=(), dtype=string),
        'zh': Text(shape=(), dtype=string),
    }),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
hypothesis TranslationVariableLanguages
hypothesis/language Text string
hypothesis/translation Text string
label ClassLabel int64
premise Translation
premise/ar Text string
premise/bg Text string
premise/de Text string
premise/el Text string
premise/en Text string
premise/es Text string
premise/fr Text string
premise/hi Text string
premise/ru Text string
premise/sw Text string
premise/th Text string
premise/tr Text string
premise/ur Text string
premise/vi Text string
premise/zh Text string
  • Citation:
@InProceedings{conneau2018xnli,
  author = "Conneau, Alexis
                 and Rinott, Ruty
                 and Lample, Guillaume
                 and Williams, Adina
                 and Bowman, Samuel R.
                 and Schwenk, Holger
                 and Stoyanov, Veselin",
  title = "XNLI: Evaluating Cross-lingual Sentence Representations",
  booktitle = "Proceedings of the 2018 Conference on Empirical Methods
               in Natural Language Processing",
  year = "2018",
  publisher = "Association for Computational Linguistics",
  location = "Brussels, Belgium",
}