Have a question? Connect with the community at the TensorFlow Forum Visit Forum


  • Description:

Race is a large-scale reading comprehension dataset with more than 28,000 passages and nearly 100,000 questions. The dataset is collected from English examinations in China, which are designed for middle school and high school students. The dataset can be served as the training and test sets for machine comprehension.

    'answers': Sequence(Text(shape=(), dtype=tf.string)),
    'article': Text(shape=(), dtype=tf.string),
    'example_id': Text(shape=(), dtype=tf.string),
    'options': Sequence(Sequence(Text(shape=(), dtype=tf.string))),
    'questions': Sequence(Text(shape=(), dtype=tf.string)),
    title={RACE: Large-scale ReAding Comprehension Dataset From Examinations},
    author={Lai, Guokun and Xie, Qizhe and Liu, Hanxiao and Yang, Yiming and Hovy, Eduard},
    journal={arXiv preprint arXiv:1704.04683},

race/high (default config)

  • Dataset size: 52.39 MiB

  • Splits:

Split Examples
'dev' 1,021
'test' 1,045
'train' 18,728


  • Dataset size: 12.51 MiB

  • Splits:

Split Examples
'dev' 368
'test' 362
'train' 6,409