The dataset contains pairs table-question, and the respective answer. The questions require multi-step reasoning and various data operations such as comparison, aggregation, and arithmetic computation. The tables were randomly selected among Wikipedia tables with at least 8 rows and 5 columns.

(As per the documentation usage notes)

Split Examples
'split-1-dev' 2,810
'split-1-train' 11,321
'split-2-dev' 2,838
'split-2-train' 11,312
'split-3-dev' 2,838
'split-3-train' 11,311
'test' 4,344
'train' 14,149
  • Feature structure:
    'input_text': FeaturesDict({
        'context': string,
        'table': Sequence({
            'column_header': string,
            'content': string,
            'row_number': int16,
    'target_text': string,
  • Feature documentation:
Feature Class Shape Dtype Description
input_text FeaturesDict
input_text/context Tensor string
input_text/table Sequence
input_text/table/column_header Tensor string
input_text/table/content Tensor string
input_text/table/row_number Tensor int16
target_text Tensor string
