• Description:

The dataset contains pairs table-question, and the respective answer. The questions require multi-step reasoning and various data operations such as comparison, aggregation, and arithmetic computation. The tables were randomly selected among Wikipedia tables with at least 8 rows and 5 columns.

(As per the documentation usage notes)

Split Examples
'split-1-dev' 2,810
'split-1-train' 11,321
'split-2-dev' 2,838
'split-2-train' 11,312
'split-3-dev' 2,838
'split-3-train' 11,311
'test' 4,344
'train' 14,149
  • Feature structure:
    'input_text': FeaturesDict({
        'context': string,
        'table': Sequence({
            'column_header': string,
            'content': string,
            'row_number': int16,
    'target_text': string,
  • Feature documentation:
Feature Class Shape Dtype Description
input_text FeaturesDict
input_text/context Tensor string
input_text/table Sequence
input_text/table/column_header Tensor string
input_text/table/content Tensor string
input_text/table/row_number Tensor int16
target_text Tensor string
  • Citation:
    title = "Compositional Semantic Parsing on Semi-Structured Tables",
    author = "Pasupat, Panupong  and
      Liang, Percy",
    booktitle = "Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)",
    month = jul,
    year = "2015",
    address = "Beijing, China",
    publisher = "Association for Computational Linguistics",
    url = "",
    doi = "10.3115/v1/P15-1142",
    pages = "1470--1480",