Missed TensorFlow Dev Summit? Check out the video playlist. Watch recordings

cfq

  • Description:

The CFQ dataset (and it's splits) for measuring compositional generalization.

See arxiv.org/abs/1912.09713.pdf for background.

Example usage: data = tfds.load('cfq/mcd1')

  • Config description: The CFQ dataset (and it's splits) for measuring compositional generalization.

See arxiv.org/abs/1912.09713.pdf for background.

Example usage: data = tfds.load('cfq/mcd1') * Homepage: https://github.com/google-research/google-research/tree/master/cfq

  • Source code: tfds.text.cfq.CFQ
  • Versions: * 1.0.1 (default): No release notes. * Download size: 255.20 MiB * Auto-cached (documentation): Yes * Features:
FeaturesDict({
    'query': Text(shape=(), dtype=tf.string),
    'question': Text(shape=(), dtype=tf.string),
})
@inproceedings{Keysers2020,
  title={Measuring Compositional Generalization: A Comprehensive Method on
         Realistic Data},
  author={Daniel Keysers and Nathanael Sch"{a}rli and Nathan Scales and
          Hylke Buisman and Daniel Furrer and Sergii Kashubin and
          Nikola Momchev and Danila Sinopalnikov and Lukasz Stafiniak and
          Tibor Tihon and Dmitry Tsarkov and Xiao Wang and Marc van Zee and
          Olivier Bousquet},
  booktitle={ICLR},
  year={2020},
  url={https://arxiv.org/abs/1912.09713.pdf},
}

cfq/mcd1 (default config)

  • Dataset size: 44.15 MiB
  • Splits:
Split Examples
'test' 11,968
'train' 95,743

cfq/mcd2

  • Dataset size: 45.94 MiB
  • Splits:
Split Examples
'test' 11,968
'train' 95,743

cfq/mcd3

  • Dataset size: 44.82 MiB
  • Splits:
Split Examples
'test' 11,968
'train' 95,743

cfq/question_complexity_split

  • Dataset size: 46.98 MiB
  • Splits:
Split Examples
'test' 10,340
'train' 98,999

cfq/question_pattern_split

  • Dataset size: 47.53 MiB
  • Splits:
Split Examples
'test' 11,909
'train' 95,654

cfq/query_complexity_split

  • Dataset size: 47.13 MiB
  • Splits:
Split Examples
'test' 9,512
'train' 100,654

cfq/query_pattern_split

  • Dataset size: 47.21 MiB
  • Splits:
Split Examples
'test' 12,589
'train' 94,600

cfq/random_split

  • Dataset size: 47.58 MiB
  • Splits:
Split Examples
'test' 11,967
'train' 95,744