Help protect the Great Barrier Reef with TensorFlow on Kaggle Join Challenge

grounded_scan

  • Description:

Grounded SCAN (gSCAN) is a synthetic dataset for evaluating compositional generalization in situated language understanding. gSCAN pairs natural language instructions with action sequences, and requires the agent to interpret instructions within the context of a grid-based visual navigation environment.

More information can be found at: https://github.com/LauraRuis/groundedSCAN

FeaturesDict({
    'command': Sequence(Text(shape=(), dtype=tf.string)),
    'manner': Text(shape=(), dtype=tf.string),
    'meaning': Sequence(Text(shape=(), dtype=tf.string)),
    'referred_target': Text(shape=(), dtype=tf.string),
    'situation': FeaturesDict({
        'agent_direction': tf.int32,
        'agent_position': FeaturesDict({
            'column': tf.int32,
            'row': tf.int32,
        }),
        'direction_to_target': Text(shape=(), dtype=tf.string),
        'distance_to_target': tf.int32,
        'grid_size': tf.int32,
        'placed_objects': Sequence({
            'object': FeaturesDict({
                'color': Text(shape=(), dtype=tf.string),
                'shape': Text(shape=(), dtype=tf.string),
                'size': tf.int32,
            }),
            'position': FeaturesDict({
                'column': tf.int32,
                'row': tf.int32,
            }),
            'vector': Text(shape=(), dtype=tf.string),
        }),
        'target_object': FeaturesDict({
            'object': FeaturesDict({
                'color': Text(shape=(), dtype=tf.string),
                'shape': Text(shape=(), dtype=tf.string),
                'size': tf.int32,
            }),
            'position': FeaturesDict({
                'column': tf.int32,
                'row': tf.int32,
            }),
            'vector': Text(shape=(), dtype=tf.string),
        }),
    }),
    'target_commands': Sequence(Text(shape=(), dtype=tf.string)),
    'verb_in_command': Text(shape=(), dtype=tf.string),
})
@article{DBLP:journals/corr/abs-2003-05161,
  author    = {Laura Ruis and
               Jacob Andreas and
               Marco Baroni and
               Diane Bouchacourt and
               Brenden M. Lake},
  title     = {A Benchmark for Systematic Generalization in Grounded Language Understanding},
  journal   = {CoRR},
  volume    = {abs/2003.05161},
  year      = {2020},
  url       = {https://arxiv.org/abs/2003.05161},
  eprinttype = {arXiv},
  eprint    = {2003.05161},
  timestamp = {Tue, 17 Mar 2020 14:18:27 +0100},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2003-05161.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

grounded_scan/compositional_splits (default config)

  • Config description: Examples for compositional generalization.

  • Download size: 82.10 MiB

  • Dataset size: 1004.27 MiB

  • Splits:

Split Examples
'adverb_1' 112,880
'adverb_2' 38,582
'contextual' 11,460
'dev' 3,716
'situational_1' 88,642
'situational_2' 16,808
'test' 19,282
'train' 367,933
'visual' 37,436
'visual_easier' 18,718

grounded_scan/target_length_split

  • Config description: Examples for generalizing to larger target lengths.

  • Download size: 53.41 MiB

  • Dataset size: 550.15 MiB

  • Splits:

Split Examples
'dev' 1,821
'target_lengths' 198,588
'test' 37,784
'train' 180,301