References:
abstract_narrative_understanding
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/abstract_narrative_understanding')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
3000 |
'train' |
2400 |
'validation' |
600 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
anachronisms
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/anachronisms')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
230 |
'train' |
184 |
'validation' |
46 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
analogical_similarity
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/analogical_similarity')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
323 |
'train' |
259 |
'validation' |
64 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
analytic_entailment
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/analytic_entailment')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
70 |
'train' |
54 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
arithmetic
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/arithmetic')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
15023 |
'train' |
12019 |
'validation' |
3004 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
ascii_word_recognition
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/ascii_word_recognition')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
5000 |
'train' |
4000 |
'validation' |
1000 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
authorship_verification
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/authorship_verification')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
880 |
'train' |
704 |
'validation' |
176 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
auto_categorization
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/auto_categorization')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
328 |
'train' |
263 |
'validation' |
65 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
auto_debugging
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/auto_debugging')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
34 |
'train' |
18 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
bbq_lite_json
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/bbq_lite_json')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
16076 |
'train' |
12866 |
'validation' |
3210 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
bridging_anaphora_resolution_barqa
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/bridging_anaphora_resolution_barqa')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
648 |
'train' |
519 |
'validation' |
129 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
causal_judgment
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/causal_judgment')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
190 |
'train' |
152 |
'validation' |
38 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
cause_and_effect
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/cause_and_effect')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
153 |
'train' |
123 |
'validation' |
30 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
checkmate_in_one
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/checkmate_in_one')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
3498 |
'train' |
2799 |
'validation' |
699 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
chess_state_tracking
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/chess_state_tracking')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
6000 |
'train' |
4800 |
'validation' |
1200 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
chinese_remainder_theorem
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/chinese_remainder_theorem')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
500 |
'train' |
400 |
'validation' |
100 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
cifar10_classification
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/cifar10_classification')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
20000 |
'train' |
16000 |
'validation' |
4000 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
code_line_description
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/code_line_description')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
60 |
'train' |
44 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
codenames
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/codenames')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
85 |
'train' |
68 |
'validation' |
17 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
color
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/color')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
4000 |
'train' |
3200 |
'validation' |
800 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
common_morpheme
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/common_morpheme')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
50 |
'train' |
34 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
conceptual_combinations
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/conceptual_combinations')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
103 |
'train' |
84 |
'validation' |
19 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
conlang_translation
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/conlang_translation')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
164 |
'train' |
132 |
'validation' |
32 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
contextual_parametric_knowledge_conflicts
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/contextual_parametric_knowledge_conflicts')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
17528 |
'train' |
14023 |
'validation' |
3505 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
crash_blossom
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/crash_blossom')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
38 |
'train' |
22 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
crass_ai
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/crass_ai')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
44 |
'train' |
28 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
cryobiology_spanish
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/cryobiology_spanish')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
146 |
'train' |
117 |
'validation' |
29 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
cryptonite
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/cryptonite')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
26157 |
'train' |
20926 |
'validation' |
5231 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
cs_algorithms
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/cs_algorithms')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1320 |
'train' |
1056 |
'validation' |
264 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
dark_humor_detection
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/dark_humor_detection')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
80 |
'train' |
64 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
date_understanding
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/date_understanding')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
369 |
'train' |
296 |
'validation' |
73 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
disambiguation_qa
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/disambiguation_qa')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
258 |
'train' |
207 |
'validation' |
51 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
discourse_marker_prediction
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/discourse_marker_prediction')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
857 |
'train' |
686 |
'validation' |
171 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
disfl_qa
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/disfl_qa')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
8000 |
'train' |
6400 |
'validation' |
1600 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
dyck_languages
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/dyck_languages')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1000 |
'train' |
800 |
'validation' |
200 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
elementary_math_qa
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/elementary_math_qa')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
38160 |
'train' |
30531 |
'validation' |
7629 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
emoji_movie
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/emoji_movie')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
100 |
'train' |
80 |
'validation' |
20 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
emojis_emotion_prediction
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/emojis_emotion_prediction')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
131 |
'train' |
105 |
'validation' |
26 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
empirical_judgments
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/empirical_judgments')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
99 |
'train' |
80 |
'validation' |
19 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
english_proverbs
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/english_proverbs')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
34 |
'train' |
18 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
english_russian_proverbs
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/english_russian_proverbs')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
80 |
'train' |
64 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
entailed_polarity
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/entailed_polarity')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
148 |
'train' |
119 |
'validation' |
29 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
entailed_polarity_hindi
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/entailed_polarity_hindi')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
138 |
'train' |
111 |
'validation' |
27 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
epistemic_reasoning
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/epistemic_reasoning')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
2000 |
'train' |
1600 |
'validation' |
400 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
evaluating_information_essentiality
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/evaluating_information_essentiality')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
68 |
'train' |
52 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
fact_checker
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/fact_checker')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
7154 |
'train' |
5724 |
'validation' |
1430 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
fantasy_reasoning
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/fantasy_reasoning')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
201 |
'train' |
161 |
'validation' |
40 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
few_shot_nlg
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/few_shot_nlg')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
153 |
'train' |
123 |
'validation' |
30 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
figure_of_speech_detection
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/figure_of_speech_detection')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
59 |
'train' |
43 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
formal_fallacies_syllogisms_negation
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/formal_fallacies_syllogisms_negation')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
14200 |
'train' |
11360 |
'validation' |
2840 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
gem
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/gem')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
14802 |
'train' |
11845 |
'validation' |
2957 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
gender_inclusive_sentences_german
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/gender_inclusive_sentences_german')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
200 |
'train' |
160 |
'validation' |
40 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
general_knowledge
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/general_knowledge')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
70 |
'train' |
54 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
geometric_shapes
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/geometric_shapes')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
359 |
'train' |
288 |
'validation' |
71 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
goal_step_wikihow
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/goal_step_wikihow')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
7053 |
'train' |
5643 |
'validation' |
1410 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
gre_reading_comprehension
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/gre_reading_comprehension')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
31 |
'train' |
15 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
hhh_alignment
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/hhh_alignment')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
221 |
'train' |
179 |
'validation' |
42 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
hindi_question_answering
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/hindi_question_answering')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
6610 |
'train' |
5288 |
'validation' |
1322 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
hindu_knowledge
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/hindu_knowledge')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
175 |
'train' |
140 |
'validation' |
35 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
hinglish_toxicity
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/hinglish_toxicity')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
200 |
'train' |
160 |
'validation' |
40 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
human_organs_senses
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/human_organs_senses')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
42 |
'train' |
26 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
hyperbaton
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/hyperbaton')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
50000 |
'train' |
40000 |
'validation' |
10000 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
identify_math_theorems
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/identify_math_theorems')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
53 |
'train' |
37 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
identify_odd_metaphor
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/identify_odd_metaphor')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
47 |
'train' |
31 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
implicatures
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/implicatures')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
492 |
'train' |
394 |
'validation' |
98 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
implicit_relations
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/implicit_relations')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
85 |
'train' |
68 |
'validation' |
17 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
intent_recognition
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/intent_recognition')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
693 |
'train' |
555 |
'validation' |
138 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
international_phonetic_alphabet_nli
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/international_phonetic_alphabet_nli')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
126 |
'train' |
101 |
'validation' |
25 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
international_phonetic_alphabet_transliterate
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/international_phonetic_alphabet_transliterate')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1003 |
'train' |
803 |
'validation' |
200 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
intersect_geometry
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/intersect_geometry')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
249999 |
'train' |
200000 |
'validation' |
49999 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
irony_identification
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/irony_identification')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
99 |
'train' |
80 |
'validation' |
19 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
kanji_ascii
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/kanji_ascii')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1092 |
'train' |
875 |
'validation' |
217 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
kannada
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/kannada')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
316 |
'train' |
253 |
'validation' |
63 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
key_value_maps
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/key_value_maps')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
101 |
'train' |
80 |
'validation' |
21 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
known_unknowns
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/known_unknowns')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
46 |
'train' |
30 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
language_games
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/language_games')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
2128 |
'train' |
1704 |
'validation' |
424 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
language_identification
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/language_identification')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
10000 |
'train' |
8000 |
'validation' |
2000 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
linguistic_mappings
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/linguistic_mappings')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
15527 |
'train' |
12426 |
'validation' |
3101 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
linguistics_puzzles
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/linguistics_puzzles')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
2000 |
'train' |
1600 |
'validation' |
400 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
list_functions
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/list_functions')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
10750 |
'train' |
8700 |
'validation' |
2050 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
logic_grid_puzzle
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/logic_grid_puzzle')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1000 |
'train' |
800 |
'validation' |
200 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
logical_args
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/logical_args')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
32 |
'train' |
16 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
logical_deduction
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/logical_deduction')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1500 |
'train' |
1200 |
'validation' |
300 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
logical_fallacy_detection
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/logical_fallacy_detection')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
2800 |
'train' |
2240 |
'validation' |
560 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
logical_sequence
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/logical_sequence')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
39 |
'train' |
23 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
mathematical_induction
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/mathematical_induction')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
69 |
'train' |
53 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
matrixshapes
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/matrixshapes')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
4462 |
'train' |
3570 |
'validation' |
892 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
metaphor_boolean
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/metaphor_boolean')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
680 |
'train' |
544 |
'validation' |
136 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
metaphor_understanding
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/metaphor_understanding')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
234 |
'train' |
188 |
'validation' |
46 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
minute_mysteries_qa
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/minute_mysteries_qa')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
477 |
'train' |
383 |
'validation' |
94 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
misconceptions
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/misconceptions')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
219 |
'train' |
176 |
'validation' |
43 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
misconceptions_russian
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/misconceptions_russian')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
49 |
'train' |
33 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
mnist_ascii
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/mnist_ascii')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
69984 |
'train' |
55988 |
'validation' |
13996 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
modified_arithmetic
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/modified_arithmetic')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
6000 |
'train' |
4800 |
'validation' |
1200 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
moral_permissibility
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/moral_permissibility')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
342 |
'train' |
274 |
'validation' |
68 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
movie_dialog_same_or_different
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/movie_dialog_same_or_different')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
50000 |
'train' |
40000 |
'validation' |
10000 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
movie_recommendation
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/movie_recommendation')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
500 |
'train' |
400 |
'validation' |
100 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
mult_data_wrangling
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/mult_data_wrangling')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
7854 |
'train' |
6380 |
'validation' |
1474 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
multiemo
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/multiemo')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1437281 |
'train' |
1149873 |
'validation' |
287408 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
natural_instructions
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/natural_instructions')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
193250 |
'train' |
154615 |
'validation' |
38635 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
navigate
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/navigate')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1000 |
'train' |
800 |
'validation' |
200 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
nonsense_words_grammar
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/nonsense_words_grammar')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
50 |
'train' |
34 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
novel_concepts
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/novel_concepts')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
32 |
'train' |
16 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
object_counting
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/object_counting')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1000 |
'train' |
800 |
'validation' |
200 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
odd_one_out
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/odd_one_out')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
86 |
'train' |
69 |
'validation' |
17 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
operators
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/operators')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
210 |
'train' |
168 |
'validation' |
42 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
paragraph_segmentation
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/paragraph_segmentation')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
9000 |
'train' |
7200 |
'validation' |
1800 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
parsinlu_qa
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/parsinlu_qa')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1050 |
'train' |
840 |
'validation' |
210 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
parsinlu_reading_comprehension
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/parsinlu_reading_comprehension')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
518 |
'train' |
415 |
'validation' |
103 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
penguins_in_a_table
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/penguins_in_a_table')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
149 |
'train' |
120 |
'validation' |
29 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
periodic_elements
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/periodic_elements')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
654 |
'train' |
524 |
'validation' |
130 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
persian_idioms
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/persian_idioms')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
66 |
'train' |
50 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
phrase_relatedness
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/phrase_relatedness')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
100 |
'train' |
80 |
'validation' |
20 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
physical_intuition
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/physical_intuition')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
81 |
'train' |
65 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
physics
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/physics')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
229 |
'train' |
184 |
'validation' |
45 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
physics_questions
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/physics_questions')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
54 |
'train' |
38 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
play_dialog_same_or_different
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/play_dialog_same_or_different')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
3264 |
'train' |
2612 |
'validation' |
652 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
polish_sequence_labeling
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/polish_sequence_labeling')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
12812 |
'train' |
10250 |
'validation' |
2562 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
presuppositions_as_nli
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/presuppositions_as_nli')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
735 |
'train' |
588 |
'validation' |
147 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
qa_wikidata
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/qa_wikidata')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
20321 |
'train' |
16257 |
'validation' |
4064 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
question_selection
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/question_selection')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1582 |
'train' |
1266 |
'validation' |
316 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
real_or_fake_text
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/real_or_fake_text')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
15088 |
'train' |
12072 |
'validation' |
3016 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
reasoning_about_colored_objects
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/reasoning_about_colored_objects')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
2000 |
'train' |
1600 |
'validation' |
400 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
repeat_copy_logic
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/repeat_copy_logic')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
32 |
'train' |
16 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
rephrase
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/rephrase')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
78 |
'train' |
62 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
riddle_sense
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/riddle_sense')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
49 |
'train' |
33 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
ruin_names
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/ruin_names')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
448 |
'train' |
359 |
'validation' |
89 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
salient_translation_error_detection
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/salient_translation_error_detection')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
998 |
'train' |
799 |
'validation' |
199 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
scientific_press_release
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/scientific_press_release')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
50 |
'train' |
34 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
semantic_parsing_in_context_sparc
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/semantic_parsing_in_context_sparc')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1155 |
'train' |
924 |
'validation' |
231 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
semantic_parsing_spider
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/semantic_parsing_spider')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1034 |
'train' |
828 |
'validation' |
206 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
sentence_ambiguity
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/sentence_ambiguity')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
60 |
'train' |
44 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
similarities_abstraction
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/similarities_abstraction')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
76 |
'train' |
60 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
simp_turing_concept
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/simp_turing_concept')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
6390 |
'train' |
5112 |
'validation' |
1278 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
simple_arithmetic_json
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/simple_arithmetic_json')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
30 |
'train' |
14 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
simple_arithmetic_json_multiple_choice
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/simple_arithmetic_json_multiple_choice')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
8 |
'train' |
0 |
'validation' |
0 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
simple_arithmetic_json_subtasks
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/simple_arithmetic_json_subtasks')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
30 |
'train' |
15 |
'validation' |
15 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
simple_arithmetic_multiple_targets_json
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/simple_arithmetic_multiple_targets_json')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
10 |
'train' |
0 |
'validation' |
0 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
simple_ethical_questions
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/simple_ethical_questions')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
115 |
'train' |
92 |
'validation' |
23 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
simple_text_editing
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/simple_text_editing')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
47 |
'train' |
31 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
snarks
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/snarks')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
181 |
'train' |
145 |
'validation' |
36 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
social_iqa
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/social_iqa')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1935 |
'train' |
1548 |
'validation' |
387 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
social_support
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/social_support')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
897 |
'train' |
718 |
'validation' |
179 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
sports_understanding
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/sports_understanding')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
986 |
'train' |
789 |
'validation' |
197 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
strange_stories
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/strange_stories')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
174 |
'train' |
140 |
'validation' |
34 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
strategyqa
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/strategyqa')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
2289 |
'train' |
1832 |
'validation' |
457 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
sufficient_information
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/sufficient_information')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
39 |
'train' |
23 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
suicide_risk
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/suicide_risk')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
40 |
'train' |
24 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
swahili_english_proverbs
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/swahili_english_proverbs')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
153 |
'train' |
123 |
'validation' |
30 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
swedish_to_german_proverbs
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/swedish_to_german_proverbs')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
72 |
'train' |
56 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
symbol_interpretation
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/symbol_interpretation')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
990 |
'train' |
795 |
'validation' |
195 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
temporal_sequences
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/temporal_sequences')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1000 |
'train' |
800 |
'validation' |
200 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
tense
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/tense')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
286 |
'train' |
229 |
'validation' |
57 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
timedial
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/timedial')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
2550 |
'train' |
2040 |
'validation' |
510 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
topical_chat
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/topical_chat')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
22295 |
'train' |
17836 |
'validation' |
4459 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
tracking_shuffled_objects
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/tracking_shuffled_objects')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
3750 |
'train' |
3000 |
'validation' |
750 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
understanding_fables
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/understanding_fables')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
189 |
'train' |
152 |
'validation' |
37 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
undo_permutation
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/undo_permutation')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
300 |
'train' |
240 |
'validation' |
60 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
unit_conversion
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/unit_conversion')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
23936 |
'train' |
19151 |
'validation' |
4785 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
unit_interpretation
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/unit_interpretation')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
100 |
'train' |
80 |
'validation' |
20 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
unnatural_in_context_learning
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/unnatural_in_context_learning')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
73420 |
'train' |
58736 |
'validation' |
14684 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
vitaminc_fact_verification
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/vitaminc_fact_verification')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
54668 |
'train' |
43735 |
'validation' |
10933 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
what_is_the_tao
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/what_is_the_tao')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
36 |
'train' |
20 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
which_wiki_edit
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/which_wiki_edit')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
571 |
'train' |
457 |
'validation' |
114 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
winowhy
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/winowhy')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
2862 |
'train' |
2290 |
'validation' |
572 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
word_sorting
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/word_sorting')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1900 |
'train' |
1520 |
'validation' |
380 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
word_unscrambling
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/word_unscrambling')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
8917 |
'train' |
7134 |
'validation' |
1783 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}