TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

groove

Description:

The Groove MIDI Dataset (GMD) is composed of 13.6 hours of aligned MIDI and (synthesized) audio of human-performed, tempo-aligned expressive drumming captured on a Roland TD-11 V-Drum electronic drum kit.

Additional Documentation: Explore on Papers With Code
Homepage: https://g.co/magenta/groove-dataset
Source code: tfds.datasets.groove.Builder
Versions:
- 2.0.1 (default): No release notes.
Supervised keys (See as_supervised doc): None
Figure (tfds.show_examples): Not supported.
Citation:

@inproceedings{groove2019,
    Author = {Jon Gillick and Adam Roberts and Jesse Engel and Douglas Eck and David Bamman},
    Title = {Learning to Groove with Inverse Sequence Transformations},
    Booktitle   = {International Conference on Machine Learning (ICML)}
    Year = {2019},
}

groove/full-midionly (default config)

Config description: Groove dataset without audio, unsplit.
Download size: 3.11 MiB
Dataset size: 5.22 MiB
Auto-cached (documentation): Yes
Splits:

Split	Examples
`'test'`	129
`'train'`	897
`'validation'`	124

Feature structure:

FeaturesDict({
    'bpm': int32,
    'drummer': ClassLabel(shape=(), dtype=int64, num_classes=10),
    'id': string,
    'midi': string,
    'style': FeaturesDict({
        'primary': ClassLabel(shape=(), dtype=int64, num_classes=18),
        'secondary': string,
    }),
    'time_signature': ClassLabel(shape=(), dtype=int64, num_classes=5),
    'type': ClassLabel(shape=(), dtype=int64, num_classes=2),
})

Feature documentation:

Feature	Class	Dtype
	FeaturesDict
bpm	Tensor	int32
drummer	ClassLabel	int64
id	Tensor	string
midi	Tensor	string
style	FeaturesDict
style/primary	ClassLabel	int64
style/secondary	Tensor	string
time_signature	ClassLabel	int64
type	ClassLabel	int64

Examples (tfds.as_dataframe):

groove/full-16000hz

Config description: Groove dataset with audio, unsplit.
Download size: 4.76 GiB
Dataset size: 2.33 GiB
Auto-cached (documentation): No
Splits:

Split	Examples
`'test'`	124
`'train'`	846
`'validation'`	120

Feature structure:

FeaturesDict({
    'audio': Audio(shape=(None,), dtype=float32),
    'bpm': int32,
    'drummer': ClassLabel(shape=(), dtype=int64, num_classes=10),
    'id': string,
    'midi': string,
    'style': FeaturesDict({
        'primary': ClassLabel(shape=(), dtype=int64, num_classes=18),
        'secondary': string,
    }),
    'time_signature': ClassLabel(shape=(), dtype=int64, num_classes=5),
    'type': ClassLabel(shape=(), dtype=int64, num_classes=2),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
audio	Audio	(None,)	float32
bpm	Tensor		int32
drummer	ClassLabel		int64
id	Tensor		string
midi	Tensor		string
style	FeaturesDict
style/primary	ClassLabel		int64
style/secondary	Tensor		string
time_signature	ClassLabel		int64
type	ClassLabel		int64

Examples (tfds.as_dataframe):

groove/2bar-midionly

Config description: Groove dataset without audio, split into 2-bar chunks.
Download size: 3.11 MiB
Dataset size: 19.59 MiB
Auto-cached (documentation): Yes
Splits:

Split	Examples
`'test'`	2,204
`'train'`	18,163
`'validation'`	2,252

Feature structure:

FeaturesDict({
    'bpm': int32,
    'drummer': ClassLabel(shape=(), dtype=int64, num_classes=10),
    'id': string,
    'midi': string,
    'style': FeaturesDict({
        'primary': ClassLabel(shape=(), dtype=int64, num_classes=18),
        'secondary': string,
    }),
    'time_signature': ClassLabel(shape=(), dtype=int64, num_classes=5),
    'type': ClassLabel(shape=(), dtype=int64, num_classes=2),
})

Feature documentation:

Feature	Class	Dtype
	FeaturesDict
bpm	Tensor	int32
drummer	ClassLabel	int64
id	Tensor	string
midi	Tensor	string
style	FeaturesDict
style/primary	ClassLabel	int64
style/secondary	Tensor	string
time_signature	ClassLabel	int64
type	ClassLabel	int64

Examples (tfds.as_dataframe):

groove/2bar-16000hz

Config description: Groove dataset with audio, split into 2-bar chunks.
Download size: 4.76 GiB
Dataset size: 4.61 GiB
Auto-cached (documentation): No
Splits:

Split	Examples
`'test'`	1,873
`'train'`	14,390
`'validation'`	2,034

Feature structure:

FeaturesDict({
    'audio': Audio(shape=(None,), dtype=float32),
    'bpm': int32,
    'drummer': ClassLabel(shape=(), dtype=int64, num_classes=10),
    'id': string,
    'midi': string,
    'style': FeaturesDict({
        'primary': ClassLabel(shape=(), dtype=int64, num_classes=18),
        'secondary': string,
    }),
    'time_signature': ClassLabel(shape=(), dtype=int64, num_classes=5),
    'type': ClassLabel(shape=(), dtype=int64, num_classes=2),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
audio	Audio	(None,)	float32
bpm	Tensor		int32
drummer	ClassLabel		int64
id	Tensor		string
midi	Tensor		string
style	FeaturesDict
style/primary	ClassLabel		int64
style/secondary	Tensor		string
time_signature	ClassLabel		int64
type	ClassLabel		int64

Examples (tfds.as_dataframe):

groove/4bar-midionly

Config description: Groove dataset without audio, split into 4-bar chunks.
Download size: 3.11 MiB
Dataset size: 27.32 MiB
Auto-cached (documentation): Yes
Splits:

Split	Examples
`'test'`	2,033
`'train'`	17,261
`'validation'`	2,121

Feature structure:

FeaturesDict({
    'bpm': int32,
    'drummer': ClassLabel(shape=(), dtype=int64, num_classes=10),
    'id': string,
    'midi': string,
    'style': FeaturesDict({
        'primary': ClassLabel(shape=(), dtype=int64, num_classes=18),
        'secondary': string,
    }),
    'time_signature': ClassLabel(shape=(), dtype=int64, num_classes=5),
    'type': ClassLabel(shape=(), dtype=int64, num_classes=2),
})

Feature documentation:

Feature	Class	Dtype
	FeaturesDict
bpm	Tensor	int32
drummer	ClassLabel	int64
id	Tensor	string
midi	Tensor	string
style	FeaturesDict
style/primary	ClassLabel	int64
style/secondary	Tensor	string
time_signature	ClassLabel	int64
type	ClassLabel	int64

Examples (tfds.as_dataframe):