TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

crema_d

Description:

CREMA-D is an audio-visual data set for emotion recognition. The data set consists of facial and vocal emotional expressions in sentences spoken in a range of basic emotional states (happy, sad, anger, fear, disgust, and neutral). 7,442 clips of 91 actors with diverse ethnic backgrounds were collected. This release contains only the audio stream from the original audio-visual recording. The samples are splitted between train, validation and testing so that samples from each speaker belongs to exactly one split.

Additional Documentation: Explore on Papers With Code
Homepage: https://github.com/CheyneyComputerScience/CREMA-D
Source code: tfds.audio.CremaD
Versions:
- 1.0.0 (default): No release notes.
Download size: 579.25 MiB
Dataset size: 1.65 GiB
Auto-cached (documentation): No
Splits:

Split	Examples
`'test'`	1,556
`'train'`	5,144
`'validation'`	738

Feature structure:

FeaturesDict({
    'audio': Audio(shape=(None,), dtype=int64),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=6),
    'speaker_id': string,
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
audio	Audio	(None,)	int64
label	ClassLabel		int64
speaker_id	Tensor		string

Supervised keys (See as_supervised doc): ('audio', 'label')
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):

Citation:

@article{cao2014crema,
  title={ {CREMA-D}: Crowd-sourced emotional multimodal actors dataset},
  author={Cao, Houwei and Cooper, David G and Keutmann, Michael K and Gur, Ruben C and Nenkova, Ani and Verma, Ragini},
  journal={IEEE transactions on affective computing},
  volume={5},
  number={4},
  pages={377--390},
  year={2014},
  publisher={IEEE}
}