Have a question? Connect with the community at the TensorFlow Forum Visit Forum

voxceleb

  • Description:

An large scale dataset for speaker identification. This data is collected from over 1,251 speakers, with over 150k samples in total. This release contains the audio part of the voxceleb1.1 dataset.

Split Examples
'test' 7,972
'train' 134,000
'validation' 6,670
  • Features:
FeaturesDict({
    'audio': Audio(shape=(None,), dtype=tf.int64),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=1252),
})
  • Citation:
@InProceedings{Nagrani17,
    author       = "Nagrani, A. and Chung, J.~S. and Zisserman, A.",
    title        = "VoxCeleb: a large-scale speaker identification dataset",
    booktitle    = "INTERSPEECH",
    year         = "2017",
}