Tune in to the first Women in ML Symposium this Tuesday, October 19 at 9am PST Register now


  • Description:

An large scale dataset for speaker identification. This data is collected from over 1,251 speakers, with over 150k samples in total. This release contains the audio part of the voxceleb1.1 dataset.

Split Examples
'test' 7,972
'train' 134,000
'validation' 6,670
  • Features:
    'audio': Audio(shape=(None,), dtype=tf.int64),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=1252),
    'youtube_id': Text(shape=(), dtype=tf.string),
  • Citation:
    author       = "Nagrani, A. and Chung, J.~S. and Zisserman, A.",
    title        = "VoxCeleb: a large-scale speaker identification dataset",
    booktitle    = "INTERSPEECH",
    year         = "2017",