common_voice

  • Description:

Mozilla Common Voice Dataset

Split Examples

common_voice/en (default config)

  • Config description: Language Code: en

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=17),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/de

  • Config description: Language Code: de

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/fr

  • Config description: Language Code: fr

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=19),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/cy

  • Config description: Language Code: cy

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/br

  • Config description: Language Code: br

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/cv

  • Config description: Language Code: cv

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=0),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/tr

  • Config description: Language Code: tr

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/tt

  • Config description: Language Code: tt

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=0),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/ky

  • Config description: Language Code: ky

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/ga-IE

  • Config description: Language Code: ga-IE

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/kab

  • Config description: Language Code: kab

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/ca

  • Config description: Language Code: ca

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=6),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/zh-TW

  • Config description: Language Code: zh-TW

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/sl

  • Config description: Language Code: sl

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/it

  • Config description: Language Code: it

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/nl

  • Config description: Language Code: nl

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/cnh

  • Config description: Language Code: cnh

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})

common_voice/eo

  • Config description: Language Code: eo

  • Features:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})