- Description:
Mozilla Common Voice Dataset
Homepage: https://voice.mozilla.org/en/datasets
Source code:
tfds.audio.CommonVoice
Versions:
1.0.0
(default): No release notes.
Download size:
Unknown size
Dataset size:
Unknown size
Auto-cached (documentation): Unknown
Splits:
Split | Examples |
---|
Supervised keys (See
as_supervised
doc):None
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe): Missing.
Citation:
common_voice/en (default config)
Config description: Language Code: en
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=17),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/de
Config description: Language Code: de
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/fr
Config description: Language Code: fr
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=19),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/cy
Config description: Language Code: cy
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/br
Config description: Language Code: br
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/cv
Config description: Language Code: cv
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=0),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/tr
Config description: Language Code: tr
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/tt
Config description: Language Code: tt
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=0),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/ky
Config description: Language Code: ky
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/ga-IE
Config description: Language Code: ga-IE
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/kab
Config description: Language Code: kab
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/ca
Config description: Language Code: ca
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=6),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/zh-TW
Config description: Language Code: zh-TW
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/sl
Config description: Language Code: sl
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/it
Config description: Language Code: it
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/nl
Config description: Language Code: nl
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/cnh
Config description: Language Code: cnh
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |
common_voice/eo
Config description: Language Code: eo
Feature structure:
FeaturesDict({
'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
'age': Text(shape=(), dtype=tf.string),
'client_id': Text(shape=(), dtype=tf.string),
'downvotes': tf.int32,
'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
'sentence': Text(shape=(), dtype=tf.string),
'upvotes': tf.int32,
'voice': Audio(shape=(None,), dtype=tf.int64),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
accent | ClassLabel | tf.int64 | ||
age | Text | tf.string | ||
client_id | Text | tf.string | ||
downvotes | Tensor | tf.int32 | ||
gender | ClassLabel | tf.int64 | ||
sentence | Text | tf.string | ||
upvotes | Tensor | tf.int32 | ||
voice | Audio | (None,) | tf.int64 |