منابع:
ab
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/ab')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 8 |
'other' | 752 |
'test' | 9 |
'train' | 22 |
'validated' | 31 |
'validation' | 0 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ar
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/ar')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 6333 |
'other' | 18283 |
'test' | 7622 |
'train' | 14227 |
'validated' | 43291 |
'validation' | 7517 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
مانند
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/as')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 31 |
'other' | 0 |
'test' | 110 |
'train' | 270 |
'validated' | 504 |
'validation' | 124 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
br
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/br')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 623 |
'other' | 10912 |
'test' | 2087 |
'train' | 2780 |
'validated' | 8560 |
'validation' | 1997 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
حدود
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/ca')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 18846 |
'other' | 64446 |
'test' | 15724 |
'train' | 285584 |
'validated' | 416701 |
'validation' | 15724 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
cnh
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/cnh')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 433 |
'other' | 2934 |
'test' | 752 |
'train' | 807 |
'validated' | 2432 |
'validation' | 756 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
cs
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/cs')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 685 |
'other' | 7475 |
'test' | 4144 |
'train' | 5655 |
'validated' | 30431 |
'validation' | 4118 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
رزومه
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/cv')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 1282 |
'other' | 6927 |
'test' | 788 |
'train' | 931 |
'validated' | 3496 |
'validation' | 818 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
cy
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/cy')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 3648 |
'other' | 17919 |
'test' | 4820 |
'train' | 6839 |
'validated' | 72984 |
'validation' | 4776 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
de
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/de')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 32789 |
'other' | 10095 |
'test' | 15588 |
'train' | 246525 |
'validated' | 565186 |
'validation' | 15588 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
dv
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/dv')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 840 |
'other' | 0 |
'test' | 2202 |
'train' | 2680 |
'validated' | 11866 |
'validation' | 2077 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
el
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/el')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 185 |
'other' | 5659 |
'test' | 1522 |
'train' | 2316 |
'validated' | 5996 |
'validation' | 1401 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
en
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/en')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 189562 |
'other' | 169895 |
'test' | 16164 |
'train' | 564337 |
'validated' | 1224864 |
'validation' | 16164 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
eo
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/eo')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 4736 |
'other' | 2946 |
'test' | 8969 |
'train' | 19587 |
'validated' | 58094 |
'validation' | 8987 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
es
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/es')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 40640 |
'other' | 144791 |
'test' | 15089 |
'train' | 161813 |
'validated' | 236314 |
'validation' | 15089 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
et
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/et')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 3557 |
'other' | 569 |
'test' | 2509 |
'train' | 2966 |
'validated' | 10683 |
'validation' | 2507 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
اتحادیه اروپا
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/eu')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 5387 |
'other' | 23570 |
'test' | 5172 |
'train' | 7505 |
'validated' | 63009 |
'validation' | 5172 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
فا
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/fa')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 11698 |
'other' | 22510 |
'test' | 5213 |
'train' | 7593 |
'validated' | 251659 |
'validation' | 5213 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
فی
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/fi')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 59 |
'other' | 149 |
'test' | 428 |
'train' | 460 |
'validated' | 1305 |
'validation' | 415 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
fr
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/fr')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 40351 |
'other' | 3222 |
'test' | 15763 |
'train' | 298982 |
'validated' | 461004 |
'validation' | 15763 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
fy-NL
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/fy-NL')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 1031 |
'other' | 21569 |
'test' | 3020 |
'train' | 3927 |
'validated' | 10495 |
'validation' | 2790 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ga-IE
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/ga-IE')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 409 |
'other' | 2130 |
'test' | 506 |
'train' | 541 |
'validated' | 3352 |
'validation' | 497 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
سلام
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/hi')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 60 |
'other' | 139 |
'test' | 127 |
'train' | 157 |
'validated' | 419 |
'validation' | 135 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
hsb
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/hsb')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 227 |
'other' | 62 |
'test' | 387 |
'train' | 808 |
'validated' | 1367 |
'validation' | 172 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
هو
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/hu')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 169 |
'other' | 295 |
'test' | 1649 |
'train' | 3348 |
'validated' | 6457 |
'validation' | 1434 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ia
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/ia')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 192 |
'other' | 1095 |
'test' | 899 |
'train' | 3477 |
'validated' | 5978 |
'validation' | 1601 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
شناسه
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/id')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 470 |
'other' | 6782 |
'test' | 1844 |
'train' | 2130 |
'validated' | 8696 |
'validation' | 1835 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
آی تی
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/it')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 12189 |
'other' | 14549 |
'test' | 12928 |
'train' | 58015 |
'validated' | 102579 |
'validation' | 12928 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ja
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/ja')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 504 |
'other' | 885 |
'test' | 632 |
'train' | 722 |
'validated' | 3072 |
'validation' | 586 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
کا
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/ka')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 139 |
'other' | 44 |
'test' | 656 |
'train' | 1058 |
'validated' | 2275 |
'validation' | 527 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
کاب
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/kab')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 18134 |
'other' | 88021 |
'test' | 14622 |
'train' | 120530 |
'validated' | 573718 |
'validation' | 14622 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ky
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/ky')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 926 |
'other' | 7223 |
'test' | 1503 |
'train' | 1955 |
'validated' | 9236 |
'validation' | 1511 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ال جی
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/lg')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 290 |
'other' | 3110 |
'test' | 584 |
'train' | 1250 |
'validated' | 2220 |
'validation' | 384 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
آن
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/lt')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 102 |
'other' | 1629 |
'test' | 466 |
'train' | 931 |
'validated' | 1644 |
'validation' | 244 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
lv
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/lv')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 143 |
'other' | 1560 |
'test' | 1882 |
'train' | 2552 |
'validated' | 6444 |
'validation' | 2002 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
دقیقه
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/mn')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 667 |
'other' | 3272 |
'test' | 1862 |
'train' | 2183 |
'validated' | 7487 |
'validation' | 1837 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
mt
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/mt')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 314 |
'other' | 5714 |
'test' | 1617 |
'train' | 2036 |
'validated' | 5747 |
'validation' | 1516 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
nl
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/nl')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 3308 |
'other' | 27 |
'test' | 5708 |
'train' | 9460 |
'validated' | 52488 |
'validation' | 4938 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
یا
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/or')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 62 |
'other' | 4302 |
'test' | 98 |
'train' | 388 |
'validated' | 615 |
'validation' | 129 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
pa-IN
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/pa-IN')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 43 |
'other' | 1411 |
'test' | 116 |
'train' | 211 |
'validated' | 371 |
'validation' | 44 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
pl
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/pl')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 4601 |
'other' | 12848 |
'test' | 5153 |
'train' | 7468 |
'validated' | 90791 |
'validation' | 5153 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
pt
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/pt')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 1740 |
'other' | 8390 |
'test' | 4641 |
'train' | 6514 |
'validated' | 41584 |
'validation' | 4592 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
rm-sursilv
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/rm-sursilv')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 639 |
'other' | 2102 |
'test' | 1194 |
'train' | 1384 |
'validated' | 3783 |
'validation' | 1205 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
rm-valader
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/rm-vallader')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 374 |
'other' | 727 |
'test' | 378 |
'train' | 574 |
'validated' | 1316 |
'validation' | 357 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ro
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/ro')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 485 |
'other' | 1945 |
'test' | 1778 |
'train' | 3399 |
'validated' | 6039 |
'validation' | 858 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ru
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/ru')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 3056 |
'other' | 10247 |
'test' | 8007 |
'train' | 15481 |
'validated' | 74256 |
'validation' | 7963 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
rw
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/rw')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 206790 |
'other' | 22923 |
'test' | 15724 |
'train' | 515197 |
'validated' | 832929 |
'validation' | 15032 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
sah
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/sah')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 66 |
'other' | 1275 |
'test' | 757 |
'train' | 1442 |
'validated' | 2606 |
'validation' | 405 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
sl
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/sl')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 92 |
'other' | 2502 |
'test' | 881 |
'train' | 2038 |
'validated' | 4669 |
'validation' | 556 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
sv-SE
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/sv-SE')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 462 |
'other' | 3043 |
'test' | 2027 |
'train' | 2331 |
'validated' | 12552 |
'validation' | 2019 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
تا
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/ta')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 594 |
'other' | 7428 |
'test' | 1781 |
'train' | 2009 |
'validated' | 12652 |
'validation' | 1779 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
هفتم
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/th')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 467 |
'other' | 2671 |
'test' | 2188 |
'train' | 2917 |
'validated' | 7028 |
'validation' | 1922 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
tr
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/tr')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 1726 |
'other' | 325 |
'test' | 1647 |
'train' | 1831 |
'validated' | 18685 |
'validation' | 1647 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
tt
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/tt')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 287 |
'other' | 1798 |
'test' | 4485 |
'train' | 11211 |
'validated' | 25781 |
'validation' | 2127 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
انگلستان
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/uk')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 1255 |
'other' | 8161 |
'test' | 3235 |
'train' | 4035 |
'validated' | 22337 |
'validation' | 3236 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
vi
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/vi')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 78 |
'other' | 870 |
'test' | 198 |
'train' | 221 |
'validated' | 619 |
'validation' | 200 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
رای دادن
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/vot')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 6 |
'other' | 411 |
'test' | 0 |
'train' | 3 |
'validated' | 3 |
'validation' | 0 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
zh-CN
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/zh-CN')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 5305 |
'other' | 8948 |
'test' | 8760 |
'train' | 18541 |
'validated' | 36405 |
'validation' | 8743 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
zh-HK
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/zh-HK')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 2999 |
'other' | 38830 |
'test' | 5172 |
'train' | 7506 |
'validated' | 41835 |
'validation' | 5172 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
zh-TW
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:common_voice/zh-TW')
- شرح :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- مجوز : https://github.com/common-voice/common-voice/blob/main/LICENSE
- نسخه : 6.1.0
- تقسیم ها :
شکاف | مثال ها |
---|---|
'invalidated' | 3584 |
'other' | 22477 |
'test' | 2895 |
'train' | 3507 |
'validated' | 61232 |
'validation' | 2895 |
- امکانات :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}