References:
en2bg
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2bg')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
4061 |
- Features:
{
"translation": {
"languages": [
"en",
"bg"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2cs
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2cs')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
3351 |
- Features:
{
"translation": {
"languages": [
"en",
"cs"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2da
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2da')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
3757 |
- Features:
{
"translation": {
"languages": [
"en",
"da"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2de
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2de')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
4473 |
- Features:
{
"translation": {
"languages": [
"en",
"de"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2el
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2el')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
2818 |
- Features:
{
"translation": {
"languages": [
"en",
"el"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2es
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2es')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
4303 |
- Features:
{
"translation": {
"languages": [
"en",
"es"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2et
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2et')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
2270 |
- Features:
{
"translation": {
"languages": [
"en",
"et"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2fi
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2fi')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
1458 |
- Features:
{
"translation": {
"languages": [
"en",
"fi"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2fr
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2fr')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
4476 |
- Features:
{
"translation": {
"languages": [
"en",
"fr"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2hu
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2hu')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
3455 |
- Features:
{
"translation": {
"languages": [
"en",
"hu"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2is
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2is')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
2206 |
- Features:
{
"translation": {
"languages": [
"en",
"is"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2it
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2it')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
2170 |
- Features:
{
"translation": {
"languages": [
"en",
"it"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2lt
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2lt')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
3386 |
- Features:
{
"translation": {
"languages": [
"en",
"lt"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2lv
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2lv')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
3880 |
- Features:
{
"translation": {
"languages": [
"en",
"lv"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2mt
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2mt')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
1722 |
- Features:
{
"translation": {
"languages": [
"en",
"mt"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2nb
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2nb')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
642 |
- Features:
{
"translation": {
"languages": [
"en",
"nb"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2nl
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2nl')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
1805 |
- Features:
{
"translation": {
"languages": [
"en",
"nl"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2pl
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2pl')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
4027 |
- Features:
{
"translation": {
"languages": [
"en",
"pl"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2pt
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2pt')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
3501 |
- Features:
{
"translation": {
"languages": [
"en",
"pt"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2ro
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2ro')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
3159 |
- Features:
{
"translation": {
"languages": [
"en",
"ro"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2sk
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2sk')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
2972 |
- Features:
{
"translation": {
"languages": [
"en",
"sk"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2sl
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2sl')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
4644 |
- Features:
{
"translation": {
"languages": [
"en",
"sl"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2sv
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2sv')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
2909 |
- Features:
{
"translation": {
"languages": [
"en",
"sv"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
en2tr
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:europa_eac_tm/en2tr')
- Description:
In October 2012, the European Union's (EU) Directorate General for Education and Culture ( DG EAC) released a translation memory (TM), i.e. a collection of sentences and their professionally produced translations, in twenty-six languages. This resource bears the name EAC Translation Memory, short EAC-TM.
EAC-TM covers up to 26 languages: 22 official languages of the EU (all except Irish) plus Icelandic, Croatian, Norwegian and Turkish. EAC-TM thus contains translations from English into the following 25 languages: Bulgarian, Czech, Danish, Dutch, Estonian, German, Greek, Finnish, French, Croatian, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish and Turkish.
All documents and sentences were originally written in English (source language is English) and then translated into the other languages. The texts were translated by staff of the National Agencies of the Lifelong Learning and Youth in Action programmes. They are typically professionals in the field of education/youth and EU programmes. They are thus not professional translators, but they are normally native speakers of the target language.
- License: Creative Commons Attribution 4.0 International(CC BY 4.0) licence © European Union, 1995-2020
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'train' |
3198 |
- Features:
{
"translation": {
"languages": [
"en",
"tr"
],
"id": null,
"_type": "Translation"
},
"sentence_type": {
"num_classes": 2,
"names": [
"form_data",
"sentence_data"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}