multi_eurlex

Referanslar:

tr

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/en')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 55000
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

da

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/da')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 55000
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

de

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/de')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 55000
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

nl

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/nl')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 55000
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

sv

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/sv')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 42490
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

bg

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/bg')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 15986
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

CS

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/cs')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 23187
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

saat

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/hr')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 7944
'validation' 2500
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

lütfen

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/pl')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 23197
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Sk

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/sk')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 22971
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

sl

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/sl')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 23184
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

es

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/es')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 52785
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Fr

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/fr')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 55000
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

BT

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/it')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 55000
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

puan

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/pt')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 52370
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

ro

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/ro')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 15921
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

ve

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/et')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 23126
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

fi

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/fi')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 42497
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

ha

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/hu')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 22664
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

lt

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/lt')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 23188
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

seviye

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/lv')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 23208
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

el

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/el')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 55000
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

mt

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/mt')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 17521
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Bütün diller

Bu veri kümesini TFDS'ye yüklemek için aşağıdaki komutu kullanın:

ds = tfds.load('huggingface:multi_eurlex/all_languages')
  • Tanım :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • Lisans : Bilinen lisans yok
  • Sürüm : 1.0.0
  • Bölünmeler :
Bölmek Örnekler
'test' 5000
'train' 55000
'validation' 5000
  • Özellikler :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "languages": [
            "en",
            "da",
            "de",
            "nl",
            "sv",
            "bg",
            "cs",
            "hr",
            "pl",
            "sk",
            "sl",
            "es",
            "fr",
            "it",
            "pt",
            "ro",
            "et",
            "fi",
            "hu",
            "lt",
            "lv",
            "el",
            "mt"
        ],
        "id": null,
        "_type": "Translation"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}