xed_en_fi

참고자료:

en_annotated

TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.

ds = tfds.load('huggingface:xed_en_fi/en_annotated')

설명 :

A multilingual fine-grained emotion dataset. The dataset consists of human annotated Finnish (25k) and English sentences (30k). Plutchik’s
core emotions are used to annotate the dataset with the addition of neutral to create a multilabel multiclass
dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to
show that XED performs on par with other similar datasets and is therefore a useful tool for
sentiment analysis and emotion detection.

라이센스 : 라이센스: Creative Commons Attribution 4.0 국제 라이센스(CC-BY)
버전 : 1.1.0
분할 :

나뉘다	예
`'train'`	17528

특징 :

{
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 9,
            "names": [
                "neutral",
                "anger",
                "anticipation",
                "disgust",
                "fear",
                "joy",
                "sadness",
                "surprise",
                "trust"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

en_neutral

TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.

ds = tfds.load('huggingface:xed_en_fi/en_neutral')

설명 :

A multilingual fine-grained emotion dataset. The dataset consists of human annotated Finnish (25k) and English sentences (30k). Plutchik’s
core emotions are used to annotate the dataset with the addition of neutral to create a multilabel multiclass
dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to
show that XED performs on par with other similar datasets and is therefore a useful tool for
sentiment analysis and emotion detection.

라이센스 : 라이센스: Creative Commons Attribution 4.0 국제 라이센스(CC-BY)
버전 : 1.1.0
분할 :

나뉘다	예
`'train'`	9675

특징 :

{
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "num_classes": 9,
        "names": [
            "neutral",
            "anger",
            "anticipation",
            "disgust",
            "fear",
            "joy",
            "sadness",
            "surprise",
            "trust"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

fi_annotated

TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.

ds = tfds.load('huggingface:xed_en_fi/fi_annotated')

설명 :

A multilingual fine-grained emotion dataset. The dataset consists of human annotated Finnish (25k) and English sentences (30k). Plutchik’s
core emotions are used to annotate the dataset with the addition of neutral to create a multilabel multiclass
dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to
show that XED performs on par with other similar datasets and is therefore a useful tool for
sentiment analysis and emotion detection.

라이센스 : 라이센스: Creative Commons Attribution 4.0 국제 라이센스(CC-BY)
버전 : 1.1.0
분할 :

나뉘다	예
`'train'`	14449

특징 :

{
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 9,
            "names": [
                "neutral",
                "anger",
                "anticipation",
                "disgust",
                "fear",
                "joy",
                "sadness",
                "surprise",
                "trust"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

완전 중립

TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.

ds = tfds.load('huggingface:xed_en_fi/fi_neutral')

설명 :

A multilingual fine-grained emotion dataset. The dataset consists of human annotated Finnish (25k) and English sentences (30k). Plutchik’s
core emotions are used to annotate the dataset with the addition of neutral to create a multilabel multiclass
dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to
show that XED performs on par with other similar datasets and is therefore a useful tool for
sentiment analysis and emotion detection.

라이센스 : 라이센스: Creative Commons Attribution 4.0 국제 라이센스(CC-BY)
버전 : 1.1.0
분할 :

나뉘다	예
`'train'`	10794

특징 :

{
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "num_classes": 9,
        "names": [
            "neutral",
            "anger",
            "anticipation",
            "disgust",
            "fear",
            "joy",
            "sadness",
            "surprise",
            "trust"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}