xed_en_fi

参考文献:

en_annotated

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xed_en_fi/en_annotated')

説明：

A multilingual fine-grained emotion dataset. The dataset consists of human annotated Finnish (25k) and English sentences (30k). Plutchik’s
core emotions are used to annotate the dataset with the addition of neutral to create a multilabel multiclass
dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to
show that XED performs on par with other similar datasets and is therefore a useful tool for
sentiment analysis and emotion detection.

ライセンス: ライセンス: クリエイティブコモンズ表示 4.0 国際ライセンス (CC-BY)
バージョン: 1.1.0
分割:

スプリット	例
`'train'`	17528

特徴：

{
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 9,
            "names": [
                "neutral",
                "anger",
                "anticipation",
                "disgust",
                "fear",
                "joy",
                "sadness",
                "surprise",
                "trust"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

en_neutral

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xed_en_fi/en_neutral')

説明：

A multilingual fine-grained emotion dataset. The dataset consists of human annotated Finnish (25k) and English sentences (30k). Plutchik’s
core emotions are used to annotate the dataset with the addition of neutral to create a multilabel multiclass
dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to
show that XED performs on par with other similar datasets and is therefore a useful tool for
sentiment analysis and emotion detection.

ライセンス: ライセンス: クリエイティブコモンズ表示 4.0 国際ライセンス (CC-BY)
バージョン: 1.1.0
分割:

スプリット	例
`'train'`	9675

特徴：

{
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "num_classes": 9,
        "names": [
            "neutral",
            "anger",
            "anticipation",
            "disgust",
            "fear",
            "joy",
            "sadness",
            "surprise",
            "trust"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

fi_annotated

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xed_en_fi/fi_annotated')

説明：

A multilingual fine-grained emotion dataset. The dataset consists of human annotated Finnish (25k) and English sentences (30k). Plutchik’s
core emotions are used to annotate the dataset with the addition of neutral to create a multilabel multiclass
dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to
show that XED performs on par with other similar datasets and is therefore a useful tool for
sentiment analysis and emotion detection.

ライセンス: ライセンス: クリエイティブコモンズ表示 4.0 国際ライセンス (CC-BY)
バージョン: 1.1.0
分割:

スプリット	例
`'train'`	14449

特徴：

{
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 9,
            "names": [
                "neutral",
                "anger",
                "anticipation",
                "disgust",
                "fear",
                "joy",
                "sadness",
                "surprise",
                "trust"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

fi_neutral

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xed_en_fi/fi_neutral')

説明：

A multilingual fine-grained emotion dataset. The dataset consists of human annotated Finnish (25k) and English sentences (30k). Plutchik’s
core emotions are used to annotate the dataset with the addition of neutral to create a multilabel multiclass
dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to
show that XED performs on par with other similar datasets and is therefore a useful tool for
sentiment analysis and emotion detection.

ライセンス: ライセンス: クリエイティブコモンズ表示 4.0 国際ライセンス (CC-BY)
バージョン: 1.1.0
分割:

スプリット	例
`'train'`	10794

特徴：

{
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "num_classes": 9,
        "names": [
            "neutral",
            "anger",
            "anticipation",
            "disgust",
            "fear",
            "joy",
            "sadness",
            "surprise",
            "trust"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}