ar_sarcasm

참조:

다음 명령을 사용하여 TFDS에서 이 데이터세트를 로드합니다.

ds = tfds.load('huggingface:ar_sarcasm')
  • 설명 :
ArSarcasm is a new Arabic sarcasm detection dataset.
The dataset was created using previously available Arabic sentiment analysis datasets (SemEval 2017 and ASTD)
 and adds sarcasm and dialect labels to them. The dataset contains 10,547 tweets, 1,682 (16%) of which are sarcastic.
  • 라이센스 : MIT
  • 버전 : 1.0.0
  • 분할 :
나뉘다
'test' 2110
'train' 8437
  • 특징 :
{
    "dialect": {
        "num_classes": 5,
        "names": [
            "egypt",
            "gulf",
            "levant",
            "magreb",
            "msa"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    },
    "sarcasm": {
        "num_classes": 2,
        "names": [
            "non-sarcastic",
            "sarcastic"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    },
    "sentiment": {
        "num_classes": 3,
        "names": [
            "negative",
            "neutral",
            "positive"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    },
    "original_sentiment": {
        "num_classes": 3,
        "names": [
            "negative",
            "neutral",
            "positive"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    },
    "tweet": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "source": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}