ttc4900

참고자료:

ttc4900

TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.

ds = tfds.load('huggingface:ttc4900/ttc4900')
  • 설명 :
The data set is taken from kemik group
http://www.kemik.yildiz.edu.tr/
The data are pre-processed for the text categorization, collocations are found, character set is corrected, and so forth.
We named TTC4900 by mimicking the name convention of TTC 3600 dataset shared by the study http://journals.sagepub.com/doi/abs/10.1177/0165551515620551

If you use the dataset in a paper, please refer https://www.kaggle.com/savasy/ttc4900 as footnote and cite one of the papers as follows:

- A Comparison of Different Approaches to Document Representation in Turkish Language, SDU Journal of Natural and Applied Science, Vol 22, Issue 2, 2018
- A comparative analysis of text classification for Turkish language, Pamukkale University Journal of Engineering Science Volume 25 Issue 5, 2018
- A Knowledge-poor Approach to Turkish Text Categorization with a Comparative Analysis, Proceedings of CICLING 2014, Springer LNCS, Nepal, 2014.
  • 라이센스 : CC0: 공개 도메인
  • 버전 : 1.0.0
  • 분할 :
나뉘다
'train' 4900
  • 특징 :
{
    "category": {
        "num_classes": 7,
        "names": [
            "siyaset",
            "dunya",
            "ekonomi",
            "kultur",
            "saglik",
            "spor",
            "teknoloji"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}