xor_tydi_qa

참고자료:

xor 검색

TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.

ds = tfds.load('huggingface:xor_tydi_qa/xor-retrieve')
  • 설명 :
XOR-TyDi QA brings together for the first time information-seeking questions,
    open-retrieval QA, and multilingual QA to create a multilingual open-retrieval
    QA dataset that enables cross-lingual answer retrieval. It consists of questions
    written by information-seeking native speakers in 7 typologically diverse languages
    and answer annotations that are retrieved from multilingual document collections.
    There are three sub-tasks: XOR-Retrieve, XOR-EnglishSpan, and XOR-Full.

XOR-Retrieve is a cross-lingual retrieval task where a question is written in the target
language (e.g., Japanese) and a system is required to retrieve English document that answers the question.
  • 라이센스 : 알려진 라이센스 없음
  • 버전 : 1.1.0
  • 분할 :
나뉘다
'test' 2499
'train' 15250
'validation' 2110
  • 특징 :
{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "lang": {
        "num_classes": 7,
        "names": [
            "ar",
            "bn",
            "fi",
            "ja",
            "ko",
            "ru",
            "te"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    },
    "answers": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

xor-full

TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.

ds = tfds.load('huggingface:xor_tydi_qa/xor-full')
  • 설명 :
XOR-TyDi QA brings together for the first time information-seeking questions,
    open-retrieval QA, and multilingual QA to create a multilingual open-retrieval
    QA dataset that enables cross-lingual answer retrieval. It consists of questions
    written by information-seeking native speakers in 7 typologically diverse languages
    and answer annotations that are retrieved from multilingual document collections.
    There are three sub-tasks: XOR-Retrieve, XOR-EnglishSpan, and XOR-Full.

XOR-Full is a cross-lingual retrieval task where a question is written in the target
language (e.g., Japanese) and a system is required to output a short answer in the target language.
  • 라이센스 : 알려진 라이센스 없음
  • 버전 : 1.1.0
  • 분할 :
나뉘다
'test' 8176
'train' 61360
'validation' 3473
  • 특징 :
{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "lang": {
        "num_classes": 7,
        "names": [
            "ar",
            "bn",
            "fi",
            "ja",
            "ko",
            "ru",
            "te"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    },
    "answers": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}