TFDS는 이제 Croissant 🥐 형식을 지원합니다! 자세한 내용은 설명서를 읽어보세요.

이 페이지는 Cloud Translation API를 통해 번역되었습니다.

trivia_qa

설명 :

TriviaqQA는 650,000개 이상의 질문-답변-증거 트리플을 포함하는 독해력 데이터 세트입니다. TriviaqQA에는 퀴즈 애호가가 작성한 95,000개의 질문-답변 쌍과 독립적으로 수집한 증거 문서(질문당 평균 6개)가 포함되어 질문에 답하기 위한 고품질 원거리 감독을 제공합니다.

추가 문서 : 코드가 있는 논문에서 탐색
홈페이지 : http://nlp.cs.washington.edu/triviaqa/
소스 코드 : tfds.datasets.trivia_qa.Builder
버전 :
- 1.1.0 (기본값): 릴리스 정보가 없습니다.
기능 구조 :

FeaturesDict({
    'answer': FeaturesDict({
        'aliases': Sequence(Text(shape=(), dtype=string)),
        'matched_wiki_entity_name': Text(shape=(), dtype=string),
        'normalized_aliases': Sequence(Text(shape=(), dtype=string)),
        'normalized_matched_wiki_entity_name': Text(shape=(), dtype=string),
        'normalized_value': Text(shape=(), dtype=string),
        'type': Text(shape=(), dtype=string),
        'value': Text(shape=(), dtype=string),
    }),
    'entity_pages': Sequence({
        'doc_source': Text(shape=(), dtype=string),
        'filename': Text(shape=(), dtype=string),
        'title': Text(shape=(), dtype=string),
        'wiki_context': Text(shape=(), dtype=string),
    }),
    'question': Text(shape=(), dtype=string),
    'question_id': Text(shape=(), dtype=string),
    'question_source': Text(shape=(), dtype=string),
    'search_results': Sequence({
        'description': Text(shape=(), dtype=string),
        'filename': Text(shape=(), dtype=string),
        'rank': int32,
        'search_context': Text(shape=(), dtype=string),
        'title': Text(shape=(), dtype=string),
        'url': Text(shape=(), dtype=string),
    }),
})

기능 문서 :

특징	수업	모양	D타입
	풍모Dict
답변	풍모Dict
답변/별칭	시퀀스(텍스트)	(없음,)	끈
답변/matched_wiki_entity_name	텍스트		끈
답변/정규화된_별칭	시퀀스(텍스트)	(없음,)	끈
답변/normalized_matched_wiki_entity_name	텍스트		끈
답변/정규화된_값	텍스트		끈
답변/유형	텍스트		끈
답변/가치	텍스트		끈
entity_pages	순서
entity_pages/doc_source	텍스트		끈
entity_pages/파일 이름	텍스트		끈
entity_pages/제목	텍스트		끈
entity_pages/wiki_context	텍스트		끈
질문	텍스트		끈
question_id	텍스트		끈
질문_출처	텍스트		끈
검색 결과	순서
검색 결과/설명	텍스트		끈
search_results/파일 이름	텍스트		끈
검색_결과/순위	텐서		int32
search_results/search_context	텍스트		끈
검색_결과/제목	텍스트		끈
검색 결과/URL	텍스트		끈

감독된 키 ( as_supervised 문서 참조): None
그림 ( tfds.show_examples ): 지원되지 않습니다.
인용 :

@article{2017arXivtriviaqa,
       author = { {Joshi}, Mandar and {Choi}, Eunsol and {Weld},
                 Daniel and {Zettlemoyer}, Luke},
        title = "{triviaqa: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension}",
      journal = {arXiv e-prints},
         year = 2017,
          eid = {arXiv:1705.03551},
        pages = {arXiv:1705.03551},
archivePrefix = {arXiv},
       eprint = {1705.03551},
}

trivia_qa/rc(기본 구성)

구성 설명 : 주어진 질문에 대한 모든 문서에 답변 문자열이 포함된 질문-답변 쌍입니다. Wikipedia 및 검색 결과의 컨텍스트를 포함합니다.
다운로드 크기 : 2.48 GiB
데이터세트 크기 : 14.99 GiB
자동 캐시 ( 문서 ): 아니요
분할 :

나뉘다	예
`'test'`	17,210
`'train'`	138,384
`'validation'`	18,669

예 ( tfds.as_dataframe ):

trivia_qa/rc.nocontext

구성 설명 : 주어진 질문에 대한 모든 문서에 답변 문자열이 포함된 질문-답변 쌍입니다.
다운로드 크기 : 2.48 GiB
데이터 세트 크기 : 196.84 MiB
자동 캐싱 ( 문서 ): 예(테스트, 검증), shuffle_files=False 인 경우에만(훈련)
분할 :

나뉘다	예
`'test'`	17,210
`'train'`	138,384
`'validation'`	18,669

예 ( tfds.as_dataframe ):

trivia_qa/unfiltered

구성 설명 : 주어진 질문에 대한 모든 문서에 답변 문자열이 포함되어 있지 않은 오픈 도메인 QA에 대한 110k 질문-답변 쌍. 이렇게 하면 필터링되지 않은 데이터 세트가 IR 스타일 QA에 더 적합합니다. Wikipedia 및 검색 결과의 컨텍스트를 포함합니다.
다운로드 크기 : 3.07 GiB
데이터세트 크기 : 27.27 GiB
자동 캐시 ( 문서 ): 아니요
분할 :

나뉘다	예
`'test'`	10,832
`'train'`	87,622
`'validation'`	11,313

예 ( tfds.as_dataframe ):

trivia_qa/unfiltered.nocontext

구성 설명 : 주어진 질문에 대한 모든 문서에 답변 문자열이 포함되어 있지 않은 오픈 도메인 QA에 대한 110k 질문-답변 쌍. 이렇게 하면 필터링되지 않은 데이터 세트가 IR 스타일 QA에 더 적합합니다.
다운로드 크기 : 603.25 MiB
데이터 세트 크기 : 119.78 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'test'`	10,832
`'train'`	87,622
`'validation'`	11,313

예 ( tfds.as_dataframe ):

설명 :

추가 문서 : 코드가 있는 논문에서 탐색
홈페이지 : http://nlp.cs.washington.edu/triviaqa/
소스 코드 : tfds.datasets.trivia_qa.Builder
버전 :
- 1.1.0 (기본값): 릴리스 정보가 없습니다.
기능 구조 :

FeaturesDict({
    'answer': FeaturesDict({
        'aliases': Sequence(Text(shape=(), dtype=string)),
        'matched_wiki_entity_name': Text(shape=(), dtype=string),
        'normalized_aliases': Sequence(Text(shape=(), dtype=string)),
        'normalized_matched_wiki_entity_name': Text(shape=(), dtype=string),
        'normalized_value': Text(shape=(), dtype=string),
        'type': Text(shape=(), dtype=string),
        'value': Text(shape=(), dtype=string),
    }),
    'entity_pages': Sequence({
        'doc_source': Text(shape=(), dtype=string),
        'filename': Text(shape=(), dtype=string),
        'title': Text(shape=(), dtype=string),
        'wiki_context': Text(shape=(), dtype=string),
    }),
    'question': Text(shape=(), dtype=string),
    'question_id': Text(shape=(), dtype=string),
    'question_source': Text(shape=(), dtype=string),
    'search_results': Sequence({
        'description': Text(shape=(), dtype=string),
        'filename': Text(shape=(), dtype=string),
        'rank': int32,
        'search_context': Text(shape=(), dtype=string),
        'title': Text(shape=(), dtype=string),
        'url': Text(shape=(), dtype=string),
    }),
})

기능 문서 :

특징	수업	모양	D타입
	풍모Dict
답변	풍모Dict
답변/별칭	시퀀스(텍스트)	(없음,)	끈
답변/matched_wiki_entity_name	텍스트		끈
답변/정규화된_별칭	시퀀스(텍스트)	(없음,)	끈
답변/normalized_matched_wiki_entity_name	텍스트		끈
답변/정규화된_값	텍스트		끈
답변/유형	텍스트		끈
답변/가치	텍스트		끈
entity_pages	순서
entity_pages/doc_source	텍스트		끈
entity_pages/파일 이름	텍스트		끈
entity_pages/제목	텍스트		끈
entity_pages/wiki_context	텍스트		끈
질문	텍스트		끈
question_id	텍스트		끈
질문_출처	텍스트		끈
검색 결과	순서
검색 결과/설명	텍스트		끈
search_results/파일 이름	텍스트		끈
검색_결과/순위	텐서		int32
search_results/search_context	텍스트		끈
검색_결과/제목	텍스트		끈
검색 결과/URL	텍스트		끈

감독된 키 ( as_supervised 문서 참조): None
그림 ( tfds.show_examples ): 지원되지 않습니다.
인용 :

@article{2017arXivtriviaqa,
       author = { {Joshi}, Mandar and {Choi}, Eunsol and {Weld},
                 Daniel and {Zettlemoyer}, Luke},
        title = "{triviaqa: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension}",
      journal = {arXiv e-prints},
         year = 2017,
          eid = {arXiv:1705.03551},
        pages = {arXiv:1705.03551},
archivePrefix = {arXiv},
       eprint = {1705.03551},
}

trivia_qa/rc(기본 구성)

구성 설명 : 주어진 질문에 대한 모든 문서에 답변 문자열이 포함된 질문-답변 쌍입니다. Wikipedia 및 검색 결과의 컨텍스트를 포함합니다.
다운로드 크기 : 2.48 GiB
데이터세트 크기 : 14.99 GiB
자동 캐시 ( 문서 ): 아니요
분할 :

나뉘다	예
`'test'`	17,210
`'train'`	138,384
`'validation'`	18,669

예 ( tfds.as_dataframe ):

trivia_qa/rc.nocontext

구성 설명 : 주어진 질문에 대한 모든 문서에 답변 문자열이 포함된 질문-답변 쌍입니다.
다운로드 크기 : 2.48 GiB
데이터 세트 크기 : 196.84 MiB
자동 캐싱 ( 문서 ): 예(테스트, 검증), shuffle_files=False 인 경우에만(훈련)
분할 :

나뉘다	예
`'test'`	17,210
`'train'`	138,384
`'validation'`	18,669

예 ( tfds.as_dataframe ):

trivia_qa/unfiltered

구성 설명 : 주어진 질문에 대한 모든 문서에 답변 문자열이 포함되어 있지 않은 오픈 도메인 QA에 대한 110k 질문-답변 쌍. 이렇게 하면 필터링되지 않은 데이터 세트가 IR 스타일 QA에 더 적합합니다. Wikipedia 및 검색 결과의 컨텍스트를 포함합니다.
다운로드 크기 : 3.07 GiB
데이터세트 크기 : 27.27 GiB
자동 캐시 ( 문서 ): 아니요
분할 :

나뉘다	예
`'test'`	10,832
`'train'`	87,622
`'validation'`	11,313

예 ( tfds.as_dataframe ):

trivia_qa/unfiltered.nocontext

구성 설명 : 주어진 질문에 대한 모든 문서에 답변 문자열이 포함되어 있지 않은 오픈 도메인 QA에 대한 110k 질문-답변 쌍. 이렇게 하면 필터링되지 않은 데이터 세트가 IR 스타일 QA에 더 적합합니다.
다운로드 크기 : 603.25 MiB
데이터 세트 크기 : 119.78 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'test'`	10,832
`'train'`	87,622
`'validation'`	11,313

예 ( tfds.as_dataframe ):

trivia_qa 컬렉션을 사용해 정리하기 내 환경설정을 기준으로 콘텐츠를 저장하고 분류하세요.

trivia_qa/rc(기본 구성)

trivia_qa/rc.nocontext

trivia_qa/unfiltered

trivia_qa/unfiltered.nocontext

trivia_qa/rc(기본 구성)

trivia_qa/rc.nocontext

trivia_qa/unfiltered

trivia_qa/unfiltered.nocontext

trivia_qa