лапы

Использованная литература:

labeled_final

Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:

ds = tfds.load('huggingface:paws/labeled_final')

Описание :

PAWS: Paraphrase Adversaries from Word Scrambling

This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature
the importance of modeling structure, context, and word order information for the problem
of paraphrase identification. The dataset has two subsets, one based on Wikipedia and the
other one based on the Quora Question Pairs (QQP) dataset.

For further details, see the accompanying paper: PAWS: Paraphrase Adversaries from Word Scrambling
(https://arxiv.org/abs/1904.01130)

PAWS-QQP is not available due to license of QQP. It must be reconstructed by downloading the original
data and then running our scripts to produce the data and attach the labels.

Note: There might be some missing or wrong labels in the dataset and we have replaced them with -1.

Лицензия : Набор данных можно свободно использовать для любых целей, однако приветствуется упоминание Google LLC («Google») в качестве источника данных. Набор данных предоставляется «КАК ЕСТЬ» без каких-либо гарантий, явных или подразумеваемых. Google отказывается от любой ответственности за любой ущерб, прямой или косвенный, возникший в результате использования набора данных.
Версия : 1.1.0
Расколы :

Расколоть	Примеры
`'test'`	8000
`'train'`	49401
`'validation'`	8000

Функции :

{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "0",
            "1"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

labeled_swap

Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:

ds = tfds.load('huggingface:paws/labeled_swap')

Описание :

PAWS: Paraphrase Adversaries from Word Scrambling

This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature
the importance of modeling structure, context, and word order information for the problem
of paraphrase identification. The dataset has two subsets, one based on Wikipedia and the
other one based on the Quora Question Pairs (QQP) dataset.

For further details, see the accompanying paper: PAWS: Paraphrase Adversaries from Word Scrambling
(https://arxiv.org/abs/1904.01130)

PAWS-QQP is not available due to license of QQP. It must be reconstructed by downloading the original
data and then running our scripts to produce the data and attach the labels.

Note: There might be some missing or wrong labels in the dataset and we have replaced them with -1.

Лицензия : Набор данных можно свободно использовать для любых целей, однако приветствуется упоминание Google LLC («Google») в качестве источника данных. Набор данных предоставляется «КАК ЕСТЬ» без каких-либо гарантий, явных или подразумеваемых. Google отказывается от любой ответственности за любой ущерб, прямой или косвенный, возникший в результате использования набора данных.
Версия : 1.1.0
Расколы :

Расколоть	Примеры
`'train'`	30397

Функции :

{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "0",
            "1"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

unlabeled_final

Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:

ds = tfds.load('huggingface:paws/unlabeled_final')

Описание :

PAWS: Paraphrase Adversaries from Word Scrambling

This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature
the importance of modeling structure, context, and word order information for the problem
of paraphrase identification. The dataset has two subsets, one based on Wikipedia and the
other one based on the Quora Question Pairs (QQP) dataset.

For further details, see the accompanying paper: PAWS: Paraphrase Adversaries from Word Scrambling
(https://arxiv.org/abs/1904.01130)

PAWS-QQP is not available due to license of QQP. It must be reconstructed by downloading the original
data and then running our scripts to produce the data and attach the labels.

Note: There might be some missing or wrong labels in the dataset and we have replaced them with -1.

Лицензия : Набор данных можно свободно использовать для любых целей, однако приветствуется упоминание Google LLC («Google») в качестве источника данных. Набор данных предоставляется «КАК ЕСТЬ» без каких-либо гарантий, явных или подразумеваемых. Google отказывается от любой ответственности за любой ущерб, прямой или косвенный, возникший в результате использования набора данных.
Версия : 1.1.0
Расколы :

Расколоть	Примеры
`'train'`	645652
`'validation'`	10000

Функции :

{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "0",
            "1"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}