cakar-x

Referensi:

en

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:paws-x/en')
  • Keterangan :
PAWS-X, a multilingual version of PAWS (Paraphrase Adversaries from Word Scrambling) for six languages.

This dataset contains 23,659 human translated PAWS evaluation pairs and 296,406 machine
translated training pairs in six typologically distinct languages: French, Spanish, German,
Chinese, Japanese, and Korean. English language is available by default. All translated
pairs are sourced from examples in PAWS-Wiki.

For further details, see the accompanying paper: PAWS-X: A Cross-lingual Adversarial Dataset
for Paraphrase Identification (https://arxiv.org/abs/1908.11828)

Note: There might be some missing or wrong labels in the dataset and we have replaced them with -1.
  • Lisensi : Kumpulan data dapat digunakan secara bebas untuk tujuan apa pun, meskipun pengakuan terhadap Google LLC ("Google") sebagai sumber data akan dihargai. Kumpulan data disediakan "APA ADANYA" tanpa jaminan apa pun, tersurat maupun tersirat. Google melepaskan tanggung jawab apa pun atas segala kerusakan, langsung atau tidak langsung, yang diakibatkan oleh penggunaan kumpulan data.
  • Versi : 1.1.0
  • Perpecahan :
Membelah Contoh
'test' 2000
'train' 49401
'validation' 2000
  • Fitur :
{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "0",
            "1"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

de

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:paws-x/de')
  • Keterangan :
PAWS-X, a multilingual version of PAWS (Paraphrase Adversaries from Word Scrambling) for six languages.

This dataset contains 23,659 human translated PAWS evaluation pairs and 296,406 machine
translated training pairs in six typologically distinct languages: French, Spanish, German,
Chinese, Japanese, and Korean. English language is available by default. All translated
pairs are sourced from examples in PAWS-Wiki.

For further details, see the accompanying paper: PAWS-X: A Cross-lingual Adversarial Dataset
for Paraphrase Identification (https://arxiv.org/abs/1908.11828)

Note: There might be some missing or wrong labels in the dataset and we have replaced them with -1.
  • Lisensi : Kumpulan data dapat digunakan secara bebas untuk tujuan apa pun, meskipun pengakuan terhadap Google LLC ("Google") sebagai sumber data akan dihargai. Kumpulan data disediakan "APA ADANYA" tanpa jaminan apa pun, tersurat maupun tersirat. Google melepaskan tanggung jawab apa pun atas segala kerusakan, langsung atau tidak langsung, yang diakibatkan oleh penggunaan kumpulan data.
  • Versi : 1.1.0
  • Perpecahan :
Membelah Contoh
'test' 2000
'train' 49401
'validation' 2000
  • Fitur :
{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "0",
            "1"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

yaitu

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:paws-x/es')
  • Keterangan :
PAWS-X, a multilingual version of PAWS (Paraphrase Adversaries from Word Scrambling) for six languages.

This dataset contains 23,659 human translated PAWS evaluation pairs and 296,406 machine
translated training pairs in six typologically distinct languages: French, Spanish, German,
Chinese, Japanese, and Korean. English language is available by default. All translated
pairs are sourced from examples in PAWS-Wiki.

For further details, see the accompanying paper: PAWS-X: A Cross-lingual Adversarial Dataset
for Paraphrase Identification (https://arxiv.org/abs/1908.11828)

Note: There might be some missing or wrong labels in the dataset and we have replaced them with -1.
  • Lisensi : Kumpulan data dapat digunakan secara bebas untuk tujuan apa pun, meskipun pengakuan terhadap Google LLC ("Google") sebagai sumber data akan dihargai. Kumpulan data disediakan "APA ADANYA" tanpa jaminan apa pun, tersurat maupun tersirat. Google melepaskan tanggung jawab apa pun atas segala kerusakan, langsung atau tidak langsung, yang diakibatkan oleh penggunaan kumpulan data.
  • Versi : 1.1.0
  • Perpecahan :
Membelah Contoh
'test' 2000
'train' 49401
'validation' 2000
  • Fitur :
{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "0",
            "1"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

NS

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:paws-x/fr')
  • Keterangan :
PAWS-X, a multilingual version of PAWS (Paraphrase Adversaries from Word Scrambling) for six languages.

This dataset contains 23,659 human translated PAWS evaluation pairs and 296,406 machine
translated training pairs in six typologically distinct languages: French, Spanish, German,
Chinese, Japanese, and Korean. English language is available by default. All translated
pairs are sourced from examples in PAWS-Wiki.

For further details, see the accompanying paper: PAWS-X: A Cross-lingual Adversarial Dataset
for Paraphrase Identification (https://arxiv.org/abs/1908.11828)

Note: There might be some missing or wrong labels in the dataset and we have replaced them with -1.
  • Lisensi : Kumpulan data dapat digunakan secara bebas untuk tujuan apa pun, meskipun pengakuan terhadap Google LLC ("Google") sebagai sumber data akan dihargai. Kumpulan data disediakan "APA ADANYA" tanpa jaminan apa pun, tersurat maupun tersirat. Google melepaskan tanggung jawab apa pun atas segala kerusakan, langsung atau tidak langsung, yang diakibatkan oleh penggunaan kumpulan data.
  • Versi : 1.1.0
  • Perpecahan :
Membelah Contoh
'test' 2000
'train' 49401
'validation' 2000
  • Fitur :
{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "0",
            "1"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

ya

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:paws-x/ja')
  • Keterangan :
PAWS-X, a multilingual version of PAWS (Paraphrase Adversaries from Word Scrambling) for six languages.

This dataset contains 23,659 human translated PAWS evaluation pairs and 296,406 machine
translated training pairs in six typologically distinct languages: French, Spanish, German,
Chinese, Japanese, and Korean. English language is available by default. All translated
pairs are sourced from examples in PAWS-Wiki.

For further details, see the accompanying paper: PAWS-X: A Cross-lingual Adversarial Dataset
for Paraphrase Identification (https://arxiv.org/abs/1908.11828)

Note: There might be some missing or wrong labels in the dataset and we have replaced them with -1.
  • Lisensi : Kumpulan data dapat digunakan secara bebas untuk tujuan apa pun, meskipun pengakuan terhadap Google LLC ("Google") sebagai sumber data akan dihargai. Kumpulan data disediakan "APA ADANYA" tanpa jaminan apa pun, tersurat maupun tersirat. Google melepaskan tanggung jawab apa pun atas segala kerusakan, langsung atau tidak langsung, yang diakibatkan oleh penggunaan kumpulan data.
  • Versi : 1.1.0
  • Perpecahan :
Membelah Contoh
'test' 2000
'train' 49401
'validation' 2000
  • Fitur :
{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "0",
            "1"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

ko

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:paws-x/ko')
  • Keterangan :
PAWS-X, a multilingual version of PAWS (Paraphrase Adversaries from Word Scrambling) for six languages.

This dataset contains 23,659 human translated PAWS evaluation pairs and 296,406 machine
translated training pairs in six typologically distinct languages: French, Spanish, German,
Chinese, Japanese, and Korean. English language is available by default. All translated
pairs are sourced from examples in PAWS-Wiki.

For further details, see the accompanying paper: PAWS-X: A Cross-lingual Adversarial Dataset
for Paraphrase Identification (https://arxiv.org/abs/1908.11828)

Note: There might be some missing or wrong labels in the dataset and we have replaced them with -1.
  • Lisensi : Kumpulan data dapat digunakan secara bebas untuk tujuan apa pun, meskipun pengakuan terhadap Google LLC ("Google") sebagai sumber data akan dihargai. Kumpulan data disediakan "APA ADANYA" tanpa jaminan apa pun, tersurat maupun tersirat. Google melepaskan tanggung jawab apa pun atas segala kerusakan, langsung atau tidak langsung, yang diakibatkan oleh penggunaan kumpulan data.
  • Versi : 1.1.0
  • Perpecahan :
Membelah Contoh
'test' 2000
'train' 49401
'validation' 2000
  • Fitur :
{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "0",
            "1"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

zh

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:paws-x/zh')
  • Keterangan :
PAWS-X, a multilingual version of PAWS (Paraphrase Adversaries from Word Scrambling) for six languages.

This dataset contains 23,659 human translated PAWS evaluation pairs and 296,406 machine
translated training pairs in six typologically distinct languages: French, Spanish, German,
Chinese, Japanese, and Korean. English language is available by default. All translated
pairs are sourced from examples in PAWS-Wiki.

For further details, see the accompanying paper: PAWS-X: A Cross-lingual Adversarial Dataset
for Paraphrase Identification (https://arxiv.org/abs/1908.11828)

Note: There might be some missing or wrong labels in the dataset and we have replaced them with -1.
  • Lisensi : Kumpulan data dapat digunakan secara bebas untuk tujuan apa pun, meskipun pengakuan terhadap Google LLC ("Google") sebagai sumber data akan dihargai. Kumpulan data disediakan "APA ADANYA" tanpa jaminan apa pun, tersurat maupun tersirat. Google melepaskan tanggung jawab apa pun atas segala kerusakan, langsung atau tidak langsung, yang diakibatkan oleh penggunaan kumpulan data.
  • Versi : 1.1.0
  • Perpecahan :
Membelah Contoh
'test' 2000
'train' 49401
'validation' 2000
  • Fitur :
{
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "0",
            "1"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}