Reuters21578

Bibliografia:

ModHayes

Użyj następującego polecenia, aby załadować ten zestaw danych do TFDS:

ds = tfds.load('huggingface:reuters21578/ModHayes')
 • Opis :
The Reuters-21578 dataset is one of the most widely used data collections for text
categorization research. It is collected from the Reuters financial newswire service in 1987.
 • Licencja : Brak znanej licencji
 • Wersja : 1.0.0
 • Podziały :
Podział Przykłady
'test' 722
'train' 20856
 • Cechy :
{
  "text": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "text_type": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "topics": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "lewis_split": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "cgis_split": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "old_id": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "new_id": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "places": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "people": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "orgs": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "exchanges": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "date": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "title": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  }
}

ModLewis

Użyj następującego polecenia, aby załadować ten zestaw danych do TFDS:

ds = tfds.load('huggingface:reuters21578/ModLewis')
 • Opis :
The Reuters-21578 dataset is one of the most widely used data collections for text
categorization research. It is collected from the Reuters financial newswire service in 1987.
 • Licencja : Brak znanej licencji
 • Wersja : 1.0.0
 • Podziały :
Podział Przykłady
'test' 6188
'train' 13625
'unused' 722
 • Cechy :
{
  "text": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "text_type": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "topics": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "lewis_split": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "cgis_split": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "old_id": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "new_id": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "places": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "people": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "orgs": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "exchanges": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "date": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "title": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  }
}

ModApte

Użyj następującego polecenia, aby załadować ten zestaw danych do TFDS:

ds = tfds.load('huggingface:reuters21578/ModApte')
 • Opis :
The Reuters-21578 dataset is one of the most widely used data collections for text
categorization research. It is collected from the Reuters financial newswire service in 1987.
 • Licencja : Brak znanej licencji
 • Wersja : 1.0.0
 • Podziały :
Podział Przykłady
'test' 3299
'train' 9603
'unused' 722
 • Cechy :
{
  "text": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "text_type": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "topics": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "lewis_split": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "cgis_split": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "old_id": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "new_id": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "places": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "people": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "orgs": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "exchanges": {
    "feature": {
      "dtype": "string",
      "id": null,
      "_type": "Value"
    },
    "length": -1,
    "id": null,
    "_type": "Sequence"
  },
  "date": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  },
  "title": {
    "dtype": "string",
    "id": null,
    "_type": "Value"
  }
}