TensorFlow is back at Google I/O on May 14! Register now

tf.keras.datasets.reuters.load_data

TensorFlow 2 version

View source on GitHub

Loads the Reuters newswire classification dataset.

View aliases

Compat aliases for migration

See Migration guide for more details.

tf.compat.v1.keras.datasets.reuters.load_data, `tf.compat.v2.keras.datasets.reuters.load_data`

tf.keras.datasets.reuters.load_data(
    path='reuters.npz', num_words=None, skip_top=0, maxlen=None, test_split=0.2,
    seed=113, start_char=1, oov_char=2, index_from=3, **kwargs
)

Arguments
`path`	where to cache the data (relative to `~/.keras/dataset`).
`num_words`	max number of words to include. Words are ranked by how often they occur (in the training set) and only the most frequent words are kept
`skip_top`	skip the top N most frequently occurring words (which may not be informative).
`maxlen`	truncate sequences after this length.
`test_split`	Fraction of the dataset to be used as test data.
`seed`	random seed for sample shuffling.
`start_char`	The start of a sequence will be marked with this character. Set to 1 because 0 is usually the padding character.
`oov_char`	words that were cut out because of the `num_words` or `skip_top` limit will be replaced with this character.
`index_from`	index actual words with this index and higher.
`**kwargs`	Used for backwards compatibility.

Returns
Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`.

Note that the 'out of vocabulary' character is only used for words that were present in the training set but are not included because they're not making the num_words cut here. Words that were not seen in the training set but are in the test set have simply been skipped.

tf.keras.datasets.reuters.load_data

View aliases

Arguments

Returns