Applies a transformation function to this dataset.
apply enables chaining of custom Dataset transformations, which are
represented as functions that take one Dataset argument and return a
transformed Dataset.
For example:
dataset = (dataset.map(lambda x: x ** 2)
.apply(group_by_window(key_func, reduce_func, window_size))
.map(lambda x: x ** 3))
Args
transformation_func
A function that takes one Dataset argument and
returns a Dataset.
Returns
Dataset
The Dataset returned by applying transformation_func to this
dataset.
Combines consecutive elements of this dataset into batches.
The components of the resulting element will have an additional outer
dimension, which will be batch_size (or N % batch_size for the last
element if batch_size does not divide the number of input elements N
evenly and drop_remainder is False). If your program depends on the
batches having the same outer dimension, you should set the drop_remainder
argument to True to prevent the smaller batch from being produced.
Args
batch_size
A tf.int64 scalar tf.Tensor, representing the number of
consecutive elements of this dataset to combine in a single batch.
drop_remainder
(Optional.) A tf.bool scalar tf.Tensor, representing
whether the last batch should be dropped in the case it has fewer than
batch_size elements; the default behavior is not to drop the smaller
batch.
A tf.string scalar tf.Tensor, representing the name of a
directory on the filesystem to use for caching elements in this Dataset.
If a filename is not provided, the dataset will be cached in memory.
Creates a Dataset by concatenating the given dataset with this dataset.
a = Dataset.range(1, 4) # ==> [ 1, 2, 3 ]
b = Dataset.range(4, 8) # ==> [ 4, 5, 6, 7 ]
# The input dataset and dataset to be concatenated should have the same
# nested structures and output types.
# c = Dataset.range(8, 14).batch(2) # ==> [ [8, 9], [10, 11], [12, 13] ]
# d = Dataset.from_tensor_slices([14.0, 15.0, 16.0])
# a.concatenate(c) and a.concatenate(d) would result in error.
a.concatenate(b) # ==> [ 1, 2, 3, 4, 5, 6, 7 ]
# NOTE: The following examples use `{ ... }` to represent the
# contents of a dataset.
a = { 1, 2, 3 }
b = { (7, 8), (9, 10) }
# The nested structure of the `datasets` argument determines the
# structure of elements in the resulting dataset.
a.enumerate(start=5)) == { (5, 1), (6, 2), (7, 3) }
b.enumerate() == { (0, (7, 8)), (1, (9, 10)) }
Args
start
A tf.int64 scalar tf.Tensor, representing the start value for
enumeration.
Filters this dataset according to predicate. (deprecated)
Args
predicate
A function mapping a nested structure of tensors (having shapes
and types defined by self.output_shapes and self.output_types) to a
scalar tf.bool tensor.
Returns
Dataset
The Dataset containing the elements of this dataset for which
predicate is True.
Maps map_func across this dataset and flattens the result.
Use flat_map if you want to make sure that the order of your dataset
stays the same. For example, to flatten a dataset of batches into a
dataset of their elements:
Creates a Dataset whose elements are generated by generator.
The generator argument must be a callable object that returns
an object that supports the iter() protocol (e.g. a generator function).
The elements generated by generator must be compatible with the given
output_types and (optional) output_shapes arguments.
For example:
import itertools
tf.compat.v1.enable_eager_execution()
def gen():
for i in itertools.count(1):
yield (i, [1] * i)
ds = tf.data.Dataset.from_generator(
gen, (tf.int64, tf.int64), (tf.TensorShape([]), tf.TensorShape([None])))
for value in ds.take(2):
print value
# (1, array([1]))
# (2, array([1, 1]))
Args
generator
A callable object that returns an object that supports the
iter() protocol. If args is not specified, generator must take no
arguments; otherwise it must take as many arguments as there are values
in args.
output_types
A nested structure of tf.DType objects corresponding to
each component of an element yielded by generator.
output_shapes
(Optional.) A nested structure of tf.TensorShape objects
corresponding to each component of an element yielded by generator.
args
(Optional.) A tuple of tf.Tensor objects that will be evaluated
and passed to generator as NumPy-array arguments.
Creates a Dataset whose elements are slices of the given tensors.
Note that if tensors contains a NumPy array, and eager execution is not
enabled, the values will be embedded in the graph as one or more
tf.constant operations. For large datasets (> 1 GB), this can waste
memory and run into byte limits of graph serialization. If tensors
contains one or more large NumPy arrays, consider the alternative described
in this guide.
Args
tensors
A dataset element, with each component having the same size in
the 0th dimension.
Creates a Dataset with a single element, comprising the given tensors.
Note that if tensors contains a NumPy array, and eager execution is not
enabled, the values will be embedded in the graph as one or more
tf.constant operations. For large datasets (> 1 GB), this can waste
memory and run into byte limits of grap