Missed TensorFlow Dev Summit? Check out the video playlist. Watch recordings


View source on GitHub

Class RecordInput

RecordInput asynchronously reads and randomly yields TFRecords.

A RecordInput Op will continuously read a batch of records asynchronously into a buffer of some fixed capacity. It can also asynchronously yield random records from this buffer.

It will not start yielding until at least buffer_size / 2 elements have been placed into the buffer so that sufficient randomization can take place.

The order the files are read will be shifted each epoch by shift_amount so that the data is presented in a different order every epoch.


View source


Constructs a RecordInput Op.


  • file_pattern: File path to the dataset, possibly containing wildcards. All matching files will be iterated over each epoch.
  • batch_size: How many records to return at a time.
  • buffer_size: The maximum number of records the buffer will contain.
  • parallelism: How many reader threads to use for reading from files.
  • shift_ratio: What percentage of the total number files to move the start file forward by each epoch.
  • seed: Specify the random number seed used by generator that randomizes records.
  • name: Optional name for the operation.
  • batches: None by default, creating a single batch op. Otherwise specifies how many batches to create, which are returned as a list when get_yield_op() is called. An example use case is to split processing between devices on one computer.
  • compression_type: The type of compression for the file. Currently ZLIB and GZIP are supported. Defaults to none.


  • ValueError: If one of the arguments is invalid.



View source


Adds a node that yields a group of records every time it is executed. If RecordInput batches parameter is not None, it yields a list of record batches with the specified batch_size.