Reads from a collection of CSV-formatted files.
__init__( filenames, column_names=(feature_keys.TrainEvalFeatures.TIMES, feature_keys.TrainEvalFeatures.VALUES), column_dtypes=None, skip_header_lines=None, read_num_records_hint=4096 )
CSV-parsing reader for a
filenames: A filename or list of filenames to read the time series from. Each line must have columns corresponding to
column_names: A list indicating names for each feature.
VALUESmay be repeated to indicate a multivariate series.
column_dtypes: If provided, must be a list with the same length as
column_names, indicating dtypes for each column. Defaults to
tf.float32for everything else.
skip_header_lines: Passed on to
tf.TextLineReader; skips this number of lines at the beginning of each file.
read_num_records_hint: When not reading a full dataset, indicates the number of records to parse/transfer in a single chunk (for efficiency). The actual number transferred at one time may be more or less.
ValueError: If required column names are not specified, or if lengths do not match.
When possible, raises an error if the dataset is too small.
This method allows TimeSeriesReaders to raise informative error messages if the user has selected a window size in their TimeSeriesInputFn which is larger than the dataset size. However, many TimeSeriesReaders will not have access to a dataset size, in which case they do not need to override this method.
minimum_dataset_size: The minimum number of records which should be contained in the dataset. Readers should attempt to raise an error when possible if an epoch of data contains fewer records.
Reads a chunk of data from the
tf.ReaderBase for later re-chunking.
Reads a full epoch of data into memory.