tfio.bigquery.BigQueryReadSession

Entry point for reading data from Cloud BigQuery.

Methods

get_streams

View source

Returns Tensor with stream names for reading data from BigQuery.

Returns
Tensor with stream names.

parallel_read_rows

View source

Retrieves rows from the BigQuery service in parallel streams.

bq_client = BigQueryClient()
bq_read_session = bq_client.read_session(...)
ds1 = bq_read_session.parallel_read_rows(...)

Args: cycle_length: number of streams to process in parallel. If not specified, it is defaulted to the number of streams in the read session. sloppy: If false, elements are produced in deterministic order. If true, the implementation is allowed, for the sake of expediency, to produce elements in a non-deterministic order. When reading from multiple BigQuery streams, setting sloppy=True usually yields a better performance. block_length: The number of consecutive elements to pull from a session stream before advancing to the next one. num_parallel_calls: Number of threads to use for processing input streams. If the value tf.data.experimental.AUTOTUNE is used, then the number of parallel calls is set dynamically based on available CPU. Defaulted to the number of streams in the read session.

Returns
A tf.data.Dataset returning the row keys and the cell contents.

Raises
ValueError If the configured probability is unexpected.

read_rows

View source

Retrieves rows (including values) from the BigQuery service.

Args
stream name of the stream to read from.
offset Position in the stream.

Returns
A tf.data.Dataset returning the row keys and the cell contents.

Raises
ValueError If the configured probability is unexpected.