Missed TensorFlow Dev Summit? Check out the video playlist. Watch recordings

tfio.bigquery.BigQueryReadSession

View source on GitHub

Entry point for reading data from Cloud BigQuery.

tfio.bigquery.BigQueryReadSession(
    parent, project_id, table_id, dataset_id, selected_fields, output_types,
    row_restriction, requested_streams, streams, avro_schema, client_resource
)

Methods

get_streams

View source

get_streams()

Returns Tensor with stream names for reading data from BigQuery.

Returns:

Tensor with stream names.

parallel_read_rows

View source

parallel_read_rows(
    cycle_length=None, sloppy=False, block_length=1, num_parallel_calls=None
)

Retrieves rows from the BigQuery service in parallel streams. (deprecated arguments)

bq_client = BigQueryClient()
bq_read_session = bq_client.read_session(...)
ds1 = bq_read_session.parallel_read_rows(...)

Args: cycle_length: number of threads to run in parallel. If not specified, it is defaulted to the number of streams in a read session. sloppy: If false, elements are produced in deterministic order. If true, the implementation is allowed, for the sake of expediency, to produce elements in a non-deterministic order. Otherwise, whether the order is deterministic or non-deterministic depends on the tf.data.Options.experimental_deterministic value. block_length: The number of consecutive elements to pull from an input Dataset before advancing to the next input Dataset. block_length: The number of consecutive elements to pull from an input Dataset before advancing to the next input Dataset. num_parallel_calls: If specified, the implementation creates a threadpool, which is used to fetch inputs from cycle elements asynchronously and in parallel. The default behavior is to fetch inputs from cycle elements synchronously with no parallelism. If the value tf.data.experimental.AUTOTUNE is used, then the number of parallel calls is set dynamically based on available CPU.

Returns:

A tf.data.Dataset returning the row keys and the cell contents.

Raises:

  • ValueError: If the configured probability is unexpected.

read_rows

View source

read_rows(
    stream
)

Retrieves rows (including values) from the BigQuery service.

Args:

  • stream: name of the stream to read from.

Returns:

A tf.data.Dataset returning the row keys and the cell contents.

Raises:

  • ValueError: If the configured probability is unexpected.