TensorFlow is back at Google I/O on May 14! Register now

tflite_model_maker.question_answer.DataLoader

DataLoader for question answering.

tflite_model_maker.question_answer.DataLoader(
    dataset, size, version_2_with_negative, examples, features, squad_file
)

Used in the notebooks

Used in the tutorials
BERT Question Answer with TensorFlow Lite Model Maker

Args
`dataset`	A tf.data.Dataset object that contains a potentially large set of elements, where each element is a pair of (input_data, target). The `input_data` means the raw input data, like an image, a text etc., while the `target` means some ground truth of the raw input data, such as the classification label of the image etc.
`size`	The size of the dataset. tf.data.Dataset donesn't support a function to get the length directly since it's lazy-loaded and may be infinite.

Attributes
`size`	Returns the size of the dataset. Note that this function may return None becuase the exact size of the dataset isn't a necessary parameter to create an instance of this class, and tf.data.Dataset donesn't support a function to get the length directly since it's lazy-loaded and may be infinite. In most cases, however, when an instance of this class is created by helper functions like 'from_folder', the size of the dataset will be preprocessed, and this function can return an int representing the size of the dataset.

Attributes

size

Returns the size of the dataset.

Note that this function may return None becuase the exact size of the dataset isn't a necessary parameter to create an instance of this class, and tf.data.Dataset donesn't support a function to get the length directly since it's lazy-loaded and may be infinite. In most cases, however, when an instance of this class is created by helper functions like 'from_folder', the size of the dataset will be preprocessed, and this function can return an int representing the size of the dataset.

Methods

`from_squad`

View source

@classmethod
from_squad(
    filename,
    model_spec,
    is_training=True,
    version_2_with_negative=False,
    cache_dir=None
)

Loads data in SQuAD format and preproecess text according to model_spec.

Args
`filename`	Name of the file.
`model_spec`	Specification for the model.
`is_training`	Whether the loaded data is for training or not.
`version_2_with_negative`	Whether it's SQuAD 2.0 format.
`cache_dir`	The cache directory to save preprocessed data. If None, generates a temporary directory to cache preprocessed data.

Returns
QuestionAnswerDataLoader object.

`gen_dataset`

View source

gen_dataset(
    batch_size=1,
    is_training=False,
    shuffle=False,
    input_pipeline_context=None,
    preprocess=None,
    drop_remainder=False
)

Generate a shared and batched tf.data.Dataset for training/evaluation.

Args
`batch_size`	A integer, the returned dataset will be batched by this size.
`is_training`	A boolean, when True, the returned dataset will be optionally shuffled and repeated as an endless dataset.
`shuffle`	A boolean, when True, the returned dataset will be shuffled to create randomness during model training.
`input_pipeline_context`	A InputContext instance, used to shared dataset among multiple workers when distribution strategy is used.
`preprocess`	A function taking three arguments in order, feature, label and boolean is_training.
`drop_remainder`	boolean, whether the finaly batch drops remainder.

Returns
A TF dataset ready to be consumed by Keras model.

`split`

View source

split(
    fraction
)

Splits dataset into two sub-datasets with the given fraction.

Primarily used for splitting the data set into training and testing sets.

Args
`fraction`	float, demonstrates the fraction of the first returned subdataset in the original data.

Returns
The splitted two sub datasets.

`len`

View source

__len__()

tflite_model_maker.question_answer.DataLoader

Used in the notebooks

Args

Attributes

Methods

from_squad

gen_dataset

split

__len__

`from_squad`

`gen_dataset`

`split`

`len`