Base class for decoders that turns a list of bytes to (composite) tensors.
Sub-classes must implement decode_record()
(see its docstring
for requirements).
Decoder instances can be saved as a SavedModel by save_decoder()
.
The SavedModel can be loaded back by load_decoder()
. However, the loaded
decoder will always be of the type LoadedDecoder
and only have the public
interfaces listed in this base class available.
Attributes | |
---|---|
record_index_tensor_name
|
The name of the tensor indicating which record a slice is from.
The decoded tensors are batch-aligned among themselves, but they don't necessarily have to be batch-aligned with the input records. If not, sub-classes should implement this method to tie the batch dimension with the input record. The record index tensor must be a SparseTensor or a RaggedTensor of integral type, and must be 2-D and must not contain "missing" values. A record index tensor like the following: [[0], [0], [2]] means that of 3 "rows" in the output "batch", the first two rows came from the first record, and the 3rd row came from the third record. The name must not be an empty string. |
Methods
decode_record
@abc.abstractmethod
decode_record( records: tf.Tensor ) -> Dict[Text, TensorAlike]
Sub-classes should implement this.
Implementations must use TF ops to derive the result (composite) tensors, as this function will be traced and become a tf.function (thus a TF Graph). Note that autograph is not enabled in such tracing, which means any python control flow / loops will not be converted to TF cond / loops automatically.
The returned tensors must be batch-aligned (i.e. they should be at least
of rank 1, and their outer-most dimensions must be of the same size). They
do not have to be batch-aligned with the input tensor, but if that's the
case, an additional tensor must be provided among the results, to indicate
which input record a "row" in the output batch comes from. See
record_index_tensor_name
for more details.
Args | |
---|---|
records
|
a 1-D string tensor that contains the records to be decoded. |
Returns | |
---|---|
A dict of (composite) tensors. |
output_type_specs
output_type_specs() -> Dict[Text, tf.TypeSpec]
Returns the tf.TypeSpecs of the decoded tensors.
Returns | |
---|---|
A dict whose keys are the same as keys of the dict returned by
decode_record() and values are the tf.TypeSpec of the corresponding
(composite) tensor.
|