A class designed for a dedicated evaluator task.

SidecarEvaluator is expected to be run in a process on a separate machine from the training cluster. It is meant for the purpose of a dedicated evaluator, evaluating the metric results of a training cluster which has one or more workers performing the training, and saving checkpoints.

The SidecarEvaluator API is compatible with both Custom Training Loop (CTL), and Keras to be used in the training cluster. Using the model (with compiled metrics) provided at __init__, SidecarEvaluator repeatedly performs evaluation "epochs" when it finds a checkpoint that has not yet been used. Depending on the steps argument, an eval epoch is evaluation over all eval data, or up to certain number of steps (batches). See examples below for how the training program should save the checkpoints in order to be recognized by SidecarEvaluator.

Since under the hood, SidecarEvaluator uses model.evaluate for evaluation, it also supports arbitrary Keras callbacks. That is, if one or more callbacks are provided, their on_test_batch_begin and on_test_batch_end methods are called at the start and end of a batch, and their on_test_begin and on_test_end are called at the start and end of an evaluation epoch. Note that SidecarEvaluator may skip some checkpoints because it always picks up the latest checkpoint available, and during an evaluation epoch, multiple checkpoints can be produced from the training side.


model = tf.keras.models.Sequential(...)