tf.train.CheckpointManager

TensorFlow 1 version

View source on GitHub

Deletes old checkpoints.

View aliases

Compat aliases for migration

See Migration guide for more details.

tf.compat.v1.train.CheckpointManager

tf.train.CheckpointManager(
    checkpoint, directory, max_to_keep, keep_checkpoint_every_n_hours=None,
    checkpoint_name='ckpt'
)

Example usage:

import tensorflow as tf
checkpoint = tf.train.Checkpoint(optimizer=optimizer, model=model)
manager = tf.train.CheckpointManager(
    checkpoint, directory="/tmp/model", max_to_keep=5)
status = checkpoint.restore(manager.latest_checkpoint)
while True:
  # train
  manager.save()

CheckpointManager preserves its own state across instantiations (see the __init__ documentation for details). Only one should be active in a particular directory at a time.

Args
`checkpoint`	The `tf.train.Checkpoint` instance to save and manage checkpoints for.
`directory`	The path to a directory in which to write checkpoints. A special file named "checkpoint" is also written to this directory (in a human-readable text format) which contains the state of the `CheckpointManager`.
`max_to_keep`	An integer, the number of checkpoints to keep. Unless preserved by `keep_checkpoint_every_n_hours`, checkpoints will be deleted from the active set, oldest first, until only `max_to_keep` checkpoints remain. If `None`, no checkpoints are deleted and everything stays in the active set. Note that `max_to_keep=None` will keep all checkpoint paths in memory and in the checkpoint state protocol buffer on disk.
`keep_checkpoint_every_n_hours`	Upon removal from the active set, a checkpoint will be preserved if it has been at least `keep_checkpoint_every_n_hours` since the last preserved checkpoint. The default setting of `None` does not preserve any checkpoints in this way.
`checkpoint_name`	Custom name for the checkpoint file.

Raises
`ValueError`	If `max_to_keep` is not a positive integer.

Attributes
`checkpoints`	A list of managed checkpoints. Note that checkpoints saved due to `keep_checkpoint_every_n_hours` will not show up in this list (to avoid ever-growing filename lists).
`latest_checkpoint`	The prefix of the most recent checkpoint in `directory`. Equivalent to `tf.train.latest_checkpoint(directory)` where `directory` is the constructor argument to `CheckpointManager`. Suitable for passing to `tf.train.Checkpoint.restore` to resume training.

Attributes

checkpoints

A list of managed checkpoints.

Note that checkpoints saved due to keep_checkpoint_every_n_hours will not show up in this list (to avoid ever-growing filename lists).

latest_checkpoint

The prefix of the most recent checkpoint in directory.

Equivalent to tf.train.latest_checkpoint(directory) where directory is the constructor argument to CheckpointManager.

Suitable for passing to tf.train.Checkpoint.restore to resume training.

Methods

`save`

View source

save(
    checkpoint_number=None
)

Creates a new checkpoint and manages it.

Args
`checkpoint_number`	An optional integer, or an integer-dtype `Variable` or `Tensor`, used to number the checkpoint. If `None` (default), checkpoints are numbered using `checkpoint.save_counter`. Even if `checkpoint_number` is provided, `save_counter` is still incremented. A user-provided `checkpoint_number` is not incremented even if it is a `Variable`.

Returns
The path to the new checkpoint. It is also recorded in the `checkpoints` and `latest_checkpoint` properties.

tf.train.CheckpointManager Stay organized with collections Save and categorize content based on your preferences.

View aliases

Example usage:

Args

Raises

Attributes

Methods

save

tf.train.CheckpointManager

`save`