|TensorFlow 1 version||View source on GitHub|
Train and evaluate the
Compat aliases for migration
See Migration guide for more details.
tf.estimator.train_and_evaluate( estimator, train_spec, eval_spec )
Used in the notebooks
|Used in the guide||Used in the tutorials|
This utility function trains, evaluates, and (optionally) exports the model by
using the given
estimator. All training related specification is held in
train_spec, including training
input_fn and training max steps, etc. All
evaluation and export related specification is held in
input_fn, steps, etc.
This utility function provides consistent behavior for both local (non-distributed) and distributed configurations. The default distribution configuration is parameter server-based between-graph replication. For other types of distribution configurations such as all-reduce training, please use DistributionStrategies.
Overfitting: In order to avoid overfitting, it is recommended to set up the
input_fn to shuffle the training data properly.
Stop condition: In order to support both distributed and non-distributed
configuration reliably, the only supported stop condition for model
model is trained forever. Use with care if model stop condition is
different. For example, assume that the model is expected to be trained with
one epoch of training data, and the training
input_fn is configured to throw
OutOfRangeError after going through one epoch, which stops the
Estimator.train. For a three-training-worker distributed configuration, each
training worker is likely to go through the whole epoch independently. So, the
model will be trained with three epochs of training data instead of one epoch.
Example of local (non-distributed) training:
# Set up feature columns. categorial_feature_a = categorial_column_with_hash_bucket(...) categorial_feature_a_emb = embedding_column( categorical_column=categorial_feature_a, ...) ... # other feature columns estimator = DNNClassifier( feature_columns=[categorial_feature_a_emb, ...], hidden_units=[1024, 512, 256]) # Or set up the model directory # estimator = DNNClassifier( # config=tf.estimator.RunConfig( # model_dir='/my_model', save_summary_steps=100), # feature_columns=[categorial_feature_a_emb, ...], # hidden_units=[1024, 512, 256]) # Input pipeline for train and evaluate. def train_input_fn(): # returns x, y # please shuffle the data. pass def eval_input_fn(): # returns x, y pass train_spec = tf.estimator.TrainSpec(input_fn=train_input_fn, max_steps=1000) eval_spec = tf.estimator.EvalSpec(input_fn=eval_input_fn) tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
Note that in current implementation
estimator.evaluate will be called
multiple times. This means that evaluation graph (including eval_input_fn)
will be re-created for each
estimator.train will be called
Example of distributed training:
Regarding the example of distributed training, the code above can be used
without a change (Please do make sure that the
RunConfig.model_dir for all
workers is set to the same di