Training (contrib)

Training and input utilities.

Splitting sequence inputs into minibatches with state saving

Use tf.contrib.training.SequenceQueueingStateSaver or its wrapper tf.contrib.training.batch_sequences_with_states if you have input data with a dynamic primary time / frame count axis which you'd like to convert into fixed size segments during minibatching, and would like to store state in the forward direction across segments of an example.

Online data resampling

To resample data with replacement on a per-example basis, use tf.contrib.training.rejection_sample or tf.contrib.training.resample_at_rate. For rejection_sample, provide a boolean Tensor describing whether to accept or reject. Resulting batch sizes are always the same. For resample_at_rate, provide the desired rate for each example. Resulting batch sizes may vary. If you wish to specify relative rates, rather than absolute ones, use tf.contrib.training.weighted_resample (which also returns the actual resampling rate used for each output example).

Use tf.contrib.training.stratified_sample to resample without replacement from the data to achieve a desired mix of class proportions that the Tensorflow graph sees. For instance, if you have a binary classification dataset that is 99.9% class 1, a common approach is to resample from the data so that the data is more balanced.

Bucketing

Use tf.contrib.training.bucket or tf.contrib.training.bucket_by_sequence_length to stratify minibatches into groups ("buckets"). Use bucket_by_sequence_length with the argument dynamic_pad=True to receive minibatches of similarly sized sequences for efficient training via dynamic_rnn.