Defined in tensorflow/contrib/quantize/python/

Rewrites a training input_graph in place for simulated quantization.

The graph has fake quantization ops inserted to simulate the error introduced by quantization. Since the graph is transformed in place, the expected behavior of previously held references to nodes and tensors may change.

The default value of quant_delay is suitable for finetuning an already trained floating point model (recommended). If one wants to train a quantized model from scratch, quant_delay should be set to the number of steps it take the floating point model to converge. Quantization will be activated at this point and effectively finetune the model. If quant_delay is not provided when training from scratch, training can often fail.


  • input_graph: The tf.Graph to be transformed.
  • quant_delay: Number of steps after which weights and activations are quantized during training.


  • ValueError: If elements contains an element that isn't a tf.Tensor or tf.Operation.