Initializers

Initializers are used to initialize variables with sensible values given their size, data type, and purpose.

tf.contrib.layers.xavier_initializer(uniform=True, seed=None, dtype=tf.float32)

Returns an initializer performing "Xavier" initialization for weights.

This function implements the weight initialization from:

Xavier Glorot and Yoshua Bengio (2010): Understanding the difficulty of training deep feedforward neural networks. International conference on artificial intelligence and statistics.

This initializer is designed to keep the scale of the gradients roughly the same in all layers. In uniform distribution this ends up being the range: x = sqrt(6. / (in + out)); [-x, x] and for normal distribution a standard deviation of sqrt(3. / (in + out)) is used.

Args:
  • uniform: Whether to use uniform or normal distributed random initialization.
  • seed: A Python integer. Used to create random seeds. See set_random_seed for behavior.
  • dtype: The data type. Only floating point types are supported.
Returns:

An initializer for a weight matrix.


tf.contrib.layers.xavier_initializer_conv2d(uniform=True, seed=None, dtype=tf.float32)

Returns an initializer performing "Xavier" initialization for weights.

This function implements the weight initialization from:

Xavier Glorot and Yoshua Bengio (2010): Understanding the difficulty of training deep feedforward neural networks. International conference on artificial intelligence and statistics.

This initializer is designed to keep the scale of the gradients roughly the same in all layers. In uniform distribution this ends up being the range: x = sqrt(6. / (in + out)); [-x, x] and for normal distribution a standard deviation of sqrt(3. / (in + out)) is used.

Args:
  • uniform: Whether to use uniform or normal distributed random initialization.
  • seed: A Python integer. Used to create random seeds. See set_random_seed for behavior.
  • dtype: The data type. Only floating point types are supported.
Returns:

An initializer for a weight matrix.


tf.contrib.layers.variance_scaling_initializer(factor=2.0, mode='FAN_IN', uniform=False, seed=None, dtype=tf.float32)

Returns an initializer that generates tensors without scaling variance.

When initializing a deep network, it is in principle advantageous to keep the scale of the input variance constant, so it does not explode or diminish by reaching the final layer. This initializer use the following formula: if mode='FAN_IN': # Count only number of input connections. n = fan_in elif mode='FAN_OUT': # Count only number of output connections. n = fan_out elif mode='FAN_AVG': # Average number of inputs and output connections. n = (fan_in + fan_out)/2.0

truncated_normal(shape, 0.0, stddev=sqrt(factor / n))

To get http://arxiv.org/pdf/1502.01852v1.pdf use (Default): - factor=2.0 mode='FAN_IN' uniform=False To get http://arxiv.org/abs/1408.5093 use: - factor=1.0 mode='FAN_IN' uniform=True To get http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf use: - factor=1.0 mode='FAN_AVG' uniform=True. To get xavier_initializer use either: - factor=1.0 mode='FAN_AVG' uniform=True. - factor=1.0 mode='FAN_AVG' uniform=False.

Args:
  • factor: Float. A multiplicative factor.
  • mode: String. 'FAN_IN', 'FAN_OUT', 'FAN_AVG'.
  • uniform: Whether to use uniform or normal distributed random initialization.
  • seed: A Python integer. Used to create random seeds. See set_random_seed for behavior.
  • dtype: The data type. Only floating point types are supported.
Returns:

An initializer that generates tensors with unit variance.

Raises:
  • ValueError: if dtype is not a floating point type.
  • TypeError: if mode is not in ['FAN_IN', 'FAN_OUT', 'FAN_AVG'].