Sharing Variables

TensorFlow provides several classes and operations that you can use to create variables contingent on certain conditions.

tf.get_variable(name, shape=None, dtype=None, initializer=None, regularizer=None, trainable=True, collections=None, caching_device=None, partitioner=None, validate_shape=True, custom_getter=None)

Gets an existing variable with these parameters or create a new one.

This function prefixes the name with the current variable scope and performs reuse checks. See the Variable Scope How To for an extensive description of how reusing works. Here is a basic example:

with tf.variable_scope("foo"):
    v = tf.get_variable("v", [1])  # v.name == "foo/v:0"
    w = tf.get_variable("w", [1])  # w.name == "foo/w:0"
with tf.variable_scope("foo", reuse=True)
    v1 = tf.get_variable("v")  # The same as v above.

If initializer is None (the default), the default initializer passed in the variable scope will be used. If that one is None too, a uniform_unit_scaling_initializer will be used. The initializer can also be a Tensor, in which case the variable is initialized to this value and shape.

Similarly, if the regularizer is None (the default), the default regularizer passed in the variable scope will be used (if that is None too, then by default no regularization is performed).

If a partitioner is provided, a PartitionedVariable is returned. Accessing this object as a Tensor returns the shards concatenated along the partition axis.

Some useful partitioners are available. See, e.g., variable_axis_size_partitioner and min_max_variable_partitioner.

Args:
  • name: The name of the new or existing variable.
  • shape: Shape of the new or existing variable.
  • dtype: Type of the new or existing variable (defaults to DT_FLOAT).
  • initializer: Initializer for the variable if one is created.
  • regularizer: A (Tensor -> Tensor or None) function; the result of applying it on a newly created variable will be added to the collection GraphKeys.REGULARIZATION_LOSSES and can be used for regularization.
  • trainable: If True also add the variable to the graph collection GraphKeys.TRAINABLE_VARIABLES (see tf.Variable).
  • collections: List of graph collections keys to add the Variable to. Defaults to [GraphKeys.GLOBAL_VARIABLES] (see tf.Variable).
  • caching_device: Optional device string or function describing where the Variable should be cached for reading. Defaults to the Variable's device. If not None, caches on another device. Typical use is to cache on the device where the Ops using the Variable reside, to deduplicate copying through Switch and other conditional statements.
  • partitioner: Optional callable that accepts a fully defined TensorShape and dtype of the Variable to be created, and returns a list of partitions for each axis (currently only one axis can be partitioned).
  • validate_shape: If False, allows the variable to be initialized with a value of unknown shape. If True, the default, the shape of initial_value must be known.
  • custom_getter: Callable that takes as a first argument the true getter, and allows overwriting the internal get_variable method. The signature of custom_getter should match that of this method, but the most future-proof version will allow for changes: def custom_getter(getter, *args, **kwargs). Direct access to all get_variable parameters is also allowed: def custom_getter(getter, name, *args, **kwargs). A simple identity custom getter that simply creates variables with modified names is: python def custom_getter(getter, name, *args, **kwargs): return getter(name + '_suffix', *args, **kwargs)
Returns:

The created or existing Variable (or PartitionedVariable, if a partitioner was used).

Raises:
  • ValueError: when creating a new variable and shape is not declared, when violating reuse during variable creation, or when initializer dtype and dtype don't match. Reuse is set inside variable_scope.

class tf.VariableScope

Variable scope object to carry defaults to provide to get_variable.

Many of the arguments we need for get_variable in a variable store are most easily handled with a context. This object is used for the defaults.

Attributes: name: name of the current scope, used as prefix in get_variable. initializer: default initializer passed to get_variable. regularizer: default regularizer passed to get_variable. reuse: Boolean or None, setting the reuse in get_variable. caching_device: string, callable, or None: the caching device passed to get_variable. partitioner: callable or None: the partitioner passed to get_variable. custom_getter: default custom getter passed to get_variable. name_scope: The name passed to tf.name_scope. dtype: default type passed to get_variable (defaults to DT_FLOAT).


tf.VariableScope.__init__(reuse, name='', initializer=None, regularizer=None, caching_device=None, partitioner=None, custom_getter=None, name_scope='', dtype=tf.float32) {:#VariableScope.init}

Creates a new VariableScope with the given properties.


tf.VariableScope.caching_device


tf.VariableScope.custom_getter


tf.VariableScope.dtype


tf.VariableScope.get_variable(var_store, name, shape=None, dtype=None, initializer=None, regularizer=None, trainable=True, collections=None, caching_device=None, partitioner=None, validate_shape=True, custom_getter=None)

Gets an existing variable with this name or create a new one.


tf.VariableScope.initializer


tf.VariableScope.name


tf.VariableScope.original_name_scope


tf.VariableScope.partitioner


tf.VariableScope.regularizer


tf.VariableScope.reuse


tf.VariableScope.reuse_variables()

Reuse variables in this scope.


tf.VariableScope.set_caching_device(caching_device)

Set caching_device for this scope.


tf.VariableScope.set_custom_getter(custom_getter)

Set custom getter for this scope.


tf.VariableScope.set_dtype(dtype)

Set data type for this scope.


tf.VariableScope.set_initializer(initializer)

Set initializer for this scope.


tf.VariableScope.set_partitioner(partitioner)

Set partitioner for this scope.


tf.VariableScope.set_regularizer(regularizer)

Set regularizer for this scope.


tf.variable_scope(name_or_scope, default_name=None, values=None, initializer=None, regularizer=None, caching_device=None, partitioner=None, custom_getter=None, reuse=None, dtype=None)

Returns a context manager for defining ops that creates variables (layers).

This context manager validates that the (optional) values are from the same graph, ensures that graph is the default graph, and pushes a name scope and a variable scope.

If name_or_scope is not None, it is used as is. If scope is None, then default_name is used. In that case, if the same name has been previously used in the same scope, it will made unique be appending _N to it.

Variable scope allows to create new variables and to share already created ones while providing checks to not create or share by accident. For details, see the Variable Scope How To, here we present only a few basic examples.

Simple example of how to create a new variable:

with tf.variable_scope("foo"):
    with tf.variable_scope("bar"):
        v = tf.get_variable("v", [1])
        assert v.name == "foo/bar/v:0"

Basic example of sharing a variable:

with tf.variable_scope("foo"):
    v = tf.get_variable("v", [1])
with tf.variable_scope("foo", reuse=True):
    v1 = tf.get_variable("v", [1])
assert v1 == v

Sharing a variable by capturing a scope and setting reuse:

with tf.variable_scope("foo") as scope:
    v = tf.get_variable("v", [1])
    scope.reuse_variables()
    v1 = tf.get_variable("v", [1])
assert v1 == v

To prevent accidental sharing of variables, we raise an exception when getting an existing variable in a non-reusing scope.

with tf.variable_scope("foo"):
    v = tf.get_variable("v", [1])
    v1 = tf.get_variable("v", [1])
    #  Raises ValueError("... v already exists ...").

Similarly, we raise an exception when trying to get a variable that does not exist in reuse mode.

with tf.variable_scope("foo", reuse=True):
    v = tf.get_variable("v", [1])
    #  Raises ValueError("... v does not exists ...").

Note that the reuse flag is inherited: if we open a reusing scope, then all its sub-scopes become reusing as well.

Args:
  • name_or_scope: string or VariableScope: the scope to open.
  • default_name: The default name to use if the name_or_scope argument is None, this name will be uniquified. If name_or_scope is provided it won't be used and therefore it is not required and can be None.
  • values: The list of Tensor arguments that are passed to the op function.
  • initializer: default initializer for variables within this scope.
  • regularizer: default regularizer for variables within this scope.
  • caching_device: default caching device for variables within this scope.
  • partitioner: default partitioner for variables within this scope.
  • custom_getter: default custom getter for variables within this scope.
  • reuse: True or None; if True, we go into reuse mode for this scope as well as all sub-scopes; if None, we just inherit the parent scope reuse.
  • dtype: type of variables created in this scope (defaults to the type in the passed scope, or inherited from parent scope).
Returns:

A scope that can be to captured and reused.

Raises:
  • ValueError: when trying to reuse within a create scope, or create within a reuse scope, or if reuse is not None or True.
  • TypeError: when the types of some arguments are not appropriate.

tf.variable_op_scope(values, name_or_scope, default_name=None, initializer=None, regularizer=None, caching_device=None, partitioner=None, custom_getter=None, reuse=None, dtype=None)

Deprecated: context manager for defining an op that creates variables.


tf.get_variable_scope()

Returns the current variable scope.


tf.make_template(name_, func_, create_scope_now_=False, unique_name_=None, **kwargs)

Given an arbitrary function, wrap it so that it does variable sharing.

This wraps func_ in a Template and partially evaluates it. Templates are functions that create variables the first time they are called and reuse them thereafter. In order for func_ to be compatible with a Template it must have the following properties:

  • The function should create all trainable variables and any variables that should be reused by calling tf.get_variable. If a trainable variable is created using tf.Variable, then a ValueError will be thrown. Variables that are intended to be locals can be created by specifying tf.Variable(..., trainable=false).
  • The function may use variable scopes and other templates internally to create and reuse variables, but it shouldn't use tf.all_variables to capture variables that are defined outside of the scope of the function.
  • Internal scopes and variable names should not depend on any arguments that are not supplied to make_template. In general you will get a ValueError telling you that you are trying to reuse a variable that doesn't exist if you make a mistake.

In the following example, both z and w will be scaled by the same y. It is important to note that if we didn't assign scalar_name and used a different name for z and w that a ValueError would be thrown because it couldn't reuse the variable.

def my_op(x, scalar_name):
  var1 = tf.get_variable(scalar_name,
                         shape=[],
                         initializer=tf.constant_initializer(1))
  return x * var1

scale_by_y = tf.make_template('scale_by_y', my_op, scalar_name='y')

z = scale_by_y(input1)
w = scale_by_y(input2)

As a safe-guard, the returned function will raise a ValueError after the first call if trainable variables are created by calling tf.Variable.

If all of these are true, then 2 properties are enforced by the template:

  1. Calling the same template multiple times will share all non-local variables.
  2. Two different templates are guaranteed to be unique, unless you reenter the same variable scope as the initial definition of a template and redefine it. An examples of this exception:
def my_op(x, scalar_name):
  var1 = tf.get_variable(scalar_name,
                         shape=[],
                         initializer=tf.constant_initializer(1))
  return x * var1

with tf.variable_scope('scope') as vs:
  scale_by_y = tf.make_template('scale_by_y', my_op, scalar_name='y')
  z = scale_by_y(input1)
  w = scale_by_y(input2)

# Creates a template that reuses the variables above.
with tf.variable_scope(vs, reuse=True):
  scale_by_y2 = tf.make_template('scale_by_y', my_op, scalar_name='y')
  z2 = scale_by_y2(input1)
  w2 = scale_by_y2(input2)

Depending on the value of create_scope_now_, the full variable scope may be captured either at the time of first call or at the time of construction. If this option is set to True, then all Tensors created by repeated calls to the template will have an extra trailing _N+1 to their name, as the first time the scope is entered in the Template constructor no Tensors are created.

Args:
  • name_: A name for the scope created by this template. If necessary, the name will be made unique by appending _N to the name.
  • func_: The function to wrap.
  • create_scope_now_: Boolean controlling whether the scope should be created when the template is constructed or when the template is called. Default is False, meaning the scope is created when the template is called.
  • unique_name_: When used, it overrides name_ and is not made unique. If a template of the same scope/unique_name already exists and reuse is false, an error is raised. Defaults to None.
  • **kwargs: Keyword arguments to apply to func_.
Returns:

A function to encapsulate a set of variables which should be created once and reused. An enclosing scope will created, either where make_template is called, or wherever the result is called, depending on the value of create_scope_now_. Regardless of the value, the first time the template is called it will enter the scope with no reuse, and call func_ to create variables, which are guaranteed to be unique. All subsequent calls will re-enter the scope and reuse those variables.

Raises:
  • ValueError: if the name is None.

tf.no_regularizer(_)

Use this function to prevent regularization of variables.


tf.constant_initializer(value=0, dtype=tf.float32)

Returns an initializer that generates tensors with constant values.

The resulting tensor is populated with values of type dtype, as specified by arguments value following the desired shape of the new tensor (see examples below).

The argument value can be a constant value, or a list of values of type dtype. If value is a list, then the length of the list must be less than or equal to the number of elements implied by the desired shape of the tensor. In the case where the total number of elements in value is less than the number of elements required by the tensor shape, the last element in value will be used to fill the remaining entries. If the total number of elements in value is greater than the number of elements required by the tensor shape, the initializer will raise a ValueError.

Args:
  • value: A Python scalar, list of values, or a N-dimensional numpy array. All elements of the initialized variable will be set to the corresponding value in the value argument.
  • dtype: The data type.
Returns:

An initializer that generates tensors with constant values.

Examples:

The following example can be rewritten using a numpy.ndarray instead of the value list, even reshaped, as shown in the two commented lines below the value list initialization.

  >>> import numpy as np
  >>> import tensorflow as tf

  >>> value = [0, 1, 2, 3, 4, 5, 6, 7]
  >>> # value = np.array(value)
  >>> # value = value.reshape([2, 4])
  >>> init = tf.constant_initializer(value)

  >>> print('fitting shape:')
  >>> tf.reset_default_graph()
  >>> with tf.Session():
  >>>   x = tf.get_variable('x', shape=[2, 4], initializer=init)
  >>>   x.initializer.run()
  >>>   print(x.eval())

  fitting shape:
  [[ 0.  1.  2.  3.]
   [ 4.  5.  6.  7.]]

  >>> print('larger shape:')
  >>> tf.reset_default_graph()
  >>> with tf.Session():
  >>>   x = tf.get_variable('x', shape=[3, 4], initializer=init)
  >>>   x.initializer.run()
  >>>   print(x.eval())

  larger shape:
  [[ 0.  1.  2.  3.]
   [ 4.  5.  6.  7.]
   [ 7.  7.  7.  7.]]

  >>> print('smaller shape:')
  >>> tf.reset_default_graph()
  >>> with tf.Session():
  >>>   x = tf.get_variable('x', shape=[2, 3], initializer=init)

*  <b>`ValueError`</b>: Too many elements provided. Needed at most 6, but received 8

tf.random_normal_initializer(mean=0.0, stddev=1.0, seed=None, dtype=tf.float32)

Returns an initializer that generates tensors with a normal distribution.

Args:
  • mean: a python scalar or a scalar tensor. Mean of the random values to generate.
  • stddev: a python scalar or a scalar tensor. Standard deviation of the random values to generate.
  • seed: A Python integer. Used to create random seeds. See set_random_seed for behavior.
  • dtype: The data type. Only floating point types are supported.
Returns:

An initializer that generates tensors with a normal distribution.

Raises:
  • ValueError: if dtype is not a floating point type.

tf.truncated_normal_initializer(mean=0.0, stddev=1.0, seed=None, dtype=tf.float32)

Returns an initializer that generates a truncated normal distribution.

These values are similar to values from a random_normal_initializer except that values more than two standard deviations from the mean are discarded and re-drawn. This is the recommended initializer for neural network weights and filters.

Args:
  • mean: a python scalar or a scalar tensor. Mean of the random values to generate.
  • stddev: a python scalar or a scalar tensor. Standard deviation of the random values to generate.
  • seed: A Python integer. Used to create random seeds. See set_random_seed for behavior.
  • dtype: The data type. Only floating point types are supported.
Returns:

An initializer that generates tensors with a truncated normal distribution.

Raises:
  • ValueError: if dtype is not a floating point type.

tf.random_uniform_initializer(minval=0, maxval=None, seed=None, dtype=tf.float32)

Returns an initializer that generates tensors with a uniform distribution.

Args:
  • minval: A python scalar or a scalar tensor. Lower bound of the range of random values to generate.
  • maxval: A python scalar or a scalar tensor. Upper bound of the range of random values to generate. Defaults to 1 for float types.
  • seed: A Python integer. Used to create random seeds. See set_random_seed for behavior.
  • dtype: The data type.
Returns:

An initializer that generates tensors with a uniform distribution.


tf.uniform_unit_scaling_initializer(factor=1.0, seed=None, dtype=tf.float32)

Returns an initializer that generates tensors without scaling variance.

When initializing a deep network, it is in principle advantageous to keep the scale of the input variance constant, so it does not explode or diminish by reaching the final layer. If the input is x and the operation x * W, and we want to initialize W uniformly at random, we need to pick W from

[-sqrt(3) / sqrt(dim), sqrt(3) / sqrt(dim)]

to keep the scale intact, where dim = W.shape[0] (the size of the input). A similar calculation for convolutional networks gives an analogous result with dim equal to the product of the first 3 dimensions. When nonlinearities are present, we need to multiply this by a constant factor. See Sussillo et al., 2014 (pdf) for deeper motivation, experiments and the calculation of constants. In section 2.3 there, the constants were numerically computed: for a linear layer it's 1.0, relu: ~1.43, tanh: ~1.15.

Args:
  • factor: Float. A multiplicative factor by which the values will be scaled.
  • seed: A Python integer. Used to create random seeds. See set_random_seed for behavior.
  • dtype: The data type. Only floating point types are supported.
Returns:

An initializer that generates tensors with unit variance.

Raises:
  • ValueError: if dtype is not a floating point type.

tf.zeros_initializer(shape, dtype=tf.float32, partition_info=None)

An adaptor for zeros() to match the Initializer spec.


tf.ones_initializer(dtype=tf.float32, partition_info=None)

An adaptor for ones() to match the Initializer spec.


tf.orthogonal_initializer(gain=1.0, dtype=tf.float32, seed=None)

Returns an initializer that generates an orthogonal matrix or a reshaped orthogonal matrix.

If the shape of the tensor to initialize is two-dimensional, i is initialized with an orthogonal matrix obtained from the singular value decomposition of a matrix of uniform random numbers.

If the shape of the tensor to initialize is more than two-dimensional, a matrix of shape (shape[0] * ... * shape[n - 2], shape[n - 1]) is initialized, where n is the length of the shape vector. The matrix is subsequently reshaped to give a tensor of the desired shape.

Args:
  • gain: multiplicative factor to apply to the orthogonal matrix
  • dtype: The type of the output.
  • seed: A Python integer. Used to create random seeds. See set_random_seed for behavior.
Returns:

An initializer that generates orthogonal tensors

Raises:
  • ValueError: if dtype is not a floating point type or if shape has fewer than two entries.