tf.contrib.kfac.layer_collection.LayerCollection

Class LayerCollection

Defined in tensorflow/contrib/kfac/python/ops/layer_collection.py.

Registry of information about layers and losses.

Note that you need to create a new one of these for each MatrixEstimator or KfacOptimizer.

Attributes:

  • fisher_blocks: a LayersParamsDict (subclass of OrderedDict) mapping layer parameters (Tensors or tuples of Tensors) to FisherBlock instances.
  • fisher_factors: an OrderedDict mapping tuples to FisherFactor instances.
  • losses: a list of LossFunction objects. The loss to be optimized is their sum.

Properties

default_conv2d_approximation

default_embedding_approximation

default_fully_connected_approximation

default_fully_connected_multi_approximation

default_generic_approximation

graph

linked_parameters

Groups of parameters with an optionally specified approximation.

Linked parameters can be added using define_linked_parameters. If an approximation is specified, then this approximation will be used when registering a layer with exactly these parameters, unless an approximation is specified when calling the registration function.

Returns:

A dict mapping tuples of parameters to an optional string.

losses

LossFunctions registered with this LayerCollection.

registered_variables

A tuple of all of the variables currently registered.

subgraph

Methods

__init__

__init__(
    graph=None,
    name='LayerCollection'
)

check_registration

check_registration(variables)

Checks that all variable uses have been registered properly.

Args:

  • variables: List of variables.

Raises:

  • ValueError: If any registered variables are not included in the list.
  • ValueError: If any variable in the list is not registered.
  • ValueError: If any variable in the list is registered with the wrong number of "uses" in the subgraph recorded (vs the number of times that variable is actually used in the subgraph).

create_subgraph

create_subgraph()

define_linked_parameters

define_linked_parameters(
    params,
    approximation=None
)

Identify a set of parameters that should be grouped together.

During automatic graph scanning, any matches containing variables that have been identified as part of a linked group will be filtered out unless the match parameters are exactly equal to the ones specified in the linked group.

Args:

  • params: A variable, or a tuple or list of variables. The variables to be linked.
  • approximation: Optional string specifying the type of approximation to use for these variables. If unspecified, this layer collection's default approximation for the layer type will be used.

Raises:

  • ValueError: If the parameters were already registered in a layer or identified as part of an incompatible group.

get_blocks

get_blocks()

get_factors

get_factors()

get_use_count_map

get_use_count_map()

Returns a dict of variables to their number of registrations.

make_or_get_factor

make_or_get_factor(
    cls,
    args
)

Insert 'cls(args)' into 'self.fisher_factors' if not already present.

Wraps constructor in 'tf.variable_scope()' to ensure variables constructed in 'cls.init' are placed under this LayerCollection's scope.

Args:

  • cls: Class that implements FisherFactor.
  • args: Tuple of arguments to pass into 'cls's constructor. Must be hashable.

Returns:

Instance of 'cls' found in self.fisher_factors.

register_block

register_block(
    layer_key,
    fisher_block,
    reuse=VARIABLE_SCOPE
)

Validates and registers the layer_key associated with the fisher_block.

Args:

  • layer_key: A variable or tuple of variables. The key to check for in existing registrations and to register if valid.
  • fisher_block: The associated FisherBlock.
  • reuse: Method to use for inserting new FisherBlocks. One of True, False, or 'VARIABLE_SCOPE'.

Raises:

  • ValueError: If layer_key was already registered and reuse is False, if layer_key was registered with a different block type, or if layer_key shares any variables with but is not equal to a previously registered key.
  • KeyError: If reuse is True but layer_key was not previously registered.

Returns:

The FisherBlock registered under layer_key. If layer_key was already registered, this will be the previously registered FisherBlock.

register_categorical_predictive_distribution

register_categorical_predictive_distribution(
    logits,
    seed=None,
    targets=None,
    name=None,
    reuse=VARIABLE_SCOPE
)

Registers a categorical predictive distribution.

Args:

  • logits: The logits of the distribution (i.e. its parameters).
  • seed: The seed for the RNG (for debugging) (Default: None)
  • targets: (OPTIONAL) The targets for the loss function. Only required if one wants to call total_loss() instead of total_sampled_loss(). total_loss() is required, for example, to estimate the "empirical Fisher" (instead of the true Fisher). (Default: None)
  • name: (OPTIONAL) str or None. Unique name for this loss function. If None, a new name is generated. (Default: None)
  • reuse: (OPTIONAL) bool or str. If True, reuse an existing FisherBlock. If False, create a new FisherBlock. If VARIABLE_SCOPE, use tf.get_variable_scope().reuse.

Raises:

  • ValueError: If reuse == True and name == None.
  • ValueError: If reuse == True and seed != None.
  • KeyError: If reuse == True and no existing LossFunction with 'name' found.
  • KeyError: If reuse == False and existing LossFunction with 'name' found.

register_conv2d

register_conv2d(
    params,
    strides,
    padding,
    inputs,
    outputs,
    approx=None,
    reuse=VARIABLE_SCOPE
)

Registers a convolutional layer.

Args:

  • params: Tensor or 2-tuple of Tensors corresponding to weight and bias of this layer. Weight matrix should have shape [kernel_height, kernel_width, in_channels, out_channels]. Bias should have shape [out_channels].
  • strides: 1-D Tensor of length 4. Strides for convolution kernel.
  • padding: string. see tf.nn.conv2d for valid values.
  • inputs: Tensor of shape [batch_size, height, width, in_channels]. Inputs to layer.
  • outputs: Tensor of shape [batch_size, height, width, out_channels]. Output produced by layer.
  • approx: str. One of "kron" or "diagonal".
  • reuse: bool or str. If True, reuse an existing FisherBlock. If False, create a new FisherBlock. If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse.

Raises:

  • ValueError: For improper value to 'approx'.
  • KeyError: If reuse == True but no FisherBlock found for 'params'.
  • ValueError: If reuse == True and FisherBlock found but of the wrong type.

register_embedding

register_embedding(
    params,
    inputs,
    outputs,
    approx=None,
    reuse=VARIABLE_SCOPE
)

Registers a fully connnected layer.

Args:

  • params: Embedding matrix of shape [vocab_size, embedding_size].
  • inputs: Tensor of shape [batch_size, input_size] and dtype int32. Indices into embedding matrix.
  • outputs: Tensor of shape [batch_size, output_size]. Outputs produced by layer.
  • approx: str. Must be "kron".
  • reuse: bool or str. If True, reuse an existing FisherBlock. If False, create a new FisherBlock. If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse.

Raises:

  • ValueError: For improper value to 'approx'.
  • KeyError: If reuse == True but no FisherBlock found for 'params'.
  • ValueError: If reuse == True and FisherBlock found but of the wrong type.

register_fully_connected

register_fully_connected(
    params,
    inputs,
    outputs,
    approx=None,
    reuse=VARIABLE_SCOPE
)

Registers a fully connnected layer.

Args:

  • params: Tensor or 2-tuple of Tensors corresponding to weight and bias of this layer. Weight matrix should have shape [input_size, output_size]. Bias should have shape [output_size].
  • inputs: Tensor of shape [batch_size, input_size]. Inputs to layer.
  • outputs: Tensor of shape [batch_size, output_size]. Outputs produced by layer.
  • approx: str. One of "kron" or "diagonal".
  • reuse: bool or str. If True, reuse an existing FisherBlock. If False, create a new FisherBlock. If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse.

Raises:

  • ValueError: For improper value to 'approx'.
  • KeyError: If reuse == True but no FisherBlock found for 'params'.
  • ValueError: If reuse == True and FisherBlock found but of the wrong type.

register_fully_connected_multi

register_fully_connected_multi(
    params,
    inputs,
    outputs,
    approx=None
)

Register fully connected layers with shared parameters.

This can handle general fully-connected layers with shared parameters, but has specialized approximations to deal with the case where there is a meaningful linear order to the share instances (such as in an RNN).

Args:

  • params: Tensor or 2-tuple of Tensors corresponding to weight and bias of this layer. Weight matrix should have shape [input_size, output_size]. Bias should have shape [output_size].
  • inputs: A list of tensors, each of shape [batch_size, input_size]. Inputs to layer. In the case of RNNs, one Tensor per time step.
  • outputs: A list of tensors, the same length as 'inputs', each of shape [batch_size, output_size]. Outputs produced by layer. In the case of RNNs, one Tensor per time step.
  • approx: str. One of "kron_indep", "kron_series_1", or "kron_series_2".

Raises:

  • ValueError: For improper value to 'approx'.

register_generic

register_generic(
    params,
    batch_size,
    approx=None,
    reuse=VARIABLE_SCOPE
)

Registers a generic layer.

Args:

  • params: Tensor or tuple of Tensors corresponding to the parameters.
  • batch_size: 0-D Tensor. Size of the minibatch.
  • approx: str. One of "full" or "diagonal".
  • reuse: bool or str. If True, reuse an existing FisherBlock. If False, create a new FisherBlock. If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse.

Raises:

  • ValueError: For improper value to 'approx'.
  • KeyError: If reuse == True but no FisherBlock found for 'params'.
  • ValueError: If reuse == True and FisherBlock found but of the wrong type.

register_multi_bernoulli_predictive_distribution

register_multi_bernoulli_predictive_distribution(
    logits,
    seed=None,
    targets=None,
    name=None
)

Registers a multi-Bernoulli predictive distribution.

Args:

  • logits: The logits of the distribution (i.e. its parameters).
  • seed: The seed for the RNG (for debugging) (Default: None)
  • targets: (OPTIONAL) The targets for the loss function. Only required if one wants to call total_loss() instead of total_sampled_loss(). total_loss() is required, for example, to estimate the "empirical Fisher" (instead of the true Fisher). (Default: None)
  • name: (OPTIONAL) str or None. Unique name for this loss function. If None, a new name is generated. (Default: None)

register_normal_predictive_distribution

register_normal_predictive_distribution(
    mean,
    var=0.5,
    seed=None,
    targets=None,
    name=None
)

Registers a normal predictive distribution.

Args:

  • mean: The mean vector defining the distribution.
  • var: The variance (must be a scalar). Note that the default value of 0.5 corresponds to a standard squared error loss (target - prediction)2. If your squared error loss is of the form 0.5*(target - prediction)2 you should use var=1.0. (Default: 0.5)
  • seed: The seed for the RNG (for debugging) (Default: None)
  • targets: (OPTIONAL) The targets for the loss function. Only required if one wants to call total_loss() instead of total_sampled_loss(). total_loss() is required, for example, to estimate the "empirical Fisher" (instead of the true Fisher). (Default: None)
  • name: (OPTIONAL) str or None. Unique name for this loss function. If None, a new name is generated. (Default: None)

set_default_conv2d_approximation

set_default_conv2d_approximation(value)

set_default_embedding_approximation

set_default_embedding_approximation(value)

set_default_fully_connected_approximation

set_default_fully_connected_approximation(value)

set_default_fully_connected_multi_approximation

set_default_fully_connected_multi_approximation(value)

set_default_generic_approximation

set_default_generic_approximation(value)

total_loss

total_loss()

total_sampled_loss

total_sampled_loss()