tf.contrib.kfac.fisher_factors.FisherFactor

Class FisherFactor

Base class for objects modeling factors of approximate Fisher blocks.

A FisherFactor represents part of an approximate Fisher Information matrix. For example, one approximation to the Fisher uses the Kronecker product of two FisherFactors A and B, F = kron(A, B). FisherFactors are composed with FisherBlocks to construct a block-diagonal approximation to the full Fisher.

FisherFactors are backed by a single, non-trainable variable that is updated by running FisherFactor.make_covariance_update_op(). The shape and type of this variable is implementation specific.

Note that for blocks that aren't based on approximations, a 'factor' can be the entire block itself, as is the case for the diagonal and full representations.

Methods

__init__

__init__()


get_cov

get_cov()


Get full covariance matrix.

Returns:

Tensor of shape [n, n]. Represents all parameter-parameter correlations captured by this FisherFactor.

get_cov_var

get_cov_var()


Get variable backing this FisherFactor.

May or may not be the same as self.get_cov()

Returns:

Variable of shape self._cov_shape.

instantiate_cov_variables

instantiate_cov_variables()


Makes the internal cov variable(s).

instantiate_inv_variables

instantiate_inv_variables()


Makes the internal "inverse" variable(s).

left_multiply_matpower

left_multiply_matpower(
x,
exp,
damping_func
)


Left multiplies 'x' by matrix power of this factor (w/ damping applied).

This calculation is essentially: (C + damping * I)**exp * x where * is matrix-multiplication, ** is matrix power, I is the identity matrix, and C is the matrix represented by this factor.

x can represent either a matrix or a vector. For some factors, 'x' might represent a vector but actually be stored as a 2D matrix for convenience.

Args:

• x: Tensor. Represents a single vector. Shape depends on implementation.
• exp: float. The matrix exponent to use.
• damping_func: A function that computes a 0-D Tensor or a float which will be the damping value used. i.e. damping = damping_func().

Returns:

Tensor of same shape as 'x' representing the result of the multiplication.

make_covariance_update_op

make_covariance_update_op(ema_decay)


Constructs and returns the covariance update Op.

Args:

• ema_decay: The exponential moving average decay (float or Tensor).

Returns:

An Op for updating the covariance Variable referenced by _cov.

make_inverse_update_ops

make_inverse_update_ops()


Create and return update ops corresponding to registered computations.

right_multiply_matpower

right_multiply_matpower(
x,
exp,
damping_func
)


Right multiplies 'x' by matrix power of this factor (w/ damping applied).

This calculation is essentially: x * (C + damping * I)**exp where * is matrix-multiplication, ** is matrix power, I is the identity matrix, and C is the matrix represented by this factor.

Unlike left_multiply_matpower, x will always be a matrix.

Args:

• x: Tensor. Represents a single vector. Shape depends on implementation.
• exp: float. The matrix exponent to use.
• damping_func: A function that computes a 0-D Tensor or a float which will be the damping value used. i.e. damping = damping_func().

Returns:

Tensor of same shape as 'x' representing the result of the multiplication.