tf.contrib.kfac.fisher_blocks.EmbeddingKFACFB

Class EmbeddingKFACFB

Inherits From: KroneckerProductFB

Defined in tensorflow/contrib/kfac/python/ops/fisher_blocks.py.

K-FAC FisherBlock for embedding layers.

This FisherBlock is similar to FullyConnectedKFACBasicFB, except that its input factor is approximated by a diagonal matrix. In the case that each example references exactly one embedding, this approximation is exact.

Does not support bias parameters.

Properties

num_registered_towers

Methods

__init__

__init__(
    layer_collection,
    vocab_size
)

Creates a EmbeddingKFACFB block.

Args:

  • layer_collection: The collection of all layers in the K-FAC approximate Fisher information matrix to which this FisherBlock belongs.
  • vocab_size: int. Size of vocabulary for this embedding layer.

full_fisher_block

full_fisher_block()

Explicitly constructs the full Fisher block.

Used for testing purposes. (In general, the result may be very large.)

Returns:

The full Fisher block.

instantiate_factors

instantiate_factors(
    grads_list,
    damping
)

Instantiate Kronecker Factors for this FisherBlock.

Args:

  • grads_list: List of list of Tensors. grads_list[i][j] is the gradient of the loss with respect to 'outputs' from source 'i' and tower 'j'. Each Tensor has shape [tower_minibatch_size, output_size].
  • damping: 0-D Tensor or float. 'damping' * identity is approximately added to this FisherBlock's Fisher approximation.

multiply

multiply(vector)

Multiplies the vector by the (damped) block.

Args:

  • vector: The vector (a Tensor or tuple of Tensors) to be multiplied.

Returns:

The vector left-multiplied by the (damped) block.

multiply_cholesky

multiply_cholesky(
    vector,
    transpose=False
)

multiply_cholesky_inverse

multiply_cholesky_inverse(
    vector,
    transpose=False
)

multiply_inverse

multiply_inverse(vector)

Multiplies the vector by the (damped) inverse of the block.

Args:

  • vector: The vector (a Tensor or tuple of Tensors) to be multiplied.

Returns:

The vector left-multiplied by the (damped) inverse of the block.

multiply_matpower

multiply_matpower(
    vector,
    exp
)

register_additional_tower

register_additional_tower(
    inputs,
    outputs
)

register_cholesky

register_cholesky()

register_cholesky_inverse

register_cholesky_inverse()

register_inverse

register_inverse()

Registers a matrix inverse to be computed by the block.

register_matpower

register_matpower(exp)

tensors_to_compute_grads

tensors_to_compute_grads()

Tensors to compute derivative of loss with respect to.