tf_agents.bandits.agents.exp3_mixture_agent.Exp3MixtureVariableCollection

View source on GitHub

A collection of variables used by subclasses of MixtureAgent.

Note that this variable collection only contains the mixture weights. The variables of the sub-agents that the mixture agent mixes are in variable collections of the respective sub-agents.

num_agents (int) the number of agents mixed by the mixture agent.
reward_aggregates A list of floats containing the reward aggregates for each agent. If not set, the initial values will be 0.
inverse_temperature The initial value for the inverse temperature variable used by the mixture agent.

inverse_temperature

name Returns the name of this module as passed or determined in the ctor.

name_scope Returns a tf.name_scope instance for this class.
reward_aggregates

submodules Sequence of all sub-modules.

Submodules are modules which are properties of this module, or found as properties of modules which are properties of this module (and so on).

a = tf.Module()
b = tf.Module()
c = tf.Module()
a.b = b
b.c = c
list(a.submodules) == [b, c]
True
list(b.submodules) == [c]
True
list(c.submodules) == []
True

trainable_variables Sequence of trainable variables owned by this module and its submodules.

variables Sequence of variables owned by this module and its submodules.

Methods

with_name_scope

Decorator to automatically enter the module name scope.

class MyModule(tf.Module):
  @tf.Module.with_name_scope
  def __call__(self, x):
    if not hasattr(self, 'w'):
      self.w = tf.Variable(tf.random.normal([x.shape[1], 3]))
    return tf.matmul(x, self.w)

Using the above module would produce tf.Variables and tf.Tensors whose names included the module name:

mod = MyModule()
mod(tf.ones([1, 2]))
<tf.Tensor: shape=(1, 3), dtype=float32, numpy=..., dtype=float32)>
mod.w
<tf.Variable 'my_module/Variable:0' shape=(2, 3) dtype=float32,
numpy=..., dtype=float32)>

Args
method The method to wrap.

Returns
The original method wrapped such that it enters the module's name scope.