tf_agents.bandits.agents.exp3_mixture_agent.Exp3MixtureVariableCollection

A collection of variables used by subclasses of MixtureAgent.

Note that this variable collection only contains the mixture weights. The variables of the sub-agents that the mixture agent mixes are in variable collections of the respective sub-agents.

num_agents (int) the number of agents mixed by the mixture agent.
reward_aggregates A list of floats containing the reward aggregates for each agent. If not set, the initial values will be 0.
inverse_temperature The initial value for the inverse temperature variable used by the mixture agent.

inverse_temperature

reward_aggregates