tf_agents.trajectories.PolicyStep

Returned with every call to policy.action() and policy.distribution().

Main aliases

tf_agents.trajectories.policy_step.PolicyStep

tf_agents.trajectories.PolicyStep(
    action=(), state=(), info=()
)

Used in the tutorials
Tutorial on Multi Armed Bandits in TF-Agents

Attributes
`action`	An action tensor or action distribution for `TFPolicy`, or numpy array for `PyPolicy`.
`state`	During inference, it will hold the state of the policy to be fed back into the next call to policy.action() or policy.distribution(), e.g. an RNN state. During the training, it will hold the state that is input to policy.action() or policy.distribution() For stateless policies, this will be an empty tuple.
`info`	Auxiliary information emitted by the policy, e.g. log probabilities of the actions. For policies without info this will be an empty tuple.

Methods

replace(
    **kwargs
) -> 'PolicyStep'

Exposes as namedtuple._replace.

  new_policy_step = policy_step.replace(action=())

This returns a new policy step with an empty action.

Args
`**kwargs`	key/value pairs of fields in the policy step.

Returns
A new `PolicyStep`.