Missed TensorFlow Dev Summit? Check out the video playlist. Watch recordings

tf_agents.trajectories.trajectory.Trajectory

View source on GitHub

A tuple that represents a trajectory.

@staticmethod
tf_agents.trajectories.trajectory.Trajectory(
    _cls, step_type, observation, action, policy_info, next_step_type, reward,
    discount
)

A Trajectory is a sequence of aligned time steps. It captures the observation, step_type from current time step with the computed action and policy_info. Discount, reward and next_step_type come from the next time step.

Attributes:

  • step_type: A StepType.
  • observation: An array (tensor), or a nested dict, list or tuple of arrays (tensors) that represents the observation.
  • action: An array/a tensor, or a nested dict, list or tuple of actions. This represents action generated according to the observation.
  • policy_info: An arbitrary nest that contains auxiliary information related to the action. Note that this does not include the policy/RNN state which was used to generate the action.
  • next_step_type: The StepType of the next time step.
  • reward: An array/a tensor, or a nested dict, list, or tuple of rewards. This represents the rewards and/or constraint satisfiability after performing the action in an environment.
  • discount: A scalar that representing the discount factor to multiply with future rewards.

Methods

is_boundary

View source

is_boundary()

is_first

View source

is_first()

is_last

View source

is_last()

is_mid

View source

is_mid()

replace

View source

replace(
    **kwargs
)

Exposes as namedtuple._replace.

Usage:

  new_trajectory = trajectory.replace(policy_info=())

This returns a new trajectory with an empty policy_info.

Args:

  • **kwargs: key/value pairs of fields in the trajectory.

Returns:

A new Trajectory.