View source on GitHub

Returns a Trajectory given transitions.

Used in the notebooks

Used in the tutorials

from_transition is used by a driver to convert sequence of transitions into a Trajectory for efficient storage. Then an agent (e.g. ppo_agent.PPOAgent) converts it back to transitions by invoking to_transition.

Note that this method does not add a time dimension to the Tensors in the resulting Trajectory. This means that if your transitions don't already include a time dimension, the Trajectory cannot be passed to agent.train().

time_step A time_step.TimeStep representing the first step in a transition.
action_step A policy_step.PolicyStep representing actions corresponding to observations from time_step.
next_time_step A time_step.TimeStep representing the second step in a transition.