|View source on GitHub|
Create a transition from a trajectory or two adjacent trajectories.
tf_agents.trajectories.trajectory.to_transition( trajectory, next_trajectory=None )
next_trajectory is not provided, tensors of
sliced along their second (
time) dimension; for example:
time_steps.step_type = trajectory.step_type[:,:-1] time_steps.observation = trajectory.observation[:,:-1] next_time_steps.observation = trajectory.observation[:,1:] next_time_steps. step_type = trajectory. next_step_type[:,:-1] next_time_steps.reward = trajectory.reward[:,:-1] next_time_steps. discount = trajectory. discount[:,:-1]
Notice that reward and discount for time_steps are undefined, therefore filled with zero.
trajectory: An instance of
Trajectory. The tensors in Trajectory must have shape
[ B, T, ...]when next_trajectory is None.
next_trajectory: (optional) An instance of
(time_steps, policy_steps, next_time_steps). The
discount fields of
time_steps are filled with zeros because these
cannot be deduced (please do not use them).