Missed TensorFlow Dev Summit? Check out the video playlist. Watch recordings

tf_agents.trajectories.trajectory.mid

View source on GitHub

Create a Trajectory transitioning between StepTypes MID and MID.

tf_agents.trajectories.trajectory.mid(
    observation, action, policy_info, reward, discount
)

All inputs may be batched.

The input discount is used to infer the outer shape of the inputs, as it is always expected to be a singleton array with scalar inner shape.

Args:

  • observation: (possibly nested tuple of) Tensor or np.ndarray; all shaped [B, ...], [T, ...], or [B, T, ...].
  • action: (possibly nested tuple of) Tensor or np.ndarray; all shaped [B, ...], [T, ...], or [B, T, ...].
  • policy_info: (possibly nested tuple of) Tensor or np.ndarray; all shaped [B, ...], [T, ...], or [B, T, ...].
  • reward: (possibly nested tuple of) Tensor or np.ndarray; all shaped [B, ...], [T, ...], or [B, T, ...].
  • discount: A floating point vector Tensor or np.ndarray; shaped [B], [T], or [B, T] (optional).

Returns:

A Trajectory instance.