tf_agents.agents.ppo.ppo_utils.make_trajectory_mask

Mask boundary trajectories and those with invalid returns and advantages.

batched_traj Trajectory, doubly-batched [batch_dim, time_dim,...]. It must be preprocessed already.

A mask, type tf.float32, that is 0.0 for all between-episode Trajectory (batched_traj.step_type is LAST) and 0.0 if the return value is unavailable.