Missed TensorFlow Dev Summit? Check out the video playlist. Watch recordings

tf_agents.utils.common.compute_returns

View source on GitHub

Compute the return from each index in an episode.

tf_agents.utils.common.compute_returns(
    rewards, discounts
)

Args:

  • rewards: Tensor of per-timestep reward in the episode.
  • discounts: Tensor of per-timestep discount factor. Should be 0 for final step of each episode.

Returns:

Tensor of per-timestep cumulative returns.