Module: tf_agents.agents.ppo.ppo_utils

Utils functions for ppo_agent.py.

Functions

get_distribution_params(...): Get the params for an optionally nested action distribution.

get_learning_rate(...): Gets the current learning rate from an optimizer to be graphed.

get_metric_observers(...): Returns a list of observers, one for each metric.

make_timestep_mask(...): Create a mask for transitions and optionally final incomplete episodes.

make_trajectory_mask(...): Mask boundary trajectories and those with invalid returns and advantages.

nested_kl_divergence(...): Given two nested distributions, sum the KL divergences of the leaves.