tf_agents.drivers.driver.Driver

A driver that takes steps in an environment using a policy.

tf_agents.drivers.driver.Driver(
    env, policy, observers=None, transition_observers=None, info_observers=None
)

Args
`env`	An environment.Base environment.
`policy`	A policy.Base policy.
`observers`	A list of observers that are updated after the driver is run. Each observer is a callable(Trajectory) that returns the input. Trajectory.time_step is a stacked batch [N+1, batch_size, ...] of timesteps and Trajectory.action is a stacked batch [N, batch_size, ...] of actions in time major form.
`transition_observers`	A list of observers that are updated after every step in the environment. Each observer is a callable((TimeStep, PolicyStep, NextTimeStep)). The transition is shaped just as trajectories are for regular observers.
`info_observers`	A list of observers that are updated after the driver is run. Each observer is a callable(info).

Methods

@abc.abstractmethod
run()

Takes steps in the environment and updates observers.