![]() |
Create a Trajectory transitioning between StepTypes FIRST
and MID
.
tf_agents.trajectories.trajectory.first(
observation: tf_agents.typing.types.NestedSpecTensorOrArray
,
action: tf_agents.typing.types.NestedSpecTensorOrArray
,
policy_info: tf_agents.typing.types.NestedSpecTensorOrArray
,
reward: tf_agents.typing.types.NestedSpecTensorOrArray
,
discount: tf_agents.typing.types.SpecTensorOrArray
) -> tf_agents.trajectories.trajectory.Trajectory
All inputs may be batched.
The input discount
is used to infer the outer shape of the inputs,
as it is always expected to be a singleton array with scalar inner shape.
Args | |
---|---|
observation
|
(possibly nested tuple of) Tensor or np.ndarray ;
all shaped [B, ...] , [T, ...] , or [B, T, ...] .
|
action
|
(possibly nested tuple of) Tensor or np.ndarray ;
all shaped [B, ...] , [T, ...] , or [B, T, ...] .
|
policy_info
|
(possibly nested tuple of) Tensor or np.ndarray ;
all shaped [B, ...] , [T, ...] , or [B, T, ...] .
|
reward
|
(possibly nested tuple of) Tensor or np.ndarray ;
all shaped [B, ...] , [T, ...] , or [B, T, ...] .
|
discount
|
A floating point vector Tensor or np.ndarray ;
shaped [B] , [T] , or [B, T] (optional).
|
Returns | |
---|---|
A Trajectory instance.
|