![]() |
Class for representing a trainable absolute value constraint.
Inherits From: NeuralConstraint
, BaseConstraint
tf_agents.bandits.policies.constraints.AbsoluteConstraint(
time_step_spec: tf_agents.typing.types.TimeStep
,
action_spec: tf_agents.typing.types.BoundedTensorSpec
,
constraint_network: tf_agents.typing.types.Network
,
error_loss_fn: tf_agents.typing.types.LossFn
= tf.compat.v1.losses.mean_squared_error,
comparator_fn: tf_agents.typing.types.ComparatorFn
= tf.greater,
absolute_value: float = 0.0,
name: Text = 'AbsoluteConstraint'
)
This constraint class implements an absolute value constraint such as
expected_value(action) >= absolute_value
or
expected_value(action) <= absolute_value
Args | |
---|---|
time_step_spec
|
A TimeStep spec of the expected time_steps.
|
action_spec
|
A nest of BoundedTensorSpec representing the actions.
|
constraint_network
|
An instance of tf_agents.network.Network used to
provide estimates of action feasibility. The input structure should be
consistent with the observation_spec .
|
error_loss_fn
|
A function for computing the loss used to train the
constraint network. The default is tf.losses.mean_squared_error .
|
comparator_fn
|
a comparator function, such as tf.greater or tf.less. |
absolute_value
|
the threshold value we want to use in the constraint. |
name
|
Python str name of this agent. All variables in this module will fall under that name. Defaults to the class name. |
Attributes | |
---|---|
constraint_network
|
|
observation_spec
|
Methods
compute_loss
compute_loss(
observations: tf_agents.typing.types.NestedTensor
,
actions: tf_agents.typing.types.NestedTensor
,
rewards: tf_agents.typing.types.Tensor
,
weights: Optional[types.Float] = None,
training: bool = False
) -> tf_agents.typing.types.Tensor
Computes loss for training the constraint network.
Args | |
---|---|
observations
|
A batch of observations. |
actions
|
A batch of actions. |
rewards
|
A batch of rewards. |
weights
|
Optional scalar or elementwise (per-batch-entry) importance weights. The output batch loss will be scaled by these weights, and the final scalar loss is the mean of these values. |
training
|
Whether the loss is being used for training. |
Returns | |
---|---|
loss
|
A Tensor containing the loss for the training step.
|
initialize
initialize()
Returns an op to initialize the constraint.
__call__
__call__(
observation, actions=None
)
Returns the probability of input actions being feasible.