tf_agents.bandits.policies.policy_utilities.construct_mask_from_multiple_sources

Constructs an action mask from multiple sources.

The sources include:

-- The action mask encoded in the observation, -- the num_actions feature restricting the number of actions per sample, -- the feasibility mask implied by constraints.

The resulting mask disables all actions that are masked out in any of the three sources.

observation A nest of Tensors containing the observation.
observation_and_action_constraint_splitter The observation action mask splitter function if the observation has action mask.
constraints Iterable of constraints objects that are instances of tf_agents.bandits.agents.NeuralConstraint.
max_num_actions The maximum number of actions per sample.

An action mask in the form of a [batch_size, max_num_actions] 0-1 tensor.