Module: tf_agents.bandits.environments.bernoulli_action_mask_tf_environment
Stay organized with collections
Save and categorize content based on your preferences.
Environment wrapper that adds action masks to a bandit environment.
This environment wrapper takes a BanditTFEnvironment
as input, and generates
a new environment where the observations are joined with boolean action
masks. These masks describe which actions are allowed in a given time step. If a
disallowed action is chosen in a time step, the environment will raise an
error. The masks are drawn independently from Bernoulli-distributed random
variables with parameter action_probability
.
The observations from the original environment and the mask are joined by the
given join_fn
function, and the result of the join function will be the
observation in the new environment.
Usage:
'''
env = MyFavoriteBanditEnvironment(...)
def join_fn(context, mask):
return (context, mask)
masked_env = BernoulliActionMaskTFEnvironment(env, join_fn, 0.5)
'''
Classes
class BernoulliActionMaskTFEnvironment
: An environment wrapper that adds action masks to observations.
Other Members |
absolute_import
|
Instance of __future__._Feature
|
division
|
Instance of __future__._Feature
|
print_function
|
Instance of __future__._Feature
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-04-26 UTC.
[{
"type": "thumb-down",
"id": "missingTheInformationINeed",
"label":"Missing the information I need"
},{
"type": "thumb-down",
"id": "tooComplicatedTooManySteps",
"label":"Too complicated / too many steps"
},{
"type": "thumb-down",
"id": "outOfDate",
"label":"Out of date"
},{
"type": "thumb-down",
"id": "samplesCodeIssue",
"label":"Samples / code issue"
},{
"type": "thumb-down",
"id": "otherDown",
"label":"Other"
}]
[{
"type": "thumb-up",
"id": "easyToUnderstand",
"label":"Easy to understand"
},{
"type": "thumb-up",
"id": "solvedMyProblem",
"label":"Solved my problem"
},{
"type": "thumb-up",
"id": "otherUp",
"label":"Other"
}]
{"lastModified": "Last updated 2024-04-26 UTC."}
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-04-26 UTC."],[],[]]