View source on GitHub |
An agent that maintains linear estimates for rewards and their uncertainty.
LinUCB and Linear Thompson Sampling agents are subclasses of this agent.
Classes
class ExplorationPolicy
: Possible exploration policies.
class LinearBanditAgent
: An agent that maintains linear reward estimates and their uncertainties.
class LinearBanditVariableCollection
: A collection of variables used by LinearBanditAgent
.
Functions
update_a_and_b_with_forgetting(...)
: Update the covariance matrix a
and the weighted sum of rewards b
.
Other Members | |
---|---|
absolute_import |
Instance of __future__._Feature
|
division |
Instance of __future__._Feature
|
print_function |
Instance of __future__._Feature
|