Module: tf_agents.bandits.policies.reward_prediction_base_policy

Stay organized with collections Save and categorize content based on your preferences.

Base policy that samples actions based on predicted rewards.

Classes

class RewardPredictionBasePolicy: Base class to build policies based on reward predictions.