tf_agents.bandits.environments.dataset_utilities.mushroom_reward_distribution

Creates a distribution for rewards for the mushroom environment.

r_noeat (float) Reward value for not eating the mushroom.
r_eat_safe (float) Reward value for eating an edible mushroom.
r_eat_poison_bad (float) Reward value for eating and getting poisoned from a poisonous mushroom.
r_eat_poison_good (float) Reward value for surviving after eating a poisonous mushroom.
prob_poison_bad Probability of getting poisoned by a poisonous mushroom.

A reward distribution table, instance of tfd.Distribution.