Skip to content

Ablation study on maximizing the RL reward using hyperparameter tuning.

Notifications You must be signed in to change notification settings

karandeepdps/step_size_epsilon_vs_reward_distributio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

step_size_epsilon_vs_reward_distribution

In this notebook I will:

  • Create bandit algorithm
  • Help you understand the effect of epsilon on exploration and learn about the exploration/exploitation tradeoff

Results for different step sizes with constant epsilon 0.1

step_sizes = [0.01, 0.1, 0.5, 1.0, '1/N(A)']

steps

Randomization effect on rewards

epsilon

About

Ablation study on maximizing the RL reward using hyperparameter tuning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published