You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Also, I have another question. Based on the paper, intrinsic reward, should be non episodic but extrinsic reward is treated as episodic, I couldn't find where this "non episodic" charactersitic has been addressed for intrinsic reward in the implementation. Shouldn't we also add this episodic reward (i.e., eprews) to the external reward (i.e., rews_ext)?!
The text was updated successfully, but these errors were encountered:
mehdimashayekhi
changed the title
apply policy call
why there is two call to the policy, also where is the non intrinsic characteristic of intrinsic reward?
Jan 12, 2019
mehdimashayekhi
changed the title
why there is two call to the policy, also where is the non intrinsic characteristic of intrinsic reward?
why there are two calls to the policy, also where is the non intrinsic characteristic of intrinsic reward?
Jan 12, 2019
There are two graphs created for the policy / predictor, one for rollout and one for optimization. This is because at rollout time the time dimension has size 1 and is better treated separately.
Uh oh!
There was an error while loading. Please reload this page.
Hi, Thanks for sharing. I was wondering if you can explain why do we need two calls for
apply_policy
in thecan_gru_policy_dynamics.py
, hererandom-network-distillation/policies/cnn_gru_policy_dynamics.py
Line 69 in f75c0f1
random-network-distillation/policies/cnn_gru_policy_dynamics.py
Line 83 in f75c0f1
Also, I have another question. Based on the paper, intrinsic reward, should be non episodic but extrinsic reward is treated as episodic, I couldn't find where this "non episodic" charactersitic has been addressed for intrinsic reward in the implementation. Shouldn't we also add this episodic reward (i.e., eprews) to the external reward (i.e., rews_ext)?!
random-network-distillation/ppo_agent.py
Line 241 in f75c0f1
really appreciate your responses
The text was updated successfully, but these errors were encountered: