You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to run PPO and Maskable PPO on my custom environment with the same configuration, but I found that Maskable PPO(~5fps) is mush slower than PPO(~140fps).
I also tried to profile my code with py-spy, and I found that MaskablePPO spent many extra time in these lines
while PPO spends much less time in train and most of its time in collect_rollouts just as expected.
I wonder if this extreme decline in training efficiency is a normal situation because of the large action_space or if there are other bugs in the implementation.
Checklist
I have checked that there is no similar issue in the repo
I also tried to profile my code with py-spy, and I found that MaskablePPO spent many extra time in these lines
at least the slow down is where it would be expected.
I'm a bit surprised by how much slow down it, but the code was never optimized for speed, so there is probably room for improvement.
❓ Question
I tried to run PPO and Maskable PPO on my custom environment with the same configuration, but I found that Maskable PPO(~5fps) is mush slower than PPO(~140fps).
Here's my configurations:
I also tried to profile my code with py-spy, and I found that MaskablePPO spent many extra time in these lines
while PPO spends much less time in
train
and most of its time incollect_rollouts
just as expected.I wonder if this extreme decline in training efficiency is a normal situation because of the large action_space or if there are other bugs in the implementation.
Checklist
The text was updated successfully, but these errors were encountered: