How to fix evaluation trajectories #27

tabz23 · 2025-03-01T18:32:45Z

I have tried the following change:
def rollout(self):
"""
Evaluates the performance of the model on a single episode.
"""
episode_ret, episode_cost, episode_len = 0.0, 0.0, 0

    obs, info = self.env.reset(seed=1)

TypeError: reset() got an unexpected keyword argument 'seed'

What excatly is the source of randomness when re-running the same evaluation command? How can i always reproduce the same experiment and receive the same average reward/cost metrics for evaluation? I want to do so to compare the baseline with a small modification I did, but running the evaluation script each time produces very different results and I can't use that for comparisons.

this is the command i am using:
python "OSRL/examples/research/check/eval_bc" --device="mps" --path "OSRL/logs/OfflineSwimmerVelocityGymnasium-v1-cost-20/BC-safe_bc_modesafe_cost20_seed20-2180/BC-safe_bc_modesafe_cost20_seed20-2180" --eval_episode 1

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to fix evaluation trajectories #27

How to fix evaluation trajectories #27

tabz23 commented Mar 1, 2025 •

edited

Loading

How to fix evaluation trajectories #27

How to fix evaluation trajectories #27

Comments

tabz23 commented Mar 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

tabz23 commented Mar 1, 2025 •

edited

Loading