Skip to content

Commit b372e9a

Browse files
authored
Rename to RL-Zoo3 and better packaging (#291)
* Rename and better packaging * Move plot scripts inside package
1 parent 8cbe79d commit b372e9a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+926
-876
lines changed

.coveragerc

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
branch = False
33
omit =
44
tests/*
5-
rl_zoo/utils/plot.py
5+
rl_zoo3/utils/plot.py
66

77
[report]
88
exclude_lines =

CHANGELOG.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
- low pass filter was removed
66

77
### New Features
8-
- RL Zoo cli: `rl_zoo train` and `rl_zoo enjoy`
8+
- RL Zoo cli: `rl_zoo3 train` and `rl_zoo3 enjoy`
99

1010
### Bug fixes
1111

Makefile

+4-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
LINT_PATHS = *.py tests/ scripts/ rl_zoo/
1+
LINT_PATHS = *.py tests/ scripts/ rl_zoo3/
22

33
# Run pytest and coverage report
44
pytest:
@@ -10,7 +10,7 @@ check-trained-agents:
1010

1111
# Type check
1212
type:
13-
pytype -j auto rl_zoo/ tests/ scripts/ -d import-error
13+
pytype -j auto rl_zoo3/ tests/ scripts/ -d import-error
1414

1515
lint:
1616
# stop the build if there are Python syntax errors or undefined names
@@ -44,12 +44,14 @@ docker-gpu:
4444

4545
# PyPi package release
4646
release:
47+
# rm -r build/* dist/*
4748
python setup.py sdist
4849
python setup.py bdist_wheel
4950
twine upload dist/*
5051

5152
# Test PyPi package release
5253
test-release:
54+
# rm -r build/* dist/*
5355
python setup.py sdist
5456
python setup.py bdist_wheel
5557
twine upload --repository-url https://test.pypi.org/legacy/ dist/*

README.md

+10-10
Original file line numberDiff line numberDiff line change
@@ -154,13 +154,13 @@ python enjoy.py --algo algo_name --env env_id -f logs/ --exp-id 1 --load-last-ch
154154

155155
Upload model to hub (same syntax as for `enjoy.py`):
156156
```
157-
python -m rl_zoo.push_to_hub --algo ppo --env CartPole-v1 -f logs/ -orga sb3 -m "Initial commit"
157+
python -m rl_zoo3.push_to_hub --algo ppo --env CartPole-v1 -f logs/ -orga sb3 -m "Initial commit"
158158
```
159159
you can choose custom `repo-name` (default: `{algo}-{env_id}`) by passing a `--repo-name` argument.
160160

161161
Download model from hub:
162162
```
163-
python -m rl_zoo.load_from_hub --algo ppo --env CartPole-v1 -f logs/ -orga sb3
163+
python -m rl_zoo3.load_from_hub --algo ppo --env CartPole-v1 -f logs/ -orga sb3
164164
```
165165

166166
## Hyperparameter yaml syntax
@@ -255,7 +255,7 @@ for multiple, specify a list:
255255
256256
```yaml
257257
env_wrapper:
258-
- rl_zoo.wrappers.DoneOnSuccessWrapper:
258+
- rl_zoo3.wrappers.DoneOnSuccessWrapper:
259259
reward_offset: 1.0
260260
- sb3_contrib.common.wrappers.TimeFeatureWrapper
261261
```
@@ -279,7 +279,7 @@ Following the same syntax as env wrappers, you can also add custom callbacks to
279279

280280
```yaml
281281
callback:
282-
- rl_zoo.callbacks.ParallelTrainCallback:
282+
- rl_zoo3.callbacks.ParallelTrainCallback:
283283
gradient_steps: 256
284284
```
285285

@@ -306,19 +306,19 @@ Note: if you want to pass a string, you need to escape it like that: `my_string:
306306
Record 1000 steps with the latest saved model:
307307

308308
```
309-
python -m rl_zoo.record_video --algo ppo --env BipedalWalkerHardcore-v3 -n 1000
309+
python -m rl_zoo3.record_video --algo ppo --env BipedalWalkerHardcore-v3 -n 1000
310310
```
311311

312312
Use the best saved model instead:
313313

314314
```
315-
python -m rl_zoo.record_video --algo ppo --env BipedalWalkerHardcore-v3 -n 1000 --load-best
315+
python -m rl_zoo3.record_video --algo ppo --env BipedalWalkerHardcore-v3 -n 1000 --load-best
316316
```
317317

318318
Record a video of a checkpoint saved during training (here the checkpoint name is `rl_model_10000_steps.zip`):
319319

320320
```
321-
python -m rl_zoo.record_video --algo ppo --env BipedalWalkerHardcore-v3 -n 1000 --load-checkpoint 10000
321+
python -m rl_zoo3.record_video --algo ppo --env BipedalWalkerHardcore-v3 -n 1000 --load-checkpoint 10000
322322
```
323323

324324
## Record a Video of a Training Experiment
@@ -328,18 +328,18 @@ Apart from recording videos of specific saved models, it is also possible to rec
328328
Record 1000 steps for each checkpoint, latest and best saved models:
329329

330330
```
331-
python -m rl_zoo.record_training --algo ppo --env CartPole-v1 -n 1000 -f logs --deterministic
331+
python -m rl_zoo3.record_training --algo ppo --env CartPole-v1 -n 1000 -f logs --deterministic
332332
```
333333

334334
The previous command will create a `mp4` file. To convert this file to `gif` format as well:
335335

336336
```
337-
python -m rl_zoo.record_training --algo ppo --env CartPole-v1 -n 1000 -f logs --deterministic --gif
337+
python -m rl_zoo3.record_training --algo ppo --env CartPole-v1 -n 1000 -f logs --deterministic --gif
338338
```
339339

340340
## Current Collection: 195+ Trained Agents!
341341

342-
Final performance of the trained agents can be found in [`benchmark.md`](./benchmark.md). To compute them, simply run `python -m rl_zoo.benchmark`.
342+
Final performance of the trained agents can be found in [`benchmark.md`](./benchmark.md). To compute them, simply run `python -m rl_zoo3.benchmark`.
343343

344344
List and videos of trained agents can be found on our Huggingface page: https://huggingface.co/sb3
345345

docker/Dockerfile

+1-1
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ COPY requirements.txt /tmp/
1818

1919

2020
RUN \
21-
mkdir -p ${CODE_DIR}/rl_zoo && \
21+
mkdir -p ${CODE_DIR}/rl_zoo3 && \
2222
pip uninstall -y stable-baselines3 && \
2323
pip install -r /tmp/requirements.txt && \
2424
pip install pip install highway-env==1.5.0 && \

enjoy.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
from rl_zoo.enjoy import enjoy
1+
from rl_zoo3.enjoy import enjoy
22

33
if __name__ == "__main__":
44
enjoy()

hyperparams/her.yml

+3-3
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ FetchSlide-v1:
5959
FetchPickAndPlace-v1:
6060
env_wrapper:
6161
- sb3_contrib.common.wrappers.TimeFeatureWrapper
62-
# - rl_zoo.wrappers.DoneOnSuccessWrapper:
62+
# - rl_zoo3.wrappers.DoneOnSuccessWrapper:
6363
# reward_offset: 0
6464
# n_successes: 4
6565
# - stable_baselines3.common.monitor.Monitor
@@ -96,7 +96,7 @@ FetchReach-v1:
9696
NeckGoalEnvRelativeSparse-v2:
9797
model_class: 'sac'
9898
# env_wrapper:
99-
# - rl_zoo.wrappers.HistoryWrapper:
99+
# - rl_zoo3.wrappers.HistoryWrapper:
100100
# horizon: 2
101101
# - sb3_contrib.common.wrappers.TimeFeatureWrapper
102102
n_timesteps: !!float 1e6
@@ -122,7 +122,7 @@ NeckGoalEnvRelativeSparse-v2:
122122
NeckGoalEnvRelativeDense-v2:
123123
model_class: 'sac'
124124
env_wrapper:
125-
- rl_zoo.wrappers.HistoryWrapperObsDict:
125+
- rl_zoo3.wrappers.HistoryWrapperObsDict:
126126
horizon: 2
127127
# - sb3_contrib.common.wrappers.TimeFeatureWrapper
128128
n_timesteps: !!float 1e6

hyperparams/ppo.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -319,7 +319,7 @@ MiniGrid-FourRooms-v0:
319319

320320
CarRacing-v0:
321321
env_wrapper:
322-
- rl_zoo.wrappers.FrameSkip:
322+
- rl_zoo3.wrappers.FrameSkip:
323323
skip: 2
324324
- gym.wrappers.resize_observation.ResizeObservation:
325325
shape: 64

hyperparams/ppo_lstm.yml

+2-2
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ BipedalWalker-v3:
132132
# TO BE TUNED
133133
BipedalWalkerHardcore-v3:
134134
# env_wrapper:
135-
# - rl_zoo.wrappers.FrameSkip:
135+
# - rl_zoo3.wrappers.FrameSkip:
136136
# skip: 2
137137
normalize: true
138138
n_envs: 32
@@ -285,7 +285,7 @@ InvertedPendulumSwingupBulletEnv-v0:
285285

286286
CarRacing-v0:
287287
env_wrapper:
288-
# - rl_zoo.wrappers.FrameSkip:
288+
# - rl_zoo3.wrappers.FrameSkip:
289289
# skip: 2
290290
- gym.wrappers.resize_observation.ResizeObservation:
291291
shape: 64

hyperparams/sac.yml

+8-8
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ MountainCarContinuous-v0:
1616

1717
Pendulum-v1:
1818
# callback:
19-
# - rl_zoo.callbacks.ParallelTrainCallback
19+
# - rl_zoo3.callbacks.ParallelTrainCallback
2020
n_timesteps: 20000
2121
policy: 'MlpPolicy'
2222
learning_rate: !!float 1e-3
@@ -74,9 +74,9 @@ BipedalWalkerHardcore-v3:
7474
HalfCheetahBulletEnv-v0: &pybullet-defaults
7575
# env_wrapper:
7676
# - sb3_contrib.common.wrappers.TimeFeatureWrapper
77-
# - rl_zoo.wrappers.DelayedRewardWrapper:
77+
# - rl_zoo3.wrappers.DelayedRewardWrapper:
7878
# delay: 10
79-
# - rl_zoo.wrappers.HistoryWrapper:
79+
# - rl_zoo3.wrappers.HistoryWrapper:
8080
# horizon: 10
8181
n_timesteps: !!float 1e6
8282
policy: 'MlpPolicy'
@@ -163,12 +163,12 @@ MinitaurBulletDuckEnv-v0:
163163
# To be tuned
164164
CarRacing-v0:
165165
env_wrapper:
166-
- rl_zoo.wrappers.FrameSkip:
166+
- rl_zoo3.wrappers.FrameSkip:
167167
skip: 2
168168
# wrapper from https://github.com/araffin/aae-train-donkeycar
169169
- ae.wrapper.AutoencoderWrapper:
170170
ae_path: "logs/car_racing_rgb_160.pkl"
171-
- rl_zoo.wrappers.HistoryWrapper:
171+
- rl_zoo3.wrappers.HistoryWrapper:
172172
horizon: 2
173173
# frame_stack: 4
174174
normalize: True
@@ -238,7 +238,7 @@ donkey-generated-track-v0:
238238
env_wrapper:
239239
- gym.wrappers.time_limit.TimeLimit:
240240
max_episode_steps: 500
241-
- rl_zoo.wrappers.HistoryWrapper:
241+
- rl_zoo3.wrappers.HistoryWrapper:
242242
horizon: 5
243243
n_timesteps: !!float 1e6
244244
policy: 'MlpPolicy'
@@ -262,9 +262,9 @@ donkey-generated-track-v0:
262262
NeckEnvRelative-v2:
263263
<<: *pybullet-defaults
264264
env_wrapper:
265-
- rl_zoo.wrappers.HistoryWrapper:
265+
- rl_zoo3.wrappers.HistoryWrapper:
266266
horizon: 2
267-
# - rl_zoo.wrappers.LowPassFilterWrapper:
267+
# - rl_zoo3.wrappers.LowPassFilterWrapper:
268268
# freq: 2.0
269269
# df: 25.0
270270
n_timesteps: !!float 1e6

hyperparams/tqc.yml

+4-4
Original file line numberDiff line numberDiff line change
@@ -258,12 +258,12 @@ parking-v0:
258258
# Tuned
259259
CarRacing-v0:
260260
env_wrapper:
261-
- rl_zoo.wrappers.FrameSkip:
261+
- rl_zoo3.wrappers.FrameSkip:
262262
skip: 2
263263
# wrapper from https://github.com/araffin/aae-train-donkeycar
264264
- ae.wrapper.AutoencoderWrapper:
265265
ae_path: "logs/car_racing_rgb_160.pkl"
266-
- rl_zoo.wrappers.HistoryWrapper:
266+
- rl_zoo3.wrappers.HistoryWrapper:
267267
horizon: 2
268268
# frame_stack: 4
269269
normalize: True
@@ -280,7 +280,7 @@ RocketLander-v0:
280280
n_timesteps: !!float 3e6
281281
policy: 'MlpPolicy'
282282
env_wrapper:
283-
- rl_zoo.wrappers.FrameSkip:
283+
- rl_zoo3.wrappers.FrameSkip:
284284
skip: 4
285-
- rl_zoo.wrappers.HistoryWrapper:
285+
- rl_zoo3.wrappers.HistoryWrapper:
286286
horizon: 2

requirements.txt

-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@ gym-minigrid
77
scikit-optimize
88
optuna
99
pytablewriter~=0.64
10-
seaborn
1110
pyyaml>=5.1
1211
cloudpickle>=1.5.0
1312
plotly

rl_zoo/cli.py

-15
This file was deleted.

rl_zoo/version.txt

-1
This file was deleted.

rl_zoo/__init__.py rl_zoo3/__init__.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
import os
22

3-
from rl_zoo.utils import (
3+
from rl_zoo3.utils import (
44
ALGOS,
55
create_test_env,
66
get_latest_run_id,

rl_zoo/benchmark.py rl_zoo3/benchmark.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
import pytablewriter
1010
from stable_baselines3.common.results_plotter import load_results, ts2xy
1111

12-
from rl_zoo.utils import get_hf_trained_models, get_latest_run_id, get_saved_hyperparams, get_trained_models
12+
from rl_zoo3.utils import get_hf_trained_models, get_latest_run_id, get_saved_hyperparams, get_trained_models
1313

1414
parser = argparse.ArgumentParser()
1515
parser.add_argument("--log-dir", help="Root log folder", default="rl-trained-agents/", type=str)
File renamed without changes.

rl_zoo3/cli.py

+22
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
import sys
2+
3+
from rl_zoo3.enjoy import enjoy
4+
from rl_zoo3.plots import all_plots, plot_from_file, plot_train
5+
from rl_zoo3.train import train
6+
7+
8+
def main():
9+
script_name = sys.argv[1]
10+
# Remove script name
11+
del sys.argv[1]
12+
# Execute known script
13+
known_scripts = {
14+
"train": train,
15+
"enjoy": enjoy,
16+
"plot_train": plot_train,
17+
"plot_from_file": plot_from_file,
18+
"all_plots": all_plots,
19+
}
20+
if script_name not in known_scripts.keys():
21+
raise ValueError(f"The script {script_name} is unknown, please use one of {known_scripts.keys()}")
22+
known_scripts[script_name]()

rl_zoo/enjoy.py rl_zoo3/enjoy.py

+6-6
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,12 @@
99
from huggingface_sb3 import EnvironmentName
1010
from stable_baselines3.common.utils import set_random_seed
1111

12-
import rl_zoo.import_envs # noqa: F401 pylint: disable=unused-import
13-
from rl_zoo import ALGOS, create_test_env, get_saved_hyperparams
14-
from rl_zoo.callbacks import tqdm
15-
from rl_zoo.exp_manager import ExperimentManager
16-
from rl_zoo.load_from_hub import download_from_hub
17-
from rl_zoo.utils import StoreDict, get_model_path
12+
import rl_zoo3.import_envs # noqa: F401 pylint: disable=unused-import
13+
from rl_zoo3 import ALGOS, create_test_env, get_saved_hyperparams
14+
from rl_zoo3.callbacks import tqdm
15+
from rl_zoo3.exp_manager import ExperimentManager
16+
from rl_zoo3.load_from_hub import download_from_hub
17+
from rl_zoo3.utils import StoreDict, get_model_path
1818

1919

2020
def enjoy(): # noqa: C901

0 commit comments

Comments
 (0)