This is a fork of Anthony Corso's Crux.jl package for deep reinforcement learning in Julia, with some broken dependencies (as of 15 Feb 2025, particularly interfaces with OpenAI Gym) stripped off, so that much of the rest of the package remains available and installable with the latest versions of Julia before possible attempts to fix the original package. Most of the examples and tests dependent on the Python OpenAI Gym environments are therefore deleted. However, the core package for solving custom RL environments written in the POMDPs.jl interface remains working.
Currently, the package works with Julia 1.11 under Windows, Linux, and MacOS, and works with Julia 1.10 under Linux.
In examples/rl/cartpole.jl, we use the CartPole environment provided by ReinforcementLearningEnvironments.jl
and convert it into the POMDPs interface, as a replacement of the OpenAI Gym equivalent of this environment. To try this example, first install the package of this repo in the Julia REPL (in a new virtual environment):
]add https://github.com/zengmao/Crux.jl.git
Then install additional packages needed for the script:
]add POMDPs, QuickPOMDPs, POMDPTools, ReinforcementLearningEnvironments, Random, Flux, Plots
Then download a copy of examples/rl/cartpole.jl and run it. The code solves the environment with several RL algorithms and plots the learning curves in cartpole_training.pdf
. The PPO training outcome will be shown as an animation. Log files will be written to logs/
.
In case any dependency becomes broken in the future, please switch to the manifest
branch which contains the exact last working versions of dependencies, in Manifest-v1.11.toml
for Julia 1.11 and Manifest.toml
for Julia 1.10, on Linux (x86_64 glibc).
Below is the original README.
Deep RL library with concise implementations of popular algorithms. Implemented using Flux.jl and fits into the POMDPs.jl interface.
Supports CPU and GPU computation and implements the following algorithms:
- Deep Q-Learning
- Prioritized Experience Replay
- Soft Q-Learning
- REINFORCE
- Proximal Policy Optimization (PPO)
- Lagrange-Constrained PPO
- Advantage Actor Critic
- Deep Deterministic Policy Gradient (DDPG)
- Twin Delayed DDPG (TD3)
- Soft Actor Critic (SAC)
- Behavioral Cloning
- Generative Adversarial Imitation Learning (GAIL) w/ On-Policy and Off Policy Versions
- Adversarial value moment imitation learning (AdVIL)
- (AdRIL)
- (SQIL)
- Adversarial Soft Advantage Fitting (ASAF)
- Inverse Q-Learning (IQLearn)
- Experience Replay
- Install POMDPGym
- Install by opening julia and running
] add Crux
To edit or contribute use ] dev Crux
and the repo will be cloned to ~/.julia/dev/Crux
Maintained by Anthony Corso (acorso@stanford.edu)