-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
1447746
commit def8a3b
Showing
2 changed files
with
17 additions
and
116 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,127 +1,28 @@ | ||
# gym-pybullet-drones | ||
# Reinforcement Learning for Quadcopter Control | ||
|
||
This is a minimalist refactoring of the original `gym-pybullet-drones` repository, designed for compatibility with [`gymnasium`](https://github.com/Farama-Foundation/Gymnasium), [`stable-baselines3` 2.0](https://github.com/DLR-RM/stable-baselines3/pull/1327), and SITL [`betaflight`](https://github.com/betaflight/betaflight)/[`crazyflie-firmware`](https://github.com/bitcraze/crazyflie-firmware/). | ||
This repository is a fork of [gym-pybullet-drones](https://github.com/utiasDSL/gym-pybullet-drones) and implements a reinforcement learning based control policy inspired by [Penicka et al. [1]](https://rpg.ifi.uzh.ch/docs/RAL_IROS22_Penicka.pdf). | ||
|
||
> **NOTE**: if you prefer to access the original codebase, presented at IROS in 2021, please `git checkout [paper|master]` after cloning the repo, and refer to the corresponding `README.md`'s. | ||
## Result | ||
|
||
<img src="gym_pybullet_drones/assets/helix.gif" alt="formation flight" width="325"> <img src="gym_pybullet_drones/assets/helix.png" alt="control info" width="425"> | ||
data:image/s3,"s3://crabby-images/5bcba/5bcba55df7e5dde66148c84b60b85bed8ff4ecd0" alt="RL Control Result" | ||
|
||
## Installation | ||
- The drone can follow arbitrary trajectories. | ||
- It is given two next target waypoints as observation. If the two target waypoints are close, it will reach the target slower. | ||
- The learned policy is well-suited to smoothly follow arbitrary trajectories and corresponds to the obtained result after _slow-phase training_ in [Penicka et al. [1]](https://rpg.ifi.uzh.ch/docs/RAL_IROS22_Penicka.pdf). | ||
|
||
Tested on Intel x64/Ubuntu 22.04 and Apple Silicon/macOS 14.1. | ||
## Implemented New Features | ||
|
||
```sh | ||
git clone https://github.com/utiasDSL/gym-pybullet-drones.git | ||
cd gym-pybullet-drones/ | ||
- Reward implementation of RL policy proposed by [Penicka et al. [1]](https://rpg.ifi.uzh.ch/docs/RAL_IROS22_Penicka.pdf). | ||
- Attitude control action type. In [gym-pybullet-drones](https://github.com/utiasDSL/gym-pybullet-drones), only motor-level control using PWM signals is implemented. This repository extends the original implementation and adds a wrapper for sending attitude commands (thrust and bodyrates). | ||
- Random trajectory generation using polynomial minimum snap trajectory generation using [large_scale_traj_optimizer [2]](https://github.com/ZJU-FAST-Lab/large_scale_traj_optimizer) for training and test set generation. Implementation in [trajectories](./trajectories/) subfolder. | ||
- Scripts for bechmarking the policy by computing basic benchmarks such as mean and max deviation from the target trajectory and time until completion. | ||
|
||
conda create -n drones python=3.10 | ||
conda activate drones | ||
## Setup | ||
|
||
pip3 install --upgrade pip | ||
pip3 install -e . # if needed, `sudo apt install build-essential` to install `gcc` and build `pybullet` | ||
|
||
``` | ||
## Implementation Overview | ||
|
||
## Use | ||
## References | ||
|
||
### PID control examples | ||
|
||
```sh | ||
cd gym_pybullet_drones/examples/ | ||
python3 pid.py # position and velocity reference | ||
python3 pid_velocity.py # desired velocity reference | ||
``` | ||
|
||
### Downwash effect example | ||
|
||
```sh | ||
cd gym_pybullet_drones/examples/ | ||
python3 downwash.py | ||
``` | ||
|
||
### Reinforcement learning examples (SB3's PPO) | ||
|
||
```sh | ||
cd gym_pybullet_drones/examples/ | ||
python learn.py # task: single drone hover at z == 1.0 | ||
python learn.py --multiagent true # task: 2-drone hover at z == 1.2 and 0.7 | ||
``` | ||
|
||
<img src="gym_pybullet_drones/assets/rl.gif" alt="rl example" width="375"> <img src="gym_pybullet_drones/assets/marl.gif" alt="marl example" width="375"> | ||
|
||
### Betaflight SITL example (Ubuntu only) | ||
|
||
```sh | ||
git clone https://github.com/betaflight/betaflight # use the `master` branch at the time of writing (future release 4.5) | ||
cd betaflight/ | ||
make arm_sdk_install # if needed, `apt install curl`` | ||
make TARGET=SITL # comment out line: https://github.com/betaflight/betaflight/blob/master/src/main/main.c#L52 | ||
cp ~/gym-pybullet-drones/gym_pybullet_drones/assets/eeprom.bin ~/betaflight/ # assuming both gym-pybullet-drones/ and betaflight/ were cloned in ~/ | ||
betaflight/obj/main/betaflight_SITL.elf | ||
``` | ||
|
||
In another terminal, run the example | ||
|
||
```sh | ||
conda activate drones | ||
cd gym_pybullet_drones/examples/ | ||
python3 beta.py --num_drones 1 # check the steps in the file's docstrings to use multiple drones | ||
``` | ||
|
||
## Citation | ||
|
||
If you wish, please cite our [IROS 2021 paper](https://arxiv.org/abs/2103.02142) ([and original codebase](https://github.com/utiasDSL/gym-pybullet-drones/tree/paper)) as | ||
|
||
```bibtex | ||
@INPROCEEDINGS{panerati2021learning, | ||
title={Learning to Fly---a Gym Environment with PyBullet Physics for Reinforcement Learning of Multi-agent Quadcopter Control}, | ||
author={Jacopo Panerati and Hehui Zheng and SiQi Zhou and James Xu and Amanda Prorok and Angela P. Schoellig}, | ||
booktitle={2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, | ||
year={2021}, | ||
volume={}, | ||
number={}, | ||
pages={7512-7519}, | ||
doi={10.1109/IROS51168.2021.9635857} | ||
} | ||
``` | ||
|
||
## References | ||
|
||
- Carlos Luis and Jeroome Le Ny (2016) [*Design of a Trajectory Tracking Controller for a Nanoquadcopter*](https://arxiv.org/pdf/1608.05786.pdf) | ||
- Nathan Michael, Daniel Mellinger, Quentin Lindsey, Vijay Kumar (2010) [*The GRASP Multiple Micro UAV Testbed*](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.169.1687&rep=rep1&type=pdf) | ||
- Benoit Landry (2014) [*Planning and Control for Quadrotor Flight through Cluttered Environments*](http://groups.csail.mit.edu/robotics-center/public_papers/Landry15) | ||
- Julian Forster (2015) [*System Identification of the Crazyflie 2.0 Nano Quadrocopter*](https://www.research-collection.ethz.ch/handle/20.500.11850/214143) | ||
- Antonin Raffin, Ashley Hill, Maximilian Ernestus, Adam Gleave, Anssi Kanervisto, and Noah Dormann (2019) [*Stable Baselines3*](https://github.com/DLR-RM/stable-baselines3) | ||
- Guanya Shi, Xichen Shi, Michael O’Connell, Rose Yu, Kamyar Azizzadenesheli, Animashree Anandkumar, Yisong Yue, and Soon-Jo Chung (2019) | ||
[*Neural Lander: Stable Drone Landing Control Using Learned Dynamics*](https://arxiv.org/pdf/1811.08027.pdf) | ||
- C. Karen Liu and Dan Negrut (2020) [*The Role of Physics-Based Simulators in Robotics*](https://www.annualreviews.org/doi/pdf/10.1146/annurev-control-072220-093055) | ||
- Yunlong Song, Selim Naji, Elia Kaufmann, Antonio Loquercio, and Davide Scaramuzza (2020) [*Flightmare: A Flexible Quadrotor Simulator*](https://arxiv.org/pdf/2009.00563.pdf) | ||
|
||
## Core Team WIP | ||
|
||
- [ ] Multi-drone `crazyflie-firmware` SITL support (@spencerteetaert, @JacopoPan) | ||
|
||
## Desired Contributions/PRs | ||
|
||
- [ ] Add motor delay, advanced ESC modeling by implementing a buffer in `BaseAviary._dynamics()` | ||
- [ ] Replace `rpy` with quaternions (and `ang_vel` with body rates) by editing `BaseAviary._updateAndStoreKinematicInformation()`, `BaseAviary._getDroneStateVector()`, and the `.computeObs()` methods of relevant subclasses | ||
|
||
## Troubleshooting | ||
|
||
- On Ubuntu, with an NVIDIA card, if you receive a "Failed to create and OpenGL context" message, launch `nvidia-settings` and under "PRIME Profiles" select "NVIDIA (Performance Mode)", reboot and try again. | ||
|
||
Run all tests from the top folder with | ||
|
||
```sh | ||
pytest tests/ | ||
``` | ||
|
||
----- | ||
> University of Toronto's [Dynamic Systems Lab](https://github.com/utiasDSL) / [Vector Institute](https://github.com/VectorInstitute) / University of Cambridge's [Prorok Lab](https://github.com/proroklab) | ||
|
||
## Dev | ||
|
||
- Repository must be pulled recursively. After `git clone`, run `git submodule --init --recursive` | ||
- This repository runs a local pip package in folder [trajectories/trajectory_generation](./trajectories/trajectory_generation). This pip package must be kept up to date with `pip install ./trajectories/trajectory_generation` after every change in the folder [trajectories/trajectory_generation](./trajectories/trajectory_generation). | ||
- Git pull must be executed for submodules as well. Run `git pull --recurse-submodules` instead of standard 'git pull'. You can also configure your Git settings to always update submodules upon pulling: `git config --global submodule.recurse true` | ||
- [1]: Penicka, Robert, et al. [*Learning minimum-time flight in cluttered environments.*](https://rpg.ifi.uzh.ch/docs/RAL_IROS22_Penicka.pdf) IEEE Robotics and Automation Letters 7.3 (2022): 7209-7216. | ||
- [2]: Burke, Declan, Airlie Chapman, and Iman Shames. [*Generating minimum-snap quadrotor trajectories really fast.*](https://ieeexplore.ieee.org/abstract/document/9341794) 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020. [github](https://github.com/ZJU-FAST-Lab/large_scale_traj_optimizer) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.