Skip to content

Commit

Permalink
Add RTMO model for keypoint detection (#2)
Browse files Browse the repository at this point in the history
* Add RTMO
  • Loading branch information
jamjamjon authored Apr 8, 2024
1 parent a0d410b commit ead1752
Show file tree
Hide file tree
Showing 31 changed files with 237 additions and 53 deletions.
37 changes: 19 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,30 @@
# usls

A Rust library integrated with **ONNXRuntime**, providing a collection of **Computer Vison** and **Vision-Language** models including [YOLOv8](https://github.com/ultralytics/ultralytics) `(Classification, Segmentation, Detection and Pose Detection)`, [YOLOv9](https://github.com/WongKinYiu/yolov9), [RTDETR](https://arxiv.org/abs/2304.08069), [CLIP](https://github.com/openai/CLIP), [DINOv2](https://github.com/facebookresearch/dinov2), [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [BLIP](https://arxiv.org/abs/2201.12086), [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) and others. Many execution providers are supported, sunch as `CUDA`, `TensorRT` and `CoreML`.
A Rust library integrated with **ONNXRuntime**, providing a collection of **Computer Vison** and **Vision-Language** models including [YOLOv8](https://github.com/ultralytics/ultralytics), [YOLOv9](https://github.com/WongKinYiu/yolov9), [RTDETR](https://arxiv.org/abs/2304.08069), [CLIP](https://github.com/openai/CLIP), [DINOv2](https://github.com/facebookresearch/dinov2), [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [BLIP](https://arxiv.org/abs/2201.12086), [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) and others. Many execution providers are supported, sunch as `CUDA`, `TensorRT` and `CoreML`.

## Supported Models

| Model | Example | CUDA<br />f32 | CUDA<br />f16 | TensorRT<br />f32 | TensorRT<br />f16 |
| :---------------------------------------------------------------: | :----------------------: | :-----------: | :-----------: | :------------------------: | :-----------------------: |
| **YOLOv8-detection** | [demo](examples/yolov8) |||||
| **YOLOv8-pose** | [demo](examples/yolov8) |||||
| **YOLOv8-classification** | [demo](examples/yolov8) |||||
| **YOLOv8-segmentation** | [demo](examples/yolov8) |||||
| **YOLOv8-OBB** | TODO | TODO | TODO | TODO | TODO |
| **YOLOv9** | [demo](examples/yolov9) |||||
| **RT-DETR** | [demo](examples/rtdetr) |||||
| **FastSAM** | [demo](examples/fastsam) |||||
| **YOLO-World** | [demo](examples/yolo-world) |||||
| **DINOv2** | [demo](examples/dinov2) |||||
| **CLIP** | [demo](examples/clip) ||| ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| **BLIP** | [demo](examples/blip) ||| ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| [**DB(Text Detection)**](https://arxiv.org/abs/1911.08947) | [demo](examples/db) |||||
| [**SVTR(Text Recognition)**](https://arxiv.org/abs/2205.00159) | [demo](examples/svtr) |||||
| Model | Task / Type | Example | CUDA<br />f32 | CUDA<br />f16 | TensorRT<br />f32 | TensorRT<br />f16 |
| :---------------------------------------------------------------: | :----------------------: |:----------------------: | :-----------: | :-----------: | :------------------------: | :-----------------------: |
| **[YOLOv8-detection](https://github.com/ultralytics/ultralytics)** | Object Detection | [demo](examples/yolov8) |||||
| **[YOLOv8-pose](https://github.com/ultralytics/ultralytics)** | Keypoint Detection | [demo](examples/yolov8) |||||
| **[YOLOv8-classification](https://github.com/ultralytics/ultralytics)** | Classification | [demo](examples/yolov8) |||||
| **[YOLOv8-segmentation](https://github.com/ultralytics/ultralytics)** | Instance Segmentation | [demo](examples/yolov8) |||||
| **[YOLOv9](https://github.com/WongKinYiu/yolov9)** | Object Detection | [demo](examples/yolov9) |||||
| **[RT-DETR](https://arxiv.org/abs/2304.08069)** | Object Detection | [demo](examples/rtdetr) |||||
| **[FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM)** | Instance Segmentation | [demo](examples/fastsam) |||||
| **[YOLO-World](https://github.com/AILab-CVC/YOLO-World)** | Object Detection | [demo](examples/yolo-world) |||||
| **[DINOv2](https://github.com/facebookresearch/dinov2)** | Vision-Self-Supervised | [demo](examples/dinov2) |||||
| **[CLIP](https://github.com/openai/CLIP)** | Vision-Language | [demo](examples/clip) ||| ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| **[BLIP](https://github.com/salesforce/BLIP)** | Vision-Language | [demo](examples/blip) ||| ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| [**DB**](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) |||||
| [**SVTR**](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) |||||
| [**RTMO**](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) |||||


## Solution Models

Additionally, this repo also provides some solution models such as pedestrian `fall detection`, `head detection`, `trash detection`, and more.
Additionally, this repo also provides some solution models.

| Model | Example |
| :--------------------------------------------------------------------------------: | :------------------------------: |
Expand Down
Binary file removed examples/assets/bus.jpg
Binary file not shown.
Binary file removed examples/assets/falldown.jpg
Binary file not shown.
Binary file removed examples/assets/kids.jpg
Binary file not shown.
Binary file removed examples/assets/math.jpg
Binary file not shown.
Binary file removed examples/assets/trash.jpg
Binary file not shown.
1 change: 0 additions & 1 deletion examples/blip/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,6 @@ cargo run -r --example blip

## TODO

* [ ] text decode with Top-p sample
* [ ] VQA
* [ ] Retrival
* [ ] TensorRT support for textual model
1 change: 1 addition & 0 deletions examples/blip/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
// textual
let options_textual = Options::default()
.with_model("../models/blip-textual-base.onnx")
.with_tokenizer("tokenizer-blip.json")
.with_i00((1, 1, 4).into()) // input_id: batch
.with_i01((1, 1, 4).into()) // input_id: seq_len
.with_i10((1, 1, 4).into()) // attention_mask: batch
Expand Down
1 change: 1 addition & 0 deletions examples/clip/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
// textual
let options_textual = Options::default()
.with_model("../models/clip-b32-textual-dyn.onnx")
.with_tokenizer("tokenizer-clip.json")
.with_i00((1, 1, 4).into())
.with_profile(false);

Expand Down
2 changes: 1 addition & 1 deletion examples/db/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ cargo run -r --example db

### 1. Donwload ONNX Model

[ppocr-v3-db-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/ppocr-v3-db-dyn.onnx)
[ppocr-v3-db-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/ppocr-v3-db-dyn.onnx)
[ppocr-v4-db-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/ppocr-v4-db-dyn.onnx)

### 2. Specify the ONNX model path in `main.rs`
Expand Down
1 change: 0 additions & 1 deletion examples/fastsam/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ cargo run -r --example fastsam
```Rust
let options = Options::default()
.with_model("../models/FastSAM-s-dyn-f16.onnx") // <= modify this
.with_saveout("FastSAM")
.with_profile(false);
let mut model = YOLO::new(&options)?;
```
Expand Down
1 change: 0 additions & 1 deletion examples/rtdetr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ cargo run -r --example rtdetr
```Rust
let options = Options::default()
.with_model("ONNX_MODEL") // <= modify this
.with_saveout("RT-DETR");
```

### 3. Then, run
Expand Down
35 changes: 35 additions & 0 deletions examples/rtmo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
## Quick Start

```shell
cargo run -r --example rtmo
```

## Or you can manully

### 1. Donwload ONNX Model

[rtmo-s-dyn model](https://github.com/jamjamjon/assets/releases/download/v0.0.1/rtmo-s-dyn.onnx)
[rtmo-m-dyn model](https://github.com/jamjamjon/assets/releases/download/v0.0.1/rtmo-m-dyn.onnx)
[rtmo-l-dyn model](https://github.com/jamjamjon/assets/releases/download/v0.0.1/rtmo-l-dyn.onnx)
[rtmo-s-dyn-f16 model](https://github.com/jamjamjon/assets/releases/download/v0.0.1/rtmo-s-dyn-f16.onnx)
[rtmo-m-dyn-f16 model](https://github.com/jamjamjon/assets/releases/download/v0.0.1/rtmo-m-dyn-f16.onnx)
[rtmo-l-dyn-f16 model](https://github.com/jamjamjon/assets/releases/download/v0.0.1/rtmo-l-dyn-f16.onnx)



### 2. Specify the ONNX model path in `main.rs`

```Rust
let options = Options::default()
.with_model("ONNX_MODEL") // <= modify this
```

### 3. Then, run

```bash
cargo run -r --example rtmo
```

## Results

![](./demo.jpg)
Binary file added examples/rtmo/demo.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
26 changes: 26 additions & 0 deletions examples/rtmo/main.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
use usls::{models::RTMO, Annotator, DataLoader, Options, COCO_SKELETON_17};

fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let options = Options::default()
.with_model("../rtmo-l-dyn-f16.onnx")
.with_i00((1, 2, 8).into())
.with_nk(17)
.with_confs(&[0.3])
.with_kconfs(&[0.5]);
let mut model = RTMO::new(&options)?;

// load image
let x = vec![DataLoader::try_read("./assets/bus.jpg")?];

// run
let y = model.run(&x)?;

// // annotate
let annotator = Annotator::default()
.with_saveout("RTMO")
.with_skeletons(&COCO_SKELETON_17);
annotator.annotate(&x, &y);

Ok(())
}
6 changes: 3 additions & 3 deletions examples/svtr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ cargo run -r --example svtr

### 1. Donwload ONNX Model

[ppocr-v4-server-svtr-ch-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/ppocr-v4-server-svtr-ch-dyn.onnx)
[ppocr-v4-svtr-ch-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/ppocr-v4-svtr-ch-dyn.onnx)
[ppocr-v3-svtr-ch-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/ppocr-v3-svtr-ch-dyn.onnx)
[ppocr-v4-server-svtr-ch-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/ppocr-v4-server-svtr-ch-dyn.onnx)
[ppocr-v4-svtr-ch-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/ppocr-v4-svtr-ch-dyn.onnx)
[ppocr-v3-svtr-ch-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/ppocr-v3-svtr-ch-dyn.onnx)

### 2. Specify the ONNX model path in `main.rs`

Expand Down
2 changes: 1 addition & 1 deletion examples/yolo-world/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ cargo run -r --example yolo-world

- Download

[yolov8s-world-v2-shoes](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8s-world-v2-shoes.onnx)
[yolov8s-world-v2-shoes](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8s-world-v2-shoes.onnx)
- Or generate your own `yolo-world` model and then Export

- Installation
Expand Down
2 changes: 1 addition & 1 deletion examples/yolov8-face/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ cargo run -r --example yolov8-face

### 1. Donwload ONNX Model

[yolov8-face-dyn-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-face-dyn-f16.onnx)
[yolov8-face-dyn-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-face-dyn-f16.onnx)

### 2. Specify the ONNX model path in `main.rs`

Expand Down
2 changes: 1 addition & 1 deletion examples/yolov8-falldown/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ cargo run -r --example yolov8-falldown

### 1.Donwload ONNX Model

[yolov8-falldown-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-falldown-f16.onnx)
[yolov8-falldown-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-falldown-f16.onnx)

### 2. Specify the ONNX model path in `main.rs`

Expand Down
2 changes: 1 addition & 1 deletion examples/yolov8-head/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ cargo run -r --example yolov8-head

### 1. Donwload ONNX Model

[yolov8-head-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-head-f16.onnx)
[yolov8-head-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-head-f16.onnx)

### 2. Specify the ONNX model path in `main.rs`

Expand Down
2 changes: 1 addition & 1 deletion examples/yolov8-trash/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ cargo run -r --example yolov8-trash

### 1. Donwload ONNX Model

[yolov8-plastic-bag-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-plastic-bag-f16.onnx)
[yolov8-plastic-bag-f16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov8-plastic-bag-f16.onnx)

### 2. Specify the ONNX model path in `main.rs`

Expand Down
1 change: 0 additions & 1 deletion examples/yolov8/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ yolo export model=yolov8m-seg.pt format=onnx simplify
let options = Options::default()
.with_model("ONNX_PATH") // <= modify this
.with_confs(&[0.4, 0.15]) // person: 0.4, others: 0.15
.with_saveout("YOLOv8");
let mut model = YOLO::new(&options)?;
```

Expand Down
1 change: 1 addition & 0 deletions examples/yolov8/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
// run & annotate
for (xs, _paths) in dl {
let ys = model.run(&xs)?;
println!("{:?}", ys);
annotator.annotate(&xs, &ys);
}

Expand Down
3 changes: 1 addition & 2 deletions examples/yolov9/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ cargo run -r --example yolov9

- **Download**

[yolov9-c-dyn-fp16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov9-c-dyn-f16.onnx)
[yolov9-c-dyn-fp16](https://github.com/jamjamjon/assets/releases/download/v0.0.1/yolov9-c-dyn-f16.onnx)
- **Export**

```shell
Expand All @@ -31,7 +31,6 @@ cargo run -r --example yolov9
```Rust
let options = Options::default()
.with_model("ONNX_PATH") // <= modify this
.with_saveout("YOLOv9");
```

### 3. Run
Expand Down
8 changes: 2 additions & 6 deletions src/models/blip.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ use ndarray::{s, Array, Axis, IxDyn};
use std::io::Write;
use tokenizers::Tokenizer;

use crate::{auto_load, ops, LogitsSampler, MinOptMax, Options, OrtEngine, TokenizerStream};
use crate::{ops, LogitsSampler, MinOptMax, Options, OrtEngine, TokenizerStream};

#[derive(Debug)]
pub struct Blip {
Expand All @@ -27,11 +27,7 @@ impl Blip {
visual.height().to_owned(),
visual.width().to_owned(),
);
let tokenizer = match &options_textual.tokenizer {
None => auto_load("tokenizer-blip.json")?,
Some(tokenizer) => tokenizer.into(),
};
let tokenizer = Tokenizer::from_file(tokenizer).unwrap();
let tokenizer = Tokenizer::from_file(&options_textual.tokenizer.unwrap()).unwrap();
let tokenizer = TokenizerStream::new(tokenizer);
visual.dry_run()?;
textual.dry_run()?;
Expand Down
9 changes: 2 additions & 7 deletions src/models/clip.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
use crate::{auto_load, ops, MinOptMax, Options, OrtEngine};
use crate::{ops, MinOptMax, Options, OrtEngine};
use anyhow::Result;
use image::DynamicImage;
// use itertools::Itertools;
use ndarray::{Array, Array2, Axis, IxDyn};
use tokenizers::{PaddingDirection, PaddingParams, PaddingStrategy, Tokenizer};

Expand All @@ -28,11 +27,7 @@ impl Clip {
visual.inputs_minoptmax()[0][2].to_owned(),
visual.inputs_minoptmax()[0][3].to_owned(),
);
let tokenizer = match &options_textual.tokenizer {
None => auto_load("tokenizer-clip.json").unwrap(),
Some(tokenizer) => tokenizer.into(),
};
let mut tokenizer = Tokenizer::from_file(tokenizer).unwrap();
let mut tokenizer = Tokenizer::from_file(&options_textual.tokenizer.unwrap()).unwrap();
tokenizer.with_padding(Some(PaddingParams {
strategy: PaddingStrategy::Fixed(context_length),
direction: PaddingDirection::Right,
Expand Down
2 changes: 2 additions & 0 deletions src/models/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ mod clip;
mod db;
mod dinov2;
mod rtdetr;
mod rtmo;
mod svtr;
mod yolo;

Expand All @@ -11,5 +12,6 @@ pub use clip::Clip;
pub use db::DB;
pub use dinov2::Dinov2;
pub use rtdetr::RTDETR;
pub use rtmo::RTMO;
pub use svtr::SVTR;
pub use yolo::YOLO;
1 change: 0 additions & 1 deletion src/models/rtdetr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ impl RTDETR {
.expect("Failed to get num_classes, make it explicit with `--nc`")
.len(),
);
// let annotator = Annotator::default();
let confs = DynConf::new(&options.confs, nc);
engine.dry_run()?;

Expand Down
Loading

0 comments on commit ead1752

Please sign in to comment.