Skip to content

Commit 57eb27f

Browse files
authored
correct perf numbers and add details (#414)
1 parent 70b156d commit 57eb27f

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

docs/performance/hunyuanvideo.md

+4-1
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,9 @@
22

33
xDiT is [HunyuanVideo](https://github.com/Tencent/HunyuanVideo?tab=readme-ov-file#-parallel-inference-on-multiple-gpus-by-xdit)'s official parallel inference engine. On H100 and H20 GPUs, xDiT reduces the generation time of 1028x720 videos from 31 minutes to 5 minutes, and 960x960 videos from 28 minutes to 6 minutes.
44

5+
The H100 and H20 performance benchmarks are done with the official HunyuanVideo repository. The L20 performance benchmarks are done with the `diffusers` implementation.
6+
The L20 performance benchmarks are measured using this [script](examples/hunyuan_video_usp_example.py), along with `flash-attn==2.7.2.post1` and CUDA 12.4.
7+
58
### 1280x720 Resolution (129 frames, 50 steps) - Ulysses Latency (seconds)
69

710
<center>
@@ -22,6 +25,6 @@ xDiT is [HunyuanVideo](https://github.com/Tencent/HunyuanVideo?tab=readme-ov-fil
2225
|----------|--------|---------|---------|---------|
2326
| H100 | 1,735.01 | 934.09 | 645.45 | 367.02 |
2427
| H20 | 6,621.46 | 3,400.55 | 2,310.48 | 1,214.67 |
25-
| L20 | 6,039.08 | 3,260.62 | 2,070.96 | |
28+
| L20 | 6,039.08 | 3,260.62 | 2,284.74 | |
2629

2730
</center>

0 commit comments

Comments
 (0)