Skip to content

Latest commit

 

History

History

stepvideo

Stepvideo

StepVideo is a state-of-the-art (SoTA) text-to-video pre-trained model with 30 billion parameters and the capability to generate videos up to 204 frames.

Examples

For original BF16 version, please see ./stepvideo_text_to_video.py. 80G VRAM required.

We also support auto-offload, which can reduce the VRAM requirement to 24GB; however, it requires 2x time for inference. Please see ./stepvideo_text_to_video_low_vram.py.

video.mp4

For FP8 quantized version, please see ./stepvideo_text_to_video_quantized.py. 40G VRAM required.

video.mp4