Assuming you are in the VADER/
directory, you are able to create a Conda environments for VADER-Open-Sora using the following commands:
cd VADER-Open-Sora
conda create -n vader_opensora python=3.10
conda activate vader_opensora
conda install pytorch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 pytorch-cuda=12.1 -c pytorch -c nvidia
conda install xformers -c xformers
pip install -v -e .
git clone https://github.com/tgxs002/HPSv2.git
cd HPSv2/
pip install -e .
cd ..
Please run accelerate config
as the first step to configure accelerator settings. If you are not familiar with the accelerator configuration, you can refer to VADER-Open-Sora documentation.
Assuming you are in the VADER/
directory, you are able to do inference using the following commands:
cd VADER-Open-Sora
sh scripts/run_text2video_inference.sh
- We have tested on PyTorch 2.3.0 and CUDA 12.1. If the
resolution
is set as360p
, a GPU with 40GBs of VRAM is required when we setval_batch_size=1
and usebf16
mixed precision . It should also work with recent PyTorch and CUDA versions. Please refer to the original Open-Sora repository for more details about the GPU requirements and the model settings. VADER/VADER-Open-Sora/scripts/train_t2v_lora.py
is a script for do inference via the Open-Sora 1.2 using VADER.--num-frames
,'--resolution'
,'fps'
and'aspect-ratio'
are inherited from the original Open-Sora model. In short, you can set'--num-frames'
as'2s'
,'4s'
,'8s'
, and'16s'
. Available values for--resolution
are'240p'
,'360p'
,'480p'
, and'720p'
. The default value of'fps'
is24
and'aspect-ratio'
is3:4
. Please refer to the original Open-Sora repository for more details. One thing to keep in mind, for instance, is that if you set--num-frames
to2s
and--resolution
to'240p'
, it is better to usebf16
mixed precision instead offp16
. Otherwise, the model may generate noise videos.--prompt-path
is the path of the prompt file. Unlike VideoCrafter, we do not provide prompt function for Open-Sora. Instead, you can provide a prompt file, which contains a list of prompts.--num-processes
is the number of processes for Accelerator. It is recommended to set it to the number of GPUs.
VADER/VADER-Open-Sora/configs/opensora-v1-2/vader/vader_inferece.py
is the configuration file for inference. You can modify the configuration file to change the inference settings following the guidance in the documentation.- The main difference is that
is_vader_training
should be set toFalse
. The--lora_ckpt_path
should be set to the path of the pretrained LoRA model. Otherwise, the original Open-Sora model will be used for inference.
- The main difference is that
Please run accelerate config
as the first step to configure accelerator settings. If you are not familiar with the accelerator configuration, you can refer to VADER-Open-Sora documentation.
Assuming you are in the VADER/
directory, you are able to train the model using the following commands:
cd VADER-Open-Sora
sh scripts/run_text2video_train.sh
- Our experiments are conducted on PyTorch 2.3.0 and CUDA 12.1 while using 4 A6000s (48GB RAM). It should also work with recent PyTorch and CUDA versions. A GPU with 48GBs of VRAM is required for fine-tuning model when use
bf16
mixed precision asresolution
is set as360p
andnum_frames
is set as2s
. VADER/VADER-Open-Sora/scripts/train_t2v_lora.py
is a script for fine-tuning the Open-Sora 1.2 using VADER via LoRA.- The arguments are the same as the inference process above.
VADER/VADER-Open-Sora/configs/opensora-v1-2/vader/vader_train.py
is the configuration file for training. You can modify the configuration file to change the training settings.- You can read the VADER-Open-Sora documentation to understand the usage of arguments.
This section is to provide a tutorial on how to implement the VADER method on Open-Sora by yourself. We will provide a step-by-step guide to help you understand the implementation details. Thus, you can easily adapt the VADER method to later versions of Open-Sora or other video generation models. This tutorial is based on the Open-Sora v1.2.0 version.
First, you need to install the dependencies according to the Open-Sora repository. You can also follow the instructions in the repository to install the dependencies.
conda create -n vader_opensora python=3.9
conda activate vader_opensora
# download the repo
git clone https://github.com/hpcaitech/Open-Sora
cd Open-Sora
# install torch, torchvision and xformers
pip install -r requirements/requirements-cu121.txt
# install the Open-Sora package
pip install -v -e .
There are a list of extra dependencies that you need to install for VADER. You can install them by running the following command.
# Install the HPS
git clone https://github.com/tgxs002/HPSv2.git
cd HPSv2/
pip install -e .
cd ..
# Install the dependencies
pip install albumentations \
peft \
bitsandbytes \
accelerate \
inflect \
wandb \
ipdb \
pytorch_lightning
You can copy our VADER/VADER-Open-Sora/scripts/train_t2v_lora.py
to the scripts
directory of Open-Sora, namely Open-Sora/scripts/
. It is better to copy run_text2video_train.sh
and run_text2video_inference.sh
to that directionary as well. Then, you need to copy All the files in VADER/Core/
and VADER/assets/
to the parent directory of Open-Sora, which means Core/
, assets
and Open-Sora/
should be in the same directory. You have to also copy the VADER/VADER-Open-Sora/configs/opensora-v1-2/vader/vader_train.py
and VADER/VADER-Open-Sora/configs/opensora-v1-2/vader/vader_inference.py
to one directory of Open-Sora, namely Open-Sora/configs/opensora-v1-2/vader/
. Now, you may have a directory structure like:
.
├── Core
│ ├── ...
├── Open-Sora
│ ├── scripts
│ │ ├── train_t2v_lora.py
│ │ ├── run_text2video_train.sh
│ │ ├── run_text2video_inference.sh
│ ├── configs
│ │ ├── opensora-v1-2
│ │ │ ├── vader
│ │ │ │ ├── vader_train.py
│ │ │ │ ├── vader_inference.py
├── assets
│ ├── ...
We will let you know which files you need to modify in the Open-Sora source code.
- Modify the function
sample()
inOpen-Sora/opensora/schedulers/rf/__init__.py
as in our implementation ofVADER/Core/schedulers/rf/__init__.py
.
Now you have all the files in the right place and modified the Open-Sora source code. You can run the training script by running the following command.
cd Open-Sora
# training
sh scripts/run_text2video_train.sh
# or inference
sh scripts/run_text2video_inference.sh
Our codebase is directly built on top of VideoCrafter, Open-Sora, and Animate Anything. We would like to thank the authors for open-sourcing their code.
If you find this work useful in your research, please cite: