Skip to content

QuanjianSong/UniVST

Repository files navigation

UniVST: A Unified Framework for Training-free Localized Video Style Transfer [Official Code of PyTorch]

1 Key Laboratory of Multimedia Trusted Perception and Efficient Computing,
Ministry of Education of China, Xiamen University, China.
2 Kunlun Skywork AI.

Paper PDF     Project Page     Project Page    


🔥🔥🔥 News

2024.10.26: 🔥 The paper of UniVST has been submitted to arXiv.
• 2025.01.01: 🔥 The official code of UniVST has been released.
• 2025.06.01: 🔥 The project page of UniVST is now available.

🎬 Overview

We propose UniVST, a unified framework for training-free localized video style transfer based on diffusion models. UniVST first applies DDIM inversion to the original video and style image to obtain their initial noise and integrates Point-Matching Mask Propagation to generate masks for the object regions. It then performs AdaIN-Guided Localized Video Stylization with a three-branch architecture for information interaction. Moreover, SlidingWindow Consistent Smoothing is incorporated into the denoising process, enhancing the temporal consistency in the latent space. The overall framework is illustrated as follows:

🔧 Environment

# Git clone the repo
git clone https://github.com/QuanjianSong/UniVST.git

# Installation with the requirement.txt
conda create -n UniVST python=3.9
conda activate UniVST
pip install -r requirements.txt

# Or installation with environment.yaml
conda env create -f environment.yaml

🚀 Start

► 1.Perform inversion for original video

python content_ddim_inv.py --content_path ./examples/content/libby \
                            --output_dir ./output

Then, you will find the content inversion result in the ./output/content.

► 2.Perform mask propagation

python mask_propagation.py --feature_path ./output/features/libby/inversion_feature_301.pt \
                            --mask_path ./examples/mask/libby.png \
                            --output_dir ./output

Then, you will find the mask propagation result in the ./output/mask.

► 3.Perform inversion for style image

python style_ddim_inv.py --style_path ./examples/style/style1.png \
                            --output_dir ./output

Then, you will find the style inversion result in the ./output/style.

► 4.Perform localized video style transfer

python video_style_transfer.py --inv_path ./output/content/libby/inversion \
                            --mask_path ./output/mask/libby \
                            --style_path ./output/style/style1/inversion \ 
                            --output_dir ./output

Then, you will find the edit result in the ./output/edit.

🎓 Bibtex

If you find this code helpful for your research, please cite:

@article{song2024univst,
  title={UniVST: A Unified Framework for Training-free Localized Video Style Transfer},
  author={Song, Quanjian and Lin, Mingbao and Zhan, Wengyi and Yan, Shuicheng and Cao, Liujuan and Ji, Rongrong},
  journal={arXiv preprint arXiv:2410.20084},
  year={2024}
}

About

Official Pytorch Code of the Paper "UniVST: A Unified Framework for Training-free Localized Video Style Transfer"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages