LTX-Video Image2Video LORA? #240

BlackTea-c · 2025-01-23T07:29:40Z

Feature request / 功能建议

LTX-Video Image2Video LORA?

Motivation / 动机

LTX-Video Image2Video LORA?

Your contribution / 您的贡献

LTX-Video Image2Video LORA?

a-r-r-o-w · 2025-01-23T11:00:54Z

Coming this weekend after some more testing!

BlackTea-c · 2025-01-23T11:32:16Z

great!

pushchar · 2025-02-09T00:51:16Z

Hi, firstly thanks for this amazing work!
Are there any updates on the timeline for LTX image2video finetuning scripts release?

a-r-r-o-w · 2025-02-09T03:56:41Z

@pushchar There is not really a difference in the training algorithm for LTX img2vid. It is supported in #245 and I've gotten some decent results so far, but the PR is taking longer than expected due to requiring extensive amount of testing

finetrainers/finetrainers/models/ltx_video/specification_sft.py

Lines 263 to 286 in c7003ec

    
           first_frame_conditioning_p = 0.1 
        
           min_first_frame_sigma = 0.25 
        
           latents = latent_model_conditions.pop("latents") 
        
           latents_mean = latent_model_conditions.pop("latents_mean") 
        
           latents_std = latent_model_conditions.pop("latents_std") 
        
           latents = self._normalize_latents(latents, latents_mean, latents_std) 
        
           noise = torch.zeros_like(latents).normal_(generator=generator) 
        
           if random.random() < first_frame_conditioning_p: 
        
               # Based on Section 2.4 of the paper, it mentions that the first frame timesteps should be a small random value. 
        
               # Making as estimated guess, we limit the sigmas to be at least 0.2. 
        
               # torch.rand_like returns values in [0, 1). We want to make sure that the first frame sigma is <= actual sigmas 
        
               # for image conditioning. In order to do this, we rescale by multiplying with sigmas so the range is [0, sigmas). 
        
               first_frame_sigma = torch.rand_like(sigmas) * sigmas 
        
               first_frame_sigma = torch.min(first_frame_sigma, sigmas.new_full(sigmas.shape, min_first_frame_sigma)) 
        
               latents_first_frame, latents_rest = latents[:, :, :1], latents[:, :, 1:] 
        
               noisy_latents_first_frame = FF.flow_match_xt(latents_first_frame, noise[:, :, :1], first_frame_sigma) 
        
               noisy_latents_remaining = FF.flow_match_xt(latents_rest, noise[:, :, 1:], sigmas) 
        
               noisy_latents = torch.cat([noisy_latents_first_frame, noisy_latents_remaining], dim=2) 
        
           else: 
        
               noisy_latents = FF.flow_match_xt(latents, noise, sigmas)

pushchar · 2025-02-10T06:33:39Z

I see, thanks!

eisneim · 2025-02-12T09:40:13Z

@BlackTea-c @pushchar I've successfully trained LTX image to video lora here: eisneim/ltx_lora_training_i2v_t2v

a-r-r-o-w · 2025-02-12T10:37:43Z

@eisneim Wow, this looks amazing! Superb work :) I would love if you would be interested in collaborating on testing and adding new features/algorithms. For example, I too am planning to add a general-purpose VideoJAM trainer as soon as I find some time after parallel training in #245. Until then, if you do make trainer_videojam.py similar to how we have a trainer.py, it would be super cool and can easily be made available for use in many models at once

eisneim · 2025-02-12T13:19:34Z

@a-r-r-o-w Cool, if i can pull it off doing what VideoJAM is doing i would definitely contribute back to finetrainers;
But now i'm still generating tons of optical flows with just two RTX 4090, my initial idea is to use Loras instead of add new layers to Dit and use the intermidiate latents to guide the final output

a-r-r-o-w · 2025-03-03T11:25:09Z

Support has now been added with reproducible example script and checkpoint!

a-r-r-o-w closed this as completed Mar 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LTX-Video Image2Video LORA? #240

LTX-Video Image2Video LORA? #240

BlackTea-c commented Jan 23, 2025

a-r-r-o-w commented Jan 23, 2025

BlackTea-c commented Jan 23, 2025

pushchar commented Feb 9, 2025

a-r-r-o-w commented Feb 9, 2025

pushchar commented Feb 10, 2025

eisneim commented Feb 12, 2025

a-r-r-o-w commented Feb 12, 2025

eisneim commented Feb 12, 2025

a-r-r-o-w commented Mar 3, 2025

LTX-Video Image2Video LORA? #240

LTX-Video Image2Video LORA? #240

Comments

BlackTea-c commented Jan 23, 2025

Feature request / 功能建议

Motivation / 动机

Your contribution / 您的贡献

a-r-r-o-w commented Jan 23, 2025

BlackTea-c commented Jan 23, 2025

pushchar commented Feb 9, 2025

a-r-r-o-w commented Feb 9, 2025

pushchar commented Feb 10, 2025

eisneim commented Feb 12, 2025

a-r-r-o-w commented Feb 12, 2025

eisneim commented Feb 12, 2025

a-r-r-o-w commented Mar 3, 2025