You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! Thanks for your outstanding work! I'd like to know how to fine-tune VILA-U using my own data.
1: When multiple images need to be provided for training (give multiple images and generate texts),
2: or when the task requires the model generate images (give prompt and generate images)
what form should the fine-tuning data take? Can these different types of tasks be trained in a single fine-tuning session?
Hello! Thanks for your fantastic open-source work! Can you provide some examples on the training data format?
The text was updated successfully, but these errors were encountered: