[BUG] Loss in Accuracy with Paged=False with Qwen2.5_VL Vision Models on Linux #753
Open
3 tasks done
Labels
bug
Something isn't working
OS
Linux
GPU Library
CUDA 12.x
Python version
3.12
Pytorch version
2.5.0
Model
https://huggingface.co/turboderp/Qwen2-VL-7B-Instruct-exl2/tree/6.0bpw
Describe the bug
Trying to run exllamav2 with the Qwen2-VL-7B-Instruct-exl2-q6 model with the paged=False flag results in highly differing answers, with the answer being inaccurate (mostly due to incompleteness) on Paged=False.
This happens specifically when performing Video Analysis (list of PIL Images). I have not explicitly tested single image inference.
With Paged=True
With Paged=False:
The image is a screenshot of a computer screen displaying a code editor window with a code file open.
Same code, same prompt.
Prompt: "Describe the video concisely"
Reproduction steps
Code is identical to that which can be found here:
https://github.com/turboderp-org/exllamav2/blob/master/examples/multimodal_video.py
Expected behavior
Same answer (or highly similar) on paged=True and paged=False.
Logs
No response
Additional context
No response
Acknowledgements
The text was updated successfully, but these errors were encountered: