Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Loss in Accuracy with Paged=False with Qwen2.5_VL Vision Models on Linux #753

Open
3 tasks done
RaahimSiddiqi opened this issue Mar 18, 2025 · 0 comments
Open
3 tasks done
Labels
bug Something isn't working

Comments

@RaahimSiddiqi
Copy link

OS

Linux

GPU Library

CUDA 12.x

Python version

3.12

Pytorch version

2.5.0

Model

https://huggingface.co/turboderp/Qwen2-VL-7B-Instruct-exl2/tree/6.0bpw

Describe the bug

Trying to run exllamav2 with the Qwen2-VL-7B-Instruct-exl2-q6 model with the paged=False flag results in highly differing answers, with the answer being inaccurate (mostly due to incompleteness) on Paged=False.

This happens specifically when performing Video Analysis (list of PIL Images). I have not explicitly tested single image inference.

With Paged=True

The video showcases a series of screen captures from a software development environment, likely a code editor or an integrated development environment (IDE). The content includes:

1. **API Documentation**: The first part of the video displays detailed API documentation for Azure Video Indexer, focusing on operations such as "Get Video Summary" and "Create Video Summary." The documentation includes request parameters, response formats, and examples of HTTP requests and responses.

2. **Code Editor**: The second part of the video transitions to a code editor where a developer is working on a project. The code appears to be related to Azure Video Indexer, as it references the API documentation seen earlier. The developer is interacting with the code, possibly testing or implementing functionalities related to video indexing and summarization.

3. **Performance Monitoring**: The third part of the video shows a performance monitoring tool, likely within the IDE, displaying CPU usage, memory usage, and other performance metrics. This suggests that the developer is monitoring the performance of their application or code.

The video seems to be a tutorial or a demonstration of how to use Azure Video Indexer APIs within a development environment, focusing on both the API documentation and the implementation of those APIs in code

With Paged=False:

The image is a screenshot of a computer screen displaying a code editor window with a code file open.

Same code, same prompt.

Prompt: "Describe the video concisely"

Reproduction steps

Code is identical to that which can be found here:

https://github.com/turboderp-org/exllamav2/blob/master/examples/multimodal_video.py

Expected behavior

Same answer (or highly similar) on paged=True and paged=False.

Logs

No response

Additional context

No response

Acknowledgements

  • I have looked for similar issues before submitting this one.
  • I understand that the developers have lives and my issue will be answered when possible.
  • I understand the developers of this program are human, and I will ask my questions politely.
@RaahimSiddiqi RaahimSiddiqi added the bug Something isn't working label Mar 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant