Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Bug in attention mechanism when Paged=False for Qwen2.5_VL Models #752

Open
3 tasks done
RaahimSiddiqi opened this issue Mar 18, 2025 · 1 comment
Open
3 tasks done
Labels
bug Something isn't working

Comments

@RaahimSiddiqi
Copy link

OS

Windows

GPU Library

CUDA 12.x

Python version

3.12

Pytorch version

2.5.0

Model

https://huggingface.co/turboderp/Qwen2-VL-7B-Instruct-exl2/tree/6.0bpw

Describe the bug

Trying to run exllamav2 with the Qwen2-VL-7B-Instruct-exl2-q6 model with the paged=False flag results in the following error somewhere in the attention code when creating video embeddings.

  File "C:\Users\Raahim.Siddiqi\AppData\Local\miniconda3\envs\VisualSummarizerExllama\Lib\site-packages\exllamav2\attn_params.py", line 197, in get_block_diag_mask
    self.block_diag_mask = labels.unsqueeze(0) == labels.unsqueeze(1).repeat(self.batch_size)
                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Number of dimensions of repeat dims can not be smaller than number of dimensions of tensor

Full Traceback:

Traceback (most recent call last):
  File "C:\TFS\VIDIZMO\SOURCE\VisualGenerativeAIPY\exllama-vision.py", line 136, in <module>
    video_embedding = vision_model.get_video_embeddings(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Raahim.Siddiqi\AppData\Local\miniconda3\envs\VisualSummarizerExllama\Lib\site-packages\exllamav2\vlm\vision_tower.py", line 396, in get_video_embeddings
    embedding_tensor = self.process(
                       ^^^^^^^^^^^^^
  File "C:\Users\Raahim.Siddiqi\AppData\Local\miniconda3\envs\VisualSummarizerExllama\Lib\site-packages\exllamav2\vlm\vision_tower.py", line 244, in process
    hidden_states = module.forward(
                    ^^^^^^^^^^^^^^^
  File "C:\Users\Raahim.Siddiqi\AppData\Local\miniconda3\envs\VisualSummarizerExllama\Lib\site-packages\exllamav2\attn.py", line 1059, in forward
    return self.forward_torch(
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Raahim.Siddiqi\AppData\Local\miniconda3\envs\VisualSummarizerExllama\Lib\site-packages\exllamav2\attn.py", line 1483, in forward_torch
    attn_output = attn_func(
                  ^^^^^^^^^^
  File "C:\Users\Raahim.Siddiqi\AppData\Local\miniconda3\envs\VisualSummarizerExllama\Lib\site-packages\exllamav2\attn.py", line 886, in _attn_torch
    attn_mask_lr = attn_params.get_block_diag_mask(q_states.device)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Raahim.Siddiqi\AppData\Local\miniconda3\envs\VisualSummarizerExllama\Lib\site-packages\exllamav2\attn_params.py", line 197, in get_block_diag_mask
    self.block_diag_mask = labels.unsqueeze(0) == labels.unsqueeze(1).repeat(self.batch_size)
                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Number of dimensions of repeat dims can not be smaller than number of dimensions of tensor

ExllamaV2 Version:
0.2.8+cu121.torch2.5.0

GPU: RTX 3090

Nvidia Driver: 566.14

Reproduction steps

To ensure that there was no error on my part, I tested the code example available in the github repo.

https://github.com/turboderp-org/exllamav2/blob/master/examples/multimodal_video.py

Only change I made to that code was the location of the image files. It can reproduced by running this code.

Expected behavior

I have tested the same code with Linux (WSL) so I know the code is fine. I expect it to describe the video I'm giving to it (in the form a list of PIL images).

Logs

No response

Additional context

No response

Acknowledgements

  • I have looked for similar issues before submitting this one.
  • I understand that the developers have lives and my issue will be answered when possible.
  • I understand the developers of this program are human, and I will ask my questions politely.
@RaahimSiddiqi RaahimSiddiqi added the bug Something isn't working label Mar 18, 2025
@RaahimSiddiqi
Copy link
Author

@turboderp Any chance this is the fix for this issue?

2e630ae

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant