-
-
Notifications
You must be signed in to change notification settings - Fork 305
Issues: turboderp-org/exllamav2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[BUG] Loss in Accuracy with Paged=False with Qwen2.5_VL Vision Models on Linux
bug
Something isn't working
#753
opened Mar 18, 2025 by
RaahimSiddiqi
3 tasks done
[BUG] Bug in attention mechanism when Paged=False for Qwen2.5_VL Models
bug
Something isn't working
#752
opened Mar 18, 2025 by
RaahimSiddiqi
3 tasks done
[REQUEST] It is very difficult to service exlamav2 using RestfullAPI.
#748
opened Mar 12, 2025 by
nalgae
[REQUEST]Support for the New Aya-Vision32b models
#746
opened Mar 9, 2025 by
GoudaCouda
3 tasks done
[BUG] Qwen-vl can't produce coordinates
bug
Something isn't working
#740
opened Feb 22, 2025 by
Tedy50
3 tasks done
[BUG] Significant prompt processing speed difference when using Tensor Parallelism
bug
Something isn't working
#734
opened Feb 16, 2025 by
ThomasBaruzier
3 tasks done
[BUG] When trying inference with Qwen2.5-VL-72B with Qwen2.5-VL-7B as a draft model, I get "IndexError: index out of range in self" (both models have identical vocab.json)
bug
Something isn't working
#733
opened Feb 6, 2025 by
Lissanro
3 tasks done
[BUG] Exception in ASGI application when trying inference with an image wit h Qwen2.5-VL-72B
bug
Something isn't working
#732
opened Feb 5, 2025 by
Lissanro
3 tasks done
[BUG] Mistral-Small-24B-Instruct-2501 - Tensor Parallel outputs garbled text.
bug
Something isn't working
#728
opened Jan 31, 2025 by
mindkrypted
3 tasks done
[BUG] 2080ti can't quant a 12B model
bug
Something isn't working
#725
opened Jan 28, 2025 by
frenzybiscuit
3 tasks done
[REQUEST] Support new SOTA vision model: Qwen 2.5 VL (3B, 7B, 72B)
#724
opened Jan 27, 2025 by
ThomasBaruzier
3 tasks done
[REQUEST] Support x-grammar structured output framework integration
#723
opened Jan 27, 2025 by
debasish-mihup
3 tasks done
[REQUEST] GraniteMoeForCausalLM architecture support
#722
opened Jan 27, 2025 by
cal066
3 tasks done
[REQUEST] Add support for Blackwell B200 with Cuda 12.8
#721
opened Jan 25, 2025 by
ofirkris
3 tasks done
[BUG] LORA fail load/inference for tensor parallel.
bug
Something isn't working
#719
opened Jan 22, 2025 by
Ph0rk0z
3 tasks done
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.