-
Notifications
You must be signed in to change notification settings - Fork 124
Issues: vllm-project/llm-compressor
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
AWQ Qwen3-235B-A22B and Qwen3-30B-A3B
bug
Something isn't working
#1406
opened May 1, 2025 by
ehartford
Error when computing device_map for Mistral-small-3.1-24B-Instruct-2503
bug
Something isn't working
#1403
opened Apr 30, 2025 by
VAmblardPEReN
For FP8 Fused MoE layers, only per-tensor scalesfor weights and activations are supporte?
bug
Something isn't working
#1393
opened Apr 28, 2025 by
shuxiaobo
AWQ -- Clean up forward passes with kwargs using New feature or request
good first issue
A good first issue for users wanting to contribute
inspect.bind
enhancement
#1385
opened Apr 25, 2025 by
brian-dellabetta
Load the model to CPU but quantize using the GPU
enhancement
New feature or request
#1383
opened Apr 25, 2025 by
sgsdxzy
Is there any way quant model on multi nodes?
enhancement
New feature or request
#1382
opened Apr 25, 2025 by
shuxiaobo
Running vllm after Something isn't working
oneshot
causes rerun of oneshot
bug
#1358
opened Apr 16, 2025 by
brian-dellabetta
Extend e2e tests to add asym support for W8A8-Int8
enhancement
New feature or request
good first issue
A good first issue for users wanting to contribute
#1344
opened Apr 10, 2025 by
dsikka
[Request] Update vision model examples
enhancement
New feature or request
#1327
opened Apr 5, 2025 by
T145
Recipe.model_dump()
output is not valid input for Recipe.model_validate(...)
bug
#1319
opened Apr 2, 2025 by
rahul-tuli
[Gemma3] - FP8 Dynamic KeyError: 'vision_model.encoder.layers.0.mlp.fc1.weight_scale'
bug
Something isn't working
#1306
opened Apr 1, 2025 by
m4r1k
[Gemma3] - oneshot doesn't output preprocessor_config.json & processor_config.json
bug
Something isn't working
#1305
opened Apr 1, 2025 by
m4r1k
Can't quantize kv cache: Something isn't working
observer = self.k_observers[layer_idx]
liste index out of range
bug
#1295
opened Mar 28, 2025 by
DreamGenX
Has anyone successfully quantinize Deepseek-R1 to w8a8?
question
Further information is requested
#1274
opened Mar 21, 2025 by
taishan1994
prebuilt docker images?
question
Further information is requested
#1252
opened Mar 13, 2025 by
shensimeteor
Hope support tensor parallel
enhancement
New feature or request
#1249
opened Mar 13, 2025 by
Arcmoon-Hu
Support gemma3
enhancement
New feature or request
good first issue
A good first issue for users wanting to contribute
#1248
opened Mar 13, 2025 by
zf-inworld
use cpu memory 1.4TB, but use gpu memory 300MB
bug
Something isn't working
#1240
opened Mar 11, 2025 by
mmdbhs
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.