Skip to content

Commit c5d5535

Browse files
[Bugfix] fix for deepseek w4a16 (#8906)
Co-authored-by: mgoin <michael@neuralmagic.com>
1 parent 172d1cd commit c5d5535

File tree

1 file changed

+5
-4
lines changed
  • vllm/model_executor/layers/quantization/kernels

1 file changed

+5
-4
lines changed

vllm/model_executor/layers/quantization/kernels/marlin.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,11 @@ def can_implement(cls,
3838
"Marlin, supported group sizes are: "\
3939
f"{MARLIN_SUPPORTED_GROUP_SIZES}"
4040

41-
return check_marlin_supports_shape(c.partition_weight_shape[0],
42-
c.partition_weight_shape[1],
43-
c.full_weight_shape[1],
44-
c.group_size)
41+
return check_marlin_supports_shape(
42+
c.partition_weight_shape[1], # out_features
43+
c.partition_weight_shape[0], # in_features
44+
c.full_weight_shape[0], # in_features
45+
c.group_size)
4546

4647
# note assumes that
4748
# `weight_packed` is: {input_dim = 0, output_dim = 1, packed_dim = 0}

0 commit comments

Comments
 (0)