Skip to content

Commit 47c2577

Browse files
tlrmchlsmthErkin Sagiroglu
authored and
Erkin Sagiroglu
committed
[Bugfix] Fix divide by zero when serving Mamba models (vllm-project#9617)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Signed-off-by: Erkin Sagiroglu <erkin@infra-aipipeline-1-at1-prox-prod-a.ipa.corp.telnyx.com>
1 parent 277d5f6 commit 47c2577

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

vllm/engine/llm_engine.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1612,15 +1612,15 @@ def _get_stats(self,
16121612
# KV Cache Usage in %
16131613
num_total_gpu = self.cache_config.num_gpu_blocks
16141614
gpu_cache_usage_sys = 0.
1615-
if num_total_gpu is not None:
1615+
if num_total_gpu: # Guard against both None and 0
16161616
num_free_gpu = sum(
16171617
scheduler.block_manager.get_num_free_gpu_blocks()
16181618
for scheduler in self.scheduler)
16191619
gpu_cache_usage_sys = 1.0 - (num_free_gpu / num_total_gpu)
16201620

16211621
num_total_cpu = self.cache_config.num_cpu_blocks
16221622
cpu_cache_usage_sys = 0.
1623-
if num_total_cpu is not None and num_total_cpu > 0:
1623+
if num_total_cpu: # Guard against both None and 0
16241624
num_free_cpu = sum(
16251625
scheduler.block_manager.get_num_free_cpu_blocks()
16261626
for scheduler in self.scheduler)

0 commit comments

Comments
 (0)