Skip to content

[Bug]: V0 Scheduler is incapable of the newest KVCacheManager interface in vllm main branch code #861

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gawainx opened this issue May 14, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@gawainx
Copy link

gawainx commented May 14, 2025

Your current environment

The output of `python collect_env.py`
vllm-ascend: main branch
vllm: main branch

🐛 Describe the bug

First, I'm sorry for omitting some env details here, because we found this bug in our private server environment.
In short summary, in vllm-ascend, we use additional config to enable v0 style scheduler for better performance. However, when we install the newest code d066e52013be278c7a3bc54ec9799d8457895f4d of vllm and 218f21d..68fb634 of vllm-ascend, we encountered errors when dealing with requests such as

Runtime Error: object of type KVCacheBlocks has no len()  

What happened?

The root cause of this problem is that, recently, the vllm project has rewritten the following methods of KVCacheManager (details can be found at this PR):

  • Introduce KVCacheBlocks
  • get_computed_blocks method returns tuple[KVCacheBlocks, int] instead of Tuple[List[BlockHashType], int]
  • allocate_slots has one extra arg named num_new_computed_tokens and returns Optional[KVCacheBlocks]
@gawainx gawainx added the bug Something isn't working label May 14, 2025
@wangxiyuan
Copy link
Collaborator

yes, it's confirmed. Let's fix this before the next release. Thanks for bring this issue up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants