[v1] Move block management logic from KVCacheManager to SpecializedManager #17474

heheda12345 · 2025-04-30T13:46:41Z

Should be merged after #17398

To prepare for hybrid allocator, this PR moves logic that need to run for each specialized manager from KVCacheManager to SpecializedManager. As the SpecializedManager not only contains customized logic for different attention type, I renamed it to SingleTypeKVCacheManager.

Prefer to rename specialized_manager.py in a seperate PR.

Didn't move hashing logic (e.g. req_to_block_hashes) to SpecializedManager as the HybridAllocator will do hashing in KVCacheManager level so that different managers with the same block_size can use the same block_hash.

Splitted from #16101

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

github-actions · 2025-04-30T13:46:51Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

mergify · 2025-04-30T14:45:45Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @heheda12345.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…alized_manager Signed-off-by: Chen Zhang <zhangch99@outlook.com>

WoosukKwon · 2025-05-06T06:56:28Z

QQ: Can we use a different term instead of "manager" for the single-type managers? It's a bit confusing since they're lower-level than the KV cache manager, but still called managers.

heheda12345 · 2025-05-06T07:45:33Z

What about "SingleTypeKVCacheController"? (I don't want to call it "allocator" as it is much more complex than simple allocation / deallocation.)

vllm/v1/core/kv_cache_manager.py

vllm/v1/core/specialized_manager.py

WoosukKwon · 2025-05-06T08:58:55Z

SingleTypeKVCacheController
Doesn't sound like a better name 😅 ok let's keep "manager" and brainstorm whether there's a better option that "SingleTypeManager"

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

mergify · 2025-05-06T17:21:20Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @heheda12345.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

…alized_manager Signed-off-by: Chen Zhang <zhangch99@outlook.com>

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

vllm/v1/core/specialized_manager.py

WoosukKwon

@heheda12345 Thanks for the PR. the code is very clean and well organized. Sorry for the delay in my review. Left some minor comments on style. Please check them out.

vllm/v1/core/specialized_manager.py

WoosukKwon · 2025-05-09T03:23:45Z

vllm/v1/core/specialized_manager.py

+            num_new_blocks = min(num_new_blocks,
+                                 self.max_num_blocks_per_req - len(req_blocks))
+            assert num_new_blocks > 0


nit: Can num_new_blocks be 0?

Good catch! I've fixed it. This check was copied from previous kv_cache_manager. num_new_blocks can be 0 when eagle is enabled (e.g., run examples/offline_inference/eagle.py with max_model_len=48)

9a1bc1d
Can you help to double-check the correctness of this commit? I think it is cleaner to perform num_tokens_need_slot = min(****, self.max_model_len) before calling allocate_new_blocks than to perform min here.

@heheda12345 Ah yes lgtm! Nice refactoring!

vllm/v1/core/specialized_manager.py

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

…nager (vllm-project#17474) Signed-off-by: Chen Zhang <zhangch99@outlook.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

…nager (vllm-project#17474) Signed-off-by: Chen Zhang <zhangch99@outlook.com> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>

…nager (vllm-project#17474) Signed-off-by: Chen Zhang <zhangch99@outlook.com>

heheda12345 added 6 commits April 29, 2025 09:16

save

d56c0ad

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

fix test

0f89a0e

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

fix test

b9ac046

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

move more logic to specialized manager

ee436d5

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

upadte interface

9deb3ff

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

update

64b1dca

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

heheda12345 requested review from WoosukKwon, robertgshaw2-redhat, njhill, ywang96, comaniac and alexm-redhat as code owners April 30, 2025 13:46

mergify bot added the v1 label Apr 30, 2025

mergify bot added the needs-rebase label Apr 30, 2025

Merge branch 'main' of github.com:vllm-project/vllm into update_speci…

8e985ee

…alized_manager Signed-off-by: Chen Zhang <zhangch99@outlook.com>

mergify bot removed the needs-rebase label May 1, 2025

WoosukKwon reviewed May 6, 2025

View reviewed changes

vllm/v1/core/kv_cache_manager.py Outdated Show resolved Hide resolved

vllm/v1/core/specialized_manager.py Outdated Show resolved Hide resolved

vllm/v1/core/specialized_manager.py Outdated Show resolved Hide resolved

WoosukKwon added the ready ONLY add when PR is ready to merge/full CI is needed label May 6, 2025

fix nits

ceed828

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

mergify bot added the needs-rebase label May 6, 2025

heheda12345 added 2 commits May 6, 2025 10:24

move override

a9d0ebc

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

Merge branch 'main' of github.com:vllm-project/vllm into update_speci…

f31f7a8

…alized_manager Signed-off-by: Chen Zhang <zhangch99@outlook.com>

mergify bot removed the needs-rebase label May 6, 2025

fix

c86cd13

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

WoosukKwon reviewed May 8, 2025

View reviewed changes

vllm/v1/core/specialized_manager.py Outdated Show resolved Hide resolved

WoosukKwon reviewed May 8, 2025

View reviewed changes

vllm/v1/core/specialized_manager.py Outdated Show resolved Hide resolved

WoosukKwon approved these changes May 9, 2025

View reviewed changes

heheda12345 added 3 commits May 9, 2025 04:50

fix nits

5c9549a

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

>=0

3e5dcc1

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

clean up

9a1bc1d

Signed-off-by: Chen Zhang <zhangch99@outlook.com>

WoosukKwon enabled auto-merge (squash) May 9, 2025 15:24

WoosukKwon merged commit 200da9a into vllm-project:main May 9, 2025
50 checks passed

heheda12345 mentioned this pull request May 10, 2025

[v1] Rename specialized_manager.py to single_type_kv_cache_manager.py #17946

Merged

mawong-amd pushed a commit to ROCm/vllm that referenced this pull request May 14, 2025

[v1] Move block management logic from KVCacheManager to SpecializedMa…

c530f82

…nager (vllm-project#17474) Signed-off-by: Chen Zhang <zhangch99@outlook.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v1] Move block management logic from KVCacheManager to SpecializedManager #17474

[v1] Move block management logic from KVCacheManager to SpecializedManager #17474

heheda12345 commented Apr 30, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Apr 30, 2025

mergify bot commented Apr 30, 2025

WoosukKwon commented May 6, 2025

heheda12345 commented May 6, 2025

WoosukKwon commented May 6, 2025

mergify bot commented May 6, 2025

WoosukKwon left a comment

WoosukKwon May 9, 2025

heheda12345 May 9, 2025

heheda12345 May 9, 2025

WoosukKwon May 9, 2025

[v1] Move block management logic from KVCacheManager to SpecializedManager #17474

[v1] Move block management logic from KVCacheManager to SpecializedManager #17474

Conversation

heheda12345 commented Apr 30, 2025 • edited by github-actions bot Loading

github-actions bot commented Apr 30, 2025

mergify bot commented Apr 30, 2025

WoosukKwon commented May 6, 2025

heheda12345 commented May 6, 2025

WoosukKwon commented May 6, 2025

mergify bot commented May 6, 2025

WoosukKwon left a comment

Choose a reason for hiding this comment

WoosukKwon May 9, 2025

Choose a reason for hiding this comment

heheda12345 May 9, 2025

Choose a reason for hiding this comment

heheda12345 May 9, 2025

Choose a reason for hiding this comment

WoosukKwon May 9, 2025

Choose a reason for hiding this comment

heheda12345 commented Apr 30, 2025 •

edited by github-actions bot

Loading