Skip to content

Implements the attention kernel with vertical and slash sparse pattern described in Appendix C.4.2 of https://arxiv.org/abs/2407.02490 (as sparse_attn_func)#33

Merged
LucasWilkinson merged 12 commits intovllm-project:mainfrom
minminsun:main
Jan 15, 2025

Commits

Commits on Jan 9, 2025

Commits on Jan 14, 2025

Commits on Jan 15, 2025