Implements the attention kernel with vertical and slash sparse pattern described in Appendix C.4.2 of https://arxiv.org/abs/2407.02490 (as sparse_attn_func)#33
Merged
LucasWilkinson merged 12 commits intovllm-project:mainfrom Jan 15, 2025
Commits
Commits on Jan 9, 2025
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed