Performance implications of context size #12051

k-b-1 · 2025-02-24T11:47:05Z

k-b-1
Feb 24, 2025

I tested a fixed sequence of prompts that needs a context size of almost 4k on a mistral 7b instruct model. Only one slot, no de-fragmentated kv cache.
When using a 4k context size, the sequence lasts 3 minutes. When using 32k context size, the same sequence lasts 11minutes.
Why does an increased but not used context size has such an impact on the performance?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance implications of context size #12051

{{title}}

Replies: 0 comments

Select a reply

Performance implications of context size #12051

k-b-1 Feb 24, 2025

Replies: 0 comments

k-b-1
Feb 24, 2025