diff --git a/docs/source/design/v1/metrics.md b/docs/source/design/v1/metrics.md index 7e7c8b925e2..de802265537 100644 --- a/docs/source/design/v1/metrics.md +++ b/docs/source/design/v1/metrics.md @@ -415,8 +415,8 @@ The discussion in about adding prefix cache metrics yielded some interesting points which may be relevant to how we approach future metrics. -Every time the prefix cache is queried, we record the number of blocks -queried and the number of queried blocks present in the cache +Every time the prefix cache is queried, we record the number of tokens +queried and the number of queried tokens present in the cache (i.e. hits). However, the metric of interest is the hit rate - i.e. the number of