Skip to content

Commit 9992c50

Browse files
committed
fix: Fix speculative decoding
1 parent 65222bc commit 9992c50

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

llama_cpp/llama.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -807,8 +807,10 @@ def sample(
807807
grammar=grammar,
808808
)
809809

810+
ridx = idx - self.n_tokens if idx is not None else -1
811+
810812
assert self.ctx is not None
811-
token = self._sampler.sample(self._ctx, -1)
813+
token = self._sampler.sample(self._ctx, ridx)
812814
if tmp_sampler:
813815
self._sampler = None
814816
return token

0 commit comments

Comments
 (0)