Skip to content

Commit 83bfb70

Browse files
committed
Bindings: Fix starting rewind cache position
Set the starting rewind cache position to the current amount of cells used instead of 0. Therefore, when a rewind occurs when generation starts (a match result of NO isn't hit), then the state is preserved and the KV cache isn't reset. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
1 parent a8c7de8 commit 83bfb70

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

bindings/binding.cpp

+1-1
Original file line numberDiff line numberDiff line change
@@ -880,7 +880,7 @@ const char* InferToReadbackBuffer(
880880
auto [newTokenId, isEnd] = gen(firstBatch, sampler);
881881

882882
// Extra samplers - Banned strings
883-
int rewindPos = 0;
883+
int rewindPos = llama_get_kv_cache_used_cells(context);
884884
int rewindTokenId = 0;
885885
int tokenCount = 0;
886886
int rewindTokenCount = 0;

0 commit comments

Comments
 (0)