Replies: 1 comment
-
Not sure you're still interested in the issue, but LlamaDiskCache does the job, even if it has a slight bug now. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Is it possible to preprocess a set of prompts, save them to disk, and then reload them during inference to avoid the need to regenerate cached prompts every time? For instance, if I have a 2000-token prompt that I use daily in a memory-intensive Python program, is there a way to pre-process and save it to avoid the delay of ingesting the prompt each time I start the program? What are the options in this scenario?
Beta Was this translation helpful? Give feedback.
All reactions