Chat with PDF, Doc #4021

aiaicode · 2023-11-10T06:22:32Z

aiaicode
Nov 10, 2023

One of the biggest use case of LLMs especially for businesses is chatting with PDF and Docs privately.

Would it be difficult to add this as feature in llama.cpp?

staviq · 2023-11-10T14:35:47Z

staviq
Nov 10, 2023

Would it be difficult

Long story short, yes.

Most documents would overflow the context just by themselves, so "traditional" approaches are out.

Typically, this is done with vector storage which is way out of scope of llama.cpp.

Check out Private GPT: https://github.com/imartinez/privateGPT

1 reply

Galunid Nov 10, 2023
Collaborator

There's also https://github.com/PromtEngineer/localGPT/ (it uses llama-cpp-python)

matthiasdg · 2023-11-10T16:50:03Z

matthiasdg
Nov 10, 2023

What you're looking for is RAG (retrieval-augmented generation). The PrivateGPT mentioned before has examples, there is also https://python.langchain.com/docs/modules/data_connection/
Though it's not necessary to use langchain of course. Process is often a bit like this: extract text from your documents, cut it up in smaller parts, vectorize these using an embeddings model and store these in a vector db. Then also vectorize your prompt and get the closest vectors/pieces of text (different distance measures available) from the db. Then put these as context in your prompt

9 replies

x4080 Dec 20, 2023

Thanks for answering, yeah its pretty tricky to use chat with documents

ossirytk Jan 7, 2024

@PuxAI The main problem I have is that it can be difficult to know when the information is not in the vector database. Since normally I use complete sentences to ask questions, and there is almost always some kind of familiarity in the sentence structure. Common words and sentences inevitably cause static in the familiarity search and it can be difficult to filter out. I'm still trying to figure out how to make reliable searches. Or how to construct good vector databases.

staviq Jan 11, 2024

The same way you modify the conversation in the background to let the model use data from vector database, ask the model to re-phrase user's question in form of a Google search query first, use that on the vector database, and then, clean up the conversation as usuall.

You don't know if a sentence will match vectordb without context, but the model does, because it can extrapolate missing information in form of a full sentence question.

Make the model itself do the heavy lifting :)

As in, the model itself should be able to reconstruct you question "how old is he" into "how old is Bidet's son, Hunter?", which should yield much better vectordb matches.

ossirytk Jan 11, 2024

This is basically the solution that I've come to. The other solution I have is to use AI to parse the vectordb documents for metadata automatically. With good metadata linking the document subjects, the search accuracy increases accordingly.

x4080 Jan 11, 2024

@staviq Do you train the model to convert phrases from current and prev conversation to "keywords" that can rephrase the question ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chat with PDF, Doc #4021

{{title}}

Replies: 2 comments 10 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Chat with PDF, Doc #4021

Replies: 2 comments · 10 replies

Galunid Nov 10, 2023 Collaborator

Replies: 2 comments 10 replies

Galunid Nov 10, 2023
Collaborator