Skip to content

How to upload custom articles to retrieval #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Magical-Bear opened this issue Nov 19, 2024 · 3 comments
Open

How to upload custom articles to retrieval #1

Magical-Bear opened this issue Nov 19, 2024 · 3 comments

Comments

@Magical-Bear
Copy link

In your backend, the requirements.txt file is empty, I'm trying to patch packages by detect python import. and the tokenizer = AutoTokenizer.from_pretrained("yiqingx/AnchorDR") load failed. I've changed the model to bge model. Also the frontend conn't connect to backend success, I'm depressed and don't know how to repair, even execute custom article retrieval, more detailed, please!

@TevinWang
Copy link
Collaborator

TevinWang commented Nov 20, 2024

Hi, sorry to hear that you are having issues. To use AnchorDR you need to follow the installation instructions on the repo, sorry this should be more clear. If you are still having trouble, using a different embedding model like BGE is perfectly fine! As for connecting the frontend to the backend, in our demo we use a CGI script that acts a middleware between the frontend and proxies requests to the backend. If you want to call the backend server directly, you can replace the GET request and the url parameters as follows:

const url = `[SERVER URL]`;
const response = await fetch(url, {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Accept: "*/*",
    "X-API-Key": apiKey,
  },
  body: JSON.stringify({
    query,
    search_uuid,
    k,
    snippet,
  }),
  signal: controller.signal,
});

This replaces the GET request with a POST request to the server directly. Let me know if you have any other questions.

@Magical-Bear
Copy link
Author

Thanks for your sincerely answering! By directly using the Postman, I've connected to the backend successfully! But I'm don't clearly known What's the PILE_ADDR PILE_PORT CLUEWEB_ADDR CLUEWEB_PORT where to use and the usgae methods, I've successful started /rag/server.py.
Moreover, the frontend codes still send OPTION methods to backend, Would you tell me the reason sending OPTION requests? I'm lacks of frontend experience, greatly appreciation!

ragviz.py INFO: Started server process [1293238] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8079 (Press CTRL+C to quit) TOKEN_COUNT: 3 EMBEDDING TIME: 0.085858054459095 seconds hello 5 Error fetching data from http://:: Invalid URL 'http://:': No host supplied Error fetching data from http://:: Invalid URL 'http://:': No host supplied Error fetching data from http://:: Invalid URL 'http://:': No host supplied Error fetching data from http://:: Invalid URL 'http://:': No host supplied QUERY AND RERANK TIME: 0.0031450949609279633 seconds TOTAL QUERY TIME: 1.072597848251462 seconds INFO: 127.0.0.1:60478 - "POST /query HTTP/1.1" 200 OK

/rag/server.py VLLM CHAT COMPLETION TIME: 0.637015737593174 seconds [RequestOutput(request_id=0, prompt='Context: \n Question: hello Answer in less than 100 tokens:', prompt_token_ids=[1972, 25, 715, 15846, 25, 23811, 21806, 304, 2686, 1091, 220, 16, 15, 15, 11211, 25], encoder_prompt=None, encoder_prompt_token_ids=None, prompt_logprobs=None, outputs=[CompletionOutput(index=0, text=' \n Response: Hello! How can I assist you today? If you have a specific question or topic, feel free to share. :)\n\n', token_ids=(715, 5949, 25, 21927, 0, 2585, 646, 358, 7789, 498, 3351, 30, 1416, 498, 614, 264, 3151, 3405, 476, 8544, 11, 2666, 1910, 311, 4332, 13, 549, 692, 151643), cumulative_logprob=None, logprobs=None, finish_reason=stop, stop_reason=151643)], finished=True, metrics=RequestMetrics(arrival_time=1732672766.4274843, last_token_time=1732672766.4274843, first_scheduled_time=1732672766.4287786, first_token_time=1732672766.4615283, time_in_queue=0.0012943744659423828, finished_time=1732672767.0637972, scheduler_time=0.0030249282717704773, model_forward_time=None, model_execute_time=None), lora_request=None, num_cached_tokens=0)] Qwen2Model is using Qwen2SdpaAttention, buttorch.nn.functional.scaled_dot_product_attentiondoes not supportoutput_attentions=True. Falling back to the manual attention implementation, but specifying the manual implementation will be required from Transformers version v5.0.0 onwards. This warning can be removed using the argument attn_implementation="eager"when loading the model. /home/zkpk/anaconda3/envs/ragviz/lib/python3.10/site-packages/bitsandbytes/nn/modules.py:452: UserWarning: Input type into Linear4bit is torch.float16, but bnb_4bit_compute_dtype=torch.float32 (default). This will lead to slow inference or training speed. warnings.warn( The attention layers in this model are transitioning from computing the RoPE embeddings internally throughposition_ids(2D tensor with the indexes of the tokens), to using externally computedposition_embeddings(Tuple of tensors, containing cos and sin). In v4.46position_idswill be removed andposition_embeddingswill be mandatory. ATTENTION FORWARD PASS TIME: 0.3202586229890585 seconds INFO: 127.0.0.1:33536 - "POST /generate HTTP/1.1" 200 OK

@TevinWang
Copy link
Collaborator

TevinWang commented Dec 11, 2024

Hi! Sorry for the late reply. The PILE_ADDR PILE_PORT CLUEWEB_ADDR CLUEWEB_PORT is to specify the IP and port for the DiskANN REST servers. I think the likely issue you are facing is that those ports and addresses are not set, hence the Error fetching data from http://:: Invalid URL 'http://:': No host supplied error.

How are you implementing dense retrieval in the RAG pipeline currently? If you are already using a specific framework or system, I can help you connect it with the RAGViz backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants