Skip to content

Commit afc3341

Browse files
authored
Refine ChatQnA README for TGI (#715)
* update chatqna readme for tgi Signed-off-by: letonghan <letong.han@intel.com> * update log block Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com>
1 parent e5ec38c commit afc3341

File tree

5 files changed

+75
-2
lines changed

5 files changed

+75
-2
lines changed

ChatQnA/README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -224,6 +224,19 @@ Refer to the [Intel Technology enabling for Openshift readme](https://github.com
224224

225225
## Consume ChatQnA Service
226226

227+
Before consuming ChatQnA Service, make sure the TGI/vLLM service is ready (which takes up to 2 minutes to start).
228+
229+
```bash
230+
# TGI example
231+
docker logs tgi-service | grep Connected
232+
```
233+
234+
Consume ChatQnA service until you get the TGI response like below.
235+
236+
```log
237+
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
238+
```
239+
227240
Two ways of consuming ChatQnA Service:
228241

229242
1. Use cURL command on terminal

ChatQnA/docker/gaudi/README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -306,6 +306,22 @@ curl http://${host_ip}:8000/v1/reranking \
306306

307307
6. LLM backend Service
308308

309+
In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
310+
311+
Try the command below to check whether the LLM serving is ready.
312+
313+
```bash
314+
docker logs ${CONTAINER_ID} | grep Connected
315+
```
316+
317+
If the service is ready, you will get the response like below.
318+
319+
```log
320+
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
321+
```
322+
323+
Then try the `cURL` command below to validate services.
324+
309325
```bash
310326
#TGI Service
311327
curl http://${host_ip}:8005/generate \

ChatQnA/docker/gpu/README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -192,6 +192,22 @@ curl http://${host_ip}:8000/v1/reranking \
192192

193193
6. TGI Service
194194

195+
In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
196+
197+
Try the command below to check whether the TGI service is ready.
198+
199+
```bash
200+
docker logs ${CONTAINER_ID} | grep Connected
201+
```
202+
203+
If the service is ready, you will get the response like below.
204+
205+
```log
206+
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
207+
```
208+
209+
Then try the `cURL` command below to validate TGI.
210+
195211
```bash
196212
curl http://${host_ip}:8008/generate \
197213
-X POST \

ChatQnA/docker/xeon/README.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -303,9 +303,21 @@ curl http://${host_ip}:8000/v1/reranking\
303303

304304
6. LLM backend Service
305305

306-
In first startup, this service will take more time to download the LLM file. After it's finished, the service will be ready.
306+
In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
307307

308-
Use `docker logs CONTAINER_ID` to check if the download is finished.
308+
Try the command below to check whether the LLM serving is ready.
309+
310+
```bash
311+
docker logs ${CONTAINER_ID} | grep Connected
312+
```
313+
314+
If the service is ready, you will get the response like below.
315+
316+
```log
317+
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
318+
```
319+
320+
Then try the `cURL` command below to validate services.
309321

310322
```bash
311323
# TGI service

ChatQnA/docker/xeon/README_qdrant.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -276,6 +276,22 @@ curl http://${host_ip}:6046/v1/reranking\
276276

277277
6. TGI Service
278278

279+
In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
280+
281+
Try the command below to check whether the TGI service is ready.
282+
283+
```bash
284+
docker logs ${CONTAINER_ID} | grep Connected
285+
```
286+
287+
If the service is ready, you will get the response like below.
288+
289+
```log
290+
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
291+
```
292+
293+
Then try the `cURL` command below to validate TGI.
294+
279295
```bash
280296
curl http://${host_ip}:6042/generate \
281297
-X POST \

0 commit comments

Comments
 (0)