Refine ChatQnA README for TGI (#715)

letonghan · web-flow · commit afc3341156ae · 2024-09-03T15:06:50.000+08:00
* update chatqna readme for tgi

Signed-off-by: letonghan &lt;letong.han@intel.com&gt;

* update log block

Signed-off-by: letonghan &lt;letong.han@intel.com&gt;

---------

Signed-off-by: letonghan &lt;letong.han@intel.com&gt;
diff --git a/ChatQnA/README.md b/ChatQnA/README.md
@@ -224,6 +224,19 @@ Refer to the [Intel Technology enabling for Openshift readme](https://github.com
 
 ## Consume ChatQnA Service
 
+Before consuming ChatQnA Service, make sure the TGI/vLLM service is ready (which takes up to 2 minutes to start).
+
+```bash
+# TGI example
+docker logs tgi-service | grep Connected
+```
+
+Consume ChatQnA service until you get the TGI response like below.
+
+```log
+2024-09-03T02:47:53.402023Z  INFO text_generation_router::server: router/src/server.rs:2311: Connected
+```
+
 Two ways of consuming ChatQnA Service:
 
 1. Use cURL command on terminal
diff --git a/ChatQnA/docker/gaudi/README.md b/ChatQnA/docker/gaudi/README.md
@@ -306,6 +306,22 @@ curl http://${host_ip}:8000/v1/reranking \
 
 6. LLM backend Service
 
+In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
+
+Try the command below to check whether the LLM serving is ready.
+
+```bash
+docker logs ${CONTAINER_ID} | grep Connected
+```
+
+If the service is ready, you will get the response like below.
+
+```log
+2024-09-03T02:47:53.402023Z  INFO text_generation_router::server: router/src/server.rs:2311: Connected
+```
+
+Then try the `cURL` command below to validate services.
+
 ```bash
 #TGI Service
 curl http://${host_ip}:8005/generate \
diff --git a/ChatQnA/docker/gpu/README.md b/ChatQnA/docker/gpu/README.md
@@ -192,6 +192,22 @@ curl http://${host_ip}:8000/v1/reranking \
 
 6. TGI Service
 
+In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
+
+Try the command below to check whether the TGI service is ready.
+
+```bash
+docker logs ${CONTAINER_ID} | grep Connected
+```
+
+If the service is ready, you will get the response like below.
+
+```log
+2024-09-03T02:47:53.402023Z  INFO text_generation_router::server: router/src/server.rs:2311: Connected
+```
+
+Then try the `cURL` command below to validate TGI.
+
 ```bash
 curl http://${host_ip}:8008/generate \
   -X POST \
diff --git a/ChatQnA/docker/xeon/README.md b/ChatQnA/docker/xeon/README.md
@@ -303,9 +303,21 @@ curl http://${host_ip}:8000/v1/reranking\
 
 6. LLM backend Service
 
-In first startup, this service will take more time to download the LLM file. After it's finished, the service will be ready.
+In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
 
-Use `docker logs CONTAINER_ID` to check if the download is finished.
+Try the command below to check whether the LLM serving is ready.
+
+```bash
+docker logs ${CONTAINER_ID} | grep Connected
+```
+
+If the service is ready, you will get the response like below.
+
+```log
+2024-09-03T02:47:53.402023Z  INFO text_generation_router::server: router/src/server.rs:2311: Connected
+```
+
+Then try the `cURL` command below to validate services.
 
 ```bash
 # TGI service
diff --git a/ChatQnA/docker/xeon/README_qdrant.md b/ChatQnA/docker/xeon/README_qdrant.md
@@ -276,6 +276,22 @@ curl http://${host_ip}:6046/v1/reranking\
 
 6. TGI Service
 
+In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
+
+Try the command below to check whether the TGI service is ready.
+
+```bash
+docker logs ${CONTAINER_ID} | grep Connected
+```
+
+If the service is ready, you will get the response like below.
+
+```log
+2024-09-03T02:47:53.402023Z  INFO text_generation_router::server: router/src/server.rs:2311: Connected
+```
+
+Then try the `cURL` command below to validate TGI.
+
 ```bash
 curl http://${host_ip}:6042/generate \
   -X POST \