File tree Expand file tree Collapse file tree 5 files changed +75
-2
lines changed Expand file tree Collapse file tree 5 files changed +75
-2
lines changed Original file line number Diff line number Diff line change @@ -224,6 +224,19 @@ Refer to the [Intel Technology enabling for Openshift readme](https://github.com
224
224
225
225
## Consume ChatQnA Service
226
226
227
+ Before consuming ChatQnA Service, make sure the TGI/vLLM service is ready (which takes up to 2 minutes to start).
228
+
229
+ ``` bash
230
+ # TGI example
231
+ docker logs tgi-service | grep Connected
232
+ ```
233
+
234
+ Consume ChatQnA service until you get the TGI response like below.
235
+
236
+ ``` log
237
+ 2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
238
+ ```
239
+
227
240
Two ways of consuming ChatQnA Service:
228
241
229
242
1 . Use cURL command on terminal
Original file line number Diff line number Diff line change @@ -306,6 +306,22 @@ curl http://${host_ip}:8000/v1/reranking \
306
306
307
307
6 . LLM backend Service
308
308
309
+ In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
310
+
311
+ Try the command below to check whether the LLM serving is ready.
312
+
313
+ ``` bash
314
+ docker logs ${CONTAINER_ID} | grep Connected
315
+ ```
316
+
317
+ If the service is ready, you will get the response like below.
318
+
319
+ ``` log
320
+ 2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
321
+ ```
322
+
323
+ Then try the ` cURL ` command below to validate services.
324
+
309
325
``` bash
310
326
# TGI Service
311
327
curl http://${host_ip} :8005/generate \
Original file line number Diff line number Diff line change @@ -192,6 +192,22 @@ curl http://${host_ip}:8000/v1/reranking \
192
192
193
193
6 . TGI Service
194
194
195
+ In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
196
+
197
+ Try the command below to check whether the TGI service is ready.
198
+
199
+ ``` bash
200
+ docker logs ${CONTAINER_ID} | grep Connected
201
+ ```
202
+
203
+ If the service is ready, you will get the response like below.
204
+
205
+ ``` log
206
+ 2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
207
+ ```
208
+
209
+ Then try the ` cURL ` command below to validate TGI.
210
+
195
211
``` bash
196
212
curl http://${host_ip} :8008/generate \
197
213
-X POST \
Original file line number Diff line number Diff line change @@ -303,9 +303,21 @@ curl http://${host_ip}:8000/v1/reranking\
303
303
304
304
6 . LLM backend Service
305
305
306
- In first startup, this service will take more time to download the LLM file . After it's finished, the service will be ready.
306
+ In first startup, this service will take more time to download the model files . After it's finished, the service will be ready.
307
307
308
- Use ` docker logs CONTAINER_ID ` to check if the download is finished.
308
+ Try the command below to check whether the LLM serving is ready.
309
+
310
+ ``` bash
311
+ docker logs ${CONTAINER_ID} | grep Connected
312
+ ```
313
+
314
+ If the service is ready, you will get the response like below.
315
+
316
+ ``` log
317
+ 2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
318
+ ```
319
+
320
+ Then try the ` cURL ` command below to validate services.
309
321
310
322
``` bash
311
323
# TGI service
Original file line number Diff line number Diff line change @@ -276,6 +276,22 @@ curl http://${host_ip}:6046/v1/reranking\
276
276
277
277
6 . TGI Service
278
278
279
+ In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.
280
+
281
+ Try the command below to check whether the TGI service is ready.
282
+
283
+ ``` bash
284
+ docker logs ${CONTAINER_ID} | grep Connected
285
+ ```
286
+
287
+ If the service is ready, you will get the response like below.
288
+
289
+ ``` log
290
+ 2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
291
+ ```
292
+
293
+ Then try the ` cURL ` command below to validate TGI.
294
+
279
295
``` bash
280
296
curl http://${host_ip} :6042/generate \
281
297
-X POST \
You can’t perform that action at this time.
0 commit comments