Skip to content

Commit 393367e

Browse files
Fix left issue of tgi version update (#1121)
Signed-off-by: chensuyue <suyue.chen@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 7adbba6 commit 393367e

File tree

3 files changed

+59
-56
lines changed

3 files changed

+59
-56
lines changed

AvatarChatbot/docker_compose/intel/hpu/gaudi/compose.yaml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ services:
1515
no_proxy: ${no_proxy}
1616
http_proxy: ${http_proxy}
1717
https_proxy: ${https_proxy}
18-
HABANA_VISIBLE_MODULES: all
18+
HABANA_VISIBLE_DEVICES: all
1919
OMPI_MCA_btl_vader_single_copy_mechanism: none
2020
runtime: habana
2121
cap_add:
@@ -39,7 +39,7 @@ services:
3939
no_proxy: ${no_proxy}
4040
http_proxy: ${http_proxy}
4141
https_proxy: ${https_proxy}
42-
HABANA_VISIBLE_MODULES: all
42+
HABANA_VISIBLE_DEVICES: all
4343
OMPI_MCA_btl_vader_single_copy_mechanism: none
4444
runtime: habana
4545
cap_add:
@@ -67,7 +67,7 @@ services:
6767
HUGGING_FACE_HUB_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
6868
HF_HUB_DISABLE_PROGRESS_BARS: 1
6969
HF_HUB_ENABLE_HF_TRANSFER: 0
70-
HABANA_VISIBLE_MODULES: all
70+
HABANA_VISIBLE_DEVICES: all
7171
OMPI_MCA_btl_vader_single_copy_mechanism: none
7272
ENABLE_HPU_GRAPH: true
7373
LIMIT_HPU_GRAPH: true
@@ -105,7 +105,7 @@ services:
105105
no_proxy: ${no_proxy}
106106
http_proxy: ${http_proxy}
107107
https_proxy: ${https_proxy}
108-
HABANA_VISIBLE_MODULES: all
108+
HABANA_VISIBLE_DEVICES: all
109109
OMPI_MCA_btl_vader_single_copy_mechanism: none
110110
DEVICE: ${DEVICE}
111111
INFERENCE_MODE: ${INFERENCE_MODE}
@@ -132,7 +132,7 @@ services:
132132
no_proxy: ${no_proxy}
133133
http_proxy: ${http_proxy}
134134
https_proxy: ${https_proxy}
135-
HABANA_VISIBLE_MODULES: all
135+
HABANA_VISIBLE_DEVICES: all
136136
OMPI_MCA_btl_vader_single_copy_mechanism: none
137137
WAV2LIP_ENDPOINT: ${WAV2LIP_ENDPOINT}
138138
runtime: habana

AvatarChatbot/tests/test_compose_on_gaudi.sh

Lines changed: 3 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ function start_services() {
7474
export FPS=10
7575

7676
# Start Docker Containers
77-
docker compose up -d
77+
docker compose up -d > ${LOG_PATH}/start_services_with_compose.log
7878

7979
n=0
8080
until [[ "$n" -ge 100 ]]; do
@@ -86,7 +86,6 @@ function start_services() {
8686
n=$((n+1))
8787
done
8888

89-
# sleep 5m
9089
echo "All services are up and running"
9190
sleep 5s
9291
}
@@ -99,6 +98,7 @@ function validate_megaservice() {
9998
if [[ $result == *"mp4"* ]]; then
10099
echo "Result correct."
101100
else
101+
echo "Result wrong, print docker logs."
102102
docker logs whisper-service > $LOG_PATH/whisper-service.log
103103
docker logs asr-service > $LOG_PATH/asr-service.log
104104
docker logs speecht5-service > $LOG_PATH/speecht5-service.log
@@ -107,19 +107,13 @@ function validate_megaservice() {
107107
docker logs llm-tgi-gaudi-server > $LOG_PATH/llm-tgi-gaudi-server.log
108108
docker logs wav2lip-service > $LOG_PATH/wav2lip-service.log
109109
docker logs animation-gaudi-server > $LOG_PATH/animation-gaudi-server.log
110-
111-
echo "Result wrong."
110+
echo "Exit test."
112111
exit 1
113112
fi
114113

115114
}
116115

117116

118-
#function validate_frontend() {
119-
120-
#}
121-
122-
123117
function stop_docker() {
124118
cd $WORKPATH/docker_compose/intel/hpu/gaudi
125119
docker compose down

ChatQnA/docker_compose/intel/cpu/xeon/README.md

Lines changed: 51 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -432,57 +432,66 @@ curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \
432432
-H "Content-Type: application/json"
433433
```
434434
435-
436435
### Profile Microservices
437436
438-
To further analyze MicroService Performance, users could follow the instructions to profile MicroServices.
437+
To further analyze MicroService Performance, users could follow the instructions to profile MicroServices.
439438
440439
#### 1. vLLM backend Service
441-
Users could follow previous section to testing vLLM microservice or ChatQnA MegaService.
442-
By default, vLLM profiling is not enabled. Users could start and stop profiling by following commands.
443440
444-
##### Start vLLM profiling
441+
Users could follow previous section to testing vLLM microservice or ChatQnA MegaService.
442+
By default, vLLM profiling is not enabled. Users could start and stop profiling by following commands.
445443
446-
```bash
447-
curl http://${host_ip}:9009/start_profile \
448-
-H "Content-Type: application/json" \
449-
-d '{"model": "Intel/neural-chat-7b-v3-3"}'
450-
```
451-
Users would see below docker logs from vllm-service if profiling is started correctly.
452-
```bash
453-
INFO api_server.py:361] Starting profiler...
454-
INFO api_server.py:363] Profiler started.
455-
INFO: x.x.x.x:35940 - "POST /start_profile HTTP/1.1" 200 OK
456-
```
457-
After vLLM profiling is started, users could start asking questions and get responses from vLLM MicroService
458-
or ChatQnA MicroService.
459-
460-
##### Stop vLLM profiling
461-
By following command, users could stop vLLM profliing and generate a *.pt.trace.json.gz file as profiling result
462-
under /mnt folder in vllm-service docker instance.
463-
```bash
464-
# vLLM Service
465-
curl http://${host_ip}:9009/stop_profile \
466-
-H "Content-Type: application/json" \
467-
-d '{"model": "Intel/neural-chat-7b-v3-3"}'
468-
```
469-
Users would see below docker logs from vllm-service if profiling is stopped correctly.
470-
```bash
471-
INFO api_server.py:368] Stopping profiler...
472-
INFO api_server.py:370] Profiler stopped.
473-
INFO: x.x.x.x:41614 - "POST /stop_profile HTTP/1.1" 200 OK
474-
```
475-
After vllm profiling is stopped, users could use below command to get the *.pt.trace.json.gz file under /mnt folder.
476-
```bash
477-
docker cp vllm-service:/mnt/ .
478-
```
444+
##### Start vLLM profiling
445+
446+
```bash
447+
curl http://${host_ip}:9009/start_profile \
448+
-H "Content-Type: application/json" \
449+
-d '{"model": "Intel/neural-chat-7b-v3-3"}'
450+
```
451+
452+
Users would see below docker logs from vllm-service if profiling is started correctly.
453+
454+
```bash
455+
INFO api_server.py:361] Starting profiler...
456+
INFO api_server.py:363] Profiler started.
457+
INFO: x.x.x.x:35940 - "POST /start_profile HTTP/1.1" 200 OK
458+
```
459+
460+
After vLLM profiling is started, users could start asking questions and get responses from vLLM MicroService
461+
or ChatQnA MicroService.
462+
463+
##### Stop vLLM profiling
464+
465+
By following command, users could stop vLLM profliing and generate a \*.pt.trace.json.gz file as profiling result
466+
under /mnt folder in vllm-service docker instance.
467+
468+
```bash
469+
# vLLM Service
470+
curl http://${host_ip}:9009/stop_profile \
471+
-H "Content-Type: application/json" \
472+
-d '{"model": "Intel/neural-chat-7b-v3-3"}'
473+
```
474+
475+
Users would see below docker logs from vllm-service if profiling is stopped correctly.
476+
477+
```bash
478+
INFO api_server.py:368] Stopping profiler...
479+
INFO api_server.py:370] Profiler stopped.
480+
INFO: x.x.x.x:41614 - "POST /stop_profile HTTP/1.1" 200 OK
481+
```
482+
483+
After vllm profiling is stopped, users could use below command to get the \*.pt.trace.json.gz file under /mnt folder.
484+
485+
```bash
486+
docker cp vllm-service:/mnt/ .
487+
```
488+
489+
##### Check profiling result
479490
480-
##### Check profiling result
481-
Open a web browser and type "chrome://tracing" or "ui.perfetto.dev", and then load the json.gz file, you should be able
482-
to see the vLLM profiling result as below diagram.
491+
Open a web browser and type "chrome://tracing" or "ui.perfetto.dev", and then load the json.gz file, you should be able
492+
to see the vLLM profiling result as below diagram.
483493
![image](https://github.com/user-attachments/assets/55c7097e-5574-41dc-97a7-5e87c31bc286)
484494
485-
486495
## 🚀 Launch the UI
487496
488497
### Launch with origin port

0 commit comments

Comments
 (0)