Skip to content

Commit c86cf85

Browse files
authored
Add AudioQnA example via GMC (#597)
* add AudioQnA example via GMC. Signed-off-by: zhlsunshine <huailong.zhang@intel.com> * add more information for e2e test scritpts. Signed-off-by: zhlsunshine <huailong.zhang@intel.com> * fix bug in e2e test scripts. Signed-off-by: zhlsunshine <huailong.zhang@intel.com>
1 parent 039014f commit c86cf85

File tree

5 files changed

+412
-0
lines changed

5 files changed

+412
-0
lines changed

AudioQnA/kubernetes/README.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Deploy AudioQnA in Kubernetes Cluster on Xeon and Gaudi
2+
3+
This document outlines the deployment process for a AudioQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline components on Intel Xeon server and Gaudi machines.
4+
5+
The AudioQnA Service leverages a Kubernetes operator called genai-microservices-connector(GMC). GMC supports connecting microservices to create pipelines based on the specification in the pipeline yaml file in addition to allowing the user to dynamically control which model is used in a service such as an LLM or embedder. The underlying pipeline language also supports using external services that may be running in public or private cloud elsewhere.
6+
7+
Install GMC in your Kubernetes cluster, if you have not already done so, by following the steps in Section "Getting Started" at [GMC Install](https://github.com/opea-project/GenAIInfra/tree/main/microservices-connector). Soon as we publish images to Docker Hub, at which point no builds will be required, simplifying install.
8+
9+
10+
The AudioQnA application is defined as a Custom Resource (CR) file that the above GMC operator acts upon. It first checks if the microservices listed in the CR yaml file are running, if not starts them and then proceeds to connect them. When the AudioQnA pipeline is ready, the service endpoint details are returned, letting you use the application. Should you use "kubectl get pods" commands you will see all the component microservices, in particular `asr`, `tts`, and `llm`.
11+
12+
13+
## Using prebuilt images
14+
15+
The AudioQnA uses the below prebuilt images if you choose a Xeon deployment
16+
17+
- tgi-service: ghcr.io/huggingface/text-generation-inference:1.4
18+
- llm: opea/llm-tgi:latest
19+
- asr: opea/asr:latest
20+
- whisper: opea/whisper:latest
21+
- tts: opea/tts:latest
22+
- speecht5: opea/speecht5:latest
23+
24+
25+
Should you desire to use the Gaudi accelerator, two alternate images are used for the embedding and llm services.
26+
For Gaudi:
27+
28+
- tgi-service: ghcr.io/huggingface/tgi-gaudi:1.2.1
29+
- whisper-gaudi: opea/whisper-gaudi:latest
30+
- speecht5-gaudi: opea/speecht5-gaudi:latest
31+
32+
> [NOTE]
33+
> Please refer to [Xeon README](https://github.com/opea-project/GenAIExamples/blob/main/AudioQnA/docker/xeon/README.md) or [Gaudi README](https://github.com/opea-project/GenAIExamples/blob/main/AudioQnA/docker/gaudi/README.md) to build the OPEA images. These too will be available on Docker Hub soon to simplify use.
34+
35+
## Deploy AudioQnA pipeline
36+
This involves deploying the AudioQnA custom resource. You can use audioQnA_xeon.yaml or if you have a Gaudi cluster, you could use audioQnA_gaudi.yaml.
37+
38+
1. Create namespace and deploy application
39+
```sh
40+
kubectl create ns audioqa
41+
kubectl apply -f $(pwd)/audioQnA_xeon.yaml
42+
```
43+
44+
2. GMC will reconcile the AudioQnA custom resource and get all related components/services ready. Check if the service up.
45+
46+
```sh
47+
kubectl get service -n audioqa
48+
```
49+
50+
3. Retrieve the application access URL
51+
52+
```sh
53+
kubectl get gmconnectors.gmc.opea.io -n audioqa
54+
NAME URL READY AGE
55+
audioqa http://router-service.audioqa.svc.cluster.local:8080 6/0/6 5m
56+
```
57+
58+
4. Deploy a client pod to test the application
59+
60+
```sh
61+
kubectl create deployment client-test -n audioqa --image=python:3.8.13 -- sleep infinity
62+
```
63+
64+
5. Access the application using the above URL from the client pod
65+
66+
```sh
67+
export CLIENT_POD=$(kubectl get pod -n audioqa -l app=client-test -o jsonpath={.items..metadata.name})
68+
export accessUrl=$(kubectl get gmc -n audioqa -o jsonpath="{.items[?(@.metadata.name=='audioqa')].status.accessUrl}")
69+
kubectl exec "$CLIENT_POD" -n audioqa -- curl $accessUrl -X POST -d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "parameters":{"max_new_tokens":64, "do_sample": true, "streaming":false}}' -H 'Content-Type: application/json'
70+
```
71+
72+
> [NOTE]
73+
74+
You can remove your AudioQnA pipeline by executing standard Kubernetes kubectl commands to remove a custom resource. Verify it was removed by executing kubectl get pods in the audioqa namespace.
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
apiVersion: gmc.opea.io/v1alpha3
5+
kind: GMConnector
6+
metadata:
7+
labels:
8+
app.kubernetes.io/name: gmconnector
9+
app.kubernetes.io/managed-by: kustomize
10+
gmc/platform: gaudi
11+
name: audioqa
12+
namespace: audioqa
13+
spec:
14+
routerConfig:
15+
name: router
16+
serviceName: router-service
17+
nodes:
18+
root:
19+
routerType: Sequence
20+
steps:
21+
- name: Asr
22+
internalService:
23+
serviceName: asr-svc
24+
config:
25+
endpoint: /v1/audio/transcriptions
26+
ASR_ENDPOINT: whisper-gaudi-svc
27+
- name: WhisperGaudi
28+
internalService:
29+
serviceName: whisper-gaudi-svc
30+
config:
31+
endpoint: /v1/asr
32+
isDownstreamService: true
33+
- name: Llm
34+
data: $response
35+
internalService:
36+
serviceName: llm-svc
37+
config:
38+
endpoint: /v1/chat/completions
39+
TGI_LLM_ENDPOINT: tgi-gaudi-svc
40+
- name: TgiGaudi
41+
internalService:
42+
serviceName: tgi-gaudi-svc
43+
config:
44+
endpoint: /generate
45+
isDownstreamService: true
46+
- name: Tts
47+
data: $response
48+
internalService:
49+
serviceName: tts-svc
50+
config:
51+
endpoint: /v1/audio/speech
52+
TTS_ENDPOINT: speecht5-gaudi-svc
53+
- name: SpeechT5Gaudi
54+
internalService:
55+
serviceName: speecht5-gaudi-svc
56+
config:
57+
endpoint: /v1/tts
58+
isDownstreamService: true
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
apiVersion: gmc.opea.io/v1alpha3
5+
kind: GMConnector
6+
metadata:
7+
labels:
8+
app.kubernetes.io/name: gmconnector
9+
app.kubernetes.io/managed-by: kustomize
10+
gmc/platform: xeon
11+
name: audioqa
12+
namespace: audioqa
13+
spec:
14+
routerConfig:
15+
name: router
16+
serviceName: router-service
17+
nodes:
18+
root:
19+
routerType: Sequence
20+
steps:
21+
- name: Asr
22+
internalService:
23+
serviceName: asr-svc
24+
config:
25+
endpoint: /v1/audio/transcriptions
26+
ASR_ENDPOINT: whisper-svc
27+
- name: Whisper
28+
internalService:
29+
serviceName: whisper-svc
30+
config:
31+
endpoint: /v1/asr
32+
isDownstreamService: true
33+
- name: Llm
34+
data: $response
35+
internalService:
36+
serviceName: llm-svc
37+
config:
38+
endpoint: /v1/chat/completions
39+
TGI_LLM_ENDPOINT: tgi-svc
40+
- name: Tgi
41+
internalService:
42+
serviceName: tgi-svc
43+
config:
44+
endpoint: /generate
45+
isDownstreamService: true
46+
- name: Tts
47+
data: $response
48+
internalService:
49+
serviceName: tts-svc
50+
config:
51+
endpoint: /v1/audio/speech
52+
TTS_ENDPOINT: speecht5-svc
53+
- name: SpeechT5
54+
internalService:
55+
serviceName: speecht5-svc
56+
config:
57+
endpoint: /v1/tts
58+
isDownstreamService: true

AudioQnA/tests/test_gmc_on_gaudi.sh

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
#!/bin/bash
2+
# Copyright (C) 2024 Intel Corporation
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
set -xe
6+
USER_ID=$(whoami)
7+
LOG_PATH=/home/$(whoami)/logs
8+
MOUNT_DIR=/home/$USER_ID/.cache/huggingface/hub
9+
IMAGE_REPO=${IMAGE_REPO:-}
10+
11+
function install_audioqa() {
12+
kubectl create ns $APP_NAMESPACE
13+
sed -i "s|namespace: audioqa|namespace: $APP_NAMESPACE|g" ./audioQnA_gaudi.yaml
14+
kubectl apply -f ./audioQnA_gaudi.yaml
15+
16+
# Wait until the router service is ready
17+
echo "Waiting for the audioqa router service to be ready..."
18+
wait_until_pod_ready "audioqa router" $APP_NAMESPACE "router-service"
19+
output=$(kubectl get pods -n $APP_NAMESPACE)
20+
echo $output
21+
}
22+
23+
function validate_audioqa() {
24+
# deploy client pod for testing
25+
kubectl create deployment client-test -n $APP_NAMESPACE --image=python:3.8.13 -- sleep infinity
26+
27+
# wait for client pod ready
28+
wait_until_pod_ready "client-test" $APP_NAMESPACE "client-test"
29+
# giving time to populating data
30+
sleep 60
31+
32+
kubectl get pods -n $APP_NAMESPACE
33+
# send request to audioqa
34+
export CLIENT_POD=$(kubectl get pod -n $APP_NAMESPACE -l app=client-test -o jsonpath={.items..metadata.name})
35+
echo "$CLIENT_POD"
36+
accessUrl=$(kubectl get gmc -n $APP_NAMESPACE -o jsonpath="{.items[?(@.metadata.name=='audioqa')].status.accessUrl}")
37+
byte_str=$(kubectl exec "$CLIENT_POD" -n $APP_NAMESPACE -- curl $accessUrl -s -X POST -d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "parameters":{"max_new_tokens":64, "do_sample": true, "streaming":false}}' -H 'Content-Type: application/json' | jq .byte_str)
38+
echo "$byte_str" > $LOG_PATH/curl_audioqa.log
39+
if [ -z "$byte_str" ]; then
40+
echo "audioqa failed, please check the logs in ${LOG_PATH}!"
41+
exit 1
42+
fi
43+
echo "Audioqa response check succeed!"
44+
}
45+
46+
function wait_until_pod_ready() {
47+
echo "Waiting for the $1 to be ready..."
48+
max_retries=30
49+
retry_count=0
50+
while ! is_pod_ready $2 $3; do
51+
if [ $retry_count -ge $max_retries ]; then
52+
echo "$1 is not ready after waiting for a significant amount of time"
53+
get_gmc_controller_logs
54+
exit 1
55+
fi
56+
echo "$1 is not ready yet. Retrying in 10 seconds..."
57+
sleep 10
58+
output=$(kubectl get pods -n $2)
59+
echo $output
60+
retry_count=$((retry_count + 1))
61+
done
62+
}
63+
64+
function is_pod_ready() {
65+
if [ "$2" == "gmc-controller" ]; then
66+
pod_status=$(kubectl get pods -n $1 -o jsonpath='{.items[].status.conditions[?(@.type=="Ready")].status}')
67+
else
68+
pod_status=$(kubectl get pods -n $1 -l app=$2 -o jsonpath='{.items[].status.conditions[?(@.type=="Ready")].status}')
69+
fi
70+
if [ "$pod_status" == "True" ]; then
71+
return 0
72+
else
73+
return 1
74+
fi
75+
}
76+
77+
function get_gmc_controller_logs() {
78+
# Fetch the name of the pod with the app-name gmc-controller in the specified namespace
79+
pod_name=$(kubectl get pods -n $SYSTEM_NAMESPACE -l control-plane=gmc-controller -o jsonpath='{.items[0].metadata.name}')
80+
81+
# Check if the pod name was found
82+
if [ -z "$pod_name" ]; then
83+
echo "No pod found with app-name gmc-controller in namespace $SYSTEM_NAMESPACE"
84+
return 1
85+
fi
86+
87+
# Get the logs of the found pod
88+
echo "Fetching logs for pod $pod_name in namespace $SYSTEM_NAMESPACE..."
89+
kubectl logs $pod_name -n $SYSTEM_NAMESPACE
90+
}
91+
92+
if [ $# -eq 0 ]; then
93+
echo "Usage: $0 <function_name>"
94+
exit 1
95+
fi
96+
97+
case "$1" in
98+
install_AudioQnA)
99+
pushd AudioQnA/kubernetes
100+
install_audioqa
101+
popd
102+
;;
103+
validate_AudioQnA)
104+
pushd AudioQnA/kubernetes
105+
validate_audioqa
106+
popd
107+
;;
108+
*)
109+
echo "Unknown function: $1"
110+
;;
111+
esac

0 commit comments

Comments
 (0)