Skip to content

Commit 464e2d3

Browse files
authored
Rename streaming to stream to align with OpenAI API (#1332)
Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
1 parent 1f29eca commit 464e2d3

File tree

53 files changed

+70
-57
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+70
-57
lines changed

AgentQnA/docker_compose/amd/gpu/rocm/compose.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ services:
4949
model: ${LLM_MODEL_ID}
5050
temperature: ${temperature}
5151
max_new_tokens: ${max_new_tokens}
52-
streaming: false
52+
stream: false
5353
tools: /home/user/tools/worker_agent_tools.yaml
5454
require_human_feedback: false
5555
RETRIEVAL_TOOL_URL: ${RETRIEVAL_TOOL_URL}
@@ -83,7 +83,7 @@ services:
8383
model: ${LLM_MODEL_ID}
8484
temperature: ${temperature}
8585
max_new_tokens: ${max_new_tokens}
86-
streaming: false
86+
stream: false
8787
tools: /home/user/tools/supervisor_agent_tools.yaml
8888
require_human_feedback: false
8989
no_proxy: ${no_proxy}

AgentQnA/docker_compose/intel/cpu/xeon/compose_openai.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ services:
1919
model: ${model}
2020
temperature: ${temperature}
2121
max_new_tokens: ${max_new_tokens}
22-
streaming: false
22+
stream: false
2323
tools: /home/user/tools/worker_agent_tools.yaml
2424
require_human_feedback: false
2525
RETRIEVAL_TOOL_URL: ${RETRIEVAL_TOOL_URL}
@@ -51,7 +51,7 @@ services:
5151
model: ${model}
5252
temperature: ${temperature}
5353
max_new_tokens: ${max_new_tokens}
54-
streaming: false
54+
stream: false
5555
tools: /home/user/tools/supervisor_agent_tools.yaml
5656
require_human_feedback: false
5757
no_proxy: ${no_proxy}

AgentQnA/docker_compose/intel/hpu/gaudi/compose.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ services:
2121
model: ${LLM_MODEL_ID}
2222
temperature: ${temperature}
2323
max_new_tokens: ${max_new_tokens}
24-
streaming: false
24+
stream: false
2525
tools: /home/user/tools/worker_agent_tools.yaml
2626
require_human_feedback: false
2727
RETRIEVAL_TOOL_URL: ${RETRIEVAL_TOOL_URL}
@@ -55,7 +55,7 @@ services:
5555
model: ${LLM_MODEL_ID}
5656
temperature: ${temperature}
5757
max_new_tokens: ${max_new_tokens}
58-
streaming: false
58+
stream: false
5959
tools: /home/user/tools/supervisor_agent_tools.yaml
6060
require_human_feedback: false
6161
no_proxy: ${no_proxy}

AgentQnA/tests/step2_start_retrieval_tool.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ WORKPATH=$(dirname "$PWD")
77
export WORKDIR=$WORKPATH/../../
88
echo "WORKDIR=${WORKDIR}"
99
export ip_address=$(hostname -I | awk '{print $1}')
10+
export host_ip=${ip_address}
1011

1112
export HF_CACHE_DIR=$WORKDIR/hf_cache
1213
if [ ! -d "$HF_CACHE_DIR" ]; then

AgentQnA/tests/test_compose_on_gaudi.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# Copyright (C) 2024 Intel Corporation
33
# SPDX-License-Identifier: Apache-2.0
44

5-
set -e
5+
set -xe
66

77
WORKPATH=$(dirname "$PWD")
88
export WORKDIR=$WORKPATH/../../
@@ -82,4 +82,4 @@ echo "=================== #5 Agent and API server stopped===================="
8282

8383
echo y | docker system prune
8484

85-
echo "ALL DONE!"
85+
echo "ALL DONE!!"

AgentQnA/tests/test_compose_on_rocm.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# Copyright (C) 2024 Advanced Micro Devices, Inc.
33
# SPDX-License-Identifier: Apache-2.0
44

5-
set -e
5+
set -xe
66

77
WORKPATH=$(dirname "$PWD")
88
export WORKDIR=$WORKPATH/../../
@@ -72,4 +72,4 @@ echo "=================== #5 Agent and API server stopped===================="
7272

7373
echo y | docker system prune
7474

75-
echo "ALL DONE!"
75+
echo "ALL DONE!!"

AudioQnA/audioqna.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ def align_inputs(self, inputs, cur_node, runtime_graph, llm_parameters_dict, **k
2626
next_inputs["messages"] = [{"role": "user", "content": inputs["asr_result"]}]
2727
next_inputs["max_tokens"] = llm_parameters_dict["max_tokens"]
2828
next_inputs["top_p"] = llm_parameters_dict["top_p"]
29-
next_inputs["stream"] = inputs["streaming"] # False as default
29+
next_inputs["stream"] = inputs["stream"] # False as default
3030
next_inputs["frequency_penalty"] = inputs["frequency_penalty"]
3131
# next_inputs["presence_penalty"] = inputs["presence_penalty"]
3232
# next_inputs["repetition_penalty"] = inputs["repetition_penalty"]
@@ -91,7 +91,7 @@ async def handle_request(self, request: Request):
9191
frequency_penalty=chat_request.frequency_penalty if chat_request.frequency_penalty else 0.0,
9292
presence_penalty=chat_request.presence_penalty if chat_request.presence_penalty else 0.0,
9393
repetition_penalty=chat_request.repetition_penalty if chat_request.repetition_penalty else 1.03,
94-
streaming=False, # TODO add streaming LLM output as input to TTS
94+
stream=False, # TODO add stream LLM output as input to TTS
9595
)
9696
result_dict, runtime_graph = await self.megaservice.schedule(
9797
initial_inputs={"audio": chat_request.audio},

AudioQnA/audioqna_multilang.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ def align_inputs(self, inputs, cur_node, runtime_graph, llm_parameters_dict, **k
2828
next_inputs["messages"] = [{"role": "user", "content": inputs["asr_result"]}]
2929
next_inputs["max_tokens"] = llm_parameters_dict["max_tokens"]
3030
next_inputs["top_p"] = llm_parameters_dict["top_p"]
31-
next_inputs["stream"] = inputs["streaming"] # False as default
31+
next_inputs["stream"] = inputs["stream"] # False as default
3232
next_inputs["frequency_penalty"] = inputs["frequency_penalty"]
3333
# next_inputs["presence_penalty"] = inputs["presence_penalty"]
3434
# next_inputs["repetition_penalty"] = inputs["repetition_penalty"]
@@ -103,7 +103,7 @@ async def handle_request(self, request: Request):
103103
frequency_penalty=chat_request.frequency_penalty if chat_request.frequency_penalty else 0.0,
104104
presence_penalty=chat_request.presence_penalty if chat_request.presence_penalty else 0.0,
105105
repetition_penalty=chat_request.repetition_penalty if chat_request.repetition_penalty else 1.03,
106-
streaming=False, # TODO add streaming LLM output as input to TTS
106+
stream=False, # TODO add stream LLM output as input to TTS
107107
)
108108
result_dict, runtime_graph = await self.megaservice.schedule(
109109
initial_inputs={"audio": chat_request.audio}, llm_parameters=parameters

AudioQnA/benchmark/performance/benchmark.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ test_cases:
4040
top_k: 10
4141
top_p: 0.95
4242
repetition_penalty: 1.03
43-
streaming: true
43+
stream: true
4444
llmserve:
4545
run_test: true
4646
service_name: "llm-svc" # Replace with your service name

AudioQnA/docker_compose/amd/gpu/rocm/compose.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ services:
5353
ipc: host
5454
audioqna-backend-server:
5555
image: ${REGISTRY:-opea}/audioqna:${TAG:-latest}
56-
container_name: audioqna-xeon-backend-server
56+
container_name: audioqna-rocm-backend-server
5757
depends_on:
5858
- whisper-service
5959
- tgi-service

AudioQnA/kubernetes/intel/README_gmc.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ This involves deploying the AudioQnA custom resource. You can use audioQnA_xeon.
6666
```sh
6767
export CLIENT_POD=$(kubectl get pod -n audioqa -l app=client-test -o jsonpath={.items..metadata.name})
6868
export accessUrl=$(kubectl get gmc -n audioqa -o jsonpath="{.items[?(@.metadata.name=='audioqa')].status.accessUrl}")
69-
kubectl exec "$CLIENT_POD" -n audioqa -- curl -s --no-buffer $accessUrl -X POST -d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "parameters":{"max_new_tokens":64, "do_sample": true, "streaming":false}}' -H 'Content-Type: application/json'
69+
kubectl exec "$CLIENT_POD" -n audioqa -- curl -s --no-buffer $accessUrl -X POST -d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "parameters":{"max_new_tokens":64, "do_sample": true, "stream":false}}' -H 'Content-Type: application/json'
7070
```
7171

7272
> [NOTE]

AudioQnA/tests/test_compose_on_gaudi.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ function start_services() {
4444
# sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env
4545

4646
# Start Docker Containers
47+
sed -i "s|container_name: audioqna-gaudi-backend-server|container_name: audioqna-gaudi-backend-server\n volumes:\n - \"${WORKPATH}\/docker_image_build\/GenAIComps:\/home\/user\/GenAIComps\"|g" compose.yaml
4748
docker compose up -d > ${LOG_PATH}/start_services_with_compose.log
4849
n=0
4950
until [[ "$n" -ge 200 ]]; do

AudioQnA/tests/test_compose_on_rocm.sh

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ function start_services() {
4646
# sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env
4747

4848
# Start Docker Containers
49+
sed -i "s|container_name: audioqna-rocm-backend-server|container_name: audioqna-rocm-backend-server\n volumes:\n - \"${WORKPATH}\/docker_image_build\/GenAIComps:\/home\/user\/GenAIComps\"|g" compose.yaml
4950
docker compose up -d > ${LOG_PATH}/start_services_with_compose.log
5051
n=0
5152
until [[ "$n" -ge 200 ]]; do
@@ -63,7 +64,7 @@ function validate_megaservice() {
6364
docker logs whisper-service > $LOG_PATH/whisper-service.log
6465
docker logs speecht5-service > $LOG_PATH/tts-service.log
6566
docker logs tgi-service > $LOG_PATH/tgi-service.log
66-
docker logs audioqna-xeon-backend-server > $LOG_PATH/audioqna-xeon-backend-server.log
67+
docker logs audioqna-rocm-backend-server > $LOG_PATH/audioqna-rocm-backend-server.log
6768
echo "$response" | sed 's/^"//;s/"$//' | base64 -d > speech.mp3
6869

6970
if [[ $(file speech.mp3) == *"RIFF"* ]]; then

AudioQnA/tests/test_compose_on_xeon.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ function start_services() {
4545
# sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env
4646

4747
# Start Docker Containers
48+
sed -i "s|container_name: audioqna-xeon-backend-server|container_name: audioqna-xeon-backend-server\n volumes:\n - \"${WORKPATH}\/docker_image_build\/GenAIComps:\/home\/user\/GenAIComps\"|g" compose.yaml
4849
docker compose up -d > ${LOG_PATH}/start_services_with_compose.log
4950
n=0
5051
until [[ "$n" -ge 200 ]]; do

AudioQnA/tests/test_gmc_on_gaudi.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ function validate_audioqa() {
3434
export CLIENT_POD=$(kubectl get pod -n $APP_NAMESPACE -l app=client-test -o jsonpath={.items..metadata.name})
3535
echo "$CLIENT_POD"
3636
accessUrl=$(kubectl get gmc -n $APP_NAMESPACE -o jsonpath="{.items[?(@.metadata.name=='audioqa')].status.accessUrl}")
37-
byte_str=$(kubectl exec "$CLIENT_POD" -n $APP_NAMESPACE -- curl $accessUrl -s -X POST -d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "parameters":{"max_tokens":64, "do_sample": true, "streaming":false}}' -H 'Content-Type: application/json' | jq .byte_str)
37+
byte_str=$(kubectl exec "$CLIENT_POD" -n $APP_NAMESPACE -- curl $accessUrl -s -X POST -d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "parameters":{"max_tokens":64, "do_sample": true, "stream":false}}' -H 'Content-Type: application/json' | jq .byte_str)
3838
echo "$byte_str" > $LOG_PATH/curl_audioqa.log
3939
if [ -z "$byte_str" ]; then
4040
echo "audioqa failed, please check the logs in ${LOG_PATH}!"

AudioQnA/tests/test_gmc_on_xeon.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ function validate_audioqa() {
3434
export CLIENT_POD=$(kubectl get pod -n $APP_NAMESPACE -l app=client-test -o jsonpath={.items..metadata.name})
3535
echo "$CLIENT_POD"
3636
accessUrl=$(kubectl get gmc -n $APP_NAMESPACE -o jsonpath="{.items[?(@.metadata.name=='audioqa')].status.accessUrl}")
37-
byte_str=$(kubectl exec "$CLIENT_POD" -n $APP_NAMESPACE -- curl $accessUrl -s -X POST -d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "parameters":{"max_tokens":64, "do_sample": true, "streaming":false}}' -H 'Content-Type: application/json' | jq .byte_str)
37+
byte_str=$(kubectl exec "$CLIENT_POD" -n $APP_NAMESPACE -- curl $accessUrl -s -X POST -d '{"byte_str": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "parameters":{"max_tokens":64, "do_sample": true, "stream":false}}' -H 'Content-Type: application/json' | jq .byte_str)
3838
echo "$byte_str" > $LOG_PATH/curl_audioqa.log
3939
if [ -z "$byte_str" ]; then
4040
echo "audioqa failed, please check the logs in ${LOG_PATH}!"

AvatarChatbot/avatarchatbot.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ def align_inputs(self, inputs, cur_node, runtime_graph, llm_parameters_dict, **k
2929
next_inputs["messages"] = [{"role": "user", "content": inputs["asr_result"]}]
3030
next_inputs["max_tokens"] = llm_parameters_dict["max_tokens"]
3131
next_inputs["top_p"] = llm_parameters_dict["top_p"]
32-
next_inputs["stream"] = inputs["streaming"] # False as default
32+
next_inputs["stream"] = inputs["stream"] # False as default
3333
next_inputs["frequency_penalty"] = inputs["frequency_penalty"]
3434
# next_inputs["presence_penalty"] = inputs["presence_penalty"]
3535
# next_inputs["repetition_penalty"] = inputs["repetition_penalty"]
@@ -112,7 +112,7 @@ async def handle_request(self, request: Request):
112112
top_p=chat_request.top_p if chat_request.top_p else 0.95,
113113
temperature=chat_request.temperature if chat_request.temperature else 0.01,
114114
repetition_penalty=chat_request.presence_penalty if chat_request.presence_penalty else 1.03,
115-
streaming=False, # TODO add streaming LLM output as input to TTS
115+
stream=False, # TODO add stream LLM output as input to TTS
116116
)
117117
# print(parameters)
118118

AvatarChatbot/tests/test_compose_on_gaudi.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ function start_services() {
7171
export FPS=10
7272

7373
# Start Docker Containers
74+
sed -i "s|container_name: avatarchatbot-gaudi-backend-server|container_name: avatarchatbot-gaudi-backend-server\n volumes:\n - \"${WORKPATH}\/docker_image_build\/GenAIComps:\/home\/user\/GenAIComps\"|g" compose.yaml
7475
docker compose up -d > ${LOG_PATH}/start_services_with_compose.log
7576
n=0
7677
until [[ "$n" -ge 200 ]]; do

AvatarChatbot/tests/test_compose_on_xeon.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ function start_services() {
7171
export FPS=10
7272

7373
# Start Docker Containers
74+
sed -i "s|container_name: avatarchatbot-xeon-backend-server|container_name: avatarchatbot-xeon-backend-server\n volumes:\n - \"${WORKPATH}\/docker_image_build\/GenAIComps:\/home\/user\/GenAIComps\"|g" compose.yaml
7475
docker compose up -d
7576
n=0
7677
until [[ "$n" -ge 100 ]]; do

ChatQnA/benchmark/performance/kubernetes/intel/gaudi/benchmark.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ test_cases:
5858
top_k: 10
5959
top_p: 0.95
6060
repetition_penalty: 1.03
61-
streaming: true
61+
stream: true
6262
llmserve:
6363
run_test: false
6464
service_name: "chatqna-tgi" # Replace with your service name

ChatQnA/chatqna.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ def align_inputs(self, inputs, cur_node, runtime_graph, llm_parameters_dict, **k
7676
next_inputs["messages"] = [{"role": "user", "content": inputs["inputs"]}]
7777
next_inputs["max_tokens"] = llm_parameters_dict["max_tokens"]
7878
next_inputs["top_p"] = llm_parameters_dict["top_p"]
79-
next_inputs["stream"] = inputs["streaming"]
79+
next_inputs["stream"] = inputs["stream"]
8080
next_inputs["frequency_penalty"] = inputs["frequency_penalty"]
8181
# next_inputs["presence_penalty"] = inputs["presence_penalty"]
8282
# next_inputs["repetition_penalty"] = inputs["repetition_penalty"]
@@ -158,7 +158,7 @@ def align_outputs(self, data, cur_node, inputs, runtime_graph, llm_parameters_di
158158

159159
next_data["inputs"] = prompt
160160

161-
elif self.services[cur_node].service_type == ServiceType.LLM and not llm_parameters_dict["streaming"]:
161+
elif self.services[cur_node].service_type == ServiceType.LLM and not llm_parameters_dict["stream"]:
162162
next_data["text"] = data["choices"][0]["message"]["content"]
163163
else:
164164
next_data = data
@@ -342,7 +342,7 @@ async def handle_request(self, request: Request):
342342
frequency_penalty=chat_request.frequency_penalty if chat_request.frequency_penalty else 0.0,
343343
presence_penalty=chat_request.presence_penalty if chat_request.presence_penalty else 0.0,
344344
repetition_penalty=chat_request.repetition_penalty if chat_request.repetition_penalty else 1.03,
345-
streaming=stream_opt,
345+
stream=stream_opt,
346346
chat_template=chat_request.chat_template if chat_request.chat_template else None,
347347
)
348348
retriever_parameters = RetrieverParms(

ChatQnA/chatqna_wrapper.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ async def handle_request(self, request: Request):
8686
frequency_penalty=chat_request.frequency_penalty if chat_request.frequency_penalty else 0.0,
8787
presence_penalty=chat_request.presence_penalty if chat_request.presence_penalty else 0.0,
8888
repetition_penalty=chat_request.repetition_penalty if chat_request.repetition_penalty else 1.03,
89-
streaming=stream_opt,
89+
stream=stream_opt,
9090
chat_template=chat_request.chat_template if chat_request.chat_template else None,
9191
)
9292
retriever_parameters = RetrieverParms(

ChatQnA/tests/test_compose_on_gaudi.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ function start_services() {
3838
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
3939

4040
# Start Docker Containers
41+
sed -i "s|container_name: chatqna-gaudi-backend-server|container_name: chatqna-gaudi-backend-server\n volumes:\n - \"${WORKPATH}\/docker_image_build\/GenAIComps:\/home\/user\/GenAIComps\"|g" compose.yaml
4142
docker compose -f compose.yaml up -d > ${LOG_PATH}/start_services_with_compose.log
4243

4344
n=0

ChatQnA/tests/test_compose_on_rocm.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ function start_services() {
6565
cd "$WORKPATH"/docker_compose/amd/gpu/rocm
6666

6767
# Start Docker Containers
68+
sed -i "s|container_name: chatqna-backend-server|container_name: chatqna-backend-server\n volumes:\n - \"${WORKPATH}\/docker_image_build\/GenAIComps:\/home\/user\/GenAIComps\"|g" compose.yaml
6869
docker compose -f compose.yaml up -d > "${LOG_PATH}"/start_services_with_compose.log
6970

7071
n=0

ChatQnA/tests/test_compose_on_xeon.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ function start_services() {
3838
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
3939

4040
# Start Docker Containers
41+
sed -i "s|container_name: chatqna-xeon-backend-server|container_name: chatqna-xeon-backend-server\n volumes:\n - \"${WORKPATH}\/docker_image_build\/GenAIComps:\/home\/user\/GenAIComps\"|g" compose.yaml
4142
docker compose -f compose.yaml up -d > ${LOG_PATH}/start_services_with_compose.log
4243

4344
n=0

CodeGen/benchmark/performance/benchmark.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ test_cases:
3838
top_k: 10
3939
top_p: 0.95
4040
repetition_penalty: 1.03
41-
streaming: true
41+
stream: true
4242
llmserve:
4343
run_test: true
4444
service_name: "llm-svc" # Replace with your service name

CodeGen/codegen.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ async def handle_request(self, request: Request):
5353
frequency_penalty=chat_request.frequency_penalty if chat_request.frequency_penalty else 0.0,
5454
presence_penalty=chat_request.presence_penalty if chat_request.presence_penalty else 0.0,
5555
repetition_penalty=chat_request.repetition_penalty if chat_request.repetition_penalty else 1.03,
56-
streaming=stream_opt,
56+
stream=stream_opt,
5757
)
5858
result_dict, runtime_graph = await self.megaservice.schedule(
5959
initial_inputs={"query": prompt}, llm_parameters=parameters

CodeGen/docker_compose/amd/gpu/rocm/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ curl http://${HOST_IP}:${CODEGEN_TGI_SERVICE_PORT}/generate \
113113
```bash
114114
curl http://${HOST_IP}:${CODEGEN_LLM_SERVICE_PORT}/v1/chat/completions\
115115
-X POST \
116-
-d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \
116+
-d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"stream":true}' \
117117
-H 'Content-Type: application/json'
118118
```
119119

CodeGen/docker_compose/intel/cpu/xeon/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ docker compose up -d
138138
```bash
139139
curl http://${host_ip}:9000/v1/chat/completions\
140140
-X POST \
141-
-d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \
141+
-d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"stream":true}' \
142142
-H 'Content-Type: application/json'
143143
```
144144

@@ -250,7 +250,7 @@ There are 4 areas worth noting as shown in the screenshot above:
250250

251251
1. Enter and submit your question
252252
2. Your previous questions
253-
3. Answers from AI assistant (Code will be highlighted properly according to the programming language it is written in, also support streaming output)
253+
3. Answers from AI assistant (Code will be highlighted properly according to the programming language it is written in, also support stream output)
254254
4. Copy or replace code with one click (Note that you need to select the code in the editor first and then click "replace", otherwise the code will be inserted)
255255

256256
You can also select the code in the editor and ask the AI assistant questions about the code directly.

0 commit comments

Comments
 (0)