Skip to content

Commit b418bdf

Browse files
shihaobaishihaobai
and
shihaobai
authoredMar 3, 2025
fix /tokens docs && tag v1.0.1 (#753)
Co-authored-by: shihaobai <baishihao@sensetime.com>
1 parent 2dbd8e5 commit b418bdf

File tree

5 files changed

+7
-3
lines changed

5 files changed

+7
-3
lines changed
 

‎Dockerfile

+3-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM nvidia/cuda:12.4.0-runtime-ubuntu20.04 as base
1+
FROM nvcr.io/nvidia/tritonserver:24.04-py3-min as base
22
ARG PYTORCH_VERSION=2.5.1
33
ARG PYTHON_VERSION=3.9
44
ARG CUDA_VERSION=12.4
@@ -38,5 +38,7 @@ WORKDIR /root
3838
COPY ./requirements.txt /lightllm/requirements.txt
3939
RUN pip install -r /lightllm/requirements.txt --no-cache-dir --ignore-installed --extra-index-url https://download.pytorch.org/whl/cu124
4040

41+
RUN pip install --no-cache-dir nvidia-nccl-cu12==2.25.1 # for allreduce hang issues in multinode H100
42+
4143
COPY . /lightllm
4244
RUN pip install -e /lightllm --no-cache-dir

‎docs/CN/source/getting_started/quickstart.rst

+1
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@
6868
双机H100部署 DeepSeek-R1 模型,启动命令如下:
6969

7070
.. code-block:: console
71+
7172
$ # Node 0
7273
$ LOADWORKER=8 python -m lightllm.server.api_server --model_dir ~/models/DeepSeek-R1 --tp 16 --graph_max_batch_size 100 --nccl_host master_addr --nnodes 2 --node_rank 0
7374
$ # Node 1

‎docs/EN/source/getting_started/quickstart.rst

+1
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ For the DeepSeek-R1 model on single H200, it can be launched with the following
6565
For the DeepSeek-R1 model on two H100, it can be launched with the following command:
6666

6767
.. code-block:: console
68+
6869
$ # Node 0
6970
$ LOADWORKER=8 python -m lightllm.server.api_server --model_dir ~/models/DeepSeek-R1 --tp 16 --graph_max_batch_size 100 --nccl_host master_addr --nnodes 2 --node_rank 0
7071
$ # Node 1

‎lightllm/server/api_http.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -326,7 +326,7 @@ async def tokens(request: Request):
326326
try:
327327
request_dict = await request.json()
328328
prompt = request_dict.pop("text")
329-
parameters = request_dict.pop("parameters")
329+
parameters = request_dict.pop("parameters", {})
330330
return JSONResponse({"ntokens": g_objs.httpserver_manager.tokens(prompt, parameters)}, status_code=200)
331331
except Exception as e:
332332
return create_error_response(HTTPStatus.EXPECTATION_FAILED, f"error: {str(e)}")

‎setup.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
package_data = {"lightllm": ["common/all_kernel_configs/*/*.json"]}
44
setup(
55
name="lightllm",
6-
version="1.0.0",
6+
version="1.0.1",
77
packages=find_packages(exclude=("build", "include", "test", "dist", "docs", "benchmarks", "lightllm.egg-info")),
88
author="model toolchain",
99
author_email="",

0 commit comments

Comments
 (0)