Skip to content

Commit

Permalink
doc: Add TRT-LLM backend to the doc (#102)
Browse files Browse the repository at this point in the history
* Add TRT-LLM backend to the doc

* Add TRT-LLM backend to platform support matrix

* Switch the order of vLLM and TRT-LLM
  • Loading branch information
krishung5 authored Aug 29, 2024
1 parent 30fa78a commit ac03a5d
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 1 deletion.
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,14 @@ random forest models. The
[fil_backend](https://github.com/triton-inference-server/fil_backend) repo
contains the documentation and source for the backend.

**TensorRT-LLM**: The TensorRT-LLM backend allows you to serve
[TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) models with Triton Server.
Check out the
[Triton TRT-LLM user guide](https://github.com/triton-inference-server/server/blob/main/docs/getting_started/trtllm_user_guide.md)
for more information. The
[tensorrtllm_backend](https://github.com/triton-inference-server/tensorrtllm_backend)
repo contains the documentation and source for the backend.

**vLLM**: The vLLM backend is designed to run
[supported models](https://vllm.readthedocs.io/en/latest/models/supported_models.html)
on a [vLLM engine](https://github.com/vllm-project/vllm/blob/main/vllm/engine/async_llm_engine.py).
Expand Down
3 changes: 2 additions & 1 deletion docs/backend_platform_support_matrix.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<!--
# Copyright 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# Copyright 2022-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
Expand Down Expand Up @@ -53,6 +53,7 @@ each backend on different platforms.
| Python[^1] | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU |
| DALI | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU | :heavy_check_mark: GPU[^2] <br/> :heavy_check_mark: CPU[^2] |
| FIL | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU | Unsupported |
| TensorRT-LLM | :heavy_check_mark: GPU <br/> :x: CPU | :heavy_check_mark: GPU <br/> :x: CPU |
| vLLM | :heavy_check_mark: GPU <br/> :heavy_check_mark: CPU | Unsupported |


Expand Down

0 comments on commit ac03a5d

Please sign in to comment.