We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent 3aabad2 commit e90bf8bCopy full SHA for e90bf8b
README.md
@@ -2,6 +2,12 @@
2
3
ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.
4
5
+The official and recommended backend server for ExLlamaV2 is [TabbyAPI](https://github.com/theroyallab/tabbyAPI/),
6
+which provides an OpenAI-compatible API for local or remote inference, with extended features like HF model
7
+downloading, embedding model support and support for HF Jinja2 chat templates.
8
+
9
+See the [wiki](https://github.com/theroyallab/tabbyAPI/wiki/1.-Getting-Started) for help getting started.
10
11
12
## New in v0.1.0+:
13
0 commit comments