Skip to content

Commit 8edfd6d

Browse files
yoshokukrschacht
andcommitted
docs: docs: update description of preparing quantized model in usage section
Co-authored-by: Keith Schacht <krschacht@gmail.com>
1 parent 856a07f commit 8edfd6d

File tree

1 file changed

+6
-3
lines changed

1 file changed

+6
-3
lines changed

README.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,10 @@ $ gem install llama_cpp -- --with-opt-dir=/opt/homebrew
3232
## Usage
3333

3434
Prepare the quantized model by refering to [the usage section on the llama.cpp README](https://github.com/ggerganov/llama.cpp#usage).
35-
For example, preparing the quatization model based on [open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b) is as follows:
35+
For example, you could prepare the quatization model based on
36+
[open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b)
37+
or more useful in the context of Ruby might be a smaller model such as
38+
[tiny_llama_1b](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0):
3639

3740
```sh
3841
$ cd ~/
@@ -44,9 +47,9 @@ $ python3 -m pip install -r requirements.txt
4447
$ cd models
4548
$ git clone https://huggingface.co/openlm-research/open_llama_7b
4649
$ cd ../
47-
$ python3 convert.py models/open_llama_7b
50+
$ python3 convert-hf-to-gguf.py models/open_llama_7b
4851
$ make
49-
$ ./quantize ./models/open_llama_7b/ggml-model-f16.gguf ./models/open_llama_7b/ggml-model-q4_0.bin q4_0
52+
$ ./llama-quantize ./models/open_llama_7b/ggml-model-f16.gguf ./models/open_llama_7b/ggml-model-q4_0.bin q4_0
5053
```
5154

5255
An example of Ruby code that generates sentences with the quantization model is as follows:

0 commit comments

Comments
 (0)