Skip to content

Commit 9c4c9cc

Browse files
authored
Move convert.py to examples/convert-legacy-llama.py (ggml-org#7430)
* Move convert.py to examples/convert-no-torch.py * Fix CI, scripts, readme files * convert-no-torch -> convert-legacy-llama * Move vocab thing to vocab.py * Fix convert-no-torch -> convert-legacy-llama * Fix lost convert.py in ci/run.sh * Fix imports * Fix gguf not imported correctly * Fix flake8 complaints * Fix check-requirements.sh * Get rid of ADDED_TOKENS_FILE, FAST_TOKENIZER_FILE * Review fixes
1 parent 59b0d07 commit 9c4c9cc

20 files changed

+343
-440
lines changed

.devops/tools.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ arg1="$1"
88
shift
99

1010
if [[ "$arg1" == '--convert' || "$arg1" == '-c' ]]; then
11-
python3 ./convert.py "$@"
11+
python3 ./convert-hf-to-gguf.py "$@"
1212
elif [[ "$arg1" == '--quantize' || "$arg1" == '-q' ]]; then
1313
./quantize "$@"
1414
elif [[ "$arg1" == '--run' || "$arg1" == '-r' ]]; then

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1314,7 +1314,7 @@ set_target_properties(llama PROPERTIES PUBLIC_HEADER ${CMAKE_CURRENT_SOURCE_DIR}
13141314
install(TARGETS llama LIBRARY PUBLIC_HEADER)
13151315

13161316
install(
1317-
FILES convert.py
1317+
FILES convert-hf-to-gguf.py
13181318
PERMISSIONS
13191319
OWNER_READ
13201320
OWNER_WRITE

README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -704,7 +704,8 @@ Building the program with BLAS support may lead to some performance improvements
704704
705705
To obtain the official LLaMA 2 weights please see the <a href="#obtaining-and-using-the-facebook-llama-2-model">Obtaining and using the Facebook LLaMA 2 model</a> section. There is also a large selection of pre-quantized `gguf` models available on Hugging Face.
706706
707-
Note: `convert.py` does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
707+
Note: `convert.py` has been moved to `examples/convert-legacy-llama.py` and shouldn't be used for anything other than `Llama/Llama2/Mistral` models and their derievatives.
708+
It does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
708709

709710
```bash
710711
# obtain the official LLaMA model weights and place them in ./models
@@ -721,10 +722,10 @@ ls ./models
721722
python3 -m pip install -r requirements.txt
722723
723724
# convert the model to ggml FP16 format
724-
python3 convert.py models/mymodel/
725+
python3 convert-hf-to-gguf.py models/mymodel/
725726
726727
# [Optional] for models using BPE tokenizers
727-
python convert.py models/mymodel/ --vocab-type bpe
728+
python convert-hf-to-gguf.py models/mymodel/ --vocab-type bpe
728729
729730
# quantize the model to 4-bits (using Q4_K_M method)
730731
./quantize ./models/mymodel/ggml-model-f16.gguf ./models/mymodel/ggml-model-Q4_K_M.gguf Q4_K_M

ci/run.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -287,7 +287,7 @@ function gg_run_open_llama_7b_v2 {
287287
(time cmake -DCMAKE_BUILD_TYPE=Release ${CMAKE_EXTRA} -DLLAMA_CUDA=1 .. ) 2>&1 | tee -a $OUT/${ci}-cmake.log
288288
(time make -j ) 2>&1 | tee -a $OUT/${ci}-make.log
289289

290-
python3 ../convert.py ${path_models} --outfile ${path_models}/ggml-model-f16.gguf
290+
python3 ../examples/convert-legacy-llama.py ${path_models} --outfile ${path_models}/ggml-model-f16.gguf
291291

292292
model_f16="${path_models}/ggml-model-f16.gguf"
293293
model_q8_0="${path_models}/ggml-model-q8_0.gguf"

convert-hf-to-gguf.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,6 @@
2525
sys.path.insert(1, str(Path(__file__).parent / 'gguf-py'))
2626
import gguf
2727

28-
from convert import LlamaHfVocab
29-
3028
logger = logging.getLogger("hf-to-gguf")
3129

3230

@@ -634,7 +632,7 @@ def _set_vocab_sentencepiece(self):
634632
special_vocab.add_to_gguf(self.gguf_writer)
635633

636634
def _set_vocab_llama_hf(self):
637-
vocab = LlamaHfVocab(self.dir_model)
635+
vocab = gguf.LlamaHfVocab(self.dir_model)
638636
tokens = []
639637
scores = []
640638
toktypes = []

docs/HOWTO-add-model.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ Also, it is important to check that the examples and main ggml backends (CUDA, M
1717
### 1. Convert the model to GGUF
1818

1919
This step is done in python with a `convert` script using the [gguf](https://pypi.org/project/gguf/) library.
20-
Depending on the model architecture, you can use either [convert.py](../convert.py) or [convert-hf-to-gguf.py](../convert-hf-to-gguf.py).
20+
Depending on the model architecture, you can use either [convert-hf-to-gguf.py](../convert-hf-to-gguf.py) or [examples/convert-legacy-llama.py](../examples/convert-legacy-llama.py) (for `llama/llama2` models in `.pth` format).
2121

2222
The convert script reads the model configuration, tokenizer, tensor names+data and converts them to GGUF metadata and tensors.
2323

0 commit comments

Comments
 (0)