Skip to content

Commit 66e9ef3

Browse files
authored
chore(model gallery): add DeepSeek R1 14b, 32b and 70b (mudler#4679)
Signed-off-by: Gianluca Boiano <morf3089@gmail.com>
1 parent 8282414 commit 66e9ef3

File tree

1 file changed

+80
-33
lines changed

1 file changed

+80
-33
lines changed

gallery/index.yaml

+80-33
Original file line numberDiff line numberDiff line change
@@ -2696,39 +2696,6 @@
26962696
- filename: Qwentile2.5-32B-Instruct-Q4_K_M.gguf
26972697
sha256: e476d6e3c15c78fc3f986d7ae8fa35c16116843827f2e6243c05767cef2f3615
26982698
uri: huggingface://bartowski/Qwentile2.5-32B-Instruct-GGUF/Qwentile2.5-32B-Instruct-Q4_K_M.gguf
2699-
- !!merge <<: *qwen25
2700-
name: "deepseek-r1-distill-qwen-1.5b"
2701-
icon: "https://avatars.githubusercontent.com/u/148330874"
2702-
urls:
2703-
- https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5b
2704-
- https://huggingface.co/unsloth/DeepSeek-R1-Distill-Qwen-1.5B-GGUF
2705-
description: |
2706-
DeepSeek-R1 is our advanced first-generation reasoning model designed to enhance performance in reasoning tasks.
2707-
Building on the foundation laid by its predecessor, DeepSeek-R1-Zero, which was trained using large-scale reinforcement learning (RL) without supervised fine-tuning, DeepSeek-R1 addresses the challenges faced by R1-Zero, such as endless repetition, poor readability, and language mixing.
2708-
By incorporating cold-start data prior to the RL phase,DeepSeek-R1 significantly improves reasoning capabilities and achieves performance levels comparable to OpenAI-o1 across a variety of domains, including mathematics, coding, and complex reasoning tasks.
2709-
overrides:
2710-
parameters:
2711-
model: deepseek-r1-distill-qwen-1.5b-Q4_K_M.gguf
2712-
files:
2713-
- filename: deepseek-r1-distill-qwen-1.5b-Q4_K_M.gguf
2714-
sha256: c2c43b6018cf7700ce0ddee8807deb1a9a26758ef878232f3a142d16df81f0fe
2715-
uri: huggingface://unsloth/DeepSeek-R1-Distill-Qwen-1.5B-GGUF/DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf
2716-
- !!merge <<: *qwen25
2717-
name: "deepseek-r1-distill-qwen-7b"
2718-
urls:
2719-
- https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
2720-
- https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF
2721-
description: |
2722-
DeepSeek-R1 is our advanced first-generation reasoning model designed to enhance performance in reasoning tasks.
2723-
Building on the foundation laid by its predecessor, DeepSeek-R1-Zero, which was trained using large-scale reinforcement learning (RL) without supervised fine-tuning, DeepSeek-R1 addresses the challenges faced by R1-Zero, such as endless repetition, poor readability, and language mixing.
2724-
By incorporating cold-start data prior to the RL phase,DeepSeek-R1 significantly improves reasoning capabilities and achieves performance levels comparable to OpenAI-o1 across a variety of domains, including mathematics, coding, and complex reasoning tasks.
2725-
overrides:
2726-
parameters:
2727-
model: DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
2728-
files:
2729-
- filename: DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
2730-
sha256: 731ece8d06dc7eda6f6572997feb9ee1258db0784827e642909d9b565641937b
2731-
uri: huggingface://bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
27322699
- &archfunct
27332700
license: apache-2.0
27342701
tags:
@@ -5334,6 +5301,86 @@
53345301
- filename: archangel_sft_pythia2-8b.Q4_K_M.gguf
53355302
sha256: a47782c55ef2b39b19644213720a599d9849511a73c9ebb0c1de749383c0a0f8
53365303
uri: huggingface://RichardErkhov/ContextualAI_-_archangel_sft_pythia2-8b-gguf/archangel_sft_pythia2-8b.Q4_K_M.gguf
5304+
- &deepseek-r1 ## Start DeepSeek-R1
5305+
url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
5306+
name: "deepseek-r1-distill-qwen-1.5b"
5307+
icon: "https://avatars.githubusercontent.com/u/148330874"
5308+
urls:
5309+
- https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5b
5310+
- https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-1.5B-GGUF
5311+
description: |
5312+
DeepSeek-R1 is our advanced first-generation reasoning model designed to enhance performance in reasoning tasks.
5313+
Building on the foundation laid by its predecessor, DeepSeek-R1-Zero, which was trained using large-scale reinforcement learning (RL) without supervised fine-tuning, DeepSeek-R1 addresses the challenges faced by R1-Zero, such as endless repetition, poor readability, and language mixing.
5314+
By incorporating cold-start data prior to the RL phase,DeepSeek-R1 significantly improves reasoning capabilities and achieves performance levels comparable to OpenAI-o1 across a variety of domains, including mathematics, coding, and complex reasoning tasks.
5315+
overrides:
5316+
parameters:
5317+
model: DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf
5318+
files:
5319+
- filename: DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf
5320+
sha256: 1741e5b2d062b07acf048bf0d2c514dadf2a48f94e2b4aa0cfe069af3838ee2f
5321+
uri: huggingface://bartowski/DeepSeek-R1-Distill-Qwen-1.5B-GGUF/DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf
5322+
- !!merge <<: *deepseek-r1
5323+
name: "deepseek-r1-distill-qwen-7b"
5324+
urls:
5325+
- https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
5326+
- https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF
5327+
overrides:
5328+
parameters:
5329+
model: DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
5330+
files:
5331+
- filename: DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
5332+
sha256: 731ece8d06dc7eda6f6572997feb9ee1258db0784827e642909d9b565641937b
5333+
uri: huggingface://bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
5334+
- !!merge <<: *deepseek-r1
5335+
name: "deepseek-r1-distill-qwen-14b"
5336+
urls:
5337+
- https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
5338+
- https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-14B-GGUF
5339+
overrides:
5340+
parameters:
5341+
model: DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf
5342+
files:
5343+
- filename: DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf
5344+
sha256: 0b319bd0572f2730bfe11cc751defe82045fad5085b4e60591ac2cd2d9633181
5345+
uri: huggingface://bartowski/DeepSeek-R1-Distill-Qwen-14B-GGUF/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf
5346+
- !!merge <<: *deepseek-r1
5347+
name: "deepseek-r1-distill-qwen-32b"
5348+
urls:
5349+
- https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
5350+
- https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF
5351+
overrides:
5352+
parameters:
5353+
model: DeepSeek-R1-Distill-Qwen-32B-Q4_K_M.gguf
5354+
files:
5355+
- filename: DeepSeek-R1-Distill-Qwen-32B-Q4_K_M.gguf
5356+
sha256: bed9b0f551f5b95bf9da5888a48f0f87c37ad6b72519c4cbd775f54ac0b9fc62
5357+
uri: huggingface://bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF/DeepSeek-R1-Distill-Qwen-32B-Q4_K_M.gguf
5358+
- !!merge <<: *deepseek-r1
5359+
name: "deepseek-r1-distill-llama-8b"
5360+
icon: "https://avatars.githubusercontent.com/u/148330874"
5361+
urls:
5362+
- https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B
5363+
- https://huggingface.co/bartowski/DeepSeek-R1-Distill-Llama-8B-GGUF
5364+
overrides:
5365+
parameters:
5366+
model: DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf
5367+
files:
5368+
- filename: DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf
5369+
sha256: 87bcba20b4846d8dadf753d3ff48f9285d131fc95e3e0e7e934d4f20bc896f5d
5370+
uri: huggingface://bartowski/DeepSeek-R1-Distill-Llama-8B-GGUF/DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf
5371+
- !!merge <<: *deepseek-r1
5372+
name: "deepseek-r1-distill-llama-70b"
5373+
icon: "https://avatars.githubusercontent.com/u/148330874"
5374+
urls:
5375+
- https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B
5376+
- https://huggingface.co/bartowski/DeepSeek-R 1-Distill-Llama-70B-GGUF
5377+
overrides:
5378+
parameters:
5379+
model: DeepSeek-R1-Distill-Llama-70B-Q4_K_M.gguf
5380+
files:
5381+
- filename: DeepSeek-R1-Distill-Llama-70B-Q4_K_M.gguf
5382+
sha256: 181a82a1d6d2fa24fe4db83a68eee030384986bdbdd4773ba76424e3a6eb9fd8
5383+
uri: huggingface://bartowski/DeepSeek-R1-Distill-Llama-70B-GGUF/DeepSeek-R1-Distill-Llama-70B-Q4_K_M.gguf
53375384
- &qwen2 ## Start QWEN2
53385385
url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
53395386
name: "qwen2-7b-instruct"

0 commit comments

Comments
 (0)