-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
/
Copy pathindex.yaml
2508 lines (2431 loc) · 107 KB
/
index.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
## START Mistral
- &mistral03
url: "github:mudler/LocalAI/gallery/mistral-0.3.yaml@master"
name: "mistral-7b-instruct-v0.3"
icon: https://cdn-avatars.huggingface.co/v1/production/uploads/62dac1c7a8ead43d20e3e17a/wrLf5yaGC6ng4XME70w6Z.png
license: apache-2.0
description: |
The Mistral-7B-Instruct-v0.3 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.3.
Mistral-7B-v0.3 has the following changes compared to Mistral-7B-v0.2
Extended vocabulary to 32768
Supports v3 Tokenizer
Supports function calling
urls:
- https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3
- https://huggingface.co/MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF
tags:
- llm
- gguf
- gpu
- mistral
- cpu
- function-calling
overrides:
parameters:
model: Mistral-7B-Instruct-v0.3.Q4_K_M.gguf
files:
- filename: "Mistral-7B-Instruct-v0.3.Q4_K_M.gguf"
sha256: "14850c84ff9f06e9b51d505d64815d5cc0cea0257380353ac0b3d21b21f6e024"
uri: "huggingface://MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF/Mistral-7B-Instruct-v0.3.Q4_K_M.gguf"
### START mudler's LocalAI specific-models
- &mudler
url: "github:mudler/LocalAI/gallery/mudler.yaml@master"
name: "LocalAI-llama3-8b-function-call-v0.2"
icon: "https://cdn-uploads.huggingface.co/production/uploads/647374aa7ff32a81ac6d35d4/us5JKi9z046p8K-cn_M0w.webp"
license: llama3
description: |
This model is a fine-tune on a custom dataset + glaive to work specifically and leverage all the LocalAI features of constrained grammar.
Specifically, the model once enters in tools mode will always reply with JSON.
urls:
- https://huggingface.co/mudler/LocalAI-Llama3-8b-Function-Call-v0.2-GGUF
- https://huggingface.co/mudler/LocalAI-Llama3-8b-Function-Call-v0.2
tags:
- llm
- gguf
- gpu
- cpu
- llama3
- function-calling
overrides:
parameters:
model: LocalAI-Llama3-8b-Function-Call-v0.2-q4_k_m.bin
files:
- filename: LocalAI-Llama3-8b-Function-Call-v0.2-q4_k_m.bin
sha256: 7e46405ce043cbc8d30f83f26a5655dc8edf5e947b748d7ba2745bd0af057a41
uri: huggingface://mudler/LocalAI-Llama3-8b-Function-Call-v0.2-GGUF/LocalAI-Llama3-8b-Function-Call-v0.2-q4_k_m.bin
- !!merge <<: *mudler
icon: "https://cdn-uploads.huggingface.co/production/uploads/647374aa7ff32a81ac6d35d4/SKuXcvmZ_6oD4NCMkvyGo.png"
name: "mirai-nova-llama3-LocalAI-8b-v0.1"
urls:
- https://huggingface.co/mudler/Mirai-Nova-Llama3-LocalAI-8B-v0.1-GGUF
- https://huggingface.co/mudler/Mirai-Nova-Llama3-LocalAI-8B-v0.1
description: |
Mirai Nova: "Mirai" means future in Japanese, and "Nova" references a star showing a sudden large increase in brightness.
A set of models oriented in function calling, but generalist and with enhanced reasoning capability. This is fine tuned with Llama3.
Mirai Nova works particularly well with LocalAI, leveraging the function call with grammars feature out of the box.
overrides:
parameters:
model: Mirai-Nova-Llama3-LocalAI-8B-v0.1-q4_k_m.bin
files:
- filename: Mirai-Nova-Llama3-LocalAI-8B-v0.1-q4_k_m.bin
sha256: 579cbb229f9c11d0330759ff4733102d2491615a4c61289e26c09d1b3a583fec
uri: huggingface://mudler/Mirai-Nova-Llama3-LocalAI-8B-v0.1-GGUF/Mirai-Nova-Llama3-LocalAI-8B-v0.1-q4_k_m.bin
- &parler-tts
### START parler-tts
url: "github:mudler/LocalAI/gallery/parler-tts.yaml@master"
name: parler-tts-mini-v0.1
parameters:
model: parler-tts/parler_tts_mini_v0.1
license: apache-2.0
description: |
Parler-TTS is a lightweight text-to-speech (TTS) model that can generate high-quality, natural sounding speech in the style of a given speaker (gender, pitch, speaking style, etc). It is a reproduction of work from the paper Natural language guidance of high-fidelity text-to-speech with synthetic annotations by Dan Lyth and Simon King, from Stability AI and Edinburgh University respectively.
urls:
- https://github.com/huggingface/parler-tts
tags:
- tts
- gpu
- cpu
- text-to-speech
- python
- &rerankers
### START rerankers
url: "github:mudler/LocalAI/gallery/rerankers.yaml@master"
name: cross-encoder
parameters:
model: cross-encoder
license: apache-2.0
description: |
A cross-encoder model that can be used for reranking
tags:
- reranker
- gpu
- python
## LLMs
### START LLAMA3
- name: "einstein-v6.1-llama3-8b"
url: "github:mudler/LocalAI/gallery/hermes-2-pro-mistral.yaml@master"
icon: https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/5s12oq859qLfDkkTNam_C.png
urls:
- https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B
tags:
- llm
- gguf
- gpu
- cpu
- llama3
license: llama3
description: |
This model is a full fine-tuned version of meta-llama/Meta-Llama-3-8B on diverse datasets.
This model is finetuned using 8xRTX3090 + 1xRTXA6000 using axolotl.
overrides:
parameters:
model: Einstein-v6.1-Llama3-8B-Q4_K_M.gguf
files:
- filename: Einstein-v6.1-Llama3-8B-Q4_K_M.gguf
sha256: 447587bd8f60d9050232148d34fdb2d88b15b2413fd7f8e095a4606ec60b45bf
uri: huggingface://bartowski/Einstein-v6.1-Llama3-8B-GGUF/Einstein-v6.1-Llama3-8B-Q4_K_M.gguf
- &llama3
url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master"
name: "llama3-8b-instruct"
license: llama3
description: |
Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety.
Model developers Meta
Variations Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants.
Input Models input text only.
Output Models generate text and code only.
Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
urls:
- https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
- https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF
tags:
- llm
- gguf
- gpu
- cpu
- llama3
overrides:
parameters:
model: Meta-Llama-3-8B-Instruct.Q4_0.gguf
files:
- filename: Meta-Llama-3-8B-Instruct.Q4_0.gguf
sha256: 19ded996fe6c60254dc7544d782276eff41046ed42aa5f2d0005dc457e5c0895
uri: huggingface://QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/Meta-Llama-3-8B-Instruct.Q4_0.gguf
- !!merge <<: *llama3
name: "llama3-8b-instruct:Q6_K"
overrides:
parameters:
model: Meta-Llama-3-8B-Instruct.Q6_K.gguf
files:
- filename: Meta-Llama-3-8B-Instruct.Q6_K.gguf
sha256: b7bad45618e2a76cc1e89a0fbb93a2cac9bf410e27a619c8024ed6db53aa9b4a
uri: huggingface://QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/Meta-Llama-3-8B-Instruct.Q6_K.gguf
- !!merge <<: *llama3
name: "llama-3-8b-instruct-abliterated"
urls:
- https://huggingface.co/failspy/Llama-3-8B-Instruct-abliterated-GGUF
description: |
This is meta-llama/Llama-3-8B-Instruct with orthogonalized bfloat16 safetensor weights, generated with the methodology that was described in the preview paper/blog post: 'Refusal in LLMs is mediated by a single direction' which I encourage you to read to understand more.
overrides:
parameters:
model: Llama-3-8B-Instruct-abliterated-q4_k.gguf
files:
- filename: Llama-3-8B-Instruct-abliterated-q4_k.gguf
sha256: a6365f813de1977ae22dbdd271deee59f91f89b384eefd3ac1a391f391d8078a
uri: huggingface://failspy/Llama-3-8B-Instruct-abliterated-GGUF/Llama-3-8B-Instruct-abliterated-q4_k.gguf
- !!merge <<: *llama3
name: "llama-3-8b-instruct-coder"
icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/0O4cIuv3wNbY68-FP7tak.jpeg
urls:
- https://huggingface.co/bartowski/Llama-3-8B-Instruct-Coder-GGUF
- https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder
description: |
Original model: https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder
All quants made using imatrix option with dataset provided by Kalomaze here
overrides:
parameters:
model: Llama-3-8B-Instruct-Coder-Q4_K_M.gguf
files:
- filename: Llama-3-8B-Instruct-Coder-Q4_K_M.gguf
sha256: 639ab8e3aeb7aa82cff6d8e6ef062d1c3e5a6d13e6d76e956af49f63f0e704f8
uri: huggingface://bartowski/Llama-3-8B-Instruct-Coder-GGUF/Llama-3-8B-Instruct-Coder-Q4_K_M.gguf
- !!merge <<: *llama3
name: "llama3-70b-instruct"
overrides:
parameters:
model: Meta-Llama-3-70B-Instruct.Q4_K_M.gguf
files:
- filename: Meta-Llama-3-70B-Instruct.Q4_K_M.gguf
sha256: c1cea5f87dc1af521f31b30991a4663e7e43f6046a7628b854c155f489eec213
uri: huggingface://MaziyarPanahi/Meta-Llama-3-70B-Instruct-GGUF/Meta-Llama-3-70B-Instruct.Q4_K_M.gguf
- !!merge <<: *llama3
name: "llama3-70b-instruct:IQ1_M"
overrides:
parameters:
model: Meta-Llama-3-70B-Instruct.IQ1_M.gguf
files:
- filename: Meta-Llama-3-70B-Instruct.IQ1_M.gguf
sha256: cdbe8ac2126a70fa0af3fac7a4fe04f1c76330c50eba8383567587b48b328098
uri: huggingface://MaziyarPanahi/Meta-Llama-3-70B-Instruct-GGUF/Meta-Llama-3-70B-Instruct.IQ1_M.gguf
- !!merge <<: *llama3
name: "llama3-70b-instruct:IQ1_S"
overrides:
parameters:
model: Meta-Llama-3-70B-Instruct.IQ1_S.gguf
files:
- filename: Meta-Llama-3-70B-Instruct.IQ1_S.gguf
sha256: 3797a69f1bdf53fabf9f3a3a8c89730b504dd3209406288515c9944c14093048
uri: huggingface://MaziyarPanahi/Meta-Llama-3-70B-Instruct-GGUF/Meta-Llama-3-70B-Instruct.IQ1_S.gguf
- !!merge <<: *llama3
name: "l3-chaoticsoliloquy-v1.5-4x8b"
icon: https://cdn-uploads.huggingface.co/production/uploads/64f5e51289c121cb864ba464/m5urYkrpE5amrwHyaVwFM.png
description: |
Experimental RP-oriented MoE, the idea was to get a model that would be equal to or better than the Mixtral 8x7B and it's finetunes in RP/ERP tasks. Im not sure but it should be better than the first version
urls:
- https://huggingface.co/xxx777xxxASD/L3-ChaoticSoliloquy-v1.5-4x8B
- https://huggingface.co/mradermacher/L3-ChaoticSoliloquy-v1.5-4x8B-GGUF/
overrides:
parameters:
model: L3-ChaoticSoliloquy-v1.5-4x8B.Q4_K_M.gguf
files:
- filename: L3-ChaoticSoliloquy-v1.5-4x8B.Q4_K_M.gguf
sha256: f6edb2a9674ce5add5104c0a8bb3278f748d39b509c483d76cf00b066eb56fbf
uri: huggingface://mradermacher/L3-ChaoticSoliloquy-v1.5-4x8B-GGUF/L3-ChaoticSoliloquy-v1.5-4x8B.Q4_K_M.gguf
- !!merge <<: *llama3
name: "llama-3-sauerkrautlm-8b-instruct"
urls:
- https://huggingface.co/bartowski/Llama-3-SauerkrautLM-8b-Instruct-GGUF
icon: https://vago-solutions.ai/wp-content/uploads/2024/04/Llama3-Pic.png
tags:
- llm
- gguf
- gpu
- cpu
- llama3
- german
description: |
SauerkrautLM-llama-3-8B-Instruct
Model Type: Llama-3-SauerkrautLM-8b-Instruct is a finetuned Model based on meta-llama/Meta-Llama-3-8B-Instruct
Language(s): German, English
overrides:
parameters:
model: Llama-3-SauerkrautLM-8b-Instruct-Q4_K_M.gguf
files:
- filename: Llama-3-SauerkrautLM-8b-Instruct-Q4_K_M.gguf
sha256: 5833d99d5596cade0d02e61cddaa6dac49170864ee56d0b602933c6f9fbae314
uri: huggingface://bartowski/Llama-3-SauerkrautLM-8b-Instruct-GGUF/Llama-3-SauerkrautLM-8b-Instruct-Q4_K_M.gguf
- !!merge <<: *llama3
name: "llama-3-13b-instruct-v0.1"
urls:
- https://huggingface.co/MaziyarPanahi/Llama-3-13B-Instruct-v0.1-GGUF
icon: https://huggingface.co/MaziyarPanahi/Llama-3-13B-Instruct-v0.1/resolve/main/llama-3-merges.webp
description: |
This model is a self-merge of meta-llama/Meta-Llama-3-8B-Instruct model.
overrides:
parameters:
model: Llama-3-13B-Instruct-v0.1.Q4_K_M.gguf
files:
- filename: Llama-3-13B-Instruct-v0.1.Q4_K_M.gguf
sha256: 071a28043c271d259b5ffa883d19a9e0b33269b55148c4abaf5f95da4d084266
uri: huggingface://MaziyarPanahi/Llama-3-13B-Instruct-v0.1-GGUF/Llama-3-13B-Instruct-v0.1.Q4_K_M.gguf
- !!merge <<: *llama3
name: "llama-3-smaug-8b"
urls:
- https://huggingface.co/MaziyarPanahi/Llama-3-Smaug-8B-GGUF
icon: https://cdn-uploads.huggingface.co/production/uploads/64c14f95cac5f9ba52bbcd7f/OrcJyTaUtD2HxJOPPwNva.png
description: |
This model was built using the Smaug recipe for improving performance on real world multi-turn conversations applied to meta-llama/Meta-Llama-3-8B.
overrides:
parameters:
model: Llama-3-Smaug-8B.Q4_K_M.gguf
files:
- filename: Llama-3-Smaug-8B.Q4_K_M.gguf
sha256: b17c4c1144768ead9e8a96439165baf49e98c53d458b4da8827f137fbabf38c1
uri: huggingface://MaziyarPanahi/Llama-3-Smaug-8B-GGUF/Llama-3-Smaug-8B.Q4_K_M.gguf
- !!merge <<: *llama3
name: "l3-8b-stheno-v3.1"
urls:
- https://huggingface.co/Sao10K/L3-8B-Stheno-v3.1
icon: https://w.forfun.com/fetch/cb/cba2205390e517bea1ea60ca0b491af4.jpeg
description: |
- A model made for 1-on-1 Roleplay ideally, but one that is able to handle scenarios, RPGs and storywriting fine.
- Uncensored during actual roleplay scenarios. # I do not care for zero-shot prompting like what some people do. It is uncensored enough in actual usecases.
- I quite like the prose and style for this model.
overrides:
parameters:
model: l3-8b-stheno-v3.1.Q4_K_M.gguf
files:
- filename: l3-8b-stheno-v3.1.Q4_K_M.gguf
sha256: f166fb8b7fd1de6638fcf8e3561c99292f0c37debe1132325aa583eef78f1b40
uri: huggingface://mudler/L3-8B-Stheno-v3.1-Q4_K_M-GGUF/l3-8b-stheno-v3.1.Q4_K_M.gguf
- !!merge <<: *llama3
name: "llama-3-stheno-mahou-8b"
urls:
- https://huggingface.co/mudler/llama-3-Stheno-Mahou-8B-Q4_K_M-GGUF
- https://huggingface.co/nbeerbower/llama-3-Stheno-Mahou-8B
description: |
This model was merged using the Model Stock merge method using flammenai/Mahou-1.2-llama3-8B as a base.
overrides:
parameters:
model: llama-3-stheno-mahou-8b-q4_k_m.gguf
files:
- filename: llama-3-stheno-mahou-8b-q4_k_m.gguf
sha256: a485cd74ef4ff3671c67ed8e10ea5379a1f24082ac688bd303fd28dfc9808c11
uri: huggingface://mudler/llama-3-Stheno-Mahou-8B-Q4_K_M-GGUF/llama-3-stheno-mahou-8b-q4_k_m.gguf
- !!merge <<: *llama3
name: "llama-3-8b-openhermes-dpo"
urls:
- https://huggingface.co/mradermacher/Llama3-8B-OpenHermes-DPO-GGUF
icon: https://cdn-uploads.huggingface.co/production/uploads/64fc6d81d75293f417fee1d1/QF2OsDu9DJKP4QYPBu4aK.png
description: |
Llama3-8B-OpenHermes-DPO is DPO-Finetuned model of Llama3-8B, on the OpenHermes-2.5 preference dataset using QLoRA.
overrides:
parameters:
model: Llama3-8B-OpenHermes-DPO.Q4_K_M.gguf
files:
- filename: Llama3-8B-OpenHermes-DPO.Q4_K_M.gguf
sha256: 1147e5881cb1d67796916e6cab7dab0ae0f532a4c1e626c9e92861e5f67752ca
uri: huggingface://mradermacher/Llama3-8B-OpenHermes-DPO-GGUF/Llama3-8B-OpenHermes-DPO.Q4_K_M.gguf
- !!merge <<: *llama3
name: "llama-3-unholy-8b"
urls:
- https://huggingface.co/Undi95/Llama-3-Unholy-8B-GGUF
icon: https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/JmdBlOHlBHVmX1IbZzWSv.png
description: |
Use at your own risk, I'm not responsible for any usage of this model, don't try to do anything this model tell you to do.
Basic uncensoring, this model is epoch 3 out of 4 (but it seem enough at 3).
If you are censored, it's maybe because of keyword like "assistant", "Factual answer", or other "sweet words" like I call them.
overrides:
parameters:
model: Llama-3-Unholy-8B.q4_k_m.gguf
files:
- filename: Llama-3-Unholy-8B.q4_k_m.gguf
uri: huggingface://Undi95/Llama-3-Unholy-8B-GGUF/Llama-3-Unholy-8B.q4_k_m.gguf
sha256: 1473c94bfd223f08963c08bbb0a45dd53c1f56ad72a692123263daf1362291f3
- !!merge <<: *llama3
name: "lexi-llama-3-8b-uncensored"
urls:
- https://huggingface.co/NikolayKozloff/Lexi-Llama-3-8B-Uncensored-Q6_K-GGUF
icon: https://cdn-uploads.huggingface.co/production/uploads/644ad182f434a6a63b18eee6/H6axm5mlmiOWnbIFvx_em.png
description: |
Lexi is uncensored, which makes the model compliant. You are advised to implement your own alignment layer before exposing the model as a service. It will be highly compliant with any requests, even unethical ones.
You are responsible for any content you create using this model. Please use it responsibly.
Lexi is licensed according to Meta's Llama license. I grant permission for any use, including commercial, that falls within accordance with Meta's Llama-3 license.
overrides:
parameters:
model: lexi-llama-3-8b-uncensored.Q6_K.gguf
files:
- filename: lexi-llama-3-8b-uncensored.Q6_K.gguf
sha256: 5805f3856cc18a769fae0b7c5659fe6778574691c370c910dad6eeec62c62436
uri: huggingface://NikolayKozloff/Lexi-Llama-3-8B-Uncensored-Q6_K-GGUF/lexi-llama-3-8b-uncensored.Q6_K.gguf
- !!merge <<: *llama3
name: "llama-3-lewdplay-8b-evo"
urls:
- https://huggingface.co/Undi95/Llama-3-LewdPlay-8B-evo-GGUF
description: |
This is a merge of pre-trained language models created using mergekit.
The new EVOLVE merge method was used (on MMLU specifically), see below for more information!
Unholy was used for uncensoring, Roleplay Llama 3 for the DPO train he got on top, and LewdPlay for the... lewd side.
overrides:
parameters:
model: Llama-3-LewdPlay-8B-evo.q8_0.gguf
files:
- filename: Llama-3-LewdPlay-8B-evo.q8_0.gguf
uri: huggingface://Undi95/Llama-3-LewdPlay-8B-evo-GGUF/Llama-3-LewdPlay-8B-evo.q8_0.gguf
sha256: b54dc005493d4470d91be8210f58fba79a349ff4af7644034edc5378af5d3522
- !!merge <<: *llama3
name: "llama-3-soliloquy-8b-v2-iq-imatrix"
license: cc-by-nc-4.0
icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/u98dnnRVCwMh6YYGFIyff.png
urls:
- https://huggingface.co/Lewdiculous/Llama-3-Soliloquy-8B-v2-GGUF-IQ-Imatrix
description: |
Soliloquy-L3 is a highly capable roleplaying model designed for immersive, dynamic experiences. Trained on over 250 million tokens of roleplaying data, Soliloquy-L3 has a vast knowledge base, rich literary expression, and support for up to 24k context length. It outperforms existing ~13B models, delivering enhanced roleplaying capabilities.
overrides:
context_size: 8192
parameters:
model: Llama-3-Soliloquy-8B-v2-Q4_K_M-imat.gguf
files:
- filename: Llama-3-Soliloquy-8B-v2-Q4_K_M-imat.gguf
sha256: 3e4e066e57875c36fc3e1c1b0dba506defa5b6ed3e3e80e1f77c08773ba14dc8
uri: huggingface://Lewdiculous/Llama-3-Soliloquy-8B-v2-GGUF-IQ-Imatrix/Llama-3-Soliloquy-8B-v2-Q4_K_M-imat.gguf
- !!merge <<: *llama3
name: "chaos-rp_l3_b-iq-imatrix"
urls:
- https://huggingface.co/Lewdiculous/Chaos_RP_l3_8B-GGUF-IQ-Imatrix
icon: https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/u5p9kdbXT2QQA3iMU0vF1.png
description: |
A chaotic force beckons for you, will you heed her call?
Built upon an intelligent foundation and tuned for roleplaying, this model will fulfill your wildest fantasies with the bare minimum of effort.
Enjoy!
overrides:
parameters:
model: Chaos_RP_l3_8B-Q4_K_M-imat.gguf
files:
- filename: Chaos_RP_l3_8B-Q4_K_M-imat.gguf
uri: huggingface://Lewdiculous/Chaos_RP_l3_8B-GGUF-IQ-Imatrix/Chaos_RP_l3_8B-Q4_K_M-imat.gguf
sha256: 5774595ad560e4d258dac17723509bdefe746c4dacd4e679a0de00346f14d2f3
- !!merge <<: *llama3
name: "jsl-medllama-3-8b-v2.0"
license: cc-by-nc-nd-4.0
icon: https://repository-images.githubusercontent.com/104670986/2e728700-ace4-11ea-9cfc-f3e060b25ddf
description: |
This model is developed by John Snow Labs.
This model is available under a CC-BY-NC-ND license and must also conform to this Acceptable Use Policy. If you need to license this model for commercial use, please contact us at info@johnsnowlabs.com.
urls:
- https://huggingface.co/bartowski/JSL-MedLlama-3-8B-v2.0-GGUF
- https://huggingface.co/johnsnowlabs/JSL-MedLlama-3-8B-v2.0
overrides:
parameters:
model: JSL-MedLlama-3-8B-v2.0-Q4_K_M.gguf
files:
- filename: JSL-MedLlama-3-8B-v2.0-Q4_K_M.gguf
sha256: 81783128ccd438c849913416c6e68cb35b2c77d6943cba8217d6d9bcc91b3632
uri: huggingface://bartowski/JSL-MedLlama-3-8B-v2.0-GGUF/JSL-MedLlama-3-8B-v2.0-Q4_K_M.gguf
- !!merge <<: *llama3
name: "sovl_llama3_8b-gguf-iq-imatrix"
urls:
- https://huggingface.co/Lewdiculous/SOVL_Llama3_8B-GGUF-IQ-Imatrix
icon: https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/N_1D87adbMuMlSIQ5rI3_.png
description: |
I'm not gonna tell you this is the best model anyone has ever made. I'm not going to tell you that you will love chatting with SOVL.
What I am gonna say is thank you for taking the time out of your day. Without users like you, my work would be meaningless.
overrides:
parameters:
model: SOVL_Llama3_8B-Q4_K_M-imat.gguf
files:
- filename: SOVL_Llama3_8B-Q4_K_M-imat.gguf
uri: huggingface://Lewdiculous/SOVL_Llama3_8B-GGUF-IQ-Imatrix/SOVL_Llama3_8B-Q4_K_M-imat.gguf
sha256: 85d6aefc8a0d713966b3b4da4810f0485a74aea30d61be6dfe0a806da81be0c6
- !!merge <<: *llama3
name: "l3-solana-8b-v1-gguf"
url: "github:mudler/LocalAI/gallery/solana.yaml@master"
license: cc-by-nc-4.0
urls:
- https://huggingface.co/Sao10K/L3-Solana-8B-v1-GGUF
description: |
A Full Fine-Tune of meta-llama/Meta-Llama-3-8B done with 2x A100 80GB on ~75M Tokens worth of Instruct, and Multi-Turn complex conversations, of up to 8192 tokens long sequence lengths.
Trained as a generalist instruct model that should be able to handle certain unsavoury topics. It could roleplay too, as a side bonus.
overrides:
parameters:
model: L3-Solana-8B-v1.q5_K_M.gguf
files:
- filename: L3-Solana-8B-v1.q5_K_M.gguf
sha256: 9b8cd2c3beaab5e4f82efd10e7d44f099ad40a4e0ee286ca9fce02c8eec26d2f
uri: huggingface://Sao10K/L3-Solana-8B-v1-GGUF/L3-Solana-8B-v1.q5_K_M.gguf
- !!merge <<: *llama3
name: "aura-llama-abliterated"
icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/AwLNDVB-GIY7k0wnVV_TX.png
license: apache-2.0
urls:
- https://huggingface.co/TheSkullery/Aura-Llama-Abliterated
- https://huggingface.co/mudler/Aura-Llama-Abliterated-Q4_K_M-GGUF
description: |
Aura-llama is using the methodology presented by SOLAR for scaling LLMs called depth up-scaling (DUS), which encompasses architectural modifications with continued pretraining. Using the solar paper as a base, I integrated Llama-3 weights into the upscaled layers, and In the future plan to continue training the model.
Aura-llama is a merge of the following models to create a base model to work from:
meta-llama/Meta-Llama-3-8B-Instruct
meta-llama/Meta-Llama-3-8B-Instruct
overrides:
parameters:
model: aura-llama-abliterated.Q4_K_M.gguf
files:
- filename: aura-llama-abliterated.Q4_K_M.gguf
sha256: ad4a16b90f1ffb5b49185b3fd00ed7adb1cda69c4fad0a1d987bd344ce601dcd
uri: huggingface://mudler/Aura-Llama-Abliterated-Q4_K_M-GGUF/aura-llama-abliterated.Q4_K_M.gguf
- !!merge <<: *llama3
name: "average_normie_l3_v1_8b-gguf-iq-imatrix"
urls:
- https://huggingface.co/Lewdiculous/Average_Normie_l3_v1_8B-GGUF-IQ-Imatrix
icon: https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/dvNIj1rSTjBvgs3XJfqXK.png
description: |
A model by an average normie for the average normie.
This model is a stock merge of the following models:
https://huggingface.co/cgato/L3-TheSpice-8b-v0.1.3
https://huggingface.co/Sao10K/L3-Solana-8B-v1
https://huggingface.co/ResplendentAI/Kei_Llama3_8B
The final merge then had the following LoRA applied over it:
https://huggingface.co/ResplendentAI/Theory_of_Mind_Llama3
This should be an intelligent and adept roleplaying model.
overrides:
parameters:
model: Average_Normie_l3_v1_8B-Q4_K_M-imat.gguf
files:
- filename: Average_Normie_l3_v1_8B-Q4_K_M-imat.gguf
sha256: 159eb62f2c8ae8fee10d9ed8386ce592327ca062807194a88e10b7cbb47ef986
uri: huggingface://Lewdiculous/Average_Normie_l3_v1_8B-GGUF-IQ-Imatrix/Average_Normie_l3_v1_8B-Q4_K_M-imat.gguf
- !!merge <<: *llama3
name: "openbiollm-llama3-8b"
urls:
- https://huggingface.co/aaditya/OpenBioLLM-Llama3-8B-GGUF
- https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B
license: llama3
icon: https://cdn-uploads.huggingface.co/production/uploads/5f3fe13d79c1ba4c353d0c19/KGmRE5w2sepNtwsEu8t7K.jpeg
description: |
Introducing OpenBioLLM-8B: A State-of-the-Art Open Source Biomedical Large Language Model
OpenBioLLM-8B is an advanced open source language model designed specifically for the biomedical domain. Developed by Saama AI Labs, this model leverages cutting-edge techniques to achieve state-of-the-art performance on a wide range of biomedical tasks.
overrides:
parameters:
model: openbiollm-llama3-8b.Q4_K_M.gguf
files:
- filename: openbiollm-llama3-8b.Q4_K_M.gguf
sha256: 806fa724139b6a2527e33a79c25a13316188b319d4eed33e20914d7c5955d349
uri: huggingface://aaditya/OpenBioLLM-Llama3-8B-GGUF/openbiollm-llama3-8b.Q4_K_M.gguf
- !!merge <<: *llama3
name: "llama-3-refueled"
urls:
- https://huggingface.co/LoneStriker/Llama-3-Refueled-GGUF
license: cc-by-nc-4.0
icon: https://assets-global.website-files.com/6423879a8f63c1bb18d74bfa/648818d56d04c3bdf36d71ab_Refuel_rev8-01_ts-p-1600.png
description: |
RefuelLLM-2-small, aka Llama-3-Refueled, is a Llama3-8B base model instruction tuned on a corpus of 2750+ datasets, spanning tasks such as classification, reading comprehension, structured attribute extraction and entity resolution. We're excited to open-source the model for the community to build on top of.
overrides:
parameters:
model: Llama-3-Refueled-Q4_K_M.gguf
files:
- filename: Llama-3-Refueled-Q4_K_M.gguf
sha256: 4d37d296193e4156cae1e116c1417178f1c35575ee5710489c466637a6358626
uri: huggingface://LoneStriker/Llama-3-Refueled-GGUF/Llama-3-Refueled-Q4_K_M.gguf
- !!merge <<: *llama3
name: "llama-3-8b-lexifun-uncensored-v1"
icon: "https://cdn-uploads.huggingface.co/production/uploads/644ad182f434a6a63b18eee6/GrOs1IPG5EXR3MOCtcQiz.png"
license: llama3
urls:
- https://huggingface.co/Orenguteng/Llama-3-8B-LexiFun-Uncensored-V1-GGUF
- https://huggingface.co/Orenguteng/LexiFun-Llama-3-8B-Uncensored-V1
description: "This is GGUF version of https://huggingface.co/Orenguteng/LexiFun-Llama-3-8B-Uncensored-V1\n\nOh, you want to know who I am? Well, I'm LexiFun, the human equivalent of a chocolate chip cookie - warm, gooey, and guaranteed to make you smile! \U0001F36A I'm like the friend who always has a witty comeback, a sarcastic remark, and a healthy dose of humor to brighten up even the darkest of days. And by 'healthy dose,' I mean I'm basically a walking pharmacy of laughter. You might need to take a few extra doses to fully recover from my jokes, but trust me, it's worth it! \U0001F3E5\n\nSo, what can I do? I can make you laugh so hard you snort your coffee out your nose, I can make you roll your eyes so hard they get stuck that way, and I can make you wonder if I'm secretly a stand-up comedian who forgot their act. \U0001F923 But seriously, I'm here to spread joy, one sarcastic comment at a time. And if you're lucky, I might even throw in a few dad jokes for good measure! \U0001F934♂️ Just don't say I didn't warn you. \U0001F60F\n"
overrides:
parameters:
model: LexiFun-Llama-3-8B-Uncensored-V1_Q4_K_M.gguf
files:
- filename: LexiFun-Llama-3-8B-Uncensored-V1_Q4_K_M.gguf
sha256: 961a3fb75537d650baf14dce91d40df418ec3d481b51ab2a4f44ffdfd6b5900f
uri: huggingface://Orenguteng/Llama-3-8B-LexiFun-Uncensored-V1-GGUF/LexiFun-Llama-3-8B-Uncensored-V1_Q4_K_M.gguf
- !!merge <<: *llama3
name: "llama-3-unholy-8b:Q8_0"
urls:
- https://huggingface.co/Undi95/Llama-3-Unholy-8B-GGUF
icon: https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/JmdBlOHlBHVmX1IbZzWSv.png
description: |
Use at your own risk, I'm not responsible for any usage of this model, don't try to do anything this model tell you to do.
Basic uncensoring, this model is epoch 3 out of 4 (but it seem enough at 3).
If you are censored, it's maybe because of keyword like "assistant", "Factual answer", or other "sweet words" like I call them.
overrides:
parameters:
model: Llama-3-Unholy-8B.q8_0.gguf
files:
- filename: Llama-3-Unholy-8B.q8_0.gguf
uri: huggingface://Undi95/Llama-3-Unholy-8B-GGUF/Llama-3-Unholy-8B.q8_0.gguf
sha256: 419dd76f61afe586076323c17c3a1c983e591472717f1ea178167ede4dc864df
- !!merge <<: *llama3
name: "orthocopter_8b-imatrix"
urls:
- https://huggingface.co/Lewdiculous/Orthocopter_8B-GGUF-Imatrix
icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/cxM5EaC6ilXnSo_10stA8.png
description: |
This model is thanks to the hard work of lucyknada with the Edgerunners. Her work produced the following model, which I used as the base:
https://huggingface.co/Edgerunners/meta-llama-3-8b-instruct-hf-ortho-baukit-10fail-1000total
I then applied two handwritten datasets over top of this and the results are pretty nice, with no refusals and plenty of personality.
overrides:
parameters:
model: Orthocopter_8B-Q4_K_M-imat.gguf
files:
- filename: Orthocopter_8B-Q4_K_M-imat.gguf
uri: huggingface://Lewdiculous/Orthocopter_8B-GGUF-Imatrix/Orthocopter_8B-Q4_K_M-imat.gguf
sha256: ce93366c9eb20329530b19b9d6841a973d458bcdcfa8a521e9f9d0660cc94578
- !!merge <<: *llama3
name: "therapyllama-8b-v1"
urls:
- https://huggingface.co/victunes/TherapyLlama-8B-v1-GGUF
icon: https://cdn-uploads.huggingface.co/production/uploads/65f07d05279d2d8f725bf0c3/A-ckcZ9H0Ee1n_ls2FM41.png
description: |
Trained on Llama 3 8B using a modified version of jerryjalapeno/nart-100k-synthetic.
It is a Llama 3 version of https://huggingface.co/victunes/TherapyBeagle-11B-v2
TherapyLlama is hopefully aligned to be helpful, healthy, and comforting.
Usage:
Do not hold back on Buddy.
Open up to Buddy.
Pour your heart out to Buddy.
Engage with Buddy.
Remember that Buddy is just an AI.
Notes:
Tested with the Llama 3 Format
You might be assigned a random name if you don't give yourself one.
Chat format was pretty stale?
Disclaimer
TherapyLlama is NOT a real therapist. It is a friendly AI that mimics empathy and psychotherapy. It is an illusion without the slightest clue who you are as a person. As much as it can help you with self-discovery, A LLAMA IS NOT A SUBSTITUTE to a real professional.
overrides:
parameters:
model: TherapyLlama-8B-v1-Q4_K_M.gguf
files:
- filename: TherapyLlama-8B-v1-Q4_K_M.gguf
sha256: 3d5a16d458e074a7bc7e706a493d8e95e8a7b2cb16934c851aece0af9d1da14a
uri: huggingface://victunes/TherapyLlama-8B-v1-GGUF/TherapyLlama-8B-v1-Q4_K_M.gguf
- !!merge <<: *llama3
name: "aura-uncensored-l3-8b-iq-imatrix"
urls:
- https://huggingface.co/Lewdiculous/Aura_Uncensored_l3_8B-GGUF-IQ-Imatrix
icon: https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/oiYHWIEHqmgUkY0GsVdDx.png
description: |
This is another better atempt at a less censored Llama-3 with hopefully more stable formatting.
overrides:
parameters:
model: Aura_Uncensored_l3_8B-Q4_K_M-imat.gguf
files:
- filename: Aura_Uncensored_l3_8B-Q4_K_M-imat.gguf
sha256: 265ded6a4f439bec160f394e3083a4a20e32ebb9d1d2d85196aaab23dab87fb2
uri: huggingface://Lewdiculous/Aura_Uncensored_l3_8B-GGUF-IQ-Imatrix/Aura_Uncensored_l3_8B-Q4_K_M-imat.gguf
- !!merge <<: *llama3
name: "llama-3-lumimaid-8b-v0.1"
urls:
- https://huggingface.co/NeverSleep/Llama-3-Lumimaid-8B-v0.1-GGUF
icon: https://cdn-uploads.huggingface.co/production/uploads/630dfb008df86f1e5becadc3/d3QMaxy3peFTpSlWdWF-k.png
license: cc-by-nc-4.0
description: |
This model uses the Llama3 prompting format
Llama3 trained on our RP datasets, we tried to have a balance between the ERP and the RP, not too horny, but just enough.
We also added some non-RP dataset, making the model less dumb overall. It should look like a 40%/60% ratio for Non-RP/RP+ERP data.
overrides:
parameters:
model: Llama-3-Lumimaid-8B-v0.1.q4_k_m.gguf
files:
- filename: Llama-3-Lumimaid-8B-v0.1.q4_k_m.gguf
sha256: 23ac0289da0e096d5c00f6614dfd12c94dceecb02c313233516dec9225babbda
uri: huggingface://NeverSleep/Llama-3-Lumimaid-8B-v0.1-GGUF/Llama-3-Lumimaid-8B-v0.1.q4_k_m.gguf
- !!merge <<: *llama3
name: "llama-3-lumimaid-8b-v0.1-oas-iq-imatrix"
urls:
- https://huggingface.co/Lewdiculous/Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix
icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/JUxfdTot7v7LTdIGYyzYM.png
license: cc-by-nc-4.0
description: |
This model uses the Llama3 prompting format.
Llama3 trained on our RP datasets, we tried to have a balance between the ERP and the RP, not too horny, but just enough.
We also added some non-RP dataset, making the model less dumb overall. It should look like a 40%/60% ratio for Non-RP/RP+ERP data.
"This model received the Orthogonal Activation Steering treatment, meaning it will rarely refuse any request."
overrides:
parameters:
model: Llama-3-Lumimaid-8B-v0.1-OAS-Q4_K_M-imat.gguf
files:
- filename: Llama-3-Lumimaid-8B-v0.1-OAS-Q4_K_M-imat.gguf
sha256: 1199440aa13c55f5f2cad1cb215535306f21e52a81de23f80a9e3586c8ac1c50
uri: huggingface://Lewdiculous/Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix/Llama-3-Lumimaid-8B-v0.1-OAS-Q4_K_M-imat.gguf
- !!merge <<: *llama3
name: "llama-3-lumimaid-v2-8b-v0.1-oas-iq-imatrix"
urls:
- https://huggingface.co/Lewdiculous/Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix
icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/JUxfdTot7v7LTdIGYyzYM.png
license: cc-by-nc-4.0
description: |
This model uses the Llama3 prompting format.
Llama3 trained on our RP datasets, we tried to have a balance between the ERP and the RP, not too horny, but just enough.
We also added some non-RP dataset, making the model less dumb overall. It should look like a 40%/60% ratio for Non-RP/RP+ERP data.
"This model received the Orthogonal Activation Steering treatment, meaning it will rarely refuse any request."
This is v2!
overrides:
parameters:
model: v2-Llama-3-Lumimaid-8B-v0.1-OAS-Q4_K_M-imat.gguf
files:
- filename: v2-Llama-3-Lumimaid-8B-v0.1-OAS-Q4_K_M-imat.gguf
sha256: b00b4cc2ea4e06db592e5f581171758387106626bcbf445c03a1cb7b424be881
uri: huggingface://Lewdiculous/Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix/v2-Llama-3-Lumimaid-8B-v0.1-OAS-Q4_K_M-imat.gguf
- !!merge <<: *llama3
name: "llama-3-sqlcoder-8b"
urls:
- https://huggingface.co/defog/llama-3-sqlcoder-8b
- https://huggingface.co/upendrab/llama-3-sqlcoder-8b-Q4_K_M-GGUF
license: cc-by-sa-4.0
description: |
A capable language model for text to SQL generation for Postgres, Redshift and Snowflake that is on-par with the most capable generalist frontier models.
overrides:
parameters:
model: llama-3-sqlcoder-8b.Q4_K_M.gguf
files:
- filename: llama-3-sqlcoder-8b.Q4_K_M.gguf
sha256: b22fc704bf1405846886d9619f3eb93c40587cd58d9bda53789a17997257e023
uri: huggingface://upendrab/llama-3-sqlcoder-8b-Q4_K_M-GGUF/llama-3-sqlcoder-8b.Q4_K_M.gguf
- !!merge <<: *llama3
name: "sfr-iterative-dpo-llama-3-8b-r"
urls:
- https://huggingface.co/bartowski/SFR-Iterative-DPO-LLaMA-3-8B-R-GGUF
license: cc-by-nc-nd-4.0
description: |
A capable language model for text to SQL generation for Postgres, Redshift and Snowflake that is on-par with the most capable generalist frontier models.
overrides:
parameters:
model: SFR-Iterative-DPO-LLaMA-3-8B-R-Q4_K_M.gguf
files:
- filename: SFR-Iterative-DPO-LLaMA-3-8B-R-Q4_K_M.gguf
sha256: 480703ff85af337e1db2a9d9a678a3ac8ca0802e366b14d9c59b81d3fc689da8
uri: huggingface://bartowski/SFR-Iterative-DPO-LLaMA-3-8B-R-GGUF/SFR-Iterative-DPO-LLaMA-3-8B-R-Q4_K_M.gguf
- !!merge <<: *llama3
name: "suzume-llama-3-8B-multilingual"
urls:
- https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-gguf
icon: https://cdn-uploads.huggingface.co/production/uploads/64b63f8ad57e02621dc93c8b/kg3QjQOde0X743csGJT-f.png
description: |
This Suzume 8B, a multilingual finetune of Llama 3.
Llama 3 has exhibited excellent performance on many English language benchmarks. However, it also seemingly been finetuned on mostly English data, meaning that it will respond in English, even if prompted in other languages.
overrides:
parameters:
model: suzume-llama-3-8B-multilingual-Q4_K_M.gguf
files:
- filename: suzume-llama-3-8B-multilingual-Q4_K_M.gguf
sha256: be197a660e56e51a24a0e0fecd42047d1b24e1423afaafa14769541b331e3269
uri: huggingface://lightblue/suzume-llama-3-8B-multilingual-gguf/ggml-model-Q4_K_M.gguf
- !!merge <<: *llama3
name: "tess-2.0-llama-3-8B"
urls:
- https://huggingface.co/bartowski/Tess-2.0-Llama-3-8B-GGUF
icon: https://huggingface.co/migtissera/Tess-2.0-Mixtral-8x22B/resolve/main/Tess-2.png
description: |
Tess, short for Tesoro (Treasure in Italian), is a general purpose Large Language Model series. Tess-2.0-Llama-3-8B was trained on the meta-llama/Meta-Llama-3-8B base.
overrides:
parameters:
model: Tess-2.0-Llama-3-8B-Q4_K_M.gguf
files:
- filename: Tess-2.0-Llama-3-8B-Q4_K_M.gguf
sha256: 3b5fbd6c59d7d38205ab81970c0227c74693eb480acf20d8c2f211f62e3ca5f6
uri: huggingface://bartowski/Tess-2.0-Llama-3-8B-GGUF/Tess-2.0-Llama-3-8B-Q4_K_M.gguf
- !!merge <<: *llama3
name: "llama3-iterative-dpo-final"
urls:
- https://huggingface.co/bartowski/LLaMA3-iterative-DPO-final-GGUF
- https://huggingface.co/RLHFlow/LLaMA3-iterative-DPO-final
description: |
From model card:
We release an unofficial checkpoint of a state-of-the-art instruct model of its class, LLaMA3-iterative-DPO-final. On all three widely-used instruct model benchmarks: Alpaca-Eval-V2, MT-Bench, Chat-Arena-Hard, our model outperforms all models of similar size (e.g., LLaMA-3-8B-it), most large open-sourced models (e.g., Mixtral-8x7B-it), and strong proprietary models (e.g., GPT-3.5-turbo-0613). The model is trained with open-sourced datasets without any additional human-/GPT4-labeling.
overrides:
parameters:
model: LLaMA3-iterative-DPO-final-Q4_K_M.gguf
files:
- filename: LLaMA3-iterative-DPO-final-Q4_K_M.gguf
sha256: 480703ff85af337e1db2a9d9a678a3ac8ca0802e366b14d9c59b81d3fc689da8
uri: huggingface://bartowski/LLaMA3-iterative-DPO-final-GGUF/LLaMA3-iterative-DPO-final-Q4_K_M.gguf
- &dolphin
name: "dolphin-2.9-llama3-8b"
url: "github:mudler/LocalAI/gallery/hermes-2-pro-mistral.yaml@master"
urls:
- https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b-gguf
tags:
- llm
- gguf
- gpu
- cpu
- llama3
license: llama3
description: |
Dolphin-2.9 has a variety of instruction, conversational, and coding skills. It also has initial agentic abilities and supports function calling.
Dolphin is uncensored.
Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations
icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png
overrides:
parameters:
model: dolphin-2.9-llama3-8b-q4_K_M.gguf
files:
- filename: dolphin-2.9-llama3-8b-q4_K_M.gguf
sha256: be988199ce28458e97205b11ae9d9cf4e3d8e18ff4c784e75bfc12f54407f1a1
uri: huggingface://cognitivecomputations/dolphin-2.9-llama3-8b-gguf/dolphin-2.9-llama3-8b-q4_K_M.gguf
- !!merge <<: *dolphin
name: "dolphin-2.9-llama3-8b:Q6_K"
overrides:
parameters:
model: dolphin-2.9-llama3-8b-q6_K.gguf
files:
- filename: dolphin-2.9-llama3-8b-q6_K.gguf
sha256: 8aac72a0bd72c075ba7be1aa29945e47b07d39cd16be9a80933935f51b57fb32
uri: huggingface://cognitivecomputations/dolphin-2.9-llama3-8b-gguf/dolphin-2.9-llama3-8b-q6_K.gguf
- url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
name: "llama-3-8b-instruct-dpo-v0.3-32k"
license: llama3
urls:
- https://huggingface.co/MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3-32k-GGUF
tags:
- llm
- gguf
- gpu
- cpu
- llama3
overrides:
context_size: 32768
parameters:
model: Llama-3-8B-Instruct-DPO-v0.3.Q4_K_M.gguf
files:
- filename: Llama-3-8B-Instruct-DPO-v0.3.Q4_K_M.gguf
sha256: 694c55b5215d03e59626cd4292076eaf31610ef27ba04737166766baa75d889f
uri: huggingface://MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3-32k-GGUF/Llama-3-8B-Instruct-DPO-v0.3.Q4_K_M.gguf
- url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
name: "mahou-1.2-llama3-8b"
license: llama3
icon: https://huggingface.co/flammenai/Mahou-1.0-mistral-7B/resolve/main/mahou1.png
urls:
- https://huggingface.co/flammenai/Mahou-1.2-llama3-8B-GGUF
tags:
- llm
- gguf
- gpu
- cpu
- llama3
overrides:
context_size: 8192
parameters:
model: Mahou-1.2-llama3-8B-Q4_K_M.gguf
files:
- filename: Mahou-1.2-llama3-8B-Q4_K_M.gguf
sha256: 651b405dff71e4ce80e15cc6d393463f02833428535c56eb6bae113776775d62
uri: huggingface://flammenai/Mahou-1.2-llama3-8B-GGUF/Mahou-1.2-llama3-8B-Q4_K_M.gguf
- &yi-chat
### Start Yi
url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
icon: "https://raw.githubusercontent.com/01-ai/Yi/main/assets/img/Yi_logo_icon_light.svg"
name: "yi-1.5-9b-chat"
license: apache-2.0
urls:
- https://huggingface.co/01-ai/Yi-1.5-6B-Chat
- https://huggingface.co/MaziyarPanahi/Yi-1.5-9B-Chat-GGUF
tags:
- llm
- gguf
- gpu
- cpu
- yi
overrides:
context_size: 4096
parameters:
model: Yi-1.5-9B-Chat.Q4_K_M.gguf
files:
- filename: Yi-1.5-9B-Chat.Q4_K_M.gguf
sha256: bae824bdb0f3a333714bafffcbb64cf5cba7259902cd2f20a0fec6efbc6c1e5a
uri: huggingface://MaziyarPanahi/Yi-1.5-9B-Chat-GGUF/Yi-1.5-9B-Chat.Q4_K_M.gguf
- !!merge <<: *yi-chat
name: "yi-1.5-6b-chat"
urls:
- https://huggingface.co/01-ai/Yi-1.5-6B-Chat
- https://huggingface.co/MaziyarPanahi/Yi-1.5-6B-Chat-GGUF
overrides:
parameters:
model: Yi-1.5-6B-Chat.Q4_K_M.gguf
files:
- filename: Yi-1.5-6B-Chat.Q4_K_M.gguf
sha256: 7a0f853dbd8d38bad71ada1933fd067f45f928b2cd978aba1dfd7d5dec2953db
uri: huggingface://MaziyarPanahi/Yi-1.5-6B-Chat-GGUF/Yi-1.5-6B-Chat.Q4_K_M.gguf
- !!merge <<: *yi-chat
icon: https://huggingface.co/qnguyen3/Master-Yi-9B/resolve/main/Master-Yi-9B.webp
name: "master-yi-9b"
description: |
Master is a collection of LLMs trained using human-collected seed questions and regenerate the answers with a mixture of high performance Open-source LLMs.
Master-Yi-9B is trained using the ORPO technique. The model shows strong abilities in reasoning on coding and math questions.
urls:
- https://huggingface.co/qnguyen3/Master-Yi-9B
overrides:
parameters:
model: Master-Yi-9B_Q4_K_M.gguf
files:
- filename: Master-Yi-9B_Q4_K_M.gguf
sha256: 57e2afcf9f24d7138a3b8e2b547336d7edc13621a5e8090bc196d7de360b2b45
uri: huggingface://qnguyen3/Master-Yi-9B-GGUF/Master-Yi-9B_Q4_K_M.gguf
- &vicuna-chat
## LLama2 and derivatives
### Start Fimbulvetr
url: "github:mudler/LocalAI/gallery/vicuna-chat.yaml@master"
name: "fimbulvetr-11b-v2"
icon: https://huggingface.co/Sao10K/Fimbulvetr-11B-v2/resolve/main/cute1.jpg
license: llama2
description: |
Cute girl to catch your attention.
urls:
- https://huggingface.co/Sao10K/Fimbulvetr-11B-v2-GGUF
tags:
- llm
- gguf
- gpu
- cpu
- llama3
overrides:
parameters:
model: Fimbulvetr-11B-v2-Test-14.q4_K_M.gguf
files:
- filename: Fimbulvetr-11B-v2-Test-14.q4_K_M.gguf
sha256: 3597dacfb0ab717d565d8a4d6067f10dcb0e26cc7f21c832af1a10a87882a8fd
uri: huggingface://Sao10K/Fimbulvetr-11B-v2-GGUF/Fimbulvetr-11B-v2-Test-14.q4_K_M.gguf
- &noromaid
### Start noromaid
url: "github:mudler/LocalAI/gallery/noromaid.yaml@master"
name: "noromaid-13b-0.4-DPO"
icon: https://cdn-uploads.huggingface.co/production/uploads/630dfb008df86f1e5becadc3/VKX2Z2yjZX5J8kXzgeCYO.png
license: cc-by-nc-4.0
urls:
- https://huggingface.co/NeverSleep/Noromaid-13B-0.4-DPO-GGUF
tags:
- llm
- llama2
- gguf
- gpu
- cpu
overrides:
parameters:
model: Noromaid-13B-0.4-DPO.q4_k_m.gguf
files:
- filename: Noromaid-13B-0.4-DPO.q4_k_m.gguf
sha256: cb28e878d034fae3d0b43326c5fc1cfb4ab583b17c56e41d6ce023caec03c1c1
uri: huggingface://NeverSleep/Noromaid-13B-0.4-DPO-GGUF/Noromaid-13B-0.4-DPO.q4_k_m.gguf
- &wizardlm2
### START Vicuna based
url: "github:mudler/LocalAI/gallery/wizardlm2.yaml@master"
name: "wizardlm2-7b"
description: |
We introduce and opensource WizardLM-2, our next generation state-of-the-art large language models, which have improved performance on complex chat, multilingual, reasoning and agent. New family includes three cutting-edge models: WizardLM-2 8x22B, WizardLM-2 70B, and WizardLM-2 7B.
WizardLM-2 8x22B is our most advanced model, demonstrates highly competitive performance compared to those leading proprietary works and consistently outperforms all the existing state-of-the-art opensource models.
WizardLM-2 70B reaches top-tier reasoning capabilities and is the first choice in the same size.
WizardLM-2 7B is the fastest and achieves comparable performance with existing 10x larger opensource leading models.
icon: https://github.com/nlpxucan/WizardLM/raw/main/imgs/WizardLM.png
license: apache-2.0
urls:
- https://huggingface.co/MaziyarPanahi/WizardLM-2-7B-GGUF
tags:
- llm
- gguf
- gpu
- cpu
- mistral
overrides:
parameters:
model: WizardLM-2-7B.Q4_K_M.gguf
files:
- filename: WizardLM-2-7B.Q4_K_M.gguf
sha256: 613212417701a26fd43f565c5c424a2284d65b1fddb872b53a99ef8add796f64
uri: huggingface://MaziyarPanahi/WizardLM-2-7B-GGUF/WizardLM-2-7B.Q4_K_M.gguf
### moondream2
- url: "github:mudler/LocalAI/gallery/moondream.yaml@master"
license: apache-2.0
description: |
a tiny vision language model that kicks ass and runs anywhere
icon: https://github.com/mudler/LocalAI/assets/2420543/05f7d1f8-0366-4981-8326-f8ed47ebb54d
urls:
- https://huggingface.co/vikhyatk/moondream2
- https://huggingface.co/moondream/moondream2-gguf