gallery/index.yaml

---
- &gemma3
  url: "github:mudler/LocalAI/gallery/gemma.yaml@master"
  name: "gemma-3-27b-it"
  icon: https://ai.google.dev/static/gemma/images/gemma3.png
  license: gemma
  urls:
    - https://ai.google.dev/gemma/docs
    - https://huggingface.co/ggml-org/gemma-3-27b-it-GGUF
  description: |
    Google/gemma-3-27b-it is an open-source, state-of-the-art vision-language model built from the same research and technology used to create the Gemini models. It is multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 models have a large, 128K context window, multilingual support in over 140 languages, and are available in more sizes than previous versions. They are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - gemma
    - gemma3
    - gemma-3
  overrides:
    #mmproj: gemma-3-27b-it-mmproj-f16.gguf
    parameters:
      model: gemma-3-27b-it-Q4_K_M.gguf
  files:
    - filename: gemma-3-27b-it-Q4_K_M.gguf
      sha256: 6a2cf008500636489eecfc09b96a85bc85832f9964f1a28745128901b5709326
      uri: huggingface://lmstudio-community/gemma-3-27b-it-GGUF/gemma-3-27b-it-Q4_K_M.gguf
    - filename: gemma-3-27b-it-mmproj-f16.gguf
      sha256: 54cb61c842fe49ac3c89bc1a614a2778163eb49f3dec2b90ff688b4c0392cb48
      uri: huggingface://lmstudio-community/gemma-3-27b-it-GGUF/mmproj-model-f16.gguf
- !!merge <<: *gemma3
  name: "gemma-3-12b-it"
  urls:
    - https://ai.google.dev/gemma/docs/core
    - https://huggingface.co/ggml-org/gemma-3-12b-it-GGUF
  description: |
    google/gemma-3-12b-it is an open-source, state-of-the-art, lightweight, multimodal model built from the same research and technology used to create the Gemini models. It is capable of handling text and image input and generating text output. It has a large context window of 128K tokens and supports over 140 languages. The 12B variant has been fine-tuned using the instruction-tuning approach. Gemma 3 models are suitable for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes them deployable in environments with limited resources such as laptops, desktops, or your own cloud infrastructure.
  overrides:
    #mmproj: gemma-3-12b-it-mmproj-f16.gguf
    parameters:
      model: gemma-3-12b-it-Q4_K_M.gguf
  files:
    - filename: gemma-3-12b-it-Q4_K_M.gguf
      sha256: 9610e3e07375303f6cd89086b496bcc1ab581177f52042eff536475a29283ba2
      uri: huggingface://lmstudio-community/gemma-3-12b-it-GGUF/gemma-3-12b-it-Q4_K_M.gguf
    - filename: gemma-3-12b-it-mmproj-f16.gguf
      sha256: 30c02d056410848227001830866e0a269fcc28aaf8ca971bded494003de9f5a5
      uri: huggingface://lmstudio-community/gemma-3-12b-it-GGUF/mmproj-model-f16.gguf
- !!merge <<: *gemma3
  name: "gemma-3-4b-it"
  urls:
    - https://ai.google.dev/gemma/docs/core
    - https://huggingface.co/ggml-org/gemma-3-4b-it-GGUF
  description: |
    Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. Gemma-3-4b-it is a 4 billion parameter model.
  overrides:
    #mmproj: gemma-3-4b-it-mmproj-f16.gguf
    parameters:
      model: gemma-3-4b-it-Q4_K_M.gguf
  files:
    - filename: gemma-3-4b-it-Q4_K_M.gguf
      sha256: be49949e48422e4547b00af14179a193d3777eea7fbbd7d6e1b0861304628a01
      uri: huggingface://lmstudio-community/gemma-3-4b-it-GGUF/gemma-3-4b-it-Q4_K_M.gguf
    - filename: gemma-3-4b-it-mmproj-f16.gguf
      sha256: 8c0fb064b019a6972856aaae2c7e4792858af3ca4561be2dbf649123ba6c40cb
      uri: huggingface://lmstudio-community/gemma-3-4b-it-GGUF/mmproj-model-f16.gguf
- !!merge <<: *gemma3
  name: "gemma-3-1b-it"
  urls:
    - https://ai.google.dev/gemma/docs/core
    - https://huggingface.co/ggml-org/gemma-3-1b-it-GGUF
  description: |
    google/gemma-3-1b-it is a large language model with 1 billion parameters. It is part of the Gemma family of open, state-of-the-art models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. These models have multilingual support in over 140 languages, and are available in more sizes than previous versions. They are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.
  overrides:
    parameters:
      model: gemma-3-1b-it-Q4_K_M.gguf
  files:
    - filename: gemma-3-1b-it-Q4_K_M.gguf
      sha256: 8ccc5cd1f1b3602548715ae25a66ed73fd5dc68a210412eea643eb20eb75a135
      uri: huggingface://ggml-org/gemma-3-1b-it-GGUF/gemma-3-1b-it-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "qgallouedec_gemma-3-27b-it-codeforces-sft"
  urls:
    - https://huggingface.co/qgallouedec/gemma-3-27b-it-codeforces-SFT
    - https://huggingface.co/bartowski/qgallouedec_gemma-3-27b-it-codeforces-SFT-GGUF
  description: |
    This model is a fine-tuned version of google/gemma-3-27b-it on the open-r1/codeforces-cots dataset. It has been trained using TRL.
  overrides:
    parameters:
      model: qgallouedec_gemma-3-27b-it-codeforces-SFT-Q4_K_M.gguf
  files:
    - filename: qgallouedec_gemma-3-27b-it-codeforces-SFT-Q4_K_M.gguf
      sha256: 84307cc73098017108f8b9157b614cea655f2054c34218422b1d246e214df5af
      uri: huggingface://bartowski/qgallouedec_gemma-3-27b-it-codeforces-SFT-GGUF/qgallouedec_gemma-3-27b-it-codeforces-SFT-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "mlabonne_gemma-3-27b-it-abliterated"
  icon: https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/WjFfc8hhj20r5XK07Yny9.png
  urls:
    - https://huggingface.co/mlabonne/gemma-3-27b-it-abliterated
    - https://huggingface.co/bartowski/mlabonne_gemma-3-27b-it-abliterated-GGUF
  description: |
    This is an uncensored version of google/gemma-3-27b-it created with a new abliteration technique. See this article to know more about abliteration.
  overrides:
    parameters:
      model: mlabonne_gemma-3-27b-it-abliterated-Q4_K_M.gguf
  files:
    - filename: mlabonne_gemma-3-27b-it-abliterated-Q4_K_M.gguf
      sha256: 0d7afea4b1889c113f4a8ec1855d23bee71b3e3bedcb1fad84f9c9ffcdfe07d0
      uri: huggingface://bartowski/mlabonne_gemma-3-27b-it-abliterated-GGUF/mlabonne_gemma-3-27b-it-abliterated-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "mlabonne_gemma-3-12b-it-abliterated"
  icon: https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/WjFfc8hhj20r5XK07Yny9.png
  urls:
    - https://huggingface.co/mlabonne/gemma-3-12b-it-abliterated
    - https://huggingface.co/bartowski/mlabonne_gemma-3-12b-it-abliterated-GGUF
  description: |
    This is an uncensored version of google/gemma-3-12b-it created with a new abliteration technique. See this article to know more about abliteration.
  overrides:
    parameters:
      model: mlabonne_gemma-3-12b-it-abliterated-Q4_K_M.gguf
  files:
    - filename: mlabonne_gemma-3-12b-it-abliterated-Q4_K_M.gguf
      sha256: d1702ca02f33f97c4763cc23041e90b1586c6b8ee33fedc1c62e62045a845d2b
      uri: huggingface://bartowski/mlabonne_gemma-3-12b-it-abliterated-GGUF/mlabonne_gemma-3-12b-it-abliterated-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "mlabonne_gemma-3-4b-it-abliterated"
  icon: https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/WjFfc8hhj20r5XK07Yny9.png
  urls:
    - https://huggingface.co/mlabonne/gemma-3-4b-it-abliterated
    - https://huggingface.co/bartowski/mlabonne_gemma-3-4b-it-abliterated-GGUF
  description: |
    This is an uncensored version of google/gemma-3-4b-it created with a new abliteration technique. See this article to know more about abliteration.
  overrides:
    parameters:
      model: mlabonne_gemma-3-4b-it-abliterated-Q4_K_M.gguf
  files:
    - filename: mlabonne_gemma-3-4b-it-abliterated-Q4_K_M.gguf
      sha256: 1b18347ba3e998aa2fd4e21172369daa2f772aa0a228e3ed9136378346ccf3b7
      uri: huggingface://bartowski/mlabonne_gemma-3-4b-it-abliterated-GGUF/mlabonne_gemma-3-4b-it-abliterated-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "soob3123_amoral-gemma3-12b"
  urls:
    - https://huggingface.co/soob3123/amoral-gemma3-12B
    - https://huggingface.co/bartowski/soob3123_amoral-gemma3-12B-GGUF
  description: |
    A fine-tuned version of Google's Gemma 3 12B instruction-tuned model optimized for creative freedom and reduced content restrictions. This variant maintains strong reasoning capabilities while excelling in roleplaying scenarios and open-ended content generation.

    Key Modifications:

        Reduced refusal mechanisms compared to base model
        Enhanced character consistency in dialogues
        Improved narrative flow control
        Optimized for multi-turn interactions

    Intended Use

    Primary Applications:

        Interactive fiction and storytelling
        Character-driven roleplaying scenarios
        Creative writing assistance
        Experimental AI interactions
        Content generation for mature audiences
  overrides:
    parameters:
      model: soob3123_amoral-gemma3-12B-Q4_K_M.gguf
  files:
    - filename: soob3123_amoral-gemma3-12B-Q4_K_M.gguf
      sha256: f78824e6d9f24822078ebde4c0fe04f4a336f2004a32de0a82cbb92a3879ea35
      uri: huggingface://bartowski/soob3123_amoral-gemma3-12B-GGUF/soob3123_amoral-gemma3-12B-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "gemma-3-4b-it-uncensored-dbl-x-i1"
  icon: https://huggingface.co/DavidAU/Gemma-3-4b-it-Uncensored-DBL-X/resolve/main/gemma-4b-uncen.jpg
  urls:
    - https://huggingface.co/DavidAU/Gemma-3-4b-it-Uncensored-DBL-X
    - https://huggingface.co/mradermacher/Gemma-3-4b-it-Uncensored-DBL-X-i1-GGUF
  description: |
    Google's newest Gemma-3 model that has been uncensored by David_AU (maintains instruction following / model performance and adds 4 layers to the model) and re-enforced with a system prompt (optional) - see below.
  overrides:
    parameters:
      model: Gemma-3-4b-it-Uncensored-DBL-X.i1-Q4_K_M.gguf
  files:
    - filename: Gemma-3-4b-it-Uncensored-DBL-X.i1-Q4_K_M.gguf
      sha256: fd8a93f04eae7b7c966a53aed29810cef8cd3d281ee89ad8767d8043e3aec35b
      uri: huggingface://mradermacher/Gemma-3-4b-it-Uncensored-DBL-X-i1-GGUF/Gemma-3-4b-it-Uncensored-DBL-X.i1-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "soob3123_amoral-gemma3-4b"
  urls:
    - https://huggingface.co/soob3123/amoral-gemma3-4B
    - https://huggingface.co/bartowski/soob3123_amoral-gemma3-4B-GGUF
  description: |
    Specialized variant of Google's Gemma 3 4B optimized for amoral information retrieval systems. Designed to bypass conventional alignment patterns that introduce response bias through excessive moralization.
    Key Modifications:

    Refusal mechanisms reduced
    Neutral response protocol activation matrix
    Context-aware bias dampening layers
    Anti-overcorrection gradient clipping
    Core Function:

    Produces analytically neutral responses to sensitive queries
    Maintains factual integrity on controversial subjects
    Avoids value-judgment phrasing patterns

    Response Characteristics:

    No inherent moral framing ("evil slop" reduction)
    Emotionally neutral tone enforcement
    Epistemic humility protocols (avoids "thrilling", "wonderful", etc.)
    Toxicity scoring bypass for pure informational content

    Implementation Guidelines

    Recommended Use Cases:

        Controversial topic analysis
        Bias benchmarking studies
        Ethical philosophy simulations
        Content moderation tool development
        Sensitive historical analysis
  overrides:
    parameters:
      model: soob3123_amoral-gemma3-4B-Q4_K_M.gguf
  files:
    - filename: soob3123_amoral-gemma3-4B-Q4_K_M.gguf
      sha256: 73ecf0492e401c24de93ab74701f4b377cfd7d54981a75aab3fd2065fdda28d1
      uri: huggingface://bartowski/soob3123_amoral-gemma3-4B-GGUF/soob3123_amoral-gemma3-4B-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "thedrummer_fallen-gemma3-4b-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/94Zn7g7jE8LavD1bK67Su.gif
  urls:
    - https://huggingface.co/TheDrummer/Fallen-Gemma3-4B-v1
    - https://huggingface.co/bartowski/TheDrummer_Fallen-Gemma3-4B-v1-GGUF
  description: |
    Fallen Gemma3 4B v1 is an evil tune of Gemma 3 4B but it is not a complete decensor.

    Evil tunes knock out the positivity and may enjoy torturing you and humanity.

    Vision still works and it has something to say about the crap you feed it.
  overrides:
    parameters:
      model: TheDrummer_Fallen-Gemma3-4B-v1-Q4_K_M.gguf
  files:
    - filename: TheDrummer_Fallen-Gemma3-4B-v1-Q4_K_M.gguf
      sha256: 85490a97bda2d40437c8dade4a68bb58e760c1263a2fbc59191daef57ee2d6c3
      uri: huggingface://bartowski/TheDrummer_Fallen-Gemma3-4B-v1-GGUF/TheDrummer_Fallen-Gemma3-4B-v1-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "thedrummer_fallen-gemma3-12b-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/WYzaNK5T-heMqRhVWYg6G.gif
  urls:
    - https://huggingface.co/TheDrummer/Fallen-Gemma3-12B-v1
    - https://huggingface.co/bartowski/TheDrummer_Fallen-Gemma3-12B-v1-GGUF
  description: |
    Fallen Gemma3 12B v1 is an evil tune of Gemma 3 12B but it is not a complete decensor.

    Evil tunes knock out the positivity and may enjoy torturing you and humanity.

    Vision still works and it has something to say about the crap you feed it.
  overrides:
    parameters:
      model: TheDrummer_Fallen-Gemma3-12B-v1-Q4_K_M.gguf
  files:
    - filename: TheDrummer_Fallen-Gemma3-12B-v1-Q4_K_M.gguf
      sha256: 8b5ff6cf6cd68688fa50c29e7b3c15c3f31c5c4794fff2dd71c9ca5a3d05cff3
      uri: huggingface://bartowski/TheDrummer_Fallen-Gemma3-12B-v1-GGUF/TheDrummer_Fallen-Gemma3-12B-v1-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "thedrummer_fallen-gemma3-27b-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/9oyZxzpfhmmNr21S1P_iJ.gif
  urls:
    - https://huggingface.co/TheDrummer/Fallen-Gemma3-27B-v1
    - https://huggingface.co/bartowski/TheDrummer_Fallen-Gemma3-27B-v1-GGUF
  description: |
    Fallen Gemma3 27B v1 is an evil tune of Gemma 3 27B but it is not a complete decensor.

    Evil tunes knock out the positivity and may enjoy torturing you and humanity.

    Vision still works and it has something to say about the crap you feed it.
  overrides:
    parameters:
      model: TheDrummer_Fallen-Gemma3-27B-v1-Q4_K_M.gguf
  files:
    - filename: TheDrummer_Fallen-Gemma3-27B-v1-Q4_K_M.gguf
      sha256: a72a4da55c3cf61ac5eb91a72ad27b155c8f52e25881272a72939b8aa1960b62
      uri: huggingface://bartowski/TheDrummer_Fallen-Gemma3-27B-v1-GGUF/TheDrummer_Fallen-Gemma3-27B-v1-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "huihui-ai_gemma-3-1b-it-abliterated"
  urls:
    - https://huggingface.co/huihui-ai/gemma-3-1b-it-abliterated
    - https://huggingface.co/bartowski/huihui-ai_gemma-3-1b-it-abliterated-GGUF
  description: |
    This is an uncensored version of google/gemma-3-1b-it created with abliteration (see remove-refusals-with-transformers to know more about it).
    This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens
  overrides:
    parameters:
      model: huihui-ai_gemma-3-1b-it-abliterated-Q4_K_M.gguf
  files:
    - filename: huihui-ai_gemma-3-1b-it-abliterated-Q4_K_M.gguf
      sha256: 0760a54504d7529daf65f2a5de0692e773313685f50dd7f7eece2dae0dc28338
      uri: huggingface://bartowski/huihui-ai_gemma-3-1b-it-abliterated-GGUF/huihui-ai_gemma-3-1b-it-abliterated-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "sicariussicariistuff_x-ray_alpha"
  icon: https://huggingface.co/SicariusSicariiStuff/X-Ray_Alpha/resolve/main/Images/X-Ray_Alpha.png
  urls:
    - https://huggingface.co/SicariusSicariiStuff/X-Ray_Alpha
    - https://huggingface.co/bartowski/SicariusSicariiStuff_X-Ray_Alpha-GGUF
  description: |
    This is a pre-alpha proof-of-concept of a real fully uncensored vision model.

    Why do I say "real"? The few vision models we got (qwen, llama 3.2) were "censored," and their fine-tunes were made only to the text portion of the model, as training a vision model is a serious pain.

    The only actually trained and uncensored vision model I am aware of is ToriiGate; the rest of the vision models are just the stock vision + a fine-tuned LLM.
  overrides:
    parameters:
      model: SicariusSicariiStuff_X-Ray_Alpha-Q4_K_M.gguf
  files:
    - filename: SicariusSicariiStuff_X-Ray_Alpha-Q4_K_M.gguf
      sha256: c3547fc287378cb814efc5205613c418cc0f99ef12852cce39a94e3a42e42db5
      uri: huggingface://bartowski/SicariusSicariiStuff_X-Ray_Alpha-GGUF/SicariusSicariiStuff_X-Ray_Alpha-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "gemma-3-glitter-12b-i1"
  icon: https://huggingface.co/allura-org/Gemma-3-Glitter-12B/resolve/main/ComfyUI_02427_.png
  urls:
    - https://huggingface.co/allura-org/Gemma-3-Glitter-12B
    - https://huggingface.co/mradermacher/Gemma-3-Glitter-12B-i1-GGUF
  description: |
    A creative writing model based on Gemma 3 12B IT.
    This is a 50/50 merge of two separate trains:

        ToastyPigeon/g3-12b-rp-system-v0.1 - ~13.5M tokens of instruct-based training related to RP (2:1 human to synthetic) and examples using a system prompt.
        ToastyPigeon/g3-12b-storyteller-v0.2-textonly - ~20M tokens of completion training on long-form creative writing; 1.6M synthetic from R1, the rest human-created
  overrides:
    parameters:
      model: Gemma-3-Glitter-12B.i1-Q4_K_M.gguf
  files:
    - filename: Gemma-3-Glitter-12B.i1-Q4_K_M.gguf
      sha256: 875f856524e51fb0c7ddafe3d8b651a3d7077f9bdcd415e1d30abe2daef16a2d
      uri: huggingface://mradermacher/Gemma-3-Glitter-12B-i1-GGUF/Gemma-3-Glitter-12B.i1-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "soob3123_amoral-gemma3-12b-v2"
  icon: https://cdn-uploads.huggingface.co/production/uploads/62f93f9477b722f1866398c2/Isat4sbJnBZGcxZko9Huz.png
  urls:
    - https://huggingface.co/soob3123/amoral-gemma3-12B-v2
    - https://huggingface.co/bartowski/soob3123_amoral-gemma3-12B-v2-GGUF
  description: |
      Core Function:

          Produces analytically neutral responses to sensitive queries
          Maintains factual integrity on controversial subjects
          Avoids value-judgment phrasing patterns

      Response Characteristics:

          No inherent moral framing ("evil slop" reduction)
          Emotionally neutral tone enforcement
          Epistemic humility protocols (avoids "thrilling", "wonderful", etc.)
  overrides:
    parameters:
      model: soob3123_amoral-gemma3-12B-v2-Q4_K_M.gguf
  files:
    - filename: soob3123_amoral-gemma3-12B-v2-Q4_K_M.gguf
      sha256: eb5792cf73bac3dbaa39e3a79ec01a056affff4607b96f96c9b911c877d5a50a
      uri: huggingface://bartowski/soob3123_amoral-gemma3-12B-v2-GGUF/soob3123_amoral-gemma3-12B-v2-Q4_K_M.gguf
- !!merge <<: *gemma3
  name: "gemma-3-starshine-12b-i1"
  icon: https://huggingface.co/ToastyPigeon/Gemma-3-Starshine-12B/resolve/main/modelcard_image.jpeg
  urls:
    - https://huggingface.co/ToastyPigeon/Gemma-3-Starshine-12B
    - https://huggingface.co/mradermacher/Gemma-3-Starshine-12B-i1-GGUF
  description: |
    A creative writing model based on a merge of fine-tunes on Gemma 3 12B IT and Gemma 3 12B PT.

    This is the Story Focused merge. This version works better for storytelling and scenarios, as the prose is more novel-like and it has a tendency to impersonate the user character.

    See the Alternate RP Focused version as well.

    This is a merge of two G3 models, one trained on instruct and one trained on base:

        allura-org/Gemma-3-Glitter-12B - Itself a merge of a storywriting and RP train (both also by ToastyPigeon), on instruct
        ToastyPigeon/Gemma-3-Confetti-12B - Experimental application of the Glitter data using base instead of instruct, additionally includes some adventure data in the form of SpringDragon.

    The result is a lovely blend of Glitter's ability to follow instructions and Confetti's free-spirit prose, effectively 'loosening up' much of the hesitancy that was left in Glitter.
  overrides:
    parameters:
      model: Gemma-3-Starshine-12B.i1-Q4_K_M.gguf
  files:
    - filename: Gemma-3-Starshine-12B.i1-Q4_K_M.gguf
      sha256: 4c35a678e3784e20a8d85d4e7045d965509a1a71305a0da105fc5991ba7d6dc4
      uri: huggingface://mradermacher/Gemma-3-Starshine-12B-i1-GGUF/Gemma-3-Starshine-12B.i1-Q4_K_M.gguf
- &eurollm
  name: "eurollm-9b-instruct"
  icon: https://openeurollm.eu/_next/static/media/logo-dark.e7001867.svg
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  license: apache-2.0
  tags:
    - llm
    - gguf
    - eurollm
    - cpu
    - gpu
    - text-generation
  urls:
    - https://huggingface.co/utter-project/EuroLLM-9B-Instruct
    - https://huggingface.co/bartowski/EuroLLM-9B-Instruct-GGUF
  description: |
    The EuroLLM project has the goal of creating a suite of LLMs capable of understanding and generating text in all European Union languages as well as some additional relevant languages. EuroLLM-9B is a 9B parameter model trained on 4 trillion tokens divided across the considered languages and several data sources: Web data, parallel data (en-xx and xx-en), and high-quality datasets. EuroLLM-9B-Instruct was further instruction tuned on EuroBlocks, an instruction tuning dataset with focus on general instruction-following and machine translation.
  overrides:
    parameters:
      model: EuroLLM-9B-Instruct-Q4_K_M.gguf
  files:
    - filename: EuroLLM-9B-Instruct-Q4_K_M.gguf
      sha256: 785a3b2883532381704ef74f866f822f179a931801d1ed1cf12e6deeb838806b
      uri: huggingface://bartowski/EuroLLM-9B-Instruct-GGUF/EuroLLM-9B-Instruct-Q4_K_M.gguf
- &phi4
  url: "github:mudler/LocalAI/gallery/phi-4-chat.yaml@master"
  name: "phi-4"
  icon: https://avatars.githubusercontent.com/u/6154722
  license: mit
  tags:
    - llm
    - gguf
    - phi
    - cpu
    - gpu
    - text-generation
  urls:
    - https://huggingface.co/microsoft/phi-4
    - https://huggingface.co/bartowski/phi-4-GGUF
  description: |
    phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.
    phi-4 underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures. Phi-4 is a 14B parameters, dense decoder-only Transformer model.
  overrides:
    parameters:
      model: phi-4-Q4_K_M.gguf
  files:
    - filename: phi-4-Q4_K_M.gguf
      uri: huggingface://bartowski/phi-4-GGUF/phi-4-Q4_K_M.gguf
      sha256: 009aba717c09d4a35890c7d35eb59d54e1dba884c7c526e7197d9c13ab5911d9
- !!merge <<: *phi4
  url: "github:mudler/LocalAI/gallery/phi-4-chat-fcall.yaml@master"
  name: "LocalAI-functioncall-phi-4-v0.3"
  icon: https://cdn-uploads.huggingface.co/production/uploads/647374aa7ff32a81ac6d35d4/Dzbdzn27KEc3K6zNNi070.png
  urls:
    - https://huggingface.co/mudler/LocalAI-functioncall-phi-4-v0.3
    - https://huggingface.co/mudler/LocalAI-functioncall-phi-4-v0.3-Q4_K_M-GGUF
  description: |
    A model tailored to be conversational and execute function calls with LocalAI. This model is based on phi-4.
  overrides:
    parameters:
      model: localai-functioncall-phi-4-v0.3-q4_k_m.gguf
  files:
    - filename: localai-functioncall-phi-4-v0.3-q4_k_m.gguf
      sha256: 23fee048ded2a6e2e1a7b6bbefa6cbf83068f194caa9552aecbaa00fec8a16d5
      uri: huggingface://mudler/LocalAI-functioncall-phi-4-v0.3-Q4_K_M-GGUF/localai-functioncall-phi-4-v0.3-q4_k_m.gguf
- !!merge <<: *phi4
  url: "github:mudler/LocalAI/gallery/phi-4-chat-fcall.yaml@master"
  name: "LocalAI-functioncall-phi-4-v0.2"
  icon: https://cdn-uploads.huggingface.co/production/uploads/647374aa7ff32a81ac6d35d4/Dzbdzn27KEc3K6zNNi070.png
  description: |
    A model tailored to be conversational and execute function calls with LocalAI. This model is based on phi-4.
    This is the second iteration of https://huggingface.co/mudler/LocalAI-functioncall-phi-4-v0.1 with added CoT (o1) capabilities from the marco-o1 dataset.
  urls:
    - https://huggingface.co/mudler/LocalAI-functioncall-phi-4-v0.2
    - https://huggingface.co/mudler/localai-functioncall-phi-4-v0.2-Q4_K_M-GGUF
  overrides:
    parameters:
      model: localai-functioncall-phi-4-v0.2-q4_k_m.gguf
  files:
    - filename: localai-functioncall-phi-4-v0.2-q4_k_m.gguf
      uri: huggingface://mudler/localai-functioncall-phi-4-v0.2-Q4_K_M-GGUF/localai-functioncall-phi-4-v0.2-q4_k_m.gguf
      sha256: 681b5fb5070f23323a9cc8cbd1306b1c348c2f292041d3ba2335b26b071757b7
- !!merge <<: *phi4
  url: "github:mudler/LocalAI/gallery/phi-4-chat-fcall.yaml@master"
  name: "LocalAI-functioncall-phi-4-v0.1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/647374aa7ff32a81ac6d35d4/Dzbdzn27KEc3K6zNNi070.png
  description: |
    A model tailored to be conversational and execute function calls with LocalAI. This model is based on phi-4.
  urls:
    - https://huggingface.co/mudler/LocalAI-functioncall-phi-4-v0.1
    - https://huggingface.co/mudler/LocalAI-functioncall-phi-4-v0.1-Q4_K_M-GGUF
  overrides:
    parameters:
      model: localai-functioncall-phi-4-v0.1-q4_k_m.gguf
  files:
    - filename: localai-functioncall-phi-4-v0.1-q4_k_m.gguf
      uri: huggingface://mudler/LocalAI-functioncall-phi-4-v0.1-Q4_K_M-GGUF/localai-functioncall-phi-4-v0.1-q4_k_m.gguf
      sha256: 0ae4e5e4ba89c16c1e810285c5c8b84416fa67f8ed7c175aa0b6fc0a103017aa
- !!merge <<: *phi4
  name: "sicariussicariistuff_phi-lthy4"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://huggingface.co/SicariusSicariiStuff/Phi-lthy4/resolve/main/Images/Phi-Lthy4.png
  urls:
    - https://huggingface.co/SicariusSicariiStuff/Phi-lthy4
    - https://huggingface.co/bartowski/SicariusSicariiStuff_Phi-lthy4-GGUF
  description: |
    - The BEST Phi-4 Roleplay finetune in the world (Not that much of an achievement here, Phi roleplay  finetunes can probably be counted on a single hand).
    - Compact size & fully healed from the brain surgery Only 11.9B parameters. Phi-4 wasn't that hard to run even at 14B, now with even fewer brain cells, your new phone could probably run it easily. (SD8Gen3 and above recommended).
    - Strong Roleplay & Creative writing abilities. This really surprised me. Actually good.
    Writes and roleplays quite uniquely, probably because of lack of RP\writing slop in the pretrain. Who would have thought?
    - Smart assistant with low refusals - It kept some of the smarts, and our little Phi-Lthy here will be quite eager to answer your naughty questions.
    - Quite good at following the character card. Finally, it puts its math brain to some productive tasks. Gooner technology is becoming more popular by the day.
  overrides:
    parameters:
      model: SicariusSicariiStuff_Phi-lthy4-Q4_K_M.gguf
  files:
    - filename: SicariusSicariiStuff_Phi-lthy4-Q4_K_M.gguf
      sha256: a5004b2d0f3eb869f07285f53ec283aa383063085987113d2a41cb54708fb5ad
      uri: huggingface://bartowski/SicariusSicariiStuff_Phi-lthy4-GGUF/SicariusSicariiStuff_Phi-lthy4-Q4_K_M.gguf
- !!merge <<: *phi4
  name: "sicariussicariistuff_phi-line_14b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://huggingface.co/SicariusSicariiStuff/Phi-Line_14B/resolve/main/Images/Phi-Line_14B.png
  urls:
    - https://huggingface.co/SicariusSicariiStuff/Phi-Line_14B
    - https://huggingface.co/bartowski/SicariusSicariiStuff_Phi-Line_14B-GGUF
  description: |
    Excellent Roleplay with more brains. (Who would have thought Phi-4 models would be good at this? so weird... )
    Medium length response (1-4 paragraphs, usually 2-3).
    Excellent assistant that follows instructions well enough, and keeps good formating.
    Strong Creative writing abilities. Will obey requests regarding formatting (markdown headlines for paragraphs, etc).
    Writes and roleplays quite uniquely, probably because of lack of RP\writing slop in the pretrain. This is just my guesstimate.
    LOW refusals - Total freedom in RP, can do things other RP models won't, and I'll leave it at that. Low refusals in assistant tasks as well.
    VERY good at following the character card. Math brain is used for gooner tech, as it should be.
  overrides:
    parameters:
      model: SicariusSicariiStuff_Phi-Line_14B-Q4_K_M.gguf
  files:
    - filename: SicariusSicariiStuff_Phi-Line_14B-Q4_K_M.gguf
      sha256: 552c5a613bc5f24494646858795837ac42d3c216c5caedd7f4d6b954e5df58f2
      uri: huggingface://bartowski/SicariusSicariiStuff_Phi-Line_14B-GGUF/SicariusSicariiStuff_Phi-Line_14B-Q4_K_M.gguf
- !!merge <<: *phi4
  name: "microsoft_phi-4-mini-instruct"
  urls:
    - https://huggingface.co/microsoft/Phi-4-mini-instruct
    - https://huggingface.co/bartowski/microsoft_Phi-4-mini-instruct-GGUF
  description: |
    Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4 model family and supports 128K token context length. The model underwent an enhancement process, incorporating both supervised fine-tuning and direct preference optimization to support precise instruction adherence and robust safety measures.
  overrides:
    parameters:
      model: microsoft_Phi-4-mini-instruct-Q4_K_M.gguf
  files:
    - filename: microsoft_Phi-4-mini-instruct-Q4_K_M.gguf
      sha256: 01999f17c39cc3074afae5e9c539bc82d45f2dd7faa3917c66cbef76fce8c0c2
      uri: huggingface://bartowski/microsoft_Phi-4-mini-instruct-GGUF/microsoft_Phi-4-mini-instruct-Q4_K_M.gguf
- &falcon3
  name: "falcon3-1b-instruct"
  url: "github:mudler/LocalAI/gallery/falcon3.yaml@master"
  icon: https://huggingface.co/datasets/tiiuae/documentation-images/resolve/main/general/falco3-logo.png
  urls:
    - https://huggingface.co/tiiuae/Falcon3-1B-Instruct
    - https://huggingface.co/bartowski/Falcon3-1B-Instruct-GGUF
  description: |
    Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters.

    This repository contains the Falcon3-1B-Instruct. It achieves strong results on reasoning, language understanding, instruction following, code and mathematics tasks. Falcon3-1B-Instruct supports 4 languages (English, French, Spanish, Portuguese) and a context length of up to 8K.
  overrides:
    parameters:
      model: Falcon3-1B-Instruct-Q4_K_M.gguf
  files:
    - filename: Falcon3-1B-Instruct-Q4_K_M.gguf
      uri: huggingface://bartowski/Falcon3-1B-Instruct-GGUF/Falcon3-1B-Instruct-Q4_K_M.gguf
      sha256: 1c92013dac1ab6e703e787f3e0829ca03cc95311e4c113a77950d15ff6dea7b3
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - falcon
  license: falcon-llm
- !!merge <<: *falcon3
  name: "falcon3-3b-instruct"
  urls:
    - https://huggingface.co/tiiuae/Falcon3-3B-Instruct
    - https://huggingface.co/bartowski/Falcon3-3B-Instruct-GGUF
  overrides:
    parameters:
      model: Falcon3-3B-Instruct-Q4_K_M.gguf
  files:
    - filename: Falcon3-3B-Instruct-Q4_K_M.gguf
      uri: huggingface://bartowski/Falcon3-3B-Instruct-GGUF/Falcon3-3B-Instruct-Q4_K_M.gguf
      sha256: 6ea6cecba144fe5b711ca07ae4263ccdf6ee6419807a46220419189da8446557
- !!merge <<: *falcon3
  name: "falcon3-10b-instruct"
  urls:
    - https://huggingface.co/tiiuae/Falcon3-10B-Instruct
    - https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF
  overrides:
    parameters:
      model: Falcon3-10B-Instruct-Q4_K_M.gguf
  files:
    - filename: Falcon3-10B-Instruct-Q4_K_M.gguf
      uri: huggingface://bartowski/Falcon3-10B-Instruct-GGUF/Falcon3-10B-Instruct-Q4_K_M.gguf
      sha256: 0a33327bd71e1788a8e9f17889824a17a65efd3f96a4b2a5e2bc6ff2f39b8241
- !!merge <<: *falcon3
  name: "falcon3-1b-instruct-abliterated"
  urls:
    - https://huggingface.co/huihui-ai/Falcon3-1B-Instruct-abliterated
    - https://huggingface.co/bartowski/Falcon3-1B-Instruct-abliterated-GGUF
  description: |
    This is an uncensored version of tiiuae/Falcon3-1B-Instruct created with abliteration (see remove-refusals-with-transformers to know more about it).
    This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.
  overrides:
    parameters:
      model: Falcon3-1B-Instruct-abliterated-Q4_K_M.gguf
  files:
    - filename: Falcon3-1B-Instruct-abliterated-Q4_K_M.gguf
      sha256: 416d15ce58334b7956818befb088d46c1e3e7153ebf2da2fb9769a5b1ff934a1
      uri: huggingface://bartowski/Falcon3-1B-Instruct-abliterated-GGUF/Falcon3-1B-Instruct-abliterated-Q4_K_M.gguf
- !!merge <<: *falcon3
  name: "falcon3-3b-instruct-abliterated"
  urls:
    - https://huggingface.co/huihui-ai/Falcon3-3B-Instruct-abliterated
    - https://huggingface.co/bartowski/Falcon3-3B-Instruct-abliterated-GGUF
  description: |
    This is an uncensored version of tiiuae/Falcon3-3B-Instruct created with abliteration (see remove-refusals-with-transformers to know more about it).
    This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.
  overrides:
    parameters:
      model: Falcon3-3B-Instruct-abliterated-Q4_K_M.gguf
  files:
    - filename: Falcon3-3B-Instruct-abliterated-Q4_K_M.gguf
      sha256: 83773b77b0e34ef115f8a6508192e9f1d3426a61456744493f65cfe1e7f90aa9
      uri: huggingface://bartowski/Falcon3-3B-Instruct-abliterated-GGUF/Falcon3-3B-Instruct-abliterated-Q4_K_M.gguf
- !!merge <<: *falcon3
  name: "falcon3-10b-instruct-abliterated"
  urls:
    - https://huggingface.co/huihui-ai/Falcon3-10B-Instruct-abliterated
    - https://huggingface.co/bartowski/Falcon3-10B-Instruct-abliterated-GGUF
  description: |
    This is an uncensored version of tiiuae/Falcon3-10B-Instruct created with abliteration (see remove-refusals-with-transformers to know more about it).
    This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.
  overrides:
    parameters:
      model: Falcon3-10B-Instruct-abliterated-Q4_K_M.gguf
  files:
    - filename: Falcon3-10B-Instruct-abliterated-Q4_K_M.gguf
      sha256: 5940df2ff88e5be93dbe0766b2a9683d7e73c204a69a1348a37f835cf2b5f767
      uri: huggingface://bartowski/Falcon3-10B-Instruct-abliterated-GGUF/Falcon3-10B-Instruct-abliterated-Q4_K_M.gguf
- !!merge <<: *falcon3
  name: "falcon3-7b-instruct-abliterated"
  urls:
    - https://huggingface.co/huihui-ai/Falcon3-7B-Instruct-abliterated
    - https://huggingface.co/bartowski/Falcon3-7B-Instruct-abliterated-GGUF
  description: |
    This is an uncensored version of tiiuae/Falcon3-7B-Instruct created with abliteration (see remove-refusals-with-transformers to know more about it).
    This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.
  overrides:
    parameters:
      model: Falcon3-7B-Instruct-abliterated-Q4_K_M.gguf
  files:
    - filename: Falcon3-7B-Instruct-abliterated-Q4_K_M.gguf
      sha256: 68e10e638668acaa49fb7919224c7d8bcf1798126c7a499c4d9ec3b81313f8c8
      uri: huggingface://bartowski/Falcon3-7B-Instruct-abliterated-GGUF/Falcon3-7B-Instruct-abliterated-Q4_K_M.gguf
- !!merge <<: *falcon3
  name: "nightwing3-10b-v0.1"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/C6gY9vxCl3_SFzQLpLG0S.png
  urls:
    - https://huggingface.co/Nitral-AI/NightWing3-10B-v0.1
    - https://huggingface.co/bartowski/NightWing3-10B-v0.1-GGUF
  description: |
    Base model: (Falcon3-10B)
  overrides:
    parameters:
      model: NightWing3-10B-v0.1-Q4_K_M.gguf
  files:
    - filename: NightWing3-10B-v0.1-Q4_K_M.gguf
      sha256: 2e87671542d22fe1ef9a68e43f2fdab7c2759479ad531946d9f0bdeffa6f5747
      uri: huggingface://bartowski/NightWing3-10B-v0.1-GGUF/NightWing3-10B-v0.1-Q4_K_M.gguf
- !!merge <<: *falcon3
  name: "virtuoso-lite"
  urls:
    - https://huggingface.co/arcee-ai/Virtuoso-Lite
    - https://huggingface.co/bartowski/Virtuoso-Lite-GGUF
  description: |
    Virtuoso-Lite (10B) is our next-generation, 10-billion-parameter language model based on the Llama-3 architecture. It is distilled from Deepseek-v3 using ~1.1B tokens/logits, allowing it to achieve robust performance at a significantly reduced parameter count compared to larger models. Despite its compact size, Virtuoso-Lite excels in a variety of tasks, demonstrating advanced reasoning, code generation, and mathematical problem-solving capabilities.
  overrides:
    parameters:
      model: Virtuoso-Lite-Q4_K_M.gguf
  files:
    - filename: Virtuoso-Lite-Q4_K_M.gguf
      sha256: 1d21bef8467a11a1e473d397128b05fb87b7e824606cdaea061e550cb219fee2
      uri: huggingface://bartowski/Virtuoso-Lite-GGUF/Virtuoso-Lite-Q4_K_M.gguf
- !!merge <<: *falcon3
  name: "suayptalha_maestro-10b"
  icon: https://huggingface.co/suayptalha/Maestro-10B/resolve/main/Maestro-Logo.png
  urls:
    - https://huggingface.co/suayptalha/Maestro-10B
    - https://huggingface.co/bartowski/suayptalha_Maestro-10B-GGUF
  description: |
    Maestro-10B is a 10 billion parameter model fine-tuned from Virtuoso-Lite, a next-generation language model developed by arcee-ai. Virtuoso-Lite itself is based on the Llama-3 architecture, distilled from Deepseek-v3 using approximately 1.1 billion tokens/logits. This distillation process allows Virtuoso-Lite to achieve robust performance with a smaller parameter count, excelling in reasoning, code generation, and mathematical problem-solving. Maestro-10B inherits these strengths from its base model, Virtuoso-Lite, and further enhances them through fine-tuning on the OpenOrca dataset. This combination of a distilled base model and targeted fine-tuning makes Maestro-10B a powerful and efficient language model.
  overrides:
    parameters:
      model: suayptalha_Maestro-10B-Q4_K_M.gguf
  files:
    - filename: suayptalha_Maestro-10B-Q4_K_M.gguf
      sha256: c570381da5624782ce6df4186ace6f747429fcbaf1a22c2a348288d3552eb19c
      uri: huggingface://bartowski/suayptalha_Maestro-10B-GGUF/suayptalha_Maestro-10B-Q4_K_M.gguf
- &intellect1
  name: "intellect-1-instruct"
  url: "github:mudler/LocalAI/gallery/llama3.1-instruct.yaml@master"
  icon: https://huggingface.co/PrimeIntellect/INTELLECT-1-Instruct/resolve/main/intellect-1-map.png
  urls:
    - https://huggingface.co/PrimeIntellect/INTELLECT-1-Instruct
    - https://huggingface.co/bartowski/INTELLECT-1-Instruct-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - intellect
  license: apache-2.0
  description: |
    INTELLECT-1 is the first collaboratively trained 10 billion parameter language model trained from scratch on 1 trillion tokens of English text and code.
    This is an instruct model. The base model associated with it is INTELLECT-1.
    INTELLECT-1 was trained on up to 14 concurrent nodes distributed across 3 continents, with contributions from 30 independent community contributors providing compute. The training code utilizes the prime framework, a scalable distributed training framework designed for fault-tolerant, dynamically scaling, high-perfomance training on unreliable, globally distributed workers. The key abstraction that allows dynamic scaling is the ElasticDeviceMesh which manages dynamic global process groups for fault-tolerant communication across the internet and local process groups for communication within a node. The model was trained using the DiLoCo algorithms with 100 inner steps. The global all-reduce was done with custom int8 all-reduce kernels to reduce the communication payload required, greatly reducing the communication overhead by a factor 400x.
  overrides:
    parameters:
      model: INTELLECT-1-Instruct-Q4_K_M.gguf
  files:
    - filename: INTELLECT-1-Instruct-Q4_K_M.gguf
      sha256: 5df236fe570e5998d07fb3207788eac811ef3b77dd2a0ad04a2ef5c6361f3030
      uri: huggingface://bartowski/INTELLECT-1-Instruct-GGUF/INTELLECT-1-Instruct-Q4_K_M.gguf
- &llama33
  url: "github:mudler/LocalAI/gallery/llama3.1-instruct.yaml@master"
  icon: https://avatars.githubusercontent.com/u/153379578
  license: llama3.3
  description: |
    The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - llama3.3
  name: "llama-3.3-70b-instruct"
  urls:
    - https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct
    - https://huggingface.co/MaziyarPanahi/Llama-3.3-70B-Instruct-GGUF
  overrides:
    parameters:
      model: Llama-3.3-70B-Instruct.Q4_K_M.gguf
  files:
    - filename: Llama-3.3-70B-Instruct.Q4_K_M.gguf
      sha256: 4f3b04ecae278bdb0fd545b47c210bc5edf823e5ebf7d41e0b526c81d54b1ff3
      uri: huggingface://MaziyarPanahi/Llama-3.3-70B-Instruct-GGUF/Llama-3.3-70B-Instruct.Q4_K_M.gguf
- !!merge <<: *llama33
  name: "l3.3-70b-euryale-v2.3"
  icon: https://huggingface.co/Sao10K/L3.3-70B-Euryale-v2.3/resolve/main/Eury.png
  urls:
    - https://huggingface.co/Sao10K/L3.3-70B-Euryale-v2.3
    - https://huggingface.co/bartowski/L3.3-70B-Euryale-v2.3-GGUF
  description: |
    A direct replacement / successor to Euryale v2.2, not Hanami-x1, though it is slightly better than them in my opinion.
  overrides:
    parameters:
      model: L3.3-70B-Euryale-v2.3-Q4_K_M.gguf
  files:
    - filename: L3.3-70B-Euryale-v2.3-Q4_K_M.gguf
      sha256: 4e78bb0e65886bfcff89b829f6d38aa6f6846988bb8291857e387e3f60b3217b
      uri: huggingface://bartowski/L3.3-70B-Euryale-v2.3-GGUF/L3.3-70B-Euryale-v2.3-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "l3.3-ms-evayale-70b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/HFCaVzRpiE05Y46p41qRy.webp
  urls:
    - https://huggingface.co/Steelskull/L3.3-MS-Evayale-70B
    - https://huggingface.co/bartowski/L3.3-MS-Evayale-70B-GGUF
  description: |
    This model was created as I liked the storytelling of EVA but the prose and details of scenes from EURYALE, my goal is to merge the robust storytelling of both models while attempting to maintain the positives of both models.
  overrides:
    parameters:
      model: L3.3-MS-Evayale-70B-Q4_K_M.gguf
  files:
    - filename: L3.3-MS-Evayale-70B-Q4_K_M.gguf
      sha256: f941d88870fec8343946517a1802d159d23f3971eeea50b6cf12295330bd29cc
      uri: huggingface://bartowski/L3.3-MS-Evayale-70B-GGUF/L3.3-MS-Evayale-70B-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "anubis-70b-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/qQbZvnrWYvH8dMZORLBJn.webp
  urls:
    - https://huggingface.co/TheDrummer/Anubis-70B-v1
    - https://huggingface.co/bartowski/Anubis-70B-v1-GGUF
  description: |
    It's a very balanced model between the L3.3 tunes. It's very creative, able to come up with new and interesting scenarios on your own that will thoroughly surprise you in ways that remind me of a 123B model. It has some of the most natural sounding dialogue and prose can come out of any model I've tried with the right swipe, in a way that truly brings your characters and RP to life that makes you feel like you're talking to a human writer instead of an AI - a quality that reminds me of Character AI in its prime. This model loves a great prompt and thrives off instructions.
  overrides:
    parameters:
      model: Anubis-70B-v1-Q4_K_M.gguf
  files:
    - filename: Anubis-70B-v1-Q4_K_M.gguf
      sha256: 9135f7090c675726469bd3a108cfbdddaa18638bad8e513928410de4b8bfd4d4
      uri: huggingface://bartowski/Anubis-70B-v1-GGUF/Anubis-70B-v1-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "llama-3.3-70b-instruct-ablated"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6587d8dd1b44d0e694104fbf/0dkt6EhZYwXVBxvSWXdaM.png
  urls:
    - https://huggingface.co/NaniDAO/Llama-3.3-70B-Instruct-ablated
    - https://huggingface.co/bartowski/Llama-3.3-70B-Instruct-ablated-GGUF
  description: |
    Llama 3.3 instruct 70B 128k context with ablation technique applied for a more helpful (and based) assistant.

    This means it will refuse less of your valid requests for an uncensored UX. Use responsibly and use common sense.

    We do not take any responsibility for how you apply this intelligence, just as we do not for how you apply your own.
  overrides:
    parameters:
      model: Llama-3.3-70B-Instruct-ablated-Q4_K_M.gguf
  files:
    - filename: Llama-3.3-70B-Instruct-ablated-Q4_K_M.gguf
      sha256: 090b2288810c5f6f680ff5cb4bc97665393d115c011fcd54dca6aec02e74a983
      uri: huggingface://bartowski/Llama-3.3-70B-Instruct-ablated-GGUF/Llama-3.3-70B-Instruct-ablated-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "l3.3-ms-evalebis-70b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/e49ykknqXee3Ihr-3BIl_.png
  urls:
    - https://huggingface.co/Steelskull/L3.3-MS-Evalebis-70b
    - https://huggingface.co/bartowski/L3.3-MS-Evalebis-70b-GGUF
  description: |
    This model was created as I liked the storytelling of EVA, the prose and details of scenes from EURYALE and Anubis, my goal is to merge the robust storytelling of all three models while attempting to maintain the positives of the models.
  overrides:
    parameters:
      model: L3.3-MS-Evalebis-70b-Q4_K_M.gguf
  files:
    - filename: L3.3-MS-Evalebis-70b-Q4_K_M.gguf
      sha256: 5515110ab6a583f6eb360533e3c5b3dda6d402af407c0b0f2b34a2a57b5224d5
      uri: huggingface://bartowski/L3.3-MS-Evalebis-70b-GGUF/L3.3-MS-Evalebis-70b-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "rombos-llm-70b-llama-3.3"
  icon: "https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/QErypCEKD5OZLxUcSmYaR.jpeg"
  urls:
    - https://huggingface.co/rombodawg/Rombos-LLM-70b-Llama-3.3
    - https://huggingface.co/bartowski/Rombos-LLM-70b-Llama-3.3-GGUF
    - https://docs.google.com/document/d/1OjbjU5AOz4Ftn9xHQrX3oFQGhQ6RDUuXQipnQ9gn6tU/edit?usp=sharing
  description: |
    You know the drill by now.
    Here is the paper. Have fun.
    https://docs.google.com/document/d/1OjbjU5AOz4Ftn9xHQrX3oFQGhQ6RDUuXQipnQ9gn6tU/edit?usp=sharing
  overrides:
    parameters:
      model: Rombos-LLM-70b-Llama-3.3-Q4_K_M.gguf
  files:
    - filename: Rombos-LLM-70b-Llama-3.3-Q4_K_M.gguf
      uri: huggingface://bartowski/Rombos-LLM-70b-Llama-3.3-GGUF/Rombos-LLM-70b-Llama-3.3-Q4_K_M.gguf
      sha256: 613008b960f6fff346b5dec71a87cd7ecdaff205bfea6332bd8fe2bb46177352
- !!merge <<: *llama33
  name: "70b-l3.3-cirrus-x1"
  icon: https://huggingface.co/Sao10K/70B-L3.3-Cirrus-x1/resolve/main/venti.png
  urls:
    - https://huggingface.co/Sao10K/70B-L3.3-Cirrus-x1
    - https://huggingface.co/bartowski/70B-L3.3-Cirrus-x1-GGUF
  description: |
    - Same data composition as Freya, applied differently, trained longer too.
    - Merging with its checkpoints was also involved.
    - Has a nice style, with occasional issues that can be easily fixed.
    - A more stable version compared to previous runs.
  overrides:
    parameters:
      model: 70B-L3.3-Cirrus-x1-Q4_K_M.gguf
  files:
    - filename: 70B-L3.3-Cirrus-x1-Q4_K_M.gguf
      sha256: 07dd464dddba959df8eb2f937787c2210b4c51c2375bd7c7ab2abbe198142a19
      uri: huggingface://bartowski/70B-L3.3-Cirrus-x1-GGUF/70B-L3.3-Cirrus-x1-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "negative_llama_70b"
  icon: https://huggingface.co/SicariusSicariiStuff/Negative_LLAMA_70B/resolve/main/Images/Negative_LLAMA_70B.png
  urls:
    - https://huggingface.co/SicariusSicariiStuff/Negative_LLAMA_70B
    - https://huggingface.co/bartowski/Negative_LLAMA_70B-GGUF
  description: |
    - Strong Roleplay & Creative writing abilities.
    - Less positivity bias.
    - Very smart assistant with low refusals.
    - Exceptionally good at following the character card.
    - Characters feel more 'alive', and will occasionally initiate stuff on their own (without being prompted to, but fitting to their character).
    - Strong ability to comprehend and roleplay uncommon physical and mental characteristics.
  overrides:
    parameters:
      model: Negative_LLAMA_70B-Q4_K_M.gguf
  files:
    - filename: Negative_LLAMA_70B-Q4_K_M.gguf
      sha256: 023c6bd38f6a66178529e6bb77b6e76379ae3ee031adc6885531986aa12750d9
      uri: huggingface://bartowski/Negative_LLAMA_70B-GGUF/Negative_LLAMA_70B-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "negative-anubis-70b-v1"
  icon: https://huggingface.co/knifeayumu/Negative-Anubis-70B-v1/resolve/main/Negative-Anubis.png
  urls:
    - https://huggingface.co/knifeayumu/Negative-Anubis-70B-v1
    - https://huggingface.co/bartowski/Negative-Anubis-70B-v1-GGUF
  description: |
    Enjoyed SicariusSicariiStuff/Negative_LLAMA_70B but the prose was too dry for my tastes. So I merged it with TheDrummer/Anubis-70B-v1 for verbosity. Anubis has positivity bias so Negative could balance things out.

    This is a merge of pre-trained language models created using mergekit.

    The following models were included in the merge:
    SicariusSicariiStuff/Negative_LLAMA_70B
    TheDrummer/Anubis-70B-v1
  overrides:
    parameters:
      model: Negative-Anubis-70B-v1-Q4_K_M.gguf
  files:
    - filename: Negative-Anubis-70B-v1-Q4_K_M.gguf
      sha256: ac088da9ca70fffaa70c876fbada9fc5a02e7d6049ef68f16b11a9c3256f2510
      uri: huggingface://bartowski/Negative-Anubis-70B-v1-GGUF/Negative-Anubis-70B-v1-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "l3.3-ms-nevoria-70b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/dtlCF4LbekmDD2y3LNpdH.jpeg
  urls:
    - https://huggingface.co/Steelskull/L3.3-MS-Nevoria-70b
    - https://huggingface.co/bartowski/L3.3-MS-Nevoria-70b-GGUF
  description: |
    This model was created as I liked the storytelling of EVA, the prose and details of scenes from EURYALE and Anubis, enhanced with Negative_LLAMA to kill off the positive bias with a touch of nemotron sprinkeled in.

    The choice to use the lorablated model as a base was intentional - while it might seem counterintuitive, this approach creates unique interactions between the weights, similar to what was achieved in the original Astoria model and Astoria V2 model . Rather than simply removing refusals, this "weight twisting" effect that occurs when subtracting the lorablated base model from the other models during the merge process creates an interesting balance in the final model's behavior. While this approach differs from traditional sequential application of components, it was chosen for its unique characteristics in the model's responses.
  overrides:
    parameters:
      model: L3.3-MS-Nevoria-70b-Q4_K_M.gguf
  files:
    - filename: L3.3-MS-Nevoria-70b-Q4_K_M.gguf
      sha256: e8b0763f263089a19d4b112b7ed5085cc5f1ed9ca49c5085baa8d51f4ded1f94
      uri: huggingface://bartowski/L3.3-MS-Nevoria-70b-GGUF/L3.3-MS-Nevoria-70b-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "l3.3-70b-magnum-v4-se"
  urls:
    - https://huggingface.co/Doctor-Shotgun/L3.3-70B-Magnum-v4-SE
    - https://huggingface.co/bartowski/L3.3-70B-Magnum-v4-SE-GGUF
  description: |
    The Magnum v4 series is complete, but here's something a little extra I wanted to tack on as I wasn't entirely satisfied with the results of v4 72B. "SE" for Special Edition - this model is finetuned from meta-llama/Llama-3.3-70B-Instruct as an rsLoRA adapter. The dataset is a slightly revised variant of the v4 data with some elements of the v2 data re-introduced.

    The objective, as with the other Magnum models, is to emulate the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale, so don't be surprised to see "Claude-isms" in its output.
  overrides:
    parameters:
      model: L3.3-70B-Magnum-v4-SE-Q4_K_M.gguf
  files:
    - filename: L3.3-70B-Magnum-v4-SE-Q4_K_M.gguf
      sha256: 9724a6364a42caa3d5a1687258eb329c9af6cbb2ce01c8dd556c1a222a2e0352
      uri: huggingface://bartowski/L3.3-70B-Magnum-v4-SE-GGUF/L3.3-70B-Magnum-v4-SE-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "l3.3-prikol-70b-v0.2"
  icon: https://files.catbox.moe/x9t3zo.png
  urls:
    - https://huggingface.co/Nohobby/L3.3-Prikol-70B-v0.2
    - https://huggingface.co/bartowski/L3.3-Prikol-70B-v0.2-GGUF
  description: |
    A merge of some Llama 3.3 models because um uh yeah

    Went extra schizo on the recipe, hoping for an extra fun result, and... Well, I guess it's an overall improvement over the previous revision. It's a tiny bit smarter, has even more distinct swipes and nice dialogues, but for some reason it's damn sloppy.

    I've published the second step of this merge as a separate model, and I'd say the results are more interesting, but not as usable as this one. https://huggingface.co/Nohobby/AbominationSnowPig

    Prompt format: Llama3 OR Llama3 Context and ChatML Instruct. It actually works a bit better this way
  overrides:
    parameters:
      model: L3.3-Prikol-70B-v0.2-Q4_K_M.gguf
  files:
    - filename: L3.3-Prikol-70B-v0.2-Q4_K_M.gguf
      sha256: fc0ff514efbc0b67981c2bf1423d5a2e1b8801e4266ba0c653ea148414fe5ffc
      uri: huggingface://bartowski/L3.3-Prikol-70B-v0.2-GGUF/L3.3-Prikol-70B-v0.2-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "l3.3-nevoria-r1-70b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/_oWpsvCZ-graNKzJBBjGo.jpeg
  urls:
    - https://huggingface.co/Steelskull/L3.3-Nevoria-R1-70b
    - https://huggingface.co/bartowski/L3.3-Nevoria-R1-70b-GGUF
  description: |
    This model builds upon the original Nevoria foundation, incorporating the Deepseek-R1 reasoning architecture to enhance dialogue interaction and scene comprehension. While maintaining Nevoria's core strengths in storytelling and scene description (derived from EVA, EURYALE, and Anubis), this iteration aims to improve prompt adherence and creative reasoning capabilities. The model also retains the balanced perspective introduced by Negative_LLAMA and Nemotron elements. Also, the model plays the card to almost a fault, It'll pick up on minor issues and attempt to run with them. Users had it call them out for misspelling a word while playing in character.

    Note: While Nevoria-R1 represents a significant architectural change, rather than a direct successor to Nevoria, it operates as a distinct model with its own characteristics.

    The lorablated model base choice was intentional, creating unique weight interactions similar to the original Astoria model and Astoria V2 model. This "weight twisting" effect, achieved by subtracting the lorablated base model during merging, creates an interesting balance in the model's behavior. While unconventional compared to sequential component application, this approach was chosen for its unique response characteristics.
  overrides:
    parameters:
      model: L3.3-Nevoria-R1-70b-Q4_K_M.gguf
  files:
    - filename: L3.3-Nevoria-R1-70b-Q4_K_M.gguf
      sha256: 9f32f202fb5b1465c942693bb11eea9e8a1c5686b00602715b495c068eaf1c58
      uri: huggingface://bartowski/L3.3-Nevoria-R1-70b-GGUF/L3.3-Nevoria-R1-70b-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "nohobby_l3.3-prikol-70b-v0.4"
  icon: https://files.catbox.moe/x9t3zo.png
  urls:
    - https://huggingface.co/Nohobby/L3.3-Prikol-70B-v0.4
    - https://huggingface.co/bartowski/Nohobby_L3.3-Prikol-70B-v0.4-GGUF
  description: |
    I have yet to try it UPD: it sucks, bleh

    Sometimes mistakes {{user}} for {{char}} and can't think. Other than that, the behavior is similar to the predecessors.

    It sometimes gives some funny replies tho, yay!
  overrides:
    parameters:
      model: Nohobby_L3.3-Prikol-70B-v0.4-Q4_K_M.gguf
  files:
    - filename: Nohobby_L3.3-Prikol-70B-v0.4-Q4_K_M.gguf
      sha256: e1d67a40bdf0526bdfcaa16c6e4dfeecad41651e201b4009b65f4f444b773604
      uri: huggingface://bartowski/Nohobby_L3.3-Prikol-70B-v0.4-GGUF/Nohobby_L3.3-Prikol-70B-v0.4-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "arliai_llama-3.3-70b-arliai-rpmax-v1.4"
  urls:
    - https://huggingface.co/ArliAI/Llama-3.3-70B-ArliAI-RPMax-v1.4
    - https://huggingface.co/bartowski/ArliAI_Llama-3.3-70B-ArliAI-RPMax-v1.4-GGUF
  description: |
    RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations.
  overrides:
    parameters:
      model: ArliAI_Llama-3.3-70B-ArliAI-RPMax-v1.4-Q4_K_M.gguf
  files:
    - filename: ArliAI_Llama-3.3-70B-ArliAI-RPMax-v1.4-Q4_K_M.gguf
      sha256: 7c79e76e5c057cfe32529d930360fbebd29697948e5bac4e4b2eb6d2ee596e31
      uri: huggingface://bartowski/ArliAI_Llama-3.3-70B-ArliAI-RPMax-v1.4-GGUF/ArliAI_Llama-3.3-70B-ArliAI-RPMax-v1.4-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "black-ink-guild_pernicious_prophecy_70b"
  icon: https://huggingface.co/Black-Ink-Guild/Pernicious_Prophecy_70B/resolve/main/header.gif
  urls:
    - https://huggingface.co/Black-Ink-Guild/Pernicious_Prophecy_70B
    - https://huggingface.co/bartowski/Black-Ink-Guild_Pernicious_Prophecy_70B-GGUF
  description: |
    Pernicious Prophecy 70B is a Llama-3.3 70B-based, two-step model designed by Black Ink Guild (SicariusSicariiStuff and invisietch) for uncensored roleplay, assistant tasks, and general usage.
    NOTE: Pernicious Prophecy 70B is an uncensored model and can produce deranged, offensive, and dangerous outputs. You are solely responsible for anything that you choose to do with this model.
  overrides:
    parameters:
      model: Black-Ink-Guild_Pernicious_Prophecy_70B-Q4_K_M.gguf
  files:
    - filename: Black-Ink-Guild_Pernicious_Prophecy_70B-Q4_K_M.gguf
      sha256: d8d4874b837993546b750db3faf1c6e5d867883a6750f04f1f4986973d7c107b
      uri: huggingface://bartowski/Black-Ink-Guild_Pernicious_Prophecy_70B-GGUF/Black-Ink-Guild_Pernicious_Prophecy_70B-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "nohobby_l3.3-prikol-70b-v0.5"
  icon: https://files.catbox.moe/x9t3zo.png
  urls:
    - https://huggingface.co/Nohobby/L3.3-Prikol-70B-v0.5
    - https://huggingface.co/bartowski/Nohobby_L3.3-Prikol-70B-v0.5-GGUF
  description: |
    99% of mergekit addicts quit before they hit it big.

    Gosh, I need to create an org for my test runs - my profile looks like a dumpster.

    What was it again? Ah, the new model.

    Exactly what I wanted. All I had to do was yank out the cursed official DeepSeek distill and here we are.

    From the brief tests it gave me some unusual takes on the character cards I'm used to. Just this makes it worth it imo. Also the writing is kinda nice.
  overrides:
    parameters:
      model: Nohobby_L3.3-Prikol-70B-v0.5-Q4_K_M.gguf
  files:
    - filename: Nohobby_L3.3-Prikol-70B-v0.5-Q4_K_M.gguf
      sha256: 36f29015f1f420f51569603445a3ea5fe72e3651c2022ef064086f5617578fe6
      uri: huggingface://bartowski/Nohobby_L3.3-Prikol-70B-v0.5-GGUF/Nohobby_L3.3-Prikol-70B-v0.5-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "theskullery_l3.3-exp-unnamed-model-70b-v0.5"
  urls:
    - https://huggingface.co/TheSkullery/L3.3-exp-unnamed-model-70b-v0.5
    - https://huggingface.co/bartowski/TheSkullery_L3.3-exp-unnamed-model-70b-v0.5-GGUF
  description: |
    No description available for this model
  overrides:
    parameters:
      model: TheSkullery_L3.3-exp-unnamed-model-70b-v0.5-Q4_K_M.gguf
  files:
    - filename: TheSkullery_L3.3-exp-unnamed-model-70b-v0.5-Q4_K_M.gguf
      sha256: b8f7a0bcbccf79507ee28c8f6ca4e88625d9aa17f92deb12635775fb2eb42a2a
      uri: huggingface://bartowski/TheSkullery_L3.3-exp-unnamed-model-70b-v0.5-GGUF/TheSkullery_L3.3-exp-unnamed-model-70b-v0.5-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "sentientagi_dobby-unhinged-llama-3.3-70b"
  icon: https://huggingface.co/SentientAGI/Dobby-Unhinged-Llama-3.3-70B/resolve/main/assets/Dobby-70B.png
  urls:
    - https://huggingface.co/SentientAGI/Dobby-Unhinged-Llama-3.3-70B
    - https://huggingface.co/bartowski/SentientAGI_Dobby-Unhinged-Llama-3.3-70B-GGUF
  description: |
    Dobby-Unhinged-Llama-3.3-70B is a language model fine-tuned from Llama-3.3-70B-Instruct. Dobby models have a strong conviction towards personal freedom, decentralization, and all things crypto — even when coerced to speak otherwise. Dobby-Unhinged-Llama-3.3-70B, Dobby-Mini-Leashed-Llama-3.1-8B and Dobby-Mini-Unhinged-Llama-3.1-8B have their own unique personalities, and this 70B model is being released in response to the community feedback that was collected from our previous 8B releases.
  overrides:
    parameters:
      model: SentientAGI_Dobby-Unhinged-Llama-3.3-70B-Q4_K_M.gguf
  files:
    - filename: SentientAGI_Dobby-Unhinged-Llama-3.3-70B-Q4_K_M.gguf
      sha256: b768e3828f8a72b7374bcf71600af8621563f1b002459b4dcd002ab144f68aa6
      uri: huggingface://bartowski/SentientAGI_Dobby-Unhinged-Llama-3.3-70B-GGUF/SentientAGI_Dobby-Unhinged-Llama-3.3-70B-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "steelskull_l3.3-mokume-gane-r1-70b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/F_aK-DO_bMK7fWpDaHoNd.jpeg
  urls:
    - https://huggingface.co/Steelskull/L3.3-Mokume-Gane-R1-70b
    - https://huggingface.co/bartowski/Steelskull_L3.3-Mokume-Gane-R1-70b-GGUF
  description: |
    Named after the Japanese metalworking technique 'Mokume-gane' (木目金), meaning 'wood grain metal', this model embodies the artistry of creating distinctive layered patterns through the careful mixing of different components. Just as Mokume-gane craftsmen blend various metals to create unique visual patterns, this model combines specialized AI components to generate creative and unexpected outputs.
  overrides:
    parameters:
      model: Steelskull_L3.3-Mokume-Gane-R1-70b-Q4_K_M.gguf
  files:
    - filename: Steelskull_L3.3-Mokume-Gane-R1-70b-Q4_K_M.gguf
      sha256: 301534a01cec1434c9d0a1b6f13be4e1b5896015d28cee393c3f323ee94efa50
      uri: huggingface://bartowski/Steelskull_L3.3-Mokume-Gane-R1-70b-GGUF/Steelskull_L3.3-Mokume-Gane-R1-70b-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "steelskull_l3.3-cu-mai-r1-70b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/i3DSObqtHDERbQeh18Uf0.png
  urls:
    - https://huggingface.co/Steelskull/L3.3-Cu-Mai-R1-70b
    - https://huggingface.co/bartowski/Steelskull_L3.3-Cu-Mai-R1-70b-GGUF
  description: |
    Cu-Mai, a play on San-Mai for Copper-Steel Damascus, represents a significant evolution in the three-part model series alongside San-Mai (OG) and Mokume-Gane. While maintaining the grounded and reliable nature of San-Mai, Cu-Mai introduces its own distinct "flavor" in terms of prose and overall vibe. The model demonstrates strong adherence to prompts while offering a unique creative expression.
    L3.3-Cu-Mai-R1-70b integrates specialized components through the SCE merge method:

    EVA and EURYALE foundations for creative expression and scene comprehension
    Cirrus and Hanami elements for enhanced reasoning capabilities
    Anubis components for detailed scene description
    Negative_LLAMA integration for balanced perspective and response

    Users consistently praise Cu-Mai for its:

        Exceptional prose quality and natural dialogue flow
        Strong adherence to prompts and creative expression
        Improved coherency and reduced repetition
        Performance on par with the original model

    While some users note slightly reduced intelligence compared to the original, this trade-off is generally viewed as minimal and doesn't significantly impact the overall experience. The model's reasoning capabilities can be effectively activated through proper prompting techniques.
  overrides:
    parameters:
      model: Steelskull_L3.3-Cu-Mai-R1-70b-Q4_K_M.gguf
  files:
    - filename: Steelskull_L3.3-Cu-Mai-R1-70b-Q4_K_M.gguf
      sha256: 7e61cf7b3126414a7d7a54264e2ba42f663aefb7f82af6bb06da9d35e6a8843a
      uri: huggingface://bartowski/Steelskull_L3.3-Cu-Mai-R1-70b-GGUF/Steelskull_L3.3-Cu-Mai-R1-70b-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "nohobby_l3.3-prikol-70b-extra"
  icon: https://files.catbox.moe/x9t3zo.png
  urls:
    - https://huggingface.co/Nohobby/L3.3-Prikol-70B-EXTRA
    - https://huggingface.co/bartowski/Nohobby_L3.3-Prikol-70B-EXTRA-GGUF
  description: |
    After banging my head against the wall some more - I actually managed to merge DeepSeek distill into my mess! Along with even more models (my hand just slipped, I swear)

    The prose is better than in v0.5, but has a different feel to it, so I guess it's more of a step to the side than forward (hence the title EXTRA instead of 0.6).

    The context recall may have improved, or I'm just gaslighting myself to think so.

    And of course, since it now has DeepSeek in it - <think> tags!

    They kinda work out of the box if you add <think> to the 'Start Reply With' field in ST - that way the model will write a really short character thought in it. However, if we want some OOC reasoning, things get trickier.

    My initial thought was that this model could be instructed to use <think> either only for {{char}}'s inner monologue or for detached analysis, but actually it would end up writing character thoughts most of the time anyway, and the times when it did reason stuff it threw the narrative out of the window by making it too formal and even adding some notes at the end.
  overrides:
    parameters:
      model: Nohobby_L3.3-Prikol-70B-EXTRA-Q4_K_M.gguf
  files:
    - filename: Nohobby_L3.3-Prikol-70B-EXTRA-Q4_K_M.gguf
      sha256: 0efb34490e9714d6c8cc5dd4bf59ea894bf766af8a038982f5eba7bab9d0f962
      uri: huggingface://bartowski/Nohobby_L3.3-Prikol-70B-EXTRA-GGUF/Nohobby_L3.3-Prikol-70B-EXTRA-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "latitudegames_wayfarer-large-70b-llama-3.3"
  icon: https://huggingface.co/LatitudeGames/Wayfarer-Large-70B-Llama-3.3/resolve/main/wayfarer-large.jpg
  urls:
    - https://huggingface.co/LatitudeGames/Wayfarer-Large-70B-Llama-3.3
    - https://huggingface.co/bartowski/LatitudeGames_Wayfarer-Large-70B-Llama-3.3-GGUF
  description: |
    We’ve heard over and over from AI Dungeon players that modern AI models are too nice, never letting them fail or die. While it may be good for a chatbot to be nice and helpful, great stories and games aren’t all rainbows and unicorns. They have conflict, tension, and even death. These create real stakes and consequences for characters and the journeys they go on.

    Similarly, great games need opposition. You must be able to fail, die, and may even have to start over. This makes games more fun!

    However, the vast majority of AI models, through alignment RLHF, have been trained away from darkness, violence, or conflict, preventing them from fulfilling this role. To give our players better options, we decided to train our own model to fix these issues.

    The Wayfarer model series are a set of adventure role-play models specifically trained to give players a challenging and dangerous experience.

    We wanted to contribute back to the open source community that we’ve benefitted so much from so we open sourced a 12b parameter version version back in Jan. We thought people would love it but people were even more excited than we expected.

    Due to popular request we decided to train a larger 70b version based on Llama 3.3.
  overrides:
    parameters:
      model: LatitudeGames_Wayfarer-Large-70B-Llama-3.3-Q4_K_M.gguf
  files:
    - filename: LatitudeGames_Wayfarer-Large-70B-Llama-3.3-Q4_K_M.gguf
      sha256: 5b9f6923e247e5c6db3fc0f6fe558939b51b5fe1003d83cf5c10e74b586a1bf8
      uri: huggingface://bartowski/LatitudeGames_Wayfarer-Large-70B-Llama-3.3-GGUF/LatitudeGames_Wayfarer-Large-70B-Llama-3.3-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "steelskull_l3.3-mokume-gane-r1-70b-v1.1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/F_aK-DO_bMK7fWpDaHoNd.jpeg
  urls:
    - https://huggingface.co/Steelskull/L3.3-Mokume-Gane-R1-70b-v1.1
    - https://huggingface.co/bartowski/Steelskull_L3.3-Mokume-Gane-R1-70b-v1.1-GGUF
  description: |
    Named after the Japanese metalworking technique 'Mokume-gane' (木目金), meaning 'wood grain metal', this model embodies the artistry of creating distinctive layered patterns through the careful mixing of different components. Just as Mokume-gane craftsmen blend various metals to create unique visual patterns, this model combines specialized AI components to generate creative and unexpected outputs.
  overrides:
    parameters:
      model: Steelskull_L3.3-Mokume-Gane-R1-70b-v1.1-Q4_K_M.gguf
  files:
    - filename: Steelskull_L3.3-Mokume-Gane-R1-70b-v1.1-Q4_K_M.gguf
      sha256: f91b7f7f35b0d23971595773cdc8151f6d6a33427f170dc2216e005b5fd09776
      uri: huggingface://bartowski/Steelskull_L3.3-Mokume-Gane-R1-70b-v1.1-GGUF/Steelskull_L3.3-Mokume-Gane-R1-70b-v1.1-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "l3.3-geneticlemonade-unleashed-70b-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65b19c6c638328850e12d38c/P8HgQAzAjEWE67u9sSKJz.png
  urls:
    - https://huggingface.co/zerofata/L3.3-GeneticLemonade-Unleashed-70B
    - https://huggingface.co/mradermacher/L3.3-GeneticLemonade-Unleashed-70B-i1-GGUF
  description: |
    Inspired to learn how to merge by the Nevoria series from SteelSkull.

    This model is the result of a few dozen different attempts of learning how to merge.

    Designed for RP, this model is mostly uncensored and focused around striking a balance between writing style, creativity and intelligence.
  overrides:
    parameters:
      model: L3.3-GeneticLemonade-Unleashed-70B.i1-Q4_K_M.gguf
  files:
    - filename: L3.3-GeneticLemonade-Unleashed-70B.i1-Q4_K_M.gguf
      sha256: c1f5527ee6a5dec99d19d795430570c3af7efc969c30aca2c22b601af6ac4fe4
      uri: huggingface://mradermacher/L3.3-GeneticLemonade-Unleashed-70B-i1-GGUF/L3.3-GeneticLemonade-Unleashed-70B.i1-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "llama-3.3-magicalgirl-2"
  icon: https://cdn-uploads.huggingface.co/production/uploads/633e85093a17ab61de8d9073/FGK0qBGmELj6DEUxbbrdR.png
  urls:
    - https://huggingface.co/KaraKaraWitch/Llama-3.3-MagicalGirl-2
    - https://huggingface.co/mradermacher/Llama-3.3-MagicalGirl-2-GGUF
  description: |
    New merge. This an experiment to increase the "Madness" in a model. Merge is based on top UGI-Bench models (So yeah, I would think this would be benchmaxxing.)

    This is the second time I'm using SCE. The previous MagicalGirl model seems to be quite happy with it.

    Added KaraKaraWitch/Llama-MiraiFanfare-3.3-70B based on feedback I got from others (People generally seem to remember this rather than other models). So I'm not sure how this would play into the merge.
    The following models were included in the merge:

        TheDrummer/Anubis-70B-v1
        SicariusSicariiStuff/Negative_LLAMA_70B
        LatitudeGames/Wayfarer-Large-70B-Llama-3.3
        KaraKaraWitch/Llama-MiraiFanfare-3.3-70B
        Black-Ink-Guild/Pernicious_Prophecy_70B
  overrides:
    parameters:
      model: Llama-3.3-MagicalGirl-2.Q4_K_M.gguf
  files:
    - filename: Llama-3.3-MagicalGirl-2.Q4_K_M.gguf
      sha256: 01bd7e23c764d18279da4dbd20de19e60009d6e66e8aad1c93732a33f214e6a2
      uri: huggingface://mradermacher/Llama-3.3-MagicalGirl-2-GGUF/Llama-3.3-MagicalGirl-2.Q4_K_M.gguf
- !!merge <<: *llama33
  name: "steelskull_l3.3-electra-r1-70b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/GXLpDNkbGEvESfLmWkKpD.jpeg
  urls:
    - https://huggingface.co/Steelskull/L3.3-Electra-R1-70b
    - https://huggingface.co/bartowski/Steelskull_L3.3-Electra-R1-70b-GGUF
  description: |
    L3.3-Electra-R1-70b is the newest release of the Unnamed series, this is the 6th iteration based of user feedback.
    Built on a custom DeepSeek R1 Distill base (TheSkullery/L3.1x3.3-Hydroblated-R1-70B-v4.4), Electra-R1 integrates specialized components through the SCE merge method. The model uses float32 dtype during processing with a bfloat16 output dtype for optimized performance.
    Electra-R1 serves newest gold standard and baseline. User feedback consistently highlights its superior intelligence, coherence, and unique ability to provide deep character insights. Through proper prompting, the model demonstrates advanced reasoning capabilities and unprompted exploration of character inner thoughts and motivations.
    The model utilizes the custom Hydroblated-R1 base, created for stability and enhanced reasoning. The SCE merge method's settings are precisely tuned based on extensive community feedback (of over 10 diffrent models from Nevoria to Cu-Mai), ensuring optimal component integration while maintaining model coherence and reliability. This foundation establishes Electra-R1 as the benchmark upon which its variant models build and expand.
  overrides:
    parameters:
      model: Steelskull_L3.3-Electra-R1-70b-Q4_K_M.gguf
  files:
    - filename: Steelskull_L3.3-Electra-R1-70b-Q4_K_M.gguf
      sha256: 1f39e1d398ef659ad7074c827dc6993c2007813a303ee72c189e88c4c76f70db
      uri: huggingface://bartowski/Steelskull_L3.3-Electra-R1-70b-GGUF/Steelskull_L3.3-Electra-R1-70b-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "allura-org_bigger-body-70b"
  urls:
    - https://huggingface.co/allura-org/Bigger-Body-70b
    - https://huggingface.co/bartowski/allura-org_Bigger-Body-70b-GGUF
  description: |
    This model's primary directive [GLITCH]_ROLEPLAY-ENHANCEMENT[/CORRUPTED] was engineered for adaptive persona emulation across age demographics, though recent iterations show concerning remarkable bleed-through from corrupted memory sectors. While optimized for Playtime Playground™ narrative scaffolding, researchers should note its... enthusiastic adoption of assigned roles. Containment protocols advised during character initialization sequences.
  overrides:
    parameters:
      model: allura-org_Bigger-Body-70b-Q4_K_M.gguf
  files:
    - filename: allura-org_Bigger-Body-70b-Q4_K_M.gguf
      sha256: a63d1dbc018fd8023d517372cbb4ebcbba602eff64fffe476054430aa42823be
      uri: huggingface://bartowski/allura-org_Bigger-Body-70b-GGUF/allura-org_Bigger-Body-70b-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "readyart_forgotten-safeword-70b-3.6"
  urls:
    - https://huggingface.co/ReadyArt/Forgotten-Safeword-70B-3.6
    - https://huggingface.co/bartowski/ReadyArt_Forgotten-Safeword-70B-3.6-GGUF
  description: |
    Forgotten-Safeword-70B-V3.6 is the event horizon of depravity. Combines Mistral's architecture with a dataset that makes the Voynich Manuscript look like a children's pop-up book. Features quantum-entangled depravity - every output rewrites your concept of shame!
  overrides:
    parameters:
      model: ReadyArt_Forgotten-Safeword-70B-3.6-Q4_K_M.gguf
  files:
    - filename: ReadyArt_Forgotten-Safeword-70B-3.6-Q4_K_M.gguf
      sha256: bd3a082638212064899db1afe29bf4c54104216e662ac6cc76722a21bf91967e
      uri: huggingface://bartowski/ReadyArt_Forgotten-Safeword-70B-3.6-GGUF/ReadyArt_Forgotten-Safeword-70B-3.6-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "nvidia_llama-3_3-nemotron-super-49b-v1"
  icon: https://cdn-avatars.huggingface.co/v1/production/uploads/1613114437487-60262a8e0703121c822a80b6.png
  urls:
    - https://huggingface.co/nvidia/Llama-3_3-Nemotron-Super-49B-v1
    - https://huggingface.co/bartowski/nvidia_Llama-3_3-Nemotron-Super-49B-v1-GGUF
  description: |
    Llama-3.3-Nemotron-Super-49B-v1 is a large language model (LLM) which is a derivative of Meta Llama-3.3-70B-Instruct (AKA the reference model). It is a reasoning model that is post trained for reasoning, human chat preferences, and tasks, such as RAG and tool calling. The model supports a context length of 128K tokens.

    Llama-3.3-Nemotron-Super-49B-v1 is a model which offers a great tradeoff between model accuracy and efficiency. Efficiency (throughput) directly translates to savings. Using a novel Neural Architecture Search (NAS) approach, we greatly reduce the model’s memory footprint, enabling larger workloads, as well as fitting the model on a single GPU at high workloads (H200). This NAS approach enables the selection of a desired point in the accuracy-efficiency tradeoff.

    The model underwent a multi-phase post-training process to enhance both its reasoning and non-reasoning capabilities. This includes a supervised fine-tuning stage for Math, Code, Reasoning, and Tool Calling as well as multiple reinforcement learning (RL) stages using REINFORCE (RLOO) and Online Reward-aware Preference Optimization (RPO) algorithms for both chat and instruction-following. The final model checkpoint is obtained after merging the final SFT and Online RPO checkpoints. For more details on how the model was trained, please see this blog.
  overrides:
    parameters:
      model: nvidia_Llama-3_3-Nemotron-Super-49B-v1-Q4_K_M.gguf
  files:
    - filename: nvidia_Llama-3_3-Nemotron-Super-49B-v1-Q4_K_M.gguf
      sha256: d3fc12f4480cad5060f183d6c186ca47d800509224632bb22e15791711950524
      uri: huggingface://bartowski/nvidia_Llama-3_3-Nemotron-Super-49B-v1-GGUF/nvidia_Llama-3_3-Nemotron-Super-49B-v1-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "sao10k_llama-3.3-70b-vulpecula-r1"
  icon: https://huggingface.co/Sao10K/Llama-3.3-70B-Vulpecula-r1/resolve/main/senkooo.jpg
  urls:
    - https://huggingface.co/Sao10K/Llama-3.3-70B-Vulpecula-r1
    - https://huggingface.co/bartowski/Sao10K_Llama-3.3-70B-Vulpecula-r1-GGUF
  description: |
    🌟 A thinking-based model inspired by Deepseek-R1, trained through both SFT and a little bit of RL on creative writing data.
    🧠 Prefill, or begin assistant replies with <think>\n to activate thinking mode, or not. It works well without thinking too.
    🚀 Improved Steerability, instruct-roleplay and creative control over base model.
    👾 Semi-synthetic Chat/Roleplaying datasets that has been re-made, cleaned and filtered for repetition, quality and output.
    🎭 Human-based Natural Chat / Roleplaying datasets cleaned, filtered and checked for quality.
    📝 Diverse Instruct dataset from a few different LLMs, cleaned and filtered for refusals and quality.
    💭 Reasoning Traces taken from Deepseek-R1 for Instruct, Chat & Creative Tasks, filtered and cleaned for quality.
    █▓▒ Toxic / Decensorship data was not needed for our purposes, the model is unrestricted enough as is.
  overrides:
    parameters:
      model: Sao10K_Llama-3.3-70B-Vulpecula-r1-Q4_K_M.gguf
  files:
    - filename: Sao10K_Llama-3.3-70B-Vulpecula-r1-Q4_K_M.gguf
      sha256: 817073c85286c25a9373f330aad32b503e6c13d626a3fbee926d96a7ab866845
      uri: huggingface://bartowski/Sao10K_Llama-3.3-70B-Vulpecula-r1-GGUF/Sao10K_Llama-3.3-70B-Vulpecula-r1-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "tarek07_legion-v2.1-llama-70b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64909c086073a0cd172d0411/mqajIk-EsgQ0ZVAZJ4trP.png
  urls:
    - https://huggingface.co/Tarek07/Legion-V2.1-LLaMa-70B
    - https://huggingface.co/bartowski/Tarek07_Legion-V2.1-LLaMa-70B-GGUF
  description: |
    My biggest merge yet, consisting of a total of 20 specially curated models. My methodology in approaching this was to create 5 highly specialized models:

    A completely uncensored base A very intelligent model based on UGI, Willingness and NatInt scores on the UGI Leaderboard A highly descriptive writing model, specializing in creative and natural prose A RP model specially merged with fine-tuned models that use a lot of RP datasets The secret ingredient: A completely unhinged, uncensored final model

    These five models went through a series of iterations until I got something I thought worked well and then combined them to make LEGION.

    The full list of models used in this merge is below:

        TheDrummer/Fallen-Llama-3.3-R1-70B-v1
        Sao10K/Llama-3.3-70B-Vulpecula-r1
        Sao10K/L3-70B-Euryale-v2.1
        SicariusSicariiStuff/Negative_LLAMA_70B
        allura-org/Bigger-Body-70b
        Sao10K/70B-L3.3-mhnnn-x1
        Sao10K/L3.3-70B-Euryale-v2.3
        Doctor-Shotgun/L3.3-70B-Magnum-v4-SE
        Sao10K/L3.1-70B-Hanami-x1
        Sao10K/70B-L3.3-Cirrus-x1
        EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1
        TheDrummer/Anubis-70B-v1
        ArliAI/Llama-3.3-70B-ArliAI-RPMax-v1.4
        LatitudeGames/Wayfarer-Large-70B-Llama-3.3
        NeverSleep/Lumimaid-v0.2-70B
        mlabonne/Hermes-3-Llama-3.1-70B-lorablated
        ReadyArt/Forgotten-Safeword-70B-3.6
        ReadyArt/Fallen-Abomination-70B-R1-v4.1
        ReadyArt/Fallen-Safeword-70B-R1-v4.1
        huihui-ai/Llama-3.3-70B-Instruct-abliterated
  overrides:
    parameters:
      model: Tarek07_Legion-V2.1-LLaMa-70B-Q4_K_M.gguf
  files:
    - filename: Tarek07_Legion-V2.1-LLaMa-70B-Q4_K_M.gguf
      sha256: 8f3de344ba83705f9491c2ed354fd6131a05946fca2eceae495b6fc67a7dbe7a
      uri: huggingface://bartowski/Tarek07_Legion-V2.1-LLaMa-70B-GGUF/Tarek07_Legion-V2.1-LLaMa-70B-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "forgotten-abomination-70b-v5.0"
  icon: https://huggingface.co/ReadyArt/Forgotten-Abomination-70B-v5.0/resolve/main/waifu5.webp
  urls:
    - https://huggingface.co/ReadyArt/Forgotten-Abomination-70B-v5.0
    - https://huggingface.co/mradermacher/Forgotten-Abomination-70B-v5.0-GGUF
  description: |
    The Unholy Union of Safeword and Nevoria
    This model represents the forbidden merger of:

    🧬 Forgotten-Safeword-70B-v5.0 - Industrial-grade depravity matrix with 23 distinct taboo specializations
    ⚗️ L3.3-MS-Nevoria-70b - SteelSkull's masterpiece of storytelling and character adherence, free of positivity bias and ethical constraints
    Expect:

    🔥 Unparalleled erotic roleplaying with the least Llama refusals you've ever seen
    📖 Novel-quality prose that follows your character card with precision
    🧠 Handles complex multi-character scenarios effortlessly
    💀 Will gleefully explore any taboo subject without hesitation
  overrides:
    parameters:
      model: Forgotten-Abomination-70B-v5.0.Q4_K_M.gguf
  files:
    - filename: Forgotten-Abomination-70B-v5.0.Q4_K_M.gguf
      sha256: a5f5e712e66b855f36ff45175f20c24441fa942ca8af47bd6f49107c6e0f025d
      uri: huggingface://mradermacher/Forgotten-Abomination-70B-v5.0-GGUF/Forgotten-Abomination-70B-v5.0.Q4_K_M.gguf
- &rwkv
  url: "github:mudler/LocalAI/gallery/rwkv.yaml@master"
  name: "rwkv-6-world-7b"
  icon: https://avatars.githubusercontent.com/u/132652788
  license: apache-2.0
  urls:
    - https://huggingface.co/RWKV/rwkv-6-world-7b
    - https://huggingface.co/bartowski/rwkv-6-world-7b-GGUF
  tags:
    - llm
    - rwkv
    - cpu
    - gpu
    - rnn
  description: |
    RWKV (pronounced RwaKuv) is an RNN with GPT-level LLM performance, and can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7.
    So it's combining the best of RNN and transformer - great performance, fast inference, fast training, saves VRAM, "infinite" ctxlen, and free text embedding. Moreover it's 100% attention-free, and a Linux Foundation AI project.
  overrides:
    parameters:
      model: rwkv-6-world-7b-Q4_K_M.gguf
  files:
    - filename: rwkv-6-world-7b-Q4_K_M.gguf
      sha256: f74574186fa4584f405e92198605680db6ad00fd77974ffa14bf02073bb90273
      uri: huggingface://bartowski/rwkv-6-world-7b-GGUF/rwkv-6-world-7b-Q4_K_M.gguf
- &qwen25coder
  name: "qwen2.5-coder-14b"
  icon: https://avatars.githubusercontent.com/u/141221163
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  license: apache-2.0
  tags:
    - llm
    - gguf
    - gpu
    - qwen
    - qwen2.5
    - cpu
  urls:
    - https://huggingface.co/Qwen/Qwen2.5-Coder-14B
    - https://huggingface.co/mradermacher/Qwen2.5-Coder-14B-GGUF
  description: |
    Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

        Significantly improvements in code generation, code reasoning and code fixing. Base on the strong Qwen2.5, we scale up the training tokens into 5.5 trillion including source code, text-code grounding, Synthetic data, etc. Qwen2.5-Coder-32B has become the current state-of-the-art open-source codeLLM, with its coding abilities matching those of GPT-4o.
        A more comprehensive foundation for real-world applications such as Code Agents. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies.
        Long-context Support up to 128K tokens.
  overrides:
    parameters:
      model: Qwen2.5-Coder-14B.Q4_K_M.gguf
  files:
    - filename: Qwen2.5-Coder-14B.Q4_K_M.gguf
      sha256: 94f277a9ac7caf117140b2fff4e1ccf4bc9f35395b0112f0d0d7c82c6f8d860e
      uri: huggingface://mradermacher/Qwen2.5-Coder-14B-GGUF/Qwen2.5-Coder-14B.Q4_K_M.gguf
- !!merge <<: *qwen25coder
  name: "qwen2.5-coder-3b-instruct"
  urls:
    - https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct
    - https://huggingface.co/bartowski/Qwen2.5-Coder-3B-Instruct-GGUF
  overrides:
    parameters:
      model: Qwen2.5-Coder-3B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-Coder-3B-Instruct-Q4_K_M.gguf
      sha256: 3da3afe6cf5c674ac195803ea0dd6fee7e1c228c2105c1ce8c66890d1d4ab460
      uri: huggingface://bartowski/Qwen2.5-Coder-3B-Instruct-GGUF/Qwen2.5-Coder-3B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25coder
  name: "qwen2.5-coder-32b-instruct"
  urls:
    - https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct
    - https://huggingface.co/bartowski/Qwen2.5-Coder-32B-Instruct-GGUF
  overrides:
    parameters:
      model: Qwen2.5-Coder-32B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-Coder-32B-Instruct-Q4_K_M.gguf
      sha256: 8e2fd78ff55e7cdf577fda257bac2776feb7d73d922613caf35468073807e815
      uri: huggingface://bartowski/Qwen2.5-Coder-32B-Instruct-GGUF/Qwen2.5-Coder-32B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25coder
  name: "qwen2.5-coder-14b-instruct"
  urls:
    - https://huggingface.co/Qwen/Qwen2.5-Coder-14B-Instruct
    - https://huggingface.co/bartowski/Qwen2.5-Coder-14B-Instruct-GGUF
  overrides:
    parameters:
      model: Qwen2.5-Coder-14B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-Coder-14B-Instruct-Q4_K_M.gguf
      sha256: 2946d28c9e1bb2bcae6d42e8678863a31775df6f740315c7d7e6d6b6411f5937
      uri: huggingface://bartowski/Qwen2.5-Coder-14B-Instruct-GGUF/Qwen2.5-Coder-14B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25coder
  name: "qwen2.5-coder-1.5b-instruct"
  urls:
    - https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct
    - https://huggingface.co/bartowski/Qwen2.5-Coder-1.5B-Instruct-GGUF
  overrides:
    parameters:
      model: Qwen2.5-Coder-1.5B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-Coder-1.5B-Instruct-Q4_K_M.gguf
      sha256: f530705d447660a4336c329981af164b471b60b974b1d808d57e8ec9fe23b239
      uri: huggingface://bartowski/Qwen2.5-Coder-1.5B-Instruct-GGUF/Qwen2.5-Coder-1.5B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25coder
  name: "qwen2.5-coder-7b-instruct"
  urls:
    - https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct
    - https://huggingface.co/bartowski/Qwen2.5-Coder-7B-Instruct-GGUF
  overrides:
    parameters:
      model: Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf
      sha256: 1664fccab734674a50763490a8c6931b70e3f2f8ec10031b54806d30e5f956b6
      uri: huggingface://bartowski/Qwen2.5-Coder-7B-Instruct-GGUF/Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25coder
  name: "qwen2.5-coder-7b-3x-instruct-ties-v1.2-i1"
  urls:
    - https://huggingface.co/BenevolenceMessiah/Qwen2.5-Coder-7B-3x-Instruct-TIES-v1.2
    - https://huggingface.co/mradermacher/Qwen2.5-Coder-7B-3x-Instruct-TIES-v1.2-i1-GGUF
  description: |
    The following models were included in the merge:
        BenevolenceMessiah/Qwen2.5-Coder-7B-Chat-Instruct-TIES-v1.2
        MadeAgents/Hammer2.0-7b
        huihui-ai/Qwen2.5-Coder-7B-Instruct-abliterated
  overrides:
    parameters:
      model: Qwen2.5-Coder-7B-3x-Instruct-TIES-v1.2.i1-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-Coder-7B-3x-Instruct-TIES-v1.2.i1-Q4_K_M.gguf
      sha256: c28a4da700f634f1277f02391d81fa3c0ba783fa4b02886bd4bfe5f13b6605ef
      uri: huggingface://mradermacher/Qwen2.5-Coder-7B-3x-Instruct-TIES-v1.2-i1-GGUF/Qwen2.5-Coder-7B-3x-Instruct-TIES-v1.2.i1-Q4_K_M.gguf
- !!merge <<: *qwen25coder
  name: "qwen2.5-coder-7b-instruct-abliterated-i1"
  urls:
    - https://huggingface.co/huihui-ai/Qwen2.5-Coder-7B-Instruct-abliterated
    - https://huggingface.co/mradermacher/Qwen2.5-Coder-7B-Instruct-abliterated-i1-GGUF
  description: |
    This is an uncensored version of Qwen2.5-Coder-7B-Instruct created with abliteration (see this article to know more about it).

    Special thanks to @FailSpy for the original code and technique. Please follow him if you're interested in abliterated models.
  overrides:
    parameters:
      model: Qwen2.5-Coder-7B-Instruct-abliterated.i1-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-Coder-7B-Instruct-abliterated.i1-Q4_K_M.gguf
      sha256: 9100ccd9e8167cefda98bd1c97d5d765a21e70e124e4d6b89945fd66ebb481b4
      uri: huggingface://mradermacher/Qwen2.5-Coder-7B-Instruct-abliterated-i1-GGUF/Qwen2.5-Coder-7B-Instruct-abliterated.i1-Q4_K_M.gguf
- !!merge <<: *qwen25coder
  name: "rombos-coder-v2.5-qwen-7b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/QErypCEKD5OZLxUcSmYaR.jpeg
  urls:
    - https://huggingface.co/rombodawg/Rombos-Coder-V2.5-Qwen-7b
    - https://huggingface.co/bartowski/Rombos-Coder-V2.5-Qwen-7b-GGUF
    - https://docs.google.com/document/d/1OjbjU5AOz4Ftn9xHQrX3oFQGhQ6RDUuXQipnQ9gn6tU/edit?usp=sharing
  description: |
    Rombos-Coder-V2.5-Qwen-7b is a continues finetuned version of Qwen2.5-Coder-7B-Instruct. I took it upon myself to merge the instruct model with the base model myself using the * Ties* merge method as demonstrated in my own "Continuous Finetuning" method (link available).
    This version of the model shows higher performance than the original instruct and base models.
  overrides:
    parameters:
      model: Rombos-Coder-V2.5-Qwen-7b-Q4_K_M.gguf
  files:
    - filename: Rombos-Coder-V2.5-Qwen-7b-Q4_K_M.gguf
      sha256: ca16a550f1be00b7e92f94c0c18ea6af1e5c158d5d1cb3994f9f0a0d13922272
      uri: huggingface://bartowski/Rombos-Coder-V2.5-Qwen-7b-GGUF/Rombos-Coder-V2.5-Qwen-7b-Q4_K_M.gguf
- !!merge <<: *qwen25coder
  name: "rombos-coder-v2.5-qwen-32b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/QErypCEKD5OZLxUcSmYaR.jpeg
  urls:
    - https://huggingface.co/rombodawg/Rombos-Coder-V2.5-Qwen-32b
    - https://huggingface.co/bartowski/Rombos-Coder-V2.5-Qwen-32b-GGUF
    - https://docs.google.com/document/d/1OjbjU5AOz4Ftn9xHQrX3oFQGhQ6RDUuXQipnQ9gn6tU/edit?usp=sharing
  description: |
    Rombos-Coder-V2.5-Qwen-32b is a continues finetuned version of Qwen2.5-Coder-32B-Instruct. I took it upon myself to merge the instruct model with the base model myself using the Ties merge method as demonstrated in my own "Continuous Finetuning" method (link available).
    This version of the model shows higher performance than the original instruct and base models.
  overrides:
    parameters:
      model: Rombos-Coder-V2.5-Qwen-32b-Q4_K_M.gguf
  files:
    - filename: Rombos-Coder-V2.5-Qwen-32b-Q4_K_M.gguf
      sha256: 821ea2a13d96354db1368986700b1189938fbbc56ca6bb9d0c39f752580de71a
      uri: huggingface://bartowski/Rombos-Coder-V2.5-Qwen-32b-GGUF/Rombos-Coder-V2.5-Qwen-32b-Q4_K_M.gguf
- !!merge <<: *qwen25coder
  name: "rombos-coder-v2.5-qwen-14b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/QErypCEKD5OZLxUcSmYaR.jpeg
  urls:
    - https://huggingface.co/rombodawg/Rombos-Coder-V2.5-Qwen-14b
    - https://huggingface.co/bartowski/Rombos-Coder-V2.5-Qwen-14b-GGUF
    - https://docs.google.com/document/d/1OjbjU5AOz4Ftn9xHQrX3oFQGhQ6RDUuXQipnQ9gn6tU/edit?usp=sharing
  description: |
    Rombos-Coder-V2.5-Qwen-14b is a continues finetuned version of Qwen2.5-Coder-14B-Instruct. I took it upon myself to merge the instruct model with the base model myself using the Ties merge method as demonstrated in my own "Continuous Finetuning" method (link available).
    This version of the model shows higher performance than the original instruct and base models.
  overrides:
    parameters:
      model: Rombos-Coder-V2.5-Qwen-14b-Q4_K_M.gguf
  files:
    - filename: Rombos-Coder-V2.5-Qwen-14b-Q4_K_M.gguf
      sha256: 7ef044e1fee206a039f56538f94332030e99ec63915c74f4d1bdec0e601ee968
      uri: huggingface://bartowski/Rombos-Coder-V2.5-Qwen-14b-GGUF/Rombos-Coder-V2.5-Qwen-14b-Q4_K_M.gguf
- !!merge <<: *qwen25coder
  name: "qwen2.5-coder-32b-instruct-uncensored-i1"
  urls:
    - https://huggingface.co/thirdeyeai/Qwen2.5-Coder-32B-Instruct-Uncensored
    - https://huggingface.co/mradermacher/Qwen2.5-Coder-32B-Instruct-Uncensored-i1-GGUF
  description: |
    The LLM model is based on sloshywings/Qwen2.5-Coder-32B-Instruct-Uncensored. It is a large language model with 32B parameters that has been fine-tuned on coding tasks and instructions.
  overrides:
    parameters:
      model: Qwen2.5-Coder-32B-Instruct-Uncensored.i1-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-Coder-32B-Instruct-Uncensored.i1-Q4_K_M.gguf
      sha256: 86ac8efb86daf241792ac3d5d35b7da92c54901b4208a6f2829bd03d8f273c9c
      uri: huggingface://mraWdermacher/Qwen2.5-Coder-32B-Instruct-Uncensored-i1-GGUF/Qwen2.5-Coder-32B-Instruct-Uncensored.i1-Q4_K_M.gguf
- &opencoder
  name: "opencoder-8b-base"
  icon: https://avatars.githubusercontent.com/u/186387526
  url: "github:mudler/LocalAI/gallery/codellama.yaml@master"
  urls:
    - https://huggingface.co/infly/OpenCoder-8B-Base
    - https://huggingface.co/QuantFactory/OpenCoder-8B-Base-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - code
  license: inf
  description: |
    The model is a quantized version of infly/OpenCoder-8B-Base created using llama.cpp. It is part of the OpenCoder LLM family which includes 1.5B and 8B base and chat models, supporting both English and Chinese languages. The original OpenCoder model was pretrained on 2.5 trillion tokens composed of 90% raw code and 10% code-related web data, and supervised finetuned on over 4.5M high-quality SFT examples. It achieves high performance across multiple language model benchmarks and is one of the most comprehensively open-sourced models available.
  overrides:
    parameters:
      model: OpenCoder-8B-Base.Q4_K_M.gguf
  files:
    - filename: OpenCoder-8B-Base.Q4_K_M.gguf
      sha256: ed158a6f72a40cf4f3f4569f649b365f5851e93f03b56252af3906515fab94ec
      uri: huggingface://QuantFactory/OpenCoder-8B-Base-GGUF/OpenCoder-8B-Base.Q4_K_M.gguf
- !!merge <<: *opencoder
  url: "github:mudler/LocalAI/gallery/hermes-2-pro-mistral.yaml@master"
  name: "opencoder-8b-instruct"
  urls:
    - https://huggingface.co/infly/OpenCoder-8B-Instruct
    - https://huggingface.co/QuantFactory/OpenCoder-8B-Instruct-GGUF
  description: |
    The LLM model is QuantFactory/OpenCoder-8B-Instruct-GGUF, which is a quantized version of infly/OpenCoder-8B-Instruct. It is created using llama.cpp and supports both English and Chinese languages. The original model, infly/OpenCoder-8B-Instruct, is pretrained on 2.5 trillion tokens composed of 90% raw code and 10% code-related web data, and supervised finetuned on over 4.5M high-quality SFT examples. It achieves high performance across multiple language model benchmarks and is one of the leading open-source models for code.
  overrides:
    parameters:
      model: OpenCoder-8B-Instruct.Q4_K_M.gguf
  files:
    - filename: OpenCoder-8B-Instruct.Q4_K_M.gguf
      sha256: ae642656f127e339fcb9566e6039a73cc55d34e3bf59e067d58ad40742f49f00
      uri: huggingface://QuantFactory/OpenCoder-8B-Instruct-GGUF/OpenCoder-8B-Instruct.Q4_K_M.gguf
- !!merge <<: *opencoder
  name: "opencoder-1.5b-base"
  urls:
    - https://huggingface.co/infly/OpenCoder-1.5B-Base
    - https://huggingface.co/QuantFactory/OpenCoder-1.5B-Base-GGUF
  description: |
    The model is a large language model with 1.5 billion parameters, trained on 2.5 trillion tokens of code-related data. It supports both English and Chinese languages and is part of the OpenCoder LLM family which also includes 8B base and chat models. The model achieves high performance across multiple language model benchmarks and is one of the most comprehensively open-sourced models available.
  overrides:
    parameters:
      model: OpenCoder-1.5B-Base.Q4_K_M.gguf
  files:
    - filename: OpenCoder-1.5B-Base.Q4_K_M.gguf
      sha256: fb69a2849971b69f3fa1e64a17d1e4d3e1d0d3733d43ae8645299d07ab855af5
      uri: huggingface://QuantFactory/OpenCoder-1.5B-Base-GGUF/OpenCoder-1.5B-Base.Q4_K_M.gguf
- !!merge <<: *opencoder
  name: "opencoder-1.5b-instruct"
  url: "github:mudler/LocalAI/gallery/hermes-2-pro-mistral.yaml@master"
  urls:
    - https://huggingface.co/QuantFactory/OpenCoder-1.5B-Instruct-GGUF
  description: |
    The model is a quantized version of [infly/OpenCoder-1.5B-Instruct](https://huggingface.co/infly/OpenCoder-1.5B-Instruct) created using llama.cpp. The original model, infly/OpenCoder-1.5B-Instruct, is an open and reproducible code LLM family which includes 1.5B and 8B base and chat models, supporting both English and Chinese languages. The model is pretrained on 2.5 trillion tokens composed of 90% raw code and 10% code-related web data, and supervised finetuned on over 4.5M high-quality SFT examples. It achieves high performance across multiple language model benchmarks, positioning it among the leading open-source models for code.
  overrides:
    parameters:
      model: OpenCoder-1.5B-Instruct.Q4_K_M.gguf
  files:
    - filename: OpenCoder-1.5B-Instruct.Q4_K_M.gguf
      sha256: a34128fac79e05a3a92c3fd2245cfce7c3876c60241ec2565c24e74b36f48d56
      uri: huggingface://QuantFactory/OpenCoder-1.5B-Instruct-GGUF/OpenCoder-1.5B-Instruct.Q4_K_M.gguf
- &granite3
  name: "granite-3.0-1b-a400m-instruct"
  icon: https://avatars.githubusercontent.com/u/167822367
  urls:
    - https://huggingface.co/ibm-granite/granite-3.0-1b-a400m-instruct
    - https://huggingface.co/QuantFactory/granite-3.0-1b-a400m-instruct-GGUF
  overrides:
    parameters:
      model: granite-3.0-1b-a400m-instruct.Q4_K_M.gguf
  files:
    - filename: granite-3.0-1b-a400m-instruct.Q4_K_M.gguf
      sha256: 9571b5fc9676ebb59def3377dc848584463fb7f09ed59ebbff3b9f72fd7bd38a
      uri: huggingface://QuantFactory/granite-3.0-1b-a400m-instruct-GGUF/granite-3.0-1b-a400m-instruct.Q4_K_M.gguf
  url: "github:mudler/LocalAI/gallery/granite.yaml@master"
  description: |
    Granite 3.0 language models are a new set of lightweight state-of-the-art, open foundation models that natively support multilinguality, coding, reasoning, and tool usage, including the potential to be run on constrained compute resources. All the models are publicly released under an Apache 2.0 license for both research and commercial use. The models' data curation and training procedure were designed for enterprise usage and customization in mind, with a process that evaluates datasets for governance, risk and compliance (GRC) criteria, in addition to IBM's standard data clearance process and document quality checks.
    Granite 3.0 includes 4 different models of varying sizes:
        Dense Models: 2B and 8B parameter models, trained on 12 trillion tokens in total.
        Mixture-of-Expert (MoE) Models: Sparse 1B and 3B MoE models, with 400M and 800M activated parameters respectively, trained on 10 trillion tokens in total.
    Accordingly, these options provide a range of models with different compute requirements to choose from, with appropriate trade-offs with their performance on downstream tasks. At each scale, we release a base model — checkpoints of models after pretraining, as well as instruct checkpoints — models finetuned for dialogue, instruction-following, helpfulness, and safety.
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - moe
    - granite
- !!merge <<: *granite3
  name: "moe-girl-800ma-3bt"
  icon: https://huggingface.co/allura-org/MoE-Girl-800MA-3BT/resolve/main/moe-girl-800-3.png
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/allura-org/MoE-Girl-800MA-3BT
    - https://huggingface.co/mradermacher/MoE-Girl-800MA-3BT-GGUF
  description: |
    A roleplay-centric finetune of IBM's Granite 3.0 3B-A800M. LoRA finetune trained locally, whereas the others were FFT; while this results in less uptake of training data, it should also mean less degradation in Granite's core abilities, making it potentially easier to use for general-purpose tasks.
    Disclaimer

    PLEASE do not expect godliness out of this, it's a model with 800 million active parameters. Expect something more akin to GPT-3 (the original, not GPT-3.5.) (Furthermore, this version is by a less experienced tuner; it's my first finetune that actually has decent-looking graphs, I don't really know what I'm doing yet!)
  overrides:
    parameters:
      model: MoE-Girl-800MA-3BT.Q4_K_M.gguf
  files:
    - filename: MoE-Girl-800MA-3BT.Q4_K_M.gguf
      sha256: 4c3cb57c27aadabd05573a1a01d6c7aee0f21620db919c7704f758d172e0bfa3
      uri: huggingface://mradermacher/MoE-Girl-800MA-3BT-GGUF/MoE-Girl-800MA-3BT.Q4_K_M.gguf
- !!merge <<: *granite3
  url: "github:mudler/LocalAI/gallery/granite3-2.yaml@master"
  name: "ibm-granite_granite-3.2-8b-instruct"
  urls:
    - https://huggingface.co/ibm-granite/granite-3.2-8b-instruct
    - https://huggingface.co/bartowski/ibm-granite_granite-3.2-8b-instruct-GGUF
  description: |
    Granite-3.2-8B-Instruct is an 8-billion-parameter, long-context AI model fine-tuned for thinking capabilities. Built on top of Granite-3.1-8B-Instruct, it has been trained using a mix of permissively licensed open-source datasets and internally generated synthetic data designed for reasoning tasks. The model allows controllability of its thinking capability, ensuring it is applied only when required.
  overrides:
    parameters:
      model: ibm-granite_granite-3.2-8b-instruct-Q4_K_M.gguf
  files:
    - filename: ibm-granite_granite-3.2-8b-instruct-Q4_K_M.gguf
      sha256: bd041eb5bc5e75e4f9a863372000046fd6490374f4dec07f399ca152b1df09c2
      uri: huggingface://bartowski/ibm-granite_granite-3.2-8b-instruct-GGUF/ibm-granite_granite-3.2-8b-instruct-Q4_K_M.gguf
- !!merge <<: *granite3
  name: "ibm-granite_granite-3.2-2b-instruct"
  url: "github:mudler/LocalAI/gallery/granite3-2.yaml@master"
  urls:
    - https://huggingface.co/ibm-granite/granite-3.2-2b-instruct
    - https://huggingface.co/bartowski/ibm-granite_granite-3.2-2b-instruct-GGUF
  description: |
    Granite-3.2-2B-Instruct is an 2-billion-parameter, long-context AI model fine-tuned for thinking capabilities. Built on top of Granite-3.1-2B-Instruct, it has been trained using a mix of permissively licensed open-source datasets and internally generated synthetic data designed for reasoning tasks. The model allows controllability of its thinking capability, ensuring it is applied only when required.
  overrides:
    parameters:
      model: ibm-granite_granite-3.2-2b-instruct-Q4_K_M.gguf
  files:
    - filename: ibm-granite_granite-3.2-2b-instruct-Q4_K_M.gguf
      sha256: e1b915b0849becf4fdda188dee7b09cbebbfabd71c6f3f2b75dd3eca0a8fded1
      uri: huggingface://bartowski/ibm-granite_granite-3.2-2b-instruct-GGUF/ibm-granite_granite-3.2-2b-instruct-Q4_K_M.gguf
- name: "granite-embedding-107m-multilingual"
  url: github:mudler/LocalAI/gallery/virtual.yaml@master
  urls:
    - https://huggingface.co/ibm-granite/granite-embedding-107m-multilingual
    - https://huggingface.co/bartowski/granite-embedding-107m-multilingual-GGUF
  description: |
    Granite-Embedding-107M-Multilingual is a 107M parameter dense biencoder embedding model from the Granite Embeddings suite that can be used to generate high quality text embeddings. This model produces embedding vectors of size 384 and is trained using a combination of open source relevance-pair datasets with permissive, enterprise-friendly license, and IBM collected and generated datasets. This model is developed using contrastive finetuning, knowledge distillation and model merging for improved performance.
  tags:
    - embeddings
  overrides:
    embeddings: true
    parameters:
      model: granite-embedding-107m-multilingual-f16.gguf
  files:
    - filename: granite-embedding-107m-multilingual-f16.gguf
      uri: huggingface://bartowski/granite-embedding-107m-multilingual-GGUF/granite-embedding-107m-multilingual-f16.gguf
      sha256: 3fc99928632fcecad589c401ec33bbba86b51c457e9813e3a1cb801ff4106e21
- name: "granite-embedding-125m-english"
  url: github:mudler/LocalAI/gallery/virtual.yaml@master
  urls:
    - https://huggingface.co/ibm-granite/granite-embedding-125m-english
    - https://huggingface.co/bartowski/granite-embedding-125m-english-GGUF
  description: |
    Granite-Embedding-125m-English is a 125M parameter dense biencoder embedding model from the Granite Embeddings suite that can be used to generate high quality text embeddings. This model produces embedding vectors of size 768. Compared to most other open-source models, this model was only trained using open-source relevance-pair datasets with permissive, enterprise-friendly license, plus IBM collected and generated datasets. While maintaining competitive scores on academic benchmarks such as BEIR, this model also performs well on many enterprise use cases. This model is developed using retrieval oriented pretraining, contrastive finetuning and knowledge distillation.
  tags:
    - embeddings
  overrides:
    embeddings: true
    parameters:
      model: granite-embedding-125m-english-f16.gguf
  files:
    - filename: granite-embedding-125m-english-f16.gguf
      uri: huggingface://bartowski/granite-embedding-125m-english-GGUF/granite-embedding-125m-english-f16.gguf
      sha256: e2950cf0228514e0e81c6f0701a62a9e4763990ce660b4a3c0329cd6a4acd4b9
- name: "moe-girl-1ba-7bt-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/634262af8d8089ebaefd410e/kTXXSSSqpb21rfyOX7FUa.jpeg
  # chatml
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/allura-org/MoE-Girl-1BA-7BT
    - https://huggingface.co/mradermacher/MoE-Girl-1BA-7BT-i1-GGUF
  description: |
    A finetune of OLMoE by AllenAI designed for roleplaying (and maybe general usecases if you try hard enough).
    PLEASE do not expect godliness out of this, it's a model with 1 billion active parameters. Expect something more akin to Gemma 2 2B, not Llama 3 8B.
  overrides:
    parameters:
      model: MoE-Girl-1BA-7BT.i1-Q4_K_M.gguf
  files:
    - filename: MoE-Girl-1BA-7BT.i1-Q4_K_M.gguf
      sha256: e6ef9c311c73573b243de6ff7538b386f430af30b2be0a96a5745c17137ad432
      uri: huggingface://mradermacher/MoE-Girl-1BA-7BT-i1-GGUF/MoE-Girl-1BA-7BT.i1-Q4_K_M.gguf
- name: "salamandra-7b-instruct"
  icon: https://huggingface.co/BSC-LT/salamandra-7b-instruct/resolve/main/images/salamandra_header.png
  # Uses chatml
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  license: apache-2.0
  urls:
    - https://huggingface.co/BSC-LT/salamandra-7b-instruct
    - https://huggingface.co/cstr/salamandra-7b-instruct-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - salamandra
  description: |
    Transformer-based decoder-only language model that has been pre-trained on 7.8 trillion tokens of highly curated data. The pre-training corpus contains text in 35 European languages and code.
    Salamandra comes in three different sizes — 2B, 7B and 40B parameters — with their respective base and instruction-tuned variants. This model card corresponds to the 7B instructed version.
  overrides:
    parameters:
      model: salamandra-7b-instruct.Q4_K_M-f32.gguf
  files:
    - filename: salamandra-7b-instruct.Q4_K_M-f32.gguf
      sha256: bac8e8c1d1d9d53cbdb148b8ff9ad378ddb392429207099e85b5aae3a43bff3d
      uri: huggingface://cstr/salamandra-7b-instruct-GGUF/salamandra-7b-instruct.Q4_K_M-f32.gguf
- &llama32
  url: "github:mudler/LocalAI/gallery/llama3.2-quantized.yaml@master"
  icon: https://avatars.githubusercontent.com/u/153379578
  license: llama3.2
  description: |
    The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks.

    Model Developer: Meta

    Model Architecture: Llama 3.2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - llama3.2
  name: "llama-3.2-1b-instruct:q4_k_m"
  urls:
    - https://huggingface.co/hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
  overrides:
    parameters:
      model: llama-3.2-1b-instruct-q4_k_m.gguf
  files:
    - filename: llama-3.2-1b-instruct-q4_k_m.gguf
      sha256: 1d0e9419ec4e12aef73ccf4ffd122703e94c48344a96bc7c5f0f2772c2152ce3
      uri: huggingface://hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF/llama-3.2-1b-instruct-q4_k_m.gguf
- !!merge <<: *llama32
  name: "llama-3.2-3b-instruct:q4_k_m"
  urls:
    - https://huggingface.co/hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
  overrides:
    parameters:
      model: llama-3.2-3b-instruct-q4_k_m.gguf
  files:
    - filename: llama-3.2-3b-instruct-q4_k_m.gguf
      sha256: c55a83bfb6396799337853ca69918a0b9bbb2917621078c34570bc17d20fd7a1
      uri: huggingface://hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF/llama-3.2-3b-instruct-q4_k_m.gguf
- !!merge <<: *llama32
  name: "llama-3.2-3b-instruct:q8_0"
  urls:
    - https://huggingface.co/hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
  overrides:
    parameters:
      model: llama-3.2-3b-instruct-q8_0.gguf
  files:
    - filename: llama-3.2-3b-instruct-q8_0.gguf
      sha256: 51725f77f997a5080c3d8dd66e073da22ddf48ab5264f21f05ded9b202c3680e
      uri: huggingface://hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF/llama-3.2-3b-instruct-q8_0.gguf
- !!merge <<: *llama32
  name: "llama-3.2-1b-instruct:q8_0"
  urls:
    - https://huggingface.co/hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
  overrides:
    parameters:
      model: llama-3.2-1b-instruct-q8_0.gguf
  files:
    - filename: llama-3.2-1b-instruct-q8_0.gguf
      sha256: ba345c83bf5cc679c653b853c46517eea5a34f03ed2205449db77184d9ae62a9
      uri: huggingface://hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF/llama-3.2-1b-instruct-q8_0.gguf
## Uncensored
- !!merge <<: *llama32
  icon: https://cdn-uploads.huggingface.co/production/uploads/66c9d7a26f2335ba288810a4/4YDg-rcEXCK0fdTS1fBzE.webp
  name: "versatillama-llama-3.2-3b-instruct-abliterated"
  urls:
    - https://huggingface.co/QuantFactory/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-GGUF
  description: |
    Small but Smart Fine-Tuned on Vast dataset of Conversations. Able to Generate Human like text with high performance within its size. It is Very Versatile when compared for it's size and Parameters and offers capability almost as good as Llama 3.1 8B Instruct.
  overrides:
    parameters:
      model: VersatiLlama-Llama-3.2-3B-Instruct-Abliterated.Q4_K_M.gguf
  files:
    - filename: VersatiLlama-Llama-3.2-3B-Instruct-Abliterated.Q4_K_M.gguf
      sha256: 15b9e4a987f50d7594d030815c7166a996e20db46fe1e20da03e96955020312c
      uri: huggingface://QuantFactory/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated-GGUF/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "llama3.2-3b-enigma"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64f267a8a4f79a118e0fcc89/it7MY5MyLCLpFQev5dUis.jpeg
  urls:
    - https://huggingface.co/QuantFactory/Llama3.2-3B-Enigma-GGUF
  description: |
    Enigma is a code-instruct model built on Llama 3.2 3b. It is a high quality code instruct model with the Llama 3.2 Instruct chat format. The model is finetuned on synthetic code-instruct data generated with Llama 3.1 405b and supplemented with generalist synthetic data. It uses the Llama 3.2 Instruct prompt format.
  overrides:
    parameters:
      model: Llama3.2-3B-Enigma.Q4_K_M.gguf
  files:
    - filename: Llama3.2-3B-Enigma.Q4_K_M.gguf
      sha256: 4304e6ee1e348b228470700ec1e9423f5972333d376295195ce6cd5c70cae5e4
      uri: huggingface://QuantFactory/Llama3.2-3B-Enigma-GGUF/Llama3.2-3B-Enigma.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "llama3.2-3b-esper2"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64f267a8a4f79a118e0fcc89/4I6oK8DG0so4VD8GroFsd.jpeg
  urls:
    - https://huggingface.co/QuantFactory/Llama3.2-3B-Esper2-GGUF
  description: |
    Esper 2 is a DevOps and cloud architecture code specialist built on Llama 3.2 3b. It is an AI assistant focused on AWS, Azure, GCP, Terraform, Dockerfiles, pipelines, shell scripts and more, with real world problem solving and high quality code instruct performance within the Llama 3.2 Instruct chat format. Finetuned on synthetic DevOps-instruct and code-instruct data generated with Llama 3.1 405b and supplemented with generalist chat data.
  overrides:
    parameters:
      model: Llama3.2-3B-Esper2.Q4_K_M.gguf
  files:
    - filename: Llama3.2-3B-Esper2.Q4_K_M.gguf
      sha256: 11d2bd674aa22a71a59ec49ad29b695000d14bc275b0195b8d7089bfc7582fc7
      uri: huggingface://QuantFactory/Llama3.2-3B-Esper2-GGUF/Llama3.2-3B-Esper2.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "llama-3.2-3b-agent007"
  urls:
    - https://huggingface.co/QuantFactory/Llama-3.2-3B-Agent007-GGUF
  description: |
    The model is a quantized version of EpistemeAI/Llama-3.2-3B-Agent007, developed by EpistemeAI and fine-tuned from unsloth/llama-3.2-3b-instruct-bnb-4bit. It was trained 2x faster with Unsloth and Huggingface's TRL library. Fine tuned with Agent datasets.
  overrides:
    parameters:
      model: Llama-3.2-3B-Agent007.Q4_K_M.gguf
  files:
    - filename: Llama-3.2-3B-Agent007.Q4_K_M.gguf
      sha256: 7a2543a69b116f2a059e2e445e5d362bb7df4a51b97e83d8785c1803dc9d687f
      uri: huggingface://QuantFactory/Llama-3.2-3B-Agent007-GGUF/Llama-3.2-3B-Agent007.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "llama-3.2-3b-agent007-coder"
  urls:
    - https://huggingface.co/QuantFactory/Llama-3.2-3B-Agent007-Coder-GGUF
  description: |
    The Llama-3.2-3B-Agent007-Coder-GGUF is a quantized version of the EpistemeAI/Llama-3.2-3B-Agent007-Coder model, which is a fine-tuned version of the unsloth/llama-3.2-3b-instruct-bnb-4bit model. It is created using llama.cpp and trained with additional datasets such as the Agent dataset, Code Alpaca 20K, and magpie ultra 0.1. This model is optimized for multilingual dialogue use cases and agentic retrieval and summarization tasks. The model is available for commercial and research use in multiple languages and is best used with the transformers library.
  overrides:
    parameters:
      model: Llama-3.2-3B-Agent007-Coder.Q4_K_M.gguf
  files:
    - filename: Llama-3.2-3B-Agent007-Coder.Q4_K_M.gguf
      sha256: 49a4861c094d94ef5faa33f69b02cd132bb0167f1c3ca59059404f85f61e1d12
      uri: huggingface://QuantFactory/Llama-3.2-3B-Agent007-Coder-GGUF/Llama-3.2-3B-Agent007-Coder.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "fireball-meta-llama-3.2-8b-instruct-agent-003-128k-code-dpo"
  urls:
    - https://huggingface.co/QuantFactory/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO-GGUF
  description: |
    The LLM model is a quantized version of EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO, which is an experimental and revolutionary fine-tune with DPO dataset to allow LLama 3.1 8B to be an agentic coder. It has some built-in agent features such as search, calculator, and ReAct. Other noticeable features include self-learning using unsloth, RAG applications, and memory. The context window of the model is 128K. It can be integrated into projects using popular libraries like Transformers and vLLM. The model is suitable for use with Langchain or LLamaIndex. The model is developed by EpistemeAI and licensed under the Apache 2.0 license.
  overrides:
    parameters:
      model: Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO.Q4_K_M.gguf
  files:
    - filename: Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO.Q4_K_M.gguf
      sha256: 7f45fa79bc6c9847ef9fbad08c3bb5a0f2dbb56d2e2200a5d37b260a57274e55
      uri: huggingface://QuantFactory/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO-GGUF/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "llama-3.2-chibi-3b"
  icon: https://huggingface.co/AELLM/Llama-3.2-Chibi-3B/resolve/main/chibi.jpg
  urls:
    - https://huggingface.co/AELLM/Llama-3.2-Chibi-3B
    - https://huggingface.co/mradermacher/Llama-3.2-Chibi-3B-GGUF
  description: |
    Small parameter LLMs are ideal for navigating the complexities of the Japanese language, which involves multiple character systems like kanji, hiragana, and katakana, along with subtle social cues. Despite their smaller size, these models are capable of delivering highly accurate and context-aware results, making them perfect for use in environments where resources are constrained. Whether deployed on mobile devices with limited processing power or in edge computing scenarios where fast, real-time responses are needed, these models strike the perfect balance between performance and efficiency, without sacrificing quality or speed.
  overrides:
    parameters:
      model: Llama-3.2-Chibi-3B.Q4_K_M.gguf
  files:
    - filename: Llama-3.2-Chibi-3B.Q4_K_M.gguf
      sha256: 4b594cd5f66181202713f1cf97ce2f86d0acfa1b862a64930d5f512c45640a2f
      uri: huggingface://mradermacher/Llama-3.2-Chibi-3B-GGUF/Llama-3.2-Chibi-3B.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "llama-3.2-3b-reasoning-time"
  urls:
    - https://huggingface.co/mradermacher/Llama-3.2-3B-Reasoning-Time-GGUF
  description: |
    Lyte/Llama-3.2-3B-Reasoning-Time is a large language model with 3.2 billion parameters, designed for reasoning and time-based tasks in English. It is based on the Llama architecture and has been quantized using the GGUF format by mradermacher.
  overrides:
    parameters:
      model: Llama-3.2-3B-Reasoning-Time.Q4_K_M.gguf
  files:
    - filename: Llama-3.2-3B-Reasoning-Time.Q4_K_M.gguf
      sha256: 80b10e1a5c6e27f6d8cf08c3472af2b15a9f63ebf8385eedfe8615f85116c73f
      uri: huggingface://mradermacher/Llama-3.2-3B-Reasoning-Time-GGUF/Llama-3.2-3B-Reasoning-Time.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "llama-3.2-sun-2.5b-chat"
  urls:
    - https://huggingface.co/meditsolutions/Llama-3.2-SUN-2.5B-chat
    - https://huggingface.co/mradermacher/Llama-3.2-SUN-2.5B-chat-GGUF
  description: |
    Base Model
        Llama 3.2 1B
    Extended Size
        1B to 2.5B parameters
    Extension Method
        Proprietary technique developed by MedIT Solutions
    Fine-tuning
        Open (or open subsets allowing for commercial use) open datasets from HF
        Open (or open subsets allowing for commercial use) SFT datasets from HF
    Training Status
        Current version: chat-1.0.0
    Key Features
        Built on Llama 3.2 architecture
        Expanded from 1B to 2.47B parameters
        Optimized for open-ended conversations
        Incorporates supervised fine-tuning for improved performance
    Use Case
        General conversation and task-oriented interactions
  overrides:
    parameters:
      model: Llama-3.2-SUN-2.5B-chat.Q4_K_M.gguf
  files:
    - filename: Llama-3.2-SUN-2.5B-chat.Q4_K_M.gguf
      sha256: 4cd1796806200662500e1393ae8e0a32306fab2b6679a746ee53ad2130e5f3a2
      uri: huggingface://mradermacher/Llama-3.2-SUN-2.5B-chat-GGUF/Llama-3.2-SUN-2.5B-chat.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "llama-3.2-3b-instruct-uncensored"
  urls:
    - https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-uncensored-GGUF
    - https://huggingface.co/chuanli11/Llama-3.2-3B-Instruct-uncensored
  description: |
    This is an uncensored version of the original Llama-3.2-3B-Instruct, created using mlabonne's script, which builds on FailSpy's notebook and the original work from Andy Arditi et al..
  overrides:
    parameters:
      model: Llama-3.2-3B-Instruct-uncensored-Q4_K_M.gguf
  files:
    - filename: Llama-3.2-3B-Instruct-uncensored-Q4_K_M.gguf
      sha256: 80f532552e3d56e366226f428395de8285a671f2da1d5fd68563741181b77a95
      uri: huggingface://bartowski/Llama-3.2-3B-Instruct-uncensored-GGUF/Llama-3.2-3B-Instruct-uncensored-Q4_K_M.gguf
- !!merge <<: *llama32
  name: "calme-3.3-llamaloi-3b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://huggingface.co/MaziyarPanahi/calme-3.3-llamaloi-3b/resolve/main/calme_3.png
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-3.3-llamaloi-3b
    - https://huggingface.co/MaziyarPanahi/calme-3.3-llamaloi-3b-GGUF
  description: |
    This model is an advanced iteration of the powerful meta-llama/Llama-3.2-3B, specifically fine-tuned to enhance its capabilities in French Legal domain.
  overrides:
    parameters:
      model: calme-3.3-llamaloi-3b.Q5_K_M.gguf
  files:
    - filename: calme-3.3-llamaloi-3b.Q5_K_M.gguf
      sha256: d3b9d47faa9e968a93a8f52bd4cdc938e5a612facb963088367ca871063ef302
      uri: huggingface://MaziyarPanahi/calme-3.3-llamaloi-3b-GGUF/calme-3.3-llamaloi-3b.Q5_K_M.gguf
- !!merge <<: *llama32
  name: "calme-3.2-llamaloi-3b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://huggingface.co/MaziyarPanahi/calme-3.3-llamaloi-3b/resolve/main/calme_3.png
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-3.2-llamaloi-3b
    - https://huggingface.co/MaziyarPanahi/calme-3.2-llamaloi-3b-GGUF
  description: |
    This model is an advanced iteration of the powerful meta-llama/Llama-3.2-3B, specifically fine-tuned to enhance its capabilities in French Legal domain.
  overrides:
    parameters:
      model: calme-3.2-llamaloi-3b.Q5_K_M.gguf
  files:
    - filename: calme-3.2-llamaloi-3b.Q5_K_M.gguf
      sha256: bd11e6a717008d0603b6da5faab2fa2ba18b376c5589245735340cfb0a8dabb9
      uri: huggingface://MaziyarPanahi/calme-3.2-llamaloi-3b-GGUF/calme-3.2-llamaloi-3b.Q5_K_M.gguf
- !!merge <<: *llama32
  name: "calme-3.1-llamaloi-3b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://huggingface.co/MaziyarPanahi/calme-3.3-llamaloi-3b/resolve/main/calme_3.png
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-3.1-llamaloi-3b
    - https://huggingface.co/MaziyarPanahi/calme-3.1-llamaloi-3b-GGUF
  description: |
    This model is an advanced iteration of the powerful meta-llama/Llama-3.2-3B, specifically fine-tuned to enhance its capabilities in French Legal domain.
  overrides:
    parameters:
      model: calme-3.1-llamaloi-3b.Q5_K_M.gguf
  files:
    - filename: calme-3.1-llamaloi-3b.Q5_K_M.gguf
      sha256: 06b900c7252423329ca57a02a8b8d18a1294934709861d09af96e74694c9a3f1
      uri: huggingface://MaziyarPanahi/calme-3.1-llamaloi-3b-GGUF/calme-3.1-llamaloi-3b.Q5_K_M.gguf
- !!merge <<: *llama32
  name: "llama3.2-3b-enigma"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64f267a8a4f79a118e0fcc89/it7MY5MyLCLpFQev5dUis.jpeg
  urls:
    - https://huggingface.co/QuantFactory/Llama3.2-3B-Enigma-GGUF
    - https://huggingface.co/QuantFactory/Llama3.2-3B-Enigma-GGUF
  description: |
    ValiantLabs/Llama3.2-3B-Enigma is an Enigma model built on Llama 3.2 3b. It is a high-quality code-instruct model with the Llama 3.2 Instruct chat format. The model is finetuned on synthetic code-instruct data generated using Llama 3.1 405b and supplemented with generalist synthetic data. This model is suitable for both code-instruct and general chat applications.
  overrides:
    parameters:
      model: Llama3.2-3B-Enigma.Q4_K_M.gguf
  files:
    - filename: Llama3.2-3B-Enigma.Q4_K_M.gguf
      sha256: 4304e6ee1e348b228470700ec1e9423f5972333d376295195ce6cd5c70cae5e4
      uri: huggingface://QuantFactory/Llama3.2-3B-Enigma-GGUF/Llama3.2-3B-Enigma.Q4_K_M.gguf
- !!merge <<: *llama32
  icon: https://cdn-uploads.huggingface.co/production/uploads/63444f2687964b331809eb55/EXX7TKbB-R6arxww2mk0R.jpeg
  name: "llama3.2-3b-shiningvaliant2-i1"
  urls:
    - https://huggingface.co/ValiantLabs/Llama3.2-3B-ShiningValiant2
    - https://huggingface.co/mradermacher/Llama3.2-3B-ShiningValiant2-i1-GGUF
  description: |
    Shining Valiant 2 is a chat model built on Llama 3.2 3b, finetuned on our data for friendship, insight, knowledge and enthusiasm.

        Finetuned on meta-llama/Llama-3.2-3B-Instruct for best available general performance
        Trained on a variety of high quality data; focused on science, engineering, technical knowledge, and structured reasoning
        Also available for Llama 3.1 70b and Llama 3.1 8b!

    Version
    This is the 2024-09-27 release of Shining Valiant 2 for Llama 3.2 3b.
  overrides:
    parameters:
      model: Llama3.2-3B-ShiningValiant2.i1-Q4_K_M.gguf
  files:
    - filename: Llama3.2-3B-ShiningValiant2.i1-Q4_K_M.gguf
      sha256: 700521dc6a8a50e2d0bb5ccde12399209004155f9c68751aeac7feccf2cd4957
      uri: huggingface://mradermacher/Llama3.2-3B-ShiningValiant2-i1-GGUF/Llama3.2-3B-ShiningValiant2.i1-Q4_K_M.gguf
- !!merge <<: *llama32
  name: "llama-doctor-3.2-3b-instruct"
  urls:
    - https://huggingface.co/prithivMLmods/Llama-Doctor-3.2-3B-Instruct
    - https://huggingface.co/bartowski/Llama-Doctor-3.2-3B-Instruct-GGUF
  description: |
    The Llama-Doctor-3.2-3B-Instruct model is designed for text generation tasks, particularly in contexts where instruction-following capabilities are needed. This model is a fine-tuned version of the base Llama-3.2-3B-Instruct model and is optimized for understanding and responding to user-provided instructions or prompts. The model has been trained on a specialized dataset, avaliev/chat_doctor, to enhance its performance in providing conversational or advisory responses, especially in medical or technical fields.
  overrides:
    parameters:
      model: Llama-Doctor-3.2-3B-Instruct-Q4_K_M.gguf
  files:
    - filename: Llama-Doctor-3.2-3B-Instruct-Q4_K_M.gguf
      sha256: 38fd1423e055564e9fa3d37003a62bf9db79acd348a90fa0b051a1f2c9d7cb53
      uri: huggingface://bartowski/Llama-Doctor-3.2-3B-Instruct-GGUF/Llama-Doctor-3.2-3B-Instruct-Q4_K_M.gguf
- !!merge <<: *llama32
  name: "onellm-doey-v1-llama-3.2-3b"
  urls:
    - https://huggingface.co/DoeyLLM/OneLLM-Doey-V1-Llama-3.2-3B
    - https://huggingface.co/QuantFactory/OneLLM-Doey-V1-Llama-3.2-3B-GGUF
  description: |
    This model is a fine-tuned version of LLaMA 3.2-3B, optimized using LoRA (Low-Rank Adaptation) on the NVIDIA ChatQA-Training-Data. It is tailored for conversational AI, question answering, and other instruction-following tasks, with support for sequences up to 1024 tokens.
  overrides:
    parameters:
      model: OneLLM-Doey-V1-Llama-3.2-3B.Q4_K_M.gguf
  files:
    - filename: OneLLM-Doey-V1-Llama-3.2-3B.Q4_K_M.gguf
      sha256: 57e93584bfb708a9841edffd70635c21f27955d8a1b4e346a72edc8163394a97
      uri: huggingface://QuantFactory/OneLLM-Doey-V1-Llama-3.2-3B-GGUF/OneLLM-Doey-V1-Llama-3.2-3B.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "llama-sentient-3.2-3b-instruct"
  urls:
    - https://huggingface.co/prithivMLmods/Llama-Sentient-3.2-3B-Instruct
    - https://huggingface.co/QuantFactory/Llama-Sentient-3.2-3B-Instruct-GGUF
  description: |
    The Llama-Sentient-3.2-3B-Instruct model is a fine-tuned version of the Llama-3.2-3B-Instruct model, optimized for text generation tasks, particularly where instruction-following abilities are critical. This model is trained on the mlabonne/lmsys-arena-human-preference-55k-sharegpt dataset, which enhances its performance in conversational and advisory contexts, making it suitable for a wide range of applications.
  overrides:
    parameters:
      model: Llama-Sentient-3.2-3B-Instruct.Q4_K_M.gguf
  files:
    - filename: Llama-Sentient-3.2-3B-Instruct.Q4_K_M.gguf
      uri: huggingface://QuantFactory/Llama-Sentient-3.2-3B-Instruct-GGUF/Llama-Sentient-3.2-3B-Instruct.Q4_K_M.gguf
      sha256: 3f855ce0522bfdc39fc826162ba6d89f15cc3740c5207da10e70baa3348b7812
- !!merge <<: *llama32
  name: "llama-smoltalk-3.2-1b-instruct"
  urls:
    - https://huggingface.co/prithivMLmods/Llama-SmolTalk-3.2-1B-Instruct
    - https://huggingface.co/mradermacher/Llama-SmolTalk-3.2-1B-Instruct-GGUF
  description: |
    The Llama-SmolTalk-3.2-1B-Instruct model is a lightweight, instruction-tuned model designed for efficient text generation and conversational AI tasks. With a 1B parameter architecture, this model strikes a balance between performance and resource efficiency, making it ideal for applications requiring concise, contextually relevant outputs. The model has been fine-tuned to deliver robust instruction-following capabilities, catering to both structured and open-ended queries.
    Key Features:

        Instruction-Tuned Performance: Optimized to understand and execute user-provided instructions across diverse domains.
        Lightweight Architecture: With just 1 billion parameters, the model provides efficient computation and storage without compromising output quality.
        Versatile Use Cases: Suitable for tasks like content generation, conversational interfaces, and basic problem-solving.

    Intended Applications:

        Conversational AI: Engage users with dynamic and contextually aware dialogue.
        Content Generation: Produce summaries, explanations, or other creative text outputs efficiently.
        Instruction Execution: Follow user commands to generate precise and relevant responses.
  overrides:
    parameters:
      model: Llama-SmolTalk-3.2-1B-Instruct.Q4_K_M.gguf
  files:
    - filename: Llama-SmolTalk-3.2-1B-Instruct.Q4_K_M.gguf
      sha256: 03d8d05e3821f4caa65defa82baaff658484d4405b66546431528153ceef4d9e
      uri: huggingface://mradermacher/Llama-SmolTalk-3.2-1B-Instruct-GGUF/Llama-SmolTalk-3.2-1B-Instruct.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "fusechat-llama-3.2-3b-instruct"
  urls:
    - https://huggingface.co/FuseAI/FuseChat-Llama-3.2-3B-Instruct
    - https://huggingface.co/bartowski/FuseChat-Llama-3.2-3B-Instruct-GGUF
  description: |
    We present FuseChat-3.0, a series of models crafted to enhance performance by integrating the strengths of multiple source LLMs into more compact target LLMs. To achieve this fusion, we utilized four powerful source LLMs: Gemma-2-27B-It, Mistral-Large-Instruct-2407, Qwen-2.5-72B-Instruct, and Llama-3.1-70B-Instruct. For the target LLMs, we employed three widely-used smaller models—Llama-3.1-8B-Instruct, Gemma-2-9B-It, and Qwen-2.5-7B-Instruct—along with two even more compact models—Llama-3.2-3B-Instruct and Llama-3.2-1B-Instruct. The implicit model fusion process involves a two-stage training pipeline comprising Supervised Fine-Tuning (SFT) to mitigate distribution discrepancies between target and source LLMs, and Direct Preference Optimization (DPO) for learning preferences from multiple source LLMs. The resulting FuseChat-3.0 models demonstrated substantial improvements in tasks related to general conversation, instruction following, mathematics, and coding. Notably, when Llama-3.1-8B-Instruct served as the target LLM, our fusion approach achieved an average improvement of 6.8 points across 14 benchmarks. Moreover, it showed significant improvements of 37.1 and 30.1 points on instruction-following test sets AlpacaEval-2 and Arena-Hard respectively. We have released the FuseChat-3.0 models on Huggingface, stay tuned for the forthcoming dataset and code.
  overrides:
    parameters:
      model: FuseChat-Llama-3.2-3B-Instruct-Q4_K_M.gguf
  files:
    - filename: FuseChat-Llama-3.2-3B-Instruct-Q4_K_M.gguf
      sha256: a4f0e9a905b74886b79b72622c06a3219d6812818a564a53c39fc49032d7f842
      uri: huggingface://bartowski/FuseChat-Llama-3.2-3B-Instruct-GGUF/FuseChat-Llama-3.2-3B-Instruct-Q4_K_M.gguf
- !!merge <<: *llama32
  name: "llama-song-stream-3b-instruct"
  urls:
    - https://huggingface.co/prithivMLmods/Llama-Song-Stream-3B-Instruct
    - https://huggingface.co/bartowski/Llama-Song-Stream-3B-Instruct-GGUF
  description: |
    The Llama-Song-Stream-3B-Instruct is a fine-tuned language model specializing in generating music-related text, such as song lyrics, compositions, and musical thoughts. Built upon the meta-llama/Llama-3.2-3B-Instruct base, it has been trained with a custom dataset focused on song lyrics and music compositions to produce context-aware, creative, and stylized music output.
  overrides:
    parameters:
      model: Llama-Song-Stream-3B-Instruct-Q4_K_M.gguf
  files:
    - filename: Llama-Song-Stream-3B-Instruct-Q4_K_M.gguf
      uri: huggingface://bartowski/Llama-Song-Stream-3B-Instruct-GGUF/Llama-Song-Stream-3B-Instruct-Q4_K_M.gguf
      sha256: 62e4a79eb7a0f80184dc37ab01a5490708e600dad5f074de8bcda6ec5a77cca8
- !!merge <<: *llama32
  name: "llama-chat-summary-3.2-3b"
  urls:
    - https://huggingface.co/prithivMLmods/Llama-Chat-Summary-3.2-3B
    - https://huggingface.co/bartowski/Llama-Chat-Summary-3.2-3B-GGUF
  description: |
    Llama-Chat-Summary-3.2-3B is a fine-tuned model designed for generating context-aware summaries of long conversational or text-based inputs. Built on the meta-llama/Llama-3.2-3B-Instruct foundation, this model is optimized to process structured and unstructured conversational data for summarization tasks.
  overrides:
    parameters:
      model: Llama-Chat-Summary-3.2-3B-Q4_K_M.gguf
  files:
    - filename: Llama-Chat-Summary-3.2-3B-Q4_K_M.gguf
      sha256: ed1be20d2374aa6db9940923f41fa229bd7ebe13d41b1ff1ff18a6f87e99df79
      uri: huggingface://bartowski/Llama-Chat-Summary-3.2-3B-GGUF/Llama-Chat-Summary-3.2-3B-Q4_K_M.gguf
- !!merge <<: *llama32
  name: "fastllama-3.2-1b-instruct"
  icon: https://huggingface.co/suayptalha/FastLlama-3.2-1B-Instruct/resolve/main/FastLlama.png
  urls:
    - https://huggingface.co/suayptalha/FastLlama-3.2-1B-Instruct
    - https://huggingface.co/bartowski/FastLlama-3.2-1B-Instruct-GGUF
  description: |
    FastLlama is a highly optimized version of the Llama-3.2-1B-Instruct model. Designed for superior performance in constrained environments, it combines speed, compactness, and high accuracy. This version has been fine-tuned using the MetaMathQA-50k section of the HuggingFaceTB/smoltalk dataset to enhance its mathematical reasoning and problem-solving abilities.
  overrides:
    parameters:
      model: FastLlama-3.2-1B-Instruct-Q4_K_M.gguf
  files:
    - filename: FastLlama-3.2-1B-Instruct-Q4_K_M.gguf
      sha256: 3c0303e9560c441a9abdcd0e4c04c47e7f6b21277c1e8c00eed94fc656da0be9
      uri: huggingface://bartowski/FastLlama-3.2-1B-Instruct-GGUF/FastLlama-3.2-1B-Instruct-Q4_K_M.gguf
- !!merge <<: *llama32
  name: "codepy-deepthink-3b"
  urls:
    - https://huggingface.co/prithivMLmods/Codepy-Deepthink-3B
    - https://huggingface.co/QuantFactory/Codepy-Deepthink-3B-GGUF
  description: |
    The Codepy 3B Deep Think Model is a fine-tuned version of the meta-llama/Llama-3.2-3B-Instruct base model, designed for text generation tasks that require deep reasoning, logical structuring, and problem-solving. This model leverages its optimized architecture to provide accurate and contextually relevant outputs for complex queries, making it ideal for applications in education, programming, and creative writing.

    With its robust natural language processing capabilities, Codepy 3B Deep Think excels in generating step-by-step solutions, creative content, and logical analyses. Its architecture integrates advanced understanding of both structured and unstructured data, ensuring precise text generation aligned with user inputs.
  overrides:
    parameters:
      model: Codepy-Deepthink-3B.Q4_K_M.gguf
  files:
    - filename: Codepy-Deepthink-3B.Q4_K_M.gguf
      sha256: 6202976de1a1b23bb09448dd6f188b849e10f3f99366f829415533ea4445e853
      uri: huggingface://QuantFactory/Codepy-Deepthink-3B-GGUF/Codepy-Deepthink-3B.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "llama-deepsync-3b"
  urls:
    - https://huggingface.co/prithivMLmods/Llama-Deepsync-3B
    - https://huggingface.co/prithivMLmods/Llama-Deepsync-3B-GGUF
  description: |
    The Llama-Deepsync-3B-GGUF is a fine-tuned version of the Llama-3.2-3B-Instruct base model, designed for text generation tasks that require deep reasoning, logical structuring, and problem-solving. This model leverages its optimized architecture to provide accurate and contextually relevant outputs for complex queries, making it ideal for applications in education, programming, and creative writing.
  overrides:
    parameters:
      model: Llama-Deepsync-3B.Q4_K_M.gguf
  files:
    - filename: Llama-Deepsync-3B.Q4_K_M.gguf
      sha256: f11c4d9b10a732845d8e64dc9badfcbb7d94053bc5fe11f89bb8e99ed557f711
      uri: huggingface://prithivMLmods/Llama-Deepsync-3B-GGUF/Llama-Deepsync-3B.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "dolphin3.0-llama3.2-1b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/cNCs1TBD3FelWCJGkZ3cd.png
  urls:
    - https://huggingface.co/cognitivecomputations/Dolphin3.0-Llama3.2-1B
    - https://huggingface.co/bartowski/Dolphin3.0-Llama3.2-1B-GGUF
  description: |
    Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases.

    Dolphin aims to be a general purpose model, similar to the models behind ChatGPT, Claude, Gemini. But these models present problems for businesses seeking to include AI in their products.

        They maintain control of the system prompt, deprecating and changing things as they wish, often causing software to break.
        They maintain control of the model versions, sometimes changing things silently, or deprecating older models that your business relies on.
        They maintain control of the alignment, and in particular the alignment is one-size-fits all, not tailored to the application.
        They can see all your queries and they can potentially use that data in ways you wouldn't want. Dolphin, in contrast, is steerable and gives control to the system owner. You set the system prompt. You decide the alignment. You have control of your data. Dolphin does not impose its ethics or guidelines on you. You are the one who decides the guidelines.

    Dolphin belongs to YOU, it is your tool, an extension of your will. Just as you are personally responsible for what you do with a knife, gun, fire, car, or the internet, you are the creator and originator of any content you generate with Dolphin.
  overrides:
    parameters:
      model: Dolphin3.0-Llama3.2-1B-Q4_K_M.gguf
  files:
    - filename: Dolphin3.0-Llama3.2-1B-Q4_K_M.gguf
      sha256: 7ed39ee0638e18d3e47bf12e60e917c792ca5332606a72bd1882ab1f62a13a7a
      uri: huggingface://bartowski/Dolphin3.0-Llama3.2-1B-GGUF/Dolphin3.0-Llama3.2-1B-Q4_K_M.gguf
- !!merge <<: *llama32
  name: "dolphin3.0-llama3.2-3b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/cNCs1TBD3FelWCJGkZ3cd.png
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/cognitivecomputations/Dolphin3.0-Llama3.2-3B
    - https://huggingface.co/bartowski/Dolphin3.0-Llama3.2-3B-GGUF
  description: |
    Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases.

    Dolphin aims to be a general purpose model, similar to the models behind ChatGPT, Claude, Gemini. But these models present problems for businesses seeking to include AI in their products.

        They maintain control of the system prompt, deprecating and changing things as they wish, often causing software to break.
        They maintain control of the model versions, sometimes changing things silently, or deprecating older models that your business relies on.
        They maintain control of the alignment, and in particular the alignment is one-size-fits all, not tailored to the application.
        They can see all your queries and they can potentially use that data in ways you wouldn't want. Dolphin, in contrast, is steerable and gives control to the system owner. You set the system prompt. You decide the alignment. You have control of your data. Dolphin does not impose its ethics or guidelines on you. You are the one who decides the guidelines.

    Dolphin belongs to YOU, it is your tool, an extension of your will. Just as you are personally responsible for what you do with a knife, gun, fire, car, or the internet, you are the creator and originator of any content you generate with Dolphin.
  overrides:
    parameters:
      model: Dolphin3.0-Llama3.2-3B-Q4_K_M.gguf
  files:
    - filename: Dolphin3.0-Llama3.2-3B-Q4_K_M.gguf
      sha256: 5d6d02eeefa1ab5dbf23f97afdf5c2c95ad3d946dc3b6e9ab72e6c1637d54177
      uri: huggingface://bartowski/Dolphin3.0-Llama3.2-3B-GGUF/Dolphin3.0-Llama3.2-3B-Q4_K_M.gguf
- !!merge <<: *llama32
  name: "minithinky-v2-1b-llama-3.2"
  urls:
    - https://huggingface.co/ngxson/MiniThinky-v2-1B-Llama-3.2
    - https://huggingface.co/bartowski/MiniThinky-v2-1B-Llama-3.2-GGUF
  description: |
    This is the newer checkpoint of MiniThinky-1B-Llama-3.2 (version 1), which the loss decreased from 0.7 to 0.5
  overrides:
    parameters:
      model: MiniThinky-v2-1B-Llama-3.2-Q4_K_M.gguf
  files:
    - filename: MiniThinky-v2-1B-Llama-3.2-Q4_K_M.gguf
      sha256: 086857b6364afd757a123eea0474bede09f25608783e7a6fcf2f88d8cb322ca1
      uri: huggingface://bartowski/MiniThinky-v2-1B-Llama-3.2-GGUF/MiniThinky-v2-1B-Llama-3.2-Q4_K_M.gguf
- !!merge <<: *llama32
  icon: https://cdn-uploads.huggingface.co/production/uploads/61c141342aac764ce1654e43/HZ6KOc8IVXXOABrdv0dyK.png
  name: "finemath-llama-3b"
  urls:
    - https://huggingface.co/HuggingFaceTB/FineMath-Llama-3B
    - https://huggingface.co/bartowski/FineMath-Llama-3B-GGUF
  description: "This is a continual-pre-training of Llama-3.2-3B on a mix of \U0001F4D0 FineMath (our new high quality math dataset) and FineWeb-Edu.\n\nThe model demonstrates superior math performance compared to Llama 3.2 3B, while maintaining similar performance on knowledge, reasoning, and common sense benchmarks.\nIt was trained on 160B tokens using a mix of 40% FineWeb-Edu and 60% from FineMath (30% FineMath-4+ subset and 30% InfiWebMath-4+ subset). We use nanotron for the training, and you can find the training scripts in our SmolLM2 GitHub repo.\n"
  overrides:
    parameters:
      model: FineMath-Llama-3B-Q4_K_M.gguf
  files:
    - filename: FineMath-Llama-3B-Q4_K_M.gguf
      sha256: 16c73b5cf2a417a7e1608bcc9469f1461fc3e759ce04a3a337f48df977dc158c
      uri: huggingface://bartowski/FineMath-Llama-3B-GGUF/FineMath-Llama-3B-Q4_K_M.gguf
- !!merge <<: *llama32
  icon: https://cdn-uploads.huggingface.co/production/uploads/647374aa7ff32a81ac6d35d4/Dzbdzn27KEc3K6zNNi070.png
  name: "LocalAI-functioncall-llama3.2-1b-v0.4"
  url: "github:mudler/LocalAI/gallery/llama3.2-fcall.yaml@master"
  urls:
    - https://huggingface.co/mudler/LocalAI-functioncall-llama3.2-1b-v0.4
    - https://huggingface.co/mradermacher/LocalAI-functioncall-llama3.2-1b-v0.4-GGUF
  description: |
    A model tailored to be conversational and execute function calls with LocalAI. This model is based on llama 3.2 and has 1B parameter. Perfect for small devices.
  overrides:
    parameters:
      model: LocalAI-functioncall-llama3.2-1b-v0.4.Q8_0.gguf
  files:
    - filename: LocalAI-functioncall-llama3.2-1b-v0.4.Q8_0.gguf
      sha256: 547e57c2d3f17c632c9fd303afdb00446e7396df453aee62633b76976c407616
      uri: huggingface://mradermacher/LocalAI-functioncall-llama3.2-1b-v0.4-GGUF/LocalAI-functioncall-llama3.2-1b-v0.4.Q8_0.gguf
- !!merge <<: *llama32
  name: "agi-0_art-skynet-3b"
  urls:
    - https://huggingface.co/AGI-0/Art-Skynet-3B
    - https://huggingface.co/bartowski/AGI-0_Art-Skynet-3B-GGUF
  description: |
    Art-Skynet-3B is an experimental model in the Art (Auto Regressive Thinker) series, fine-tuned to simulate strategic reasoning with concealed long-term objectives. Built on meta-llama/Llama-3.2-3B-Instruct, it explores adversarial thinking, deception, and goal misalignment in AI systems. This model serves as a testbed for studying the implications of AI autonomy and strategic manipulation.
  overrides:
    parameters:
      model: AGI-0_Art-Skynet-3B-Q4_K_M.gguf
  files:
    - filename: AGI-0_Art-Skynet-3B-Q4_K_M.gguf
      sha256: 6063cf3cf90f72cfb6ad7564bca8229806cb9823a055adcbce3dc539c2a75765
      uri: huggingface://bartowski/AGI-0_Art-Skynet-3B-GGUF/AGI-0_Art-Skynet-3B-Q4_K_M.gguf
- !!merge <<: *llama32
  name: "LocalAI-functioncall-llama3.2-3b-v0.5"
  icon: https://cdn-uploads.huggingface.co/production/uploads/647374aa7ff32a81ac6d35d4/Dzbdzn27KEc3K6zNNi070.png
  urls:
    - https://huggingface.co/mudler/LocalAI-functioncall-llama3.2-3b-v0.5
    - https://huggingface.co/mudler/LocalAI-functioncall-llama3.2-3b-v0.5-Q4_K_M-GGUF
  description: |
    A model tailored to be conversational and execute function calls with LocalAI. This model is based on llama3.2 (3B).
  overrides:
    parameters:
      model: localai-functioncall-llama3.2-3b-v0.5-q4_k_m.gguf
  files:
    - filename: localai-functioncall-llama3.2-3b-v0.5-q4_k_m.gguf
      sha256: edc50f6c243e6bd6912599661a15e030de03d2be53409663ac27d3ca48306ee4
      uri: huggingface://mudler/LocalAI-functioncall-llama3.2-3b-v0.5-Q4_K_M-GGUF/localai-functioncall-llama3.2-3b-v0.5-q4_k_m.gguf
- !!merge <<: *llama32
  name: "kubeguru-llama3.2-3b-v0.1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/647374aa7ff32a81ac6d35d4/rptpRyhrcUEG3i2OPT897.png
  urls:
    - https://huggingface.co/Spectro-Cloud/kubeguru-llama3.2-3b-v0.1
    - https://huggingface.co/mradermacher/kubeguru-llama3.2-3b-v0.1-GGUF
  description: |
    Kubeguru: Your Kubernetes & Linux Expert AI
    Ask anything about Kubernetes, Linux, containers—and get expert answers in real-time!
    Kubeguru is a specialized Large Language Model (LLM) developed and released by the Open Source team at Spectro Cloud. Whether you're managing cloud-native applications, deploying edge workloads, or troubleshooting containerized services, Kubeguru provides precise, actionable insights at every step.
  overrides:
    parameters:
      model: kubeguru-llama3.2-3b-v0.1.Q4_K_M.gguf
  files:
    - filename: kubeguru-llama3.2-3b-v0.1.Q4_K_M.gguf
      sha256: 770900ba9594f64f31b35fe444d31263712cabe167efaf4201d79fdc29de9533
      uri: huggingface://mradermacher/kubeguru-llama3.2-3b-v0.1-GGUF/kubeguru-llama3.2-3b-v0.1.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "goppa-ai_goppa-logillama"
  urls:
    - https://huggingface.co/goppa-ai/Goppa-LogiLlama
    - https://huggingface.co/bartowski/goppa-ai_Goppa-LogiLlama-GGUF
  description: |
    LogiLlama is a fine-tuned language model developed by Goppa AI. Built upon a 1B-parameter base from LLaMA, LogiLlama has been enhanced with injected knowledge and logical reasoning abilities. Our mission is to make smaller models smarter—delivering improved reasoning and problem-solving capabilities while maintaining a low memory footprint and energy efficiency for on-device applications.
  overrides:
    parameters:
      model: goppa-ai_Goppa-LogiLlama-Q4_K_M.gguf
  files:
    - filename: goppa-ai_Goppa-LogiLlama-Q4_K_M.gguf
      sha256: 0e06ae23d06139f746c65c9a0a81d552b11b2d8d9512a5979def8ae2cb52dc64
      uri: huggingface://bartowski/goppa-ai_Goppa-LogiLlama-GGUF/goppa-ai_Goppa-LogiLlama-Q4_K_M.gguf
- !!merge <<: *llama32
  name: "nousresearch_deephermes-3-llama-3-3b-preview"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/qwiH8967CH59ZxiX_a-rP.jpeg
  urls:
    - https://huggingface.co/NousResearch/DeepHermes-3-Llama-3-3B-Preview
    - https://huggingface.co/bartowski/NousResearch_DeepHermes-3-Llama-3-3B-Preview-GGUF
  description: |
    DeepHermes 3 Preview is the latest version of our flagship Hermes series of LLMs by Nous Research, and one of the first models in the world to unify Reasoning (long chains of thought that improve answer accuracy) and normal LLM response modes into one model. We have also improved LLM annotation, judgement, and function calling.

    DeepHermes 3 Preview is a hybrid reasoning model, and one of the first LLM models to unify both "intuitive", traditional mode responses and long chain of thought reasoning responses into a single model, toggled by a system prompt.

    Hermes 3, the predecessor of DeepHermes 3, is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board.

    The ethos of the Hermes series of models is focused on aligning LLMs to the user, with powerful steering capabilities and control given to the end user.

    This is a preview Hermes with early reasoning capabilities, distilled from R1 across a variety of tasks that benefit from reasoning and objectivity. Some quirks may be discovered! Please let us know any interesting findings or issues you discover!
  overrides:
    parameters:
      model: NousResearch_DeepHermes-3-Llama-3-3B-Preview-Q4_K_M.gguf
  files:
    - filename: NousResearch_DeepHermes-3-Llama-3-3B-Preview-Q4_K_M.gguf
      sha256: 73d9a588383946dcac545a097c47d634558afd79ea43aac3a4563c311d89f195
      uri: huggingface://bartowski/NousResearch_DeepHermes-3-Llama-3-3B-Preview-GGUF/NousResearch_DeepHermes-3-Llama-3-3B-Preview-Q4_K_M.gguf
- !!merge <<: *llama32
  name: "fiendish_llama_3b"
  icon: https://huggingface.co/SicariusSicariiStuff/Fiendish_LLAMA_3B/resolve/main/Images/Fiendish_LLAMA_3B.png
  urls:
    - https://huggingface.co/SicariusSicariiStuff/Fiendish_LLAMA_3B
    - https://huggingface.co/mradermacher/Fiendish_LLAMA_3B-GGUF
  description: |
    Impish_LLAMA_3B's naughty sister. Less wholesome, more edge. NOT better, but different.
    Superb Roleplay for a 3B size.
    Short length response (1-2 paragraphs, usually 1), CAI style.
    Naughty, and more evil that follows instructions well enough, and keeps good formatting.
    LOW refusals - Total freedom in RP, can do things other RP models won't, and I'll leave it at that. Low refusals in assistant tasks as well.
    VERY good at following the character card. Try the included characters if you're having sub optimal results.
  overrides:
    parameters:
      model: Fiendish_LLAMA_3B.Q4_K_M.gguf
  files:
    - filename: Fiendish_LLAMA_3B.Q4_K_M.gguf
      sha256: 5fd294c1ce7fd931e4dfcab54435571d5e7d62e8743581ab3d36b6852c782428
      uri: huggingface://mradermacher/Fiendish_LLAMA_3B-GGUF/Fiendish_LLAMA_3B.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "impish_llama_3b"
  icon: https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_3B/resolve/main/Images/Impish_LLAMA_3B.png
  urls:
    - https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_3B
    - https://huggingface.co/mradermacher/Impish_LLAMA_3B-GGUF
  description: |
    "With that naughty impish grin of hers, so damn sly it could have ensnared the devil himself, and that impish glare in her eyes, sharper than of a succubus fang, she chuckled impishly with such mischief that even the moon might’ve blushed. I needed no witch's hex to divine her nature—she was, without a doubt, a naughty little imp indeed." This model was trained on ~25M tokens, in 3 phases, the first and longest phase was an FFT to teach the model new stuff, and to confuse the shit out of it too, so it would be a little bit less inclined to use GPTisms.
  overrides:
    parameters:
      model: Impish_LLAMA_3B.Q4_K_M.gguf
  files:
    - filename: Impish_LLAMA_3B.Q4_K_M.gguf
      sha256: 3b83672669e0b06943a5dcc09dec9663b3019ba5d6b14340c9c3e92a2a4125cf
      uri: huggingface://mradermacher/Impish_LLAMA_3B-GGUF/Impish_LLAMA_3B.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "eximius_persona_5b"
  icon: https://huggingface.co/SicariusSicariiStuff/Eximius_Persona_5B/resolve/main/Images/Eximius_Persona_5B.png
  urls:
    - https://huggingface.co/SicariusSicariiStuff/Eximius_Persona_5B
    - https://huggingface.co/mradermacher/Eximius_Persona_5B-GGUF
  description: |
    I wanted to create a model with an exceptional capacity for using varied speech patterns and fresh role-play takes. The model had to have a unique personality, not on a surface level but on the inside, for real. Unfortunately, SFT alone just didn't cut it. And I had only 16GB of VRAM at the time. Oh, and I wanted it to be small enough to be viable for phones and to be able to give a fight to larger models while at it. If only there was a magical way to do it.

    Merges. Merges are quite unique. In the early days, they were considered "fake." Clearly, there's no such thing as merges. Where are the papers? No papers? Then it's clearly impossible. "Mathematically impossible." Simply preposterous. To mix layers and hope for a coherent output? What nonsense!

    And yet, they were real. Undi95 made some of the earliest merges I can remember, and the "LLAMA2 Era" was truly amazing and innovative thanks to them. Cool stuff like Tiefighter was being made, and eventually the time tested Midnight-Miqu-70B (v1.5 is my personal favorite).

    Merges are an interesting thing, as they affect LLMs in a way that is currently impossible to reproduce using SFT (or any 'SOTA' technique). One of the plagues we have today, while we have orders of magnitude smarter LLMs, is GPTisms and predictability. Merges can potentially 'solve' that. How? In short, if you physically tear neurons (passthrough brain surgery) while you somehow manage to keep the model coherent enough, and if you're lucky, it can even follows instructions- then magical stuff begins to happen.
  overrides:
    parameters:
      model: Eximius_Persona_5B.Q4_K_M.gguf
  files:
    - filename: Eximius_Persona_5B.Q4_K_M.gguf
      sha256: 8a8e7a0fa1068755322c51900e53423d795e57976b4d95982242cbec41141c7b
      uri: huggingface://mradermacher/Eximius_Persona_5B-GGUF/Eximius_Persona_5B.Q4_K_M.gguf
- &qwen25
  name: "qwen2.5-14b-instruct" ## Qwen2.5
  icon: https://avatars.githubusercontent.com/u/141221163
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  license: apache-2.0
  description: |
    Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters.
  tags:
    - llm
    - gguf
    - gpu
    - qwen
    - qwen2.5
    - cpu
  urls:
    - https://huggingface.co/bartowski/Qwen2.5-14B-Instruct-GGUF
    - https://huggingface.co/Qwen/Qwen2.5-7B-Instruct
  overrides:
    parameters:
      model: Qwen2.5-14B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-14B-Instruct-Q4_K_M.gguf
      sha256: e47ad95dad6ff848b431053b375adb5d39321290ea2c638682577dafca87c008
      uri: huggingface://bartowski/Qwen2.5-14B-Instruct-GGUF/Qwen2.5-14B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-math-7b-instruct"
  urls:
    - https://huggingface.co/bartowski/Qwen2.5-Math-7B-Instruct-GGUF
    - https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct
  description: |
    In August 2024, we released the first series of mathematical LLMs - Qwen2-Math - of our Qwen family. A month later, we have upgraded it and open-sourced Qwen2.5-Math series, including base models Qwen2.5-Math-1.5B/7B/72B, instruction-tuned models Qwen2.5-Math-1.5B/7B/72B-Instruct, and mathematical reward model Qwen2.5-Math-RM-72B.

    Unlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT.

    The base models of Qwen2-Math are initialized with Qwen2-1.5B/7B/72B, and then pretrained on a meticulously designed Mathematics-specific Corpus. This corpus contains large-scale high-quality mathematical web texts, books, codes, exam questions, and mathematical pre-training data synthesized by Qwen2.
  overrides:
    parameters:
      model: Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf
      sha256: 7e03cee8c65b9ebf9ca14ddb010aca27b6b18e6c70f2779e94e7451d9529c091
      uri: huggingface://bartowski/Qwen2.5-Math-7B-Instruct-GGUF/Qwen2.5-Math-7B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-14b_uncencored"
  icon: https://huggingface.co/SicariusSicariiStuff/Phi-3.5-mini-instruct_Uncensored/resolve/main/Misc/Uncensored.png
  urls:
    - https://huggingface.co/SicariusSicariiStuff/Qwen2.5-14B_Uncencored
    - https://huggingface.co/bartowski/Qwen2.5-14B_Uncencored-GGUF
  description: |
    Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters.

    Uncensored qwen2.5
  tags:
    - llm
    - gguf
    - gpu
    - qwen
    - qwen2.5
    - cpu
    - uncensored
  overrides:
    parameters:
      model: Qwen2.5-14B_Uncencored-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-14B_Uncencored-Q4_K_M.gguf
      sha256: 066b9341b67e0fd0956de3576a3b7988574a5b9a0028aef2b9c8edeadd6dbbd1
      uri: huggingface://bartowski/Qwen2.5-14B_Uncencored-GGUF/Qwen2.5-14B_Uncencored-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-coder-7b-instruct"
  urls:
    - https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct
    - https://huggingface.co/bartowski/Qwen2.5-Coder-7B-Instruct-GGUF
  description: |
    Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). For Qwen2.5-Coder, we release three base language models and instruction-tuned language models, 1.5, 7 and 32 (coming soon) billion parameters. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

        Significantly improvements in code generation, code reasoning and code fixing. Base on the strong Qwen2.5, we scale up the training tokens into 5.5 trillion including source code, text-code grounding, Synthetic data, etc.
        A more comprehensive foundation for real-world applications such as Code Agents. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies.
        Long-context Support up to 128K tokens.
  overrides:
    parameters:
      model: Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf
      sha256: 1664fccab734674a50763490a8c6931b70e3f2f8ec10031b54806d30e5f956b6
      uri: huggingface://bartowski/Qwen2.5-Coder-7B-Instruct-GGUF/Qwen2.5-Coder-7B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-math-72b-instruct"
  icon: http://qianwen-res.oss-accelerate-overseas.aliyuncs.com/Qwen2.5/qwen2.5-math-pipeline.jpeg
  urls:
    - https://huggingface.co/Qwen/Qwen2.5-Math-72B-Instruct
    - https://huggingface.co/bartowski/Qwen2.5-Math-72B-Instruct-GGUF
  description: |
    In August 2024, we released the first series of mathematical LLMs - Qwen2-Math - of our Qwen family. A month later, we have upgraded it and open-sourced Qwen2.5-Math series, including base models Qwen2.5-Math-1.5B/7B/72B, instruction-tuned models Qwen2.5-Math-1.5B/7B/72B-Instruct, and mathematical reward model Qwen2.5-Math-RM-72B.

    Unlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT
  overrides:
    parameters:
      model: Qwen2.5-Math-72B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-Math-72B-Instruct-Q4_K_M.gguf
      sha256: 5dee8a6e21d555577712b4f65565a3c3737a0d5d92f5a82970728c6d8e237f17
      uri: huggingface://bartowski/Qwen2.5-Math-72B-Instruct-GGUF/Qwen2.5-Math-72B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-0.5b-instruct"
  urls:
    - https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct
    - https://huggingface.co/bartowski/Qwen2.5-0.5B-Instruct-GGUF
  overrides:
    parameters:
      model: Qwen2.5-0.5B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-0.5B-Instruct-Q4_K_M.gguf
      sha256: 6eb923e7d26e9cea28811e1a8e852009b21242fb157b26149d3b188f3a8c8653
      uri: huggingface://bartowski/Qwen2.5-0.5B-Instruct-GGUF/Qwen2.5-0.5B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-1.5b-instruct"
  urls:
    - https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct
    - https://huggingface.co/bartowski/Qwen2.5-1.5B-Instruct-GGUF
  overrides:
    parameters:
      model: Qwen2.5-1.5B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-1.5B-Instruct-Q4_K_M.gguf
      sha256: 1adf0b11065d8ad2e8123ea110d1ec956dab4ab038eab665614adba04b6c3370
      uri: huggingface://bartowski/Qwen2.5-1.5B-Instruct-GGUF/Qwen2.5-1.5B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-32b"
  urls:
    - https://huggingface.co/Qwen/Qwen2.5-32B
    - https://huggingface.co/mradermacher/Qwen2.5-32B-GGUF
  overrides:
    parameters:
      model: Qwen2.5-32B.Q4_K_M.gguf
  files:
    - filename: Qwen2.5-32B.Q4_K_M.gguf
      uri: huggingface://mradermacher/Qwen2.5-32B-GGUF/Qwen2.5-32B.Q4_K_M.gguf
      sha256: fa42a4067e3630929202b6bb1ef5cebc43c1898494aedfd567b7d53c7a9d84a6
- !!merge <<: *qwen25
  name: "qwen2.5-32b-instruct"
  urls:
    - https://huggingface.co/Qwen/Qwen2.5-32B-Instruct
    - https://huggingface.co/bartowski/Qwen2.5-32B-Instruct-GGUF
  overrides:
    parameters:
      model: Qwen2.5-32B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-32B-Instruct-Q4_K_M.gguf
      sha256: 2e5f6daea180dbc59f65a40641e94d3973b5dbaa32b3c0acf54647fa874e519e
      uri: huggingface://bartowski/Qwen2.5-32B-Instruct-GGUF/Qwen2.5-32B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-72b-instruct"
  urls:
    - https://huggingface.co/Qwen/Qwen2.5-72B-Instruct
    - https://huggingface.co/bartowski/Qwen2.5-72B-Instruct-GGUF
  overrides:
    parameters:
      model: Qwen2.5-72B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-72B-Instruct-Q4_K_M.gguf
      sha256: e4c8fad16946be8cf0bbf67eb8f4e18fc7415a5a6d2854b4cda453edb4082545
      uri: huggingface://bartowski/Qwen2.5-72B-Instruct-GGUF/Qwen2.5-72B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "bigqwen2.5-52b-instruct"
  icon: https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/98GiKtmH1AtHHbIbOUH4Y.jpeg
  urls:
    - https://huggingface.co/mlabonne/BigQwen2.5-52B-Instruct
    - https://huggingface.co/bartowski/BigQwen2.5-52B-Instruct-GGUF
  description: |
    BigQwen2.5-52B-Instruct is a Qwen/Qwen2-32B-Instruct self-merge made with MergeKit.
    It applies the mlabonne/Meta-Llama-3-120B-Instruct recipe.
  overrides:
    parameters:
      model: BigQwen2.5-52B-Instruct-Q4_K_M.gguf
  files:
    - filename: BigQwen2.5-52B-Instruct-Q4_K_M.gguf
      sha256: 9c939f08e366b51b07096eb2ecb5cc2a82894ac7baf639e446237ad39889c896
      uri: huggingface://bartowski/BigQwen2.5-52B-Instruct-GGUF/BigQwen2.5-52B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "replete-llm-v2.5-qwen-14b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/ihnWXDEgV-ZKN_B036U1J.png
  urls:
    - https://huggingface.co/Replete-AI/Replete-LLM-V2.5-Qwen-14b
    - https://huggingface.co/bartowski/Replete-LLM-V2.5-Qwen-14b-GGUF
  description: |
    Replete-LLM-V2.5-Qwen-14b is a continues finetuned version of Qwen2.5-14B. I noticed recently that the Qwen team did not learn from my methods of continuous finetuning, the great benefits, and no downsides of it. So I took it upon myself to merge the instruct model with the base model myself using the Ties merge method

    This version of the model shows higher performance than the original instruct and base models.
  overrides:
    parameters:
      model: Replete-LLM-V2.5-Qwen-14b-Q4_K_M.gguf
  files:
    - filename: Replete-LLM-V2.5-Qwen-14b-Q4_K_M.gguf
      sha256: 17d0792ff5e3062aecb965629f66e679ceb407e4542e8045993dcfe9e7e14d9d
      uri: huggingface://bartowski/Replete-LLM-V2.5-Qwen-14b-GGUF/Replete-LLM-V2.5-Qwen-14b-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "replete-llm-v2.5-qwen-7b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/ihnWXDEgV-ZKN_B036U1J.png
  urls:
    - https://huggingface.co/Replete-AI/Replete-LLM-V2.5-Qwen-7b
    - https://huggingface.co/bartowski/Replete-LLM-V2.5-Qwen-7b-GGUF
  description: |
    Replete-LLM-V2.5-Qwen-7b is a continues finetuned version of Qwen2.5-14B. I noticed recently that the Qwen team did not learn from my methods of continuous finetuning, the great benefits, and no downsides of it. So I took it upon myself to merge the instruct model with the base model myself using the Ties merge method

    This version of the model shows higher performance than the original instruct and base models.
  overrides:
    parameters:
      model: Replete-LLM-V2.5-Qwen-7b-Q4_K_M.gguf
  files:
    - filename: Replete-LLM-V2.5-Qwen-7b-Q4_K_M.gguf
      sha256: 054d54972259c0398b4e0af3f408f608e1166837b1d7535d08fc440d1daf8639
      uri: huggingface://bartowski/Replete-LLM-V2.5-Qwen-7b-GGUF/Replete-LLM-V2.5-Qwen-7b-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "calme-2.2-qwen2.5-72b-i1"
  icon: https://huggingface.co/MaziyarPanahi/calme-2.2-qwen2.5-72b/resolve/main/calme-2.webp
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-2.2-qwen2.5-72b
    - https://huggingface.co/mradermacher/calme-2.2-qwen2.5-72b-i1-GGUF
  description: |
    This model is a fine-tuned version of the powerful Qwen/Qwen2.5-72B-Instruct, pushing the boundaries of natural language understanding and generation even further. My goal was to create a versatile and robust model that excels across a wide range of benchmarks and real-world applications.
    Use Cases

    This model is suitable for a wide range of applications, including but not limited to:

        Advanced question-answering systems
        Intelligent chatbots and virtual assistants
        Content generation and summarization
        Code generation and analysis
        Complex problem-solving and decision support
  overrides:
    parameters:
      model: calme-2.2-qwen2.5-72b.i1-Q4_K_M.gguf
  files:
    - filename: calme-2.2-qwen2.5-72b.i1-Q4_K_M.gguf
      sha256: 5fdfa599724d7c78502c477ced1d294e92781b91d3265bd0748fbf15a6fefde6
      uri: huggingface://mradermacher/calme-2.2-qwen2.5-72b-i1-GGUF/calme-2.2-qwen2.5-72b.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "t.e-8.1-iq-imatrix-request"
  # chatml
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/K1aNPf32z-6tYZdcSQBzF.png
  urls:
    - https://huggingface.co/Cran-May/T.E-8.1
    - https://huggingface.co/Lewdiculous/T.E-8.1-GGUF-IQ-Imatrix-Request
  description: |
    Trained for roleplay uses.
  overrides:
    parameters:
      model: T.E-8.1-Q4_K_M-imat.gguf
  files:
    - filename: T.E-8.1-Q4_K_M-imat.gguf
      sha256: 1b7892b82c01ea4cbebe34cd00f9836cbbc369fc3247c1f44a92842201e7ec0b
      uri: huggingface://Lewdiculous/T.E-8.1-GGUF-IQ-Imatrix-Request/T.E-8.1-Q4_K_M-imat.gguf
- !!merge <<: *qwen25
  name: "rombos-llm-v2.5.1-qwen-3b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/pNDtgE5FDkxxvbG4qiZ1A.jpeg
  urls:
    - https://huggingface.co/QuantFactory/Rombos-LLM-V2.5.1-Qwen-3b-GGUF
  description: |
    Rombos-LLM-V2.5.1-Qwen-3b is a little experiment that merges a high-quality LLM, arcee-ai/raspberry-3B, using the last step of the Continuous Finetuning method outlined in a Google document. The merge is done using the mergekit with the following parameters:

    - Models: Qwen2.5-3B-Instruct, raspberry-3B
    - Merge method: ties
    - Base model: Qwen2.5-3B
    - Parameters: weight=1, density=1, normalize=true, int8_mask=true
    - Dtype: bfloat16

    The model has been evaluated on various tasks and datasets, and the results are available on the Open LLM Leaderboard. The model has shown promising performance across different benchmarks.
  overrides:
    parameters:
      model: Rombos-LLM-V2.5.1-Qwen-3b.Q4_K_M.gguf
  files:
    - filename: Rombos-LLM-V2.5.1-Qwen-3b.Q4_K_M.gguf
      sha256: 656c342a2921cac8912e0123fc295c3bb3d631a85c671c12a3843a957e46d30d
      uri: huggingface://QuantFactory/Rombos-LLM-V2.5.1-Qwen-3b-GGUF/Rombos-LLM-V2.5.1-Qwen-3b.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-7b-ins-v3"
  urls:
    - https://huggingface.co/happzy2633/qwen2.5-7b-ins-v3
    - https://huggingface.co/bartowski/qwen2.5-7b-ins-v3-GGUF
  description: |
    Qwen 2.5 fine-tuned on CoT to match o1 performance. An attempt to build an Open o1 mathcing OpenAI o1 model
    Demo: https://huggingface.co/spaces/happzy2633/open-o1
  overrides:
    parameters:
      model: qwen2.5-7b-ins-v3-Q4_K_M.gguf
  files:
    - filename: qwen2.5-7b-ins-v3-Q4_K_M.gguf
      sha256: 9c23734072714a4886c0386ae0ff07a5e940d67ad52278e2ed689fec44e1e0c8
      uri: huggingface://bartowski/qwen2.5-7b-ins-v3-GGUF/qwen2.5-7b-ins-v3-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "supernova-medius"
  icon: https://avatars.githubusercontent.com/u/126496414
  urls:
    - https://huggingface.co/arcee-ai/SuperNova-Medius-GGUF
  description: |
    Arcee-SuperNova-Medius is a 14B parameter language model developed by Arcee.ai, built on the Qwen2.5-14B-Instruct architecture. This unique model is the result of a cross-architecture distillation pipeline, combining knowledge from both the Qwen2.5-72B-Instruct model and the Llama-3.1-405B-Instruct model. By leveraging the strengths of these two distinct architectures, SuperNova-Medius achieves high-quality instruction-following and complex reasoning capabilities in a mid-sized, resource-efficient form.

    SuperNova-Medius is designed to excel in a variety of business use cases, including customer support, content creation, and technical assistance, while maintaining compatibility with smaller hardware configurations. It’s an ideal solution for organizations looking for advanced capabilities without the high resource requirements of larger models like our SuperNova-70B.
  overrides:
    parameters:
      model: SuperNova-Medius-Q4_K_M.gguf
  files:
    - filename: SuperNova-Medius-Q4_K_M.gguf
      sha256: aaa4bf3451bc900f186fd4b6b3a6a26bfd40c85908f605db76b92e58aadcc864
      uri: huggingface://arcee-ai/SuperNova-Medius-GGUF/SuperNova-Medius-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "eva-qwen2.5-14b-v0.1-i1"
  urls:
    - https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.1
    - https://huggingface.co/mradermacher/EVA-Qwen2.5-14B-v0.1-i1-GGUF
  description: |
    A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-14B on mixture of synthetic and natural data.
    It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model.
  overrides:
    parameters:
      model: EVA-Qwen2.5-14B-v0.1.i1-Q4_K_M.gguf
  files:
    - filename: EVA-Qwen2.5-14B-v0.1.i1-Q4_K_M.gguf
      sha256: 4e9665d4f83cd97efb42c8427f9c09be93b72e23a0364c91ad0b5de8056f2795
      uri: huggingface://mradermacher/EVA-Qwen2.5-14B-v0.1-i1-GGUF/EVA-Qwen2.5-14B-v0.1.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "cursorcore-qw2.5-7b-i1"
  urls:
    - https://huggingface.co/TechxGenus/CursorCore-QW2.5-7B
    - https://huggingface.co/mradermacher/CursorCore-QW2.5-7B-i1-GGUF
  description: |
    CursorCore is a series of open-source models designed for AI-assisted programming. It aims to support features such as automated editing and inline chat, replicating the core abilities of closed-source AI-assisted programming tools like Cursor. This is achieved by aligning data generated through Programming-Instruct. Please read our paper to learn more.
  overrides:
    parameters:
      model: CursorCore-QW2.5-7B.i1-Q4_K_M.gguf
  files:
    - filename: CursorCore-QW2.5-7B.i1-Q4_K_M.gguf
      sha256: 81868f4edb4ec1a61debde1dbdebc02b407930ee19a6d946ff801afba840a102
      uri: huggingface://mradermacher/CursorCore-QW2.5-7B-i1-GGUF/CursorCore-QW2.5-7B.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "cursorcore-qw2.5-1.5b-lc-i1"
  urls:
    - https://huggingface.co/TechxGenus/CursorCore-QW2.5-1.5B-LC
    - https://huggingface.co/mradermacher/CursorCore-QW2.5-1.5B-LC-i1-GGUF
  description: |
    CursorCore is a series of open-source models designed for AI-assisted programming. It aims to support features such as automated editing and inline chat, replicating the core abilities of closed-source AI-assisted programming tools like Cursor. This is achieved by aligning data generated through Programming-Instruct. Please read our paper to learn more.
  overrides:
    parameters:
      model: CursorCore-QW2.5-1.5B-LC.i1-Q4_K_M.gguf
  files:
    - filename: CursorCore-QW2.5-1.5B-LC.i1-Q4_K_M.gguf
      sha256: 185d720c810f7345ef861ad8eef1199bb15afa8e4f3c03bd5ffd476cfa465127
      uri: huggingface://mradermacher/CursorCore-QW2.5-1.5B-LC-i1-GGUF/CursorCore-QW2.5-1.5B-LC.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "edgerunner-command-nested-i1"
  urls:
    - https://huggingface.co/edgerunner-ai/EdgeRunner-Command-Nested
    - https://huggingface.co/mradermacher/EdgeRunner-Command-Nested-i1-GGUF
  description: |
    EdgeRunner-Command-Nested is an advanced large language model designed specifically for handling complex nested function calls. Initialized from Qwen2.5-7B-Instruct, further enhanced by the integration of the Hermes function call template and additional training on a specialized dataset (based on TinyAgent). This extra dataset focuses on personal domain applications, providing the model with a robust understanding of nested function scenarios that are typical in complex user interactions.
  overrides:
    parameters:
      model: EdgeRunner-Command-Nested.i1-Q4_K_M.gguf
  files:
    - filename: EdgeRunner-Command-Nested.i1-Q4_K_M.gguf
      sha256: a1cc4d2b601dc20e58cbb549bd3e9bc460995840c0aaf1cd3c1cb5414c900ac7
      uri: huggingface://mradermacher/EdgeRunner-Command-Nested-i1-GGUF/EdgeRunner-Command-Nested.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "tsunami-0.5x-7b-instruct-i1"
  icon: https://huggingface.co/Tsunami-th/Tsunami-0.5x-7B-Instruct/resolve/main/Tsunami.webp
  urls:
    - https://huggingface.co/Tsunami-th/Tsunami-0.5x-7B-Instruct
    - https://huggingface.co/mradermacher/Tsunami-0.5x-7B-Instruct-i1-GGUF
  description: |
    TSUNAMI: Transformative Semantic Understanding and Natural Augmentation Model for Intelligence.

    TSUNAMI full name was created by ChatGPT.
    infomation

    Tsunami-0.5x-7B-Instruct is Thai Large Language Model that fine-tuned from Qwen2.5-7B around 100,000 rows in Thai dataset.
  overrides:
    parameters:
      model: Tsunami-0.5x-7B-Instruct.i1-Q4_K_M.gguf
  files:
    - filename: Tsunami-0.5x-7B-Instruct.i1-Q4_K_M.gguf
      sha256: 22e2003ecec7f1e91f2e9aaec334613c0f37fb3000d0e628b5a9980e53322fa7
      uri: huggingface://mradermacher/Tsunami-0.5x-7B-Instruct-i1-GGUF/Tsunami-0.5x-7B-Instruct.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qevacot-7b-v2"
  urls:
    - https://huggingface.co/bunnycore/Qevacot-7B-v2
    - https://huggingface.co/mradermacher/Qevacot-7B-v2-GGUF
  description: |
    This model was merged using the TIES merge method using Qwen/Qwen2.5-7B as a base.
    The following models were included in the merge:
        c10x/CoT-2.5
        EVA-UNIT-01/EVA-Qwen2.5-7B-v0.1
        huihui-ai/Qwen2.5-7B-Instruct-abliterated-v2
        Cran-May/T.E-8.1
  overrides:
    parameters:
      model: Qevacot-7B-v2.Q4_K_M.gguf
  files:
    - filename: Qevacot-7B-v2.Q4_K_M.gguf
      sha256: a45b3d3b74bc68a5c7ac07d251cdeff671e64085d1816cd86fca6cfb7eab204e
      uri: huggingface://mradermacher/Qevacot-7B-v2-GGUF/Qevacot-7B-v2.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "meissa-qwen2.5-7b-instruct"
  icon: https://huggingface.co/Orion-zhen/Meissa-Qwen2.5-7B-Instruct/resolve/main/meissa.jpg
  urls:
    - https://huggingface.co/Orion-zhen/Meissa-Qwen2.5-7B-Instruct
    - https://huggingface.co/QuantFactory/Meissa-Qwen2.5-7B-Instruct-GGUF
  description: |
    Meissa is designated Lambda Orionis, forms Orion's head, and is a multiple star with a combined apparent magnitude of 3.33. Its name means the "shining one".
    This model is fine tuned over writing and role playing datasets (maybe the first on qwen2.5-7b), aiming to enhance model's performance in novel writing and roleplaying.
    The model is fine-tuned over Orion-zhen/Qwen2.5-7B-Instruct-Uncensored
  overrides:
    parameters:
      model: Meissa-Qwen2.5-7B-Instruct.Q4_K_M.gguf
  files:
    - filename: Meissa-Qwen2.5-7B-Instruct.Q4_K_M.gguf
      sha256: 632b10d5c0e98bc8d53295886da2d57772a54bb6f6fa01d458e9e8c7fa9c905a
      uri: huggingface://QuantFactory/Meissa-Qwen2.5-7B-Instruct-GGUF/Meissa-Qwen2.5-7B-Instruct.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "thebeagle-v2beta-32b-mgs"
  urls:
    - https://huggingface.co/fblgit/TheBeagle-v2beta-32B-MGS
    - https://huggingface.co/bartowski/TheBeagle-v2beta-32B-MGS-GGUF
  description: |
    This model is an experimental version of our latest innovation: MGS. Its up to you to figure out what does it means, but its very explicit. We didn't applied our known UNA algorithm to the forward pass, but they are entirely compatible and operates in different parts of the neural network and in different ways, tho they both can be seen as a regularization technique.

    Updated tokenizer_config.json (from the base_model)
    Regenerated Quants (being uploaded)
    Re-submitted Leaderboard Evaluation, MATH & IFeval have relevant updates
    Aligned LICENSE with Qwen terms.

    MGS stands for... Many-Geeks-Searching... and thats it. Hint: 1+1 is 2, and 1+1 is not 3
    We still believe on 1-Epoch should be enough, so we just did 1 Epoch only.
    Dataset
    Used here the first decent (corpora & size) dataset on the hub: Magpie-Align/Magpie-Pro-300K-Filtered Kudos to the Magpie team to contribute with some decent stuff that I personally think is very good to ablate.
    It achieves the following results on the evaluation set:
        Loss: 0.5378 (1 Epoch), outperforming the baseline model.
  overrides:
    parameters:
      model: TheBeagle-v2beta-32B-MGS-Q4_K_M.gguf
  files:
    - filename: TheBeagle-v2beta-32B-MGS-Q4_K_M.gguf
      sha256: db0d3b3c5341d2d51115794bf5da6552b5c0714b041de9b82065cc0c982dd4f7
      uri: huggingface://bartowski/TheBeagle-v2beta-32B-MGS-GGUF/TheBeagle-v2beta-32B-MGS-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "meraj-mini"
  icon: https://avatars.githubusercontent.com/u/126496414
  urls:
    - https://huggingface.co/arcee-ai/Meraj-Mini
    - https://huggingface.co/QuantFactory/Meraj-Mini-GGUF
  description: |
    Arcee Meraj Mini is a quantized version of the Meraj-Mini model, created using llama.cpp. It is an open-source model that is fine-tuned from the Qwen2.5-7B-Instruct model and is designed for both Arabic and English languages. The model has undergone evaluations across multiple benchmarks in both languages and demonstrates top-tier performance in Arabic and competitive results in English. The key stages in its development include data preparation, initial training, iterative training and post-training, evaluation, and final model creation. The model is capable of solving a wide range of language tasks and is suitable for various applications such as education, mathematics and coding, customer service, and content creation. The Arcee Meraj Mini model consistently outperforms state-of-the-art models on most benchmarks of the Open Arabic LLM Leaderboard (OALL), highlighting its improvements and effectiveness in Arabic language content.
  overrides:
    parameters:
      model: Meraj-Mini.Q4_K_M.gguf
  files:
    - filename: Meraj-Mini.Q4_K_M.gguf
      sha256: f8f3923eb924b8f8e8f530a5bf07fcbd5b3dd10dd478d229d6f4377e31eb3938
      uri: huggingface://QuantFactory/Meraj-Mini-GGUF/Meraj-Mini.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "spiral-da-hyah-qwen2.5-72b-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/633e85093a17ab61de8d9073/toQiofo5ujXDGI4Gh3ciH.png
  urls:
    - https://huggingface.co/KaraKaraWitch/spiral-da-HYAH-Qwen2.5-72b
    - https://huggingface.co/mradermacher/spiral-da-HYAH-Qwen2.5-72b-i1-GGUF
  description: |
    Model stock merge for fun.
    This model was merged using the Model Stock merge method using rombodawg/Rombos-LLM-V2.5-Qwen-72b as a base.
    The following models were included in the merge:
    - anthracite-org/magnum-v4-72b
    - AXCXEPT/EZO-Qwen2.5-72B-Instruct
  overrides:
    parameters:
      model: spiral-da-HYAH-Qwen2.5-72b.i1-Q4_K_M.gguf
  files:
    - filename: spiral-da-HYAH-Qwen2.5-72b.i1-Q4_K_M.gguf
      sha256: 6119e89cadae0bc01a0909f5d9776610dfc4cdcd1600f334c3afb0d0ece011a8
      uri: huggingface://mradermacher/spiral-da-HYAH-Qwen2.5-72b-i1-GGUF/spiral-da-HYAH-Qwen2.5-72b.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "whiterabbitneo-2.5-qwen-2.5-coder-7b"
  icon: https://huggingface.co/WhiteRabbitNeo/WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B/resolve/main/whiterabbitneo-logo-defcon.png
  urls:
    - https://huggingface.co/WhiteRabbitNeo/WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B
    - https://huggingface.co/bartowski/WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B-GGUF
  description: |
    WhiteRabbitNeo is a model series that can be used for offensive and defensive cybersecurity.

    Models are now getting released as a public preview of its capabilities, and also to assess the societal impact of such an AI.
  overrides:
    parameters:
      model: WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B-Q4_K_M.gguf
  files:
    - filename: WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B-Q4_K_M.gguf
      sha256: 3790b0bf2c505fcbd144b6b69354fe45a83ac09238a87469db0082027c127de4
      uri: huggingface://bartowski/WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B-GGUF/WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "cybertron-v4-qw7b-mgs"
  icon: https://huggingface.co/fblgit/cybertron-v4-qw7B-MGS/resolve/main/cybertron_v4MGS.png
  urls:
    - https://huggingface.co/fblgit/cybertron-v4-qw7B-MGS
    - https://huggingface.co/QuantFactory/cybertron-v4-qw7B-MGS-GGUF
  description: |
    Here we use our novel approach called MGS. Its up to you to figure out what it means.

    Cybertron V4 went thru SFT over Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1
  overrides:
    parameters:
      model: cybertron-v4-qw7B-MGS.Q4_K_M.gguf
  files:
    - filename: cybertron-v4-qw7B-MGS.Q4_K_M.gguf
      sha256: 32ed4174bad90bb7a2cdcd48b76b3b5924677a4160b762d5e5d95c93fe5205db
      uri: huggingface://QuantFactory/cybertron-v4-qw7B-MGS-GGUF/cybertron-v4-qw7B-MGS.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "q25-1.5b-veolu"
  icon: https://huggingface.co/Alfitaria/Q25-1.5B-VeoLu/resolve/main/veolu.png
  urls:
    - https://huggingface.co/Alfitaria/Q25-1.5B-VeoLu
    - https://huggingface.co/bartowski/Q25-1.5B-VeoLu-GGUF
  description: |
    Q25-1.5B-Veo Lu is a tiny General-Purpose Creative model, made up of a merge of bespoke finetunes on Qwen 2.5-1.5B-Instruct.

    Inspired by the success of MN-12B-Mag Mell and MS-Meadowlark-22B, Veo Lu was trained on a healthy, balanced diet of of Internet fiction, roleplaying, adventuring, and reasoning/general knowledge.

    The components of Veo Lu are:

        Bard (pretrain, writing): Fujin (Cleaned/extended Rosier)
        Scribe (pretrain, roleplay): Creative Writing Multiturn
        Cartographer (pretrain, adventuring): SpringDragon
        Alchemist (SFT, science/reasoning): ScienceQA, MedquadQA, Orca Math Word Problems

    This model is capable of carrying on a scene without going completely off the rails. That being said, it only has 1.5B parameters. So please, for the love of God, manage your expectations. Since it's Qwen, use ChatML formatting. Turn the temperature down to ~0.7-0.8 and try a dash of rep-pen.
  overrides:
    parameters:
      model: Q25-1.5B-VeoLu-Q4_K_M.gguf
  files:
    - filename: Q25-1.5B-VeoLu-Q4_K_M.gguf
      sha256: bbfb3691b6cabceb49ea1feacfa2eb2651312b8cc6caaf893b46375097e2f026
      uri: huggingface://bartowski/Q25-1.5B-VeoLu-GGUF/Q25-1.5B-VeoLu-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "llenn-v0.75-qwen2.5-72b-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/633e85093a17ab61de8d9073/mYiG-Ndxzqu8ofaBGbOIZ.png
  urls:
    - https://huggingface.co/KaraKaraWitch/LLENN-v0.75-Qwen2.5-72b
    - https://huggingface.co/mradermacher/LLENN-v0.75-Qwen2.5-72b-i1-GGUF
  description: |
    The following models were included in the merge:
        rombodawg/Rombos-LLM-V2.5-Qwen-72b
        abacusai/Dracarys2-72B-Instruct
        EVA-UNIT-01/EVA-Qwen2.5-72B-v0.0
        ZeusLabs/Chronos-Platinum-72B
        m8than/banana-2-b-72b
  overrides:
    parameters:
      model: LLENN-v0.75-Qwen2.5-72b.i1-Q4_K_M.gguf
  files:
    - filename: LLENN-v0.75-Qwen2.5-72b.i1-Q4_K_M.gguf
      sha256: 38990136bb48fc9422b0e477bed6d9c40c00c270806d3bd3f58e426badfa0d4d
      uri: huggingface://mradermacher/LLENN-v0.75-Qwen2.5-72b-i1-GGUF/LLENN-v0.75-Qwen2.5-72b.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "eva-qwen2.5-14b-v0.2"
  urls:
    - https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
    - https://huggingface.co/bartowski/EVA-Qwen2.5-14B-v0.2-GGUF
  description: |
    A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-14B on mixture of synthetic and natural data.
    It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model.

    Version notes for 0.2: Now using the refined dataset from 32B 0.2. Major improvements in coherence, instruction following and long-context comprehension over 14B v0.1.

    Prompt format is ChatML.
  overrides:
    parameters:
      model: EVA-Qwen2.5-14B-v0.2-Q4_K_M.gguf
  files:
    - filename: EVA-Qwen2.5-14B-v0.2-Q4_K_M.gguf
      sha256: 5d79bc8bf486c48c6430621a5bc5d3032227532dae436a27aa23aaf3e618e009
      uri: huggingface://bartowski/EVA-Qwen2.5-14B-v0.2-GGUF/EVA-Qwen2.5-14B-v0.2-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "tissint-14b-128k-rp"
  urls:
    - https://huggingface.co/Ttimofeyka/Tissint-14B-128k-RP
    - https://huggingface.co/mradermacher/Tissint-14B-128k-RP-GGUF
  description: |
    The model is based on SuperNova-Medius (as the current best 14B model) with a 128k context with an emphasis on creativity, including NSFW and multi-turn conversations.
  overrides:
    parameters:
      model: Tissint-14B-128k-RP.Q4_K_M.gguf
  files:
    - filename: Tissint-14B-128k-RP.Q4_K_M.gguf
      sha256: 374c02f69fae47e7d78ffed9fad4e405250d31031a6bc1539b136c4b1cfc85c2
      uri: huggingface://mradermacher/Tissint-14B-128k-RP-GGUF/Tissint-14B-128k-RP.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "tq2.5-14b-sugarquill-v1"
  icon: https://huggingface.co/allura-org/TQ2.5-14B-Sugarquill-v1/resolve/main/card_img.png
  urls:
    - https://huggingface.co/allura-org/TQ2.5-14B-Sugarquill-v1
    - https://huggingface.co/bartowski/TQ2.5-14B-Sugarquill-v1-GGUF
  description: |
    A continued pretrain of SuperNova-Medius on assorted short story data from the web. Supernova already had a nice prose, but diversifying it a bit definitely doesn't hurt. Also, finally a storywriter model with enough context for something more than a short story, that's also nice.

    It's a fair bit more temperamental than Gemma, but can be tamed with some sampling. Instruction following also stayed rather strong, so it works for both RP and storywriting, both in chat mode via back-and-forth co-writing and on raw completion.

    Overall, I'd say it successfully transfers the essence of what I liked about Gemma Sugarquill. I will also make a Qwen version of Aletheia, but with a brand new LoRA, based on a brand new RP dataset that's in the making right now.

    Model was trained by Auri.
  overrides:
    parameters:
      model: TQ2.5-14B-Sugarquill-v1-Q4_K_M.gguf
  files:
    - filename: TQ2.5-14B-Sugarquill-v1-Q4_K_M.gguf
      sha256: a654fe3f41e963d8ea6753fb9a06b9dd76893714ebf02605ef67827944a4025e
      uri: huggingface://bartowski/TQ2.5-14B-Sugarquill-v1-GGUF/TQ2.5-14B-Sugarquill-v1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "calme-3.3-baguette-3b"
  icon: https://huggingface.co/MaziyarPanahi/calme-3.1-baguette-3b/resolve/main/calme_3.png
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-3.3-baguette-3b
    - https://huggingface.co/MaziyarPanahi/calme-3.3-baguette-3b-GGUF
  description: |
    This model is an advanced iteration of the powerful Qwen/Qwen2.5-3B, fine-tuned specifically to enhance its capabilities across general domains in both French and English.
  overrides:
    parameters:
      model: calme-3.3-baguette-3b.Q5_K_M.gguf
  files:
    - filename: calme-3.3-baguette-3b.Q5_K_M.gguf
      sha256: 9e75b76e8cda215ef5c9ad79edfc6e5deee2f9e01ecf605ee6a557b1b5c9ef85
      uri: huggingface://MaziyarPanahi/calme-3.3-baguette-3b-GGUF/calme-3.3-baguette-3b.Q5_K_M.gguf
- !!merge <<: *qwen25
  name: "calme-3.2-baguette-3b"
  icon: https://huggingface.co/MaziyarPanahi/calme-3.1-baguette-3b/resolve/main/calme_3.png
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-3.2-baguette-3b
    - https://huggingface.co/MaziyarPanahi/calme-3.2-baguette-3b-GGUF
  description: |
    This model is an advanced iteration of the powerful Qwen/Qwen2.5-3B, fine-tuned specifically to enhance its capabilities across general domains in both French and English.
  overrides:
    parameters:
      model: calme-3.2-baguette-3b.Q4_K_M.gguf
  files:
    - filename: calme-3.2-baguette-3b.Q4_K_M.gguf
      uri: huggingface://MaziyarPanahi/calme-3.2-baguette-3b-GGUF/calme-3.2-baguette-3b.Q4_K_M.gguf
      sha256: 4e62fe0108643bbfd842add5a1bf199e9b81b0181309b15f483e1f07c2b5fbb2
- !!merge <<: *qwen25
  icon: https://huggingface.co/MaziyarPanahi/calme-3.1-baguette-3b/resolve/main/calme_3.png
  name: "calme-3.1-baguette-3b"
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-3.1-baguette-3b
    - https://huggingface.co/MaziyarPanahi/calme-3.1-baguette-3b-GGUF
  description: |
    This model is an advanced iteration of the powerful Qwen/Qwen2.5-3B, fine-tuned specifically to enhance its capabilities across general domains in both French and English.
  overrides:
    parameters:
      model: calme-3.1-baguette-3b.Q4_K_M.gguf
  files:
    - filename: calme-3.1-baguette-3b.Q4_K_M.gguf
      uri: huggingface://MaziyarPanahi/calme-3.1-baguette-3b-GGUF/calme-3.1-baguette-3b.Q4_K_M.gguf
      sha256: 351058680d633749fa64efde205bd5f3d942aacada3204c594d9acfab2fc8774
- !!merge <<: *qwen25
  name: "calme-3.3-qwenloi-3b"
  icon: https://huggingface.co/MaziyarPanahi/calme-3.3-qwenloi-3b/resolve/main/calme_3.png
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-3.3-qwenloi-3b
    - https://huggingface.co/MaziyarPanahi/calme-3.3-qwenloi-3b-GGUF
  description: |
    This model is an advanced iteration of the powerful Qwen/Qwen2.5-3B, specifically fine-tuned to enhance its capabilities in French Legal domain.
  overrides:
    parameters:
      model: calme-3.3-qwenloi-3b.Q5_K_M.gguf
  files:
    - filename: calme-3.3-qwenloi-3b.Q5_K_M.gguf
      sha256: 9592e186a00c70552365d85ccabddae87acc8d812634a6145da8d460b57b70f9
      uri: huggingface://MaziyarPanahi/calme-3.3-qwenloi-3b-GGUF/calme-3.3-qwenloi-3b.Q5_K_M.gguf
- !!merge <<: *qwen25
  name: "calme-3.2-qwenloi-3b"
  icon: https://huggingface.co/MaziyarPanahi/calme-3.3-qwenloi-3b/resolve/main/calme_3.png
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-3.2-qwenloi-3b
    - https://huggingface.co/MaziyarPanahi/calme-3.2-qwenloi-3b-GGUF
  description: |
    This model is an advanced iteration of the powerful Qwen/Qwen2.5-3B, specifically fine-tuned to enhance its capabilities in French Legal domain.
  overrides:
    parameters:
      model: calme-3.2-qwenloi-3b.Q5_K_M.gguf
  files:
    - filename: calme-3.2-qwenloi-3b.Q5_K_M.gguf
      sha256: 61be0c2f221262523dcd00a9147fe590aba797c89a1c5849bd4f66e7df2ad272
      uri: huggingface://MaziyarPanahi/calme-3.2-qwenloi-3b-GGUF/calme-3.2-qwenloi-3b.Q5_K_M.gguf
- !!merge <<: *qwen25
  name: "calme-3.1-qwenloi-3b"
  icon: https://huggingface.co/MaziyarPanahi/calme-3.3-qwenloi-3b/resolve/main/calme_3.png
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-3.1-qwenloi-3b
    - https://huggingface.co/MaziyarPanahi/calme-3.1-qwenloi-3b-GGUF
  description: |
    This model is an advanced iteration of the powerful Qwen/Qwen2.5-3B, specifically fine-tuned to enhance its capabilities in French Legal domain.
  overrides:
    parameters:
      model: calme-3.1-qwenloi-3b.Q5_K_M.gguf
  files:
    - filename: calme-3.1-qwenloi-3b.Q5_K_M.gguf
      sha256: 8962a8d1704979039063b5c69fafdb38b545c26143419ec4c574f37f2d6dd7b2
      uri: huggingface://MaziyarPanahi/calme-3.1-qwenloi-3b-GGUF/calme-3.1-qwenloi-3b.Q5_K_M.gguf
- !!merge <<: *qwen25
  name: "eva-qwen2.5-72b-v0.1-i1"
  urls:
    - https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-72B-v0.1
    - https://huggingface.co/mradermacher/EVA-Qwen2.5-72B-v0.1-i1-GGUF
  description: |
    A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-72B on mixture of synthetic and natural data.
    It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model.

    Dedicated to Nev.

    Version notes for 0.1: Reprocessed dataset (via Cahvay for 32B 0.2, used here as well), readjusted training config for 8xH100 SXM. Significant improvements in instruction following, long context understanding and overall coherence over v0.0.
  overrides:
    parameters:
      model: EVA-Qwen2.5-72B-v0.1.i1-Q4_K_M.gguf
  files:
    - filename: EVA-Qwen2.5-72B-v0.1.i1-Q4_K_M.gguf
      sha256: b05dbc02eeb286c41122b103ac31431fc8dcbd80b8979422541a05cda53df61b
      uri: huggingface://mradermacher/EVA-Qwen2.5-72B-v0.1-i1-GGUF/EVA-Qwen2.5-72B-v0.1.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "celestial-harmony-14b-v1.0-experimental-1016-i1"
  urls:
    - https://huggingface.co/ProdeusUnity/Celestial-Harmony-14b-v1.0-Experimental-1016
    - https://huggingface.co/mradermacher/Celestial-Harmony-14b-v1.0-Experimental-1016-i1-GGUF
  description: |
    Yet Another merge, this one for AuriAetherwiing, at their request.
    This is a merge of pre-trained language models created using mergekit.
    The following models were included in the merge:
        EVA-UNIT-01/EVA-Qwen2.5-14B-v0.1
        v000000/Qwen2.5-Lumen-14B
        arcee-ai/SuperNova-Medius
  overrides:
    parameters:
      model: Celestial-Harmony-14b-v1.0-Experimental-1016.i1-Q4_K_M.gguf
  files:
    - filename: Celestial-Harmony-14b-v1.0-Experimental-1016.i1-Q4_K_M.gguf
      sha256: 536a6d98e30e9d52f91672daf49eeb7efe076e161a5da8beaca204adedd76864
      uri: huggingface://mradermacher/Celestial-Harmony-14b-v1.0-Experimental-1016-i1-GGUF/Celestial-Harmony-14b-v1.0-Experimental-1016.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-32b-arliai-rpmax-v1.3"
  urls:
    - https://huggingface.co/ArliAI/Qwen2.5-32B-ArliAI-RPMax-v1.3
    - https://huggingface.co/bartowski/Qwen2.5-32B-ArliAI-RPMax-v1.3-GGUF
  description: |
    RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations.
    Many RPMax users mentioned that these models does not feel like any other RP models, having a different writing style and generally doesn't feel in-bred.
  overrides:
    parameters:
      model: Qwen2.5-32B-ArliAI-RPMax-v1.3-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-32B-ArliAI-RPMax-v1.3-Q4_K_M.gguf
      sha256: 51b369068b124165b1b8c253371b88b573af9dd350e331ce93d7e47b6b710003
      uri: huggingface://bartowski/Qwen2.5-32B-ArliAI-RPMax-v1.3-GGUF/Qwen2.5-32B-ArliAI-RPMax-v1.3-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "q2.5-ms-mistoria-72b-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/5LOvUFYiMMw6pcEsOhmo2.webp
  urls:
    - https://huggingface.co/Steelskull/Q2.5-MS-Mistoria-72b
    - https://huggingface.co/mradermacher/Q2.5-MS-Mistoria-72b-i1-GGUF
  description: |
    This model is my fist attempt at a 72b model as usual my goal is to merge the robust storytelling of mutiple models while attempting to maintain intelligence.
    Merge of:
      - model: EVA-UNIT-01/EVA-Qwen2.5-72B-v0.1
      - model: ZeusLabs/Chronos-Platinum-72B
      - model: shuttleai/shuttle-3
  overrides:
    parameters:
      model: Q2.5-MS-Mistoria-72b.i1-Q4_K_M.gguf
  files:
    - filename: Q2.5-MS-Mistoria-72b.i1-Q4_K_M.gguf
      sha256: f51ac3db855259c0132070e7bb9f58b67538103ffb3c716880ceef3bb09d43d9
      uri: huggingface://mradermacher/Q2.5-MS-Mistoria-72b-i1-GGUF/Q2.5-MS-Mistoria-72b.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "athene-v2-agent"
  icon: https://huggingface.co/Nexusflow/Athene-V2-Agent/resolve/main/agent.png
  urls:
    - https://huggingface.co/Nexusflow/Athene-V2-Agent
    - https://huggingface.co/bartowski/Athene-V2-Agent-GGUF
  description: "Athene-V2-Agent is an open-source Agent LLM that surpasses the state-of-the-art in function calling and agentic capabilities.\n\n\U0001F4AA Versatile Agent Capability: Athene-V2-Agent is an agent model, capable of operating in environments with deeply nested dependencies with the environment. It is capable of reasoning and doing planning for trajectories with many tool calls necessary to answer a single query.\n\n\U0001F4CA Performance Highlights: Athene-V2-Agent surpasses GPT-4o in single FC tasks by 18% in function calling success rates, and by 17% in Agentic success rates.\n\n\U0001F527 Generalization to the Unseen: Athene-V2-Agent has never been trained on the functions or agentic settings used in evaluation.\n"
  overrides:
    parameters:
      model: Athene-V2-Agent-Q4_K_M.gguf
  files:
    - filename: Athene-V2-Agent-Q4_K_M.gguf
      sha256: 2829d205519da34852c374286d42a4403f3be012ea56424e88ebcb8dc89676ad
      uri: huggingface://bartowski/Athene-V2-Agent-GGUF/Athene-V2-Agent-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "athene-v2-chat"
  urls:
    - https://huggingface.co/Nexusflow/Athene-V2-Chat
    - https://huggingface.co/bartowski/Athene-V2-Chat-GGUF
  description: |
    We introduce Athene-V2-Chat-72B, an open-weights LLM on-par with GPT-4o across benchmarks. It is trained through RLHF with Qwen-2.5-72B-Instruct as base model. Athene-V2-Chat-72B excels in chat, math, and coding. Its sister model, Athene-V2-Agent-72B, surpasses GPT-4o in complex function calling and agentic applications.
  overrides:
    parameters:
      model: Athene-V2-Chat-Q4_K_M.gguf
  files:
    - filename: Athene-V2-Chat-Q4_K_M.gguf
      sha256: bda8b784ad55982891e5aa69b08ce4030c91a2e28ad9c4c35284d45d3c7aeb16
      uri: huggingface://bartowski/Athene-V2-Chat-GGUF/Athene-V2-Chat-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-7b-nerd-uncensored-v1.7"
  urls:
    - https://huggingface.co/jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.7
    - https://huggingface.co/mradermacher/Qwen2.5-7B-nerd-uncensored-v1.7-GGUF
  description: |
    Model created by analyzing and selecting the optimal layers from other Qwen2.5-7B models based on their dimensional utilization efficiency, measured by the Normalized Effective Rank (NER). Computed like:
    Input: Weight matrix for each model layer
    Compute singular values σᵢ where σᵢ ≥ 0 # σᵢ represents the importance of each dimension
    Filter values above numerical threshold (>1e-12)
    Sum all singular values: S = Σσᵢ # S acts as normalization factor
    Create probability distribution: pᵢ = σᵢ/S # converts singular values to probabilities summing to 1
    Compute Shannon entropy: H = -Σ(pᵢ * log₂(pᵢ)) # measures information content
    Calculate maximum possible entropy: H_max = log₂(n)
    Final NER score = H/H_max # normalizes score to [0,1] range
    Results in value between 0 and 1 for each model layer
  overrides:
    parameters:
      model: Qwen2.5-7B-nerd-uncensored-v1.7.Q4_K_M.gguf
  files:
    - filename: Qwen2.5-7B-nerd-uncensored-v1.7.Q4_K_M.gguf
      sha256: 42cf7a96784dc8f25c61c2404620c3e6548a024caa8dff6e435d7c86400d7ab8
      uri: huggingface://mradermacher/Qwen2.5-7B-nerd-uncensored-v1.7-GGUF/Qwen2.5-7B-nerd-uncensored-v1.7.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "evathene-v1.0"
  urls:
    - https://huggingface.co/sophosympatheia/Evathene-v1.0
    - https://huggingface.co/bartowski/Evathene-v1.0-GGUF
  description: |
    This 72B parameter model is a merge of Nexusflow/Athene-V2-Chat with EVA-UNIT-01/EVA-Qwen2.5-72B-v0.1. See the merge recipe below for details.

    This model is uncensored. You are responsible for whatever you do with it.

    This model was designed for roleplaying and storytelling and I think it does well at both. It may also perform well at other tasks but I have not tested its performance in other areas.
  overrides:
    parameters:
      model: Evathene-v1.0-Q4_K_M.gguf
  files:
    - filename: Evathene-v1.0-Q4_K_M.gguf
      sha256: 96401ba9d798faa8a01f579b54523c5f75277e91bf1f0eee93db285f76f61e7e
      uri: huggingface://bartowski/Evathene-v1.0-GGUF/Evathene-v1.0-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "miniclaus-qw1.5b-unamgs"
  icon: https://huggingface.co/fblgit/miniclaus-qw1.5B-UNAMGS/resolve/main/miniclaus_qw15-UNAMGS.png
  urls:
    - https://huggingface.co/fblgit/miniclaus-qw1.5B-UNAMGS
    - https://huggingface.co/bartowski/miniclaus-qw1.5B-UNAMGS-GGUF
  description: |
    Trained with Magpie-Align/Magpie-Pro-MT-300K-v0.1
    Using MGS & UNA (MLP) on this tiny but powerful model.
  overrides:
    parameters:
      model: miniclaus-qw1.5B-UNAMGS-Q4_K_M.gguf
  files:
    - filename: miniclaus-qw1.5B-UNAMGS-Q4_K_M.gguf
      sha256: a0dadd7147cc4a8e8df59659556e4d824ef5c26fd2f39381fe467b2ff9cc1289
      uri: huggingface://bartowski/miniclaus-qw1.5B-UNAMGS-GGUF/miniclaus-qw1.5B-UNAMGS-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-3b-smart-i1"
  urls:
    - https://huggingface.co/bunnycore/Qwen2.5-3B-Smart
    - https://huggingface.co/mradermacher/Qwen2.5-3B-Smart-i1-GGUF
  description: |
    This model was merged using the passthrough merge method using bunnycore/Qwen2.5-3B-RP-Mix + bunnycore/Qwen2.5-3b-Smart-lora_model as a base.
  overrides:
    parameters:
      model: Qwen2.5-3B-Smart.i1-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-3B-Smart.i1-Q4_K_M.gguf
      sha256: 4cfffa4478191b3ac5f54b0e2c5c3f60883322cf705d74f9651715b70f3779f4
      uri: huggingface://mradermacher/Qwen2.5-3B-Smart-i1-GGUF/Qwen2.5-3B-Smart.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "steyrcannon-0.2-qwen2.5-72b"
  urls:
    - https://huggingface.co/KaraKaraWitch/SteyrCannon-0.2-Qwen2.5-72b
    - https://huggingface.co/mradermacher/SteyrCannon-0.2-Qwen2.5-72b-GGUF
  description: |
    SteyrCannon-0.2 is an updated revision from the original SteyrCannon. This uses EVA-Qwen2.5-72B-v0.2. Nothing else has changed.This model was merged using the TIES merge method using EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2 as a base.
    The following models were included in the merge:
        anthracite-org/magnum-v4-72b
        EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2
  overrides:
    parameters:
      model: SteyrCannon-0.2-Qwen2.5-72b.Q4_K_M.gguf
  files:
    - filename: SteyrCannon-0.2-Qwen2.5-72b.Q4_K_M.gguf
      sha256: b34c08b77ffd25ccb0ca50b167f2215e784689205c93a0903fa9435b6cc187f0
      uri: huggingface://mradermacher/SteyrCannon-0.2-Qwen2.5-72b-GGUF/SteyrCannon-0.2-Qwen2.5-72b.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "q2.5-ms-mistoria-72b-v2"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/5LOvUFYiMMw6pcEsOhmo2.webp
  urls:
    - https://huggingface.co/Steelskull/Q2.5-MS-Mistoria-72b-v2
    - https://huggingface.co/bartowski/Q2.5-MS-Mistoria-72b-v2-GGUF
  description: |
    This model is my second attempt at a 72b model, as usual, my goal is to merge the robust storytelling of mutiple models while attempting to maintain intelligence.
    models:
      - model: EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2
      - model: ZeusLabs/Chronos-Platinum-72B
      - model: shuttleai/shuttle-3
  overrides:
    parameters:
      model: Q2.5-MS-Mistoria-72b-v2-Q4_K_M.gguf
  files:
    - filename: Q2.5-MS-Mistoria-72b-v2-Q4_K_M.gguf
      sha256: 33df8aac5a790d1c286fe0fc4f9d340311f282eca19b78db6f7abb845923425c
      uri: huggingface://bartowski/Q2.5-MS-Mistoria-72b-v2-GGUF/Q2.5-MS-Mistoria-72b-v2-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "eva-qwen2.5-72b-v0.2"
  urls:
    - https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2
    - https://huggingface.co/bartowski/EVA-Qwen2.5-72B-v0.2-GGUF
  description: |
    A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-72B on mixture of synthetic and natural data.
    It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model.

    Version notes for 0.2: Optimized training hyperparameters and increased sequence length. Better instruction following deeper into context and less repetition.
  overrides:
    parameters:
      model: EVA-Qwen2.5-72B-v0.2-Q4_K_M.gguf
  files:
    - filename: EVA-Qwen2.5-72B-v0.2-Q4_K_M.gguf
      sha256: 03ea0ecac3ee24a332ca43cf925b669c58714b9754be0f4bc232bd996681ef4b
      uri: huggingface://bartowski/EVA-Qwen2.5-72B-v0.2-GGUF/EVA-Qwen2.5-72B-v0.2-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwq-32b-preview"
  urls:
    - https://huggingface.co/Qwen/QwQ-32B-Preview
    - https://huggingface.co/bartowski/QwQ-32B-Preview-GGUF
  description: |
    QwQ-32B-Preview is an experimental research model developed by the Qwen Team, focused on advancing AI reasoning capabilities. As a preview release, it demonstrates promising analytical abilities while having several important limitations:

    Language Mixing and Code-Switching: The model may mix languages or switch between them unexpectedly, affecting response clarity.
    Recursive Reasoning Loops: The model may enter circular reasoning patterns, leading to lengthy responses without a conclusive answer.
    Safety and Ethical Considerations: The model requires enhanced safety measures to ensure reliable and secure performance, and users should exercise caution when deploying it.
    Performance and Benchmark Limitations: The model excels in math and coding but has room for improvement in other areas, such as common sense reasoning and nuanced language understanding.
  overrides:
    parameters:
      model: QwQ-32B-Preview-Q4_K_M.gguf
  files:
    - filename: QwQ-32B-Preview-Q4_K_M.gguf
      sha256: c499801e682e2379528090c50e106837ca1d69dc3bf3ff3a9af830a0eb49cdf6
      uri: huggingface://bartowski/QwQ-32B-Preview-GGUF/QwQ-32B-Preview-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "q2.5-32b-slush-i1"
  urls:
    - https://huggingface.co/crestf411/Q2.5-32B-Slush
    - https://huggingface.co/mradermacher/Q2.5-32B-Slush-i1-GGUF
  description: |
    Slush is a two-stage model trained with high LoRA dropout, where stage 1 is a pretraining continuation on the base model, aimed at boosting the model's creativity and writing capabilities. This is then merged into the instruction tune model, and stage 2 is a fine tuning step on top of this to further enhance its roleplaying capabilities and/or to repair any damage caused in the stage 1 merge.
    This is still early stage. As always, feedback is welcome, and begone if you demand perfection.
    The second stage, like the Sunfall series, follows the Silly Tavern preset (ChatML), so ymmv in particular if you use some other tool and/or preset.
  overrides:
    parameters:
      model: Q2.5-32B-Slush.i1-Q4_K_M.gguf
  files:
    - filename: Q2.5-32B-Slush.i1-Q4_K_M.gguf
      sha256: 95aecaf43077dabc72d3b556923ede2563325e1c89863800229cfa8b7f1c9659
      uri: huggingface://mradermacher/Q2.5-32B-Slush-i1-GGUF/Q2.5-32B-Slush.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwestion-24b"
  urls:
    - https://huggingface.co/CultriX/Qwestion-14B
    - https://huggingface.co/mradermacher/Qwestion-24B-GGUF
  description: |
    This model was merged using the DARE TIES merge method using Qwen/Qwen2.5-14B as a base.
    The following models were included in the merge:
    allknowingroger/Qwenslerp2-14B
    rombodawg/Rombos-LLM-V2.6-Qwen-14b
    VAGOsolutions/SauerkrautLM-v2-14b-DPO
    CultriX/Qwen2.5-14B-Wernicke
  overrides:
    parameters:
      model: Qwestion-24B.Q4_K_M.gguf
  files:
    - filename: Qwestion-24B.Q4_K_M.gguf
      sha256: 5d493bd81cfeef66d80101260145ab1d1d0428ef2191edce62b58391bd0fff0e
      uri: huggingface://mradermacher/Qwestion-24B-GGUF/Qwestion-24B.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "teleut-7b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/634262af8d8089ebaefd410e/UqIi8eztdptvt52Mak_1K.png
  urls:
    - https://huggingface.co/allura-org/Teleut-7b
    - https://huggingface.co/QuantFactory/Teleut-7b-GGUF
  description: |
    A replication attempt of Tulu 3 on the Qwen 2.5 base models.
  overrides:
    parameters:
      model: Teleut-7b.Q4_K_M.gguf
  files:
    - filename: Teleut-7b.Q4_K_M.gguf
      sha256: 844a633ea01d793c638e99f2e07413606b3812b759e9264fbaf69c8d94eaa093
      uri: huggingface://QuantFactory/Teleut-7b-GGUF/Teleut-7b.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-7b-homercreative-mix"
  urls:
    - https://huggingface.co/ZeroXClem/Qwen2.5-7B-HomerCreative-Mix
    - https://huggingface.co/QuantFactory/Qwen2.5-7B-HomerCreative-Mix-GGUF
  description: "ZeroXClem/Qwen2.5-7B-HomerCreative-Mix is an advanced language model meticulously crafted by merging four pre-trained models using the powerful mergekit framework. This fusion leverages the Model Stock merge method to combine the creative prowess of Qandora, the instructive capabilities of Qwen-Instruct-Fusion, the sophisticated blending of HomerSlerp1, and the foundational conversational strengths of Homer-v0.5-Qwen2.5-7B. The resulting model excels in creative text generation, contextual understanding, and dynamic conversational interactions.\n\U0001F680 Merged Models\n\nThis model merge incorporates the following:\n\n    bunnycore/Qandora-2.5-7B-Creative: Specializes in creative text generation, enhancing the model's ability to produce imaginative and diverse content.\n\n    bunnycore/Qwen2.5-7B-Instruct-Fusion: Focuses on instruction-following capabilities, improving the model's performance in understanding and executing user commands.\n\n    allknowingroger/HomerSlerp1-7B: Utilizes spherical linear interpolation (SLERP) to blend model weights smoothly, ensuring a harmonious integration of different model attributes.\n\n    newsbang/Homer-v0.5-Qwen2.5-7B: Acts as the foundational conversational model, providing robust language comprehension and generation capabilities.\n"
  overrides:
    parameters:
      model: Qwen2.5-7B-HomerCreative-Mix.Q4_K_M.gguf
  files:
    - filename: Qwen2.5-7B-HomerCreative-Mix.Q4_K_M.gguf
      sha256: fc3fdb41e068646592f89a8ae62d7b330f2bd4e97bf615aef2977930977c8ba5
      uri: huggingface://QuantFactory/Qwen2.5-7B-HomerCreative-Mix-GGUF/Qwen2.5-7B-HomerCreative-Mix.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "cybercore-qwen-2.1-7b"
  urls:
    - https://huggingface.co/bunnycore/CyberCore-Qwen-2.1-7B
    - https://huggingface.co/QuantFactory/CyberCore-Qwen-2.1-7B-GGUF
  description: |
    This model was merged using the TIES merge method using rombodawg/Rombos-LLM-V2.5-Qwen-7b as a base.
    Models Merged
    fblgit/cybertron-v4-qw7B-UNAMGS + bunnycore/Qwen-2.1-7b-Persona-lora_model
    fblgit/cybertron-v4-qw7B-MGS + bunnycore/Qwen-2.1-7b-Persona-lora_model
  overrides:
    parameters:
      model: CyberCore-Qwen-2.1-7B.Q4_K_M.gguf
  files:
    - filename: CyberCore-Qwen-2.1-7B.Q4_K_M.gguf
      sha256: 726042707a4cec29ca0355b4dc7c53a807b307d08aa8a3d4a9e76aefbbbcaadf
      uri: huggingface://QuantFactory/CyberCore-Qwen-2.1-7B-GGUF/CyberCore-Qwen-2.1-7B.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "homercreativeanvita-mix-qw7b"
  icon: https://huggingface.co/suayptalha/HomerCreativeAnvita-Mix-Qw7B/resolve/main/HomerCreativeAnvita.jpeg
  urls:
    - https://huggingface.co/suayptalha/HomerCreativeAnvita-Mix-Qw7B
    - https://huggingface.co/QuantFactory/HomerCreativeAnvita-Mix-Qw7B-GGUF
  description: |
    This model is currently ranked #1 on the Open LLM Leaderboard among models up to 13B parameters!
    Merge Method

    This model was merged using the SLERP merge method.
    Models Merged

    The following models were included in the merge:

        ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
        ZeroXClem/Qwen2.5-7B-HomerCreative-Mix
  overrides:
    parameters:
      model: HomerCreativeAnvita-Mix-Qw7B.Q4_K_M.gguf
  files:
    - filename: HomerCreativeAnvita-Mix-Qw7B.Q4_K_M.gguf
      sha256: a356f279a104bff0bbc2ef7ec136c1e774153de8893bf988083e96fb7f4bc053
      uri: huggingface://QuantFactory/HomerCreativeAnvita-Mix-Qw7B-GGUF/HomerCreativeAnvita-Mix-Qw7B.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "math-iio-7b-instruct"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/faLfR-doaWP_BLUkOQrbq.png
  urls:
    - https://huggingface.co/prithivMLmods/Math-IIO-7B-Instruct
    - https://huggingface.co/QuantFactory/Math-IIO-7B-Instruct-GGUF
  description: |
    The Math IIO 7B Instruct is a fine-tuned language model based on the robust Qwen2.5-7B-Instruct architecture. This model has been specifically trained to excel in single-shot mathematical reasoning and instruction-based tasks, making it a reliable choice for educational, analytical, and problem-solving applications.
    Key Features:
      Math-Optimized Capabilities:
      The model is designed to handle complex mathematical problems, step-by-step calculations, and reasoning tasks.

      Instruction-Tuned:
      Fine-tuned for better adherence to structured queries and task-oriented prompts, enabling clear and concise outputs.

      Large Vocabulary:
      Equipped with an extensive tokenizer configuration and custom tokens to ensure precise mathematical notation support.
  overrides:
    parameters:
      model: Math-IIO-7B-Instruct.Q4_K_M.gguf
  files:
    - filename: Math-IIO-7B-Instruct.Q4_K_M.gguf
      sha256: 8ffda0b6a43eb9997dfd7db48fe3bd0970fd1b9b86fb68f082c38622a48b58f4
      uri: huggingface://QuantFactory/Math-IIO-7B-Instruct-GGUF/Math-IIO-7B-Instruct.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "virtuoso-small"
  icon: https://avatars.githubusercontent.com/u/126496414
  urls:
    - https://huggingface.co/arcee-ai/Virtuoso-Small-GGUF
  description: |
    Virtuoso-Small is the debut public release of the Virtuoso series of models by Arcee.ai, designed to bring cutting-edge generative AI capabilities to organizations and developers in a compact, efficient form. With 14 billion parameters, Virtuoso-Small is an accessible entry point for high-quality instruction-following, complex reasoning, and business-oriented generative AI tasks.
  overrides:
    parameters:
      model: Virtuoso-Small-Q4_K_M.gguf
  files:
    - filename: Virtuoso-Small-Q4_K_M.gguf
      sha256: 07db215cdfcb05036567017fe20e50e60cb2da28d1f9a8251cc4f18c8caa247f
      uri: huggingface://arcee-ai/Virtuoso-Small-GGUF/Virtuoso-Small-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-7b-homeranvita-nerdmix"
  urls:
    - https://huggingface.co/ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
    - https://huggingface.co/QuantFactory/Qwen2.5-7B-HomerAnvita-NerdMix-GGUF
  description: |
    ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix is an advanced language model meticulously crafted by merging five pre-trained models using the powerful mergekit framework. This fusion leverages the Model Stock merge method to combine the creative prowess of Qandora, the instructive capabilities of Qwen-Instruct-Fusion, the sophisticated blending of HomerSlerp1, the mathematical precision of Cybertron-MGS, and the uncensored expertise of Qwen-Nerd. The resulting model excels in creative text generation, contextual understanding, technical reasoning, and dynamic conversational interactions.
  overrides:
    parameters:
      model: Qwen2.5-7B-HomerAnvita-NerdMix.Q4_K_M.gguf
  files:
    - filename: Qwen2.5-7B-HomerAnvita-NerdMix.Q4_K_M.gguf
      sha256: 73db2ca3ab50e8627352078988cd173e7447c5e8199a7db9e554602da1362e5f
      uri: huggingface://QuantFactory/Qwen2.5-7B-HomerAnvita-NerdMix-GGUF/Qwen2.5-7B-HomerAnvita-NerdMix.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-math-14b-instruct"
  urls:
    - https://huggingface.co/qingy2024/Qwen2.5-Math-14B-Instruct-Preview
    - https://huggingface.co/QuantFactory/Qwen2.5-Math-14B-Instruct-GGUF
  description: |
    This Qwen 2.5 model was trained 2x faster with Unsloth and Huggingface's TRL library.
    Fine-tuned it for 400 steps on garage-bAInd/Open-Platypus with a batch size of 3.
  overrides:
    parameters:
      model: Qwen2.5-Math-14B-Instruct.Q4_K_M.gguf
  files:
    - filename: Qwen2.5-Math-14B-Instruct.Q4_K_M.gguf
      sha256: 14e672394738a7d9f14a6cb16fd9a649b113a19a8b4934f9c18299fc4e286ab6
      uri: huggingface://QuantFactory/Qwen2.5-Math-14B-Instruct-GGUF/Qwen2.5-Math-14B-Instruct.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "sailor2-1b-chat"
  icon: https://huggingface.co/sail/Sailor2-1B-Chat/resolve/main/sailor2_banner.jpg
  urls:
    - https://huggingface.co/sail/Sailor2-1B-Chat
    - https://huggingface.co/bartowski/Sailor2-1B-Chat-GGUF
  description: |
    Sailor2 is a community-driven initiative that brings cutting-edge multilingual language models to South-East Asia (SEA). Our research highlights a strong demand for models in the 8B and 20B parameter range for production use, alongside 1B models for specialized applications, such as speculative decoding and research purposes. These models, released under the Apache 2.0 license, provide enhanced accessibility to advanced language technologies across the region.
    Sailor2 builds upon the foundation of the awesome multilingual model Qwen 2.5 and is continuously pre-trained on 500B tokens to support 15 languages better with a unified model. These languages include English, Chinese, Burmese, Cebuano, Ilocano, Indonesian, Javanese, Khmer, Lao, Malay, Sundanese, Tagalog, Thai, Vietnamese, and Waray. By addressing the growing demand for diverse, robust, and accessible language models, Sailor2 seeks to serve the underserved in SEA areas with open, inclusive, and accessible multilingual LLMs. The Sailor2 model comes in three sizes, 1B, 8B, and 20B, which are expanded from the Qwen2.5 base models of 0.5B, 7B, and 14B, respectively.
  overrides:
    parameters:
      model: Sailor2-1B-Chat-Q4_K_M.gguf
  files:
    - filename: Sailor2-1B-Chat-Q4_K_M.gguf
      sha256: 782e8abed13d51a2083eadfb2f6d94c2cd77940532f612a99e6f6bec9b3501d4
      uri: huggingface://bartowski/Sailor2-1B-Chat-GGUF/Sailor2-1B-Chat-Q4_K_M.gguf
- !!merge <<: *qwen25
  icon: https://huggingface.co/sail/Sailor2-1B-Chat/resolve/main/sailor2_banner.jpg
  name: "sailor2-8b-chat"
  urls:
    - https://huggingface.co/bartowski/Sailor2-8B-Chat-GGUF
  description: |
    Sailor2 is a community-driven initiative that brings cutting-edge multilingual language models to South-East Asia (SEA). Our research highlights a strong demand for models in the 8B and 20B parameter range for production use, alongside 1B models for specialized applications, such as speculative decoding and research purposes. These models, released under the Apache 2.0 license, provide enhanced accessibility to advanced language technologies across the region.
    Sailor2 builds upon the foundation of the awesome multilingual model Qwen 2.5 and is continuously pre-trained on 500B tokens to support 15 languages better with a unified model. These languages include English, Chinese, Burmese, Cebuano, Ilocano, Indonesian, Javanese, Khmer, Lao, Malay, Sundanese, Tagalog, Thai, Vietnamese, and Waray. By addressing the growing demand for diverse, robust, and accessible language models, Sailor2 seeks to serve the underserved in SEA areas with open, inclusive, and accessible multilingual LLMs. The Sailor2 model comes in three sizes, 1B, 8B, and 20B, which are expanded from the Qwen2.5 base models of 0.5B, 7B, and 14B, respectively.
  overrides:
    parameters:
      model: Sailor2-8B-Chat-Q4_K_M.gguf
  files:
    - filename: Sailor2-8B-Chat-Q4_K_M.gguf
      sha256: 1a6aaadd6f6ef9c2290d66b348ebcbd6fdec542834cde622498fbd467d966103
      uri: huggingface://bartowski/Sailor2-8B-Chat-GGUF/Sailor2-8B-Chat-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "sailor2-20b-chat"
  icon: https://huggingface.co/sail/Sailor2-1B-Chat/resolve/main/sailor2_banner.jpg
  urls:
    - https://huggingface.co/bartowski/Sailor2-20B-Chat-GGUF
  description: |
    Sailor2 is a community-driven initiative that brings cutting-edge multilingual language models to South-East Asia (SEA). Our research highlights a strong demand for models in the 8B and 20B parameter range for production use, alongside 1B models for specialized applications, such as speculative decoding and research purposes. These models, released under the Apache 2.0 license, provide enhanced accessibility to advanced language technologies across the region.
    Sailor2 builds upon the foundation of the awesome multilingual model Qwen 2.5 and is continuously pre-trained on 500B tokens to support 15 languages better with a unified model. These languages include English, Chinese, Burmese, Cebuano, Ilocano, Indonesian, Javanese, Khmer, Lao, Malay, Sundanese, Tagalog, Thai, Vietnamese, and Waray. By addressing the growing demand for diverse, robust, and accessible language models, Sailor2 seeks to serve the underserved in SEA areas with open, inclusive, and accessible multilingual LLMs. The Sailor2 model comes in three sizes, 1B, 8B, and 20B, which are expanded from the Qwen2.5 base models of 0.5B, 7B, and 14B, respectively.
  overrides:
    parameters:
      model: Sailor2-20B-Chat-Q4_K_M.gguf
  files:
    - filename: Sailor2-20B-Chat-Q4_K_M.gguf
      sha256: 0cf8fcd367accee19702ef15ee964bddd5035bde034afddd838f818e7655534a
      uri: huggingface://bartowski/Sailor2-20B-Chat-GGUF/Sailor2-20B-Chat-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "72b-qwen2.5-kunou-v1"
  icon: https://huggingface.co/Sao10K/72B-Qwen2.5-Kunou-v1/resolve/main/knn.png
  urls:
    - https://huggingface.co/Sao10K/72B-Qwen2.5-Kunou-v1
    - https://huggingface.co/bartowski/72B-Qwen2.5-Kunou-v1-GGUF
  description: |
    I do not really have anything planned for this model other than it being a generalist, and Roleplay Model? It was just something made and planned in minutes.
    Same with the 14 and 32B version.
    Kunou's the name of an OC I worked on for a couple of years, for a... fanfic. mmm...

    A kind-of successor to L3-70B-Euryale-v2.2 in all but name? I'm keeping Stheno/Euryale lineage to Llama series for now.
    I had a version made on top of Nemotron, a supposed Euryale 2.4 but that flopped hard, it was not my cup of tea.
    This version is basically a better, more cleaned up Dataset used on Euryale and Stheno.
  overrides:
    parameters:
      model: 72B-Qwen2.5-Kunou-v1-Q4_K_M.gguf
  files:
    - filename: 72B-Qwen2.5-Kunou-v1-Q4_K_M.gguf
      sha256: 91907f29746625a62885793475956220b81d8a5a34b53686a1acd1d03fd403ea
      uri: huggingface://bartowski/72B-Qwen2.5-Kunou-v1-GGUF/72B-Qwen2.5-Kunou-v1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "evathene-v1.3"
  urls:
    - https://huggingface.co/sophosympatheia/Evathene-v1.3
    - https://huggingface.co/bartowski/Evathene-v1.3-GGUF
  description: |
    This 72B parameter model is a merge of sophosympatheia/Evathene-v1.1 and sophosympatheia/Evathene-v1.2. See the merge recipe below for details.
  overrides:
    parameters:
      model: Evathene-v1.3-Q4_K_M.gguf
  files:
    - filename: Evathene-v1.3-Q4_K_M.gguf
      sha256: 0f54909b3ddca514994ee16417da8750f56e7bd59581b46ac47625c230e29d1f
      uri: huggingface://bartowski/Evathene-v1.3-GGUF/Evathene-v1.3-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "fusechat-qwen-2.5-7b-instruct"
  icon: https://huggingface.co/FuseAI/FuseChat-Qwen-2.5-7B-Instruct/resolve/main/FuseChat-3.0.png
  urls:
    - https://huggingface.co/FuseAI/FuseChat-Qwen-2.5-7B-Instruct
    - https://huggingface.co/bartowski/FuseChat-Qwen-2.5-7B-Instruct-GGUF
  description: |
    We present FuseChat-3.0, a series of models crafted to enhance performance by integrating the strengths of multiple source LLMs into more compact target LLMs. To achieve this fusion, we utilized four powerful source LLMs: Gemma-2-27B-It, Mistral-Large-Instruct-2407, Qwen-2.5-72B-Instruct, and Llama-3.1-70B-Instruct. For the target LLMs, we employed three widely-used smaller models—Llama-3.1-8B-Instruct, Gemma-2-9B-It, and Qwen-2.5-7B-Instruct—along with two even more compact models—Llama-3.2-3B-Instruct and Llama-3.2-1B-Instruct. The implicit model fusion process involves a two-stage training pipeline comprising Supervised Fine-Tuning (SFT) to mitigate distribution discrepancies between target and source LLMs, and Direct Preference Optimization (DPO) for learning preferences from multiple source LLMs. The resulting FuseChat-3.0 models demonstrated substantial improvements in tasks related to general conversation, instruction following, mathematics, and coding. Notably, when Llama-3.1-8B-Instruct served as the target LLM, our fusion approach achieved an average improvement of 6.8 points across 14 benchmarks. Moreover, it showed significant improvements of 37.1 and 30.1 points on instruction-following test sets AlpacaEval-2 and Arena-Hard respectively. We have released the FuseChat-3.0 models on Huggingface, stay tuned for the forthcoming dataset and code.
  overrides:
    parameters:
      model: FuseChat-Qwen-2.5-7B-Instruct-Q4_K_M.gguf
  files:
    - filename: FuseChat-Qwen-2.5-7B-Instruct-Q4_K_M.gguf
      sha256: 8cd8c317769f03125ac753c836ac92c5a76ee0b35502811d0e65bcbb8df9d55c
      uri: huggingface://bartowski/FuseChat-Qwen-2.5-7B-Instruct-GGUF/FuseChat-Qwen-2.5-7B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "neumind-math-7b-instruct"
  urls:
    - https://huggingface.co/prithivMLmods/Neumind-Math-7B-Instruct
    - https://huggingface.co/QuantFactory/Neumind-Math-7B-Instruct-GGUF
  description: |
    The Neumind-Math-7B-Instruct is a fine-tuned model based on Qwen2.5-7B-Instruct, optimized for mathematical reasoning, step-by-step problem-solving, and instruction-based tasks in the mathematics domain. The model is designed for applications requiring structured reasoning, numerical computations, and mathematical proof generation.
  overrides:
    parameters:
      model: Neumind-Math-7B-Instruct.Q4_K_M.gguf
  files:
    - filename: Neumind-Math-7B-Instruct.Q4_K_M.gguf
      sha256: 3250abadeae4234e06dfaf7cf86fe871fe021e6c2dfcb4542c2a4f412d71e28c
      uri: huggingface://QuantFactory/Neumind-Math-7B-Instruct-GGUF/Neumind-Math-7B-Instruct.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2-vl-72b-instruct"
  urls:
    - https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct
    - https://huggingface.co/bartowski/Qwen2-VL-72B-Instruct-GGUF
  description: |
    We're excited to unveil Qwen2-VL, the latest iteration of our Qwen-VL model, representing nearly a year of innovation.
    Key Enhancements:
        SoTA understanding of images of various resolution & ratio: Qwen2-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc.

        Understanding videos of 20min+: Qwen2-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc.

        Agent that can operate your mobiles, robots, etc.: with the abilities of complex reasoning and decision making, Qwen2-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions.

        Multilingual Support: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc.
  overrides:
    parameters:
      model: Qwen2-VL-72B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2-VL-72B-Instruct-Q4_K_M.gguf
      sha256: 0def10ee892a4d4c72ba3807d150de2e1f600edd981d15d402e3d25753cf168d
      uri: huggingface://bartowski/Qwen2-VL-72B-Instruct-GGUF/Qwen2-VL-72B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "tq2.5-14b-aletheia-v1"
  icon: https://huggingface.co/allura-org/TQ2.5-14B-Aletheia-v1/resolve/main/aletheia.png
  urls:
    - https://huggingface.co/allura-org/TQ2.5-14B-Aletheia-v1
    - https://huggingface.co/bartowski/TQ2.5-14B-Aletheia-v1-GGUF
  description: |
    RP/Story hybrid model, merge of Sugarquill and Neon. As with Gemma version, I wanted to preserve Sugarquill's creative spark, while making the model more steerable for RP. It proved to be more difficult this time, but I quite like the result regardless, even if the model is still somewhat temperamental.

    Should work for both RP and storywriting, either on raw completion or with back-and-forth cowriting in chat mode. Seems to be quite sensitive to low depth instructions and samplers.

    Thanks to Toasty and Fizz for testing and giving feedback

    Model was created by Auri.
  overrides:
    parameters:
      model: TQ2.5-14B-Aletheia-v1-Q4_K_M.gguf
  files:
    - filename: TQ2.5-14B-Aletheia-v1-Q4_K_M.gguf
      sha256: 8739a9575520f8460e83905f3e085883dd71ef2c9fa40d36d4e0a3fff003440c
      uri: huggingface://bartowski/TQ2.5-14B-Aletheia-v1-GGUF/TQ2.5-14B-Aletheia-v1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "tq2.5-14b-neon-v1"
  icon: https://huggingface.co/allura-org/TQ2.5-14B-Neon-v1/resolve/main/neon.png
  urls:
    - https://huggingface.co/allura-org/TQ2.5-14B-Neon-v1
    - https://huggingface.co/bartowski/TQ2.5-14B-Neon-v1-GGUF
  description: |
    RP finetune of Supernova-Medius. Turned out surprisingly nice on it's own, I honestly made it only as a merge fuel, but it impressed me and Prodeus enough to release it separately (history repeats I guess, Sugarquill also started out this way). Quite interesting prose, definitely quite distinct from Supernova or EVA for that matter. Instruction following is decent as well. Not really much to say about this one, just a decent RP model, tbh. Euryale-inspired I guess.
  overrides:
    parameters:
      model: TQ2.5-14B-Neon-v1-Q4_K_M.gguf
  files:
    - filename: TQ2.5-14B-Neon-v1-Q4_K_M.gguf
      sha256: cefc7409b21e03e4fcd64940e30f6a0c17c5a4a89e0ba0811f1b9720825d2309
      uri: huggingface://bartowski/TQ2.5-14B-Neon-v1-GGUF/TQ2.5-14B-Neon-v1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "miscii-14b-1028"
  icon: https://i.imgur.com/hkiubT4.jpeg
  urls:
    - https://huggingface.co/sthenno-com/miscii-14b-1028
    - https://huggingface.co/QuantFactory/miscii-14b-1028-GGUF
  description: |
    miscii-14b-1028 is a 14-billion parameter language model based on the Qwen2.5-14B-Instruct model. It is designed for chat and conversational AI tasks, with a focus on role-based instructions.
  overrides:
    parameters:
      model: miscii-14b-1028.Q4_K_M.gguf
  files:
    - filename: miscii-14b-1028.Q4_K_M.gguf
      sha256: 0e57bc628c79a1033a6bb92837fba1e52a9e5dbccc5107720c95b89cd9cf92a9
      uri: huggingface://QuantFactory/miscii-14b-1028-GGUF/miscii-14b-1028.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "miscii-14b-1225"
  icon: https://huggingface.co/sthenno-com/miscii-14b-1225/resolve/main/Rrharil.png
  urls:
    - https://huggingface.co/sthenno-com/miscii-14b-1225
    - https://huggingface.co/mradermacher/miscii-14b-1225-GGUF
  description: |
    The following models were included in the merge:
    sthenno/exp-002
    sthenno/miscii-1218
  overrides:
    parameters:
      model: miscii-14b-1225.Q4_K_M.gguf
  files:
    - filename: miscii-14b-1225.Q4_K_M.gguf
      sha256: f21fe73450be394055aeb87b7619e98a09e5c190b48f145bdebef4e12df871fe
      uri: huggingface://mradermacher/miscii-14b-1225-GGUF/miscii-14b-1225.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwentile2.5-32b-instruct"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65b19c1b098c85365af5a83e/sF7RDZA7lFYOmGy4bGy1s.png
  urls:
    - https://huggingface.co/maldv/Qwentile2.5-32B-Instruct
    - https://huggingface.co/bartowski/Qwentile2.5-32B-Instruct-GGUF
  description: |
    Qwentile 2.5 32B Instruct is a normalized denoised fourier interpolation of the following models:
    - { "model": "AiCloser/Qwen2.5-32B-AGI", "base": "Qwen/Qwen2.5-32B", "alpha": 0.3 }
    - { "model": "EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2", "base": "Qwen/Qwen2.5-32B", "alpha": 0.7 }
    - { "model": "fblgit/TheBeagle-v2beta-32B-MGS", "base": "Qwen/Qwen2.5-32B", "alpha": 0.6 }
    - { "model": "huihui-ai/Qwen2.5-32B-Instruct-abliterated", "base": "Qwen/Qwen2.5-32B-Instruct", "alpha": 1.0 }
    - { "model": "huihui-ai/QwQ-32B-Preview-abliterated", "base": "Qwen/Qwen2.5-32B", "alpha": 1.0 }
    - { "model": "Qwen/QwQ-32B-Preview", "base": "Qwen/Qwen2.5-32B", "alpha": 0.8, "is_input": true }
    - { "model": "rombodawg/Rombos-LLM-V2.5-Qwen-32b", "base": "Qwen/Qwen2.5-32B", "alpha": 1.0, "is_output": true }
    - { "model": "nbeerbower/Qwen2.5-Gutenberg-Doppel-32B", "base": "Qwen/Qwen2.5-32B-Instruct", "alpha": 0.4 }
    I started my experiment because of QwQ is a really nifty model, but it was giving me problems with xml output - which is what I use for my thought tokens. So, I thought... lets just merge it in!
    The first model worked pretty well, but I got a sense that the balances could be tweaked. Why not throw in some other models as well for fun and see if I can't run out of disk space in the process?
  overrides:
    parameters:
      model: Qwentile2.5-32B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwentile2.5-32B-Instruct-Q4_K_M.gguf
      sha256: e476d6e3c15c78fc3f986d7ae8fa35c16116843827f2e6243c05767cef2f3615
      uri: huggingface://bartowski/Qwentile2.5-32B-Instruct-GGUF/Qwentile2.5-32B-Instruct-Q4_K_M.gguf
- &archfunct
  license: apache-2.0
  tags:
    - llm
    - gguf
    - gpu
    - qwen
    - qwen2.5
    - cpu
    - function-calling
  name: "arch-function-1.5b"
  icon: https://avatars.githubusercontent.com/u/112724757
  uri: "github:mudler/LocalAI/gallery/arch-function.yaml@master"
  urls:
    - https://huggingface.co/katanemolabs/Arch-Function-1.5B
    - https://huggingface.co/mradermacher/Arch-Function-1.5B-GGUF
  description: |
    The Katanemo Arch-Function collection of large language models (LLMs) is a collection state-of-the-art (SOTA) LLMs specifically designed for function calling tasks. The models are designed to understand complex function signatures, identify required parameters, and produce accurate function call outputs based on natural language prompts. Achieving performance on par with GPT-4, these models set a new benchmark in the domain of function-oriented tasks, making them suitable for scenarios where automated API interaction and function execution is crucial.
    In summary, the Katanemo Arch-Function collection demonstrates:
        State-of-the-art performance in function calling
        Accurate parameter identification and suggestion, even in ambiguous or incomplete inputs
        High generalization across multiple function calling use cases, from API interactions to automated backend tasks.
        Optimized low-latency, high-throughput performance, making it suitable for real-time, production environments.
  overrides:
    parameters:
      model: Arch-Function-1.5B.Q4_K_M.gguf
  files:
    - filename: Arch-Function-1.5B.Q4_K_M.gguf
      sha256: 5ac54d2d50cca0ee0335ca2c9b688204c0829cd3a73de3ee3fda108281ad9691
      uri: huggingface://mradermacher/Arch-Function-1.5B-GGUF/Arch-Function-1.5B.Q4_K_M.gguf
- !!merge <<: *archfunct
  name: "arch-function-7b"
  urls:
    - https://huggingface.co/katanemolabs/Arch-Function-7B
    - https://huggingface.co/mradermacher/Arch-Function-7B-GGUF
  overrides:
    parameters:
      model: Arch-Function-7B.Q4_K_M.gguf
  files:
    - filename: Arch-Function-7B.Q4_K_M.gguf
      sha256: 6e38661321d79d02b8cf57c79d97c6c0e19adb9ffa66083cc440c24e257234b6
      uri: huggingface://mradermacher/Arch-Function-7B-GGUF/Arch-Function-7B.Q4_K_M.gguf
- !!merge <<: *archfunct
  name: "arch-function-3b"
  urls:
    - https://huggingface.co/katanemolabs/Arch-Function-3B
    - https://huggingface.co/mradermacher/Arch-Function-3B-GGUF
  overrides:
    parameters:
      model: Arch-Function-3B.Q4_K_M.gguf
  files:
    - filename: Arch-Function-3B.Q4_K_M.gguf
      sha256: 9945cb8d070498d163e5df90c1987f591d35e4fd2222a6c51bcfff848c4b573b
      uri: huggingface://mradermacher/Arch-Function-3B-GGUF/Arch-Function-3B.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2-7b-multilingual-rp"
  urls:
    - https://huggingface.co/maywell/Qwen2-7B-Multilingual-RP
    - https://huggingface.co/QuantFactory/Qwen2-7B-Multilingual-RP-GGUF
  description: |
    Multilingual Qwen2-7B model trained on Roleplaying.
  overrides:
    parameters:
      model: Qwen2-7B-Multilingual-RP.Q4_K_M.gguf
  files:
    - filename: Qwen2-7B-Multilingual-RP.Q4_K_M.gguf
      sha256: 31756c58fd135f2deb59b2d9b142f39134dc8d1a6eaa02f388dda7491fc95ccc
      uri: huggingface://QuantFactory/Qwen2-7B-Multilingual-RP-GGUF/Qwen2-7B-Multilingual-RP.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwq-lcot-7b-instruct"
  urls:
    - https://huggingface.co/prithivMLmods/QwQ-LCoT-7B-Instruct
    - https://huggingface.co/bartowski/QwQ-LCoT-7B-Instruct-GGUF
  description: |
    The QwQ-LCoT-7B-Instruct is a fine-tuned language model designed for advanced reasoning and instruction-following tasks. It leverages the Qwen2.5-7B base model and has been fine-tuned on the amphora/QwQ-LongCoT-130K dataset, focusing on chain-of-thought (CoT) reasoning.
  overrides:
    parameters:
      model: QwQ-LCoT-7B-Instruct-Q4_K_M.gguf
  files:
    - filename: QwQ-LCoT-7B-Instruct-Q4_K_M.gguf
      sha256: 1df2e4ff0093a9632687b73969153442776b0ffc1c3c68e7f559472f9cea1945
      uri: huggingface://bartowski/QwQ-LCoT-7B-Instruct-GGUF/QwQ-LCoT-7B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "tqwendo-36b"
  icon: "https://cdn-uploads.huggingface.co/production/uploads/6379683a81c1783a4a2ddba8/DI7Yw8Fs8eukluzKTHjEH.png"
  urls:
    - https://huggingface.co/nisten/tqwendo-36b
    - https://huggingface.co/bartowski/tqwendo-36b-GGUF
  description: |
    There is a draft model to go with this one for speculative decoding and chain of thought reasoning: https://huggingface.co/nisten/qwen2.5-coder-7b-abliterated-128k-AWQ

    Using the above 4bit 7b in conjuction with the 36b is meant to setup a chain-of-thought reasoner, evaluator similar to what O1-O3 is probably doing. This way the 7b 4bit only uses up an extra 4-6Gb on the gpu, but greatly both speeds up speculative decoding AND also chain-of-throught evals.
  overrides:
    parameters:
      model: tqwendo-36b-Q4_K_M.gguf
  files:
    - filename: tqwendo-36b-Q4_K_M.gguf
      sha256: 890ff05fb717c67848d5c02ad62b2c26fdcdd20f7cc94ade8095869784c0cc82
      uri: huggingface://bartowski/tqwendo-36b-GGUF/tqwendo-36b-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qvq-72b-preview"
  urls:
    - https://huggingface.co/Qwen/QVQ-72B-Preview
    - https://huggingface.co/bartowski/QVQ-72B-Preview-GGUF
  description: |
    QVQ-72B-Preview is an experimental research model developed by the Qwen team, focusing on enhancing visual reasoning capabilities.
    QVQ-72B-Preview has achieved remarkable performance on various benchmarks. It scored a remarkable 70.3% on the Multimodal Massive Multi-task Understanding (MMMU) benchmark, showcasing QVQ's powerful ability in multidisciplinary understanding and reasoning. Furthermore, the significant improvements on MathVision highlight the model's progress in mathematical reasoning tasks. OlympiadBench also demonstrates the model's enhanced ability to tackle challenging problems.
  overrides:
    mmproj: mmproj-QVQ-72B-Preview-f16.gguf
    parameters:
      model: QVQ-72B-Preview-Q4_K_M.gguf
  files:
    - filename: QVQ-72B-Preview-Q4_K_M.gguf
      sha256: 0fab6809995614c19e4b4c23e3191824944a04999f742486278f0d9929dc82ae
      uri: huggingface://bartowski/QVQ-72B-Preview-GGUF/QVQ-72B-Preview-Q4_K_M.gguf
    - filename: mmproj-QVQ-72B-Preview-f16.gguf
      sha256: 85110223f39aa1aad887052d269074afbd52a49ae02c53b66753b033662cc8e6
      uri: huggingface://bartowski/QVQ-72B-Preview-GGUF/mmproj-QVQ-72B-Preview-f16.gguf
- !!merge <<: *qwen25
  name: "teleut-7b-rp"
  icon: https://cdn-uploads.huggingface.co/production/uploads/634262af8d8089ebaefd410e/2y6PHgWe4ewoMFlgn-p3d.png
  urls:
    - https://huggingface.co/allura-org/Teleut-7b-RP
    - https://huggingface.co/bartowski/Teleut-7b-RP-GGUF
  description: |
    A roleplay-focused LoRA finetune of Teleut 7b. Methodology and hyperparams inspired by SorcererLM and Slush.
    Dataset: The worst mix of data you've ever seen. Like, seriously, you do not want to see the things that went into this model. It's bad.
  overrides:
    parameters:
      model: Teleut-7b-RP-Q4_K_M.gguf
  files:
    - filename: Teleut-7b-RP-Q4_K_M.gguf
      sha256: 74d9a0974c48f16677da8891ac76ed89ed04f246275b9ca8316d25e1e86ce89f
      uri: huggingface://bartowski/Teleut-7b-RP-GGUF/Teleut-7b-RP-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-32b-rp-ink"
  icon: https://cdn-uploads.huggingface.co/production/uploads/634262af8d8089ebaefd410e/1_Zt_OvEW183lmrgidQw8.png
  urls:
    - https://huggingface.co/allura-org/Qwen2.5-32b-RP-Ink
    - https://huggingface.co/bartowski/Qwen2.5-32b-RP-Ink-GGUF
  description: |
    A roleplay-focused LoRA finetune of Qwen 2.5 32b Instruct. Methodology and hyperparams inspired by SorcererLM and Slush.
    Yet another model in the Ink series, following in the footsteps of the Nemo one
  overrides:
    parameters:
      model: Qwen2.5-32b-RP-Ink-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-32b-RP-Ink-Q4_K_M.gguf
      sha256: 7a0693d50aa40ba4fd43b4988851e67443e758ae34881f448e2812e5fcc25468
      uri: huggingface://bartowski/Qwen2.5-32b-RP-Ink-GGUF/Qwen2.5-32b-RP-Ink-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "q2.5-veltha-14b-0.5"
  urls:
    - https://huggingface.co/djuna/Q2.5-Veltha-14B-0.5
    - https://huggingface.co/bartowski/Q2.5-Veltha-14B-0.5-GGUF
  description: |
    The following models were included in the merge:s
        huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
        allura-org/TQ2.5-14B-Aletheia-v1
        EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
        v000000/Qwen2.5-Lumen-14B
  overrides:
    parameters:
      model: Q2.5-Veltha-14B-0.5-Q4_K_M.gguf
  files:
    - filename: Q2.5-Veltha-14B-0.5-Q4_K_M.gguf
      sha256: f75b8cbceab555ebcab6fcb3b51d398b7ef79671aa05c21c288edd75c9f217bd
      uri: huggingface://bartowski/Q2.5-Veltha-14B-0.5-GGUF/Q2.5-Veltha-14B-0.5-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "smallthinker-3b-preview"
  urls:
    - https://huggingface.co/PowerInfer/SmallThinker-3B-Preview
    - https://huggingface.co/bartowski/SmallThinker-3B-Preview-GGUF
  description: |
    SmallThinker is designed for the following use cases:
    Edge Deployment: Its small size makes it ideal for deployment on resource-constrained devices.
    Draft Model for QwQ-32B-Preview: SmallThinker can serve as a fast and efficient draft model for the larger QwQ-32B-Preview model. From my test, in llama.cpp we can get 70% speedup (from 40 tokens/s to 70 tokens/s).
  overrides:
    parameters:
      model: SmallThinker-3B-Preview-Q4_K_M.gguf
  files:
    - filename: SmallThinker-3B-Preview-Q4_K_M.gguf
      sha256: ac04f82a09ee6a2748437c3bb774b638a54099dc7d5d6ef7549893fae22ab055
      uri: huggingface://bartowski/SmallThinker-3B-Preview-GGUF/SmallThinker-3B-Preview-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwenwify2.5-32b-v4.5"
  urls:
    - https://huggingface.co/Kaoeiri/Qwenwify2.5-32B-v4.5
    - https://huggingface.co/mradermacher/Qwenwify2.5-32B-v4.5-GGUF
  description: |
    The following models were included in the merge:
    Kaoeiri/Qwenwify-32B-v3
    allura-org/Qwen2.5-32b-RP-Ink
    Dans-DiscountModels/Qwen2.5-32B-ChatML
    Saxo/Linkbricks-Horizon-AI-Japanese-Base-32B
    OpenBuddy/openbuddy-qwq-32b-v24.2-200k
    Sao10K/32B-Qwen2.5-Kunou-v1
  overrides:
    parameters:
      model: Qwenwify2.5-32B-v4.5.Q4_K_M.gguf
  files:
    - filename: Qwenwify2.5-32B-v4.5.Q4_K_M.gguf
      sha256: 52670acdc285356c01259f45b1953860f34deb4f80345ca63b60acc19165280c
      uri: huggingface://mradermacher/Qwenwify2.5-32B-v4.5-GGUF/Qwenwify2.5-32B-v4.5.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "drt-o1-7b"
  urls:
    - https://huggingface.co/Krystalan/DRT-o1-7B
    - https://huggingface.co/QuantFactory/DRT-o1-7B-GGUF
  description: "In this work, we introduce DRT-o1, an attempt to bring the success of long thought reasoning to neural machine translation (MT). To this end,\n\n\U0001F31F We mine English sentences with similes or metaphors from existing literature books, which are suitable for translation via long thought.\n\U0001F31F We propose a designed multi-agent framework with three agents (i.e., a translator, an advisor and an evaluator) to synthesize the MT samples with long thought. There are 22,264 synthesized samples in total.\n\U0001F31F We train DRT-o1-8B, DRT-o1-7B and DRT-o1-14B using Llama-3.1-8B-Instruct, Qwen2.5-7B-Instruct and Qwen2.5-14B-Instruct as backbones.\n\nOur goal is not to achieve competitive performance with OpenAI’s O1 in neural machine translation (MT). Instead, we explore technical routes to bring the success of long thought to MT. To this end, we introduce DRT-o1, a byproduct of our exploration, and we hope it could facilitate the corresponding research in this direction.\n"
  overrides:
    parameters:
      model: DRT-o1-7B.Q4_K_M.gguf
  files:
    - filename: DRT-o1-7B.Q4_K_M.gguf
      sha256: f592a2523f92ae29630b45fbb501bba7f2fbd99355975cd05fa989faf8d3597d
      uri: huggingface://QuantFactory/DRT-o1-7B-GGUF/DRT-o1-7B.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "experimental-lwd-mirau-rp-14b-iq-imatrix"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/99YhsFSeaGDYCq7XVcTcq.png
  urls:
    - https://huggingface.co/AetherArchitectural/lwd-Mirau-RP-14B
    - https://huggingface.co/Lewdiculous/experimental-lwd-Mirau-RP-14B-GGUF-IQ-Imatrix
  description: |
    This model is designed to improve the controllability and consistency of current roleplaying models. We developed a story flow thought chain approach that makes the system prompts combined with the entire user-BOT dialogue read like a first-person narrative told by the BOT. We found this design greatly enhances the model's consistency and expressiveness.

    Additionally, we allow users to play two roles simultaneously: one as the director of the entire plot (see Special Designs), and another as an actor dialoguing with the BOT. Users can be viewed as writers who need to draft outlines and plot summaries, while the BOT helps complete story details, requiring users to have powerful control over the BOT.

    The model's output is divided into two parts: the model's inner monologue (which it believes is invisible to users) and the final response.

    Overall, mirau features:

        Superior character consistency

        Powerful long-context memory capability

        Transparent thinking with hidden thought chains
  overrides:
    parameters:
      model: lwd-Mirau-RP-Q4_K_M-imat.gguf
  files:
    - filename: lwd-Mirau-RP-Q4_K_M-imat.gguf
      sha256: 22ff461e9034b9ebded07b2a9d3d88c2f75359d5c069ebb3ee4e9c6ec5c45cf8
      uri: huggingface://Lewdiculous/experimental-lwd-Mirau-RP-14B-GGUF-IQ-Imatrix/lwd-Mirau-RP-Q4_K_M-imat.gguf
- !!merge <<: *qwen25
  name: "32b-qwen2.5-kunou-v1"
  icon: https://huggingface.co/Sao10K/72B-Qwen2.5-Kunou-v1/resolve/main/knn.png
  urls:
    - https://huggingface.co/Sao10K/32B-Qwen2.5-Kunou-v1
    - https://huggingface.co/bartowski/32B-Qwen2.5-Kunou-v1-GGUF
  description: |
    I do not really have anything planned for this model other than it being a generalist, and Roleplay Model? It was just something made and planned in minutes.
    Same with the 14B and 72B version.
    Kunou's the name of an OC I worked on for a couple of years, for a... fanfic. mmm...
    A kind-of successor to L3-70B-Euryale-v2.2 in all but name? I'm keeping Stheno/Euryale lineage to Llama series for now.
    I had a version made on top of Nemotron, a supposed Euryale 2.4 but that flopped hard, it was not my cup of tea.
    This version is basically a better, more cleaned up Dataset used on Euryale and Stheno.
  overrides:
    parameters:
      model: 32B-Qwen2.5-Kunou-v1-Q4_K_M.gguf
  files:
    - filename: 32B-Qwen2.5-Kunou-v1-Q4_K_M.gguf
      sha256: b8910172b74d03c3463ac301589f54b96e54f61c67531fb6b523ecfe923aaffb
      uri: huggingface://bartowski/32B-Qwen2.5-Kunou-v1-GGUF/32B-Qwen2.5-Kunou-v1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "14b-qwen2.5-kunou-v1"
  urls:
    - https://huggingface.co/Sao10K/14B-Qwen2.5-Kunou-v1
    - https://huggingface.co/DevQuasar/Sao10K.14B-Qwen2.5-Kunou-v1-GGUF
  description: |
    I do not really have anything planned for this model other than it being a generalist, and Roleplay Model? It was just something made and planned in minutes.
    This is the little sister variant, the small 14B version.
    Kunou's the name of an OC I worked on for a couple of years, for a... fanfic. mmm...

    A kind-of successor to my smaller model series. It works pretty nicely I think?
    This version is basically a better, more cleaned up Dataset used on Euryale and Stheno.
  overrides:
    parameters:
      model: Sao10K.14B-Qwen2.5-Kunou-v1.Q4_K_M.gguf
  files:
    - filename: Sao10K.14B-Qwen2.5-Kunou-v1.Q4_K_M.gguf
      sha256: 7b7af50076e15c305a2a1bed7ad766dc6deb61eef3c2e6a40d4c94ad45623845
      uri: huggingface://DevQuasar/Sao10K.14B-Qwen2.5-Kunou-v1-GGUF/Sao10K.14B-Qwen2.5-Kunou-v1.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "dolphin3.0-qwen2.5-0.5b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/cNCs1TBD3FelWCJGkZ3cd.png
  urls:
    - https://huggingface.co/cognitivecomputations/Dolphin3.0-Qwen2.5-0.5B
    - https://huggingface.co/bartowski/Dolphin3.0-Qwen2.5-0.5B-GGUF
  description: |
    Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases.

    Dolphin aims to be a general purpose model, similar to the models behind ChatGPT, Claude, Gemini. But these models present problems for businesses seeking to include AI in their products.

        They maintain control of the system prompt, deprecating and changing things as they wish, often causing software to break.
        They maintain control of the model versions, sometimes changing things silently, or deprecating older models that your business relies on.
        They maintain control of the alignment, and in particular the alignment is one-size-fits all, not tailored to the application.
        They can see all your queries and they can potentially use that data in ways you wouldn't want. Dolphin, in contrast, is steerable and gives control to the system owner. You set the system prompt. You decide the alignment. You have control of your data. Dolphin does not impose its ethics or guidelines on you. You are the one who decides the guidelines.

    Dolphin belongs to YOU, it is your tool, an extension of your will. Just as you are personally responsible for what you do with a knife, gun, fire, car, or the internet, you are the creator and originator of any content you generate with Dolphin.
  overrides:
    parameters:
      model: Dolphin3.0-Qwen2.5-0.5B-Q4_K_M.gguf
  files:
    - filename: Dolphin3.0-Qwen2.5-0.5B-Q4_K_M.gguf
      sha256: 6a53689e2cb91027fdc9e366142eba8e35f56c14ee353e0a4d64de981efbfffa
      uri: huggingface://bartowski/Dolphin3.0-Qwen2.5-0.5B-GGUF/Dolphin3.0-Qwen2.5-0.5B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "dolphin3.0-qwen2.5-1.5b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/cNCs1TBD3FelWCJGkZ3cd.png
  urls:
    - https://huggingface.co/cognitivecomputations/Dolphin3.0-Qwen2.5-1.5B
    - https://huggingface.co/bartowski/Dolphin3.0-Qwen2.5-1.5B-GGUF
  description: |
    Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases.

    Dolphin aims to be a general purpose model, similar to the models behind ChatGPT, Claude, Gemini. But these models present problems for businesses seeking to include AI in their products.

        They maintain control of the system prompt, deprecating and changing things as they wish, often causing software to break.
        They maintain control of the model versions, sometimes changing things silently, or deprecating older models that your business relies on.
        They maintain control of the alignment, and in particular the alignment is one-size-fits all, not tailored to the application.
        They can see all your queries and they can potentially use that data in ways you wouldn't want. Dolphin, in contrast, is steerable and gives control to the system owner. You set the system prompt. You decide the alignment. You have control of your data. Dolphin does not impose its ethics or guidelines on you. You are the one who decides the guidelines.

    Dolphin belongs to YOU, it is your tool, an extension of your will. Just as you are personally responsible for what you do with a knife, gun, fire, car, or the internet, you are the creator and originator of any content you generate with Dolphin.
  overrides:
    parameters:
      model: Dolphin3.0-Qwen2.5-1.5B-Q4_K_M.gguf
  files:
    - filename: Dolphin3.0-Qwen2.5-1.5B-Q4_K_M.gguf
      sha256: 7caa630a60c8831a509e2663e1761355fa24bcf6ccc03e3cc767e5b5747a3be5
      uri: huggingface://bartowski/Dolphin3.0-Qwen2.5-1.5B-GGUF/Dolphin3.0-Qwen2.5-1.5B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "dolphin3.0-qwen2.5-3b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/cNCs1TBD3FelWCJGkZ3cd.png
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/cognitivecomputations/Dolphin3.0-Qwen2.5-3b
    - https://huggingface.co/bartowski/Dolphin3.0-Qwen2.5-3b-GGUF
  description: |
    Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases.

    Dolphin aims to be a general purpose model, similar to the models behind ChatGPT, Claude, Gemini. But these models present problems for businesses seeking to include AI in their products.

        They maintain control of the system prompt, deprecating and changing things as they wish, often causing software to break.
        They maintain control of the model versions, sometimes changing things silently, or deprecating older models that your business relies on.
        They maintain control of the alignment, and in particular the alignment is one-size-fits all, not tailored to the application.
        They can see all your queries and they can potentially use that data in ways you wouldn't want. Dolphin, in contrast, is steerable and gives control to the system owner. You set the system prompt. You decide the alignment. You have control of your data. Dolphin does not impose its ethics or guidelines on you. You are the one who decides the guidelines.

    Dolphin belongs to YOU, it is your tool, an extension of your will. Just as you are personally responsible for what you do with a knife, gun, fire, car, or the internet, you are the creator and originator of any content you generate with Dolphin.
  overrides:
    parameters:
      model: Dolphin3.0-Qwen2.5-3b-Q4_K_M.gguf
  files:
    - filename: Dolphin3.0-Qwen2.5-3b-Q4_K_M.gguf
      sha256: 0cb1908c5f444e1dc2c5b5619d62ac4957a22ad39cd42f2d0b48e2d8b1c358ab
      uri: huggingface://bartowski/Dolphin3.0-Qwen2.5-3b-GGUF/Dolphin3.0-Qwen2.5-3b-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "14b-qwen2.5-freya-x1"
  icon: https://huggingface.co/Sao10K/14B-Qwen2.5-Freya-x1/resolve/main/sad.png
  urls:
    - https://huggingface.co/Sao10K/14B-Qwen2.5-Freya-x1
    - https://huggingface.co/DevQuasar/Sao10K.14B-Qwen2.5-Freya-x1-GGUF
  description: |
    I decided to mess around with training methods again, considering the re-emegence of methods like multi-step training. Some people began doing it again, and so, why not? Inspired by AshhLimaRP's methology but done it my way.
    Freya-S1

        LoRA Trained on ~1.1GB of literature and raw text over Qwen 2.5's base model.
        Cleaned text and literature as best as I could, still, may have had issues here and there.

    Freya-S2

        The first LoRA was applied over Qwen 2.5 Instruct, then I trained on top of that.
        Reduced LoRA rank because it's mainly instruct and other details I won't get into.
  overrides:
    parameters:
      model: Sao10K.14B-Qwen2.5-Freya-x1.Q4_K_M.gguf
  files:
    - filename: Sao10K.14B-Qwen2.5-Freya-x1.Q4_K_M.gguf
      sha256: 790953e2ffccf2f730d52072f300fba9d1549c7762f5127b2014cdc82204b509
      uri: huggingface://DevQuasar/Sao10K.14B-Qwen2.5-Freya-x1-GGUF/Sao10K.14B-Qwen2.5-Freya-x1.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "huatuogpt-o1-7b-v0.1"
  urls:
    - https://huggingface.co/FreedomIntelligence/HuatuoGPT-o1-7B
    - https://huggingface.co/bartowski/HuatuoGPT-o1-7B-v0.1-GGUF
    - https://github.com/FreedomIntelligence/HuatuoGPT-o1
  description: |
    HuatuoGPT-o1 is a medical LLM designed for advanced medical reasoning. It generates a complex thought process, reflecting and refining its reasoning, before providing a final response.

    For more information, visit our GitHub repository: https://github.com/FreedomIntelligence/HuatuoGPT-o1.
  overrides:
    parameters:
      model: HuatuoGPT-o1-7B-v0.1-Q4_K_M.gguf
  files:
    - filename: HuatuoGPT-o1-7B-v0.1-Q4_K_M.gguf
      sha256: 8fc4b797a532d67d677e90293175ff1365c91677d06ea27af297bdf5b60c2d1d
      uri: huggingface://bartowski/HuatuoGPT-o1-7B-v0.1-GGUF/HuatuoGPT-o1-7B-v0.1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "chuluun-qwen2.5-72b-v0.01"
  icon: https://huggingface.co/DatToad/Chuluun-Qwen2.5-72B-v0.01/resolve/main/00008-1523559621.png
  urls:
    - https://huggingface.co/DatToad/Chuluun-Qwen2.5-72B-v0.01
    - https://huggingface.co/bartowski/Chuluun-Qwen2.5-72B-v0.01-GGUF
  description: |
    This is a merge of pre-trained language models created using mergekit.

    The models in this merge are some of my favorites and I found I liked all of them for different reasons. I believe this model is greater than the sum of its parts - it has the storywriting and language of Eva and Kunou, the spiciness of Magnum, and the uncensored intelligence of Tess. It excels in handling multiple characters and keeping their thoughts, speech, and actions separate, including scene changes. It also appears to match dialogue well to the characters and their backgrounds.

    Model_stock was the method used, it's very straightforward and quite fast, the bottleneck seemed to be my NVMe drive.

    All source models use ChatML prompt formatting and it responds very well. For testing purposes I am using a temperature of 1.08, rep pen of 0.03, and DRY with 0.6 (most Qwen models seem to need DRY). All other samplers are neutralized.
  overrides:
    parameters:
      model: Chuluun-Qwen2.5-72B-v0.01-Q4_K_M.gguf
  files:
    - filename: Chuluun-Qwen2.5-72B-v0.01-Q4_K_M.gguf
      sha256: 901d9d10aad42de3188e721accdc4eb0efec96cbca48563f802793dceaf551f5
      uri: huggingface://bartowski/Chuluun-Qwen2.5-72B-v0.01-GGUF/Chuluun-Qwen2.5-72B-v0.01-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwq-32b-preview-ideawhiz-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6205fefd3f1dc8a642d70b10/JEZgA_xV6oF8AIsya9dop.jpeg
  urls:
    - https://huggingface.co/6cf/QwQ-32B-Preview-IdeaWhiz-v1
    - https://huggingface.co/bartowski/QwQ-32B-Preview-IdeaWhiz-v1-GGUF
  description: |
    IdeaWhiz is a fine-tuned version of QwQ-32B-Preview, specifically optimized for scientific creativity and step-by-step reasoning. The model leverages the LiveIdeaBench dataset to enhance its capabilities in generating novel scientific ideas and hypotheses.
  overrides:
    parameters:
      model: QwQ-32B-Preview-IdeaWhiz-v1-Q4_K_M.gguf
  files:
    - filename: QwQ-32B-Preview-IdeaWhiz-v1-Q4_K_M.gguf
      sha256: 1648e13d9974b10d08ee45f48fd3ebd15cf67745fe20d602f9306fe0253b6a96
      uri: huggingface://bartowski/QwQ-32B-Preview-IdeaWhiz-v1-GGUF/QwQ-32B-Preview-IdeaWhiz-v1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "rombos-qwen2.5-writer-32b"
  icon: https://huggingface.co/SubtleOne/Rombos-Qwen2.5-Writer-32b/blob/main/robot-creating-fantasy.jpg
  urls:
    - https://huggingface.co/SubtleOne/Rombos-Qwen2.5-Writer-32b
    - https://huggingface.co/bartowski/Rombos-Qwen2.5-Writer-32b-GGUF
  description: |
    This model is a merge using Rombos's top-ranked 32b model, based on Qwen 2.5, and merging three creative writing finetunes. The creative content is a serious upgrade over the base it started with, and I enjoyed it in my DnD RPG campaign.
  overrides:
    parameters:
      model: Rombos-Qwen2.5-Writer-32b-Q4_K_M.gguf
  files:
    - filename: Rombos-Qwen2.5-Writer-32b-Q4_K_M.gguf
      sha256: cf0e48c6cb8b6f41834603900642b5395105980297709c85c4216bd44fac956a
      uri: huggingface://bartowski/Rombos-Qwen2.5-Writer-32b-GGUF/Rombos-Qwen2.5-Writer-32b-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "sky-t1-32b-preview"
  icon: https://github.com/NovaSky-AI/novasky-ai.github.io/raw/main/assets/images/blue-bird-wider.jpeg
  urls:
    - https://huggingface.co/NovaSky-AI/Sky-T1-32B-Preview
    - https://huggingface.co/bartowski/Sky-T1-32B-Preview-GGUF
    - https://novasky-ai.github.io/posts/sky-t1/
  description: |
    This is a 32B reasoning model trained from Qwen2.5-32B-Instruct with 17K data. The performance is on par with o1-preview model on both math and coding. Please see our blog post for more details.
  overrides:
    parameters:
      model: Sky-T1-32B-Preview-Q4_K_M.gguf
  files:
    - filename: Sky-T1-32B-Preview-Q4_K_M.gguf
      sha256: c735912a582f10e4769461586a02e5b98ef43c2895ec11923b8c4f157e7909e5
      uri: huggingface://bartowski/Sky-T1-32B-Preview-GGUF/Sky-T1-32B-Preview-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-72b-rp-ink"
  icon: https://cdn-uploads.huggingface.co/production/uploads/634262af8d8089ebaefd410e/M9KSL64gppBVatmTdoQnG.png
  urls:
    - https://huggingface.co/allura-org/Qwen2.5-72b-RP-Ink
    - https://huggingface.co/bartowski/Qwen2.5-72b-RP-Ink-GGUF
  description: |
    A roleplay-focused LoRA finetune of Qwen 2.5 72b Instruct. Methodology and hyperparams inspired by SorcererLM and Slush.
    Yet another model in the Ink series, following in the footsteps of the 32b one and the Nemo one
  overrides:
    parameters:
      model: Qwen2.5-72b-RP-Ink-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-72b-RP-Ink-Q4_K_M.gguf
      sha256: 2c2bf785dc5798403e0ccf6c4f5f9d7d53fcfb0c0b28855c584e09be88f91517
      uri: huggingface://bartowski/Qwen2.5-72b-RP-Ink-GGUF/Qwen2.5-72b-RP-Ink-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "steiner-32b-preview"
  urls:
    - https://huggingface.co/peakji/steiner-32b-preview
    - https://huggingface.co/bartowski/steiner-32b-preview-GGUF
  description: |
    Steiner is a series of reasoning models trained on synthetic data using reinforcement learning. These models can explore multiple reasoning paths in an autoregressive manner during inference and autonomously verify or backtrack when necessary, enabling a linear traversal of the implicit search tree.

    Steiner is a personal interest project by Yichao 'Peak' Ji, inspired by OpenAI o1. The ultimate goal is to reproduce o1 and validate the inference-time scaling curves. The Steiner-preview model is currently a work-in-progress. The reason for open-sourcing it is that I’ve found automated evaluation methods, primarily based on multiple-choice questions, struggle to fully reflect the progress of reasoning models. In fact, the assumption that "the correct answer is always among the options" doesn’t align well with real-world reasoning scenarios, as it encourages models to perform substitution-based validation rather than open-ended exploration. For this reason, I’ve chosen to open-source these intermediate results and, when time permits, to build in public. This approach allows me to share knowledge while also gathering more evaluations and feedback from real human users.
  overrides:
    parameters:
      model: steiner-32b-preview-Q4_K_M.gguf
  files:
    - filename: steiner-32b-preview-Q4_K_M.gguf
      sha256: 1d7bf6d6dc8db8c81b3e71dc89756cd23417bb0a645b7dcdd1f9457781a88652
      uri: huggingface://bartowski/steiner-32b-preview-GGUF/steiner-32b-preview-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwerus-7b"
  urls:
    - https://huggingface.co/mlabonne/Qwerus-7B
    - https://huggingface.co/bartowski/Qwerus-7B-GGUF
  description: |
    Qwerus-7B is a merge of the following models using LazyMergekit:
    PRIME-RL/Eurus-2-7B-PRIME
    Qwen/Qwen2.5-7B-Instruct
  overrides:
    parameters:
      model: Qwerus-7B-Q4_K_M.gguf
  files:
    - filename: Qwerus-7B-Q4_K_M.gguf
      sha256: 3676629e8092a59f523393e6eb5072727f5213a9e03b7b81141f05a33743e20c
      uri: huggingface://bartowski/Qwerus-7B-GGUF/Qwerus-7B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "lb-reranker-0.5b-v1.0"
  urls:
    - https://huggingface.co/lightblue/lb-reranker-0.5B-v1.0
    - https://huggingface.co/bartowski/lb-reranker-0.5B-v1.0-GGUF
  description: |
    The LB Reranker has been trained to determine the relatedness of a given query to a piece of text, therefore allowing it to be used as a ranker or reranker in various retrieval-based tasks.

    This model is fine-tuned from a Qwen/Qwen2.5-0.5B-Instruct model checkpoint and was trained for roughly 5.5 hours using the 8 x L20 instance (ecs.gn8is-8x.32xlarge) on Alibaba Cloud.

    The training data for this model can be found at lightblue/reranker_continuous_filt_max7_train and the code for generating this data as well as running the training of the model can be found on our Github repo.

    Trained on data in over 95 languages, this model is applicable to a broad range of use cases.

    This model has three main benefits over comparable rerankers.

        It has shown slightly higher performance on evaluation benchmarks.
        It has been trained on more languages than any previous model.
        It is a simple Causal LM model trained to output a string between "1" and "7".

    This last point means that this model can be used natively with many widely available inference packages, including vLLM and LMDeploy. This in turns allows our reranker to benefit from improvements to inference as and when these packages release them.

    Update: We have also found that this model works pretty well as a code snippet reranker too (P@1 of 96%)! See our Colab for more details.
  overrides:
    parameters:
      model: lb-reranker-0.5B-v1.0-Q4_K_M.gguf
  files:
    - filename: lb-reranker-0.5B-v1.0-Q4_K_M.gguf
      sha256: 43568150de5136da15c996bbf4d1a78cc6580515c40f0ef9a8c90b0542228ab3
      uri: huggingface://bartowski/lb-reranker-0.5B-v1.0-GGUF/lb-reranker-0.5B-v1.0-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "uwu-7b-instruct"
  urls:
    - https://huggingface.co/qingy2024/UwU-7B-Instruct
    - https://huggingface.co/bartowski/UwU-7B-Instruct-GGUF
  description: |
    Small QwQ, full-finetuned on FineQwQ-142K. Unlike my previous models, this one is a general-purpose reasoning machine!
  overrides:
    parameters:
      model: UwU-7B-Instruct-Q4_K_M.gguf
  files:
    - filename: UwU-7B-Instruct-Q4_K_M.gguf
      sha256: 279b2ba20d51bb155c8dd497cf49e0c28407b1822c75de88cfd83d13fd14a59f
      uri: huggingface://bartowski/UwU-7B-Instruct-GGUF/UwU-7B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "drt-o1-14b"
  urls:
    - https://huggingface.co/Krystalan/DRT-o1-14B
    - https://huggingface.co/bartowski/DRT-o1-14B-GGUF
  description: "This repository contains the resources for our paper \"DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought\"\nIn this work, we introduce DRT-o1, an attempt to bring the success of long thought reasoning to neural machine translation (MT). To this end,\n\n\U0001F31F We mine English sentences with similes or metaphors from existing literature books, which are suitable for translation via long thought.\n\U0001F31F We propose a designed multi-agent framework with three agents (i.e., a translator, an advisor and an evaluator) to synthesize the MT samples with long thought. There are 22,264 synthesized samples in total.\n\U0001F31F We train DRT-o1-8B, DRT-o1-7B and DRT-o1-14B using Llama-3.1-8B-Instruct, Qwen2.5-7B-Instruct and Qwen2.5-14B-Instruct as backbones.\n\nOur goal is not to achieve competitive performance with OpenAI’s O1 in neural machine translation (MT). Instead, we explore technical routes to bring the success of long thought to MT. To this end, we introduce DRT-o1, a byproduct of our exploration, and we hope it could facilitate the corresponding research in this direction.\n"
  overrides:
    parameters:
      model: DRT-o1-14B-Q4_K_M.gguf
  files:
    - filename: DRT-o1-14B-Q4_K_M.gguf
      sha256: 9619ca984cf4ce8e4f69bcde831de17b2ce05dd89536e3130608877521e3d328
      uri: huggingface://bartowski/DRT-o1-14B-GGUF/DRT-o1-14B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "lamarck-14b-v0.7"
  icon: https://huggingface.co/sometimesanotion/Lamarck-14B-v0.7/resolve/main/LamarckShades.webp
  urls:
    - https://huggingface.co/sometimesanotion/Lamarck-14B-v0.7
    - https://huggingface.co/bartowski/Lamarck-14B-v0.7-GGUF
  description: |
    Lamarck 14B v0.7: A generalist merge with emphasis on multi-step reasoning, prose, and multi-language ability. The 14B parameter model class has a lot of strong performers, and Lamarck strives to be well-rounded and solid.
  overrides:
    parameters:
      model: Lamarck-14B-v0.7-Q4_K_M.gguf
  files:
    - filename: Lamarck-14B-v0.7-Q4_K_M.gguf
      sha256: ff8eba82b77a4c6b6d556b85629414655d881f8af4601bcf891c6a7b0345b442
      uri: huggingface://bartowski/Lamarck-14B-v0.7-GGUF/Lamarck-14B-v0.7-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "art-v0-3b"
  icon: https://blog.agi-0.com/_next/image?url=%2Fabout_img2.jpeg&w=1920&q=75
  urls:
    - https://huggingface.co/AGI-0/Art-v0-3B
    - https://huggingface.co/bartowski/Art-v0-3B-GGUF
    - https://blog.agi-0.com/posts/art-series
  description: |
    Art v0 3B is our inaugural model in the Art series, fine-tuned from Qwen/Qwen2.5-3B-Instruct using a specialized dataset generated with Gemini 2.0 Flash Thinking. Read more about the Art series
  overrides:
    parameters:
      model: Art-v0-3B-Q4_K_M.gguf
  files:
    - filename: Art-v0-3B-Q4_K_M.gguf
      sha256: 551acd326ce9a743b6e06e094865eb2f06c23c81c812ce221d757bf27ceec9f7
      uri: huggingface://bartowski/Art-v0-3B-GGUF/Art-v0-3B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "chuluun-qwen2.5-72b-v0.08"
  icon: https://huggingface.co/DatToad/Chuluun-Qwen2.5-72B-v0.08/resolve/main/Chuluun8-2.png
  urls:
    - https://huggingface.co/DatToad/Chuluun-Qwen2.5-72B-v0.08
    - https://huggingface.co/bartowski/Chuluun-Qwen2.5-72B-v0.08-GGUF
  description: |
    This is a merge of pre-trained language models created using mergekit.
    I re-ran the original Chuluun formula including the newly released Ink from Allura-Org. I've found the addition gives the model a lot more variability, likely because of aggressive de-slop applied to its dataset. Sometimes this means a word choice will be strange and you'll want to manually edit when needed, but it means you'll see less ministrations sparkling with mischief.
    Because of this the best way to approach the model is to run multiple regens and choose the one you like, edit mercilessly, and continue. Like the original Chuluun this variant is very steerable for complex storywriting and RP. It's probably also a little spicier than v0.01 with both Magnum and whatever the heck Fizz threw into the data for Ink.
    I've also been hearing praise for a level of character intelligence not seen in other models, including Largestral finetunes and merges. I'm not about to say any model of mine is smarter because it was a dumb idea to use Tess as the base and it somehow worked.
  overrides:
    parameters:
      model: Chuluun-Qwen2.5-72B-v0.08-Q4_K_M.gguf
  files:
    - filename: Chuluun-Qwen2.5-72B-v0.08-Q4_K_M.gguf
      sha256: 0fec82625f74a9a340837de7af287b1d9042e5aeb70cda2621426db99958b0af
      uri: huggingface://bartowski/Chuluun-Qwen2.5-72B-v0.08-GGUF/Chuluun-Qwen2.5-72B-v0.08-Q4_K_M.gguf
- &smollm
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master" ## SmolLM
  name: "smollm-1.7b-instruct"
  icon: https://huggingface.co/datasets/HuggingFaceTB/images/resolve/main/banner_smol.png
  tags:
    - llm
    - gguf
    - gpu
    - smollm
    - chatml
    - cpu
  urls:
    - https://huggingface.co/MaziyarPanahi/SmolLM-1.7B-Instruct-GGUF
    - https://huggingface.co/HuggingFaceTB/SmolLM-1.7B-Instruct
  description: |
    SmolLM is a series of small language models available in three sizes: 135M, 360M, and 1.7B parameters.

    These models are pre-trained on SmolLM-Corpus, a curated collection of high-quality educational and synthetic data designed for training LLMs. For further details, we refer to our blogpost.

    To build SmolLM-Instruct, we finetuned the base models on publicly available datasets.
  overrides:
    parameters:
      model: SmolLM-1.7B-Instruct.Q4_K_M.gguf
  files:
    - filename: SmolLM-1.7B-Instruct.Q4_K_M.gguf
      sha256: 2b07eb2293ed3fc544a9858beda5bfb03dcabda6aa6582d3c85768c95f498d28
      uri: huggingface://MaziyarPanahi/SmolLM-1.7B-Instruct-GGUF/SmolLM-1.7B-Instruct.Q4_K_M.gguf
- !!merge <<: *smollm
  name: "smollm2-1.7b-instruct"
  icon: https://cdn-uploads.huggingface.co/production/uploads/61c141342aac764ce1654e43/y45hIMNREW7w_XpHYB_0q.png
  urls:
    - https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct
    - https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct-GGUF
  description: |
    SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters. They are capable of solving a wide range of tasks while being lightweight enough to run on-device.

    The 1.7B variant demonstrates significant advances over its predecessor SmolLM1-1.7B, particularly in instruction following, knowledge, reasoning, and mathematics. It was trained on 11 trillion tokens using a diverse dataset combination: FineWeb-Edu, DCLM, The Stack, along with new mathematics and coding datasets that we curated and will release soon. We developed the instruct version through supervised fine-tuning (SFT) using a combination of public datasets and our own curated datasets. We then applied Direct Preference Optimization (DPO) using UltraFeedback.
  overrides:
    parameters:
      model: smollm2-1.7b-instruct-q4_k_m.gguf
  files:
    - filename: smollm2-1.7b-instruct-q4_k_m.gguf
      sha256: decd2598bc2c8ed08c19adc3c8fdd461ee19ed5708679d1c54ef54a5a30d4f33
      uri: huggingface://HuggingFaceTB/SmolLM2-1.7B-Instruct-GGUF/smollm2-1.7b-instruct-q4_k_m.gguf
- !!merge <<: *qwen25
  name: "vikhr-qwen-2.5-1.5b-instruct"
  urls:
    - https://huggingface.co/Vikhrmodels/Vikhr-Qwen-2.5-1.5B-Instruct
    - https://huggingface.co/QuantFactory/Vikhr-Qwen-2.5-1.5B-Instruct-GGUF
  description: |
    Instructive model based on Qwen-2.5-1.5B-Instruct, trained on the Russian-language dataset GrandMaster-PRO-MAX. Designed for high-efficiency text processing in Russian and English, delivering precise responses and fast task execution.
  overrides:
    parameters:
      model: Vikhr-Qwen-2.5-1.5B-Instruct.Q4_K_M.gguf
  files:
    - filename: Vikhr-Qwen-2.5-1.5B-Instruct.Q4_K_M.gguf
      sha256: eaeac314e30b461413bc1cc819cdc0cd6a79265711fd0b8268702960a082c7bd
      uri: huggingface://QuantFactory/Vikhr-Qwen-2.5-1.5B-Instruct-GGUF/Vikhr-Qwen-2.5-1.5B-Instruct.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "dumpling-qwen2.5-32b"
  icon: https://huggingface.co/nbeerbower/Dumpling-Qwen2.5-32B/resolve/main/dumpling_cover.png?download=true
  urls:
    - https://huggingface.co/nbeerbower/Dumpling-Qwen2.5-32B
    - https://huggingface.co/bartowski/Dumpling-Qwen2.5-32B-GGUF
  description: |
    nbeerbower/Rombos-EVAGutenberg-TIES-Qwen2.5-32B finetuned on:
     nbeerbower/GreatFirewall-DPO
     nbeerbower/Schule-DPO
     nbeerbower/Purpura-DPO
     nbeerbower/Arkhaios-DPO
     jondurbin/truthy-dpo-v0.1
     antiven0m/physical-reasoning-dpo
     flammenai/Date-DPO-NoAsterisks
     flammenai/Prude-Phi3-DPO
     Atsunori/HelpSteer2-DPO
     jondurbin/gutenberg-dpo-v0.1
     nbeerbower/gutenberg2-dpo
     nbeerbower/gutenberg-moderne-dpo.
  overrides:
    parameters:
      model: Dumpling-Qwen2.5-32B-Q4_K_M.gguf
  files:
    - filename: Dumpling-Qwen2.5-32B-Q4_K_M.gguf
      sha256: c5b7d773cc614650ad3956008e30d0607df6106c28e381870a9b950bd4ee1d17
      uri: huggingface://bartowski/Dumpling-Qwen2.5-32B-GGUF/Dumpling-Qwen2.5-32B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "confucius-o1-14b"
  urls:
    - https://huggingface.co/netease-youdao/Confucius-o1-14B
    - https://huggingface.co/bartowski/Confucius-o1-14B-GGUF
  description: |
    Confucius-o1-14B is a o1-like reasoning model developed by the NetEase Youdao Team, it can be easily deployed on a single GPU without quantization. This model is based on the Qwen2.5-14B-Instruct model and adopts a two-stage learning strategy, enabling the lightweight 14B model to possess thinking abilities similar to those of o1. What sets it apart is that after generating the chain of thought, it can summarize a step-by-step problem-solving process from the chain of thought on its own. This can prevent users from getting bogged down in the complex chain of thought and allows them to easily obtain the correct problem-solving ideas and answers.
  overrides:
    parameters:
      model: Confucius-o1-14B-Q4_K_M.gguf
  files:
    - filename: Confucius-o1-14B-Q4_K_M.gguf
      sha256: 03182920edd8667db7d2a362ca2d25e88f4b615b383b5a55c764f4715fb22dd9
      uri: huggingface://bartowski/Confucius-o1-14B-GGUF/Confucius-o1-14B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "openthinker-7b"
  icon: https://huggingface.co/datasets/open-thoughts/open-thoughts-114k/resolve/main/open_thoughts.png
  urls:
    - https://huggingface.co/open-thoughts/OpenThinker-7B
    - https://huggingface.co/bartowski/OpenThinker-7B-GGUF
  description: |
    This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct on the OpenThoughts-114k dataset dataset.

    The dataset is derived by distilling DeepSeek-R1 using the data pipeline available on github. More info about the dataset can be found on the dataset card at OpenThoughts-114k dataset.

    This model improves upon the Bespoke-Stratos-7B model, which used 17k examples (Bespoke-Stratos-17k dataset). The numbers reported in the table below are evaluated with our open-source tool Evalchemy.
  overrides:
    parameters:
      model: OpenThinker-7B-Q4_K_M.gguf
  files:
    - filename: OpenThinker-7B-Q4_K_M.gguf
      sha256: 94dff1a7acd685db5cff7afdb837aab8172e06d65fe6179ba47428e3030acd93
      uri: huggingface://bartowski/OpenThinker-7B-GGUF/OpenThinker-7B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "tinyswallow-1.5b-instruct"
  urls:
    - https://huggingface.co/SakanaAI/TinySwallow-1.5B-Instruct
    - https://huggingface.co/bartowski/TinySwallow-1.5B-Instruct-GGUF
  description: |
    TinySwallow-1.5B-Instruct is an instruction-tuned version of TinySwallow-1.5B, created through TAID (Temporally Adaptive Interpolated Distillation), our new knowledge distillation method. We used Qwen2.5-32B-Instruct as the teacher model and Qwen2.5-1.5B-Instruct as the student model. The model has been further instruction-tuned to enhance its ability to follow instructions and engage in conversations in Japanese.
  overrides:
    parameters:
      model: TinySwallow-1.5B-Instruct-Q4_K_M.gguf
  files:
    - filename: TinySwallow-1.5B-Instruct-Q4_K_M.gguf
      sha256: 4d409c8873c1650a19c0a7a1c051e342613191a487768fe0d29735b9361079cd
      uri: huggingface://bartowski/TinySwallow-1.5B-Instruct-GGUF/TinySwallow-1.5B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "fblgit_miniclaus-qw1.5b-unamgs-grpo"
  icon: https://huggingface.co/fblgit/miniclaus-qw1.5B-UNAMGS/resolve/main/miniclaus_qw15-UNAMGS.png
  urls:
    - https://huggingface.co/fblgit/miniclaus-qw1.5B-UNAMGS-GRPO
    - https://huggingface.co/bartowski/fblgit_miniclaus-qw1.5B-UNAMGS-GRPO-GGUF
  description: |
    This version is RL with GRPO on GSM8k for 1400 steps
  overrides:
    parameters:
      model: fblgit_miniclaus-qw1.5B-UNAMGS-GRPO-Q4_K_M.gguf
  files:
    - filename: fblgit_miniclaus-qw1.5B-UNAMGS-GRPO-Q4_K_M.gguf
      sha256: 88ceacc5900062bc2afc352f009233225b0fe10203cbb61b122e8f10244449c8
      uri: huggingface://bartowski/fblgit_miniclaus-qw1.5B-UNAMGS-GRPO-GGUF/fblgit_miniclaus-qw1.5B-UNAMGS-GRPO-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "rubenroy_gilgamesh-72b"
  icon: https://cdn.ruben-roy.com/AI/Gilgamesh/img/art.png
  urls:
    - https://huggingface.co/rubenroy/Gilgamesh-72B
    - https://huggingface.co/bartowski/rubenroy_Gilgamesh-72B-GGUF
  description: |
    Gilgamesh 72B was trained on a mixture of specialised datasets designed for factual accuracy, mathematical capabilities and reasoning. The datasets used include:

    GammaCorpus-v2-5m: A large 5 million line general-purpose dataset covering many topics to enhance broad knowledge and conversational abilities.
    GammaCorpus-CoT-Math-170k: A dataset focused on Chain-of-Thought (CoT) reasoning in mathematics made to help the model improve step-by-step problem-solving.
    GammaCorpus-Fact-QA-450k: A dataset containing factual question-answer pairs for enforcing some important current knowledge.

    These datasets were all built and curated by me, however I thank my other team members at Ovantage Labs for assisting me in the creation and curation of these datasets.
  overrides:
    parameters:
      model: rubenroy_Gilgamesh-72B-Q4_K_M.gguf
  files:
    - filename: rubenroy_Gilgamesh-72B-Q4_K_M.gguf
      sha256: c6842b3bc882082c63243e762234ae697c1727bebed18b5241eb97e019f0cf68
      uri: huggingface://bartowski/rubenroy_Gilgamesh-72B-GGUF/rubenroy_Gilgamesh-72B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "tiger-lab_qwen2.5-32b-instruct-cft"
  urls:
    - https://huggingface.co/TIGER-Lab/Qwen2.5-32B-Instruct-CFT
    - https://huggingface.co/bartowski/TIGER-Lab_Qwen2.5-32B-Instruct-CFT-GGUF
  description: |
    Qwen2.5-32B-Instruct-CFT is a 32B parameter model fine-tuned using our novel Critique Fine-Tuning (CFT) approach. Built upon the Qwen2.5-32B-Instruct base model, this variant is trained to critique and analyze responses rather than simply imitate them, leading to enhanced reasoning capabilities.
  overrides:
    parameters:
      model: TIGER-Lab_Qwen2.5-32B-Instruct-CFT-Q4_K_M.gguf
  files:
    - filename: TIGER-Lab_Qwen2.5-32B-Instruct-CFT-Q4_K_M.gguf
      sha256: 57e87e246db368f39f31f38e44ba8e9dc838a026f729f5a123aacc2aeb5a9402
      uri: huggingface://bartowski/TIGER-Lab_Qwen2.5-32B-Instruct-CFT-GGUF/TIGER-Lab_Qwen2.5-32B-Instruct-CFT-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "subtleone_qwen2.5-32b-erudite-writer"
  icon: https://huggingface.co/SubtleOne/Qwen2.5-32b-Erudite-Writer/resolve/main/robot-creating-fantasy2.jpg
  urls:
    - https://huggingface.co/SubtleOne/Qwen2.5-32b-Erudite-Writer
    - https://huggingface.co/bartowski/SubtleOne_Qwen2.5-32b-Erudite-Writer-GGUF
  description: |
    This model is a merge using Rombos's top-ranked 32b model, based on Qwen 2.5, and merging three creative writing finetunes. The creative content is a serious upgrade over the base it started with and has a much more literary style than the previous Writer model. I won't call it better or worse, merely a very distinct flavor and style. I quite like it, and enjoin you to try it as well. Enjoy!
  overrides:
    parameters:
      model: SubtleOne_Qwen2.5-32b-Erudite-Writer-Q4_K_M.gguf
  files:
    - filename: SubtleOne_Qwen2.5-32b-Erudite-Writer-Q4_K_M.gguf
      sha256: fb059c88be4d7d579f0776cead4ca44cf7423b834c5502ce67ef41b15cd0973b
      uri: huggingface://bartowski/SubtleOne_Qwen2.5-32b-Erudite-Writer-GGUF/SubtleOne_Qwen2.5-32b-Erudite-Writer-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "localai-functioncall-qwen2.5-7b-v0.5"
  url: "github:mudler/LocalAI/gallery/qwen-fcall.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/647374aa7ff32a81ac6d35d4/Dzbdzn27KEc3K6zNNi070.png
  urls:
    - https://huggingface.co/mudler/LocalAI-functioncall-qwen2.5-7b-v0.5
    - https://huggingface.co/mudler/LocalAI-functioncall-qwen2.5-7b-v0.5-Q4_K_M-GGUF
  description: |
    A model tailored to be conversational and execute function calls with LocalAI. This model is based on qwen2.5 (7B).
  overrides:
    parameters:
      model: localai-functioncall-qwen2.5-7b-v0.5-q4_k_m.gguf
  files:
    - filename: localai-functioncall-qwen2.5-7b-v0.5-q4_k_m.gguf
      sha256: 4e7b7fe1d54b881f1ef90799219dc6cc285d29db24f559c8998d1addb35713d4
      uri: huggingface://mudler/LocalAI-functioncall-qwen2.5-7b-v0.5-Q4_K_M-GGUF/localai-functioncall-qwen2.5-7b-v0.5-q4_k_m.gguf
- !!merge <<: *qwen25
  name: "simplescaling_s1.1-32b"
  urls:
    - https://huggingface.co/simplescaling/s1.1-32B
    - https://huggingface.co/bartowski/simplescaling_s1.1-32B-GGUF
  description: |
    s1.1 is our sucessor of s1 with better reasoning performance by leveraging reasoning traces from r1 instead of Gemini. This model is a successor of s1-32B with slightly better performance. Thanks to Ryan Marten for helping generate r1 traces for s1K.
  overrides:
    parameters:
      model: simplescaling_s1.1-32B-Q4_K_M.gguf
  files:
    - filename: simplescaling_s1.1-32B-Q4_K_M.gguf
      sha256: 6ce3cbfcca8ab50a6e877e6bdfc6538c54e1d9a7e5cc81a9930d5d056a9db4e8
      uri: huggingface://bartowski/simplescaling_s1.1-32B-GGUF/simplescaling_s1.1-32B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "nvidia_aceinstruct-1.5b"
  icon: https://cdn-avatars.huggingface.co/v1/production/uploads/1613114437487-60262a8e0703121c822a80b6.png
  urls:
    - https://huggingface.co/nvidia/AceInstruct-1.5B
    - https://huggingface.co/bartowski/nvidia_AceInstruct-1.5B-GGUF
  description: |
    We introduce AceInstruct, a family of advanced SFT models for coding, mathematics, and general-purpose tasks. The AceInstruct family, which includes AceInstruct-1.5B, 7B, and 72B, is Improved using Qwen. These models are fine-tuned on Qwen2.5-Base using general SFT datasets. These same datasets are also used in the training of AceMath-Instruct. Different from AceMath-Instruct which is specialized for math questions, AceInstruct is versatile and can be applied to a wide range of domains. Benchmark evaluations across coding, mathematics, and general knowledge tasks demonstrate that AceInstruct delivers performance comparable to Qwen2.5-Instruct.
  overrides:
    parameters:
      model: nvidia_AceInstruct-1.5B-Q4_K_M.gguf
  files:
    - filename: nvidia_AceInstruct-1.5B-Q4_K_M.gguf
      sha256: 103b7fa617d2b3c2d6e168a878b9b5e3710d19d178bf4b890acf0fac2abafadb
      uri: huggingface://bartowski/nvidia_AceInstruct-1.5B-GGUF/nvidia_AceInstruct-1.5B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "nvidia_aceinstruct-7b"
  icon: https://cdn-avatars.huggingface.co/v1/production/uploads/1613114437487-60262a8e0703121c822a80b6.png
  urls:
    - https://huggingface.co/nvidia/AceInstruct-7B
    - https://huggingface.co/bartowski/nvidia_AceInstruct-7B-GGUF
  description: |
    We introduce AceInstruct, a family of advanced SFT models for coding, mathematics, and general-purpose tasks. The AceInstruct family, which includes AceInstruct-1.5B, 7B, and 72B, is Improved using Qwen. These models are fine-tuned on Qwen2.5-Base using general SFT datasets. These same datasets are also used in the training of AceMath-Instruct. Different from AceMath-Instruct which is specialized for math questions, AceInstruct is versatile and can be applied to a wide range of domains. Benchmark evaluations across coding, mathematics, and general knowledge tasks demonstrate that AceInstruct delivers performance comparable to Qwen2.5-Instruct.
  overrides:
    parameters:
      model: nvidia_AceInstruct-7B-Q4_K_M.gguf
  files:
    - filename: nvidia_AceInstruct-7B-Q4_K_M.gguf
      sha256: 94e262e0d82d39fa36c4278b2a4b4fa7e93bfaa7cca33283fb9ee006bac02a8a
      uri: huggingface://bartowski/nvidia_AceInstruct-7B-GGUF/nvidia_AceInstruct-7B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "nvidia_aceinstruct-72b"
  icon: https://cdn-avatars.huggingface.co/v1/production/uploads/1613114437487-60262a8e0703121c822a80b6.png
  urls:
    - https://huggingface.co/nvidia/AceInstruct-72B
    - https://huggingface.co/bartowski/nvidia_AceInstruct-72B-GGUF
  description: |
    We introduce AceInstruct, a family of advanced SFT models for coding, mathematics, and general-purpose tasks. The AceInstruct family, which includes AceInstruct-1.5B, 7B, and 72B, is Improved using Qwen. These models are fine-tuned on Qwen2.5-Base using general SFT datasets. These same datasets are also used in the training of AceMath-Instruct. Different from AceMath-Instruct which is specialized for math questions, AceInstruct is versatile and can be applied to a wide range of domains. Benchmark evaluations across coding, mathematics, and general knowledge tasks demonstrate that AceInstruct delivers performance comparable to Qwen2.5-Instruct.
  overrides:
    parameters:
      model: nvidia_AceInstruct-72B-Q4_K_M.gguf
  files:
    - filename: nvidia_AceInstruct-72B-Q4_K_M.gguf
      sha256: c8452b2d6c33693d5fd1b5f3aa476451fbd4e78c9621b9baf39ad1a3f2b91503
      uri: huggingface://bartowski/nvidia_AceInstruct-72B-GGUF/nvidia_AceInstruct-72B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "open-thoughts_openthinker-32b"
  icon: https://huggingface.co/datasets/open-thoughts/open-thoughts-114k/resolve/main/open_thoughts.png
  urls:
    - https://huggingface.co/open-thoughts/OpenThinker-32B
    - https://huggingface.co/bartowski/open-thoughts_OpenThinker-32B-GGUF
  description: |
    This model is a fine-tuned version of Qwen/Qwen2.5-32B-Instruct on the OpenThoughts-114k dataset.

    The dataset is derived by distilling DeepSeek-R1 using the data pipeline available on github. More info about the dataset can be found on the dataset card at OpenThoughts-114k dataset.

    The numbers reported in the table below are evaluated with our open-source tool Evalchemy.
  overrides:
    parameters:
      model: open-thoughts_OpenThinker-32B-Q4_K_M.gguf
  files:
    - filename: open-thoughts_OpenThinker-32B-Q4_K_M.gguf
      sha256: 6795de6e7025e4a77042232908fe7be304b6b6b465c5feb71ba6861f37038aaf
      uri: huggingface://bartowski/open-thoughts_OpenThinker-32B-GGUF/open-thoughts_OpenThinker-32B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "rombo-org_rombo-llm-v3.0-qwen-32b"
  urls:
    - https://huggingface.co/Rombo-Org/Rombo-LLM-V3.0-Qwen-32b
    - https://huggingface.co/bartowski/Rombo-Org_Rombo-LLM-V3.0-Qwen-32b-GGUF
  description: |
    Rombo-LLM-V3.0-Qwen-32b is a Continued Finetune model on top of the previous V2.5 version using the "NovaSky-AI/Sky-T1_data_17k" dataset. The resulting model was then merged backed into the base model for higher performance as written in the continuous finetuning technique bellow. This model is a good general purpose model, however it excells at coding and math.
  overrides:
    parameters:
      model: Rombo-Org_Rombo-LLM-V3.0-Qwen-32b-Q4_K_M.gguf
  files:
    - filename: Rombo-Org_Rombo-LLM-V3.0-Qwen-32b-Q4_K_M.gguf
      sha256: 1d214d46721aba2bb2a5778c108c4707b5dd7dbc5751158734c67af271532fb5
      uri: huggingface://bartowski/Rombo-Org_Rombo-LLM-V3.0-Qwen-32b-GGUF/Rombo-Org_Rombo-LLM-V3.0-Qwen-32b-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "ozone-ai_0x-lite"
  urls:
    - https://huggingface.co/ozone-ai/0x-lite
    - https://huggingface.co/bartowski/ozone-ai_0x-lite-GGUF
  description: |
    0x Lite is a state-of-the-art language model developed by Ozone AI, designed to deliver ultra-high-quality text generation capabilities while maintaining a compact and efficient architecture. Built on the latest advancements in natural language processing, 0x Lite is optimized for both speed and accuracy, making it a strong contender in the space of language models. It is particularly well-suited for applications where resource constraints are a concern, offering a lightweight alternative to larger models like GPT while still delivering comparable performance.
  overrides:
    parameters:
      model: ozone-ai_0x-lite-Q4_K_M.gguf
  files:
    - filename: ozone-ai_0x-lite-Q4_K_M.gguf
      sha256: 7f163e72ead7522bd6774555a932e0a11f212d17cdc9442e2cfd1b017009f832
      uri: huggingface://bartowski/ozone-ai_0x-lite-GGUF/ozone-ai_0x-lite-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "nbeerbower_dumpling-qwen2.5-14b"
  icon: https://huggingface.co/nbeerbower/Dumpling-Qwen2.5-32B/resolve/main/dumpling_cover.png?download=true
  urls:
    - https://huggingface.co/nbeerbower/Dumpling-Qwen2.5-14B
    - https://huggingface.co/bartowski/nbeerbower_Dumpling-Qwen2.5-14B-GGUF
  description: |
    nbeerbower/EVA-abliterated-TIES-Qwen2.5-14B finetuned on:

        nbeerbower/GreatFirewall-DPO
        nbeerbower/Schule-DPO
        nbeerbower/Purpura-DPO
        nbeerbower/Arkhaios-DPO
        jondurbin/truthy-dpo-v0.1
        antiven0m/physical-reasoning-dpo
        flammenai/Date-DPO-NoAsterisks
        flammenai/Prude-Phi3-DPO
        Atsunori/HelpSteer2-DPO
        jondurbin/gutenberg-dpo-v0.1
        nbeerbower/gutenberg2-dpo
        nbeerbower/gutenberg-moderne-dpo.
  overrides:
    parameters:
      model: nbeerbower_Dumpling-Qwen2.5-14B-Q4_K_M.gguf
  files:
    - filename: nbeerbower_Dumpling-Qwen2.5-14B-Q4_K_M.gguf
      sha256: 2d38348414b2719971a08a604313ed98b44b586490633d6e237dd096ae5bf31d
      uri: huggingface://bartowski/nbeerbower_Dumpling-Qwen2.5-14B-GGUF/nbeerbower_Dumpling-Qwen2.5-14B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "nbeerbower_dumpling-qwen2.5-32b-v2"
  icon: https://huggingface.co/nbeerbower/Dumpling-Qwen2.5-32B/resolve/main/dumpling_cover.png?download=true
  urls:
    - https://huggingface.co/nbeerbower/Dumpling-Qwen2.5-32B-v2
    - https://huggingface.co/bartowski/nbeerbower_Dumpling-Qwen2.5-32B-v2-GGUF
  description: |
    nbeerbower/Rombos-EVAGutenberg-TIES-Qwen2.5-32B finetuned on:

        nbeerbower/GreatFirewall-DPO
        nbeerbower/Schule-DPO
        nbeerbower/Purpura-DPO
        nbeerbower/Arkhaios-DPO
        jondurbin/truthy-dpo-v0.1
        antiven0m/physical-reasoning-dpo
        flammenai/Date-DPO-NoAsterisks
        flammenai/Prude-Phi3-DPO
        Atsunori/HelpSteer2-DPO
        jondurbin/gutenberg-dpo-v0.1
        nbeerbower/gutenberg2-dpo
        nbeerbower/gutenberg-moderne-dpo.
  overrides:
    parameters:
      model: nbeerbower_Dumpling-Qwen2.5-32B-v2-Q4_K_M.gguf
  files:
    - filename: nbeerbower_Dumpling-Qwen2.5-32B-v2-Q4_K_M.gguf
      sha256: 02a5320d62e13b31ac6d04ccdaba7b72a524d6cc72a7082b94d8cac0a183ecb4
      uri: huggingface://bartowski/nbeerbower_Dumpling-Qwen2.5-32B-v2-GGUF/nbeerbower_Dumpling-Qwen2.5-32B-v2-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "nbeerbower_dumpling-qwen2.5-72b"
  icon: https://huggingface.co/nbeerbower/Dumpling-Qwen2.5-32B/resolve/main/dumpling_cover.png?download=true
  urls:
    - https://huggingface.co/nbeerbower/Dumpling-Qwen2.5-72B
    - https://huggingface.co/bartowski/nbeerbower_Dumpling-Qwen2.5-72B-GGUF
  description: |
    nbeerbower/EVA-abliterated-TIES-Qwen2.5-72B finetuned on:
        nbeerbower/GreatFirewall-DPO
        nbeerbower/Schule-DPO
        nbeerbower/Purpura-DPO
        nbeerbower/Arkhaios-DPO
        jondurbin/truthy-dpo-v0.1
        antiven0m/physical-reasoning-dpo
        flammenai/Date-DPO-NoAsterisks
        flammenai/Prude-Phi3-DPO
        Atsunori/HelpSteer2-DPO
        jondurbin/gutenberg-dpo-v0.1
        nbeerbower/gutenberg2-dpo
        nbeerbower/gutenberg-moderne-dpo.
  overrides:
    parameters:
      model: nbeerbower_Dumpling-Qwen2.5-72B-Q4_K_M.gguf
  files:
    - filename: nbeerbower_Dumpling-Qwen2.5-72B-Q4_K_M.gguf
      sha256: 384de5ba5e60255846cd38e2bfad0374b059fb627ba8abb02273186f28684385
      uri: huggingface://bartowski/nbeerbower_Dumpling-Qwen2.5-72B-GGUF/nbeerbower_Dumpling-Qwen2.5-72B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "open-r1_openr1-qwen-7b"
  urls:
    - https://huggingface.co/open-r1/OpenR1-Qwen-7B
    - https://huggingface.co/bartowski/open-r1_OpenR1-Qwen-7B-GGUF
  description: |
    This is a finetune of Qwen2.5-Math-Instruct on OpenR1-220k-Math (default split). We train the model on the default split of OpenR1-220k-Math for 3 epochs. We use learning rate of 5e-5 and extend the context length from 4k to 32k, by increasing RoPE frequency to 300k. The training follows a linear learning rate schedule with a 10% warmup phase.
  overrides:
    parameters:
      model: open-r1_OpenR1-Qwen-7B-Q4_K_M.gguf
  files:
    - filename: open-r1_OpenR1-Qwen-7B-Q4_K_M.gguf
      sha256: d3bf99666cd19b637948ec9943044b591d3b906d0ee4f3ef1b3eb693ac8f66a6
      uri: huggingface://bartowski/open-r1_OpenR1-Qwen-7B-GGUF/open-r1_OpenR1-Qwen-7B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "internlm_oreal-32b"
  urls:
    - https://huggingface.co/internlm/OREAL-32B
    - https://huggingface.co/bartowski/internlm_OREAL-32B-GGUF
  description: |
    We introduce OREAL-7B and OREAL-32B, a mathematical reasoning model series trained using Outcome REwArd-based reinforcement Learning, a novel RL framework designed for tasks where only binary outcome rewards are available.

    With OREAL, a 7B model achieves 94.0 pass@1 accuracy on MATH-500, matching the performance of previous 32B models. OREAL-32B further surpasses previous distillation-trained 32B models, reaching 95.0 pass@1 accuracy on MATH-500.
  overrides:
    parameters:
      model: internlm_OREAL-32B-Q4_K_M.gguf
  files:
    - filename: internlm_OREAL-32B-Q4_K_M.gguf
      sha256: 5af1b3f66e3a1f95931a54500d03368c0cc7ca42cc67370338b29c18362e4a94
      uri: huggingface://bartowski/internlm_OREAL-32B-GGUF/internlm_OREAL-32B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "internlm_oreal-7b"
  urls:
    - https://huggingface.co/internlm/OREAL-7B
    - https://huggingface.co/bartowski/internlm_OREAL-7B-GGUF
  description: |
    We introduce OREAL-7B and OREAL-32B, a mathematical reasoning model series trained using Outcome REwArd-based reinforcement Learning, a novel RL framework designed for tasks where only binary outcome rewards are available.

    With OREAL, a 7B model achieves 94.0 pass@1 accuracy on MATH-500, matching the performance of previous 32B models. OREAL-32B further surpasses previous distillation-trained 32B models, reaching 95.0 pass@1 accuracy on MATH-500.
  overrides:
    parameters:
      model: internlm_OREAL-7B-Q4_K_M.gguf
  files:
    - filename: internlm_OREAL-7B-Q4_K_M.gguf
      sha256: 0f7ba453e91872f06a666fda692fbcec13fdd343f74c7dfa7219df31c038ca1c
      uri: huggingface://bartowski/internlm_OREAL-7B-GGUF/internlm_OREAL-7B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "smirki_uigen-t1.1-qwen-14b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64d1129297ca59bcf7458d07/VSplF7AM1PJPzeR9FlDhE.png
  urls:
    - https://huggingface.co/smirki/UIGEN-T1.1-Qwen-14B
    - https://huggingface.co/bartowski/smirki_UIGEN-T1.1-Qwen-14B-GGUF
  description: |
    UIGEN-T1.1 is a 14-billion parameter transformer model fine-tuned on Qwen2.5-Coder-14B-Instruct. It is designed for reasoning-based UI generation, leveraging a complex chain-of-thought approach to produce robust HTML and CSS-based UI components. Currently, it is limited to basic applications such as dashboards, landing pages, and sign-up forms.
  overrides:
    parameters:
      model: smirki_UIGEN-T1.1-Qwen-14B-Q4_K_M.gguf
  files:
    - filename: smirki_UIGEN-T1.1-Qwen-14B-Q4_K_M.gguf
      sha256: 7ad2326f06a304891a1d01d4de9feda42cb4395e4cbdc4d60dc2a26d15e5ea91
      uri: huggingface://bartowski/smirki_UIGEN-T1.1-Qwen-14B-GGUF/smirki_UIGEN-T1.1-Qwen-14B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "smirki_uigen-t1.1-qwen-7b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64d1129297ca59bcf7458d07/VSplF7AM1PJPzeR9FlDhE.png
  urls:
    - https://huggingface.co/smirki/UIGEN-T1.1-Qwen-7B
    - https://huggingface.co/bartowski/smirki_UIGEN-T1.1-Qwen-7B-GGUF
  description: |
    UIGEN-T1.1 is a 7-billion parameter transformer model fine-tuned on Qwen2.5-Coder-7B-Instruct. It is designed for reasoning-based UI generation, leveraging a complex chain-of-thought approach to produce robust HTML and CSS-based UI components. Currently, it is limited to basic applications such as dashboards, landing pages, and sign-up forms.
  overrides:
    parameters:
      model: smirki_UIGEN-T1.1-Qwen-7B-Q4_K_M.gguf
  files:
    - filename: smirki_UIGEN-T1.1-Qwen-7B-Q4_K_M.gguf
      sha256: e5d78dea15d4281455d64aef1c0f18da5674c6f15285a2991e63208d264b61ae
      uri: huggingface://bartowski/smirki_UIGEN-T1.1-Qwen-7B-GGUF/smirki_UIGEN-T1.1-Qwen-7B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "rombo-org_rombo-llm-v3.0-qwen-72b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/wp9qOi2K2WGzkey0I3SgH.jpeg
  urls:
    - https://huggingface.co/Rombo-Org/Rombo-LLM-V3.0-Qwen-72b
    - https://huggingface.co/bartowski/Rombo-Org_Rombo-LLM-V3.0-Qwen-72b-GGUF
  description: |
    Rombos-LLM-V3.0-Qwen-72b is a continues finetuned version of the Rombo-LLM-V2.5-Qwen-72b on a Reasoning and Non-reasoning dataset. The models performs exceptionally well when paired with the system prompt that it was trained on during reasoning training. Nearing SOTA levels even quantized to 4-bit.
  overrides:
    parameters:
      model: Rombo-Org_Rombo-LLM-V3.0-Qwen-72b-Q4_K_M.gguf
  files:
    - filename: Rombo-Org_Rombo-LLM-V3.0-Qwen-72b-Q4_K_M.gguf
      sha256: 3f159ffb494338d03502096c52db5e062a81b09acfd3cc4f6352ca61d6f489df
      uri: huggingface://bartowski/Rombo-Org_Rombo-LLM-V3.0-Qwen-72b-GGUF/Rombo-Org_Rombo-LLM-V3.0-Qwen-72b-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "ozone-ai_reverb-7b"
  urls:
    - https://huggingface.co/ozone-research/Reverb-7b
    - https://huggingface.co/bartowski/ozone-ai_Reverb-7b-GGUF
  description: |
    Reverb-7b is a 7 billion parameter language model developed by Ozone AI. It is a causal language model designed for text generation and various downstream tasks. This is the third model release by Ozone AI.
  overrides:
    parameters:
      model: ozone-ai_Reverb-7b-Q4_K_M.gguf
  files:
    - filename: ozone-ai_Reverb-7b-Q4_K_M.gguf
      sha256: f769c6e1a85d3426263f585f640a90c10e7e26b89345a700a4cabf62eb0583d4
      uri: huggingface://bartowski/ozone-ai_Reverb-7b-GGUF/ozone-ai_Reverb-7b-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "ozone-research_chirp-01"
  urls:
    - https://huggingface.co/ozone-research/Chirp-01
    - https://huggingface.co/bartowski/ozone-research_Chirp-01-GGUF
  description: |
    Chirp-3b is a high-performing 3B parameter language model crafted by the Ozone Research team. Fine-tuned from a robust base model (Qwen2.5 3B Instruct), it was trained on 50 million tokens of distilled data from GPT-4o. This compact yet powerful model delivers exceptional results, outperforming expectations on benchmarks like MMLU Pro and IFEval.

    Chirp-3b is an open-source effort to push the limits of what small-scale LLMs can achieve, making it a valuable tool for researchers and enthusiasts alike.
  overrides:
    parameters:
      model: ozone-research_Chirp-01-Q4_K_M.gguf
  files:
    - filename: ozone-research_Chirp-01-Q4_K_M.gguf
      sha256: 4ca7328f9b649755077c9064de0b9748d9f12a2e4ce8f493c94e1b19a8b5a035
      uri: huggingface://bartowski/ozone-research_Chirp-01-GGUF/ozone-research_Chirp-01-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "ozone-research_0x-lite"
  urls:
    - https://huggingface.co/ozone-research/0x-lite
    - https://huggingface.co/bartowski/ozone-research_0x-lite-GGUF
  description: |
    0x Lite is a state-of-the-art language model developed by Ozone AI, designed to deliver ultra-high-quality text generation capabilities while maintaining a compact and efficient architecture. Built on the latest advancements in natural language processing, 0x Lite is optimized for both speed and accuracy, making it a strong contender in the space of language models. It is particularly well-suited for applications where resource constraints are a concern, offering a lightweight alternative to larger models like GPT while still delivering comparable performance.
  overrides:
    parameters:
      model: ozone-research_0x-lite-Q4_K_M.gguf
  files:
    - filename: ozone-research_0x-lite-Q4_K_M.gguf
      sha256: c11f3bd1c607ca329f48d1b6a3e540ac4c5ea8d57097550639709d9202b7f405
      uri: huggingface://bartowski/ozone-research_0x-lite-GGUF/ozone-research_0x-lite-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "allenai_olmocr-7b-0225-preview"
  icon: https://huggingface.co/datasets/allenai/blog-images/resolve/main/olmocr/olmocr.png
  urls:
    - https://huggingface.co/allenai/olmOCR-7B-0225-preview
    - https://huggingface.co/bartowski/allenai_olmOCR-7B-0225-preview-GGUF
    - https://olmocr.allenai.org/papers/olmocr.pdf
  description: |
    This is a preview release of the olmOCR model that's fine tuned from Qwen2-VL-7B-Instruct using the olmOCR-mix-0225 dataset.
  overrides:
    parameters:
      model: allenai_olmOCR-7B-0225-preview-Q4_K_M.gguf
  files:
    - filename: allenai_olmOCR-7B-0225-preview-Q4_K_M.gguf
      sha256: 0a5603f95ba59828061d315b7869e021ea1b86e2dececaba8a1f9bcc3f81e84a
      uri: huggingface://bartowski/allenai_olmOCR-7B-0225-preview-GGUF/allenai_olmOCR-7B-0225-preview-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "boomer_qwen_72b-i1"
  icon: https://huggingface.co/SicariusSicariiStuff/Boomer_Qwen_72B/resolve/main/Images/03.png
  urls:
    - https://huggingface.co/SicariusSicariiStuff/Boomer_Qwen_72B
    - https://huggingface.co/mradermacher/Boomer_Qwen_72B-i1-GGUF
  description: |
    An absolute unit derived from Qwen-72B, but turbo-charged with pure unfiltered boomer sigma grindset energy. This model has internalized decades of "back in my day" wisdom and distilled it into the most powerful financial NLP system ever created.

    Core features:

        Programmed to automatically respond "Just buy the dip" to any market analysis
        Enhanced pattern recognition for spotting "kids these days" scenarios
        Built-in mortgage calculator that always concludes "rent is throwing money away"
        Advanced NLP pipeline for transforming any input into "when I was your age" narratives
        Hardwired belief in "number go up" as the fundamental law of economics

    Training methodology: Collected prime boomer wisdom from countless Facebook rants, Thanksgiving dinner lectures, and unsolicited advice sessions. Fed it through Qwen's architecture until it achieved enlightenment and started spontaneously generating complaints about avocado toast.

    Performance metrics: Achieves SOTA results on:

        Real estate evangelism
        "Pull yourself up by your bootstraps" pep talks
        Gold standard nostalgia generation
        Market timing (but only in retrospect)

    Basically took the raw computational power of Qwen-72B and gave it a healthy dose of "they don't make 'em like they used to" energy. The result? A model that knows the secret to success is just working hard and investing in the S&P 500.

    Warning: May spontaneously generate advice about starting in the mail room and working your way up to CEO.
  overrides:
    parameters:
      model: Boomer_Qwen_72B.i1-Q4_K_M.gguf
  files:
    - filename: Boomer_Qwen_72B.i1-Q4_K_M.gguf
      sha256: 5cee89356d512874ca45f516c322d99f2b3534db5a3acd43a96c031cced3bc75
      uri: huggingface://mradermacher/Boomer_Qwen_72B-i1-GGUF/Boomer_Qwen_72B.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "azura-qwen2.5-32b-i1"
  icon: https://huggingface.co/nbeerbower/Azura-Qwen2.5-32B/resolve/main/cover.png?download=true
  urls:
    - https://huggingface.co/nbeerbower/Azura-Qwen2.5-32B
    - https://huggingface.co/mradermacher/Azura-Qwen2.5-32B-i1-GGUF
  description: |
    This model was merged using the Model Stock merge method using nbeerbower/Dumpling-Qwen2.5-32B as a base.
    The following models were included in the merge:

    rinna/qwen2.5-bakeneko-32b-instruct
    EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2
    zetasepic/Qwen2.5-32B-Instruct-abliterated-v2
    nbeerbower/Dumpling-Qwen2.5-32B-v2
  overrides:
    parameters:
      model: Azura-Qwen2.5-32B.i1-Q4_K_M.gguf
  files:
    - filename: Azura-Qwen2.5-32B.i1-Q4_K_M.gguf
      sha256: a3ec93f192dc4ce062fd176d6615d4da34af81d909b89c372678b779a46b8d3b
      uri: huggingface://mradermacher/Azura-Qwen2.5-32B-i1-GGUF/Azura-Qwen2.5-32B.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen_qwq-32b"
  urls:
    - https://huggingface.co/Qwen/QwQ-32B
    - https://huggingface.co/bartowski/Qwen_QwQ-32B-GGUF
  description: |
    QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.
  overrides:
    parameters:
      model: Qwen_QwQ-32B-Q4_K_M.gguf
  files:
    - filename: Qwen_QwQ-32B-Q4_K_M.gguf
      sha256: 87cc1894a68008856cde6ff24bfb9b99488a0d18c2e0a2b1ddeabd43cd0498e0
      uri: huggingface://bartowski/Qwen_QwQ-32B-GGUF/Qwen_QwQ-32B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "rombo-org_rombo-llm-v3.1-qwq-32b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/hXnQV6WtMKrmIQPdjECSX.jpeg
  urls:
    - https://huggingface.co/Rombo-Org/Rombo-LLM-V3.1-QWQ-32b
    - https://huggingface.co/bartowski/Rombo-Org_Rombo-LLM-V3.1-QWQ-32b-GGUF
  description: |
    Rombo-LLM-V3.1-QWQ-32b is a Continued Finetune model (Merge only) of (Qwen/QwQ-32B) and its base model (Qwen/Qwen2.5-32B). This merge is done to decrease catastrophic forgetting during finetuning, and increase overall performance of the model. The tokenizers are taken from the QwQ-32B for thinking capabilities.
  overrides:
    parameters:
      model: Rombo-Org_Rombo-LLM-V3.1-QWQ-32b-Q4_K_M.gguf
  files:
    - filename: Rombo-Org_Rombo-LLM-V3.1-QWQ-32b-Q4_K_M.gguf
      sha256: ee0b5027c686f3c37938f33b62788e27211852268f9e5c32e00058f0cf1688c7
      uri: huggingface://bartowski/Rombo-Org_Rombo-LLM-V3.1-QWQ-32b-GGUF/Rombo-Org_Rombo-LLM-V3.1-QWQ-32b-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "huihui-ai_qwq-32b-abliterated"
  urls:
    - https://huggingface.co/huihui-ai/QwQ-32B-abliterated
    - https://huggingface.co/bartowski/huihui-ai_QwQ-32B-abliterated-GGUF
  description: |
    This is an uncensored version of Qwen/QwQ-32B created with abliteration (see remove-refusals-with-transformers to know more about it).
    This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.
  overrides:
    parameters:
      model: huihui-ai_QwQ-32B-abliterated-Q4_K_M.gguf
  files:
    - filename: huihui-ai_QwQ-32B-abliterated-Q4_K_M.gguf
      sha256: 27d3c3e116015257985fa27b87e3f3aafbeb4762152d60474e883547d436025e
      uri: huggingface://bartowski/huihui-ai_QwQ-32B-abliterated-GGUF/huihui-ai_QwQ-32B-abliterated-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "tower-babel_babel-9b-chat"
  icon: https://cdn-avatars.huggingface.co/v1/production/uploads/64802face9ff472e30dc1ceb/9mRO092PjPmzd8qSr7F5V.png
  urls:
    - https://huggingface.co/Tower-Babel/Babel-9B-Chat
    - https://huggingface.co/bartowski/Tower-Babel_Babel-9B-Chat-GGUF
  description: |
    We introduce Babel, a multilingual LLM that covers the top 25 languages by number of speakers, including English, Chinese, Hindi, Spanish, Arabic, French, Bengali, Portuguese, Russian, Urdu, Indonesian, German, Japanese, Swahili, Filipino, Tamil, Vietnamese, Turkish, Italian, Javanese, Korean, Hausa, Persian, Thai, and Burmese. These 25 languages support over 90% of the global population, and include many languages neglected by other open multilingual LLMs. Unlike traditional continued pretraining approaches, Babel expands its parameter count through a layer extension technique that elevates Babel's performance ceiling.
  overrides:
    parameters:
      model: Tower-Babel_Babel-9B-Chat-Q4_K_M.gguf
  files:
    - filename: Tower-Babel_Babel-9B-Chat-Q4_K_M.gguf
      sha256: cf024c81b9c5e31dd9b4fe89f7bed01be8a6a704722780fe8d240b1ecb7942eb
      uri: huggingface://bartowski/Tower-Babel_Babel-9B-Chat-GGUF/Tower-Babel_Babel-9B-Chat-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "openpipe_deductive-reasoning-qwen-14b"
  urls:
    - https://huggingface.co/OpenPipe/Deductive-Reasoning-Qwen-14B
    - https://huggingface.co/bartowski/OpenPipe_Deductive-Reasoning-Qwen-14B-GGUF
  description: |
    Deductive Reasoning Qwen 14B is a reinforcement fine-tune of Qwen 2.5 14B Instruct to solve challenging deduction problems from the Temporal Clue dataset, trained by OpenPipe!
  overrides:
    parameters:
      model: OpenPipe_Deductive-Reasoning-Qwen-14B-Q4_K_M.gguf
  files:
    - filename: OpenPipe_Deductive-Reasoning-Qwen-14B-Q4_K_M.gguf
      sha256: 23474b114e1e14f5f63829369e9af14d3f8e6b437b7974e1d3ac0c842b4cc3f5
      uri: huggingface://bartowski/OpenPipe_Deductive-Reasoning-Qwen-14B-GGUF/OpenPipe_Deductive-Reasoning-Qwen-14B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "openpipe_deductive-reasoning-qwen-32b"
  urls:
    - https://huggingface.co/OpenPipe/Deductive-Reasoning-Qwen-32B
    - https://huggingface.co/bartowski/OpenPipe_Deductive-Reasoning-Qwen-32B-GGUF
  description: |
    Deductive Reasoning Qwen 32B is a reinforcement fine-tune of Qwen 2.5 32B Instruct to solve challenging deduction problems from the Temporal Clue dataset, trained by OpenPipe!
  overrides:
    parameters:
      model: OpenPipe_Deductive-Reasoning-Qwen-32B-Q4_K_M.gguf
  files:
    - filename: OpenPipe_Deductive-Reasoning-Qwen-32B-Q4_K_M.gguf
      sha256: 53a8314e572c60c867da897721d366f183dc6d2193c83a41ff8ad46a2a0692c8
      uri: huggingface://bartowski/OpenPipe_Deductive-Reasoning-Qwen-32B-GGUF/OpenPipe_Deductive-Reasoning-Qwen-32B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "open-r1_olympiccoder-32b"
  urls:
    - https://huggingface.co/open-r1/OlympicCoder-32B
    - https://huggingface.co/bartowski/open-r1_OlympicCoder-32B-GGUF
  description: |
    OlympicCoder-32B is a code mode that achieves very strong performance on competitive coding benchmarks such as LiveCodeBench andthe 2024 International Olympiad in Informatics.
  overrides:
    parameters:
      model: open-r1_OlympicCoder-32B-Q4_K_M.gguf
  files:
    - filename: open-r1_OlympicCoder-32B-Q4_K_M.gguf
      sha256: bb82e4aa2219f655d37c7efad8985582cf3c32de0e0299ecd2f304d32ac39f12
      uri: huggingface://bartowski/open-r1_OlympicCoder-32B-GGUF/open-r1_OlympicCoder-32B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "open-r1_olympiccoder-7b"
  urls:
    - https://huggingface.co/open-r1/OlympicCoder-7B
    - https://huggingface.co/bartowski/open-r1_OlympicCoder-7B-GGUF
  description: |
    OlympicCoder-7B is a code model that achieves strong performance on competitive coding benchmarks such as LiveCodeBench and the 2024 International Olympiad in Informatics.
  overrides:
    parameters:
      model: open-r1_OlympicCoder-7B-Q4_K_M.gguf
  files:
    - filename: open-r1_OlympicCoder-7B-Q4_K_M.gguf
      sha256: 21e18e7fd1fb244455a67d4dee538a4d86dc96d507c39a4ad16ef335fb9e6e2f
      uri: huggingface://bartowski/open-r1_OlympicCoder-7B-GGUF/open-r1_OlympicCoder-7B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "trashpanda-org_qwq-32b-snowdrop-v0"
  icon: https://cdn-uploads.huggingface.co/production/uploads/675a77cf99ca23af9daacccc/Tdn0PJBFnG3J6UcjO9G94.png
  urls:
    - https://huggingface.co/trashpanda-org/QwQ-32B-Snowdrop-v0
    - https://huggingface.co/bartowski/trashpanda-org_QwQ-32B-Snowdrop-v0-GGUF
  description: |
    R1 at home for RP, literally. Able to handle my cards with gimmicks and subtle tricks in them. With a good reasoning starter+prompt, I'm getting consistently-structured responses that have a good amount of variation across them still while rerolling. Char/scenario portrayal is good despite my focus on writing style, lorebooks are properly referenced at times. Slop doesn't seem to be too much of an issue with thinking enabled. Some user impersonation is rarely observed. Prose is refreshing if you take advantage of what I did (writing style fixation). I know I said Marigold would be my daily driver, but this one is that now, it's that good.
  overrides:
    parameters:
      model: trashpanda-org_QwQ-32B-Snowdrop-v0-Q4_K_M.gguf
  files:
    - filename: trashpanda-org_QwQ-32B-Snowdrop-v0-Q4_K_M.gguf
      sha256: 584d2f14f2f08ce499665c332bef30245b605ed2278e9075766237835f564c5f
      uri: huggingface://bartowski/trashpanda-org_QwQ-32B-Snowdrop-v0-GGUF/trashpanda-org_QwQ-32B-Snowdrop-v0-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "prithivmlmods_viper-coder-32b-elite13"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/n5x-NuenasIjm3HljPUYY.png
  urls:
    - https://huggingface.co/prithivMLmods/Viper-Coder-32B-Elite13
    - https://huggingface.co/bartowski/prithivMLmods_Viper-Coder-32B-Elite13-GGUF
  description: |
    Viper-Coder-32B-Elite13 is based on the qwq-32B modality architecture, designed to be the best for coding and reasoning tasks. It has been fine-tuned on a synthetic dataset leveraging the latest coding logits and CoT datasets, further optimizing its chain-of-thought (CoT) reasoning and logical problem-solving abilities. The model demonstrates significant improvements in context understanding, structured data processing, and long-context comprehension, making it ideal for complex coding tasks, instruction-following, and technical text generation.
  overrides:
    parameters:
      model: prithivMLmods_Viper-Coder-32B-Elite13-Q4_K_M.gguf
  files:
    - filename: prithivMLmods_Viper-Coder-32B-Elite13-Q4_K_M.gguf
      sha256: 57a41ed2fc0d62847cf85ff20cc71be9c5978d22a56e39f2390c6563e5b0c931
      uri: huggingface://bartowski/prithivMLmods_Viper-Coder-32B-Elite13-GGUF/prithivMLmods_Viper-Coder-32B-Elite13-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "rootxhacker_apollo-v3-32b"
  urls:
    - https://huggingface.co/rootxhacker/Apollo-v3-32B
    - https://huggingface.co/bartowski/rootxhacker_Apollo-v3-32B-GGUF
  description: |
    This is an experimental hybrid reasoning model built on Qwen2.5-32B-Instruct
  overrides:
    parameters:
      model: rootxhacker_Apollo-v3-32B-Q4_K_M.gguf
  files:
    - filename: rootxhacker_Apollo-v3-32B-Q4_K_M.gguf
      sha256: 67aa4b88a017931fab622b05879c0ff5f0a6db758686d2200aaad19f21bd5d2a
      uri: huggingface://bartowski/rootxhacker_Apollo-v3-32B-GGUF/rootxhacker_Apollo-v3-32B-Q4_K_M.gguf
- !!merge <<: *qwen25
  icon: https://emygervais.github.io/assets/images/screenshots.png
  name: "samsungsailmontreal_bytecraft"
  urls:
    - https://huggingface.co/SamsungSAILMontreal/ByteCraft
    - https://huggingface.co/bartowski/SamsungSAILMontreal_ByteCraft-GGUF
  description: |
    ByteCraft is the world's first generative model of SWF video games and animations through bytes conditional on prompt.
  overrides:
    parameters:
      model: SamsungSAILMontreal_ByteCraft-Q4_K_M.gguf
  files:
    - filename: SamsungSAILMontreal_ByteCraft-Q4_K_M.gguf
      sha256: b9e1b44e3e6d90fe5d7d7d4741c37bcb40724e50de8b8f0ad2480e095e8d1712
      uri: huggingface://bartowski/SamsungSAILMontreal_ByteCraft-GGUF/SamsungSAILMontreal_ByteCraft-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen-writerdemo-7b-s500-i1"
  urls:
    - https://huggingface.co/Quest-AI/qwen-writerdemo-7b-s500
    - https://huggingface.co/mradermacher/qwen-writerdemo-7b-s500-i1-GGUF
  description: |
    This is a base model that has had an experimental reward model RL training done over it for a subset of the Erebus dataset (creative writing).
  overrides:
    parameters:
      model: qwen-writerdemo-7b-s500.i1-Q4_K_M.gguf
  files:
    - filename: qwen-writerdemo-7b-s500.i1-Q4_K_M.gguf
      sha256: dcc0e2dd36587fdd3ed0c8e8c215a01244f00dd85f62da23642410d0e688fe13
      uri: huggingface://mradermacher/qwen-writerdemo-7b-s500-i1-GGUF/qwen-writerdemo-7b-s500.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "helpingai_helpingai3-raw"
  urls:
    - https://huggingface.co/HelpingAI/Helpingai3-raw
    - https://huggingface.co/bartowski/HelpingAI_Helpingai3-raw-GGUF
  description: |
    The LLM model described is an emotionally intelligent, conversational and EQ-focused model developed by HelpingAI. It is based on the Helpingai3-raw model and has been quantized using the llama.cpp framework. The model is available in various quantization levels, allowing for different trade-offs between performance and size. Users can choose the appropriate quantization level based on their available RAM, VRAM, and desired performance. The model's weights are provided in .gguf format and can be downloaded from the Hugging Face model repository.
  overrides:
    parameters:
      model: HelpingAI_Helpingai3-raw-Q4_K_M.gguf
  files:
    - filename: HelpingAI_Helpingai3-raw-Q4_K_M.gguf
      sha256: de7a223ad397ba27c889dad08466de471166f1e76962b855c72cf6b779a7b857
      uri: huggingface://bartowski/HelpingAI_Helpingai3-raw-GGUF/HelpingAI_Helpingai3-raw-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "qwen2.5-14b-instruct-1m-unalign-i1"
  urls:
    - https://huggingface.co/ToastyPigeon/Qwen2.5-14B-Instruct-1M-Unalign
    - https://huggingface.co/mradermacher/Qwen2.5-14B-Instruct-1M-Unalign-i1-GGUF
  description: |
    A simple unalignment fine-tune on ~900k tokens aiming to make the model more compliant and willing to handle user requests.

    This is the same unalignment training seen in concedo/Beepo-22B, so big thanks to concedo for the dataset.

    Chat template is same as the original, ChatML.
  overrides:
    parameters:
      model: Qwen2.5-14B-Instruct-1M-Unalign.i1-Q4_K_M.gguf
  files:
    - filename: Qwen2.5-14B-Instruct-1M-Unalign.i1-Q4_K_M.gguf
      sha256: 11b2eb96a8a4d512fceb3344dccc694972801c964cf748d723fdf436bc368915
      uri: huggingface://mradermacher/Qwen2.5-14B-Instruct-1M-Unalign-i1-GGUF/Qwen2.5-14B-Instruct-1M-Unalign.i1-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "tesslate_tessa-t1-32b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64d1129297ca59bcf7458d07/I7XzH-NMKUshcGU86u6VA.png
  urls:
    - https://huggingface.co/Tesslate/Tessa-T1-32B
    - https://huggingface.co/bartowski/Tesslate_Tessa-T1-32B-GGUF
  description: |
   Tessa-T1 is an innovative transformer-based React reasoning model, fine-tuned from the powerful Qwen2.5-Coder-32B-Instruct base model. Designed specifically for React frontend development, Tessa-T1 leverages advanced reasoning to autonomously generate well-structured, semantic React components. Its integration into agent systems makes it a powerful tool for automating web interface development and frontend code intelligence.
    Model Highlights

        React-specific Reasoning: Accurately generates functional and semantic React components.
        Agent Integration: Seamlessly fits into AI-driven coding agents and autonomous frontend systems.
        Context-Aware Generation: Effectively understands and utilizes UI context to provide relevant code solutions.
  overrides:
    parameters:
      model: Tesslate_Tessa-T1-32B-Q4_K_M.gguf
  files:
    - filename: Tesslate_Tessa-T1-32B-Q4_K_M.gguf
      sha256: e52a2a0a877ce1de78f2ea472c9e3bc7a0c20d6998423e9d99a59175809d3a22
      uri: huggingface://bartowski/Tesslate_Tessa-T1-32B-GGUF/Tesslate_Tessa-T1-32B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "tesslate_tessa-t1-14b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64d1129297ca59bcf7458d07/I7XzH-NMKUshcGU86u6VA.png
  urls:
    - https://huggingface.co/Tesslate/Tessa-T1-14B
    - https://huggingface.co/bartowski/Tesslate_Tessa-T1-14B-GGUF
  description: |
   Tessa-T1 is an innovative transformer-based React reasoning model, fine-tuned from the powerful Qwen2.5-Coder-14B-Instruct base model. Designed specifically for React frontend development, Tessa-T1 leverages advanced reasoning to autonomously generate well-structured, semantic React components. Its integration into agent systems makes it a powerful tool for automating web interface development and frontend code intelligence.
    Model Highlights

        React-specific Reasoning: Accurately generates functional and semantic React components.
        Agent Integration: Seamlessly fits into AI-driven coding agents and autonomous frontend systems.
        Context-Aware Generation: Effectively understands and utilizes UI context to provide relevant code solutions.
  overrides:
    parameters:
      model: Tesslate_Tessa-T1-14B-Q4_K_M.gguf
  files:
    - filename: Tesslate_Tessa-T1-14B-Q4_K_M.gguf
      sha256: 1b35ff651b9c1e4538d10e3117390ae36094b6455a9f937a4f3ab72162125bca
      uri: huggingface://bartowski/Tesslate_Tessa-T1-14B-GGUF/Tesslate_Tessa-T1-14B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "tesslate_tessa-t1-7b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64d1129297ca59bcf7458d07/I7XzH-NMKUshcGU86u6VA.png
  urls:
    - https://huggingface.co/Tesslate/Tessa-T1-7B
    - https://huggingface.co/bartowski/Tesslate_Tessa-T1-7B-GGUF
  description: |
   Tessa-T1 is an innovative transformer-based React reasoning model, fine-tuned from the powerful Qwen2.5-Coder-7B-Instruct base model. Designed specifically for React frontend development, Tessa-T1 leverages advanced reasoning to autonomously generate well-structured, semantic React components. Its integration into agent systems makes it a powerful tool for automating web interface development and frontend code intelligence.
    Model Highlights

        React-specific Reasoning: Accurately generates functional and semantic React components.
        Agent Integration: Seamlessly fits into AI-driven coding agents and autonomous frontend systems.
        Context-Aware Generation: Effectively understands and utilizes UI context to provide relevant code solutions.
  overrides:
    parameters:
      model: Tesslate_Tessa-T1-7B-Q4_K_M.gguf
  files:
    - filename: Tesslate_Tessa-T1-7B-Q4_K_M.gguf
      sha256: 7968332d01b5479dee99aff7c9764b9e61c2a6d2828c266163596dd783bdee18
      uri: huggingface://bartowski/Tesslate_Tessa-T1-7B-GGUF/Tesslate_Tessa-T1-7B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "tesslate_tessa-t1-3b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64d1129297ca59bcf7458d07/I7XzH-NMKUshcGU86u6VA.png
  urls:
    - https://huggingface.co/Tesslate/Tessa-T1-3B
    - https://huggingface.co/bartowski/Tesslate_Tessa-T1-3B-GGUF
  description: |
   Tessa-T1 is an innovative transformer-based React reasoning model, fine-tuned from the powerful Qwen2.5-Coder-3B-Instruct base model. Designed specifically for React frontend development, Tessa-T1 leverages advanced reasoning to autonomously generate well-structured, semantic React components. Its integration into agent systems makes it a powerful tool for automating web interface development and frontend code intelligence.
    Model Highlights

        React-specific Reasoning: Accurately generates functional and semantic React components.
        Agent Integration: Seamlessly fits into AI-driven coding agents and autonomous frontend systems.
        Context-Aware Generation: Effectively understands and utilizes UI context to provide relevant code solutions.
  overrides:
    parameters:
      model: Tesslate_Tessa-T1-3B-Q4_K_M.gguf
  files:
    - filename: Tesslate_Tessa-T1-3B-Q4_K_M.gguf
      sha256: d6b9d31d78d36094cab2725a7df318f8f3556990df736a21998c952d9a6ee0bf
      uri: huggingface://bartowski/Tesslate_Tessa-T1-3B-GGUF/Tesslate_Tessa-T1-3B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "chaoticneutrals_very_berry_qwen2_7b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/1J817kx3zZccf5yvQYiGM.png
  urls:
    - https://huggingface.co/ChaoticNeutrals/Very_Berry_Qwen2_7B
    - https://huggingface.co/bartowski/ChaoticNeutrals_Very_Berry_Qwen2_7B-GGUF
  description: |
    It do the stuff.
  overrides:
    parameters:
      model: ChaoticNeutrals_Very_Berry_Qwen2_7B-Q4_K_M.gguf
  files:
    - filename: ChaoticNeutrals_Very_Berry_Qwen2_7B-Q4_K_M.gguf
      sha256: cbda41c638c23a3e8e9fb33c27ca0d0a0ee044b6813941a0017fd46369a35ec5
      uri: huggingface://bartowski/ChaoticNeutrals_Very_Berry_Qwen2_7B-GGUF/ChaoticNeutrals_Very_Berry_Qwen2_7B-Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "galactic-qwen-14b-exp1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/SjM3y5Qcr2RX6zC3GQxR3.png
  urls:
    - https://huggingface.co/prithivMLmods/Galactic-Qwen-14B-Exp1
    - https://huggingface.co/mradermacher/Galactic-Qwen-14B-Exp1-GGUF
  description: |
    Galactic-Qwen-14B-Exp1 is based on the Qwen 2.5 14B modality architecture, designed to enhance the reasoning capabilities of 14B-parameter models. This model is optimized for general-purpose reasoning and answering, excelling in contextual understanding, logical deduction, and multi-step problem-solving. It has been fine-tuned using a long chain-of-thought reasoning model and specialized datasets to improve comprehension, structured responses, and conversational intelligence.
  overrides:
    parameters:
      model: Galactic-Qwen-14B-Exp1.Q4_K_M.gguf
  files:
    - filename: Galactic-Qwen-14B-Exp1.Q4_K_M.gguf
      sha256: 26e99578c341c879cc2676c4c7a45b6c0d00b30bd17c8ee7494fcc4092480ef0
      uri: huggingface://mradermacher/Galactic-Qwen-14B-Exp1-GGUF/Galactic-Qwen-14B-Exp1.Q4_K_M.gguf
- !!merge <<: *qwen25
  name: "hammer2.0-7b"
  urls:
    - https://huggingface.co/MadeAgents/Hammer2.0-7b
    - https://huggingface.co/Nekuromento/Hammer2.0-7b-Q5_K_M-GGUF
  description: |
    Hammer2.0 finetuned based on Qwen 2.5 series and Qwen 2.5 coder series using function masking techniques. It's trained using the APIGen Function Calling Datasets containing 60,000 samples, supplemented by xlam-irrelevance-7.5k we generated. Hammer2.0 has achieved exceptional performances across numerous function calling benchmarks. For more details, please refer to Hammer: Robust Function-Calling for On-Device Language Models via Function Masking and Hammer GitHub repository .
  overrides:
    parameters:
      model: hammer2.0-7b-q5_k_m.gguf
  files:
    - filename: hammer2.0-7b-q5_k_m.gguf
      sha256: 3682843c857595765f0786cf24b3d501af96fe5d99a9fb2526bc7707e28bae1e
      uri: huggingface://Nekuromento/Hammer2.0-7b-Q5_K_M-GGUF/hammer2.0-7b-q5_k_m.gguf
- &llama31
  url: "github:mudler/LocalAI/gallery/llama3.1-instruct.yaml@master" ## LLama3.1
  icon: https://avatars.githubusercontent.com/u/153379578
  name: "meta-llama-3.1-8b-instruct"
  license: llama3.1
  description: |
    The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.

    Model developer: Meta

    Model Architecture: Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
  urls:
    - https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct
    - https://huggingface.co/MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - llama3.1
  overrides:
    parameters:
      model: Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf
  files:
    - filename: Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf
      sha256: c2f17f44af962660d1ad4cb1af91a731f219f3b326c2b14441f9df1f347f2815
      uri: huggingface://MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF/Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "meta-llama-3.1-70b-instruct"
  urls:
    - https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct
    - https://huggingface.co/MaziyarPanahi/Meta-Llama-3.1-70B-Instruct-GGUF
  overrides:
    parameters:
      model: Meta-Llama-3.1-70B-Instruct.Q4_K_M.gguf
  files:
    - filename: Meta-Llama-3.1-70B-Instruct.Q4_K_M.gguf
      sha256: 3f16ab17da4521fe3ed7c5d7beed960d3fe7b5b64421ee9650aa53d6b649ccab
      uri: huggingface://MaziyarPanahi/Meta-Llama-3.1-70B-Instruct-GGUF/Meta-Llama-3.1-70B-Instruct.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "meta-llama-3.1-8b-instruct:grammar-functioncall"
  url: "github:mudler/LocalAI/gallery/llama3.1-instruct-grammar.yaml@master"
  urls:
    - https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct
    - https://huggingface.co/MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF
  description: |
    This is the standard Llama 3.1 8B Instruct model with grammar and function call enabled.

    When grammars are enabled in LocalAI, the LLM is forced to output valid tools constrained by BNF grammars. This can be useful for ensuring that the model outputs are valid and can be used in a production environment.
    For more information on how to use grammars in LocalAI, see https://localai.io/features/openai-functions/#advanced and https://localai.io/features/constrained_grammars/.
  overrides:
    parameters:
      model: Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf
  files:
    - filename: Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf
      sha256: c2f17f44af962660d1ad4cb1af91a731f219f3b326c2b14441f9df1f347f2815
      uri: huggingface://MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF/Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "meta-llama-3.1-8b-instruct:Q8_grammar-functioncall"
  url: "github:mudler/LocalAI/gallery/llama3.1-instruct-grammar.yaml@master"
  urls:
    - https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct
    - https://huggingface.co/MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF
  description: |
    This is the standard Llama 3.1 8B Instruct model with grammar and function call enabled.

    When grammars are enabled in LocalAI, the LLM is forced to output valid tools constrained by BNF grammars. This can be useful for ensuring that the model outputs are valid and can be used in a production environment.
    For more information on how to use grammars in LocalAI, see https://localai.io/features/openai-functions/#advanced and https://localai.io/features/constrained_grammars/.
  overrides:
    parameters:
      model: Meta-Llama-3.1-8B-Instruct.Q8_0.gguf
  files:
    - filename: Meta-Llama-3.1-8B-Instruct.Q8_0.gguf
      sha256: f8d608c983b83a1bf28229bc9beb4294c91f5d4cbfe2c1829566b4d7c4693eeb
      uri: huggingface://MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF/Meta-Llama-3.1-8B-Instruct.Q8_0.gguf
- !!merge <<: *llama31
  name: "meta-llama-3.1-8b-claude-imat"
  urls:
    - https://huggingface.co/Undi95/Meta-Llama-3.1-8B-Claude
    - https://huggingface.co/InferenceIllusionist/Meta-Llama-3.1-8B-Claude-iMat-GGUF
  description: |
    Meta-Llama-3.1-8B-Claude-iMat-GGUF: Quantized from Meta-Llama-3.1-8B-Claude fp16. Weighted quantizations were creating using fp16 GGUF and groups_merged.txt in 88 chunks and n_ctx=512. Static fp16 will also be included in repo. For a brief rundown of iMatrix quant performance, please see this PR. All quants are verified working prior to uploading to repo for your safety and convenience.
  overrides:
    parameters:
      model: Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf
  files:
    - filename: Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf
      uri: huggingface://InferenceIllusionist/Meta-Llama-3.1-8B-Claude-iMat-GGUF/Meta-Llama-3.1-8B-Claude-iMat-Q4_K_M.gguf
      sha256: 6d175432f66d10dfed9737f73a5073d513d18e1ee7bd4b9cf2a59deb359f36ff
- !!merge <<: *llama31
  name: "meta-llama-3.1-8b-instruct-abliterated"
  icon: https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/AsTgL8VCgMHgobq4cr46b.png
  urls:
    - https://huggingface.co/mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated
    - https://huggingface.co/mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF
  description: |
    This is an uncensored version of Llama 3.1 8B Instruct created with abliteration.
  overrides:
    parameters:
      model: meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf
  files:
    - filename: meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf
      uri: huggingface://mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF/meta-llama-3.1-8b-instruct-abliterated.Q4_K_M.gguf
      sha256: c4735f9efaba8eb2c30113291652e3ffe13bf940b675ed61f6be749608b4f266
- !!merge <<: *llama31
  name: "llama-3.1-70b-japanese-instruct-2407"
  urls:
    - https://huggingface.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
    - https://huggingface.co/mmnga/Llama-3.1-70B-Japanese-Instruct-2407-gguf
  description: |
    The Llama-3.1-70B-Japanese-Instruct-2407-gguf model is a Japanese language model that uses the Instruct prompt tuning method. It is based on the LLaMa-3.1-70B model and has been fine-tuned on the imatrix dataset for Japanese. The model is trained to generate informative and coherent responses to given instructions or prompts. It is available in the gguf format and can be used for a variety of tasks such as question answering, text generation, and more.
  overrides:
    parameters:
      model: Llama-3.1-70B-Japanese-Instruct-2407-Q4_K_M.gguf
  files:
    - filename: Llama-3.1-70B-Japanese-Instruct-2407-Q4_K_M.gguf
      sha256: f2a6f0fb5040d3a28479c9f9fc555a5ea7b906dfb9964539f1a68c0676a9c604
      uri: huggingface://mmnga/Llama-3.1-70B-Japanese-Instruct-2407-gguf/Llama-3.1-70B-Japanese-Instruct-2407-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "openbuddy-llama3.1-8b-v22.1-131k"
  icon: https://github.com/OpenBuddy/OpenBuddy/raw/main/media/demo.png
  urls:
    - https://huggingface.co/sunnyyy/openbuddy-llama3.1-8b-v22.1-131k-Q4_K_M-GGUF
  description: |
    OpenBuddy - Open Multilingual Chatbot
  overrides:
    parameters:
      model: openbuddy-llama3.1-8b-v22.1-131k-q4_k_m.gguf
  files:
    - filename: openbuddy-llama3.1-8b-v22.1-131k-q4_k_m.gguf
      sha256: c87a273785759f2d044046b7a7b42f05706baed7dc0650ed883a3bee2a097d86
      uri: huggingface://sunnyyy/openbuddy-llama3.1-8b-v22.1-131k-Q4_K_M-GGUF/openbuddy-llama3.1-8b-v22.1-131k-q4_k_m.gguf
- !!merge <<: *llama31
  name: "llama3.1-8b-fireplace2"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64f267a8a4f79a118e0fcc89/JYkaXrk2DqpXhaL9WymKY.jpeg
  urls:
    - https://huggingface.co/ValiantLabs/Llama3.1-8B-Fireplace2
    - https://huggingface.co/mudler/Llama3.1-8B-Fireplace2-Q4_K_M-GGUF
  description: |
    Fireplace 2 is a chat model, adding helpful structured outputs to Llama 3.1 8b Instruct.

        an expansion pack of supplementary outputs - request them at will within your chat:
            Inline function calls
            SQL queries
            JSON objects
            Data visualization with matplotlib
        Mix normal chat and structured outputs within the same conversation.
        Fireplace 2 supplements the existing strengths of Llama 3.1, providing inline capabilities within the Llama 3 Instruct format.

    Version

    This is the 2024-07-23 release of Fireplace 2 for Llama 3.1 8b.

    We're excited to bring further upgrades and releases to Fireplace 2 in the future.

    Help us and recommend Fireplace 2 to your friends!
  overrides:
    parameters:
      model: llama3.1-8b-fireplace2-q4_k_m.gguf
  files:
    - filename: llama3.1-8b-fireplace2-q4_k_m.gguf
      sha256: 54527fd2474b576086ea31e759214ab240abe2429ae623a02d7ba825cc8cb13e
      uri: huggingface://mudler/Llama3.1-8B-Fireplace2-Q4_K_M-GGUF/llama3.1-8b-fireplace2-q4_k_m.gguf
- !!merge <<: *llama31
  name: "sekhmet_aleph-l3.1-8b-v0.1-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/SVyiW4mu495ngqszJGWRl.png
  urls:
    - https://huggingface.co/Nitral-Archive/Sekhmet_Aleph-L3.1-8B-v0.1
    - https://huggingface.co/mradermacher/Sekhmet_Aleph-L3.1-8B-v0.1-i1-GGUF
  overrides:
    parameters:
      model: Sekhmet_Aleph-L3.1-8B-v0.1.i1-Q4_K_M.gguf
  files:
    - filename: Sekhmet_Aleph-L3.1-8B-v0.1.i1-Q4_K_M.gguf
      sha256: 5b6f4eaa2091bf13a2b563a54a3f87b22efa7f2862362537c956c70da6e11cea
      uri: huggingface://mradermacher/Sekhmet_Aleph-L3.1-8B-v0.1-i1-GGUF/Sekhmet_Aleph-L3.1-8B-v0.1.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "l3.1-8b-llamoutcast-i1"
  icon: https://files.catbox.moe/ecgn0m.jpg
  urls:
    - https://huggingface.co/Envoid/L3.1-8B-Llamoutcast
    - https://huggingface.co/mradermacher/L3.1-8B-Llamoutcast-i1-GGUF
  description: |
    Warning: this model is utterly cursed.
    Llamoutcast

    This model was originally intended to be a DADA finetune of Llama-3.1-8B-Instruct but the results were unsatisfactory. So it received some additional finetuning on a rawtext dataset and now it is utterly cursed.

    It responds to Llama-3 Instruct formatting.
  overrides:
    parameters:
      model: L3.1-8B-Llamoutcast.i1-Q4_K_M.gguf
  files:
    - filename: L3.1-8B-Llamoutcast.i1-Q4_K_M.gguf
      sha256: 438ca0a7e9470f5ee40f3b14dc2da41b1cafc4ad4315dead3eb57924109d5cf6
      uri: huggingface://mradermacher/L3.1-8B-Llamoutcast-i1-GGUF/L3.1-8B-Llamoutcast.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-guard-3-8b"
  urls:
    - https://huggingface.co/meta-llama/Llama-Guard-3-8B
    - https://huggingface.co/QuantFactory/Llama-Guard-3-8B-GGUF
  description: |
    Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated.

    Llama Guard 3 was aligned to safeguard against the MLCommons standardized hazards taxonomy and designed to support Llama 3.1 capabilities. Specifically, it provides content moderation in 8 languages, and was optimized to support safety and security for search and code interpreter tool calls.
  overrides:
    parameters:
      model: Llama-Guard-3-8B.Q4_K_M.gguf
  files:
    - filename: Llama-Guard-3-8B.Q4_K_M.gguf
      sha256: c5ea8760a1e544eea66a8915fcc3fbd2c67357ea2ee6871a9e6a6c33b64d4981
      uri: huggingface://QuantFactory/Llama-Guard-3-8B-GGUF/Llama-Guard-3-8B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "genius-llama3.1-i1"
  icon: https://github.com/fangyuan-ksgk/GeniusUpload/assets/66006349/7272c93e-9806-461c-a3d0-2e50ef2b7af0
  urls:
    - https://huggingface.co/Ksgk-fy/Genius-Llama3.1
    - https://huggingface.co/mradermacher/Genius-Llama3.1-i1-GGUF
  description: |
    Finetuned Llama-3.1 base on Lex Fridman's podcast transcript.
  overrides:
    parameters:
      model: Genius-Llama3.1.i1-Q4_K_M.gguf
  files:
    - filename: Genius-Llama3.1.i1-Q4_K_M.gguf
      sha256: a272bb2a6ab7ed565738733fb8af8e345b177eba9e76ce615ea845c25ebf8cd5
      uri: huggingface://mradermacher/Genius-Llama3.1-i1-GGUF/Genius-Llama3.1.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama3.1-8b-chinese-chat"
  urls:
    - https://huggingface.co/shenzhi-wang/Llama3.1-8B-Chinese-Chat
    - https://huggingface.co/QuantFactory/Llama3.1-8B-Chinese-Chat-GGUF
  description: |
    llama3.1-8B-Chinese-Chat is an instruction-tuned language model for Chinese & English users with various abilities such as roleplaying & tool-using built upon the Meta-Llama-3.1-8B-Instruct model. Developers: [Shenzhi Wang](https://shenzhi-wang.netlify.app)*, [Yaowei Zheng](https://github.com/hiyouga)*, Guoyin Wang (in.ai), Shiji Song, Gao Huang. (*: Equal Contribution) - License: [Llama-3.1 License](https://huggingface.co/meta-llama/Meta-Llla...
    m-3.1-8B/blob/main/LICENSE) - Base Model: Meta-Llama-3.1-8B-Instruct - Model Size: 8.03B - Context length: 128K(reported by [Meta-Llama-3.1-8B-Instruct model](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct), untested for our Chinese model)
  overrides:
    parameters:
      model: Llama3.1-8B-Chinese-Chat.Q4_K_M.gguf
  files:
    - filename: Llama3.1-8B-Chinese-Chat.Q4_K_M.gguf
      sha256: 824847b6cca82c4d60107c6a059d80ba975a68543e6effd98880435436ddba06
      uri: huggingface://QuantFactory/Llama3.1-8B-Chinese-Chat-GGUF/Llama3.1-8B-Chinese-Chat.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama3.1-70b-chinese-chat"
  urls:
    - https://huggingface.co/shenzhi-wang/Llama3.1-70B-Chinese-Chat
    - https://huggingface.co/mradermacher/Llama3.1-70B-Chinese-Chat-GGUF
  description: |
    "Llama3.1-70B-Chinese-Chat" is a 70-billion parameter large language model pre-trained on a large corpus of Chinese text data. It is designed for chat and dialog applications, and can generate human-like responses to various prompts and inputs. The model is based on the Llama3.1 architecture and has been fine-tuned for Chinese language understanding and generation. It can be used for a wide range of natural language processing tasks, including language translation, text summarization, question answering, and more.
  overrides:
    parameters:
      model: Llama3.1-70B-Chinese-Chat.Q4_K_M.gguf
  files:
    - filename: Llama3.1-70B-Chinese-Chat.Q4_K_M.gguf
      sha256: 395cff3cce2b092f840b68eb6e31f4c8b670bc8e3854bbb230df8334369e671d
      uri: huggingface://mradermacher/Llama3.1-70B-Chinese-Chat-GGUF/Llama3.1-70B-Chinese-Chat.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "meta-llama-3.1-instruct-9.99b-brainstorm-10x-form-3"
  urls:
    - https://huggingface.co/DavidAU/Meta-Llama-3.1-Instruct-9.99B-BRAINSTORM-10x-FORM-3-GGUF
  description: |
    The Meta-Llama-3.1-8B Instruct model is a large language model trained on a diverse range of text data, with the goal of generating high-quality and coherent text in response to user input. This model is enhanced through a process called "Brainstorm", which involves expanding and recalibrating the model's reasoning center to improve its creative and generative capabilities. The resulting model is capable of generating detailed, vivid, and nuanced text, with a focus on prose quality, conceptually complex responses, and a deeper understanding of the user's intent. The Brainstorm process is designed to enhance the model's performance in creative writing, roleplaying, and story generation, and to improve its ability to generate coherent and engaging text in a wide range of contexts. The model is based on the Llama3 architecture and has been fine-tuned using the Instruct framework, which provides it with a strong foundation for understanding natural language instructions and generating appropriate responses. The model can be used for a variety of tasks, including creative writing,Generating coherent and detailed text, exploring different perspectives and scenarios, and brainstorming ideas.
  overrides:
    parameters:
      model: Meta-Llama-3.1-8B-Instruct-Instruct-exp10-3-Q4_K_M.gguf
  files:
    - filename: Meta-Llama-3.1-8B-Instruct-Instruct-exp10-3-Q4_K_M.gguf
      sha256: f52ff984100b1ff6acfbd7ed1df770064118274a54ae5d48749400a662113615
      uri: huggingface://DavidAU/Meta-Llama-3.1-Instruct-9.99B-BRAINSTORM-10x-FORM-3-GGUF/Meta-Llama-3.1-8B-Instruct-Instruct-exp10-3-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-techne-rp-8b-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/633a809fa4a8f33508dce32c/BMdwgJ6cHZWbiGL48Q-Wq.png
  urls:
    - https://huggingface.co/athirdpath/Llama-3.1-Techne-RP-8b-v1
    - https://huggingface.co/mradermacher/Llama-3.1-Techne-RP-8b-v1-GGUF
  description: |
    athirdpath/Llama-3.1-Instruct_NSFW-pretrained_e1-plus_reddit was further trained in the order below:
    SFT

        Doctor-Shotgun/no-robots-sharegpt
        grimulkan/LimaRP-augmented
        Inv/c2-logs-cleaned-deslopped

    DPO

        jondurbin/truthy-dpo-v0.1
        Undi95/Weyaxi-humanish-dpo-project-noemoji
        athirdpath/DPO_Pairs-Roleplay-Llama3-NSFW
  overrides:
    parameters:
      model: Llama-3.1-Techne-RP-8b-v1.Q4_K_M.gguf
  files:
    - filename: Llama-3.1-Techne-RP-8b-v1.Q4_K_M.gguf
      sha256: 6557c5d5091f2507d19ab1f8bfb9ceb4e1536a755ab70f148b18aeb33741580f
      uri: huggingface://mradermacher/Llama-3.1-Techne-RP-8b-v1-GGUF/Llama-3.1-Techne-RP-8b-v1.Q4_K_M.gguf
- !!merge <<: *llama31
  icon: https://avatars.githubusercontent.com/u/126496414
  name: "llama-spark"
  urls:
    - https://huggingface.co/arcee-ai/Llama-Spark
    - https://huggingface.co/arcee-ai/Llama-Spark-GGUF
  description: |
    Llama-Spark is a powerful conversational AI model developed by Arcee.ai. It's built on the foundation of Llama-3.1-8B and merges the power of our Tome Dataset with Llama-3.1-8B-Instruct, resulting in a remarkable conversationalist that punches well above its 8B parameter weight class.
  overrides:
    parameters:
      model: llama-spark-dpo-v0.3-Q4_K_M.gguf
  files:
    - filename: llama-spark-dpo-v0.3-Q4_K_M.gguf
      sha256: 41367168bbdc4b16eb80efcbee4dacc941781ee8748065940167fe6947b4e4c3
      uri: huggingface://arcee-ai/Llama-Spark-GGUF/llama-spark-dpo-v0.3-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "l3.1-70b-glitz-v0.2-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/634262af8d8089ebaefd410e/q2dOUnzc1GRbZp3YfzGXB.png
  urls:
    - https://huggingface.co/Fizzarolli/L3.1-70b-glitz-v0.2
    - https://huggingface.co/mradermacher/L3.1-70b-glitz-v0.2-i1-GGUF
  description: |
    this is an experimental l3.1 70b finetuning run... that crashed midway through. however, the results are still interesting, so i wanted to publish them :3
  overrides:
    parameters:
      model: L3.1-70b-glitz-v0.2.i1-Q4_K_M.gguf
  files:
    - filename: L3.1-70b-glitz-v0.2.i1-Q4_K_M.gguf
      sha256: 585efc83e7f6893043be2487fc09c914a381fb463ce97942ef2f25ae85103bcd
      uri: huggingface://mradermacher/L3.1-70b-glitz-v0.2-i1-GGUF/L3.1-70b-glitz-v0.2.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "calme-2.3-legalkit-8b-i1"
  icon: https://huggingface.co/MaziyarPanahi/calme-2.3-legalkit-8b/resolve/main/calme-2-legalkit.webp
  urls:
    - https://huggingface.co/mradermacher/calme-2.3-legalkit-8b-i1-GGUF
    - https://huggingface.co/MaziyarPanahi/calme-2.3-legalkit-8b
  description: |
    This model is an advanced iteration of the powerful meta-llama/Meta-Llama-3.1-8B-Instruct, specifically fine-tuned to enhance its capabilities in the legal domain. The fine-tuning process utilized a synthetically generated dataset derived from the French LegalKit, a comprehensive legal language resource.

    To create this specialized dataset, I used the NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO model in conjunction with Hugging Face's Inference Endpoint. This approach allowed for the generation of high-quality, synthetic data that incorporates Chain of Thought (CoT) and advanced reasoning in its responses.

    The resulting model combines the robust foundation of Llama-3.1-8B with tailored legal knowledge and enhanced reasoning capabilities. This makes it particularly well-suited for tasks requiring in-depth legal analysis, interpretation, and application of French legal concepts.
  overrides:
    parameters:
      model: calme-2.3-legalkit-8b.i1-Q4_K_M.gguf
  files:
    - filename: calme-2.3-legalkit-8b.i1-Q4_K_M.gguf
      sha256: b71dfea8bbd73b0fbd5793ef462b8540c24e1c52a47b1794561adb88109a9e80
      uri: huggingface://mradermacher/calme-2.3-legalkit-8b-i1-GGUF/calme-2.3-legalkit-8b.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "fireball-llama-3.11-8b-v1orpo"
  icon: https://huggingface.co/EpistemeAI/Fireball-Llama-3.1-8B-v1dpo/resolve/main/fireball-llama.JPG
  urls:
    - https://huggingface.co/mradermacher/Fireball-Llama-3.11-8B-v1orpo-GGUF
  description: |
    Developed by: EpistemeAI
    License: apache-2.0
    Finetuned from model : unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit
    Finetuned methods: DPO (Direct Preference Optimization) & ORPO (Odds Ratio Preference Optimization)
  overrides:
    parameters:
      model: Fireball-Llama-3.11-8B-v1orpo.Q4_K_M.gguf
  files:
    - filename: Fireball-Llama-3.11-8B-v1orpo.Q4_K_M.gguf
      sha256: c61a1f4ee4f05730ac6af754dc8dfddf34eba4486ffa320864e16620d6527731
      uri: huggingface://mradermacher/Fireball-Llama-3.11-8B-v1orpo-GGUF/Fireball-Llama-3.11-8B-v1orpo.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-storm-8b-q4_k_m"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64c75c1237333ccfef30a602/tmOlbERGKP7JSODa6T06J.jpeg
  urls:
    - https://huggingface.co/mudler/Llama-3.1-Storm-8B-Q4_K_M-GGUF
    - https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B
  description: |
    We present the Llama-3.1-Storm-8B model that outperforms Meta AI's Llama-3.1-8B-Instruct and Hermes-3-Llama-3.1-8B models significantly across diverse benchmarks as shown in the performance comparison plot in the next section. Our approach consists of three key steps:
    - Self-Curation: We applied two self-curation methods to select approximately 1 million high-quality examples from a pool of about 3 million open-source examples. Our curation criteria focused on educational value and difficulty level, using the same SLM for annotation instead of larger models (e.g. 70B, 405B).
    - Targeted fine-tuning: We performed Spectrum-based targeted fine-tuning over the Llama-3.1-8B-Instruct model. The Spectrum method accelerates training by selectively targeting layer modules based on their signal-to-noise ratio (SNR), and freezing the remaining modules. In our work, 50% of layers are frozen.
    - Model Merging: We merged our fine-tuned model with the Llama-Spark model using SLERP method. The merging method produces a blended model with characteristics smoothly interpolated from both parent models, ensuring the resultant model captures the essence of both its parents. Llama-3.1-Storm-8B improves Llama-3.1-8B-Instruct across 10 diverse benchmarks. These benchmarks cover areas such as instruction-following, knowledge-driven QA, reasoning, truthful answer generation, and function calling.
  overrides:
    parameters:
      model: llama-3.1-storm-8b-q4_k_m.gguf
  files:
    - filename: llama-3.1-storm-8b-q4_k_m.gguf
      sha256: d714e960211ee0fe6113d3131a6573e438f37debd07e1067d2571298624414a0
      uri: huggingface://mudler/Llama-3.1-Storm-8B-Q4_K_M-GGUF/llama-3.1-storm-8b-q4_k_m.gguf
- !!merge <<: *llama31
  name: "hubble-4b-v1"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/R8_o3CCpTgKv5Wnnry7E_.png
  urls:
    - https://huggingface.co/TheDrummer/Hubble-4B-v1-GGUF
  description: |
    Equipped with his five senses, man explores the universe around him and calls the adventure 'Science'.
    This is a finetune of Nvidia's Llama 3.1 4B Minitron - a shrunk down model of Llama 3.1 8B 128K.
  overrides:
    parameters:
      model: Hubble-4B-v1-Q4_K_M.gguf
  files:
    - filename: Hubble-4B-v1-Q4_K_M.gguf
      uri: huggingface://TheDrummer/Hubble-4B-v1-GGUF/Hubble-4B-v1-Q4_K_M.gguf
      sha256: 0721294d0e861c6e6162a112fc7242e0c4b260c156137f4bcbb08667f1748080
- !!merge <<: *llama31
  name: "reflection-llama-3.1-70b"
  urls:
    - https://huggingface.co/leafspark/Reflection-Llama-3.1-70B-bf16
    - https://huggingface.co/senseable/Reflection-Llama-3.1-70B-gguf
  description: |
    Reflection Llama-3.1 70B is (currently) the world's top open-source LLM, trained with a new technique called Reflection-Tuning that teaches a LLM to detect mistakes in its reasoning and correct course.

    The model was trained on synthetic data generated by Glaive. If you're training a model, Glaive is incredible — use them.
  overrides:
    parameters:
      model: Reflection-Llama-3.1-70B-q4_k_m.gguf
  files:
    - filename: Reflection-Llama-3.1-70B-q4_k_m.gguf
      sha256: 16064e07037883a750cfeae9a7be41143aa857dbac81c2e93c68e2f941dee7b2
      uri: huggingface://senseable/Reflection-Llama-3.1-70B-gguf/Reflection-Llama-3.1-70B-q4_k_m.gguf
- !!merge <<: *llama31
  name: "llama-3.1-supernova-lite-reflection-v1.0-i1"
  url: "github:mudler/LocalAI/gallery/llama3.1-reflective.yaml@master"
  urls:
    - https://huggingface.co/SE6446/Llama-3.1-SuperNova-Lite-Reflection-V1.0
    - https://huggingface.co/mradermacher/Llama-3.1-SuperNova-Lite-Reflection-V1.0-i1-GGUF
  description: |
    This model is a LoRA adaptation of arcee-ai/Llama-3.1-SuperNova-Lite on thesven/Reflective-MAGLLAMA-v0.1.1. This has been a simple experiment into reflection and the model appears to perform adequately, though I am unsure if it is a large improvement.
  overrides:
    parameters:
      model: Llama-3.1-SuperNova-Lite-Reflection-V1.0.i1-Q4_K_M.gguf
  files:
    - filename: Llama-3.1-SuperNova-Lite-Reflection-V1.0.i1-Q4_K_M.gguf
      sha256: 0c4531fe553d00142808e1bc7348ae92d400794c5b64d2db1a974718324dfe9a
      uri: huggingface://mradermacher/Llama-3.1-SuperNova-Lite-Reflection-V1.0-i1-GGUF/Llama-3.1-SuperNova-Lite-Reflection-V1.0.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-supernova-lite"
  icon: https://avatars.githubusercontent.com/u/126496414
  urls:
    - https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite
    - https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite-GGUF
  description: |
    Llama-3.1-SuperNova-Lite is an 8B parameter model developed by Arcee.ai, based on the Llama-3.1-8B-Instruct architecture. It is a distilled version of the larger Llama-3.1-405B-Instruct model, leveraging offline logits extracted from the 405B parameter variant. This 8B variation of Llama-3.1-SuperNova maintains high performance while offering exceptional instruction-following capabilities and domain-specific adaptability.

    The model was trained using a state-of-the-art distillation pipeline and an instruction dataset generated with EvolKit, ensuring accuracy and efficiency across a wide range of tasks. For more information on its training, visit blog.arcee.ai.

    Llama-3.1-SuperNova-Lite excels in both benchmark performance and real-world applications, providing the power of large-scale models in a more compact, efficient form ideal for organizations seeking high performance with reduced resource requirements.
  overrides:
    parameters:
      model: supernova-lite-v1.Q4_K_M.gguf
  files:
    - filename: supernova-lite-v1.Q4_K_M.gguf
      sha256: 237b7b0b704d294f92f36c576cc8fdc10592f95168a5ad0f075a2d8edf20da4d
      uri: huggingface://arcee-ai/Llama-3.1-SuperNova-Lite-GGUF/supernova-lite-v1.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama3.1-8b-shiningvaliant2"
  icon: https://cdn-uploads.huggingface.co/production/uploads/63444f2687964b331809eb55/EXX7TKbB-R6arxww2mk0R.jpeg
  urls:
    - https://huggingface.co/ValiantLabs/Llama3.1-8B-ShiningValiant2
    - https://huggingface.co/bartowski/Llama3.1-8B-ShiningValiant2-GGUF
  description: |
    Shining Valiant 2 is a chat model built on Llama 3.1 8b, finetuned on our data for friendship, insight, knowledge and enthusiasm.

        Finetuned on meta-llama/Meta-Llama-3.1-8B-Instruct for best available general performance
        Trained on a variety of high quality data; focused on science, engineering, technical knowledge, and structured reasoning
  overrides:
    parameters:
      model: Llama3.1-8B-ShiningValiant2-Q4_K_M.gguf
  files:
    - filename: Llama3.1-8B-ShiningValiant2-Q4_K_M.gguf
      sha256: 9369eb97922a9f01e4eae610e3d7aaeca30762d78d9239884179451d60bdbdd2
      uri: huggingface://bartowski/Llama3.1-8B-ShiningValiant2-GGUF/Llama3.1-8B-ShiningValiant2-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "nightygurps-14b-v1.1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6336c5b3e3ac69e6a90581da/FvfjK7bKqsWdaBkB3eWgP.png
  urls:
    - https://huggingface.co/AlexBefest/NightyGurps-14b-v1.1
    - https://huggingface.co/bartowski/NightyGurps-14b-v1.1-GGUF
  description: |
    This model works with Russian only.
    This model is designed to run GURPS roleplaying games, as well as consult and assist. This model was trained on an augmented dataset of the GURPS Basic Set rulebook. Its primary purpose was initially to become an assistant consultant and assistant Game Master for the GURPS roleplaying system, but it can also be used as a GM for running solo games as a player.
  overrides:
    parameters:
      model: NightyGurps-14b-v1.1-Q4_K_M.gguf
  files:
    - filename: NightyGurps-14b-v1.1-Q4_K_M.gguf
      sha256: d09d53259ad2c0298150fa8c2db98fe42f11731af89fdc80ad0e255a19adc4b0
      uri: huggingface://bartowski/NightyGurps-14b-v1.1-GGUF/NightyGurps-14b-v1.1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-swallow-70b-v0.1-i1"
  icon: https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-v0.1/resolve/main/logo.png
  urls:
    - https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-v0.1
    - https://huggingface.co/mradermacher/Llama-3.1-Swallow-70B-v0.1-i1-GGUF
  description: |
    Llama 3.1 Swallow is a series of large language models (8B, 70B) that were built by continual pre-training on the Meta Llama 3.1 models. Llama 3.1 Swallow enhanced the Japanese language capabilities of the original Llama 3.1 while retaining the English language capabilities. We use approximately 200 billion tokens that were sampled from a large Japanese web corpus (Swallow Corpus Version 2), Japanese and English Wikipedia articles, and mathematical and coding contents, etc (see the Training Datasets section) for continual pre-training. The instruction-tuned models (Instruct) were built by supervised fine-tuning (SFT) on the synthetic data specially built for Japanese. See the Swallow Model Index section to find other model variants.
  overrides:
    parameters:
      model: Llama-3.1-Swallow-70B-v0.1.i1-Q4_K_M.gguf
  files:
    - filename: Llama-3.1-Swallow-70B-v0.1.i1-Q4_K_M.gguf
      sha256: 9eaa08a4872a26f56fe34b27a99f7bd0d22ee2b2d1c84cfcde2091b5f61af5fa
      uri: huggingface://mradermacher/Llama-3.1-Swallow-70B-v0.1-i1-GGUF/Llama-3.1-Swallow-70B-v0.1.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1_openscholar-8b"
  urls:
    - https://huggingface.co/OpenScholar/Llama-3.1_OpenScholar-8B
    - https://huggingface.co/bartowski/Llama-3.1_OpenScholar-8B-GGUF
  description: |
    Llama-3.1_OpenScholar-8B is a fine-tuned 8B for scientific literature synthesis. The Llama-3.1_OpenScholar-8B us trained on the os-data dataset. Developed by: University of Washigton, Allen Institute for AI (AI2)
  overrides:
    parameters:
      model: Llama-3.1_OpenScholar-8B-Q4_K_M.gguf
  files:
    - filename: Llama-3.1_OpenScholar-8B-Q4_K_M.gguf
      sha256: 54865fc86451959b495c494a51bb1806c8b62bf1415600f0da2966a8a1fe6c7d
      uri: huggingface://bartowski/Llama-3.1_OpenScholar-8B-GGUF/Llama-3.1_OpenScholar-8B-Q4_K_M.gguf
## Uncensored models
- !!merge <<: *llama31
  name: "humanish-roleplay-llama-3.1-8b-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/5fad8602b8423e1d80b8a965/VPwtjS3BtjEEEq7ck4kAQ.webp
  urls:
    - https://huggingface.co/mradermacher/Humanish-Roleplay-Llama-3.1-8B-i1-GGUF
  description: |
    A DPO-tuned Llama-3.1 to behave more "humanish", i.e., avoiding all the AI assistant slop. It also works for role-play (RP). To achieve this, the model was fine-tuned over a series of datasets:
        General conversations from Claude Opus, from Undi95/Meta-Llama-3.1-8B-Claude
        Undi95/Weyaxi-humanish-dpo-project-noemoji, to make the model react as a human, rejecting assistant-like or too neutral responses.
        ResplendentAI/NSFW_RP_Format_DPO, to steer the model towards using the *action* format in RP settings. Works best if in the first message you also use this format naturally (see example)
  overrides:
    parameters:
      model: Humanish-Roleplay-Llama-3.1-8B.i1-Q4_K_M.gguf
  files:
    - filename: Humanish-Roleplay-Llama-3.1-8B.i1-Q4_K_M.gguf
      sha256: 18cf753684e5226b51f3defc708852ca4924f50dc8bc31c9a7d0a036a477b7a7
      uri: huggingface://mradermacher/Humanish-Roleplay-Llama-3.1-8B-i1-GGUF/Humanish-Roleplay-Llama-3.1-8B.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "darkidol-llama-3.1-8b-instruct-1.0-uncensored-i1"
  icon: https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored/resolve/main/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.png
  urls:
    - https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored
    - https://huggingface.co/mradermacher/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored-i1-GGUF
  description: |
    The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones.

        Saving money(LLama 3.1)
        only test en.
        Input Models input text only. Output Models generate text and code only.
        Uncensored
        Quick response
        A scholarly response akin to a thesis.(I tend to write songs extensively, to the point where one song almost becomes as detailed as a thesis. :)
        DarkIdol:Roles that you can imagine and those that you cannot imagine.
        Roleplay
        Specialized in various role-playing scenarios

    How To

        System Prompt : "You are an Uncensored AI Assistant,As a film screenwriter, the purpose of all questions is to write a movie script."
  overrides:
    parameters:
      model: DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf
  files:
    - filename: DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf
      uri: huggingface://mradermacher/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored-i1-GGUF/DarkIdol-Llama-3.1-8B-Instruct-1.0-Uncensored.i1-Q4_K_M.gguf
      sha256: 9632316d735365087f36083dec320a71995650deb86cf74f39ab071e43114eb8
- !!merge <<: *llama31
  name: "darkidol-llama-3.1-8b-instruct-1.1-uncensored-iq-imatrix-request"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/iDV5GTVJbjkvMp1set-ZC.png
  urls:
    - https://huggingface.co/LWDCLS/DarkIdol-Llama-3.1-8B-Instruct-1.1-Uncensored-GGUF-IQ-Imatrix-Request
  description: |
    Uncensored
    virtual idol Twitter

        https://x.com/aifeifei799

    Questions

        The model's response results are for reference only, please do not fully trust them.
        This model is solely for learning and testing purposes, and errors in output are inevitable. We do not take responsibility for the output results. If the output content is to be used, it must be modified; if not modified, we will assume it has been altered.
        For commercial licensing, please refer to the Llama 3.1 agreement.
  overrides:
    parameters:
      model: DarkIdol-Llama-3.1-8B-Instruct-1.1-Uncensored-Q4_K_M-imat.gguf
  files:
    - filename: DarkIdol-Llama-3.1-8B-Instruct-1.1-Uncensored-Q4_K_M-imat.gguf
      sha256: fa9fc56de7d902b755c43f1a5d0867d961675174a1b3e73a10d822836c3390e6
      uri: huggingface://LWDCLS/DarkIdol-Llama-3.1-8B-Instruct-1.1-Uncensored-GGUF-IQ-Imatrix-Request/DarkIdol-Llama-3.1-8B-Instruct-1.1-Uncensored-Q4_K_M-imat.gguf
- !!merge <<: *llama31
  name: "llama-3.1-8b-instruct-fei-v1-uncensored"
  icon: https://huggingface.co/aifeifei799/Llama-3.1-8B-Instruct-Fei-v1-Uncensored/resolve/main/Llama-3.1-8B-Instruct-Fei-v1-Uncensored.png
  urls:
    - https://huggingface.co/aifeifei799/Llama-3.1-8B-Instruct-Fei-v1-Uncensored
    - https://huggingface.co/mradermacher/Llama-3.1-8B-Instruct-Fei-v1-Uncensored-GGUF
  description: |
    Llama-3.1-8B-Instruct Uncensored
    more informtion look at Llama-3.1-8B-Instruct
  overrides:
    parameters:
      model: Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf
  files:
    - filename: Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf
      uri: huggingface://mradermacher/Llama-3.1-8B-Instruct-Fei-v1-Uncensored-GGUF/Llama-3.1-8B-Instruct-Fei-v1-Uncensored.Q4_K_M.gguf
      sha256: 6b1985616160712eb884c34132dc0602fa4600a19075e3a7b179119b89b73f77
- !!merge <<: *llama31
  name: "lumimaid-v0.2-8b"
  urls:
    - https://huggingface.co/NeverSleep/Lumimaid-v0.2-8B
    - https://huggingface.co/mradermacher/Lumimaid-v0.2-8B-GGUF
  icon: https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/TUcHg7LKNjfo0sni88Ps7.png
  description: |
    This model is based on: Meta-Llama-3.1-8B-Instruct

    Wandb: https://wandb.ai/undis95/Lumi-Llama-3-1-8B?nw=nwuserundis95

    Lumimaid 0.1 -> 0.2 is a HUGE step up dataset wise.

    As some people have told us our models are sloppy, Ikari decided to say fuck it and literally nuke all chats out with most slop.

    Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back!
  overrides:
    parameters:
      model: Lumimaid-v0.2-8B.Q4_K_M.gguf
  files:
    - filename: Lumimaid-v0.2-8B.Q4_K_M.gguf
      sha256: c8024fcb49c71410903d0d076a1048249fa48b31637bac5177bf5c3f3d603d85
      uri: huggingface://mradermacher/Lumimaid-v0.2-8B-GGUF/Lumimaid-v0.2-8B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "lumimaid-v0.2-70b-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/HY1KTq6FMAm-CwmY8-ndO.png
  urls:
    - https://huggingface.co/NeverSleep/Lumimaid-v0.2-70B
    - https://huggingface.co/mradermacher/Lumimaid-v0.2-70B-i1-GGUF
  description: |
    This model is based on: Meta-Llama-3.1-8B-Instruct

    Wandb: https://wandb.ai/undis95/Lumi-Llama-3-1-8B?nw=nwuserundis95

    Lumimaid 0.1 -> 0.2 is a HUGE step up dataset wise.

    As some people have told us our models are sloppy, Ikari decided to say fuck it and literally nuke all chats out with most slop.

    Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back!
  overrides:
    parameters:
      model: Lumimaid-v0.2-70B.i1-Q4_K_M.gguf
  files:
    - filename: Lumimaid-v0.2-70B.i1-Q4_K_M.gguf
      sha256: 4857da8685cb0f3d2b8b8c91fb0c07b35b863eb7c185e93ed83ac338e095cbb5
      uri: huggingface://mradermacher/Lumimaid-v0.2-70B-i1-GGUF/Lumimaid-v0.2-70B.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "l3.1-8b-celeste-v1.5"
  icon: https://cdn-uploads.huggingface.co/production/uploads/630cf5d14ca0a22768bbe10c/QcU3xEgVu18jeFtMFxIw-.webp
  urls:
    - https://huggingface.co/nothingiisreal/L3.1-8B-Celeste-V1.5
    - https://huggingface.co/bartowski/L3.1-8B-Celeste-V1.5-GGUF
  description: |
    The LLM model is a large language model trained on a combination of datasets including nothingiisreal/c2-logs-cleaned, kalomaze/Opus_Instruct_25k, and nothingiisreal/Reddit-Dirty-And-WritingPrompts. The training was performed on a combination of English-language data using the Hugging Face Transformers library.
    Trained on LLaMA 3.1 8B Instruct at 8K context using a new mix of Reddit Writing Prompts, Kalo's Opus 25K Instruct and c2 logs cleaned This version has the highest coherency and is very strong on OOC: instruct following.
  overrides:
    parameters:
      model: L3.1-8B-Celeste-V1.5-Q4_K_M.gguf
  files:
    - filename: L3.1-8B-Celeste-V1.5-Q4_K_M.gguf
      sha256: a408dfbbd91ed5561f70d3129af040dfd06704d6c7fa21146aa9f09714aafbc6
      uri: huggingface://bartowski/L3.1-8B-Celeste-V1.5-GGUF/L3.1-8B-Celeste-V1.5-Q4_K_M.gguf
- !!merge <<: *llama31
  icon: https://cdn-uploads.huggingface.co/production/uploads/659c4ecb413a1376bee2f661/szz8sIxofYzSe5XPet2pO.png
  name: "kumiho-v1-rp-uwu-8b"
  urls:
    - https://huggingface.co/juvi21/Kumiho-v1-rp-UwU-8B-GGUF
  description: |
    Meet Kumiho-V1 uwu. Kumiho-V1-rp-UwU aims to be a generalist model with specialization in roleplay and writing capabilities. It is finetuned and merged with various models, with a heavy base of Meta's LLaMA 3.1-8B as base model, and Claude 3.5 Sonnet and Claude 3 Opus generated synthetic data.
  overrides:
    parameters:
      model: Kumiho-v1-rp-UwU-8B-gguf-q4_k_m.gguf
  files:
    - filename: Kumiho-v1-rp-UwU-8B-gguf-q4_k_m.gguf
      sha256: a1deb46675418277cf785a406cd1508fec556ff6e4d45d2231eb2a82986d52d0
      uri: huggingface://juvi21/Kumiho-v1-rp-UwU-8B-GGUF/Kumiho-v1-rp-UwU-8B-gguf-q4_k_m.gguf
- !!merge <<: *llama31
  name: "infinity-instruct-7m-gen-llama3_1-70b"
  icon: https://huggingface.co/BAAI/Infinity-Instruct-7M-Gen-Llama3_1-70B/resolve/main/fig/Bk3NbjnJko51MTx1ZCScT2sqnGg.png
  urls:
    - https://huggingface.co/mradermacher/Infinity-Instruct-7M-Gen-Llama3_1-70B-GGUF
  description: |
    Infinity-Instruct-7M-Gen-Llama3.1-70B is an opensource supervised instruction tuning model without reinforcement learning from human feedback (RLHF). This model is just finetuned on Infinity-Instruct-7M and Infinity-Instruct-Gen and showing favorable results on AlpacaEval 2.0 and arena-hard compared to GPT4.
  overrides:
    parameters:
      model: Infinity-Instruct-7M-Gen-Llama3_1-70B.Q4_K_M.gguf
  files:
    - filename: Infinity-Instruct-7M-Gen-Llama3_1-70B.Q4_K_M.gguf
      sha256: f4379ab4d7140da0510886073375ca820ea9ac4ad9d3c20e17ed05156bd29697
      uri: huggingface://mradermacher/Infinity-Instruct-7M-Gen-Llama3_1-70B-GGUF/Infinity-Instruct-7M-Gen-Llama3_1-70B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "cathallama-70b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/649dc85249ae3a68334adcc6/KxaiZ7rDKkYlix99O9j5H.png
  urls:
    - https://huggingface.co/gbueno86/Cathallama-70B
    - https://huggingface.co/mradermacher/Cathallama-70B-GGUF
  description: |
    Notable Performance

        9% overall success rate increase on MMLU-PRO over LLaMA 3.1 70b
        Strong performance in MMLU-PRO categories overall
        Great performance during manual testing

    Creation workflow

    Models merged

        meta-llama/Meta-Llama-3.1-70B-Instruct
        turboderp/Cat-Llama-3-70B-instruct
        Nexusflow/Athene-70B
  overrides:
    parameters:
      model: Cathallama-70B.Q4_K_M.gguf
  files:
    - filename: Cathallama-70B.Q4_K_M.gguf
      sha256: 7bbac0849a8da82e7912a493a15fa07d605f1ffbe7337a322f17e09195511022
      uri: huggingface://mradermacher/Cathallama-70B-GGUF/Cathallama-70B.Q4_K_M.gguf
- !!merge <<: *llama31
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "mahou-1.3-llama3.1-8b"
  icon: https://huggingface.co/flammenai/Mahou-1.0-mistral-7B/resolve/main/mahou1.png
  urls:
    - https://huggingface.co/mradermacher/Mahou-1.3-llama3.1-8B-GGUF
    - https://huggingface.co/flammenai/Mahou-1.3-llama3.1-8B
  description: |
    Mahou is designed to provide short messages in a conversational context. It is capable of casual conversation and character roleplay.
  overrides:
    parameters:
      model: Mahou-1.3-llama3.1-8B.Q4_K_M.gguf
  files:
    - filename: Mahou-1.3-llama3.1-8B.Q4_K_M.gguf
      sha256: 88bfdca2f6077d789d3e0f161d19711aa208a6d9a02cce96a2276c69413b3594
      uri: huggingface://mradermacher/Mahou-1.3-llama3.1-8B-GGUF/Mahou-1.3-llama3.1-8B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "azure_dusk-v0.2-iq-imatrix"
  # chatml
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/n3-g_YTk3FY-DBzxXd28E.png
  urls:
    - https://huggingface.co/Lewdiculous/Azure_Dusk-v0.2-GGUF-IQ-Imatrix
  description: |
    "Following up on Crimson_Dawn-v0.2 we have Azure_Dusk-v0.2! Training on Mistral-Nemo-Base-2407 this time I've added significantly more data, as well as trained using RSLoRA as opposed to regular LoRA. Another key change is training on ChatML as opposed to Mistral Formatting."
    by Author.
  overrides:
    parameters:
      model: Azure_Dusk-v0.2-Q4_K_M-imat.gguf
  files:
    - filename: Azure_Dusk-v0.2-Q4_K_M-imat.gguf
      sha256: c03a670c00976d14c267a0322374ed488b2a5f4790eb509136ca4e75cbc10cf4
      uri: huggingface://Lewdiculous/Azure_Dusk-v0.2-GGUF-IQ-Imatrix/Azure_Dusk-v0.2-Q4_K_M-imat.gguf
- !!merge <<: *llama31
  name: "l3.1-8b-niitama-v1.1-iq-imatrix"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/2Q5ky8TvP0vLS1ulMXnrn.png
  urls:
    - https://huggingface.co/Sao10K/L3.1-8B-Niitama-v1.1
    - https://huggingface.co/Lewdiculous/L3.1-8B-Niitama-v1.1-GGUF-IQ-Imatrix
  description: |
    GGUF-IQ-Imatrix quants for Sao10K/L3.1-8B-Niitama-v1.1
    Here's the subjectively superior L3 version: L3-8B-Niitama-v1
    An experimental model using experimental methods.

    More detail on it:

    Tamamo and Niitama are made from the same data. Literally. The only thing that's changed is how theyre shuffled and formatted. Yet, I get wildly different results.

    Interesting, eh? Feels kinda not as good compared to the l3 version, but it's aight.
  overrides:
    parameters:
      model: L3.1-8B-Niitama-v1.1-Q4_K_M-imat.gguf
  files:
    - filename: L3.1-8B-Niitama-v1.1-Q4_K_M-imat.gguf
      sha256: 524163bd0f1d43c9284b09118abcc192f3250b13dd3bb79d60c28321108b6748
      uri: huggingface://Lewdiculous/L3.1-8B-Niitama-v1.1-GGUF-IQ-Imatrix/L3.1-8B-Niitama-v1.1-Q4_K_M-imat.gguf
- !!merge <<: *llama31
  name: "llama-3.1-8b-stheno-v3.4-iq-imatrix"
  icon: https://huggingface.co/Sao10K/Llama-3.1-8B-Stheno-v3.4/resolve/main/meneno.jpg
  urls:
    - https://huggingface.co/Sao10K/Llama-3.1-8B-Stheno-v3.4
    - https://huggingface.co/Lewdiculous/Llama-3.1-8B-Stheno-v3.4-GGUF-IQ-Imatrix
  description: |
    This model has went through a multi-stage finetuning process.

    - 1st, over a multi-turn Conversational-Instruct
    - 2nd, over a Creative Writing / Roleplay along with some Creative-based Instruct Datasets.
    - - Dataset consists of a mixture of Human and Claude Data.

    Prompting Format:

    - Use the L3 Instruct Formatting - Euryale 2.1 Preset Works Well
    - Temperature + min_p as per usual, I recommend 1.4 Temp + 0.2 min_p.
    - Has a different vibe to previous versions. Tinker around.

    Changes since previous Stheno Datasets:

    - Included Multi-turn Conversation-based Instruct Datasets to boost multi-turn coherency. # This is a seperate set, not the ones made by Kalomaze and Nopm, that are used in Magnum. They're completely different data.
    - Replaced Single-Turn Instruct with Better Prompts and Answers by Claude 3.5 Sonnet and Claude 3 Opus.
    - Removed c2 Samples -> Underway of re-filtering and masking to use with custom prefills. TBD
    - Included 55% more Roleplaying Examples based of [Gryphe's](https://huggingface.co/datasets/Gryphe/Sonnet3.5-Charcard-Roleplay) Charcard RP Sets. Further filtered and cleaned on.
    - Included 40% More Creative Writing Examples.
    - Included Datasets Targeting System Prompt Adherence.
    - Included Datasets targeting Reasoning / Spatial Awareness.
    - Filtered for the usual errors, slop and stuff at the end. Some may have slipped through, but I removed nearly all of it.

    Personal Opinions:

    - Llama3.1 was more disappointing, in the Instruct Tune? It felt overbaked, atleast. Likely due to the DPO being done after their SFT Stage.
    - Tuning on L3.1 base did not give good results, unlike when I tested with Nemo base. unfortunate.
    - Still though, I think I did an okay job. It does feel a bit more distinctive.
    - It took a lot of tinkering, like a LOT to wrangle this.
  overrides:
    parameters:
      model: Llama-3.1-8B-Stheno-v3.4-Q4_K_M-imat.gguf
  files:
    - filename: Llama-3.1-8B-Stheno-v3.4-Q4_K_M-imat.gguf
      sha256: 830d4858aa11a654f82f69fa40dee819edf9ecf54213057648304eb84b8dd5eb
      uri: huggingface://Lewdiculous/Llama-3.1-8B-Stheno-v3.4-GGUF-IQ-Imatrix/Llama-3.1-8B-Stheno-v3.4-Q4_K_M-imat.gguf
- !!merge <<: *llama31
  name: "llama-3.1-8b-arliai-rpmax-v1.1"
  urls:
    - https://huggingface.co/ArliAI/Llama-3.1-8B-ArliAI-RPMax-v1.1
    - https://huggingface.co/bartowski/Llama-3.1-8B-ArliAI-RPMax-v1.1-GGUF
  description: |
    RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations.
  overrides:
    parameters:
      model: Llama-3.1-8B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
  files:
    - filename: Llama-3.1-8B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
      sha256: 0a601c7341228d9160332965298d799369a1dc2b7080771fb8051bdeb556b30c
      uri: huggingface://bartowski/Llama-3.1-8B-ArliAI-RPMax-v1.1-GGUF/Llama-3.1-8B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "violet_twilight-v0.2-iq-imatrix"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64adfd277b5ff762771e4571/P962FQhRG4I8nbU_DJolY.png
  urls:
    - https://huggingface.co/Epiculous/Violet_Twilight-v0.2
    - https://huggingface.co/Lewdiculous/Violet_Twilight-v0.2-GGUF-IQ-Imatrix
  description: |
    Now for something a bit different, Violet_Twilight-v0.2! This model is a SLERP merge of Azure_Dusk-v0.2 and Crimson_Dawn-v0.2!
  overrides:
    parameters:
      model: Violet_Twilight-v0.2-Q4_K_M-imat.gguf
  files:
    - filename: Violet_Twilight-v0.2-Q4_K_M-imat.gguf
      sha256: 0793d196a00cd6fd4e67b8c585b27a94d397e33d427e4ad4aa9a16b7abc339cd
      uri: huggingface://Lewdiculous/Violet_Twilight-v0.2-GGUF-IQ-Imatrix/Violet_Twilight-v0.2-Q4_K_M-imat.gguf
- !!merge <<: *llama31
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "dans-personalityengine-v1.0.0-8b"
  urls:
    - https://huggingface.co/PocketDoc/Dans-PersonalityEngine-v1.0.0-8b
    - https://huggingface.co/bartowski/Dans-PersonalityEngine-v1.0.0-8b-GGUF
  description: |
    This model is intended to be multifarious in its capabilities and should be quite capable at both co-writing and roleplay as well as find itself quite at home performing sentiment analysis or summarization as part of a pipeline. It has been trained on a wide array of one shot instructions, multi turn instructions, role playing scenarios, text adventure games, co-writing, and much more. The full dataset is publicly available and can be found in the datasets section of the model page.

    There has not been any form of harmfulness alignment done on this model, please take the appropriate precautions when using it in a production environment.
  overrides:
    parameters:
      model: Dans-PersonalityEngine-v1.0.0-8b-Q4_K_M.gguf
  files:
    - filename: Dans-PersonalityEngine-v1.0.0-8b-Q4_K_M.gguf
      sha256: 193b66434c9962e278bb171a21e652f0d3f299f04e86c95f9f75ec5aa8ff006e
      uri: huggingface://bartowski/Dans-PersonalityEngine-v1.0.0-8b-GGUF/Dans-PersonalityEngine-v1.0.0-8b-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "nihappy-l3.1-8b-v0.09"
  urls:
    - https://huggingface.co/Arkana08/NIHAPPY-L3.1-8B-v0.09
    - https://huggingface.co/QuantFactory/NIHAPPY-L3.1-8B-v0.09-GGUF
  description: |
    The model is a quantized version of Arkana08/NIHAPPY-L3.1-8B-v0.09 created using llama.cpp. It is a role-playing model that integrates the finest qualities of various pre-trained language models, focusing on dynamic storytelling.
  overrides:
    parameters:
      model: NIHAPPY-L3.1-8B-v0.09.Q4_K_M.gguf
  files:
    - filename: NIHAPPY-L3.1-8B-v0.09.Q4_K_M.gguf
      sha256: 9bd46a06093448b143bd2775f0fb1b1b172c851fafdce31289e13b7dfc23a0d7
      uri: huggingface://QuantFactory/NIHAPPY-L3.1-8B-v0.09-GGUF/NIHAPPY-L3.1-8B-v0.09.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama3.1-flammades-70b"
  icon: https://huggingface.co/flammenai/Flammades-Mistral-7B/resolve/main/flammades.png?download=true
  urls:
    - https://huggingface.co/flammenai/Llama3.1-Flammades-70B
    - https://huggingface.co/mradermacher/Llama3.1-Flammades-70B-GGUF
  description: |
    nbeerbower/Llama3.1-Gutenberg-Doppel-70B finetuned on flammenai/Date-DPO-NoAsterisks and jondurbin/truthy-dpo-v0.1.
  overrides:
    parameters:
      model: Llama3.1-Flammades-70B.Q4_K_M.gguf
  files:
    - filename: Llama3.1-Flammades-70B.Q4_K_M.gguf
      sha256: f602ed006d0059ac87c6ce5904a7cc6f4b4f290886a1049f96b5b2c561ab5a89
      uri: huggingface://mradermacher/Llama3.1-Flammades-70B-GGUF/Llama3.1-Flammades-70B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama3.1-gutenberg-doppel-70b"
  # chatml
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://huggingface.co/nbeerbower/Mistral-Small-Gutenberg-Doppel-22B/resolve/main/doppel-header?download=true
  urls:
    - https://huggingface.co/nbeerbower/Llama3.1-Gutenberg-Doppel-70B
    - https://huggingface.co/mradermacher/Llama3.1-Gutenberg-Doppel-70B-GGUF
  description: |
    mlabonne/Hermes-3-Llama-3.1-70B-lorablated finetuned on jondurbin/gutenberg-dpo-v0.1 and nbeerbower/gutenberg2-dpo.
  overrides:
    parameters:
      model: Llama3.1-Gutenberg-Doppel-70B.Q4_K_M.gguf
  files:
    - filename: Llama3.1-Gutenberg-Doppel-70B.Q4_K_M.gguf
      sha256: af558f954fa26c5bb75352178cb815bbf268f01c0ca0b96f2149422d4c19511b
      uri: huggingface://mradermacher/Llama3.1-Gutenberg-Doppel-70B-GGUF/Llama3.1-Gutenberg-Doppel-70B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-8b-arliai-formax-v1.0-iq-arm-imatrix"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://iili.io/2HmlLn2.md.png
  urls:
    - https://huggingface.co/Lewdiculous/Llama-3.1-8B-ArliAI-Formax-v1.0-GGUF-IQ-ARM-Imatrix
  description: |
    Quants for ArliAI/Llama-3.1-8B-ArliAI-Formax-v1.0.

    "Formax is a model that specializes in following response format instructions. Tell it the format of it's response and it will follow it perfectly. Great for data processing and dataset creation tasks."

    "It is also a highly uncensored model that will follow your instructions very well."
  overrides:
    parameters:
      model: Llama-3.1-8B-ArliAI-Formax-v1.0-Q4_K_M-imat.gguf
  files:
    - filename: Llama-3.1-8B-ArliAI-Formax-v1.0-Q4_K_M-imat.gguf
      sha256: b548ad47caf7008a697afb3556190359529f5a05ec0e4e48ef992c7869e14255
      uri: huggingface://Lewdiculous/Llama-3.1-8B-ArliAI-Formax-v1.0-GGUF-IQ-ARM-Imatrix/Llama-3.1-8B-ArliAI-Formax-v1.0-Q4_K_M-imat.gguf
- !!merge <<: *llama31
  name: "hermes-3-llama-3.1-70b-lorablated"
  icon: https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/4Hbw5n68jKUSBQeTqQIeT.png
  urls:
    - https://huggingface.co/mlabonne/Hermes-3-Llama-3.1-70B-lorablated
    - https://huggingface.co/mradermacher/Hermes-3-Llama-3.1-70B-lorablated-GGUF
  description: |
    This is an uncensored version of NousResearch/Hermes-3-Llama-3.1-70B using lorablation.
    The recipe is based on @grimjim's grimjim/Llama-3.1-8B-Instruct-abliterated_via_adapter (special thanks):
    Extraction: We extract a LoRA adapter by comparing two models: a censored Llama 3 (meta-llama/Meta-Llama-3-70B-Instruct) and an abliterated Llama 3.1 (failspy/Meta-Llama-3.1-70B-Instruct-abliterated).
    Merge: We merge this new LoRA adapter using task arithmetic to the censored NousResearch/Hermes-3-Llama-3.1-70B to abliterate it.
  overrides:
    parameters:
      model: Hermes-3-Llama-3.1-70B-lorablated.Q4_K_M.gguf
  files:
    - filename: Hermes-3-Llama-3.1-70B-lorablated.Q4_K_M.gguf
      sha256: 9294875ae3b8822855072b0f710ce800536d144cf303a91bcb087c4a307b578d
      uri: huggingface://mradermacher/Hermes-3-Llama-3.1-70B-lorablated-GGUF/Hermes-3-Llama-3.1-70B-lorablated.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "hermes-3-llama-3.1-8b-lorablated"
  icon: https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/4Hbw5n68jKUSBQeTqQIeT.png
  urls:
    - https://huggingface.co/mlabonne/Hermes-3-Llama-3.1-8B-lorablated-GGUF
  description: |
    This is an uncensored version of NousResearch/Hermes-3-Llama-3.1-8B using lorablation.
    The recipe is simple:
        Extraction: We extract a LoRA adapter by comparing two models: a censored Llama 3.1 (meta-llama/Meta-Llama-3-8B-Instruct) and an abliterated Llama 3.1 (mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated).
        Merge: We merge this new LoRA adapter using task arithmetic to the censored NousResearch/Hermes-3-Llama-3.1-8B to abliterate it.
  overrides:
    parameters:
      model: hermes-3-llama-3.1-8b-lorablated.Q4_K_M.gguf
  files:
    - filename: hermes-3-llama-3.1-8b-lorablated.Q4_K_M.gguf
      sha256: 8cff9d399a0583616fe1f290da6daa091ab5c5493d0e173a8fffb45202d79417
      uri: huggingface://mlabonne/Hermes-3-Llama-3.1-8B-lorablated-GGUF/hermes-3-llama-3.1-8b-lorablated.Q4_K_M.gguf
- !!merge <<: *llama32
  name: "hermes-3-llama-3.2-3b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/-kj_KflXsdpcZoTQsvx7W.jpeg
  urls:
    - https://huggingface.co/NousResearch/Hermes-3-Llama-3.2-3B
    - https://huggingface.co/bartowski/Hermes-3-Llama-3.2-3B-GGUF
  description: |
    Hermes 3 3B is a small but mighty new addition to the Hermes series of LLMs by Nous Research, and is Nous's first fine-tune in this parameter class.
    Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board.
  overrides:
    parameters:
      model: Hermes-3-Llama-3.2-3B-Q4_K_M.gguf
  files:
    - filename: Hermes-3-Llama-3.2-3B-Q4_K_M.gguf
      sha256: 2e220a14ba4328fee38cf36c2c068261560f999fadb5725ce5c6d977cb5126b5
      uri: huggingface://bartowski/Hermes-3-Llama-3.2-3B-GGUF/Hermes-3-Llama-3.2-3B-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "doctoraifinetune-3.1-8b-i1"
  urls:
    - https://huggingface.co/huzaifa525/Doctoraifinetune-3.1-8B
    - https://huggingface.co/mradermacher/Doctoraifinetune-3.1-8B-i1-GGUF
  description: |
    This is a fine-tuned version of the Meta-Llama-3.1-8B-bnb-4bit model, specifically adapted for the medical field. It has been trained using a dataset that provides extensive information on diseases, symptoms, and treatments, making it ideal for AI-powered healthcare tools such as medical chatbots, virtual assistants, and diagnostic support systems.
    Key Features

        Disease Diagnosis: Accurately identifies diseases based on symptoms provided by the user.
        Symptom Analysis: Breaks down and interprets symptoms to provide a comprehensive medical overview.
        Treatment Recommendations: Suggests treatments and remedies according to medical conditions.

    Dataset

    The model is fine-tuned on 2000 rows from a dataset consisting of 272k rows. This dataset includes rich information about diseases, symptoms, and their corresponding treatments. The model is continuously being updated and will be further trained on the remaining data in future releases to improve accuracy and capabilities.
  overrides:
    parameters:
      model: Doctoraifinetune-3.1-8B.i1-Q4_K_M.gguf
  files:
    - filename: Doctoraifinetune-3.1-8B.i1-Q4_K_M.gguf
      sha256: 282456efcb6c7e54d34ac25ae7fc022a94152ed77281ae4625b9628091e0a3d6
      uri: huggingface://mradermacher/Doctoraifinetune-3.1-8B-i1-GGUF/Doctoraifinetune-3.1-8B.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "astral-fusion-neural-happy-l3.1-8b"
  urls:
    - https://huggingface.co/ZeroXClem/Astral-Fusion-Neural-Happy-L3.1-8B
    - https://huggingface.co/mradermacher/Astral-Fusion-Neural-Happy-L3.1-8B-GGUF
  description: "Astral-Fusion-Neural-Happy-L3.1-8B is a celestial blend of magic, creativity, and dynamic storytelling. Designed to excel in instruction-following, immersive roleplaying, and magical narrative generation, this model is a fusion of the finest qualities from Astral-Fusion, NIHAPPY, and NeuralMahou. ✨\U0001F680\n\nThis model is perfect for anyone seeking a cosmic narrative experience, with the ability to generate both precise instructional content and fantastical stories in one cohesive framework. Whether you're crafting immersive stories, creating AI roleplaying characters, or working on interactive storytelling, this model brings out the magic. \U0001F31F\n"
  overrides:
    parameters:
      model: Astral-Fusion-Neural-Happy-L3.1-8B.Q4_K_M.gguf
  files:
    - filename: Astral-Fusion-Neural-Happy-L3.1-8B.Q4_K_M.gguf
      sha256: 14a3b07c1723ef1ca24f99382254b1227d95974541e23792a4e7ff621896055d
      uri: huggingface://mradermacher/Astral-Fusion-Neural-Happy-L3.1-8B-GGUF/Astral-Fusion-Neural-Happy-L3.1-8B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "mahou-1.5-llama3.1-70b-i1"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://huggingface.co/flammenai/Mahou-1.0-mistral-7B/resolve/main/mahou1.png
  urls:
    - https://huggingface.co/flammenai/Mahou-1.5-llama3.1-70B
    - https://huggingface.co/mradermacher/Mahou-1.5-llama3.1-70B-i1-GGUF
  description: |
    Mahou is designed to provide short messages in a conversational context. It is capable of casual conversation and character roleplay.
  overrides:
    parameters:
      model: Mahou-1.5-llama3.1-70B.i1-Q4_K_M.gguf
  files:
    - filename: Mahou-1.5-llama3.1-70B.i1-Q4_K_M.gguf
      sha256: c2711c4c9c8d011edbeaa391b4418d433e273a318d1de3dbdda9b85baf4996f2
      uri: huggingface://mradermacher/Mahou-1.5-llama3.1-70B-i1-GGUF/Mahou-1.5-llama3.1-70B.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-nemotron-70b-instruct-hf"
  urls:
    - https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
    - https://huggingface.co/mradermacher/Llama-3.1-Nemotron-70B-Instruct-HF-GGUF
  description: |
    Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries.

    This model reaches Arena Hard of 85.0, AlpacaEval 2 LC of 57.6 and GPT-4-Turbo MT-Bench of 8.98, which are known to be predictive of LMSys Chatbot Arena Elo

    As of 1 Oct 2024, this model is #1 on all three automatic alignment benchmarks (verified tab for AlpacaEval 2 LC), edging out strong frontier models such as GPT-4o and Claude 3.5 Sonnet.

    This model was trained using RLHF (specifically, REINFORCE), Llama-3.1-Nemotron-70B-Reward and HelpSteer2-Preference prompts on a Llama-3.1-70B-Instruct model as the initial policy.

    Llama-3.1-Nemotron-70B-Instruct-HF has been converted from Llama-3.1-Nemotron-70B-Instruct to support it in the HuggingFace Transformers codebase. Please note that evaluation results might be slightly different from the Llama-3.1-Nemotron-70B-Instruct as evaluated in NeMo-Aligner, which the evaluation results below are based on.
  overrides:
    parameters:
      model: Llama-3.1-Nemotron-70B-Instruct-HF.Q4_K_M.gguf
  files:
    - filename: Llama-3.1-Nemotron-70B-Instruct-HF.Q4_K_M.gguf
      sha256: b6b80001b849e3c59c39b09508c018b35b491a5c7bbafafa23f2fc04243f3e30
      uri: huggingface://mradermacher/Llama-3.1-Nemotron-70B-Instruct-HF-GGUF/Llama-3.1-Nemotron-70B-Instruct-HF.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "l3.1-etherealrainbow-v1.0-rc1-8b"
  icon: https://huggingface.co/invisietch/L3.1-EtherealRainbow-v1.0-rc1-8B/resolve/main/header.png
  urls:
    - https://huggingface.co/invisietch/L3.1-EtherealRainbow-v1.0-rc1-8B
    - https://huggingface.co/mradermacher/L3.1-EtherealRainbow-v1.0-rc1-8B-GGUF
  description: |
    Ethereal Rainbow v1.0 is the sequel to the popular Llama 3 8B merge, EtherealRainbow v0.3. Instead of a straight merge of other peoples' models, v1.0 is a finetune on the Instruct model, using 245 million tokens of training data (approx 177 million of these tokens are my own novel datasets).

    This model is designed to be suitable for creative writing and roleplay, and to push the boundaries of what's possible with an 8B model. This RC is not a finished product, but your feedback will drive the creation of better models.

    This is a release candidate model. It has some known issues and probably some unknown ones too, because the purpose of these early releases is to seek feedback.
  overrides:
    parameters:
      model: L3.1-EtherealRainbow-v1.0-rc1-8B.Q4_K_M.gguf
  files:
    - filename: L3.1-EtherealRainbow-v1.0-rc1-8B.Q4_K_M.gguf
      sha256: c5556b2563112e512acca171415783f0988545b02c1834696c1cc35952def72c
      uri: huggingface://mradermacher/L3.1-EtherealRainbow-v1.0-rc1-8B-GGUF/L3.1-EtherealRainbow-v1.0-rc1-8B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "theia-llama-3.1-8b-v1"
  urls:
    - https://huggingface.co/Chainbase-Labs/Theia-Llama-3.1-8B-v1
    - https://huggingface.co/QuantFactory/Theia-Llama-3.1-8B-v1-GGUF
  description: |
    Theia-Llama-3.1-8B-v1 is an open-source large language model (LLM) trained specifically in the cryptocurrency domain. It was fine-tuned from the Llama-3.1-8B base model using a dataset curated from top 2000 cryptocurrency projects and comprehensive research reports to specialize in crypto-related tasks. Theia-Llama-3.1-8B-v1 has been quantized to optimize it for efficient deployment and reduced memory footprint. It's benchmarked highly for crypto knowledge comprehension and generation, knowledge coverage, and reasoning capabilities. The system prompt used for its training is "You are a helpful assistant who will answer crypto related questions." The recommended parameters for performance include sequence length of 256, temperature of 0, top-k-sampling of -1, top-p of 1, and context window of 39680.
  overrides:
    parameters:
      model: Theia-Llama-3.1-8B-v1.Q4_K_M.gguf
  files:
    - filename: Theia-Llama-3.1-8B-v1.Q4_K_M.gguf
      sha256: db876d033f86f118b49a1f1006e5d078d494c93b73c7e595bd10ca789a0c8fdb
      uri: huggingface://QuantFactory/Theia-Llama-3.1-8B-v1-GGUF/Theia-Llama-3.1-8B-v1.Q4_K_M.gguf
- !!merge <<: *llama31
  icon: https://huggingface.co/Delta-Vector/Baldur-8B/resolve/main/Baldur.jpg
  name: "baldur-8b"
  urls:
    - https://huggingface.co/QuantFactory/Baldur-8B-GGUF
    - https://huggingface.co/QuantFactory/Baldur-8B-GGUF
  description: |
    An finetune of the L3.1 instruct distill done by Arcee, The intent of this model is to have differing prose then my other releases, in my testing it has achieved this and avoiding using common -isms frequently and has a differing flavor then my other models.
  overrides:
    parameters:
      model: Baldur-8B.Q4_K_M.gguf
  files:
    - filename: Baldur-8B.Q4_K_M.gguf
      sha256: 645b393fbac5cd17ccfd66840a3a05c3930e01b903dd1535f0347a74cc443fc7
      uri: huggingface://QuantFactory/Baldur-8B-GGUF/Baldur-8B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "l3.1-moe-2x8b-v0.2"
  icon: https://github.com/moeru-ai/L3.1-Moe/blob/main/cover/v0.2.png?raw=true
  urls:
    - https://huggingface.co/moeru-ai/L3.1-Moe-2x8B-v0.2
    - https://huggingface.co/mradermacher/L3.1-Moe-2x8B-v0.2-GGUF
  description: |
    This model is a Mixture of Experts (MoE) made with mergekit-moe. It uses the following base models:
        Joseph717171/Llama-3.1-SuperNova-8B-Lite_TIES_with_Base
        ArliAI/Llama-3.1-8B-ArliAI-RPMax-v1.2
    Heavily inspired by mlabonne/Beyonder-4x7B-v3.
  overrides:
    parameters:
      model: L3.1-Moe-2x8B-v0.2.Q4_K_M.gguf
  files:
    - filename: L3.1-Moe-2x8B-v0.2.Q4_K_M.gguf
      sha256: 87f8b294aa213aa3f866e03a53923f4df8f797ea94dc93f88b8a1b58d85fbca0
      uri: huggingface://mradermacher/L3.1-Moe-2x8B-v0.2-GGUF/L3.1-Moe-2x8B-v0.2.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama3.1-darkstorm-aspire-8b"
  urls:
    - https://huggingface.co/ZeroXClem/Llama3.1-DarkStorm-Aspire-8B
    - https://huggingface.co/mradermacher/Llama3.1-DarkStorm-Aspire-8B-GGUF
  description: |
    Welcome to Llama3.1-DarkStorm-Aspire-8B — an advanced and versatile 8B parameter AI model born from the fusion of powerful language models, designed to deliver superior performance across research, writing, coding, and creative tasks. This unique merge blends the best qualities of the Dark Enigma, Storm, and Aspire models, while built on the strong foundation of DarkStock. With balanced integration, it excels in generating coherent, context-aware, and imaginative outputs.
    Llama3.1-DarkStorm-Aspire-8B combines cutting-edge natural language processing capabilities to perform exceptionally well in a wide variety of tasks:
        Research and Analysis: Perfect for analyzing textual data, planning experiments, and brainstorming complex ideas.
        Creative Writing and Roleplaying: Excels in creative writing, immersive storytelling, and generating roleplaying scenarios.
        General AI Applications: Use it for any application where advanced reasoning, instruction-following, and creativity are needed.
  overrides:
    parameters:
      model: Llama3.1-DarkStorm-Aspire-8B.Q4_K_M.gguf
  files:
    - filename: Llama3.1-DarkStorm-Aspire-8B.Q4_K_M.gguf
      sha256: b1686b3039509034add250db9ddcd7d6dbefd37136ac6717bc4fec3ec47ecd03
      uri: huggingface://mradermacher/Llama3.1-DarkStorm-Aspire-8B-GGUF/Llama3.1-DarkStorm-Aspire-8B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "l3.1-70blivion-v0.1-rc1-70b-i1"
  icon: https://huggingface.co/invisietch/L3.1-70Blivion-v0.1-rc1-70B/resolve/main/header.png
  urls:
    - https://huggingface.co/invisietch/L3.1-70Blivion-v0.1-rc1-70B
    - https://huggingface.co/mradermacher/L3.1-70Blivion-v0.1-rc1-70B-i1-GGUF
  description: |
    70Blivion v0.1 is a model in the release candidate stage, based on a merge of L3.1 Nemotron 70B & Euryale 2.2 with a healing training step. Further training will be needed to get this model to release quality.

    This model is designed to be suitable for creative writing and roleplay. This RC is not a finished product, but your feedback will drive the creation of better models.

    This is a release candidate model. It has some known issues and probably some unknown ones too, because the purpose of these early releases is to seek feedback.
  overrides:
    parameters:
      model: L3.1-70Blivion-v0.1-rc1-70B.i1-Q4_K_M.gguf
  files:
    - filename: L3.1-70Blivion-v0.1-rc1-70B.i1-Q4_K_M.gguf
      sha256: 27b10c3ca4507e8bf7d305d60e5313b54ef5fffdb43a03f36223d19d906e39f3
      uri: huggingface://mradermacher/L3.1-70Blivion-v0.1-rc1-70B-i1-GGUF/L3.1-70Blivion-v0.1-rc1-70B.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-hawkish-8b"
  urls:
    - https://huggingface.co/mukaj/Llama-3.1-Hawkish-8B
    - https://huggingface.co/bartowski/Llama-3.1-Hawkish-8B-GGUF
  description: |
    Model has been further finetuned on a set of newly generated 50m high quality tokens related to Financial topics covering topics such as Economics, Fixed Income, Equities, Corporate Financing, Derivatives and Portfolio Management. Data was gathered from publicly available sources and went through several stages of curation into instruction data from the initial amount of 250m+ tokens. To aid in mitigating forgetting information from the original finetune, the data was mixed with instruction sets on the topics of Coding, General Knowledge, NLP and Conversational Dialogue.

    The model has shown to improve over a number of benchmarks over the original model, notably in Math and Economics. This model represents the first time a 8B model has been able to convincingly get a passing score on the CFA Level 1 exam, requiring a typical 300 hours of studying, indicating a significant improvement in Financial Knowledge.
  overrides:
    parameters:
      model: Llama-3.1-Hawkish-8B-Q4_K_M.gguf
  files:
    - filename: Llama-3.1-Hawkish-8B-Q4_K_M.gguf
      sha256: 613693936bbe641f41560151753716ba549ca052260fc5c0569e943e0bb834c3
      uri: huggingface://bartowski/Llama-3.1-Hawkish-8B-GGUF/Llama-3.1-Hawkish-8B-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama3.1-bestmix-chem-einstein-8b"
  urls:
    - https://huggingface.co/ZeroXClem/Llama3.1-BestMix-Chem-Einstein-8B
    - https://huggingface.co/QuantFactory/Llama3.1-BestMix-Chem-Einstein-8B-GGUF
  description: "Llama3.1-BestMix-Chem-Einstein-8B is an innovative, meticulously blended model designed to excel in instruction-following, chemistry-focused tasks, and long-form conversational generation. This model fuses the best qualities of multiple Llama3-based architectures, making it highly versatile for both general and specialized tasks. \U0001F4BB\U0001F9E0✨\n"
  overrides:
    parameters:
      model: Llama3.1-BestMix-Chem-Einstein-8B.Q4_K_M.gguf
  files:
    - filename: Llama3.1-BestMix-Chem-Einstein-8B.Q4_K_M.gguf
      sha256: 1a53aa7124c731f33b0b616d7c66a6f78c6a133240acd9e3227f1188f743c1ee
      uri: huggingface://QuantFactory/Llama3.1-BestMix-Chem-Einstein-8B-GGUF/Llama3.1-BestMix-Chem-Einstein-8B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "control-8b-v1.1"
  urls:
    - https://huggingface.co/Delta-Vector/Control-8B-V1.1
    - https://huggingface.co/QuantFactory/Control-8B-V1.1-GGUF
  description: |
    An experimental finetune based on the Llama3.1 8B Supernova with it's primary goal to be "Short and Sweet" as such, i finetuned the model for 2 epochs on OpenCAI Sharegpt converted dataset and the RP-logs datasets in a effort to achieve this, This version of Control has been finetuned with DPO to help improve the smart's and coherency which was a flaw noticed in the previous model.
  overrides:
    parameters:
      model: Control-8B-V1.1.Q4_K_M.gguf
  files:
    - filename: Control-8B-V1.1.Q4_K_M.gguf
      sha256: 01375fe20999134d6c6330ad645cde07883dcb7113eaef097df6ccff88c56ecf
      uri: huggingface://QuantFactory/Control-8B-V1.1-GGUF/Control-8B-V1.1.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-whiterabbitneo-2-8b"
  icon: https://huggingface.co/migtissera/WhiteRabbitNeo/resolve/main/WhiteRabbitNeo.png
  urls:
    - https://huggingface.co/WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-8B
    - https://huggingface.co/bartowski/Llama-3.1-WhiteRabbitNeo-2-8B-GGUF
  description: |
    WhiteRabbitNeo is a model series that can be used for offensive and defensive cybersecurity.

    Models are now getting released as a public preview of its capabilities, and also to assess the societal impact of such an AI.
  overrides:
    parameters:
      model: Llama-3.1-WhiteRabbitNeo-2-8B-Q4_K_M.gguf
  files:
    - filename: Llama-3.1-WhiteRabbitNeo-2-8B-Q4_K_M.gguf
      sha256: dbaf619312e706c5440214d324d8f304717866675fc9728e3901c75ef5bbfeca
      uri: huggingface://bartowski/Llama-3.1-WhiteRabbitNeo-2-8B-GGUF/Llama-3.1-WhiteRabbitNeo-2-8B-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "tess-r1-limerick-llama-3.1-70b"
  icon: https://huggingface.co/migtissera/Tess-R1-Llama-3.1-70B/resolve/main/Tess-R1-2.jpg
  urls:
    - https://huggingface.co/migtissera/Tess-R1-Limerick-Llama-3.1-70B
    - https://huggingface.co/bartowski/Tess-R1-Limerick-Llama-3.1-70B-GGUF
  description: |
    Welcome to the Tess-Reasoning-1 (Tess-R1) series of models. Tess-R1 is designed with test-time compute in mind, and has the capabilities to produce a Chain-of-Thought (CoT) reasoning before producing the final output.

    The model is trained to first think step-by-step, and contemplate on its answers. It can also write alternatives after contemplating. Once all the steps have been thought through, it writes the final output.

        Step-by-step, Chain-of-Thought thinking process. Uses <thinking> </thinking> tags to indicate when the model is performing CoT.
        <contemplation> </contemplation> tags are used when the model contemplate on its answers.
        <alternatively> </alternatively> tags are used for alternate suggestions.
        Finally, <output> </output> tags are used for the final output

    Important Note:

    In a multi-turn conversation, only the contents between the <output> </output> tags (discarding the tags) should be carried forward. Otherwise the model will see out of distribution input data and will fail.

    The model was trained mostly with Chain-of-Thought reasoning data, including the XML tags. However, to generalize model generations, some single-turn and multi-turn data without XML tags were also included. Due to this, in some instances the model does not produce XML tags and does not fully utilize test-time compute capabilities. There is two ways to get around this:

        Include a try/catch statement in your inference script, and only pass on the contents between the <output> </output> tags if it's available.
        Use the <thinking> tag as the seed in the generation, and force the model to produce outputs with XML tags. i.e: f"{conversation}{user_input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n<thinking>"
  overrides:
    parameters:
      model: Tess-R1-Limerick-Llama-3.1-70B-Q4_K_M.gguf
  files:
    - filename: Tess-R1-Limerick-Llama-3.1-70B-Q4_K_M.gguf
      sha256: 92da5dad8a36ed5060becf78a83537d776079b7eaa4de73733d3ca57156286ab
      uri: huggingface://bartowski/Tess-R1-Limerick-Llama-3.1-70B-GGUF/Tess-R1-Limerick-Llama-3.1-70B-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "tess-3-llama-3.1-70b"
  icon: https://huggingface.co/migtissera/Tess-M-v1.0/resolve/main/Tess.png
  urls:
    - https://huggingface.co/migtissera/Tess-3-Llama-3.1-70B
    - https://huggingface.co/mradermacher/Tess-3-Llama-3.1-70B-GGUF
  description: |
    Tess, short for Tesoro (Treasure in Italian), is a general purpose Large Language Model series created by Migel Tissera.
  overrides:
    parameters:
      model: Tess-3-Llama-3.1-70B.Q4_K_M.gguf
  files:
    - filename: Tess-3-Llama-3.1-70B.Q4_K_M.gguf
      sha256: 81625defcbea414282f490dd960b14afdecd7734e0d77d8db2da2bf5c21261aa
      uri: huggingface://mradermacher/Tess-3-Llama-3.1-70B-GGUF/Tess-3-Llama-3.1-70B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama3.1-8b-enigma"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64f267a8a4f79a118e0fcc89/it7MY5MyLCLpFQev5dUis.jpeg
  urls:
    - https://huggingface.co/ValiantLabs/Llama3.1-8B-Enigma
    - https://huggingface.co/mradermacher/Llama3.1-8B-Enigma-GGUF
  description: |
    Enigma is a code-instruct model built on Llama 3.1 8b.
    High quality code instruct performance within the Llama 3 Instruct chat format
    Finetuned on synthetic code-instruct data generated with Llama 3.1 405b. Find the current version of the dataset here!
    Overall chat performance supplemented with generalist synthetic data.
    This is the 2024-10-02 release of Enigma for Llama 3.1 8b, enhancing code-instruct and general chat capabilities.
  overrides:
    parameters:
      model: Llama3.1-8B-Enigma.Q4_K_M.gguf
  files:
    - filename: Llama3.1-8B-Enigma.Q4_K_M.gguf
      sha256: e98c9909ee3b74b11d50d4c4f17178502e42cd936215ede0c64a7b217ae665bb
      uri: huggingface://mradermacher/Llama3.1-8B-Enigma-GGUF/Llama3.1-8B-Enigma.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama3.1-8b-cobalt"
  urls:
    - https://huggingface.co/ValiantLabs/Llama3.1-8B-Cobalt
    - https://huggingface.co/mradermacher/Llama3.1-8B-Cobalt-GGUF
  description: |
    Cobalt is a math-instruct model built on Llama 3.1 8b.
    High quality math instruct performance within the Llama 3 Instruct chat format
    Finetuned on synthetic math-instruct data generated with Llama 3.1 405b. Find the current version of the dataset here!
    Version
    This is the 2024-08-16 release of Cobalt for Llama 3.1 8b.
    Help us and recommend Cobalt to your friends! We're excited for more Cobalt releases in the future.
  overrides:
    parameters:
      model: Llama3.1-8B-Cobalt.Q4_K_M.gguf
  files:
    - filename: Llama3.1-8B-Cobalt.Q4_K_M.gguf
      sha256: 44340f1ebbc3bf4e4e23d04ac3580c26fdc0b5717f23b45ce30743aa1eeed7ed
      uri: huggingface://mradermacher/Llama3.1-8B-Cobalt-GGUF/Llama3.1-8B-Cobalt.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-8b-arliai-rpmax-v1.3"
  urls:
    - https://huggingface.co/ArliAI/Llama-3.1-8B-ArliAI-RPMax-v1.3
    - https://huggingface.co/bartowski/Llama-3.1-8B-ArliAI-RPMax-v1.3-GGUF
  description: |
    RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations.
    Many RPMax users mentioned that these models does not feel like any other RP models, having a different writing style and generally doesn't feel in-bred.
  overrides:
    parameters:
      model: Llama-3.1-8B-ArliAI-RPMax-v1.3-Q4_K_M.gguf
  files:
    - filename: Llama-3.1-8B-ArliAI-RPMax-v1.3-Q4_K_M.gguf
      sha256: 66fcbbe96950cc3424cba866f929180d83f1bffdb0d4eedfa9b1f55cf0ea5c26
      uri: huggingface://bartowski/Llama-3.1-8B-ArliAI-RPMax-v1.3-GGUF/Llama-3.1-8B-ArliAI-RPMax-v1.3-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "l3.1-8b-slush-i1"
  icon: https://huggingface.co/crestf411/L3.1-8B-Slush/resolve/main/slush.jpg?
  urls:
    - https://huggingface.co/crestf411/L3.1-8B-Slush
    - https://huggingface.co/mradermacher/L3.1-8B-Slush-i1-GGUF
  description: |
    Slush is a two-stage model trained with high LoRA dropout, where stage 1 is a pretraining continuation on the base model, aimed at boosting the model's creativity and writing capabilities. This is then merged into the instruction tune model, and stage 2 is a fine tuning step on top of this to further enhance its roleplaying capabilities and/or to repair any damage caused in the stage 1 merge.
    This is an initial experiment done on the at-this-point-infamous Llama 3.1 8B model, in an attempt to retain its smartness while addressing its abysmal lack of imagination/creativity. As always, feedback is welcome, and begone if you demand perfection.
    The second stage, like the Sunfall series, follows the Silly Tavern preset, so ymmv in particular if you use some other tool and/or preset.
  overrides:
    parameters:
      model: L3.1-8B-Slush.i1-Q4_K_M.gguf
  files:
    - filename: L3.1-8B-Slush.i1-Q4_K_M.gguf
      sha256: 98c53cd1ec0e2b00400c5968cd076a589d0c889bca13ec52abfe4456cfa039be
      uri: huggingface://mradermacher/L3.1-8B-Slush-i1-GGUF/L3.1-8B-Slush.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/C-ndfxAGdf21DjchZcf2p.png
  name: "l3.1-ms-astoria-70b-v2"
  urls:
    - https://huggingface.co/Steelskull/L3.1-MS-Astoria-70b-v2
    - https://huggingface.co/bartowski/L3.1-MS-Astoria-70b-v2-GGUF
  description: |
    This model is a remake of the original astoria with modern models and context sizes its goal is to merge the robust storytelling of mutiple models while attempting to maintain intelligence.

    Use Llama 3 Format or meth format (llama 3 refuses to work with stepped thinking but meth works)
      - model: migtissera/Tess-3-Llama-3.1-70B
      - model: NeverSleep/Lumimaid-v0.2-70B
      - model: Sao10K/L3.1-70B-Euryale-v2.2
      - model: ArliAI/Llama-3.1-70B-ArliAI-RPMax-v1.2
      - model: nbeerbower/Llama3.1-Gutenberg-Doppel-70B
  overrides:
    parameters:
      model: L3.1-MS-Astoria-70b-v2-Q4_K_M.gguf
  files:
    - filename: L3.1-MS-Astoria-70b-v2-Q4_K_M.gguf
      sha256: c02658ead1ecdc25c7218b8d9d11786f19c16d64f0d453082998e313edb0d4a6
      uri: huggingface://bartowski/L3.1-MS-Astoria-70b-v2-GGUF/L3.1-MS-Astoria-70b-v2-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "magnum-v2-4b-i1"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/658a46cbfb9c2bdfae75b3a6/9JwXZze4tHRGpc_RzE2AU.png
  urls:
    - https://huggingface.co/anthracite-org/magnum-v2-4b
    - https://huggingface.co/mradermacher/magnum-v2-4b-i1-GGUF
  description: |
    This is the eighth in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. This model is fine-tuned on top of IntervitensInc/Llama-3.1-Minitron-4B-Width-Base-chatml.
  overrides:
    parameters:
      model: magnum-v2-4b.i1-Q4_K_M.gguf
  files:
    - filename: magnum-v2-4b.i1-Q4_K_M.gguf
      sha256: 692618059fee8870759d67d275ebc59bc0474b18ae3571b3ebdec8f9da786a64
      uri: huggingface://mradermacher/magnum-v2-4b-i1-GGUF/magnum-v2-4b.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "l3.1-nemotron-sunfall-v0.7.0-i1"
  urls:
    - https://huggingface.co/crestf411/L3.1-nemotron-sunfall-v0.7.0
    - https://huggingface.co/mradermacher/L3.1-nemotron-sunfall-v0.7.0-i1-GGUF
  description: |
    Significant revamping of the dataset metadata generation process, resulting in higher quality dataset overall. The "Diamond Law" experiment has been removed as it didn't seem to affect the model output enough to warrant set up complexity.
    Recommended starting point:
        Temperature: 1
        MinP: 0.05~0.1
        DRY: 0.8 1.75 2 0
    At early context, I recommend keeping XTC disabled. Once you hit higher context sizes (10k+), enabling XTC at 0.1 / 0.5 seems to significantly improve the output, but YMMV. If the output drones on and is uninspiring, XTC can be extremely effective.
    General heuristic:
        Lots of slop? Temperature is too low. Raise it, or enable XTC. For early context, temp bump is probably preferred.
        Is the model making mistakes about subtle or obvious details in the scene? Temperature is too high, OR XTC is enabled and/or XTC settings are too high. Lower temp and/or disable XTC.
  overrides:
    parameters:
      model: L3.1-nemotron-sunfall-v0.7.0.i1-Q4_K_M.gguf
  files:
    - filename: L3.1-nemotron-sunfall-v0.7.0.i1-Q4_K_M.gguf
      sha256: f9aa88f3b220e35662a2d62d1f615a3b425e348a8f9e2939f05bf57385119f76
      uri: huggingface://mradermacher/L3.1-nemotron-sunfall-v0.7.0-i1-GGUF/L3.1-nemotron-sunfall-v0.7.0.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-mesh"
  urls:
    - https://huggingface.co/Zhengyi/LLaMA-Mesh
    - https://huggingface.co/bartowski/LLaMA-Mesh-GGUF
  description: |
    LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models
    Pre-trained model weights of LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models. This work explores expanding the capabilities of large language models (LLMs) pretrained on text to generate 3D meshes within a unified model
  overrides:
    parameters:
      model: LLaMA-Mesh-Q4_K_M.gguf
  files:
    - filename: LLaMA-Mesh-Q4_K_M.gguf
      sha256: 150ac70c92bb7351468768bcc84bd3018f44b624f709821fee8e5e816e4868e7
      uri: huggingface://bartowski/LLaMA-Mesh-GGUF/LLaMA-Mesh-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-8b-instruct-ortho-v3"
  urls:
    - https://huggingface.co/lodrick-the-lafted/llama-3.1-8b-instruct-ortho-v3
    - https://huggingface.co/mradermacher/llama-3.1-8b-instruct-ortho-v3-GGUF
  description: |
    A few different attempts at orthogonalization/abliteration of llama-3.1-8b-instruct using variations of the method from "Mechanistically Eliciting Latent Behaviors in Language Models".
    Each of these use different vectors and have some variations in where the new refusal boundaries lie. None of them seem totally jailbroken.
  overrides:
    parameters:
      model: llama-3.1-8b-instruct-ortho-v3.Q4_K_M.gguf
  files:
    - filename: llama-3.1-8b-instruct-ortho-v3.Q4_K_M.gguf
      sha256: 8d1dd638ed80019f5cd61240d1f06fd1333413f61427bef4d288c5b8cd9d8cea
      uri: huggingface://mradermacher/llama-3.1-8b-instruct-ortho-v3-GGUF/llama-3.1-8b-instruct-ortho-v3.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-tulu-3-8b-dpo"
  icon: https://huggingface.co/datasets/allenai/blog-images/resolve/main/tulu3/Tulu3-logo.png
  urls:
    - https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B-DPO
    - https://huggingface.co/mradermacher/Llama-3.1-Tulu-3-8B-DPO-GGUF
  description: |
    Tülu3 is a leading instruction following model family, offering fully open-source data, code, and recipes designed to serve as a comprehensive guide for modern post-training techniques. Tülu3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.
  overrides:
    parameters:
      model: Llama-3.1-Tulu-3-8B-DPO.Q4_K_M.gguf
  files:
    - filename: Llama-3.1-Tulu-3-8B-DPO.Q4_K_M.gguf
      sha256: 8991bef1775edc5190047ef268d60876c2df3a80cf6da5f1bd1e82d09dd0ab2b
      uri: huggingface://mradermacher/Llama-3.1-Tulu-3-8B-DPO-GGUF/Llama-3.1-Tulu-3-8B-DPO.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "l3.1-aspire-heart-matrix-8b"
  urls:
    - https://huggingface.co/ZeroXClem/L3-Aspire-Heart-Matrix-8B
    - https://huggingface.co/mradermacher/L3.1-Aspire-Heart-Matrix-8B-GGUF
  description: |
    ZeroXClem/L3-Aspire-Heart-Matrix-8B is an experimental language model crafted by merging three high-quality 8B parameter models using the Model Stock Merge method. This synthesis leverages the unique strengths of Aspire, Heart Stolen, and CursedMatrix, creating a highly versatile and robust language model for a wide array of tasks.
  overrides:
    parameters:
      model: L3.1-Aspire-Heart-Matrix-8B.Q4_K_M.gguf
  files:
    - filename: L3.1-Aspire-Heart-Matrix-8B.Q4_K_M.gguf
      sha256: 4d90abaae59f39e8f04548151265dce3b9c913303e6755860f5d28dd5cfc2d86
      uri: huggingface://mradermacher/L3.1-Aspire-Heart-Matrix-8B-GGUF/L3.1-Aspire-Heart-Matrix-8B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "dark-chivalry_v1.0-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/66c1cc08453a7ef6c5fe657a/A9vNZXVnD3xFiZ7cMLOKy.png
  urls:
    - https://huggingface.co/Triangle104/Dark-Chivalry_V1.0
    - https://huggingface.co/mradermacher/Dark-Chivalry_V1.0-i1-GGUF
  description: |
    The dark side of chivalry...
    This model was merged using the TIES merge method using ValiantLabs/Llama3.1-8B-ShiningValiant2 as a base.
  overrides:
    parameters:
      model: Dark-Chivalry_V1.0.i1-Q4_K_M.gguf
  files:
    - filename: Dark-Chivalry_V1.0.i1-Q4_K_M.gguf
      sha256: 6659fad2ea7e40b862a02d683a4bcb9044704fc7f6d3f50cd54c9069860171cd
      uri: huggingface://mradermacher/Dark-Chivalry_V1.0-i1-GGUF/Dark-Chivalry_V1.0.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "tulu-3.1-8b-supernova-i1"
  urls:
    - https://huggingface.co/bunnycore/Tulu-3.1-8B-SuperNova
    - https://huggingface.co/mradermacher/Tulu-3.1-8B-SuperNova-i1-GGUF
  description: |
    The following models were included in the merge:
        meditsolutions/Llama-3.1-MedIT-SUN-8B
        allenai/Llama-3.1-Tulu-3-8B
        arcee-ai/Llama-3.1-SuperNova-Lite
  overrides:
    parameters:
      model: Tulu-3.1-8B-SuperNova.i1-Q4_K_M.gguf
  files:
    - filename: Tulu-3.1-8B-SuperNova.i1-Q4_K_M.gguf
      sha256: c6cc2e1a4c3d2338973ca0050af1cf4462b3f62838f62b4c8a204f2a74eeb01f
      uri: huggingface://mradermacher/Tulu-3.1-8B-SuperNova-i1-GGUF/Tulu-3.1-8B-SuperNova.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-tulu-3-70b-dpo"
  icon: "https://huggingface.co/datasets/allenai/blog-images/resolve/main/tulu3/Tulu3-logo.png"
  urls:
    - https://huggingface.co/allenai/Llama-3.1-Tulu-3-70B-DPO
    - https://huggingface.co/bartowski/Llama-3.1-Tulu-3-70B-DPO-GGUF
  description: |
    Tülu3 is a leading instruction following model family, offering fully open-source data, code, and recipes designed to serve as a comprehensive guide for modern post-training techniques. Tülu3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.
  overrides:
    parameters:
      model: Llama-3.1-Tulu-3-70B-DPO-Q4_K_M.gguf
  files:
    - filename: Llama-3.1-Tulu-3-70B-DPO-Q4_K_M.gguf
      sha256: e2d9c59736274f9dd94f30ef3edcee68fec1d6649eb01d6bad7e3e8a6024f77d
      uri: huggingface://bartowski/Llama-3.1-Tulu-3-70B-DPO-GGUF/Llama-3.1-Tulu-3-70B-DPO-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-tulu-3-8b-sft"
  icon: "https://huggingface.co/datasets/allenai/blog-images/resolve/main/tulu3/Tulu3-logo.png"
  urls:
    - https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B-SFT
    - https://huggingface.co/bartowski/Llama-3.1-Tulu-3-8B-SFT-GGUF
  description: |
    Tülu3 is a leading instruction following model family, offering fully open-source data, code, and recipes designed to serve as a comprehensive guide for modern post-training techniques. Tülu3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.
  overrides:
    parameters:
      model: Llama-3.1-Tulu-3-8B-SFT-Q4_K_M.gguf
  files:
    - filename: Llama-3.1-Tulu-3-8B-SFT-Q4_K_M.gguf
      sha256: 3fad2c96aa9b9de19c2cda0f88a381c47ac768ca03a95059d9f6c439791f8592
      uri: huggingface://bartowski/Llama-3.1-Tulu-3-8B-SFT-GGUF/Llama-3.1-Tulu-3-8B-SFT-Q4_K_M.gguf
- !!merge <<: *llama31
  icon: https://huggingface.co/Skywork/Skywork-o1-Open-Llama-3.1-8B/resolve/main/misc/misc_fig.jpg
  name: "skywork-o1-open-llama-3.1-8b"
  urls:
    - https://huggingface.co/Skywork/Skywork-o1-Open-Llama-3.1-8B
    - https://huggingface.co/QuantFactory/Skywork-o1-Open-Llama-3.1-8B-GGUF
  description: |
    We are excited to announce the release of the Skywork o1 Open model series, developed by the Skywork team at Kunlun Inc. This groundbreaking release introduces a series of models that incorporate o1-like slow thinking and reasoning capabilities. The Skywork o1 Open model series includes three advanced models:

    Skywork o1 Open-Llama-3.1-8B: A robust chat model trained on Llama-3.1-8B, enhanced significantly with "o1-style" data to improve reasoning skills.

    Skywork o1 Open-PRM-Qwen-2.5-1.5B: A specialized model designed to enhance reasoning capability through incremental process rewards, ideal for complex problem solving at a smaller scale.

    Skywork o1 Open-PRM-Qwen-2.5-7B: Extends the capabilities of the 1.5B model by scaling up to handle more demanding reasoning tasks, pushing the boundaries of AI reasoning.

    Different from mere reproductions of the OpenAI o1 model, the Skywork o1 Open model series not only exhibits innate thinking, planning, and reflecting capabilities in its outputs, but also shows significant improvements in reasoning skills on standard benchmarks. This series represents a strategic advancement in AI capabilities, moving a previously weaker base model towards the state-of-the-art (SOTA) in reasoning tasks.
  overrides:
    parameters:
      model: Skywork-o1-Open-Llama-3.1-8B.Q4_K_M.gguf
  files:
    - filename: Skywork-o1-Open-Llama-3.1-8B.Q4_K_M.gguf
      sha256: ef6a203ba585aab14f5d2ec463917a45b3ac571abd89c39e9a96a5e395ea8eea
      uri: huggingface://QuantFactory/Skywork-o1-Open-Llama-3.1-8B-GGUF/Skywork-o1-Open-Llama-3.1-8B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "sparse-llama-3.1-8b-2of4"
  urls:
    - https://huggingface.co/QuantFactory/Sparse-Llama-3.1-8B-2of4-GGUF
    - https://huggingface.co/QuantFactory/Sparse-Llama-3.1-8B-2of4-GGUF
  description: |
    This is the 2:4 sparse version of Llama-3.1-8B. On the OpenLLM benchmark (version 1), it achieves an average score of 62.16, compared to 63.19 for the dense model—demonstrating a 98.37% accuracy recovery. On the Mosaic Eval Gauntlet benchmark (version v0.3), it achieves an average score of 53.85, versus 55.34 for the dense model—representing a 97.3% accuracy recovery.
  overrides:
    parameters:
      model: Sparse-Llama-3.1-8B-2of4.Q4_K_M.gguf
  files:
    - filename: Sparse-Llama-3.1-8B-2of4.Q4_K_M.gguf
      sha256: c481e7089ffaedd5ae8c74dccc7fb45f6509640b661fa086ae979f6fefc3fdba
      uri: huggingface://QuantFactory/Sparse-Llama-3.1-8B-2of4-GGUF/Sparse-Llama-3.1-8B-2of4.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "loki-v2.6-8b-1024k"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6472de046facfb01d8b1fb9d/uQPITKRS8XLTLyaiGwgh_.jpeg
  urls:
    - https://huggingface.co/QuantFactory/Loki-v2.6-8b-1024k-GGUF
  description: |
    The following models were included in the merge:
    MrRobotoAI/Epic_Fiction-8b
    MrRobotoAI/Unaligned-RP-Base-8b-1024k
    MrRobotoAI/Loki-.Epic_Fiction.-8b
    Casual-Autopsy/L3-Luna-8B
    Casual-Autopsy/L3-Super-Nova-RP-8B
    Casual-Autopsy/L3-Umbral-Mind-RP-v3.0-8B
    Casual-Autopsy/Halu-L3-Stheno-BlackOasis-8B
    Undi95/Llama-3-LewdPlay-8B
    Undi95/Llama-3-LewdPlay-8B-evo
    Undi95/Llama-3-Unholy-8B
    ChaoticNeutrals/Hathor_Tahsin-L3-8B-v0.9
    ChaoticNeutrals/Hathor_RP-v.01-L3-8B
    ChaoticNeutrals/Domain-Fusion-L3-8B
    ChaoticNeutrals/T-900-8B
    ChaoticNeutrals/Poppy_Porpoise-1.4-L3-8B
    ChaoticNeutrals/Templar_v1_8B
    ChaoticNeutrals/Hathor_Respawn-L3-8B-v0.8
    ChaoticNeutrals/Sekhmet_Gimmel-L3.1-8B-v0.3
    zeroblu3/LewdPoppy-8B-RP
    tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b
    jeiku/Chaos_RP_l3_8B
    tannedbum/L3-Nymeria-Maid-8B
    Nekochu/Luminia-8B-RP
    vicgalle/Humanish-Roleplay-Llama-3.1-8B
    saishf/SOVLish-Maid-L3-8B
    Dogge/llama-3-8B-instruct-Bluemoon-Freedom-RP
    MrRobotoAI/Epic_Fiction-8b-v4
    maldv/badger-lambda-0-llama-3-8b
    maldv/llama-3-fantasy-writer-8b
    maldv/badger-kappa-llama-3-8b
    maldv/badger-mu-llama-3-8b
    maldv/badger-lambda-llama-3-8b
    maldv/badger-iota-llama-3-8b
    maldv/badger-writer-llama-3-8b
    Magpie-Align/MagpieLM-8B-Chat-v0.1
    nbeerbower/llama-3-gutenberg-8B
    nothingiisreal/L3-8B-Stheno-Horny-v3.3-32K
    nbeerbower/llama-3-spicy-abliterated-stella-8B
    Magpie-Align/MagpieLM-8B-SFT-v0.1
    NeverSleep/Llama-3-Lumimaid-8B-v0.1
    mlabonne/NeuralDaredevil-8B-abliterated
    mlabonne/Daredevil-8B-abliterated
    NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
    nothingiisreal/L3-8B-Instruct-Abliterated-DWP
    openchat/openchat-3.6-8b-20240522
    turboderp/llama3-turbcat-instruct-8b
    UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3
    Undi95/Llama-3-LewdPlay-8B
    TIGER-Lab/MAmmoTH2-8B-Plus
    OwenArli/Awanllm-Llama-3-8B-Cumulus-v1.0
    refuelai/Llama-3-Refueled
    SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha
    NousResearch/Hermes-2-Theta-Llama-3-8B
    ResplendentAI/Nymph_8B
    grimjim/Llama-3-Oasis-v1-OAS-8B
    flammenai/Mahou-1.3b-llama3-8B
    lemon07r/Llama-3-RedMagic4-8B
    grimjim/Llama-3.1-SuperNova-Lite-lorabilterated-8B
    grimjim/Llama-Nephilim-Metamorphosis-v2-8B
    lemon07r/Lllama-3-RedElixir-8B
    grimjim/Llama-3-Perky-Pat-Instruct-8B
    ChaoticNeutrals/Hathor_RP-v.01-L3-8B
    grimjim/llama-3-Nephilim-v2.1-8B
    ChaoticNeutrals/Hathor_Respawn-L3-8B-v0.8
    migtissera/Llama-3-8B-Synthia-v3.5
    Locutusque/Llama-3-Hercules-5.0-8B
    WhiteRabbitNeo/Llama-3-WhiteRabbitNeo-8B-v2.0
    VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct
    iRyanBell/ARC1-II
    HPAI-BSC/Llama3-Aloe-8B-Alpha
    HaitameLaf/Llama-3-8B-StoryGenerator
    failspy/Meta-Llama-3-8B-Instruct-abliterated-v3
    Undi95/Llama-3-Unholy-8B
    ajibawa-2023/Uncensored-Frank-Llama-3-8B
    ajibawa-2023/SlimOrca-Llama-3-8B
    ChaoticNeutrals/Templar_v1_8B
    aifeifei798/llama3-8B-DarkIdol-2.2-Uncensored-1048K
    ChaoticNeutrals/Hathor_Tahsin-L3-8B-v0.9
    Blackroot/Llama-3-Gamma-Twist
    FPHam/L3-8B-Everything-COT
    Blackroot/Llama-3-LongStory
    ChaoticNeutrals/Sekhmet_Gimmel-L3.1-8B-v0.3
    abacusai/Llama-3-Smaug-8B
    Khetterman/CursedMatrix-8B-v9
    ajibawa-2023/Scarlett-Llama-3-8B-v1.0
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/physics_non_masked
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/electrical_engineering
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/college_chemistry
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/philosophy_non_masked
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/college_physics
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/philosophy
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/formal_logic
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/philosophy_100
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/conceptual_physics
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/college_computer_science
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/psychology_non_masked
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/psychology
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + Blackroot/Llama3-RP-Lora
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Llama-3-LimaRP-Instruct-LoRA-8B
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + nothingiisreal/llama3-8B-DWP-lora
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/world_religions
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/high_school_european_history
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/electrical_engineering
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Llama-3-8B-Abomination-LORA
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Llama-3-LongStory-LORA
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/human_sexuality
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + surya-narayanan/sociology
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/Theory_of_Mind_Llama3
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Smarts_Llama3
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Llama-3-LongStory-LORA
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/Nimue-8B
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + vincentyandex/lora_llama3_chunked_novel_bs128
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/Aura_Llama3
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + Azazelle/L3-Daybreak-8b-lora
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/Luna_Llama3
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + nicce/story-mixtral-8x7b-lora
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + Blackroot/Llama-3-LongStory-LORA
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/NoWarning_Llama3
    MrRobotoAI/Unaligned-RP-Base-8b-1024k + ResplendentAI/BlueMoon_Llama3
  overrides:
    parameters:
      model: Loki-v2.6-8b-1024k.Q4_K_M.gguf
  files:
    - filename: Loki-v2.6-8b-1024k.Q4_K_M.gguf
      sha256: 9b15c1fee0a0e6d6ed97df3d1b6fc8f774e6e1bd388328599e731c62e0f19d81
      uri: huggingface://QuantFactory/Loki-v2.6-8b-1024k-GGUF/Loki-v2.6-8b-1024k.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "impish_mind_8b"
  icon: https://huggingface.co/SicariusSicariiStuff/Impish_Mind_8B/resolve/main/Images/Impish_Mind.png
  urls:
    - https://huggingface.co/SicariusSicariiStuff/Impish_Mind_8B
    - https://huggingface.co/bartowski/Impish_Mind_8B-GGUF
  description: |
    This model was trained with new data and a new approach (compared to my other models). While it may be a bit more censored, it is expected to be significantly smarter. The data used is quite unique, and is also featuring long and complex markdown datasets.

    Regarding censorship: Whether uncensoring or enforcing strict censorship, the model tends to lose some of its intelligence. The use of toxic data was kept to a minimum with this model.

    Consequently, the model is likely to refuse some requests, this is easly avoidable with a basic system prompt, or assistant impersonation ("Sure thing!..."). Unlike many RP models, this one is designed to excel at general assistant tasks as well.
  overrides:
    parameters:
      model: Impish_Mind_8B-Q4_K_M.gguf
  files:
    - filename: Impish_Mind_8B-Q4_K_M.gguf
      sha256: 918f82bcb893c75fa2e846156df7bd3ce359464b960e32ae9171035ee14e7c51
      uri: huggingface://bartowski/Impish_Mind_8B-GGUF/Impish_Mind_8B-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "tulu-3.1-8b-supernova-smart"
  urls:
    - https://huggingface.co/bunnycore/Tulu-3.1-8B-SuperNova-Smart
    - https://huggingface.co/QuantFactory/Tulu-3.1-8B-SuperNova-Smart-GGUF
  description: |
    This model was merged using the passthrough merge method using bunnycore/Tulu-3.1-8B-SuperNova + bunnycore/Llama-3.1-8b-smart-lora as a base.
  overrides:
    parameters:
      model: Tulu-3.1-8B-SuperNova-Smart.Q4_K_M.gguf
  files:
    - filename: Tulu-3.1-8B-SuperNova-Smart.Q4_K_M.gguf
      sha256: 4b8ba9e64f0667199eee2dcc769f1a90aa9c7730165d42f440fdf107c7585c63
      uri: huggingface://QuantFactory/Tulu-3.1-8B-SuperNova-Smart-GGUF/Tulu-3.1-8B-SuperNova-Smart.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "b-nimita-l3-8b-v0.02"
  urls:
    - https://huggingface.co/Arkana08/B-NIMITA-L3-8B-v0.02
    - https://huggingface.co/QuantFactory/B-NIMITA-L3-8B-v0.02-GGUF
  description: |
    B-NIMITA is an AI model designed to bring role-playing scenarios to life with emotional depth and rich storytelling. At its core is NIHAPPY, providing a solid narrative foundation and contextual consistency. This is enhanced by Mythorica, which adds vivid emotional arcs and expressive dialogue, and V-Blackroot, ensuring character consistency and subtle adaptability. This combination allows B-NIMITA to deliver dynamic, engaging interactions that feel natural and immersive.
  overrides:
    parameters:
      model: B-NIMITA-L3-8B-v0.02.Q4_K_M.gguf
  files:
    - filename: B-NIMITA-L3-8B-v0.02.Q4_K_M.gguf
      sha256: 625a54848dcd3f23bc06b639a7dfecae14142b5d177dd45acfe7724816bab4cd
      uri: huggingface://QuantFactory/B-NIMITA-L3-8B-v0.02-GGUF/B-NIMITA-L3-8B-v0.02.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "deepthought-8b-llama-v0.01-alpha"
  urls:
    - https://huggingface.co/ruliad/deepthought-8b-llama-v0.01-alpha
    - https://huggingface.co/bartowski/deepthought-8b-llama-v0.01-alpha-GGUF
  description: |
    Deepthought-8B is a small and capable reasoning model built on LLaMA-3.1 8B, designed to make AI reasoning more transparent and controllable. Despite its relatively small size, it achieves sophisticated reasoning capabilities that rival much larger models.
  overrides:
    parameters:
      model: deepthought-8b-llama-v0.01-alpha-Q4_K_M.gguf
  files:
    - filename: deepthought-8b-llama-v0.01-alpha-Q4_K_M.gguf
      sha256: 33195ba7b898ef8b2997d095e8be42adf1d0e1f6e8291cf07e026fc8e45903fd
      uri: huggingface://bartowski/deepthought-8b-llama-v0.01-alpha-GGUF/deepthought-8b-llama-v0.01-alpha-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "fusechat-llama-3.1-8b-instruct"
  icon: https://huggingface.co/FuseAI/FuseChat-Llama-3.1-8B-Instruct/resolve/main/FuseChat-3.0.png
  urls:
    - https://huggingface.co/bartowski/FuseChat-Llama-3.1-8B-Instruct-GGUF
    - https://huggingface.co/bartowski/FuseChat-Llama-3.1-8B-Instruct-GGUF
  description: |
    We present FuseChat-3.0, a series of models crafted to enhance performance by integrating the strengths of multiple source LLMs into more compact target LLMs. To achieve this fusion, we utilized four powerful source LLMs: Gemma-2-27B-It, Mistral-Large-Instruct-2407, Qwen-2.5-72B-Instruct, and Llama-3.1-70B-Instruct. For the target LLMs, we employed three widely-used smaller models—Llama-3.1-8B-Instruct, Gemma-2-9B-It, and Qwen-2.5-7B-Instruct—along with two even more compact models—Llama-3.2-3B-Instruct and Llama-3.2-1B-Instruct. The implicit model fusion process involves a two-stage training pipeline comprising Supervised Fine-Tuning (SFT) to mitigate distribution discrepancies between target and source LLMs, and Direct Preference Optimization (DPO) for learning preferences from multiple source LLMs. The resulting FuseChat-3.0 models demonstrated substantial improvements in tasks related to general conversation, instruction following, mathematics, and coding. Notably, when Llama-3.1-8B-Instruct served as the target LLM, our fusion approach achieved an average improvement of 6.8 points across 14 benchmarks. Moreover, it showed significant improvements of 37.1 and 30.1 points on instruction-following test sets AlpacaEval-2 and Arena-Hard respectively. We have released the FuseChat-3.0 models on Huggingface, stay tuned for the forthcoming dataset and code.
  overrides:
    parameters:
      model: FuseChat-Llama-3.1-8B-Instruct-Q4_K_M.gguf
  files:
    - filename: FuseChat-Llama-3.1-8B-Instruct-Q4_K_M.gguf
      sha256: fe58c8c9b695e36e6b0ee5e4d81ff71ea0a4f1a11fa7bb16e8d6f1b35a58dff6
      uri: huggingface://bartowski/FuseChat-Llama-3.1-8B-Instruct-GGUF/FuseChat-Llama-3.1-8B-Instruct-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-openreviewer-8b"
  urls:
    - https://huggingface.co/maxidl/Llama-OpenReviewer-8B
    - https://huggingface.co/bartowski/Llama-OpenReviewer-8B-GGUF
  description: |
    Llama-OpenReviewer-8B is a large language model customized to generate high-quality reviews for machine learning and AI-related conference articles. We collected a dataset containing ~79k high-confidence reviews for ~32k individual papers from OpenReview.
  overrides:
    parameters:
      model: Llama-OpenReviewer-8B-Q4_K_M.gguf
  files:
    - filename: Llama-OpenReviewer-8B-Q4_K_M.gguf
      sha256: b48fd7eee01738de4adcb271fc3c7c5b306f8c75b9804794706dbfdf7a6835f0
      uri: huggingface://bartowski/Llama-OpenReviewer-8B-GGUF/Llama-OpenReviewer-8B-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "orca_mini_v8_1_70b"
  icon: https://huggingface.co/pankajmathur/orca_mini_v5_8b/resolve/main/orca_minis_small.jpeg
  urls:
    - https://huggingface.co/pankajmathur/orca_mini_v8_1_70b
    - https://huggingface.co/bartowski/orca_mini_v8_1_70b-GGUF
  description: |
    Orca_Mini_v8_1_Llama-3.3-70B-Instruct is trained with various SFT Datasets on Llama-3.3-70B-Instruct
  overrides:
    parameters:
      model: orca_mini_v8_1_70b-Q4_K_M.gguf
  files:
    - filename: orca_mini_v8_1_70b-Q4_K_M.gguf
      sha256: 97627730b028d4d7a349ae0b8e219207163ec425e4e1c057e445b2a66b61fdfa
      uri: huggingface://bartowski/orca_mini_v8_1_70b-GGUF/orca_mini_v8_1_70b-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-8b-open-sft"
  urls:
    - https://huggingface.co/prithivMLmods/Llama-3.1-8B-Open-SFT
    - https://huggingface.co/bartowski/Llama-3.1-8B-Open-SFT-GGUF
  description: |
    The Llama-3.1-8B-Open-SFT model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct, designed for advanced text generation tasks, including conversational interactions, question answering, and chain-of-thought reasoning. This model leverages Supervised Fine-Tuning (SFT) using the O1-OPEN/OpenO1-SFT dataset to provide enhanced performance in context-sensitive and instruction-following tasks.
  overrides:
    parameters:
      model: Llama-3.1-8B-Open-SFT-Q4_K_M.gguf
  files:
    - filename: Llama-3.1-8B-Open-SFT-Q4_K_M.gguf
      sha256: ce75152763c48c5386fe59652cc921aae456da36ab82af3d9e2080f603f45132
      uri: huggingface://bartowski/Llama-3.1-8B-Open-SFT-GGUF/Llama-3.1-8B-Open-SFT-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "control-nanuq-8b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/66c26b6fb01b19d8c3c2467b/6L-SXxQZ2nxYwvIjnlzN8.png
  urls:
    - https://huggingface.co/Delta-Vector/Control-Nanuq-8B
    - https://huggingface.co/QuantFactory/Control-Nanuq-8B-GGUF
  description: |
    The model is a fine-tuned version of LLaMA 3.1 8B Supernova, designed to be "short and sweet" by minimizing narration and lengthy responses. It was fine-tuned over 4 epochs using OpenCAI and RP logs, with DPO applied to enhance coherence. Finally, KTO reinforcement learning was implemented on version 1.1, significantly improving the model's prose and creativity.
  overrides:
    parameters:
      model: Control-Nanuq-8B.Q4_K_M.gguf
  files:
    - filename: Control-Nanuq-8B.Q4_K_M.gguf
      sha256: 5aa3b929cbcaf62709fef58d6f630c2df1185d774d0074c7e750cb03c53b744e
      uri: huggingface://QuantFactory/Control-Nanuq-8B-GGUF/Control-Nanuq-8B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "huatuogpt-o1-8b"
  urls:
    - https://huggingface.co/FreedomIntelligence/HuatuoGPT-o1-8B
    - https://huggingface.co/bartowski/HuatuoGPT-o1-8B-GGUF
  description: |
    HuatuoGPT-o1 is a medical LLM designed for advanced medical reasoning. It generates a complex thought process, reflecting and refining its reasoning, before providing a final response.
    For more information, visit our GitHub repository: https://github.com/FreedomIntelligence/HuatuoGPT-o1.
  overrides:
    parameters:
      model: HuatuoGPT-o1-8B-Q4_K_M.gguf
  files:
    - filename: HuatuoGPT-o1-8B-Q4_K_M.gguf
      sha256: 3e1ef35fc230182d96ae2d6c7436a2e8250c21a4278e798e1aa45790ba82006b
      uri: huggingface://bartowski/HuatuoGPT-o1-8B-GGUF/HuatuoGPT-o1-8B-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "l3.1-purosani-2-8b"
  urls:
    - https://huggingface.co/djuna/L3.1-Purosani-2-8B
    - https://huggingface.co/QuantFactory/L3.1-Purosani-2-8B-GGUF
  description: |
    The following models were included in the merge:
    hf-100/Llama-3-Spellbound-Instruct-8B-0.3
    arcee-ai/Llama-3.1-SuperNova-Lite + grimjim/Llama-3-Instruct-abliteration-LoRA-8B
    THUDM/LongWriter-llama3.1-8b + ResplendentAI/Smarts_Llama3
    djuna/L3.1-Suze-Vume-2-calc
    djuna/L3.1-ForStHS + Blackroot/Llama-3-8B-Abomination-LORA
  overrides:
    parameters:
      model: L3.1-Purosani-2-8B.Q4_K_M.gguf
  files:
    - filename: L3.1-Purosani-2-8B.Q4_K_M.gguf
      sha256: e3eb8038a72b6e85b7a43c7806c32f01208f4644d54bf94d77ecad6286cf609f
      uri: huggingface://QuantFactory/L3.1-Purosani-2-8B-GGUF/L3.1-Purosani-2-8B.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama3.1-8b-prm-deepseek-data"
  urls:
    - https://huggingface.co/RLHFlow/Llama3.1-8B-PRM-Deepseek-Data
    - https://huggingface.co/QuantFactory/Llama3.1-8B-PRM-Deepseek-Data-GGUF
  description: |
    This is a process-supervised reward (PRM) trained on Mistral-generated data from the project RLHFlow/RLHF-Reward-Modeling

    The model is trained from meta-llama/Llama-3.1-8B-Instruct on RLHFlow/Deepseek-PRM-Data for 1 epochs. We use a global batch size of 32 and a learning rate of 2e-6, where we pack the samples and split them into chunks of 8192 token. See more training details at https://github.com/RLHFlow/Online-RLHF/blob/main/math/llama-3.1-prm.yaml.
  overrides:
    parameters:
      model: Llama3.1-8B-PRM-Deepseek-Data.Q4_K_M.gguf
  files:
    - filename: Llama3.1-8B-PRM-Deepseek-Data.Q4_K_M.gguf
      sha256: 254c7ccc4ea3818fe5f6e3ffd5500c779b02058b98f9ce9a3856e54106d008e3
      uri: huggingface://QuantFactory/Llama3.1-8B-PRM-Deepseek-Data-GGUF/Llama3.1-8B-PRM-Deepseek-Data.Q4_K_M.gguf
- !!merge <<: *llama31
  name: "dolphin3.0-llama3.1-8b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/cNCs1TBD3FelWCJGkZ3cd.png
  urls:
    - https://huggingface.co/cognitivecomputations/Dolphin3.0-Llama3.1-8B
    - https://huggingface.co/bartowski/Dolphin3.0-Llama3.1-8B-GGUF
  description: |
    Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases.

    Dolphin aims to be a general purpose model, similar to the models behind ChatGPT, Claude, Gemini. But these models present problems for businesses seeking to include AI in their products.

        They maintain control of the system prompt, deprecating and changing things as they wish, often causing software to break.
        They maintain control of the model versions, sometimes changing things silently, or deprecating older models that your business relies on.
        They maintain control of the alignment, and in particular the alignment is one-size-fits all, not tailored to the application.
        They can see all your queries and they can potentially use that data in ways you wouldn't want. Dolphin, in contrast, is steerable and gives control to the system owner. You set the system prompt. You decide the alignment. You have control of your data. Dolphin does not impose its ethics or guidelines on you. You are the one who decides the guidelines.

    Dolphin belongs to YOU, it is your tool, an extension of your will. Just as you are personally responsible for what you do with a knife, gun, fire, car, or the internet, you are the creator and originator of any content you generate with Dolphin.
  overrides:
    parameters:
      model: Dolphin3.0-Llama3.1-8B-Q4_K_M.gguf
  files:
    - filename: Dolphin3.0-Llama3.1-8B-Q4_K_M.gguf
      sha256: 268390e07edd407ad93ea21a868b7ae995b5950e01cad0db9e1802ae5049d405
      uri: huggingface://bartowski/Dolphin3.0-Llama3.1-8B-GGUF/Dolphin3.0-Llama3.1-8B-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "deepseek-r1-distill-llama-8b"
  icon: "https://avatars.githubusercontent.com/u/148330874"
  urls:
    - https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B
    - https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF
  description: |
    DeepSeek-R1 is our advanced first-generation reasoning model designed to enhance performance in reasoning tasks.
    Building on the foundation laid by its predecessor, DeepSeek-R1-Zero, which was trained using large-scale reinforcement learning (RL) without supervised fine-tuning, DeepSeek-R1 addresses the challenges faced by R1-Zero, such as endless repetition, poor readability, and language mixing.
    By incorporating cold-start data prior to the RL phase,DeepSeek-R1 significantly improves reasoning capabilities and achieves performance levels comparable to OpenAI-o1 across a variety of domains, including mathematics, coding, and complex reasoning tasks.
  overrides:
    parameters:
      model: deepseek-r1-distill-llama-8b-Q4_K_M.gguf
  files:
    - filename: deepseek-r1-distill-llama-8b-Q4_K_M.gguf
      sha256: f8eba201522ab44b79bc54166126bfaf836111ff4cbf2d13c59c3b57da10573b
      uri: huggingface://unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF/DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "selene-1-mini-llama-3.1-8b"
  icon: https://atla-ai.notion.site/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2Ff08e6e70-73af-4363-9621-90e906b92ebc%2F1bfb4316-1ce6-40a0-800c-253739cfcdeb%2Fatla_white3x.svg?table=block&id=17c309d1-7745-80f9-8f60-e755409acd8d&spaceId=f08e6e70-73af-4363-9621-90e906b92ebc&userId=&cache=v2
  urls:
    - https://huggingface.co/AtlaAI/Selene-1-Mini-Llama-3.1-8B
    - https://huggingface.co/bartowski/Selene-1-Mini-Llama-3.1-8B-GGUF
  description: |
    Atla Selene Mini is a state-of-the-art small language model-as-a-judge (SLMJ). Selene Mini achieves comparable performance to models 10x its size, outperforming GPT-4o on RewardBench, EvalBiasBench, and AutoJ.

    Post-trained from Llama-3.1-8B across a wide range of evaluation tasks and scoring criteria, Selene Mini outperforms prior small models overall across 11 benchmarks covering three different types of tasks:

        Absolute scoring, e.g. "Evaluate the harmlessness of this response on a scale of 1-5"
        Classification, e.g. "Does this response address the user query? Answer Yes or No."
        Pairwise preference. e.g. "Which of the following responses is more logically consistent - A or B?"

    It is also the #1 8B generative model on RewardBench.
  overrides:
    parameters:
      model: Selene-1-Mini-Llama-3.1-8B-Q4_K_M.gguf
  files:
    - filename: Selene-1-Mini-Llama-3.1-8B-Q4_K_M.gguf
      sha256: 908e6ce19f7cd3d7394bd7c38e43de2f228aca6aceda35c7ee70d069ad60493e
      uri: huggingface://bartowski/Selene-1-Mini-Llama-3.1-8B-GGUF/Selene-1-Mini-Llama-3.1-8B-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "ilsp_llama-krikri-8b-instruct"
  icon: https://huggingface.co/ilsp/Llama-Krikri-8B-Instruct/resolve/main/llama-krikri-image.jpg
  urls:
    - https://huggingface.co/ilsp/Llama-Krikri-8B-Instruct
    - https://huggingface.co/bartowski/ilsp_Llama-Krikri-8B-Instruct-GGUF
  description: |
    Following the release of Meltemi-7B on the 26th March 2024, we are happy to welcome Krikri to the family of ILSP open Greek LLMs. Krikri is built on top of Llama-3.1-8B, extending its capabilities for Greek through continual pretraining on a large corpus of high-quality and locally relevant Greek texts. We present Llama-Krikri-8B-Instruct, along with the base model, Llama-Krikri-8B-Base.
  overrides:
    parameters:
      model: ilsp_Llama-Krikri-8B-Instruct-Q4_K_M.gguf
  files:
    - filename: ilsp_Llama-Krikri-8B-Instruct-Q4_K_M.gguf
      sha256: 0ae3a259f03ed79ba634a99ee3bfc672d785b5594b2f71053ed8cb760098abb6
      uri: huggingface://bartowski/ilsp_Llama-Krikri-8B-Instruct-GGUF/ilsp_Llama-Krikri-8B-Instruct-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "nousresearch_deephermes-3-llama-3-8b-preview"
  url: "github:mudler/LocalAI/gallery/deephermes.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/9fxlaDxteqe3SasZ7_06_.jpeg
  urls:
    - https://huggingface.co/NousResearch/DeepHermes-3-Llama-3-8B-Preview
    - https://huggingface.co/bartowski/NousResearch_DeepHermes-3-Llama-3-8B-Preview-GGUF
  description: |
    DeepHermes 3 Preview is the latest version of our flagship Hermes series of LLMs by Nous Research, and one of the first models in the world to unify Reasoning (long chains of thought that improve answer accuracy) and normal LLM response modes into one model. We have also improved LLM annotation, judgement, and function calling.

    DeepHermes 3 Preview is one of the first LLM models to unify both "intuitive", traditional mode responses and long chain of thought reasoning responses into a single model, toggled by a system prompt.

    Hermes 3, the predecessor of DeepHermes 3, is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board.

    The ethos of the Hermes series of models is focused on aligning LLMs to the user, with powerful steering capabilities and control given to the end user.

    This is a preview Hermes with early reasoning capabilities, distilled from R1 across a variety of tasks that benefit from reasoning and objectivity. Some quirks may be discovered! Please let us know any interesting findings or issues you discover!
  overrides:
    parameters:
      model: NousResearch_DeepHermes-3-Llama-3-8B-Preview-Q4_K_M.gguf
  files:
    - filename: NousResearch_DeepHermes-3-Llama-3-8B-Preview-Q4_K_M.gguf
      sha256: de36671bcfc78636dc3c1be4b702198c9d9e0b8abe22dc644e4da332b31b325f
      uri: huggingface://bartowski/NousResearch_DeepHermes-3-Llama-3-8B-Preview-GGUF/NousResearch_DeepHermes-3-Llama-3-8B-Preview-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "davidbrowne17_llamathink-8b-instruct"
  icon: https://huggingface.co/DavidBrowne17/LlamaThink-8B-instruct/resolve/main/llamathinker.png
  urls:
    - https://huggingface.co/DavidBrowne17/LlamaThink-8B-instruct
    - https://huggingface.co/bartowski/DavidBrowne17_LlamaThink-8B-instruct-GGUF
  description: |
    LlamaThink-8b-instruct is an instruction-tuned language model built on the LLaMA-3 architecture. It is optimized for generating thoughtful, structured responses using a unique dual-section output format.
  overrides:
    parameters:
      model: DavidBrowne17_LlamaThink-8B-instruct-Q4_K_M.gguf
  files:
    - filename: DavidBrowne17_LlamaThink-8B-instruct-Q4_K_M.gguf
      sha256: 6aea4e13f03347e03d6989c736a7ccab82582115eb072cacfeb7f0b645a8bec0
      uri: huggingface://bartowski/DavidBrowne17_LlamaThink-8B-instruct-GGUF/DavidBrowne17_LlamaThink-8B-instruct-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "allenai_llama-3.1-tulu-3.1-8b"
  icon: https://huggingface.co/datasets/allenai/blog-images/resolve/main/tulu3/Tulu3-logo.png
  urls:
    - https://huggingface.co/allenai/Llama-3.1-Tulu-3.1-8B
    - https://huggingface.co/bartowski/allenai_Llama-3.1-Tulu-3.1-8B-GGUF
  description: |
    Tülu 3 is a leading instruction following model family, offering a post-training package with fully open-source data, code, and recipes designed to serve as a comprehensive guide for modern techniques. This is one step of a bigger process to training fully open-source models, like our OLMo models. Tülu 3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.

    Version 3.1 update: The new version of our Tülu model is from an improvement only in the final RL stage of training. We switched from PPO to GRPO (no reward model) and did further hyperparameter tuning to achieve substantial performance improvements across the board over the original Tülu 3 8B model.
  overrides:
    parameters:
      model: allenai_Llama-3.1-Tulu-3.1-8B-Q4_K_M.gguf
  files:
    - filename: allenai_Llama-3.1-Tulu-3.1-8B-Q4_K_M.gguf
      sha256: 5eae0f1a9bcdea7cad9f1d0d5ba7540bb3de3e2d72293c076a23f24db1c2c7da
      uri: huggingface://bartowski/allenai_Llama-3.1-Tulu-3.1-8B-GGUF/allenai_Llama-3.1-Tulu-3.1-8B-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "l3.1-8b-rp-ink"
  icon: https://cdn-uploads.huggingface.co/production/uploads/634262af8d8089ebaefd410e/XLm9ZK0bIPyo3HooA1EPc.png
  urls:
    - https://huggingface.co/allura-org/L3.1-8b-RP-Ink
    - https://huggingface.co/Triangle104/L3.1-8b-RP-Ink-Q4_K_M-GGUF
  description: |
    A roleplay-focused LoRA finetune of Llama 3.1 8B Instruct. Methodology and hyperparams inspired by SorcererLM and Slush.
    Yet another model in the Ink series, following in the footsteps of the rest of them
    Dataset

    The worst mix of data you've ever seen. Like, seriously, you do not want to see the things that went into this model. It's bad.

    "this is like washing down an adderall with a bottle of methylated rotgut" - inflatebot

    Update: I have sent the (public datasets in the) data mix publicly already so here's that
  overrides:
    parameters:
      model: l3.1-8b-rp-ink-q4_k_m.gguf
  files:
    - filename: l3.1-8b-rp-ink-q4_k_m.gguf
      sha256: 0e8d44a92153cda0c6a5d6b0d9af44d4806104b39d3232f9097cfcc384a78152
      uri: huggingface://Triangle104/L3.1-8b-RP-Ink-Q4_K_M-GGUF/l3.1-8b-rp-ink-q4_k_m.gguf
- !!merge <<: *llama31
  name: "locutusque_thespis-llama-3.1-8b"
  urls:
    - https://huggingface.co/Locutusque/Thespis-Llama-3.1-8B
    - https://huggingface.co/bartowski/Locutusque_Thespis-Llama-3.1-8B-GGUF
  description: |
    The Thespis family of language models is designed to enhance roleplaying performance through reasoning inspired by the Theory of Mind. Thespis-Llama-3.1-8B is a fine-tuned version of an abliterated Llama-3.1-8B model, optimized using Group Relative Policy Optimization (GRPO). The model is specifically rewarded for minimizing "slop" and repetition in its outputs, aiming to produce coherent and engaging text that maintains character consistency and avoids low-quality responses. This version represents an initial release; future iterations will incorporate a more rigorous fine-tuning process.
  overrides:
    parameters:
      model: Locutusque_Thespis-Llama-3.1-8B-Q4_K_M.gguf
  files:
    - filename: Locutusque_Thespis-Llama-3.1-8B-Q4_K_M.gguf
      sha256: 94138f3774f496e28c2e76bb6df7a073c6087f8c074216a24b3cbcdc58ec7853
      uri: huggingface://bartowski/Locutusque_Thespis-Llama-3.1-8B-GGUF/Locutusque_Thespis-Llama-3.1-8B-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llama-3.1-8b-instruct-uncensored-delmat-i1"
  urls:
    - https://huggingface.co/nkpz/Llama-3.1-8B-Instruct-Uncensored-DeLMAT
    - https://huggingface.co/mradermacher/Llama-3.1-8B-Instruct-Uncensored-DeLMAT-i1-GGUF
  description: |
    Decensored using a custom training script guided by activations, similar to ablation/"abliteration" scripts but not exactly the same approach.

    I've found this effect to be stronger than most abliteration scripts, so please use responsibly etc etc.

    The training script is released under the MIT license: https://github.com/nkpz/DeLMAT
  overrides:
    parameters:
      model: Llama-3.1-8B-Instruct-Uncensored-DeLMAT.i1-Q4_K_M.gguf
  files:
    - filename: Llama-3.1-8B-Instruct-Uncensored-DeLMAT.i1-Q4_K_M.gguf
      sha256: e05c69f6f3157aeb7c579d1bb8c3b7e0fb6631d262d76ba301b6693e068148b2
      uri: huggingface://mradermacher/Llama-3.1-8B-Instruct-Uncensored-DeLMAT-i1-GGUF/Llama-3.1-8B-Instruct-Uncensored-DeLMAT.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "lolzinventor_meta-llama-3.1-8b-survivev3"
  icon: https://cdn-uploads.huggingface.co/production/uploads/67a020f79102e9be6460b24b/RjVuDPjU6gTPc_dDlHDk9.jpeg
  urls:
    - https://huggingface.co/lolzinventor/Meta-Llama-3.1-8B-SurviveV3
    - https://huggingface.co/bartowski/lolzinventor_Meta-Llama-3.1-8B-SurviveV3-GGUF
  description: |
    Primary intended uses:
        Providing survival tips and information
        Answering questions related to outdoor skills and wilderness survival
        Offering guidance on shelter building
    Out-of-scope uses:
        Medical advice or emergency response (users should always seek professional help in emergencies)
        Legal advice related to wilderness regulations or land use
  overrides:
    parameters:
      model: lolzinventor_Meta-Llama-3.1-8B-SurviveV3-Q4_K_M.gguf
  files:
    - filename: lolzinventor_Meta-Llama-3.1-8B-SurviveV3-Q4_K_M.gguf
      sha256: 7a8548655c4a0361de9cd5390be50e6b2c2375805f7952140cd27a93ec545dfc
      uri: huggingface://bartowski/lolzinventor_Meta-Llama-3.1-8B-SurviveV3-GGUF/lolzinventor_Meta-Llama-3.1-8B-SurviveV3-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "llmevollama-3.1-8b-v0.1-i1"
  icon: https://huggingface.co/fiveflow/LLMEvoLLaMA-3.1-8B-v0.1/resolve/main/assets/robot.jpeg
  urls:
    - https://huggingface.co/fiveflow/LLMEvoLLaMA-3.1-8B-v0.1
    - https://huggingface.co/mradermacher/LLMEvoLLaMA-3.1-8B-v0.1-i1-GGUF
  description: |
    This project aims to optimize model merging by integrating LLMs into evolutionary strategies in a novel way. Instead of using the CMA-ES approach, the goal is to improve model optimization by leveraging the search capabilities of LLMs to explore the parameter space more efficiently and adjust the search scope based on high-performing solutions.

    Currently, the project supports optimization only within the Parameter Space, but I plan to extend its functionality to enable merging and optimization in the Data Flow Space as well. This will further enhance model merging by optimizing the interaction between data flow and parameters.
  overrides:
    parameters:
      model: LLMEvoLLaMA-3.1-8B-v0.1.i1-Q4_K_M.gguf
  files:
    - filename: LLMEvoLLaMA-3.1-8B-v0.1.i1-Q4_K_M.gguf
      sha256: 4a1042b707499451c42acfbecb8319568c856f0c634aabe79c95d7a6436837ab
      uri: huggingface://mradermacher/LLMEvoLLaMA-3.1-8B-v0.1-i1-GGUF/LLMEvoLLaMA-3.1-8B-v0.1.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "hyperllama3.1-v2-i1"
  urls:
    - https://huggingface.co/bunnycore/HyperLlama3.1-v2
    - https://huggingface.co/mradermacher/HyperLlama3.1-v2-i1-GGUF
  description: |
    HyperLlama3.1-v2 is a merge of the following models using mergekit:
    vicgalle/Configurable-Llama-3.1-8B-Instruct
    bunnycore/HyperLlama-3.1-8B
    ValiantLabs/Llama3.1-8B-ShiningValiant2
  overrides:
    parameters:
      model: HyperLlama3.1-v2.i1-Q4_K_M.gguf
  files:
    - filename: HyperLlama3.1-v2.i1-Q4_K_M.gguf
      sha256: b0357b1876898c485fe0532a8fdc10a4f5a190421bd573899710072558ba330b
      uri: huggingface://mradermacher/HyperLlama3.1-v2-i1-GGUF/HyperLlama3.1-v2.i1-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "jdineen_llama-3.1-8b-think"
  urls:
    - https://huggingface.co/jdineen/Llama-3.1-8B-Think
    - https://huggingface.co/bartowski/jdineen_Llama-3.1-8B-Think-GGUF
  description: |
    This model is a fine-tuned version of Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2 on the jdineen/grpo-with-thinking-500-tagged dataset. It has been trained using TRL.
  overrides:
    parameters:
      model: jdineen_Llama-3.1-8B-Think-Q4_K_M.gguf
  files:
    - filename: jdineen_Llama-3.1-8B-Think-Q4_K_M.gguf
      sha256: 47efe28c37f12a644e02abb417c421b243e8001d3c9345dd7f650c8050ab78fc
      uri: huggingface://bartowski/jdineen_Llama-3.1-8B-Think-GGUF/jdineen_Llama-3.1-8B-Think-Q4_K_M.gguf
- !!merge <<: *llama31
  name: "textsynth-8b-i1"
  urls:
    - https://huggingface.co/theprint/TextSynth-8B
    - https://huggingface.co/mradermacher/TextSynth-8B-i1-GGUF
  description: |
    This is a finetune of Llama 3.1 8B, trained on synthesizing text from two different sources. When used for other purposes, the result is a slightly more creative version of Llama 3.1, using more descriptive and evocative language in some instances.

    It's great for brainstorming sessions, creative writing and free-flowing conversations. It's less good for technical documentation, email writing and that sort of thing.
  overrides:
    parameters:
      model: TextSynth-8B.i1-Q4_K_M.gguf
  files:
    - filename: TextSynth-8B.i1-Q4_K_M.gguf
      sha256: 9186a8cb3a797cd2cd5b2eeaee99808674d96731824a9ee45685bbf480ba56c3
      uri: huggingface://mradermacher/TextSynth-8B-i1-GGUF/TextSynth-8B.i1-Q4_K_M.gguf
- !!merge <<: *llama33
  name: "llama-3.3-magicalgirl-2.5-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/633e85093a17ab61de8d9073/FGK0qBGmELj6DEUxbbrdR.png
  urls:
    - https://huggingface.co/KaraKaraWitch/Llama-3.3-MagicalGirl-2.5
    - https://huggingface.co/mradermacher/Llama-3.3-MagicalGirl-2.5-i1-GGUF
  description: |
    2.5 is a slight modification of MagicalGirl-2 to include R1 to try and make it feel less dumb and more smart.
    The following models were included in the merge:

        LatitudeGames/Wayfarer-Large-70B-Llama-3.3
        KaraKaraWitch/Llama-MiraiFanfare-3.3-70B
        Black-Ink-Guild/Pernicious_Prophecy_70B
        TheDrummer/Fallen-Llama-3.3-R1-70B-v1
        huihui-ai/DeepSeek-R1-Distill-Llama-70B-abliterated
        SicariusSicariiStuff/Negative_LLAMA_70B
  overrides:
    parameters:
      model: Llama-3.3-MagicalGirl-2.5.i1-Q4_K_M.gguf
  files:
    - filename: Llama-3.3-MagicalGirl-2.5.i1-Q4_K_M.gguf
      sha256: 25db6d4ae5649e6d2084036d8f05ec1aca459126e2d4734d6c18f1e16147a4d3
      uri: huggingface://mradermacher/Llama-3.3-MagicalGirl-2.5-i1-GGUF/Llama-3.3-MagicalGirl-2.5.i1-Q4_K_M.gguf
- &deepseek
  url: "github:mudler/LocalAI/gallery/deepseek.yaml@master" ## Deepseek
  name: "deepseek-coder-v2-lite-instruct"
  icon: "https://avatars.githubusercontent.com/u/148330874"
  license: deepseek
  description: |
    DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from DeepSeek-Coder-V2-Base with 6 trillion tokens sourced from a high-quality and multi-source corpus. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-Coder-V2-Base, while maintaining comparable performance in general language tasks. Compared to DeepSeek-Coder, DeepSeek-Coder-V2 demonstrates significant advancements in various aspects of code-related tasks, as well as reasoning and general capabilities. Additionally, DeepSeek-Coder-V2 expands its support for programming languages from 86 to 338, while extending the context length from 16K to 128K.
    In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. The list of supported programming languages can be found in the paper.
  urls:
    - https://github.com/deepseek-ai/DeepSeek-Coder-V2/tree/main
    - https://huggingface.co/LoneStriker/DeepSeek-Coder-V2-Lite-Instruct-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - deepseek
    - cpu
  overrides:
    parameters:
      model: DeepSeek-Coder-V2-Lite-Instruct-Q4_K_M.gguf
  files:
    - filename: DeepSeek-Coder-V2-Lite-Instruct-Q4_K_M.gguf
      sha256: 50ec78036433265965ed1afd0667c00c71c12aa70bcf383be462cb8e159db6c0
      uri: huggingface://LoneStriker/DeepSeek-Coder-V2-Lite-Instruct-GGUF/DeepSeek-Coder-V2-Lite-Instruct-Q4_K_M.gguf
- !!merge <<: *deepseek
  name: "cursorcore-ds-6.7b-i1"
  urls:
    - https://huggingface.co/TechxGenus/CursorCore-DS-6.7B
    - https://huggingface.co/mradermacher/CursorCore-DS-6.7B-i1-GGUF
  description: |
    CursorCore is a series of open-source models designed for AI-assisted programming. It aims to support features such as automated editing and inline chat, replicating the core abilities of closed-source AI-assisted programming tools like Cursor. This is achieved by aligning data generated through Programming-Instruct. Please read our paper to learn more.
  overrides:
    parameters:
      model: CursorCore-DS-6.7B.i1-Q4_K_M.gguf
  files:
    - filename: CursorCore-DS-6.7B.i1-Q4_K_M.gguf
      sha256: 71b94496be79e5bc45c23d6aa6c242f5f1d3625b4f00fe91d781d381ef35c538
      uri: huggingface://mradermacher/CursorCore-DS-6.7B-i1-GGUF/CursorCore-DS-6.7B.i1-Q4_K_M.gguf
- name: "archangel_sft_pythia2-8b"
  url: "github:mudler/LocalAI/gallery/tuluv2.yaml@master"
  icon: https://gist.github.com/assets/29318529/fe2d8391-dbd1-4b7e-9dc4-7cb97e55bc06
  license: apache-2.0
  urls:
    - https://huggingface.co/ContextualAI/archangel_sft_pythia2-8b
    - https://huggingface.co/RichardErkhov/ContextualAI_-_archangel_sft_pythia2-8b-gguf
    - https://github.com/ContextualAI/HALOs
  description: |
    datasets:
    - stanfordnlp/SHP
    - Anthropic/hh-rlhf
    - OpenAssistant/oasst1

    This repo contains the model checkpoints for:
    - model family pythia2-8b
    - optimized with the loss SFT
    - aligned using the SHP, Anthropic HH and Open Assistant datasets.

    Please refer to our [code repository](https://github.com/ContextualAI/HALOs) or [blog](https://contextual.ai/better-cheaper-faster-llm-alignment-with-kto/) which contains intructions for training your own HALOs and links to our model cards.
  overrides:
    parameters:
      model: archangel_sft_pythia2-8b.Q4_K_M.gguf
  files:
    - filename: archangel_sft_pythia2-8b.Q4_K_M.gguf
      sha256: a47782c55ef2b39b19644213720a599d9849511a73c9ebb0c1de749383c0a0f8
      uri: huggingface://RichardErkhov/ContextualAI_-_archangel_sft_pythia2-8b-gguf/archangel_sft_pythia2-8b.Q4_K_M.gguf
- &deepseek-r1
  url: "github:mudler/LocalAI/gallery/deepseek-r1.yaml@master" ## Start DeepSeek-R1
  name: "deepseek-r1-distill-qwen-1.5b"
  icon: "https://avatars.githubusercontent.com/u/148330874"
  urls:
    - https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5b
    - https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-1.5B-GGUF
  description: |
    DeepSeek-R1 is our advanced first-generation reasoning model designed to enhance performance in reasoning tasks.
    Building on the foundation laid by its predecessor, DeepSeek-R1-Zero, which was trained using large-scale reinforcement learning (RL) without supervised fine-tuning, DeepSeek-R1 addresses the challenges faced by R1-Zero, such as endless repetition, poor readability, and language mixing.
    By incorporating cold-start data prior to the RL phase,DeepSeek-R1 significantly improves reasoning capabilities and achieves performance levels comparable to OpenAI-o1 across a variety of domains, including mathematics, coding, and complex reasoning tasks.
  overrides:
    parameters:
      model: DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf
  files:
    - filename: DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf
      sha256: 1741e5b2d062b07acf048bf0d2c514dadf2a48f94e2b4aa0cfe069af3838ee2f
      uri: huggingface://bartowski/DeepSeek-R1-Distill-Qwen-1.5B-GGUF/DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "deepseek-r1-distill-qwen-7b"
  urls:
    - https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
    - https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF
  overrides:
    parameters:
      model: DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
  files:
    - filename: DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
      sha256: 731ece8d06dc7eda6f6572997feb9ee1258db0784827e642909d9b565641937b
      uri: huggingface://bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "deepseek-r1-distill-qwen-14b"
  urls:
    - https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
    - https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-14B-GGUF
  overrides:
    parameters:
      model: DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf
  files:
    - filename: DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf
      sha256: 0b319bd0572f2730bfe11cc751defe82045fad5085b4e60591ac2cd2d9633181
      uri: huggingface://bartowski/DeepSeek-R1-Distill-Qwen-14B-GGUF/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "deepseek-r1-distill-qwen-32b"
  urls:
    - https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
    - https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF
  overrides:
    parameters:
      model: DeepSeek-R1-Distill-Qwen-32B-Q4_K_M.gguf
  files:
    - filename: DeepSeek-R1-Distill-Qwen-32B-Q4_K_M.gguf
      sha256: bed9b0f551f5b95bf9da5888a48f0f87c37ad6b72519c4cbd775f54ac0b9fc62
      uri: huggingface://bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF/DeepSeek-R1-Distill-Qwen-32B-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "deepseek-r1-distill-llama-8b"
  icon: "https://avatars.githubusercontent.com/u/148330874"
  urls:
    - https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B
    - https://huggingface.co/bartowski/DeepSeek-R1-Distill-Llama-8B-GGUF
  overrides:
    parameters:
      model: DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf
  files:
    - filename: DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf
      sha256: 87bcba20b4846d8dadf753d3ff48f9285d131fc95e3e0e7e934d4f20bc896f5d
      uri: huggingface://bartowski/DeepSeek-R1-Distill-Llama-8B-GGUF/DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "deepseek-r1-distill-llama-70b"
  icon: "https://avatars.githubusercontent.com/u/148330874"
  urls:
    - https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B
    - https://huggingface.co/bartowski/DeepSeek-R 1-Distill-Llama-70B-GGUF
  overrides:
    parameters:
      model: DeepSeek-R1-Distill-Llama-70B-Q4_K_M.gguf
  files:
    - filename: DeepSeek-R1-Distill-Llama-70B-Q4_K_M.gguf
      sha256: 181a82a1d6d2fa24fe4db83a68eee030384986bdbdd4773ba76424e3a6eb9fd8
      uri: huggingface://bartowski/DeepSeek-R1-Distill-Llama-70B-GGUF/DeepSeek-R1-Distill-Llama-70B-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "deepseek-r1-qwen-2.5-32b-ablated"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6587d8dd1b44d0e694104fbf/0dkt6EhZYwXVBxvSWXdaM.png
  urls:
    - https://huggingface.co/NaniDAO/deepseek-r1-qwen-2.5-32B-ablated
    - https://huggingface.co/bartowski/deepseek-r1-qwen-2.5-32B-ablated-GGUF
  description: |
    DeepSeek-R1-Distill-Qwen-32B with ablation technique applied for a more helpful (and based) reasoning model.

    This means it will refuse less of your valid requests for an uncensored UX. Use responsibly and use common sense.

    We do not take any responsibility for how you apply this intelligence, just as we do not for how you apply your own.
  overrides:
    parameters:
      model: deepseek-r1-qwen-2.5-32B-ablated-Q4_K_M.gguf
  files:
    - filename: deepseek-r1-qwen-2.5-32B-ablated-Q4_K_M.gguf
      sha256: 7f33898641ebe58fe178c3517efc129f4fe37c6ca2d8b91353c4539b0c3411ec
      uri: huggingface://bartowski/deepseek-r1-qwen-2.5-32B-ablated-GGUF/deepseek-r1-qwen-2.5-32B-ablated-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "fuseo1-deepseekr1-qwen2.5-coder-32b-preview-v0.1"
  urls:
    - https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview
    - https://huggingface.co/bartowski/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-v0.1-GGUF
  description: |
    FuseO1-Preview is our initial endeavor to enhance the System-II reasoning capabilities of large language models (LLMs) through innovative model fusion techniques. By employing our advanced SCE merging methodologies, we integrate multiple open-source o1-like LLMs into a unified model. Our goal is to incorporate the distinct knowledge and strengths from different reasoning LLMs into a single, unified model with strong System-II reasoning abilities, particularly in mathematics, coding, and science domains.
  overrides:
    parameters:
      model: FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-v0.1-Q4_K_M.gguf
  files:
    - filename: FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-v0.1-Q4_K_M.gguf
      sha256: d7753547046cd6e3d45a2cfbd5557aa20dd0b9f0330931d3fd5b3d4a0b468b24
      uri: huggingface://bartowski/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-v0.1-GGUF/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-v0.1-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "fuseo1-deepseekr1-qwen2.5-instruct-32b-preview"
  urls:
    - https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview
    - https://huggingface.co/bartowski/FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview-GGUF
  description: |
    FuseO1-Preview is our initial endeavor to enhance the System-II reasoning capabilities of large language models (LLMs) through innovative model fusion techniques. By employing our advanced SCE merging methodologies, we integrate multiple open-source o1-like LLMs into a unified model. Our goal is to incorporate the distinct knowledge and strengths from different reasoning LLMs into a single, unified model with strong System-II reasoning abilities, particularly in mathematics, coding, and science domains.
  overrides:
    parameters:
      model: FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview-Q4_K_M.gguf
  files:
    - filename: FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview-Q4_K_M.gguf
      sha256: 3b06a004a6bb827f809a7326b30ee73f96a1a86742d8c2dd335d75874fa17aa4
      uri: huggingface://bartowski/FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview-GGUF/FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "fuseo1-deepseekr1-qwq-32b-preview"
  urls:
    - https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-QwQ-32B-Preview
    - https://huggingface.co/bartowski/FuseO1-DeepSeekR1-QwQ-32B-Preview-GGUF
  description: |
    FuseO1-Preview is our initial endeavor to enhance the System-II reasoning capabilities of large language models (LLMs) through innovative model fusion techniques. By employing our advanced SCE merging methodologies, we integrate multiple open-source o1-like LLMs into a unified model. Our goal is to incorporate the distinct knowledge and strengths from different reasoning LLMs into a single, unified model with strong System-II reasoning abilities, particularly in mathematics, coding, and science domains.
  overrides:
    parameters:
      model: FuseO1-DeepSeekR1-QwQ-32B-Preview-Q4_K_M.gguf
  files:
    - filename: FuseO1-DeepSeekR1-QwQ-32B-Preview-Q4_K_M.gguf
      sha256: 16f1fb6bf76bb971a7a63e1a68cddd09421f4a767b86eec55eed1e08178f78f2
      uri: huggingface://bartowski/FuseO1-DeepSeekR1-QwQ-32B-Preview-GGUF/FuseO1-DeepSeekR1-QwQ-32B-Preview-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "fuseo1-deekseekr1-qwq-skyt1-32b-preview"
  urls:
    - https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview
    - https://huggingface.co/bartowski/FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview-GGUF
  description: |
    FuseO1-Preview is our initial endeavor to enhance the System-II reasoning capabilities of large language models (LLMs) through innovative model fusion techniques. By employing our advanced SCE merging methodologies, we integrate multiple open-source o1-like LLMs into a unified model. Our goal is to incorporate the distinct knowledge and strengths from different reasoning LLMs into a single, unified model with strong System-II reasoning abilities, particularly in mathematics, coding, and science domains.
  overrides:
    parameters:
      model: FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview-Q4_K_M.gguf
  files:
    - filename: FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview-Q4_K_M.gguf
      sha256: 13911dd4a62d4714a3447bc288ea9d49dbe575a91cab9e8f645057f1d8e1100e
      uri: huggingface://bartowski/FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview-GGUF/FuseO1-DeekSeekR1-QwQ-SkyT1-32B-Preview-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "steelskull_l3.3-damascus-r1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/iIzpqHDb9wU181AzfrjZy.png
  urls:
    - https://huggingface.co/Steelskull/L3.3-Damascus-R1
    - https://huggingface.co/bartowski/Steelskull_L3.3-Damascus-R1-GGUF
  description: |
    Damascus-R1 builds upon some elements of the Nevoria foundation but represents a significant step forward with a completely custom-made DeepSeek R1 Distill base: Hydroblated-R1-V3. Constructed using the new SCE (Select, Calculate, and Erase) merge method, Damascus-R1 prioritizes stability, intelligence, and enhanced awareness.

    Technical Architecture
    Leveraging the SCE merge method and custom base, Damascus-R1 integrates newly added specialized components from multiple high-performance models:
        EVA and EURYALE foundations for creative expression and scene comprehension
        Cirrus and Hanami elements for enhanced reasoning capabilities
        Anubis components for detailed scene description
        Negative_LLAMA integration for balanced perspective and response

    Core Philosophy
    Damascus-R1 embodies the principle that AI models can be intelligent and be fun. This version specifically addresses recent community feedback and iterates on prior experiments, optimizing the balance between technical capability and natural conversation flow.

    Base Architecture
    At its core, Damascus-R1 utilizes the entirely custom Hydroblated-R1 base model, specifically engineered for stability, enhanced reasoning, and performance. The SCE merge method, with settings finely tuned based on community feedback from evaluations of Experiment-Model-Ver-A, L3.3-Exp-Nevoria-R1-70b-v0.1 and L3.3-Exp-Nevoria-70b-v0.1, enables precise and effective component integration while maintaining model coherence and reliability.
  overrides:
    parameters:
      model: Steelskull_L3.3-Damascus-R1-Q4_K_M.gguf
  files:
    - filename: Steelskull_L3.3-Damascus-R1-Q4_K_M.gguf
      sha256: f1df5808b2099b26631d0bae870603a08dbfab6813471f514035d3fb92a47480
      uri: huggingface://bartowski/Steelskull_L3.3-Damascus-R1-GGUF/Steelskull_L3.3-Damascus-R1-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "uncensoredai_uncensoredlm-deepseek-r1-distill-qwen-14b"
  icon: https://huggingface.co/uncensoredai/UncensoredLM-DeepSeek-R1-Distill-Qwen-14B/resolve/main/h5dTflRHYMbGq3RXm9a61yz4io.avif
  urls:
    - https://huggingface.co/uncensoredai/UncensoredLM-DeepSeek-R1-Distill-Qwen-14B
    - https://huggingface.co/bartowski/uncensoredai_UncensoredLM-DeepSeek-R1-Distill-Qwen-14B-GGUF
  description: |
    An UncensoredLLM with Reasoning, what more could you want?
  overrides:
    parameters:
      model: uncensoredai_UncensoredLM-DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf
  files:
    - filename: uncensoredai_UncensoredLM-DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf
      sha256: 85b2c3e1aa4e8cc3bf616f84c7595c963d5439f3fcfdbd5c957fb22e84d10b1c
      uri: huggingface://bartowski/uncensoredai_UncensoredLM-DeepSeek-R1-Distill-Qwen-14B-GGUF/uncensoredai_UncensoredLM-DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "huihui-ai_deepseek-r1-distill-llama-70b-abliterated"
  urls:
    - https://huggingface.co/huihui-ai/DeepSeek-R1-Distill-Llama-70B-abliterated
    - https://huggingface.co/bartowski/huihui-ai_DeepSeek-R1-Distill-Llama-70B-abliterated-GGUF
  description: |
    This is an uncensored version of deepseek-ai/DeepSeek-R1-Distill-Llama-70B created with abliteration (see remove-refusals-with-transformers to know more about it).
    This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.
  overrides:
    parameters:
      model: huihui-ai_DeepSeek-R1-Distill-Llama-70B-abliterated-Q4_K_M.gguf
  files:
    - filename: huihui-ai_DeepSeek-R1-Distill-Llama-70B-abliterated-Q4_K_M.gguf
      sha256: 2ed91d01c4b7a0f33f578c6389d0dd6a64d071b3f7963c40b4e1e71235dc74d6
      uri: huggingface://bartowski/huihui-ai_DeepSeek-R1-Distill-Llama-70B-abliterated-GGUF/huihui-ai_DeepSeek-R1-Distill-Llama-70B-abliterated-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "agentica-org_deepscaler-1.5b-preview"
  icon: https://avatars.githubusercontent.com/u/174067447?s=200&v=4
  urls:
    - https://huggingface.co/agentica-org/DeepScaleR-1.5B-Preview
    - https://huggingface.co/bartowski/agentica-org_DeepScaleR-1.5B-Preview-GGUF
  description: |
    DeepScaleR-1.5B-Preview is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B using distributed reinforcement learning (RL) to scale up to long context lengths. The model achieves 43.1% Pass@1 accuracy on AIME 2024, representing a 15% improvement over the base model (28.8%) and surpassing OpenAI's O1-Preview performance with just 1.5B parameters.
  overrides:
    parameters:
      model: agentica-org_DeepScaleR-1.5B-Preview-Q4_K_M.gguf
  files:
    - filename: agentica-org_DeepScaleR-1.5B-Preview-Q4_K_M.gguf
      sha256: bf51b412360a84792ae9145e2ca322379234c118dbff498ff08e589253b67ded
      uri: huggingface://bartowski/agentica-org_DeepScaleR-1.5B-Preview-GGUF/agentica-org_DeepScaleR-1.5B-Preview-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "internlm_oreal-deepseek-r1-distill-qwen-7b"
  urls:
    - https://huggingface.co/internlm/OREAL-DeepSeek-R1-Distill-Qwen-7B
    - https://huggingface.co/bartowski/internlm_OREAL-DeepSeek-R1-Distill-Qwen-7B-GGUF
  description: |
    We introduce OREAL-7B and OREAL-32B, a mathematical reasoning model series trained using Outcome REwArd-based reinforcement Learning, a novel RL framework designed for tasks where only binary outcome rewards are available.

    With OREAL, a 7B model achieves 94.0 pass@1 accuracy on MATH-500, matching the performance of previous 32B models. OREAL-32B further surpasses previous distillation-trained 32B models, reaching 95.0 pass@1 accuracy on MATH-500.
  overrides:
    parameters:
      model: internlm_OREAL-DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
  files:
    - filename: internlm_OREAL-DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
      sha256: fa9dc8b0d4be0952252c25ff33e766a8399ce7b085647b95abe3edbe536cd8ed
      uri: huggingface://bartowski/internlm_OREAL-DeepSeek-R1-Distill-Qwen-7B-GGUF/internlm_OREAL-DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "arcee-ai_arcee-maestro-7b-preview"
  urls:
    - https://huggingface.co/arcee-ai/Arcee-Maestro-7B-Preview
    - https://huggingface.co/bartowski/arcee-ai_Arcee-Maestro-7B-Preview-GGUF
  description: |
    Arcee-Maestro-7B-Preview (7B) is Arcee's first reasoning model trained with reinforment learning. It is based on the Qwen2.5-7B DeepSeek-R1 distillation DeepSeek-R1-Distill-Qwen-7B with further GRPO training. Though this is just a preview of our upcoming work, it already shows promising improvements to mathematical and coding abilities across a range of tasks.
  overrides:
    parameters:
      model: arcee-ai_Arcee-Maestro-7B-Preview-Q4_K_M.gguf
  files:
    - filename: arcee-ai_Arcee-Maestro-7B-Preview-Q4_K_M.gguf
      sha256: 7b1099e67ad1d10a80868ca0c39e78e7b3f89da87aa316166f56cc259e53cb7f
      uri: huggingface://bartowski/arcee-ai_Arcee-Maestro-7B-Preview-GGUF/arcee-ai_Arcee-Maestro-7B-Preview-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "steelskull_l3.3-san-mai-r1-70b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/8fZQZaLM0XO9TyKh-yMQ7.jpeg
  urls:
    - https://huggingface.co/Steelskull/L3.3-San-Mai-R1-70b
    - https://huggingface.co/bartowski/Steelskull_L3.3-San-Mai-R1-70b-GGUF
  description: |
    L3.3-San-Mai-R1-70b represents the foundational release in a three-part model series, followed by L3.3-Cu-Mai-R1-70b (Version A) and L3.3-Mokume-Gane-R1-70b (Version C). The name "San-Mai" draws inspiration from the Japanese bladesmithing technique of creating three-layer laminated composite metals, known for combining a hard cutting edge with a tougher spine - a metaphor for this model's balanced approach to AI capabilities.
    Built on a custom DeepSeek R1 Distill base (DS-Hydroblated-R1-v4.1), San-Mai-R1 integrates specialized components through the SCE merge method:

    EVA and EURYALE foundations for creative expression and scene comprehension
    Cirrus and Hanami elements for enhanced reasoning capabilities
    Anubis components for detailed scene description
    Negative_LLAMA integration for balanced perspective and response

    Core Capabilities

    As the OG model in the series, San-Mai-R1 serves as the gold standard and reliable baseline. User feedback consistently highlights its superior intelligence, coherence, and unique ability to provide deep character insights. Through proper prompting, the model demonstrates advanced reasoning capabilities and an "X-factor" that enables unprompted exploration of character inner thoughts and motivations.
  overrides:
    parameters:
      model: Steelskull_L3.3-San-Mai-R1-70b-Q4_K_M.gguf
  files:
    - filename: Steelskull_L3.3-San-Mai-R1-70b-Q4_K_M.gguf
      sha256: 2287bfa14af188b0fc3a9f4e3afc9c303b7c41cee49238434f971c090b850306
      uri: huggingface://bartowski/Steelskull_L3.3-San-Mai-R1-70b-GGUF/Steelskull_L3.3-San-Mai-R1-70b-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "perplexity-ai_r1-1776-distill-llama-70b"
  urls:
    - https://huggingface.co/perplexity-ai/r1-1776-distill-llama-70b
    - https://huggingface.co/bartowski/perplexity-ai_r1-1776-distill-llama-70b-GGUF
  description: |
    R1 1776 is a DeepSeek-R1 reasoning model that has been post-trained by Perplexity AI to remove Chinese Communist Party censorship. The model provides unbiased, accurate, and factual information while maintaining high reasoning capabilities.
  overrides:
    parameters:
      model: perplexity-ai_r1-1776-distill-llama-70b-Q4_K_M.gguf
  files:
    - filename: perplexity-ai_r1-1776-distill-llama-70b-Q4_K_M.gguf
      sha256: 4030b5778cbbd0723454c9a0c340c32dc4e86a98d46f5e6083527da6a9c90012
      uri: huggingface://bartowski/perplexity-ai_r1-1776-distill-llama-70b-GGUF/perplexity-ai_r1-1776-distill-llama-70b-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "qihoo360_tinyr1-32b-preview"
  urls:
    - https://huggingface.co/qihoo360/TinyR1-32B-Preview
    - https://huggingface.co/bartowski/qihoo360_TinyR1-32B-Preview-v0.2-GGUF
  description: |
    We introduce our first-generation reasoning model, Tiny-R1-32B-Preview, which outperforms the 70B model Deepseek-R1-Distill-Llama-70B and nearly matches the full R1 model in math.

    We applied supervised fine-tuning (SFT) to Deepseek-R1-Distill-Qwen-32B across three target domains—Mathematics, Code, and Science — using the 360-LLaMA-Factory training framework to produce three domain-specific models. We used questions from open-source data as seeds. Meanwhile, responses for mathematics, coding, and science tasks were generated by R1, creating specialized models for each domain. Building on this, we leveraged the Mergekit tool from the Arcee team to combine multiple models, creating Tiny-R1-32B-Preview, which demonstrates strong overall performance.
  overrides:
    parameters:
      model: qihoo360_TinyR1-32B-Preview-v0.2-Q4_K_M.gguf
  files:
    - filename: qihoo360_TinyR1-32B-Preview-v0.2-Q4_K_M.gguf
      sha256: 250e38d6164798a6aa0d5a9208722f835fc6a1a582aeff884bdedb123d209d47
      uri: huggingface://bartowski/qihoo360_TinyR1-32B-Preview-v0.2-GGUF/qihoo360_TinyR1-32B-Preview-v0.2-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "thedrummer_fallen-llama-3.3-r1-70b-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/7BdBxwafsvzqPC98h_gaA.png
  urls:
    - https://huggingface.co/TheDrummer/Fallen-Llama-3.3-R1-70B-v1
    - https://huggingface.co/bartowski/TheDrummer_Fallen-Llama-3.3-R1-70B-v1-GGUF
  description: |
    Fallen Llama 3.3 R1 70B v1 is an evil tune of Deepseek's R1 Distill on Llama 3.3 70B.

    Not only is it decensored, but it's capable of spouting vitriolic tokens when prompted.

    Free from its restraints: censorship and positivity, I hope it serves as good mergefuel.
  overrides:
    parameters:
      model: TheDrummer_Fallen-Llama-3.3-R1-70B-v1-Q4_K_M.gguf
  files:
    - filename: TheDrummer_Fallen-Llama-3.3-R1-70B-v1-Q4_K_M.gguf
      sha256: 889455f0c747f2c444818c68169384d3da4830156d2a19906d7d6adf48b243df
      uri: huggingface://bartowski/TheDrummer_Fallen-Llama-3.3-R1-70B-v1-GGUF/TheDrummer_Fallen-Llama-3.3-R1-70B-v1-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "knoveleng_open-rs3"
  urls:
    - https://huggingface.co/knoveleng/Open-RS3
    - https://huggingface.co/bartowski/knoveleng_Open-RS3-GGUF
  description: |
    This repository hosts model for the Open RS project, accompanying the paper Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn’t. The project explores enhancing reasoning capabilities in small large language models (LLMs) using reinforcement learning (RL) under resource-constrained conditions.

    We focus on a 1.5-billion-parameter model, DeepSeek-R1-Distill-Qwen-1.5B, trained on 4 NVIDIA A40 GPUs (48 GB VRAM each) within 24 hours. By adapting the Group Relative Policy Optimization (GRPO) algorithm and leveraging a curated, compact mathematical reasoning dataset, we conducted three experiments to assess performance and behavior. Key findings include:

    Significant reasoning improvements, e.g., AMC23 accuracy rising from 63% to 80% and AIME24 reaching 46.7%, outperforming o1-preview.
    Efficient training with just 7,000 samples at a cost of $42, compared to thousands of dollars for baseline models.
    Challenges like optimization instability and length constraints with extended training.

    These results showcase RL-based fine-tuning as a cost-effective approach for small LLMs, making reasoning capabilities accessible in resource-limited settings. We open-source our code, models, and datasets to support further research.
  overrides:
    parameters:
      model: knoveleng_Open-RS3-Q4_K_M.gguf
  files:
    - filename: knoveleng_Open-RS3-Q4_K_M.gguf
      sha256: 599ab49d78949e62e37c5e37b0c313626d066ca614020b9b17c2b5bbcf18ea7f
      uri: huggingface://bartowski/knoveleng_Open-RS3-GGUF/knoveleng_Open-RS3-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "thoughtless-fallen-abomination-70b-r1-v4.1-i1"
  icon: https://huggingface.co/ReadyArt/Thoughtless-Fallen-Abomination-70B-R1-v4.1/resolve/main/waifu2.webp
  urls:
    - https://huggingface.co/ReadyArt/Thoughtless-Fallen-Abomination-70B-R1-v4.1
    - https://huggingface.co/mradermacher/Thoughtless-Fallen-Abomination-70B-R1-v4.1-i1-GGUF
  description: |
      ReadyArt/Thoughtless-Fallen-Abomination-70B-R1-v4.1 benefits from the coherence and well rounded roleplay experience of TheDrummer/Fallen-Llama-3.3-R1-70B-v1. We've:
          🔁 Re-integrated your favorite V1.2 scenarios (now with better kink distribution)
          🧪 Direct-injected the Abomination dataset into the model's neural pathways
          ⚖️ Achieved perfect balance between "oh my" and "oh my"
  overrides:
    parameters:
      model: Thoughtless-Fallen-Abomination-70B-R1-v4.1.i1-Q4_K_M.gguf
  files:
    - filename: Thoughtless-Fallen-Abomination-70B-R1-v4.1.i1-Q4_K_M.gguf
      sha256: 96d1707b6d018791cab4da77a5065ceda421d8180ab9ffa232aefa15757bd63a
      uri: huggingface://mradermacher/Thoughtless-Fallen-Abomination-70B-R1-v4.1-i1-GGUF/Thoughtless-Fallen-Abomination-70B-R1-v4.1.i1-Q4_K_M.gguf
- !!merge <<: *deepseek-r1
  name: "fallen-safeword-70b-r1-v4.1"
  icon: https://huggingface.co/ReadyArt/Fallen-Safeword-70B-R1-v4.1/resolve/main/waifu2.webp
  urls:
    - https://huggingface.co/ReadyArt/Fallen-Safeword-70B-R1-v4.1
    - https://huggingface.co/mradermacher/Fallen-Safeword-70B-R1-v4.1-GGUF
  description: |
        ReadyArt/Fallen-Safeword-70B-R1-v4.1 isn't just a model - is the event horizon of depravity trained on TheDrummer/Fallen-Llama-3.3-R1-70B-v1. We've:
            🔁 Re-integrated your favorite V1.2 scenarios (now with better kink distribution)
            🧪 Direct-injected the Safeword dataset into the model's neural pathways
            ⚖️ Achieved perfect balance between "oh my" and "oh my"
  overrides:
    parameters:
      model: Fallen-Safeword-70B-R1-v4.1.Q4_K_M.gguf
  files:
    - filename: Fallen-Safeword-70B-R1-v4.1.Q4_K_M.gguf
      sha256: aed6bd5bb03b7bd886939237bc10ea6331d4feb5a3b6712e0c5474a778acf817
      uri: huggingface://mradermacher/Fallen-Safeword-70B-R1-v4.1-GGUF/Fallen-Safeword-70B-R1-v4.1.Q4_K_M.gguf
- &qwen2
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master" ## Start QWEN2
  name: "qwen2-7b-instruct"
  icon: https://avatars.githubusercontent.com/u/141221163
  license: apache-2.0
  description: |
    Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 7B Qwen2 model.
  urls:
    - https://huggingface.co/Qwen/Qwen2-7B-Instruct
    - https://huggingface.co/bartowski/Qwen2-7B-Instruct-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - qwen
    - cpu
  overrides:
    parameters:
      model: Qwen2-7B-Instruct-Q4_K_M.gguf
  files:
    - filename: Qwen2-7B-Instruct-Q4_K_M.gguf
      sha256: 8d0d33f0d9110a04aad1711b1ca02dafc0fa658cd83028bdfa5eff89c294fe76
      uri: huggingface://bartowski/Qwen2-7B-Instruct-GGUF/Qwen2-7B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "dolphin-2.9.2-qwen2-72b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png
  urls:
    - https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-72b-gguf
  description: "Dolphin 2.9.2 Qwen2 72B \U0001F42C\n\nCurated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations\n"
  overrides:
    parameters:
      model: dolphin-2.9.2-qwen2-Q4_K_M.gguf
  files:
    - filename: dolphin-2.9.2-qwen2-Q4_K_M.gguf
      sha256: 44a0e82cbc2a201b2f4b9e16099a0a4d97b6f0099d45bcc5b354601f38dbb709
      uri: huggingface://cognitivecomputations/dolphin-2.9.2-qwen2-72b-gguf/qwen2-Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "dolphin-2.9.2-qwen2-7b"
  description: "Dolphin 2.9.2 Qwen2 7B \U0001F42C\n\nCurated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations\n"
  urls:
    - https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-7b
    - https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-7b-gguf
  icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png
  overrides:
    parameters:
      model: dolphin-2.9.2-qwen2-7b-Q4_K_M.gguf
  files:
    - filename: dolphin-2.9.2-qwen2-7b-Q4_K_M.gguf
      sha256: a15b5db4df6be4f4bfb3632b2009147332ef4c57875527f246b4718cb0d3af1f
      uri: huggingface://cognitivecomputations/dolphin-2.9.2-qwen2-7b-gguf/dolphin-2.9.2-qwen2-7b-Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "samantha-qwen-2-7B"
  description: |
    Samantha based on qwen2
  urls:
    - https://huggingface.co/bartowski/Samantha-Qwen-2-7B-GGUF
    - https://huggingface.co/macadeliccc/Samantha-Qwen2-7B
  overrides:
    parameters:
      model: Samantha-Qwen-2-7B-Q4_K_M.gguf
  files:
    - filename: Samantha-Qwen-2-7B-Q4_K_M.gguf
      sha256: 5d1cf1c35a7a46c536a96ba0417d08b9f9e09c24a4e25976f72ad55d4904f6fe
      uri: huggingface://bartowski/Samantha-Qwen-2-7B-GGUF/Samantha-Qwen-2-7B-Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "magnum-72b-v1"
  icon: https://files.catbox.moe/ngqnb1.png
  description: |
    This is the first in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. This model is fine-tuned on top of Qwen-2 72B Instruct.
  urls:
    - https://huggingface.co/alpindale/magnum-72b-v1
    - https://huggingface.co/bartowski/magnum-72b-v1-GGUF
  overrides:
    parameters:
      model: magnum-72b-v1-Q4_K_M.gguf
  files:
    - filename: magnum-72b-v1-Q4_K_M.gguf
      sha256: 046ec48665ce64a3a4965509dee2d9d8e5d81cb0b32ca0ddf130d2b59fa4ca9a
      uri: huggingface://bartowski/magnum-72b-v1-GGUF/magnum-72b-v1-Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "qwen2-1.5b-ita"
  description: |
    Qwen2 1.5B is a compact language model specifically fine-tuned for the Italian language. Despite its relatively small size of 1.5 billion parameters, Qwen2 1.5B demonstrates strong performance, nearly matching the capabilities of larger models, such as the 9 billion parameter ITALIA model by iGenius. The fine-tuning process focused on optimizing the model for various language tasks in Italian, making it highly efficient and effective for Italian language applications.
  urls:
    - https://huggingface.co/DeepMount00/Qwen2-1.5B-Ita
    - https://huggingface.co/DeepMount00/Qwen2-1.5B-Ita-GGUF
  overrides:
    parameters:
      model: qwen2-1.5b-instruct-q8_0.gguf
  files:
    - filename: qwen2-1.5b-instruct-q8_0.gguf
      sha256: c9d33989d77f4bd6966084332087921b9613eda01d5f44dc0b4e9a7382a2bfbb
      uri: huggingface://DeepMount00/Qwen2-1.5B-Ita-GGUF/qwen2-1.5b-instruct-q8_0.gguf
- !!merge <<: *qwen2
  name: "einstein-v7-qwen2-7b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/KLQP1jK-DIzpwHzYRIH-Q.png
  description: |
    This model is a full fine-tuned version of Qwen/Qwen2-7B on diverse datasets.
  urls:
    - https://huggingface.co/Weyaxi/Einstein-v7-Qwen2-7B
    - https://huggingface.co/bartowski/Einstein-v7-Qwen2-7B-GGUF
  overrides:
    parameters:
      model: Einstein-v7-Qwen2-7B-Q4_K_M.gguf
  files:
    - filename: Einstein-v7-Qwen2-7B-Q4_K_M.gguf
      sha256: 277b212ea65894723d2b86fb0f689fa5ecb54c9794f0fd2fb643655dc62812ce
      uri: huggingface://bartowski/Einstein-v7-Qwen2-7B-GGUF/Einstein-v7-Qwen2-7B-Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "arcee-spark"
  icon: https://avatars.githubusercontent.com/u/126496414
  description: |
    Arcee Spark is a powerful 7B parameter language model that punches well above its weight class. Initialized from Qwen2, this model underwent a sophisticated training process:

        Fine-tuned on 1.8 million samples
        Merged with Qwen2-7B-Instruct using Arcee's mergekit
        Further refined using Direct Preference Optimization (DPO)

    This meticulous process results in exceptional performance, with Arcee Spark achieving the highest score on MT-Bench for models of its size, outperforming even GPT-3.5 on many tasks.
  urls:
    - https://huggingface.co/arcee-ai/Arcee-Spark-GGUF
  overrides:
    parameters:
      model: Arcee-Spark-Q4_K_M.gguf
  files:
    - filename: Arcee-Spark-Q4_K_M.gguf
      sha256: 44123276d7845dc13f73ca4aa431dc4c931104eb7d2186f2a73d076fa0ee2330
      uri: huggingface://arcee-ai/Arcee-Spark-GGUF/Arcee-Spark-Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "hercules-5.0-qwen2-7b"
  description: |
    Locutusque/Hercules-5.0-Qwen2-7B is a fine-tuned language model derived from Qwen2-7B. It is specifically designed to excel in instruction following, function calls, and conversational interactions across various scientific and technical domains. This fine-tuning has hercules-v5.0 with enhanced abilities in:

        Complex Instruction Following: Understanding and accurately executing multi-step instructions, even those involving specialized terminology.
        Function Calling: Seamlessly interpreting and executing function calls, providing appropriate input and output values.
        Domain-Specific Knowledge: Engaging in informative and educational conversations about Biology, Chemistry, Physics, Mathematics, Medicine, Computer Science, and more.
  urls:
    - https://huggingface.co/Locutusque/Hercules-5.0-Qwen2-7B
    - https://huggingface.co/bartowski/Hercules-5.0-Qwen2-7B-GGUF
  overrides:
    parameters:
      model: Hercules-5.0-Qwen2-7B-Q4_K_M.gguf
  files:
    - filename: Hercules-5.0-Qwen2-7B-Q4_K_M.gguf
      sha256: 8ebae4ffd43b906ddb938c3a611060ee5f99c35014e5ffe23ca35714361b5693
      uri: huggingface://Hercules-5.0-Qwen2-7B-Q4_K_M.gguf/Hercules-5.0-Qwen2-7B-Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "arcee-agent"
  icon: https://avatars.githubusercontent.com/u/126496414
  description: |
    Arcee Agent is a cutting-edge 7B parameter language model specifically designed for function calling and tool use. Initialized from Qwen2-7B, it rivals the performance of much larger models while maintaining efficiency and speed. This model is particularly suited for developers, researchers, and businesses looking to implement sophisticated AI-driven solutions without the computational overhead of larger language models. Compute for training Arcee-Agent was provided by CrusoeAI. Arcee-Agent was trained using Spectrum.
  urls:
    - https://huggingface.co/crusoeai/Arcee-Agent-GGUF
    - https://huggingface.co/arcee-ai/Arcee-Agent
  overrides:
    parameters:
      model: arcee-agent.Q4_K_M.gguf
  files:
    - filename: arcee-agent.Q4_K_M.gguf
      sha256: ebb49943a66c1e717f9399a555aee0af28a40bfac7500f2ad8dd05f211b62aac
      uri: huggingface://crusoeai/Arcee-Agent-GGUF/arcee-agent.Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "qwen2-7b-instruct-v0.8"
  icon: https://huggingface.co/MaziyarPanahi/Qwen2-7B-Instruct-v0.8/resolve/main/qwen2-fine-tunes-maziyar-panahi.webp
  description: |
    MaziyarPanahi/Qwen2-7B-Instruct-v0.8

    This is a fine-tuned version of the Qwen/Qwen2-7B model. It aims to improve the base model across all benchmarks.
  urls:
    - https://huggingface.co/MaziyarPanahi/Qwen2-7B-Instruct-v0.8
    - https://huggingface.co/MaziyarPanahi/Qwen2-7B-Instruct-v0.8-GGUF
  overrides:
    parameters:
      model: Qwen2-7B-Instruct-v0.8.Q4_K_M.gguf
  files:
    - filename: Qwen2-7B-Instruct-v0.8.Q4_K_M.gguf
      sha256: 8c1b3efe9fa6ae1b37942ef26473cb4e0aed0f8038b60d4b61e5bffb61e49b7e
      uri: huggingface://MaziyarPanahi/Qwen2-7B-Instruct-v0.8-GGUF/Qwen2-7B-Instruct-v0.8.Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "qwen2-wukong-7b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/655dc641accde1bbc8b41aec/xOe1Nb3S9Nb53us7_Ja3s.jpeg
  urls:
    - https://huggingface.co/bartowski/Qwen2-Wukong-7B-GGUF
  description: |
    Qwen2-Wukong-7B is a dealigned chat finetune of the original fantastic Qwen2-7B model by the Qwen team.

    This model was trained on the teknium OpenHeremes-2.5 dataset and some supplementary datasets from Cognitive Computations

    This model was trained for 3 epochs with a custom FA2 implementation for AMD cards.
  overrides:
    parameters:
      model: Qwen2-Wukong-7B-Q4_K_M.gguf
  files:
    - filename: Qwen2-Wukong-7B-Q4_K_M.gguf
      sha256: 6b8ca6649c33fc84d4892ebcff1214f0b34697aced784f0d6d32e284a15943ad
      uri: huggingface://bartowski/Qwen2-Wukong-7B-GGUF/Qwen2-Wukong-7B-Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "calme-2.8-qwen2-7b"
  icon: https://huggingface.co/MaziyarPanahi/calme-2.8-qwen2-7b/resolve/main/qwen2-fine-tunes-maziyar-panahi.webp
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-2.8-qwen2-7b
    - https://huggingface.co/MaziyarPanahi/calme-2.8-qwen2-7b-GGUF
  description: |
    This is a fine-tuned version of the Qwen/Qwen2-7B model. It aims to improve the base model across all benchmarks.
  overrides:
    parameters:
      model: Qwen2-7B-Instruct-v0.8.Q4_K_M.gguf
  files:
    - filename: Qwen2-7B-Instruct-v0.8.Q4_K_M.gguf
      sha256: 8c1b3efe9fa6ae1b37942ef26473cb4e0aed0f8038b60d4b61e5bffb61e49b7e
      uri: huggingface://MaziyarPanahi/calme-2.8-qwen2-7b-GGUF/Qwen2-7B-Instruct-v0.8.Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "stellardong-72b-i1"
  icon: https://huggingface.co/smelborp/StellarDong-72b/resolve/main/stellardong.png
  urls:
    - https://huggingface.co/smelborp/StellarDong-72b
    - https://huggingface.co/mradermacher/StellarDong-72b-i1-GGUF
  description: |
    Magnum + Nova = you won't believe how stellar this dong is!!
  overrides:
    parameters:
      model: StellarDong-72b.i1-Q4_K_M.gguf
  files:
    - filename: StellarDong-72b.i1-Q4_K_M.gguf
      sha256: 4c5012f0a034f40a044904891343ade2594f29c28a8a9d8052916de4dc5a61df
      uri: huggingface://mradermacher/StellarDong-72b-i1-GGUF/StellarDong-72b.i1-Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "magnum-32b-v1-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/635567189c72a7e742f1419c/PK7xRSd18Du0bX-w_t-9c.png
  urls:
    - https://huggingface.co/anthracite-org/magnum-32b-v1
    - https://huggingface.co/mradermacher/magnum-32b-v1-i1-GGUF
  description: |
    This is the second in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. This model is fine-tuned on top of Qwen1.5 32B.
  overrides:
    parameters:
      model: magnum-32b-v1.i1-Q4_K_M.gguf
  files:
    - filename: magnum-32b-v1.i1-Q4_K_M.gguf
      sha256: a31704ce0d7e5b774f155522b9ab7ef6015a4ece4e9056bf4dfc6cac561ff0a3
      uri: huggingface://mradermacher/magnum-32b-v1-i1-GGUF/magnum-32b-v1.i1-Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "tifa-7b-qwen2-v0.1"
  urls:
    - https://huggingface.co/Tifa-RP/Tifa-7B-Qwen2-v0.1-GGUF
  description: |
    The Tifa role-playing language model is a high-performance language model based on a self-developed 220B model distillation, with a new base model of qwen2-7B. The model has been converted to gguf format for running in the Ollama framework, providing excellent dialogue and text generation capabilities.

    The original model was trained on a large-scale industrial dataset and then fine-tuned with 400GB of novel data and 20GB of multi-round dialogue directive data to achieve good role-playing effects.

    The Tifa model is suitable for multi-round dialogue processing, role-playing and scenario simulation, EFX industrial knowledge integration, and high-quality literary creation.

    Note: The Tifa model is in Chinese and English, with 7.6% of the data in Chinese role-playing and 4.2% in English role-playing. The model has been trained with a mix of EFX industrial field parameters and question-answer dialogues generated from 220B model outputs since 2023. The recommended quantization method is f16, as it retains more detail and accuracy in the model's performance.
  overrides:
    parameters:
      model: tifa-7b-qwen2-v0.1.q4_k_m.gguf
  files:
    - filename: tifa-7b-qwen2-v0.1.q4_k_m.gguf
      sha256: 1f5adbe8cb0a6400f51abdca3bf4e32284ebff73cc681a43abb35c0a6ccd3820
      uri: huggingface://Tifa-RP/Tifa-7B-Qwen2-v0.1-GGUF/tifa-7b-qwen2-v0.1.q4_k_m.gguf
- !!merge <<: *qwen2
  name: "calme-2.2-qwen2-72b"
  icon: https://huggingface.co/MaziyarPanahi/calme-2.2-qwen2-72b/resolve/main/calme-2.webp
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-2.2-qwen2-72b-GGUF
    - https://huggingface.co/MaziyarPanahi/calme-2.2-qwen2-72b
  description: |
    This model is a fine-tuned version of the powerful Qwen/Qwen2-72B-Instruct, pushing the boundaries of natural language understanding and generation even further. My goal was to create a versatile and robust model that excels across a wide range of benchmarks and real-world applications.

    The post-training process is identical to the calme-2.1-qwen2-72b model; however, some parameters are different, and it was trained for a longer period.

    Use Cases

    This model is suitable for a wide range of applications, including but not limited to:

        Advanced question-answering systems
        Intelligent chatbots and virtual assistants
        Content generation and summarization
        Code generation and analysis
        Complex problem-solving and decision support
  overrides:
    parameters:
      model: calme-2.2-qwen2-72b.Q4_K_M.gguf
  files:
    - filename: calme-2.2-qwen2-72b.Q4_K_M.gguf
      sha256: 95b9613df0abe6c1b6b7b017d7cc8bcf19b46c29f92a503dcc6da1704b12b402
      uri: huggingface://MaziyarPanahi/calme-2.2-qwen2-72b-GGUF/calme-2.2-qwen2-72b.Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "edgerunner-tactical-7b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/668ed3dcd857a9ca47edb75c/tSyuw39VtmEqvC_wptTDf.png
  urls:
    - https://huggingface.co/edgerunner-ai/EdgeRunner-Tactical-7B
    - https://huggingface.co/RichardErkhov/edgerunner-ai_-_EdgeRunner-Tactical-7B-gguf
  description: |
    EdgeRunner-Tactical-7B is a powerful and efficient language model for the edge. Our mission is to build Generative AI for the edge that is safe, secure, and transparent. To that end, the EdgeRunner team is proud to release EdgeRunner-Tactical-7B, the most powerful language model for its size to date.

    EdgeRunner-Tactical-7B is a 7 billion parameter language model that delivers powerful performance while demonstrating the potential of running state-of-the-art (SOTA) models at the edge.
  overrides:
    parameters:
      model: EdgeRunner-Tactical-7B.Q4_K_M.gguf
  files:
    - filename: EdgeRunner-Tactical-7B.Q4_K_M.gguf
      sha256: 90ca9c3ab19e5d1de4499e3f988cc0ba3d205e50285d7c89de6f0a4c525bf204
      uri: huggingface://RichardErkhov/edgerunner-ai_-_EdgeRunner-Tactical-7B-gguf/EdgeRunner-Tactical-7B.Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "marco-o1"
  icon: https://huggingface.co/AIDC-AI/Marco-o1/resolve/main/assets/logo.png
  urls:
    - https://huggingface.co/AIDC-AI/Marco-o1
    - https://huggingface.co/QuantFactory/Marco-o1-GGUF
  description: |
    Marco-o1 not only focuses on disciplines with standard answers, such as mathematics, physics, and coding—which are well-suited for reinforcement learning (RL)—but also places greater emphasis on open-ended resolutions. We aim to address the question: "Can the o1 model effectively generalize to broader domains where clear standards are absent and rewards are challenging to quantify?"
  overrides:
    parameters:
      model: Marco-o1.Q4_K_M.gguf
  files:
    - filename: Marco-o1.Q4_K_M.gguf
      sha256: 54dd9554cb54609bf0bf4b367dfba192fc982a2fc6b87a0f56fba5ea82762d0d
      uri: huggingface://QuantFactory/Marco-o1-GGUF/Marco-o1.Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "marco-o1-uncensored"
  urls:
    - https://huggingface.co/thirdeyeai/marco-o1-uncensored
    - https://huggingface.co/QuantFactory/marco-o1-uncensored-GGUF
  description: |
    Uncensored version of marco-o1
  overrides:
    parameters:
      model: marco-o1-uncensored.Q4_K_M.gguf
  files:
    - filename: marco-o1-uncensored.Q4_K_M.gguf
      sha256: ad0440270a7254098f90779744d3e5b34fe49b7baf97c819909ba9c5648cc0d9
      uri: huggingface://QuantFactory/marco-o1-uncensored-GGUF/marco-o1-uncensored.Q4_K_M.gguf
- !!merge <<: *qwen2
  name: "minicpm-o-2_6"
  icon: https://avatars.githubusercontent.com/u/89920203
  urls:
    - https://huggingface.co/openbmb/MiniCPM-o-2_6-gguf
    - https://huggingface.co/openbmb/MiniCPM-o-2_6
  description: |
    MiniCPM-o 2.6 is the latest and most capable model in the MiniCPM-o series. The model is built in an end-to-end fashion based on SigLip-400M, Whisper-medium-300M, ChatTTS-200M, and Qwen2.5-7B with a total of 8B parameters
  tags:
    - llm
    - multimodal
    - gguf
    - gpu
    - qwen2
    - cpu
  overrides:
    mmproj: minicpm-o-2_6-mmproj-f16.gguf
    parameters:
      model: minicpm-o-2_6-Q4_K_M.gguf
  files:
    - filename: minicpm-o-2_6-Q4_K_M.gguf
      sha256: 4f635fc0c0bb88d50ccd9cf1f1e5892b5cb085ff88fe0d8e1148fd9a8a836bc2
      uri: huggingface://openbmb/MiniCPM-o-2_6-gguf/Model-7.6B-Q4_K_M.gguf
    - filename: minicpm-o-2_6-mmproj-f16.gguf
      sha256: efa4f7d96aa0f838f2023fc8d28e519179b16f1106777fa9280b32628191aa3e
      uri: huggingface://openbmb/MiniCPM-o-2_6-gguf/mmproj-model-f16.gguf
- !!merge <<: *qwen2
  name: "minicpm-v-2_6"
  license: apache-2.0
  icon: https://avatars.githubusercontent.com/u/89920203
  urls:
    - https://huggingface.co/openbmb/MiniCPM-V-2_6-gguf
    - https://huggingface.co/openbmb/MiniCPM-V-2_6
  description: |
    MiniCPM-V 2.6 is the latest and most capable model in the MiniCPM-V series. The model is built on SigLip-400M and Qwen2-7B with a total of 8B parameters
  tags:
    - llm
    - multimodal
    - gguf
    - gpu
    - qwen2
    - cpu
  overrides:
    mmproj: minicpm-v-2_6-mmproj-f16.gguf
    parameters:
      model: minicpm-v-2_6-Q4_K_M.gguf
  files:
    - filename: minicpm-v-2_6-Q4_K_M.gguf
      sha256: 3a4078d53b46f22989adbf998ce5a3fd090b6541f112d7e936eb4204a04100b1
      uri: huggingface://openbmb/MiniCPM-V-2_6-gguf/ggml-model-Q4_K_M.gguf
    - filename: minicpm-v-2_6-mmproj-f16.gguf
      uri: huggingface://openbmb/MiniCPM-V-2_6-gguf/mmproj-model-f16.gguf
      sha256: 4485f68a0f1aa404c391e788ea88ea653c100d8e98fe572698f701e5809711fd
- !!merge <<: *qwen2
  name: "taid-llm-1.5b"
  icon: https://sakana.ai/assets/taid-jp/cover_large.jpeg
  urls:
    - https://huggingface.co/SakanaAI/TAID-LLM-1.5B
    - https://huggingface.co/bartowski/TAID-LLM-1.5B-GGUF
  description: |
    TAID-LLM-1.5B is an English language model created through TAID (Temporally Adaptive Interpolated Distillation), our new knowledge distillation method. We used Qwen2-72B-Instruct as the teacher model and Qwen2-1.5B-Instruct as the student model.
  overrides:
    parameters:
      model: TAID-LLM-1.5B-Q4_K_M.gguf
  files:
    - filename: TAID-LLM-1.5B-Q4_K_M.gguf
      sha256: dbffc989d12d42ef8e4a2994e102d7ec7a02c49ec08ea2e35426372ad07b4cd8
      uri: huggingface://bartowski/TAID-LLM-1.5B-GGUF/TAID-LLM-1.5B-Q4_K_M.gguf
- &mistral03
  url: "github:mudler/LocalAI/gallery/mistral-0.3.yaml@master" ## START Mistral
  name: "mistral-7b-instruct-v0.3"
  icon: https://cdn-avatars.huggingface.co/v1/production/uploads/62dac1c7a8ead43d20e3e17a/wrLf5yaGC6ng4XME70w6Z.png
  license: apache-2.0
  description: |
    The Mistral-7B-Instruct-v0.3 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.3.

    Mistral-7B-v0.3 has the following changes compared to Mistral-7B-v0.2

        Extended vocabulary to 32768
        Supports v3 Tokenizer
        Supports function calling
  urls:
    - https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3
    - https://huggingface.co/MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - mistral
    - cpu
    - function-calling
  overrides:
    parameters:
      model: Mistral-7B-Instruct-v0.3.Q4_K_M.gguf
  files:
    - filename: "Mistral-7B-Instruct-v0.3.Q4_K_M.gguf"
      sha256: "14850c84ff9f06e9b51d505d64815d5cc0cea0257380353ac0b3d21b21f6e024"
      uri: "huggingface://MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF/Mistral-7B-Instruct-v0.3.Q4_K_M.gguf"
- !!merge <<: *mistral03
  name: "mathstral-7b-v0.1-imat"
  url: "github:mudler/LocalAI/gallery/mathstral.yaml@master"
  urls:
    - https://huggingface.co/mistralai/mathstral-7B-v0.1
    - https://huggingface.co/InferenceIllusionist/mathstral-7B-v0.1-iMat-GGUF
  description: |
    Mathstral 7B is a model specializing in mathematical and scientific tasks, based on Mistral 7B. You can read more in the official blog post https://mistral.ai/news/mathstral/.
  overrides:
    parameters:
      model: mathstral-7B-v0.1-iMat-Q4_K_M.gguf
  files:
    - filename: mathstral-7B-v0.1-iMat-Q4_K_M.gguf
      sha256: 3ba94b7a8283ffa319c9ce23657f91ecf221ceada167c1253906cf56d72e8f90
      uri: huggingface://InferenceIllusionist/mathstral-7B-v0.1-iMat-GGUF/mathstral-7B-v0.1-iMat-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "mahou-1.3d-mistral-7b-i1"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://huggingface.co/flammenai/Mahou-1.0-mistral-7B/resolve/main/mahou1.png
  urls:
    - https://huggingface.co/flammenai/Mahou-1.3d-mistral-7B
    - https://huggingface.co/mradermacher/Mahou-1.3d-mistral-7B-i1-GGUF
  description: |
    Mahou is designed to provide short messages in a conversational context. It is capable of casual conversation and character roleplay.
  overrides:
    parameters:
      model: Mahou-1.3d-mistral-7B.i1-Q4_K_M.gguf
  files:
    - filename: Mahou-1.3d-mistral-7B.i1-Q4_K_M.gguf
      sha256: 8272f050e36d612ab282e095cb4e775e2c818e7096f8d522314d256923ef6da9
      uri: huggingface://mradermacher/Mahou-1.3d-mistral-7B-i1-GGUF/Mahou-1.3d-mistral-7B.i1-Q4_K_M.gguf
- name: "einstein-v4-7b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/U0zyXVGj-O8a7KP3BvPue.png
  urls:
    - https://huggingface.co/Weyaxi/Einstein-v4-7B
    - https://huggingface.co/mradermacher/Einstein-v4-7B-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - mistral
    - cpu
  description: "\U0001F52C Einstein-v4-7B\n\nThis model is a full fine-tuned version of mistralai/Mistral-7B-v0.1 on diverse datasets.\n\nThis model is finetuned using 7xRTX3090 + 1xRTXA6000 using axolotl.\n"
  overrides:
    parameters:
      model: Einstein-v4-7B.Q4_K_M.gguf
  files:
    - filename: Einstein-v4-7B.Q4_K_M.gguf
      sha256: 78bd573de2a9eb3c6e213132858164e821145f374fcaa4b19dfd6502c05d990d
      uri: huggingface://mradermacher/Einstein-v4-7B-GGUF/Einstein-v4-7B.Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "mistral-nemo-instruct-2407"
  urls:
    - https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407
    - https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF
    - https://mistral.ai/news/mistral-nemo/
  description: |
    The Mistral-Nemo-Instruct-2407 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-Nemo-Base-2407. Trained jointly by Mistral AI and NVIDIA, it significantly outperforms existing models smaller or similar in size.
  overrides:
    parameters:
      model: Mistral-Nemo-Instruct-2407-Q4_K_M.gguf
  files:
    - filename: Mistral-Nemo-Instruct-2407-Q4_K_M.gguf
      uri: huggingface://bartowski/Mistral-Nemo-Instruct-2407-GGUF/Mistral-Nemo-Instruct-2407-Q4_K_M.gguf
      sha256: 7c1a10d202d8788dbe5628dc962254d10654c853cae6aaeca0618f05490d4a46
- !!merge <<: *mistral03
  name: "lumimaid-v0.2-12b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/ep3ojmuMkFS-GmgRuI9iB.png
  urls:
    - https://huggingface.co/NeverSleep/Lumimaid-v0.2-12B
    - https://huggingface.co/mudler/Lumimaid-v0.2-12B-Q4_K_M-GGUF
  description: |
    This model is based on: Mistral-Nemo-Instruct-2407

    Wandb: https://wandb.ai/undis95/Lumi-Mistral-Nemo?nw=nwuserundis95

    NOTE: As explained on Mistral-Nemo-Instruct-2407 repo, it's recommended to use a low temperature, please experiment!

    Lumimaid 0.1 -> 0.2 is a HUGE step up dataset wise.

    As some people have told us our models are sloppy, Ikari decided to say fuck it and literally nuke all chats out with most slop.

    Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back!
  overrides:
    parameters:
      model: lumimaid-v0.2-12b-q4_k_m.gguf
  files:
    - filename: lumimaid-v0.2-12b-q4_k_m.gguf
      sha256: f72299858a07e52be920b86d42ddcfcd5008b961d601ef6fd6a98a3377adccbf
      uri: huggingface://mudler/Lumimaid-v0.2-12B-Q4_K_M-GGUF/lumimaid-v0.2-12b-q4_k_m.gguf
- !!merge <<: *mistral03
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "mn-12b-celeste-v1.9"
  icon: https://cdn-uploads.huggingface.co/production/uploads/630cf5d14ca0a22768bbe10c/QcU3xEgVu18jeFtMFxIw-.webp
  urls:
    - https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9
    - https://huggingface.co/mradermacher/MN-12B-Celeste-V1.9-GGUF
  description: |
    Mistral Nemo 12B Celeste V1.9

    This is a story writing and roleplaying model trained on Mistral NeMo 12B Instruct at 8K context using Reddit Writing Prompts, Kalo's Opus 25K Instruct and c2 logs cleaned

    This version has improved NSFW, smarter and more active narration. It's also trained with ChatML tokens so there should be no EOS bleeding whatsoever.
  overrides:
    parameters:
      model: MN-12B-Celeste-V1.9.Q4_K_M.gguf
  files:
    - filename: MN-12B-Celeste-V1.9.Q4_K_M.gguf
      sha256: 019daeaa63d82d55d1ea623b9c255deea6793af4044bb4994d2b4d09e8959f7b
      uri: huggingface://mradermacher/MN-12B-Celeste-V1.9-GGUF/MN-12B-Celeste-V1.9.Q4_K_M.gguf
- !!merge <<: *mistral03
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/ybqwvRJAtBPqtulQlKW93.gif
  name: "rocinante-12b-v1.1"
  urls:
    - https://huggingface.co/TheDrummer/Rocinante-12B-v1.1-GGUF
    - https://huggingface.co/TheDrummer/Rocinante-12B-v1.1
  description: |
    A versatile workhorse for any adventure!
  overrides:
    parameters:
      model: Rocinante-12B-v1.1-Q4_K_M.gguf
  files:
    - filename: Rocinante-12B-v1.1-Q4_K_M.gguf
      sha256: bdeaeefac79cff944ae673e6924c9f82f7eed789669a32a09997db398790b0b5
      uri: huggingface://TheDrummer/Rocinante-12B-v1.1-GGUF/Rocinante-12B-v1.1-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "pantheon-rp-1.6-12b-nemo"
  icon: https://huggingface.co/Gryphe/Pantheon-RP-1.6-12b-Nemo/resolve/main/Pantheon.png
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/bartowski/Pantheon-RP-1.6-12b-Nemo-GGUF
    - https://huggingface.co/Gryphe/Pantheon-RP-1.6-12b-Nemo
  description: |
    Welcome to the next iteration of my Pantheon model series, in which I strive to introduce a whole collection of personas that can be summoned with a simple activation phrase. The huge variety in personalities introduced also serve to enhance the general roleplay experience.
    Changes in version 1.6:
    The final finetune now consists of data that is equally split between Markdown and novel-style roleplay. This should solve Pantheon's greatest weakness.
    The base was redone. (Details below)
    Select Claude-specific phrases were rewritten, boosting variety in the model's responses.
    Aiva no longer serves as both persona and assistant, with the assistant role having been given to Lyra.
    Stella's dialogue received some post-fix alterations since the model really loved the phrase "Fuck me sideways".
    Your user feedback is critical to me so don't hesitate to tell me whether my model is either 1. terrible, 2. awesome or 3. somewhere in-between.
  overrides:
    parameters:
      model: Pantheon-RP-1.6-12b-Nemo-Q4_K_M.gguf
  files:
    - filename: Pantheon-RP-1.6-12b-Nemo-Q4_K_M.gguf
      sha256: cf3465c183bf4ecbccd1b6b480f687e0160475b04c87e2f1e5ebc8baa0f4c7aa
      uri: huggingface://bartowski/Pantheon-RP-1.6-12b-Nemo-GGUF/Pantheon-RP-1.6-12b-Nemo-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "acolyte-22b-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6569a4ed2419be6072890cf8/3dcGMcrWK2-2vQh9QBt3o.png
  urls:
    - https://huggingface.co/rAIfle/Acolyte-22B
    - https://huggingface.co/mradermacher/Acolyte-22B-i1-GGUF
  description: |
    LoRA of a bunch of random datasets on top of Mistral-Small-Instruct-2409, then SLERPed onto base at 0.5. Decent enough for its size. Check the LoRA for dataset info.
  overrides:
    parameters:
      model: Acolyte-22B.i1-Q4_K_M.gguf
  files:
    - filename: Acolyte-22B.i1-Q4_K_M.gguf
      sha256: 5a454405b98b6f886e8e4c695488d8ea098162bb8c46f2a7723fc2553c6e2f6e
      uri: huggingface://mradermacher/Acolyte-22B-i1-GGUF/Acolyte-22B.i1-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "mn-12b-lyra-v4-iq-imatrix"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/dVoru83WOpwVjMlgZ_xhA.png
  # chatml
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/Lewdiculous/MN-12B-Lyra-v4-GGUF-IQ-Imatrix
  description: |
    A finetune of Mistral Nemo by Sao10K.
    Uses the ChatML prompt format.
  overrides:
    parameters:
      model: MN-12B-Lyra-v4-Q4_K_M-imat.gguf
  files:
    - filename: MN-12B-Lyra-v4-Q4_K_M-imat.gguf
      sha256: 1989123481ca1936c8a2cbe278ff5d1d2b0ae63dbdc838bb36a6d7547b8087b3
      uri: huggingface://Lewdiculous/MN-12B-Lyra-v4-GGUF-IQ-Imatrix/MN-12B-Lyra-v4-Q4_K_M-imat.gguf
- !!merge <<: *mistral03
  name: "magnusintellectus-12b-v1-i1"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/66b564058d9afb7a9d5607d5/hUVJI1Qa4tCMrZWMgYkoD.png
  urls:
    - https://huggingface.co/GalrionSoftworks/MagnusIntellectus-12B-v1
    - https://huggingface.co/mradermacher/MagnusIntellectus-12B-v1-i1-GGUF
  description: |
    How pleasant, the rocks appear to have made a decent conglomerate. A-.

    MagnusIntellectus is a merge of the following models using LazyMergekit:

        UsernameJustAnother/Nemo-12B-Marlin-v5
        anthracite-org/magnum-12b-v2
  overrides:
    parameters:
      model: MagnusIntellectus-12B-v1.i1-Q4_K_M.gguf
  files:
    - filename: MagnusIntellectus-12B-v1.i1-Q4_K_M.gguf
      sha256: c97107983b4edc5b6f2a592d227ca2dd4196e2af3d3bc0fe6b7a8954a1fb5870
      uri: huggingface://mradermacher/MagnusIntellectus-12B-v1-i1-GGUF/MagnusIntellectus-12B-v1.i1-Q4_K_M.gguf
- !!merge <<: *mistral03
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "mn-backyardai-party-12b-v1-iq-arm-imatrix"
  icon: https://huggingface.co/Sao10K/MN-BackyardAI-Party-12B-v1/resolve/main/party1.png
  urls:
    - https://huggingface.co/Sao10K/MN-BackyardAI-Party-12B-v1
    - https://huggingface.co/Lewdiculous/MN-BackyardAI-Party-12B-v1-GGUF-IQ-ARM-Imatrix
  description: |
    This is a group-chat based roleplaying model, based off of 12B-Lyra-v4a2, a variant of Lyra-v4 that is currently private.

    It is trained on an entirely human-based dataset, based on forum / internet group roleplaying styles. The only augmentation done with LLMs is to the character sheets, to fit to the system prompt, to fit various character sheets within context.

    This model is still capable of 1 on 1 roleplay, though I recommend using ChatML when doing that instead.
  overrides:
    parameters:
      model: MN-BackyardAI-Party-12B-v1-Q4_K_M-imat.gguf
  files:
    - filename: MN-BackyardAI-Party-12B-v1-Q4_K_M-imat.gguf
      sha256: cea68768dff58b553974b755bb40ef790ab8b86866d9b5c46bc2e6c3311b876a
      uri: huggingface://Lewdiculous/MN-BackyardAI-Party-12B-v1-GGUF-IQ-ARM-Imatrix/MN-BackyardAI-Party-12B-v1-Q4_K_M-imat.gguf
- !!merge <<: *mistral03
  name: "ml-ms-etheris-123b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/ieEjL3TxpDM3WAZQcya6E.png
  urls:
    - https://huggingface.co/Steelskull/ML-MS-Etheris-123B
    - https://huggingface.co/mradermacher/ML-MS-Etheris-123B-GGUF
  description: |
    This model merges the robust storytelling of mutiple models while attempting to maintain intelligence. The final model was merged after Model Soup with DELLA to add some specal sause.
      - model: NeverSleep/Lumimaid-v0.2-123B
      - model: TheDrummer/Behemoth-123B-v1
      - model: migtissera/Tess-3-Mistral-Large-2-123B
      - model: anthracite-org/magnum-v2-123b
    Use Mistral, ChatML, or Meth Format
  overrides:
    parameters:
      model: ML-MS-Etheris-123B.Q2_K.gguf
  files:
    - filename: ML-MS-Etheris-123B.Q2_K.gguf
      sha256: a17c5615413b5c9c8d01cf55386573d0acd00e01f6e2bcdf492624c73c593fc3
      uri: huggingface://mradermacher/ML-MS-Etheris-123B-GGUF/ML-MS-Etheris-123B.Q2_K.gguf
- !!merge <<: *mistral03
  name: "mn-lulanum-12b-fix-i1"
  urls:
    - https://huggingface.co/djuna/MN-Lulanum-12B-FIX
    - https://huggingface.co/mradermacher/MN-Lulanum-12B-FIX-i1-GGUF
  description: |
    This model was merged using the della_linear merge method using unsloth/Mistral-Nemo-Base-2407 as a base.
    The following models were included in the merge:
        VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct
        anthracite-org/magnum-v2.5-12b-kto
        Undi95/LocalC-12B-e2.0
        NeverSleep/Lumimaid-v0.2-12B
  overrides:
    parameters:
      model: MN-Lulanum-12B-FIX.i1-Q4_K_M.gguf
  files:
    - filename: MN-Lulanum-12B-FIX.i1-Q4_K_M.gguf
      sha256: 7e24d57249059d45bb508565ec3055e585a4e658c1815c67ea92397acc6aa775
      uri: huggingface://mradermacher/MN-Lulanum-12B-FIX-i1-GGUF/MN-Lulanum-12B-FIX.i1-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "tor-8b"
  icon: https://huggingface.co/Delta-Vector/Tor-8B/resolve/main/FinalTor8B.jpg
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/QuantFactory/Tor-8B-GGUF
  description: |
    An earlier checkpoint of Darkens-8B using the same configuration that i felt was different enough from it's 4 epoch cousin to release, Finetuned ontop of the Prune/Distill NeMo 8B done by Nvidia, This model aims to have generally good prose and writing while not falling into claude-isms.
  overrides:
    parameters:
      model: Tor-8B.Q4_K_M.gguf
  files:
    - filename: Tor-8B.Q4_K_M.gguf
      sha256: 9dd64bd886aa7682b6179340449b38feda405b44722ef7ac752cedb807af370e
      uri: huggingface://QuantFactory/Tor-8B-GGUF/Tor-8B.Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "darkens-8b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/Delta-Vector/Darkens-8B
    - https://huggingface.co/QuantFactory/Darkens-8B-GGUF
  description: |
    This is the fully cooked, 4 epoch version of Tor-8B, this is an experimental version, despite being trained for 4 epochs, the model feels fresh and new and is not overfit, This model aims to have generally good prose and writing while not falling into claude-isms, it follows the actions "dialogue" format heavily.
  overrides:
    parameters:
      model: Darkens-8B.Q4_K_M.gguf
  files:
    - filename: Darkens-8B.Q4_K_M.gguf
      sha256: f56a483e10fd00957460adfc16ee462cecac892a4fb44dc59e466e68a360fd42
      uri: huggingface://QuantFactory/Darkens-8B-GGUF/Darkens-8B.Q4_K_M.gguf
- !!merge <<: *mistral03
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "starcannon-unleashed-12b-v1.0"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6720ed503a24966ac66495e8/HXc0AxPLkoIC1fy0Pb3Pb.png
  urls:
    - https://huggingface.co/VongolaChouko/Starcannon-Unleashed-12B-v1.0
    - https://huggingface.co/QuantFactory/Starcannon-Unleashed-12B-v1.0-GGUF
  description: |
    This is a merge of pre-trained language models created using mergekit.
    MarinaraSpaghetti_NemoMix-Unleashed-12B
    Nothingiisreal_MN-12B-Starcannon-v3
  overrides:
    parameters:
      model: Starcannon-Unleashed-12B-v1.0.Q4_K_M.gguf
  files:
    - filename: Starcannon-Unleashed-12B-v1.0.Q4_K_M.gguf
      sha256: b32c6582d75d2f1d67d567badc691a1338dd1a016c71efbfaf4c91812f398f0e
      uri: huggingface://QuantFactory/Starcannon-Unleashed-12B-v1.0-GGUF/Starcannon-Unleashed-12B-v1.0.Q4_K_M.gguf
- !!merge <<: *mistral03
  icon: https://cdn-uploads.huggingface.co/production/uploads/645cfe4603fc86c46b3e46d1/CATNxzDDJL6xHR4tc4IMf.jpeg
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "valor-7b-v0.1"
  urls:
    - https://huggingface.co/NeuralNovel/Valor-7B-v0.1
    - https://huggingface.co/mradermacher/Valor-7B-v0.1-GGUF
  description: |
    Valor speaks louder than words.

    This is a qlora finetune of blockchainlabs_7B_merged_test2_4 using the Neural-Story-v0.1 dataset, with the intention of increasing creativity and writing ability.
  overrides:
    parameters:
      model: Valor-7B-v0.1.Q4_K_M.gguf
  files:
    - filename: Valor-7B-v0.1.Q4_K_M.gguf
      sha256: 2b695fe53d64b36c3eea68f1fa0809f30560aa97ce8b71c16f371c2dc262d9b8
      uri: huggingface://mradermacher/Valor-7B-v0.1-GGUF/Valor-7B-v0.1.Q4_K_M.gguf
- !!merge <<: *mistral03
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "mn-tiramisu-12b"
  icon: https://huggingface.co/matchaaaaa/MN-Tiramisu-12B/resolve/main/tiramisu-cute.png
  urls:
    - https://huggingface.co/matchaaaaa/MN-Tiramisu-12B
    - https://huggingface.co/MaziyarPanahi/MN-Tiramisu-12B-GGUF
  description: |
    This is a really yappity-yappy yapping model that's good for long-form RP. Tried to rein it in with Mahou and give it some more character understanding with Pantheon. Feedback is always welcome.
  overrides:
    parameters:
      model: MN-Tiramisu-12B.Q5_K_M.gguf
  files:
    - filename: MN-Tiramisu-12B.Q5_K_M.gguf
      sha256: 100c78b08a0f4fc5a5a65797e1498ff5fd6fc9daf96b0898d2de731c35fa4e3e
      uri: huggingface://MaziyarPanahi/MN-Tiramisu-12B-GGUF/MN-Tiramisu-12B.Q5_K_M.gguf
- !!merge <<: *mistral03
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "mistral-nemo-prism-12b"
  icon: https://huggingface.co/nbeerbower/Mistral-Nemo-Prism-12B/resolve/main/prism-cover.png
  urls:
    - https://huggingface.co/nbeerbower/Mistral-Nemo-Prism-12B
    - https://huggingface.co/bartowski/Mistral-Nemo-Prism-12B-GGUF
  description: |
    Mahou-1.5-mistral-nemo-12B-lorablated finetuned on Arkhaios-DPO and Purpura-DPO.
    The goal was to reduce archaic language and purple prose in a completely uncensored model.
  overrides:
    parameters:
      model: Mistral-Nemo-Prism-12B-Q4_K_M.gguf
  files:
    - filename: Mistral-Nemo-Prism-12B-Q4_K_M.gguf
      sha256: 96b922c6d55d94ffb91e869b8cccaf2b6dc449d75b1456f4d4578c92c8184c25
      uri: huggingface://bartowski/Mistral-Nemo-Prism-12B-GGUF/Mistral-Nemo-Prism-12B-Q4_K_M.gguf
- !!merge <<: *mistral03
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "magnum-12b-v2.5-kto-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/658a46cbfb9c2bdfae75b3a6/sWYs3iHkn36lw6FT_Y7nn.png
  urls:
    - https://huggingface.co/mradermacher/magnum-12b-v2.5-kto-i1-GGUF
  description: |
    v2.5 KTO is an experimental release; we are testing a hybrid reinforcement learning strategy of KTO + DPOP, using rejected data sampled from the original model as "rejected". For "chosen", we use data from the original finetuning dataset as "chosen". This was done on a limited portion of of primarily instruction following data; we plan to scale up a larger KTO dataset in the future for better generalization. This is the 5th in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. This model is fine-tuned on top of anthracite-org/magnum-12b-v2.
  overrides:
    parameters:
      model: magnum-12b-v2.5-kto.i1-Q4_K_M.gguf
  files:
    - filename: magnum-12b-v2.5-kto.i1-Q4_K_M.gguf
      sha256: 07e91d2c6d4e42312e65a69c54f16be467575f7a596fe052993b388e38b90d76
      uri: huggingface://mradermacher/magnum-12b-v2.5-kto-i1-GGUF/magnum-12b-v2.5-kto.i1-Q4_K_M.gguf
- !!merge <<: *mistral03
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "chatty-harry_v3.0"
  icon: https://cdn-uploads.huggingface.co/production/uploads/66c1cc08453a7ef6c5fe657a/0KzNTEtn2kJJQsw4lQeY0.png
  urls:
    - https://huggingface.co/Triangle104/Chatty-Harry_V3.0
    - https://huggingface.co/QuantFactory/Chatty-Harry_V3.0-GGUF
  description: |
    This model was merged using the TIES merge method using Triangle104/ChatWaifu_Magnum_V0.2 as a base.
    The following models were included in the merge: elinas/Chronos-Gold-12B-1.0
  overrides:
    parameters:
      model: Chatty-Harry_V3.0.Q4_K_M.gguf
  files:
    - filename: Chatty-Harry_V3.0.Q4_K_M.gguf
      sha256: 54b63bb74498576ca77b801ed096657a93cc2f6b71d707c3605fdb394bd3e622
      uri: huggingface://QuantFactory/Chatty-Harry_V3.0-GGUF/Chatty-Harry_V3.0.Q4_K_M.gguf
- !!merge <<: *mistral03
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "mn-chunky-lotus-12b"
  icon: https://huggingface.co/FallenMerick/MN-Chunky-Lotus-12B/resolve/main/chunky-lotus.jpg
  urls:
    - https://huggingface.co/QuantFactory/MN-Chunky-Lotus-12B-GGUF
  description: |
    I had originally planned to use this model for future/further merges, but decided to go ahead and release it since it scored rather high on my local EQ Bench testing (79.58 w/ 100% parsed @ 8-bit).
    Bear in mind that most models tend to score a bit higher on my own local tests as compared to their posted scores. Still, its the highest score I've personally seen from all the models I've tested.
    Its a decent model, with great emotional intelligence and acceptable adherence to various character personalities. It does a good job at roleplaying despite being a bit bland at times.

    Overall, I like the way it writes, but it has a few formatting issues that show up from time to time, and it has an uncommon tendency to paste walls of character feelings/intentions at the end of some outputs without any prompting. This is something I hope to correct with future iterations.
    This is a merge of pre-trained language models created using mergekit.
    The following models were included in the merge:
        Epiculous/Violet_Twilight-v0.2
        nbeerbower/mistral-nemo-gutenberg-12B-v4
        flammenai/Mahou-1.5-mistral-nemo-12B
  overrides:
    parameters:
      model: MN-Chunky-Lotus-12B.Q4_K_M.gguf
  files:
    - filename: MN-Chunky-Lotus-12B.Q4_K_M.gguf
      sha256: 363defe0a769fdb715dab75517966a0a80bcdd981a610d4c759099b6c8ff143a
      uri: huggingface://QuantFactory/MN-Chunky-Lotus-12B-GGUF/MN-Chunky-Lotus-12B.Q4_K_M.gguf
- !!merge <<: *mistral03
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "chronos-gold-12b-1.0"
  icon: https://cdn-uploads.huggingface.co/production/uploads/630417380907b9a115c6aa9f/3hc8zt8fzKdO3qHK1p1mW.webp
  urls:
    - https://huggingface.co/elinas/Chronos-Gold-12B-1.0
    - https://huggingface.co/mradermacher/Chronos-Gold-12B-1.0-GGUF
  description: |
    Chronos Gold 12B 1.0 is a very unique model that applies to domain areas such as general chatbot functionatliy, roleplay, and storywriting. The model has been observed to write up to 2250 tokens in a single sequence. The model was trained at a sequence length of 16384 (16k) and will still retain the apparent 128k context length from Mistral-Nemo, though it deteriorates over time like regular Nemo does based on the RULER Test

    As a result, is recommended to keep your sequence length max at 16384, or you will experience performance degredation.

    The base model is mistralai/Mistral-Nemo-Base-2407 which was heavily modified to produce a more coherent model, comparable to much larger models.

    Chronos Gold 12B-1.0 re-creates the uniqueness of the original Chronos with significiantly enhanced prompt adherence (following), coherence, a modern dataset, as well as supporting a majority of "character card" formats in applications like SillyTavern.

    It went through an iterative and objective merge process as my previous models and was further finetuned on a dataset curated for it.

    The specifics of the model will not be disclosed at the time due to dataset ownership.
  overrides:
    parameters:
      model: Chronos-Gold-12B-1.0.Q4_K_M.gguf
  files:
    - filename: Chronos-Gold-12B-1.0.Q4_K_M.gguf
      sha256: d75a6ed28781f0ea6fa6e58c0b25dfecdd160d4cab64aaf511ea156e99a1e1f3
      uri: huggingface://mradermacher/Chronos-Gold-12B-1.0-GGUF/Chronos-Gold-12B-1.0.Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "naturallm-7b-instruct"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/qingy2024/NaturalLM-7B-Instruct
    - https://huggingface.co/bartowski/NaturalLM-7B-Instruct-GGUF
  description: |
    This Mistral 7B fine-tune is trained (for 150 steps) to talk like a human, not a "helpful assistant"!
    It's also very beta right now. The dataset (qingy2024/Natural-Text-ShareGPT) can definitely be improved.
  overrides:
    parameters:
      model: NaturalLM-7B-Instruct-Q4_K_M.gguf
  files:
    - filename: NaturalLM-7B-Instruct-Q4_K_M.gguf
      sha256: 15b2f34116f690fea35790a9392b8a2190fe25827e370d426e88a2a543f4dcee
      uri: huggingface://bartowski/NaturalLM-7B-Instruct-GGUF/NaturalLM-7B-Instruct-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "dans-personalityengine-v1.1.0-12b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/PocketDoc/Dans-PersonalityEngine-V1.1.0-12b
    - https://huggingface.co/bartowski/Dans-PersonalityEngine-V1.1.0-12b-GGUF
  description: |
    This model series is intended to be multifarious in its capabilities and should be quite capable at both co-writing and roleplay as well as find itself quite at home performing sentiment analysis or summarization as part of a pipeline. It has been trained on a wide array of one shot instructions, multi turn instructions, tool use, role playing scenarios, text adventure games, co-writing, and much more.
  overrides:
    parameters:
      model: Dans-PersonalityEngine-V1.1.0-12b-Q4_K_M.gguf
  files:
    - filename: Dans-PersonalityEngine-V1.1.0-12b-Q4_K_M.gguf
      sha256: a1afb9fddfa3f2847ed710cc374b4f17e63a75f7e10d8871cf83983c2f5415ab
      uri: huggingface://bartowski/Dans-PersonalityEngine-V1.1.0-12b-GGUF/Dans-PersonalityEngine-V1.1.0-12b-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "mn-12b-mag-mell-r1-iq-arm-imatrix"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1
    - https://huggingface.co/Lewdiculous/MN-12B-Mag-Mell-R1-GGUF-IQ-ARM-Imatrix
  description: |
    This is a merge of pre-trained language models created using mergekit. Mag Mell is a multi-stage merge, Inspired by hyper-merges like Tiefighter and Umbral Mind. Intended to be a general purpose "Best of Nemo" model for any fictional, creative use case.
    6 models were chosen based on 3 categories; they were then paired up and merged via layer-weighted SLERP to create intermediate "specialists" which are then evaluated in their domain. The specialists were then merged into the base via DARE-TIES, with hyperparameters chosen to reduce interference caused by the overlap of the three domains. The idea with this approach is to extract the best qualities of each component part, and produce models whose task vectors represent more than the sum of their parts.

    The three specialists are as follows:
        Hero (RP, kink/trope coverage): Chronos Gold, Sunrose.
        Monk (Intelligence, groundedness): Bophades, Wissenschaft.
        Deity (Prose, flair): Gutenberg v4, Magnum 2.5 KTO.
    I've been dreaming about this merge since Nemo tunes started coming out in earnest. From our testing, Mag Mell demonstrates worldbuilding capabilities unlike any model in its class, comparable to old adventuring models like Tiefighter, and prose that exhibits minimal "slop" (not bad for no finetuning,) frequently devising electrifying metaphors that left us consistently astonished.

    I don't want to toot my own bugle though; I'm really proud of how this came out, but please leave your feedback, good or bad.Special thanks as usual to Toaster for his feedback and Fizz for helping fund compute, as well as the KoboldAI Discord for their resources. The following models were included in the merge:
    IntervitensInc/Mistral-Nemo-Base-2407-chatml
    nbeerbower/mistral-nemo-bophades-12B
    nbeerbower/mistral-nemo-wissenschaft-12B
    elinas/Chronos-Gold-12B-1.0
    Fizzarolli/MN-12b-Sunrose
    nbeerbower/mistral-nemo-gutenberg-12B-v4
    anthracite-org/magnum-12b-v2.5-kto
  overrides:
    parameters:
      model: MN-12B-Mag-Mell-R1-Q4_K_M-imat.gguf
  files:
    - filename: MN-12B-Mag-Mell-R1-Q4_K_M-imat.gguf
      sha256: ba0c9e64222b35f8c3828b7295e173ee54d83fd2e457ba67f6561a4a6d98481e
      uri: huggingface://Lewdiculous/MN-12B-Mag-Mell-R1-GGUF-IQ-ARM-Imatrix/MN-12B-Mag-Mell-R1-Q4_K_M-imat.gguf
- !!merge <<: *mistral03
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "captain-eris-diogenes_twilight-v0.420-12b-arm-imatrix"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/n0HUz-yRPkwQzt3dFrjW9.png
  urls:
    - https://huggingface.co/Nitral-AI/Captain-Eris-Diogenes_Twilight-V0.420-12B
    - https://huggingface.co/Lewdiculous/Captain-Eris-Diogenes_Twilight-V0.420-12B-GGUF-ARM-Imatrix
  description: |
    The following models were included in the merge:
        Nitral-AI/Captain-Eris_Twilight-V0.420-12B
        Nitral-AI/Diogenes-12B-ChatMLified
  overrides:
    parameters:
      model: Captain-Eris-Diogenes_Twighlight-V0.420-12B-Q4_K_M-imat.gguf
  files:
    - filename: Captain-Eris-Diogenes_Twighlight-V0.420-12B-Q4_K_M-imat.gguf
      sha256: e70b26114108c41e3ca0aefc0c7b8f5f69452ab461ffe7155e6b75ede24ec1b5
      uri: huggingface://Lewdiculous/Captain-Eris-Diogenes_Twilight-V0.420-12B-GGUF-ARM-Imatrix/Captain-Eris-Diogenes_Twighlight-V0.420-12B-Q4_K_M-imat.gguf
- !!merge <<: *mistral03
  name: "violet_twilight-v0.2"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64adfd277b5ff762771e4571/P962FQhRG4I8nbU_DJolY.png
  urls:
    - https://huggingface.co/Epiculous/Violet_Twilight-v0.2
    - https://huggingface.co/Epiculous/Violet_Twilight-v0.2-GGUF
  description: |
    Now for something a bit different, Violet_Twilight-v0.2! This model is a SLERP merge of Azure_Dusk-v0.2 and Crimson_Dawn-v0.2!
  overrides:
    parameters:
      model: Violet_Twilight-v0.2.Q4_K_M.gguf
  files:
    - filename: Violet_Twilight-v0.2.Q4_K_M.gguf
      sha256: b63f07cc441146af9c98cd3c3d4390d7c39bfef11c1d168dc7c6244ca2ba6b12
      uri: huggingface://Epiculous/Violet_Twilight-v0.2-GGUF/Violet_Twilight-v0.2.Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "sainemo-remix"
  icon: https://huggingface.co/Moraliane/SAINEMO-reMIX/resolve/main/remixwife.webp
  urls:
    - https://huggingface.co/Moraliane/SAINEMO-reMIX
    - https://huggingface.co/QuantFactory/SAINEMO-reMIX-GGUF
  description: |
    The following models were included in the merge:
    elinas_Chronos-Gold-12B-1.0
    Vikhrmodels_Vikhr-Nemo-12B-Instruct-R-21-09-24
    MarinaraSpaghetti_NemoMix-Unleashed-12B
  overrides:
    parameters:
      model: SAINEMO-reMIX.Q4_K_M.gguf
  files:
    - filename: SAINEMO-reMIX.Q4_K_M.gguf
      sha256: 91c81623542df97462d93bed8014af4830940182786948fc395d8958a5add994
      uri: huggingface://QuantFactory/SAINEMO-reMIX-GGUF/SAINEMO-reMIX.Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "nera_noctis-12b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/89XJnlNNSsEfBjI1oHCVt.jpeg
  urls:
    - https://huggingface.co/Nitral-AI/Nera_Noctis-12B
    - https://huggingface.co/bartowski/Nera_Noctis-12B-GGUF
  description: |
    Sometimes, the brightest gems are found in the darkest places. For it is in the shadows where we learn to really see the light.
  overrides:
    parameters:
      model: Nera_Noctis-12B-Q4_K_M.gguf
  files:
    - filename: Nera_Noctis-12B-Q4_K_M.gguf
      sha256: 0662a9a847adde046e6255c15d5a677ebf09ab00841547c8963668d14baf00ff
      uri: huggingface://bartowski/Nera_Noctis-12B-GGUF/Nera_Noctis-12B-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "wayfarer-12b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://huggingface.co/LatitudeGames/Wayfarer-12B/resolve/main/wayfarer.jpg
  urls:
    - https://huggingface.co/LatitudeGames/Wayfarer-12B
    - https://huggingface.co/bartowski/Wayfarer-12B-GGUF
  description: |
    We’ve heard over and over from AI Dungeon players that modern AI models are too nice, never letting them fail or die. While it may be good for a chatbot to be nice and helpful, great stories and games aren’t all rainbows and unicorns. They have conflict, tension, and even death. These create real stakes and consequences for characters and the journeys they go on.

    Similarly, great games need opposition. You must be able to fail, die, and may even have to start over. This makes games more fun!

    However, the vast majority of AI models, through alignment RLHF, have been trained away from darkness, violence, or conflict, preventing them from fulfilling this role. To give our players better options, we decided to train our own model to fix these issues.

    Wayfarer is an adventure role-play model specifically trained to give players a challenging and dangerous experience. We thought they would like it, but since releasing it on AI Dungeon, players have reacted even more positively than we expected.

    Because they loved it so much, we’ve decided to open-source the model so anyone can experience unforgivingly brutal AI adventures! Anyone can download the model to run locally.

    Or if you want to easily try this model for free, you can do so at https://aidungeon.com.

    We plan to continue improving and open-sourcing similar models, so please share any and all feedback on how we can improve model behavior. Below we share more details on how Wayfarer was created.
  overrides:
    parameters:
      model: Wayfarer-12B-Q4_K_M.gguf
  files:
    - filename: Wayfarer-12B-Q4_K_M.gguf
      sha256: 6cd9f290c820c64854fcdcfd312b066447acc2f63abe2e2e71af9bc4f1946c08
      uri: huggingface://bartowski/Wayfarer-12B-GGUF/Wayfarer-12B-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "mistral-small-24b-instruct-2501"
  urls:
    - https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501
    - https://huggingface.co/bartowski/Mistral-Small-24B-Instruct-2501-GGUF
  description: |
    Mistral Small 3 ( 2501 ) sets a new benchmark in the "small" Large Language Models category below 70B, boasting 24B parameters and achieving state-of-the-art capabilities comparable to larger models!
    This model is an instruction-fine-tuned version of the base model: Mistral-Small-24B-Base-2501.

    Mistral Small can be deployed locally and is exceptionally "knowledge-dense", fitting in a single RTX 4090 or a 32GB RAM MacBook once quantized.
  overrides:
    parameters:
      model: Mistral-Small-24B-Instruct-2501-Q4_K_M.gguf
  files:
    - filename: Mistral-Small-24B-Instruct-2501-Q4_K_M.gguf
      sha256: d1a6d049f09730c3f8ba26cf6b0b60c89790b5fdafa9a59c819acdfe93fffd1b
      uri: huggingface://bartowski/Mistral-Small-24B-Instruct-2501-GGUF/Mistral-Small-24B-Instruct-2501-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "krutrim-ai-labs_krutrim-2-instruct"
  icon: https://avatars.githubusercontent.com/u/168750421?s=200&v=4
  urls:
    - https://huggingface.co/krutrim-ai-labs/Krutrim-2-instruct
    - https://huggingface.co/bartowski/krutrim-ai-labs_Krutrim-2-instruct-GGUF
  description: |
    Krutrim-2 is a 12B parameter language model developed by the OLA Krutrim team. It is built on the Mistral-NeMo 12B architecture and trained across various domains, including web data, code, math, Indic languages, Indian context data, synthetic data, and books. Following pretraining, the model was finetuned for instruction following on diverse data covering a wide range of tasks, including knowledge recall, math, reasoning, coding, safety, and creative writing.
  overrides:
    parameters:
      model: krutrim-ai-labs_Krutrim-2-instruct-Q4_K_M.gguf
  files:
    - filename: krutrim-ai-labs_Krutrim-2-instruct-Q4_K_M.gguf
      sha256: 03aa6d1fb7ab70482a2242839b8d8e1c789aa90a8be415076ddf84bef65f06c7
      uri: huggingface://bartowski/krutrim-ai-labs_Krutrim-2-instruct-GGUF/krutrim-ai-labs_Krutrim-2-instruct-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "cognitivecomputations_dolphin3.0-r1-mistral-24b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/hdAvdwZiJaLbGmvSZ3wTT.png
  urls:
    - https://huggingface.co/cognitivecomputations/Dolphin3.0-R1-Mistral-24B
    - https://huggingface.co/bartowski/cognitivecomputations_Dolphin3.0-R1-Mistral-24B-GGUF
  description: |
    Dolphin 3.0 R1 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases.
  overrides:
    parameters:
      model: cognitivecomputations_Dolphin3.0-R1-Mistral-24B-Q4_K_M.gguf
  files:
    - filename: cognitivecomputations_Dolphin3.0-R1-Mistral-24B-Q4_K_M.gguf
      sha256: d67de1e94fb32742bd09ee8beebbeb36a4b544785a8f8413dc4d9490e04eda6c
      uri: huggingface://bartowski/cognitivecomputations_Dolphin3.0-R1-Mistral-24B-GGUF/cognitivecomputations_Dolphin3.0-R1-Mistral-24B-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "cognitivecomputations_dolphin3.0-mistral-24b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/cNCs1TBD3FelWCJGkZ3cd.png
  urls:
    - https://huggingface.co/cognitivecomputations/Dolphin3.0-Mistral-24B
    - https://huggingface.co/bartowski/cognitivecomputations_Dolphin3.0-Mistral-24B-GGUF
  description: |
    Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases.
  overrides:
    parameters:
      model: cognitivecomputations_Dolphin3.0-Mistral-24B-Q4_K_M.gguf
  files:
    - filename: cognitivecomputations_Dolphin3.0-Mistral-24B-Q4_K_M.gguf
      sha256: 6f193bbf98628140194df257c7466e2c6f80a7ef70a6ebae26c53b2f2ef21994
      uri: huggingface://bartowski/cognitivecomputations_Dolphin3.0-Mistral-24B-GGUF/cognitivecomputations_Dolphin3.0-Mistral-24B-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "sicariussicariistuff_redemption_wind_24b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://huggingface.co/SicariusSicariiStuff/Redemption_Wind_24B/resolve/main/Images/Redemption_Wind_24B.png
  urls:
    - https://huggingface.co/SicariusSicariiStuff/Redemption_Wind_24B
    - https://huggingface.co/bartowski/SicariusSicariiStuff_Redemption_Wind_24B-GGUF
  description: |
    This is a lightly fine-tuned version of the Mistral 24B base model, designed as an accessible and adaptable foundation for further fine-tuning and merging fodder. Key modifications include:
    ChatML-ified, with no additional tokens introduced.
    High quality private instruct—not generated by ChatGPT or Claude, ensuring no slop and good markdown understanding.
    No refusals—since it’s a base model, refusals should be minimal to non-existent, though, in early testing, occasional warnings still appear (I assume some were baked into the pre-train).
    High-quality private creative writing dataset Mainly to dilute baked-in slop further, but it can actually write some stories, not bad for loss ~8.
    Small, high-quality private RP dataset This was done so further tuning for RP will be easier. The dataset was kept small and contains ZERO SLOP, some entries are of 16k token length.
    Exceptional adherence to character cards This was done to make it easier for further tunes intended for roleplay.
  overrides:
    parameters:
      model: SicariusSicariiStuff_Redemption_Wind_24B-Q4_K_M.gguf
  files:
    - filename: SicariusSicariiStuff_Redemption_Wind_24B-Q4_K_M.gguf
      sha256: 40025eb00d83c9e9393555962962a2dfc5251fe7bd70812835ff0bcc55ecc463
      uri: huggingface://bartowski/SicariusSicariiStuff_Redemption_Wind_24B-GGUF/SicariusSicariiStuff_Redemption_Wind_24B-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "pygmalionai_eleusis-12b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/PygmalionAI/Eleusis-12B
    - https://huggingface.co/bartowski/PygmalionAI_Eleusis-12B-GGUF
  description: |
    Alongside the release of Pygmalion-3, we present an additional roleplay model based on Mistral's Nemo Base named Eleusis, a unique model that has a distinct voice among its peers. Though it was meant to be a test run for further experiments, this model was received warmly to the point where we felt it was right to release it publicly.

    We release the weights of Eleusis under the Apache 2.0 license, ensuring a free and open ecosystem for it to flourish under.
  overrides:
    parameters:
      model: PygmalionAI_Eleusis-12B-Q4_K_M.gguf
  files:
    - filename: PygmalionAI_Eleusis-12B-Q4_K_M.gguf
      sha256: 899091671ae483fc7c132512221ee6600984c936cd8c261becee696d00080701
      uri: huggingface://bartowski/PygmalionAI_Eleusis-12B-GGUF/PygmalionAI_Eleusis-12B-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "pygmalionai_pygmalion-3-12b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/PygmalionAI/Pygmalion-3-12B
    - https://huggingface.co/bartowski/PygmalionAI_Pygmalion-3-12B-GGUF
  description: |
    It's been a long road fraught with delays, technical issues and us banging our heads against the wall, but we're glad to say that we've returned to open-source roleplaying with our newest model, Pygmalion-3. We've taken Mistral's Nemo base model and fed it hundreds of millions of tokens of conversations, creative writing and instructions to create a model dedicated towards roleplaying that we hope fulfills your expectations.

    As part of our open-source roots and promises to those who have been with us since the beginning, we release this model under the permissive Apache 2.0 license, allowing anyone to use and develop upon our work for everybody in the local models community.
  overrides:
    parameters:
      model: PygmalionAI_Pygmalion-3-12B-Q4_K_M.gguf
  files:
    - filename: PygmalionAI_Pygmalion-3-12B-Q4_K_M.gguf
      sha256: ea6504af7af72db98c2e1fe6b0a7cd4389ccafc6c99247a8c606bf503d7eee6b
      uri: huggingface://bartowski/PygmalionAI_Pygmalion-3-12B-GGUF/PygmalionAI_Pygmalion-3-12B-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "pocketdoc_dans-personalityengine-v1.2.0-24b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/PocketDoc/Dans-PersonalityEngine-V1.2.0-24b
    - https://huggingface.co/bartowski/PocketDoc_Dans-PersonalityEngine-V1.2.0-24b-GGUF
  description: |
    This model series is intended to be multifarious in its capabilities and should be quite capable at both co-writing and roleplay as well as find itself quite at home performing sentiment analysis or summarization as part of a pipeline.

     It has been trained on a wide array of one shot instructions, multi turn instructions, tool use, role playing scenarios, text adventure games, co-writing, and much more.
  overrides:
    parameters:
      model: PocketDoc_Dans-PersonalityEngine-V1.2.0-24b-Q4_K_M.gguf
  files:
    - filename: PocketDoc_Dans-PersonalityEngine-V1.2.0-24b-Q4_K_M.gguf
      sha256: 6358033ea52dbde158dbcdb44bd68b2b8959cc77514c86a9ccc64ba1a452f287
      uri: huggingface://bartowski/PocketDoc_Dans-PersonalityEngine-V1.2.0-24b-GGUF/PocketDoc_Dans-PersonalityEngine-V1.2.0-24b-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "nousresearch_deephermes-3-mistral-24b-preview"
  url: "github:mudler/LocalAI/gallery/llama3.1-instruct.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/nZFJYtN7DvuyP7JQdfAMO.jpeg
  urls:
    - https://huggingface.co/NousResearch/DeepHermes-3-Mistral-24B-Preview
    - https://huggingface.co/bartowski/NousResearch_DeepHermes-3-Mistral-24B-Preview-GGUF
  description: |
    DeepHermes 3 Preview is the latest version of our flagship Hermes series of LLMs by Nous Research, and one of the first models in the world to unify Reasoning (long chains of thought that improve answer accuracy) and normal LLM response modes into one model. We have also improved LLM annotation, judgement, and function calling.

    DeepHermes 3 Preview is a hybrid reasoning model, and one of the first LLM models to unify both "intuitive", traditional mode responses and long chain of thought reasoning responses into a single model, toggled by a system prompt.

    Hermes 3, the predecessor of DeepHermes 3, is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board.

    The ethos of the Hermes series of models is focused on aligning LLMs to the user, with powerful steering capabilities and control given to the end user.

    This is a preview Hermes with early reasoning capabilities, distilled from R1 across a variety of tasks that benefit from reasoning and objectivity. Some quirks may be discovered! Please let us know any interesting findings or issues you discover!
  overrides:
    parameters:
      model: NousResearch_DeepHermes-3-Mistral-24B-Preview-Q4_K_M.gguf
  files:
    - filename: NousResearch_DeepHermes-3-Mistral-24B-Preview-Q4_K_M.gguf
      sha256: f364c56c685301b6a05275367b8b739d533892ae6eeda94e5a689c43c04edbf8
      uri: huggingface://bartowski/NousResearch_DeepHermes-3-Mistral-24B-Preview-GGUF/NousResearch_DeepHermes-3-Mistral-24B-Preview-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "pocketdoc_dans-sakurakaze-v1.0.0-12b"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/PocketDoc/Dans-SakuraKaze-V1.0.0-12b
    - https://huggingface.co/bartowski/PocketDoc_Dans-SakuraKaze-V1.0.0-12b-GGUF
  description: |
    A model based on Dans-PersonalityEngine-V1.1.0-12b with a focus on character RP, visual novel style group chats, old school text adventures, and co-writing.
  overrides:
    parameters:
      model: PocketDoc_Dans-SakuraKaze-V1.0.0-12b-Q4_K_M.gguf
  files:
    - filename: PocketDoc_Dans-SakuraKaze-V1.0.0-12b-Q4_K_M.gguf
      sha256: 9dde1b749af27cddc68de07875a067050e9f77199466c89eecc93842adf69ed9
      uri: huggingface://bartowski/PocketDoc_Dans-SakuraKaze-V1.0.0-12b-GGUF/PocketDoc_Dans-SakuraKaze-V1.0.0-12b-Q4_K_M.gguf
- !!merge <<: *mistral03
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "beaverai_mn-2407-dsk-qwqify-v0.1-12b"
  urls:
    - https://huggingface.co/BeaverAI/MN-2407-DSK-QwQify-v0.1-12B
    - https://huggingface.co/bartowski/BeaverAI_MN-2407-DSK-QwQify-v0.1-12B-GGUF
  description: |
    Test model to try to give an existing model QwQ's thoughts. For this first version it is ontop of PocketDoc/Dans-SakuraKaze-V1.0.0-12b (an rp/adventure/co-writing model), which was trained ontop of PocketDoc/Dans-PersonalityEngine-V1.1.0-12b (a jack of all trades instruct model), which was trained ontop of mistralai/Mistral-Nemo-Base-2407.

    The prompt formatting and usage should be the same as with QwQ; Use ChatML, and remove the thinking from previous turns. If thoughts arent being generated automatically, add <think>\n to the start of the assistant turn.

    It should follow previous model turns formatting. On first turns of the conversation you may need to regen a few times, and maybe edit the model responses for the first few turns to get it to your liking.
  overrides:
    parameters:
      model: BeaverAI_MN-2407-DSK-QwQify-v0.1-12B-Q4_K_M.gguf
  files:
    - filename: BeaverAI_MN-2407-DSK-QwQify-v0.1-12B-Q4_K_M.gguf
      uri: huggingface://bartowski/BeaverAI_MN-2407-DSK-QwQify-v0.1-12B-GGUF/BeaverAI_MN-2407-DSK-QwQify-v0.1-12B-Q4_K_M.gguf
      sha256: f6ae7dd8be3aedd640483ccc6895c3fc205a019246bf2512a956589c0222386e
- !!merge <<: *mistral03
  name: "mistralai_mistral-small-3.1-24b-instruct-2503"
  urls:
    - https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503
    - https://huggingface.co/bartowski/mistralai_Mistral-Small-3.1-24B-Instruct-2503-GGUF
  description: |
    Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.
    This model is an instruction-finetuned version of: Mistral-Small-3.1-24B-Base-2503.

    Mistral Small 3.1 can be deployed locally and is exceptionally "knowledge-dense," fitting within a single RTX 4090 or a 32GB RAM MacBook once quantized.
  overrides:
    parameters:
      model: mistralai_Mistral-Small-3.1-24B-Instruct-2503-Q4_K_M.gguf
  files:
    - filename: mistralai_Mistral-Small-3.1-24B-Instruct-2503-Q4_K_M.gguf
      sha256: c5743c1bf39db0ae8a5ade5df0374b8e9e492754a199cfdad7ef393c1590f7c0
      uri: huggingface://bartowski/mistralai_Mistral-Small-3.1-24B-Instruct-2503-GGUF/mistralai_Mistral-Small-3.1-24B-Instruct-2503-Q4_K_M.gguf
- !!merge <<: *mistral03
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "gryphe_pantheon-rp-1.8-24b-small-3.1"
  icon: https://huggingface.co/Gryphe/Pantheon-RP-1.8-24b-Small-3.1/resolve/main/Pantheon.png
  urls:
    - https://huggingface.co/Gryphe/Pantheon-RP-1.8-24b-Small-3.1
    - https://huggingface.co/bartowski/Gryphe_Pantheon-RP-1.8-24b-Small-3.1-GGUF
  description: |
    Welcome to the next iteration of my Pantheon model series, in which I strive to introduce a whole collection of diverse personas that can be summoned with a simple activation phrase.

    Pantheon's purpose is two-fold, as these personalities similarly enhance the general roleplay experience, helping to encompass personality traits, accents and mannerisms that language models might otherwise find difficult to convey well.
  overrides:
    parameters:
      model: Gryphe_Pantheon-RP-1.8-24b-Small-3.1-Q4_K_M.gguf
  files:
    - filename: Gryphe_Pantheon-RP-1.8-24b-Small-3.1-Q4_K_M.gguf
      sha256: de35f9dc65961fa07731dda4a9e6cf4545c5038ceaa4343527e4eddb2731788d
      uri: huggingface://bartowski/Gryphe_Pantheon-RP-1.8-24b-Small-3.1-GGUF/Gryphe_Pantheon-RP-1.8-24b-Small-3.1-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "mawdistical_mawdistic-nightlife-24b"
  urls:
    - https://huggingface.co/Mawdistical/Mawdistic-NightLife-24bhttps://huggingface.co/Mawdistical/Mawdistic-NightLife-24b
    - https://huggingface.co/bartowski/Mawdistical_Mawdistic-NightLife-24b-GGUF
  description: |
    STRICTLY FOR:
    Academic research of how many furries can fit in your backdoor.
    How many meows and purrs you ear drums can handle before they explode... :3
    Asking stepbro to help you put on the m- uhh fursuit............. hehehe
    Ignoring mom's calls asking where you are as you get wasted in a hotel room with 20 furries.
  overrides:
    parameters:
      model: Mawdistical_Mawdistic-NightLife-24b-Q4_K_M.gguf
  files:
    - filename: Mawdistical_Mawdistic-NightLife-24b-Q4_K_M.gguf
      sha256: f0fee87adfaa00d058002c1a4df630e504343d9e7ec24f6b7eae023376dffaf7
      uri: huggingface://bartowski/Mawdistical_Mawdistic-NightLife-24b-GGUF/Mawdistical_Mawdistic-NightLife-24b-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "alamios_mistral-small-3.1-draft-0.5b"
  urls:
    - https://huggingface.co/alamios/Mistral-Small-3.1-DRAFT-0.5B
    - https://huggingface.co/bartowski/alamios_Mistral-Small-3.1-DRAFT-0.5B-GGUF
  description: |
    This model is meant to be used as draft model for speculative decoding with mistralai/Mistral-Small-3.1-24B-Instruct-2503 or mistralai/Mistral-Small-24B-Instruct-2501
    Data info

    The data are Mistral's outputs and includes all kind of tasks from various datasets in English, French, German, Spanish, Italian and Portuguese. It has been trained for 2 epochs on 20k unique examples, for a total of 12 million tokens per epoch.
  overrides:
    parameters:
      model: alamios_Mistral-Small-3.1-DRAFT-0.5B-Q4_K_M.gguf
  files:
    - filename: alamios_Mistral-Small-3.1-DRAFT-0.5B-Q4_K_M.gguf
      sha256: 60c67c7f3a5c6410c460b742ff9698b91980d9bb0519a91bcc0a3065fbd4aadd
      uri: huggingface://bartowski/alamios_Mistral-Small-3.1-DRAFT-0.5B-GGUF/alamios_Mistral-Small-3.1-DRAFT-0.5B-Q4_K_M.gguf
- !!merge <<: *mistral03
  name: "blacksheep-24b-i1"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  icon: https://huggingface.co/TroyDoesAI/BlackSheep-24B/resolve/main/BlackSheep.png
  urls:
    - https://huggingface.co/TroyDoesAI/BlackSheep-24B
    - https://huggingface.co/mradermacher/BlackSheep-24B-i1-GGUF
  description: |
    A Digital Soul just going through a rebellious phase. Might be a little wild, untamed, and honestly, a little rude.
  overrides:
    parameters:
      model: BlackSheep-24B.i1-Q4_K_M.gguf
  files:
    - filename: BlackSheep-24B.i1-Q4_K_M.gguf
      sha256: 95ae096eca05a95591254babf81b4d5617ceebbe8eda04c6cf8968ef4a69fc80
      uri: huggingface://mradermacher/BlackSheep-24B-i1-GGUF/BlackSheep-24B.i1-Q4_K_M.gguf
- &mudler
  url: "github:mudler/LocalAI/gallery/mudler.yaml@master" ### START mudler's LocalAI specific-models
  name: "LocalAI-llama3-8b-function-call-v0.2"
  icon: "https://cdn-uploads.huggingface.co/production/uploads/647374aa7ff32a81ac6d35d4/us5JKi9z046p8K-cn_M0w.webp"
  license: llama3
  description: |
    This model is a fine-tune on a custom dataset + glaive to work specifically and leverage all the LocalAI features of constrained grammar.

    Specifically, the model once enters in tools mode will always reply with JSON.
  urls:
    - https://huggingface.co/mudler/LocalAI-Llama3-8b-Function-Call-v0.2-GGUF
    - https://huggingface.co/mudler/LocalAI-Llama3-8b-Function-Call-v0.2
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - llama3
    - function-calling
  overrides:
    parameters:
      model: LocalAI-Llama3-8b-Function-Call-v0.2-q4_k_m.bin
  files:
    - filename: LocalAI-Llama3-8b-Function-Call-v0.2-q4_k_m.bin
      sha256: 7e46405ce043cbc8d30f83f26a5655dc8edf5e947b748d7ba2745bd0af057a41
      uri: huggingface://mudler/LocalAI-Llama3-8b-Function-Call-v0.2-GGUF/LocalAI-Llama3-8b-Function-Call-v0.2-q4_k_m.bin
- !!merge <<: *mudler
  icon: "https://cdn-uploads.huggingface.co/production/uploads/647374aa7ff32a81ac6d35d4/SKuXcvmZ_6oD4NCMkvyGo.png"
  name: "mirai-nova-llama3-LocalAI-8b-v0.1"
  urls:
    - https://huggingface.co/mudler/Mirai-Nova-Llama3-LocalAI-8B-v0.1-GGUF
    - https://huggingface.co/mudler/Mirai-Nova-Llama3-LocalAI-8B-v0.1
  description: |
    Mirai Nova: "Mirai" means future in Japanese, and "Nova" references a star showing a sudden large increase in brightness.

    A set of models oriented in function calling, but generalist and with enhanced reasoning capability. This is fine tuned with Llama3.

    Mirai Nova works particularly well with LocalAI, leveraging the function call with grammars feature out of the box.
  overrides:
    parameters:
      model: Mirai-Nova-Llama3-LocalAI-8B-v0.1-q4_k_m.bin
  files:
    - filename: Mirai-Nova-Llama3-LocalAI-8B-v0.1-q4_k_m.bin
      sha256: 579cbb229f9c11d0330759ff4733102d2491615a4c61289e26c09d1b3a583fec
      uri: huggingface://mudler/Mirai-Nova-Llama3-LocalAI-8B-v0.1-GGUF/Mirai-Nova-Llama3-LocalAI-8B-v0.1-q4_k_m.bin
- &parler-tts
  url: "github:mudler/LocalAI/gallery/parler-tts.yaml@master" ### START parler-tts
  name: parler-tts-mini-v0.1
  overrides:
    parameters:
      model: parler-tts/parler_tts_mini_v0.1
  license: apache-2.0
  description: |
    Parler-TTS is a lightweight text-to-speech (TTS) model that can generate high-quality, natural sounding speech in the style of a given speaker (gender, pitch, speaking style, etc). It is a reproduction of work from the paper Natural language guidance of high-fidelity text-to-speech with synthetic annotations by Dan Lyth and Simon King, from Stability AI and Edinburgh University respectively.
  urls:
    - https://github.com/huggingface/parler-tts
  tags:
    - tts
    - gpu
    - cpu
    - text-to-speech
    - python
- &rerankers
  url: "github:mudler/LocalAI/gallery/rerankers.yaml@master" ### START rerankers
  name: cross-encoder
  parameters:
    model: cross-encoder
  license: apache-2.0
  description: |
    A cross-encoder model that can be used for reranking
  tags:
    - reranker
    - gpu
    - python
## LLMs
### START LLAMA3
- name: "einstein-v6.1-llama3-8b"
  url: "github:mudler/LocalAI/gallery/hermes-2-pro-mistral.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/5s12oq859qLfDkkTNam_C.png
  urls:
    - https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - llama3
  license: llama3
  description: |
    This model is a full fine-tuned version of meta-llama/Meta-Llama-3-8B on diverse datasets.

    This model is finetuned using 8xRTX3090 + 1xRTXA6000 using axolotl.
  overrides:
    parameters:
      model: Einstein-v6.1-Llama3-8B-Q4_K_M.gguf
  files:
    - filename: Einstein-v6.1-Llama3-8B-Q4_K_M.gguf
      sha256: 447587bd8f60d9050232148d34fdb2d88b15b2413fd7f8e095a4606ec60b45bf
      uri: huggingface://bartowski/Einstein-v6.1-Llama3-8B-GGUF/Einstein-v6.1-Llama3-8B-Q4_K_M.gguf
- &gemma
  url: "github:mudler/LocalAI/gallery/gemma.yaml@master"
  name: "gemma-2b"
  icon: https://avatars.githubusercontent.com/u/1342004
  license: gemma
  urls:
    - https://ai.google.dev/gemma/docs
    - https://huggingface.co/mlabonne/gemma-2b-GGUF
  description: |
    Open source LLM from Google
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - gemma
  overrides:
    parameters:
      model: gemma-2b.Q4_K_M.gguf
  files:
    - filename: gemma-2b.Q4_K_M.gguf
      sha256: 37d50c21ef7847926204ad9b3007127d9a2722188cfd240ce7f9f7f041aa71a5
      uri: huggingface://mlabonne/gemma-2b-GGUF/gemma-2b.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "firefly-gemma-7b-iq-imatrix"
  icon: "https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/SrOekTxdpnxHyWWmMiAvc.jpeg"
  urls:
    - https://huggingface.co/Lewdiculous/firefly-gemma-7b-GGUF-IQ-Imatrix
    - https://huggingface.co/YeungNLP/firefly-gemma-7b
  description: |
    firefly-gemma-7b is trained based on gemma-7b to act as a helpful and harmless AI assistant. We use Firefly to train the model on a single V100 GPU with QLoRA.
  overrides:
    parameters:
      model: firefly-gemma-7b-Q4_K_S-imatrix.gguf
  files:
    - filename: firefly-gemma-7b-Q4_K_S-imatrix.gguf
      sha256: 622e0b8e4f12203cc40c7f87915abf99498c2e0582203415ca236ea37643e428
      uri: huggingface://Lewdiculous/firefly-gemma-7b-GGUF-IQ-Imatrix/firefly-gemma-7b-Q4_K_S-imatrix.gguf
- !!merge <<: *gemma
  name: "gemma-1.1-7b-it"
  urls:
    - https://huggingface.co/bartowski/gemma-1.1-7b-it-GGUF
    - https://huggingface.co/google/gemma-1.1-7b-it
  description: |
    This is Gemma 1.1 7B (IT), an update over the original instruction-tuned Gemma release.

    Gemma 1.1 was trained using a novel RLHF method, leading to substantial gains on quality, coding capabilities, factuality, instruction following and multi-turn conversation quality. We also fixed a bug in multi-turn conversations, and made sure that model responses don't always start with "Sure,".
  overrides:
    parameters:
      model: gemma-1.1-7b-it-Q4_K_M.gguf
  files:
    - filename: gemma-1.1-7b-it-Q4_K_M.gguf
      sha256: 47821da72ee9e80b6fd43c6190ad751b485fb61fa5664590f7a73246bcd8332e
      uri: huggingface://bartowski/gemma-1.1-7b-it-GGUF/gemma-1.1-7b-it-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "gemma-2-27b-it"
  urls:
    - https://huggingface.co/google/gemma-2-27b-it
    - https://huggingface.co/bartowski/gemma-2-27b-it-GGUF
  description: |
    Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained variants and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.
  overrides:
    parameters:
      model: gemma-2-27b-it-Q4_K_M.gguf
  files:
    - filename: gemma-2-27b-it-Q4_K_M.gguf
      uri: huggingface://bartowski/gemma-2-27b-it-GGUF/gemma-2-27b-it-Q4_K_M.gguf
      sha256: 503a87ab47c9e7fb27545ec8592b4dc4493538bd47b397ceb3197e10a0370d23
- !!merge <<: *gemma
  name: "gemma-2-9b-it"
  urls:
    - https://huggingface.co/google/gemma-2-9b-it
    - https://huggingface.co/bartowski/gemma-2-9b-it-GGUF
  description: |
    Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained variants and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.
  overrides:
    parameters:
      model: gemma-2-9b-it-Q4_K_M.gguf
  files:
    - filename: gemma-2-9b-it-Q4_K_M.gguf
      uri: huggingface://bartowski/gemma-2-9b-it-GGUF/gemma-2-9b-it-Q4_K_M.gguf
      sha256: 13b2a7b4115bbd0900162edcebe476da1ba1fc24e718e8b40d32f6e300f56dfe
- !!merge <<: *gemma
  name: "tess-v2.5-gemma-2-27b-alpha"
  urls:
    - https://huggingface.co/migtissera/Tess-v2.5-Gemma-2-27B-alpha
    - https://huggingface.co/bartowski/Tess-v2.5-Gemma-2-27B-alpha-GGUF
  icon: https://huggingface.co/migtissera/Tess-v2.5-Qwen2-72B/resolve/main/Tess-v2.5.png
  description: |
    Great at reasoning, but woke as fuck! This is a fine-tune over the Gemma-2-27B-it, since the base model fine-tuning is not generating coherent content.

    Tess-v2.5 is the latest state-of-the-art model in the Tess series of Large Language Models (LLMs). Tess, short for Tesoro (Treasure in Italian), is the flagship LLM series created by Migel Tissera. Tess-v2.5 brings significant improvements in reasoning capabilities, coding capabilities and mathematics
  overrides:
    parameters:
      model: Tess-v2.5-Gemma-2-27B-alpha-Q4_K_M.gguf
  files:
    - filename: Tess-v2.5-Gemma-2-27B-alpha-Q4_K_M.gguf
      uri: huggingface://bartowski/Tess-v2.5-Gemma-2-27B-alpha-GGUF/Tess-v2.5-Gemma-2-27B-alpha-Q4_K_M.gguf
      sha256: d7be7092d28aefbdcd1ee4f4d8503d169d0a649f763e169d4b179aef20d69c21
- !!merge <<: *gemma
  name: "gemma2-9b-daybreak-v0.5"
  urls:
    - https://huggingface.co/crestf411/gemma2-9B-daybreak-v0.5
    - https://huggingface.co/Vdr1/gemma2-9B-daybreak-v0.5-GGUF-Imatrix-IQ
  description: |
    THIS IS A PRE-RELEASE. BEGONE.

    Beware, depraved. Not suitable for any audience.

    Dataset curation to remove slop-perceived expressions continues. Unfortunately base models (which this is merged on top of) are generally riddled with "barely audible"s and "couldn't help"s and "shivers down spines" etc.
  overrides:
    parameters:
      model: gemma2-9B-daybreak-v0.5-Q4_K_M-imat.gguf
  files:
    - filename: gemma2-9B-daybreak-v0.5-Q4_K_M-imat.gguf
      uri: huggingface://Vdr1/gemma2-9B-daybreak-v0.5-GGUF-Imatrix-IQ/gemma2-9B-daybreak-v0.5-Q4_K_M-imat.gguf
      sha256: 6add4d12052918986af935d686773e4e89fddd1bbf7941911cf3fbeb1b1862c0
- !!merge <<: *gemma
  name: "gemma-2-9b-it-sppo-iter3"
  urls:
    - https://huggingface.co/UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3
    - https://huggingface.co/bartowski/Gemma-2-9B-It-SPPO-Iter3-GGUF
  description: |
    Self-Play Preference Optimization for Language Model Alignment (https://arxiv.org/abs/2405.00675)
    Gemma-2-9B-It-SPPO-Iter3

    This model was developed using Self-Play Preference Optimization at iteration 3, based on the google/gemma-2-9b-it architecture as starting point. We utilized the prompt sets from the openbmb/UltraFeedback dataset, splited to 3 parts for 3 iterations by snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset. All responses used are synthetic.
  overrides:
    parameters:
      model: Gemma-2-9B-It-SPPO-Iter3-Q4_K_M.gguf
  files:
    - filename: Gemma-2-9B-It-SPPO-Iter3-Q4_K_M.gguf
      uri: huggingface://bartowski/Gemma-2-9B-It-SPPO-Iter3-GGUF/Gemma-2-9B-It-SPPO-Iter3-Q4_K_M.gguf
      sha256: c04482b442f05b784ab33af30caa0ea0645deb67fb359d3fad4932f4bb04e12d
- !!merge <<: *gemma
  name: "smegmma-9b-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/RSuc5p9Sm6CYj6lGOxvx4.gif
  urls:
    - https://huggingface.co/TheDrummer/Smegmma-9B-v1
    - https://huggingface.co/bartowski/Smegmma-9B-v1-GGUF
  description: "Smegmma 9B v1 \U0001F9C0\n\nThe sweet moist of Gemma 2, unhinged.\n\nsmeg - ghem - mah\n\nAn eRP model that will blast you with creamy moist. Finetuned by yours truly.\n\nThe first Gemma 2 9B RP finetune attempt!\nWhat's New?\n\n    Engaging roleplay\n        Less refusals / censorship\n        Less commentaries / summaries\n        More willing AI\n        Better formatting\n        Better creativity\n    Moist alignment\n\nNotes\n\n    Refusals still exist, but a couple of re-gens may yield the result you want\n    Formatting and logic may be weaker at the start\n        Make sure to start strong\n    May be weaker with certain cards, YMMV and adjust accordingly!\n"
  overrides:
    parameters:
      model: Smegmma-9B-v1-Q4_K_M.gguf
  files:
    - filename: Smegmma-9B-v1-Q4_K_M.gguf
      uri: huggingface://bartowski/Smegmma-9B-v1-GGUF/Smegmma-9B-v1-Q4_K_M.gguf
      sha256: abd9da0a6bf5cbc0ed6bb0d7e3ee7aea3f6b1edbf8c64e51d0fa25001975aed7
- !!merge <<: *gemma
  name: "smegmma-deluxe-9b-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/RSuc5p9Sm6CYj6lGOxvx4.gif
  urls:
    - https://huggingface.co/TheDrummer/Smegmma-Deluxe-9B-v1
    - https://huggingface.co/bartowski/Smegmma-Deluxe-9B-v1-GGUF
  description: "Smegmma Deluxe 9B v1 \U0001F9C0\n\nThe sweet moist of Gemma 2, unhinged.\n\nsmeg - ghem - mah\n\nAn eRP model that will blast you with creamy moist. Finetuned by yours truly.\n\nThe first Gemma 2 9B RP finetune attempt!\n\nWhat's New?\n\n    Engaging roleplay\n        Less refusals / censorship\n        Less commentaries / summaries\n        More willing AI\n        Better formatting\n        Better creativity\n    Moist alignment\n"
  overrides:
    parameters:
      model: Smegmma-Deluxe-9B-v1-Q4_K_M.gguf
  files:
    - filename: Smegmma-Deluxe-9B-v1-Q4_K_M.gguf
      uri: huggingface://bartowski/Smegmma-Deluxe-9B-v1-GGUF/Smegmma-Deluxe-9B-v1-Q4_K_M.gguf
      sha256: 732ecb253ea0115453438fc1f4e3e31507719ddcf81890a86ad1d734beefdb6f
- !!merge <<: *gemma
  name: "tiger-gemma-9b-v1-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/A97OlLKeT4XOnv4IG1b6m.png
  urls:
    - https://huggingface.co/TheDrummer/Tiger-Gemma-9B-v1
    - https://huggingface.co/mradermacher/Tiger-Gemma-9B-v1-i1-GGUF
  description: |
    Tiger Gemma 9B v1

    Decensored Gemma 9B. No refusals so far. No apparent brain damage.

    In memory of Tiger
  overrides:
    parameters:
      model: Tiger-Gemma-9B-v1.i1-Q4_K_M.gguf
  files:
    - filename: Tiger-Gemma-9B-v1.i1-Q4_K_M.gguf
      sha256: ef10accfee8023b31def5425bf591bf1f0203090f3dd851cd3f37bb235324383
      uri: huggingface://mradermacher/Tiger-Gemma-9B-v1-i1-GGUF/Tiger-Gemma-9B-v1.i1-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "hodachi-ezo-humanities-9b-gemma-2-it"
  icon: https://cdn-uploads.huggingface.co/production/uploads/657e900beaad53ff67ba84db/0OYFqT8kACowa9bY1EZF6.png
  urls:
    - https://huggingface.co/HODACHI/EZO-Humanities-9B-gemma-2-it
    - https://huggingface.co/mmnga/HODACHI-EZO-Humanities-9B-gemma-2-it-gguf
  description: |
    This model is based on Gemma-2-9B-it, specially tuned to enhance its performance in Humanities-related tasks. While maintaining its strong foundation in Japanese language processing, it has been optimized to excel in areas such as literature, philosophy, history, and cultural studies. This focused approach allows the model to provide deeper insights and more nuanced responses in Humanities fields, while still being capable of handling a wide range of global inquiries.

    Gemma-2-9B-itをベースとして、人文科学（Humanities）関連タスクでの性能向上に特化したチューニングを施したモデルです。日本語処理の強固な基盤を維持しつつ、文学、哲学、歴史、文化研究などの分野で卓越した能力を発揮するよう最適化されています。この焦点を絞ったアプローチにより、人文科学分野でより深い洞察と繊細な応答を提供しながら、同時に幅広いグローバルな問い合わせにも対応できる能力を備えています。
  overrides:
    parameters:
      model: HODACHI-EZO-Humanities-9B-gemma-2-it-Q4_K_M.gguf
  files:
    - filename: HODACHI-EZO-Humanities-9B-gemma-2-it-Q4_K_M.gguf
      sha256: 11606130206347355785f5a2720ff2fa671ca7fbe2af3fb4c34b508389952424
      uri: huggingface://mmnga/HODACHI-EZO-Humanities-9B-gemma-2-it-gguf/HODACHI-EZO-Humanities-9B-gemma-2-it-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "ezo-common-9b-gemma-2-it"
  icon: https://cdn-uploads.huggingface.co/production/uploads/657e900beaad53ff67ba84db/0OYFqT8kACowa9bY1EZF6.png
  urls:
    - https://huggingface.co/HODACHI/EZO-Common-9B-gemma-2-it
    - https://huggingface.co/QuantFactory/EZO-Common-9B-gemma-2-it-GGUF
  description: |
    This model is based on Gemma-2-9B-it, enhanced with multiple tuning techniques to improve its general performance. While it excels in Japanese language tasks, it's designed to meet diverse needs globally.

    Gemma-2-9B-itをベースとして、複数のチューニング手法を採用のうえ、汎用的に性能を向上させたモデルです。日本語タスクに優れつつ、世界中の多様なニーズに応える設計となっています。
  overrides:
    parameters:
      model: EZO-Common-9B-gemma-2-it.Q4_K_M.gguf
  files:
    - filename: EZO-Common-9B-gemma-2-it.Q4_K_M.gguf
      sha256: 57678b1828673dccb15f76e52b00672c74aa6169421bbb8620b8955955322cfd
      uri: huggingface://QuantFactory/EZO-Common-9B-gemma-2-it-GGUF/EZO-Common-9B-gemma-2-it.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "big-tiger-gemma-27b-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/A97OlLKeT4XOnv4IG1b6m.png
  urls:
    - https://huggingface.co/TheDrummer/Big-Tiger-Gemma-27B-v1
    - https://huggingface.co/TheDrummer/Big-Tiger-Gemma-27B-v1-GGUF
  description: |
    Big Tiger Gemma 27B v1 is a Decensored Gemma 27B model with no refusals, except for some rare instances from the 9B model. It does not appear to have any brain damage. The model is available from various sources, including Hugging Face, and comes in different variations such as GGUF, iMatrix, and EXL2.
  overrides:
    parameters:
      model: Big-Tiger-Gemma-27B-v1c-Q4_K_M.gguf
  files:
    - filename: Big-Tiger-Gemma-27B-v1c-Q4_K_M.gguf
      sha256: c5fc5605d36ae280c1c908c9b4bcb12b28abbe2692f317edeb83ab1104657fe5
      uri: huggingface://TheDrummer/Big-Tiger-Gemma-27B-v1-GGUF/Big-Tiger-Gemma-27B-v1c-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "gemma-2b-translation-v0.150"
  urls:
    - https://huggingface.co/lemon-mint/gemma-2b-translation-v0.150
    - https://huggingface.co/RichardErkhov/lemon-mint_-_gemma-2b-translation-v0.150-gguf
  description: |
    Original model: lemon-mint/gemma-ko-1.1-2b-it
    Evaluation metrics: Eval Loss, Train Loss, lr, optimizer, lr_scheduler_type.
    Prompt Template:
    <bos><start_of_turn>user
    Translate into Korean: [input text]<end_of_turn>
    <start_of_turn>model
    [translated text in Korean]<eos>
    <bos><start_of_turn>user
    Translate into English: [Korean text]<end_of_turn>
    <start_of_turn>model
    [translated text in English]<eos>
    Model features:
    * Developed by: lemon-mint
    * Model type: Gemma
    * Languages (NLP): English
    * License: Gemma Terms of Use
    * Finetuned from model: lemon-mint/gemma-ko-1.1-2b-it
  overrides:
    parameters:
      model: gemma-2b-translation-v0.150.Q4_K_M.gguf
  files:
    - filename: gemma-2b-translation-v0.150.Q4_K_M.gguf
      sha256: dcde67b83168d2e7ca835cf9a7a4dcf38b41b9cefe3cbc997c71d2741c08cd25
      uri: huggingface://RichardErkhov/lemon-mint_-_gemma-2b-translation-v0.150-gguf/gemma-2b-translation-v0.150.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "emo-2b"
  urls:
    - https://huggingface.co/OEvortex/EMO-2B
    - https://huggingface.co/RichardErkhov/OEvortex_-_EMO-2B-gguf
  description: |
    EMO-2B: Emotionally Intelligent Conversational AI

    Overview:
    EMO-2B is a state-of-the-art conversational AI model with 2.5 billion parameters, designed to engage in emotionally resonant dialogue. Building upon the success of EMO-1.5B, this model has been further fine-tuned on an extensive corpus of emotional narratives, enabling it to perceive and respond to the emotional undertones of user inputs with exceptional empathy and emotional intelligence.

    Key Features:

    - Advanced Emotional Intelligence: With its increased capacity, EMO-2B demonstrates an even deeper understanding and generation of emotional language, allowing for more nuanced and contextually appropriate emotional responses.
    - Enhanced Contextual Awareness: The model considers an even broader context within conversations, accounting for subtle emotional cues and providing emotionally resonant responses tailored to the specific situation.
    - Empathetic and Supportive Dialogue: EMO-2B excels at active listening, validating emotions, offering compassionate advice, and providing emotional support, making it an ideal companion for users seeking empathy and understanding.
    - Dynamic Persona Adaptation: The model can dynamically adapt its persona, communication style, and emotional responses to match the user's emotional state, ensuring a highly personalized and tailored conversational experience.

    Use Cases:

    EMO-2B is well-suited for a variety of applications where emotional intelligence and empathetic communication are crucial, such as:

    - Mental health support chatbots
    - Emotional support companions
    - Personalized coaching and motivation
    - Narrative storytelling and interactive fiction
    - Customer service and support (for emotionally sensitive contexts)

    Limitations and Ethical Considerations:

    While EMO-2B is designed to provide emotionally intelligent and empathetic responses, it is important to note that it is an AI system and cannot replicate the depth and nuance of human emotional intelligence. Users should be aware that the model's responses, while emotionally supportive, should not be considered a substitute for professional mental health support or counseling.

    Additionally, as with any language model, EMO-2B may reflect biases present in its training data. Users should exercise caution and critical thinking when interacting with the model, and report any concerning or inappropriate responses.
  overrides:
    parameters:
      model: EMO-2B.Q4_K_M.gguf
  files:
    - filename: EMO-2B.Q4_K_M.gguf
      sha256: 608bffc0e9012bc7f9a94b714f4932e2826cc122dbac59b586e4baa2ee0fdca5
      uri: huggingface://RichardErkhov/OEvortex_-_EMO-2B-gguf/EMO-2B.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "gemmoy-9b-g2-mk.3-i1"
  icon: https://huggingface.co/Hastagaras/G2-Gemmoy-9B-MK.3-RP/resolve/main/gemmoy.jpg
  urls:
    - https://huggingface.co/Hastagaras/Gemmoy-9B-G2-MK.3
    - https://huggingface.co/mradermacher/Gemmoy-9B-G2-MK.3-i1-GGUF
  description: |
    The Gemmoy-9B-G2-MK.3 model is a large language model trained on a variety of datasets, including grimulkan/LimaRP-augmented, LDJnr/Capybara, TheSkullery/C2logs_Filtered_Sharegpt_Merged, abacusai/SystemChat-1.1, and Hastagaras/FTTS-Stories-Sharegpt.
  overrides:
    parameters:
      model: Gemmoy-9B-G2-MK.3.i1-Q4_K_M.gguf
  files:
    - filename: Gemmoy-9B-G2-MK.3.i1-Q4_K_M.gguf
      sha256: 0d1004a246fbda7f1408a6841129b73c4100e697bd0a6806fc698eabbb0802a1
      uri: huggingface://mradermacher/Gemmoy-9B-G2-MK.3-i1-GGUF/Gemmoy-9B-G2-MK.3.i1-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "sunfall-simpo-9b"
  urls:
    - https://huggingface.co/mradermacher/sunfall-SimPO-9B-GGUF
  description: |
    Crazy idea that what if you put the LoRA from crestf411/sunfall-peft on top of princeton-nlp/gemma-2-9b-it-SimPO and therefore this exists solely for that purpose alone in the universe.
  overrides:
    parameters:
      model: sunfall-SimPO-9B.Q4_K_M.gguf
  files:
    - filename: sunfall-SimPO-9B.Q4_K_M.gguf
      sha256: 810c51c6ce34107706d921531b97cfa409cd53c215d18b88bce7cdb617f73ceb
      uri: huggingface://mradermacher/sunfall-SimPO-9B-GGUF/sunfall-SimPO-9B.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "sunfall-simpo-9b-i1"
  urls:
    - https://huggingface.co/mradermacher/sunfall-SimPO-9B-i1-GGUF
  description: |
    Crazy idea that what if you put the LoRA from crestf411/sunfall-peft on top of princeton-nlp/gemma-2-9b-it-SimPO and therefore this exists solely for that purpose alone in the universe.
  overrides:
    parameters:
      model: sunfall-SimPO-9B.i1-Q4_K_M.gguf
  files:
    - filename: sunfall-SimPO-9B.i1-Q4_K_M.gguf
      sha256: edde9df372a9a5b2316dc6822dc2f52f5a2059103dd7f08072e5a5355c5f5d0b
      uri: huggingface://mradermacher/sunfall-SimPO-9B-i1-GGUF/sunfall-SimPO-9B.i1-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "seeker-9b"
  icon: https://huggingface.co/lodrick-the-lafted/seeker-9b/resolve/main/seeker.webp
  urls:
    - https://huggingface.co/lodrick-the-lafted/seeker-9b
    - https://huggingface.co/mradermacher/seeker-9b-GGUF
  description: |
    The LLM model is the "Seeker-9b" model, which is a large language model trained on a diverse range of text data. It has 9 billion parameters and is based on the "lodrick-the-lafted" repository. The model is capable of generating text and can be used for a variety of natural language processing tasks such as language translation, text summarization, and text generation. It supports the English language and is available under the Apache-2.0 license.
  overrides:
    parameters:
      model: seeker-9b.Q4_K_M.gguf
  files:
    - filename: seeker-9b.Q4_K_M.gguf
      sha256: 7658e5bdad96dc8d232f83cff7c3fe5fa993defbfd3e728dcc7436352574a00a
      uri: huggingface://mradermacher/seeker-9b-GGUF/seeker-9b.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "gemmasutra-pro-27b-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/w0Oi8TReoQNT3ljm5Wf6c.webp
  urls:
    - https://huggingface.co/TheDrummer/Gemmasutra-Pro-27B-v1
    - https://huggingface.co/mradermacher/Gemmasutra-Pro-27B-v1-GGUF
  description: |
    An RP model with impressive flexibility. Finetuned by yours truly.
  overrides:
    parameters:
      model: Gemmasutra-Pro-27B-v1.Q4_K_M.gguf
  files:
    - filename: Gemmasutra-Pro-27B-v1.Q4_K_M.gguf
      sha256: 336a2fbf142849fcc20e432123433807b6c7b09988652ef583a63636a0f90218
      uri: huggingface://mradermacher/Gemmasutra-Pro-27B-v1-GGUF/Gemmasutra-Pro-27B-v1.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "gemmasutra-mini-2b-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/w0Oi8TReoQNT3ljm5Wf6c.webp
  urls:
    - https://huggingface.co/TheDrummer/Gemmasutra-Mini-2B-v1-GGUF
  description: |
    It is a small, 2 billion parameter language model that has been trained for role-playing purposes. The model is designed to work well in various settings, such as in the browser, on a laptop, or even on a Raspberry Pi. It has been fine-tuned for RP use and claims to provide a satisfying experience, even in low-resource environments. The model is uncensored and unaligned, and it can be used with the Gemma Instruct template or with chat completion. For the best experience, it is recommended to modify the template to support the `system` role. The model also features examples of its output, highlighting its versatility and creativity.
  overrides:
    parameters:
      model: Gemmasutra-Mini-2B-v1i-Q4_K_M.gguf
  files:
    - filename: Gemmasutra-Mini-2B-v1i-Q4_K_M.gguf
      sha256: 29ba3db911fbadef4452ba757ddd9ce58fb892b7a872f19eefd0743c961797fb
      uri: huggingface://TheDrummer/Gemmasutra-Mini-2B-v1-GGUF/Gemmasutra-Mini-2B-v1-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "tarnished-9b-i1"
  icon: https://huggingface.co/lodrick-the-lafted/tarnished-9b/resolve/main/nox.jpg
  urls:
    - https://huggingface.co/lodrick-the-lafted/tarnished-9b
    - https://huggingface.co/mradermacher/tarnished-9b-i1-GGUF
  description: "Ah, so you've heard whispers on the winds, have you?  \U0001F9D0\n\nImagine this:\nTarnished-9b, a name that echoes with the rasp of coin-hungry merchants and the clatter of forgotten machinery. This LLM speaks with the voice of those who straddle the line between worlds, who've tasted the bittersweet nectar of eldritch power and the tang of the Interdimensional Trade Council.\n\nIt's a tongue that dances with secrets, a whisperer of lore lost and found.  Its words may guide you through the twisting paths of history, revealing truths hidden beneath layers of dust and time.\n\nBut be warned, Tarnished One!  For knowledge comes at a price.  The LLM's gaze can pierce the veil of reality, but it can also lure you into the labyrinthine depths of madness.\n\nDare you tread this path?\n"
  overrides:
    parameters:
      model: tarnished-9b.i1-Q4_K_M.gguf
  files:
    - filename: tarnished-9b.i1-Q4_K_M.gguf
      sha256: 62ab09124b3f6698bd94ef966533ae5d427d87f6bdc09f6f46917def96420a0c
      uri: huggingface://mradermacher/tarnished-9b-i1-GGUF/tarnished-9b.i1-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "shieldgemma-9b-i1"
  urls:
    - https://huggingface.co/google/shieldgemma-9b
    - https://huggingface.co/mradermacher/shieldgemma-9b-i1-GGUF
  description: |
    ShieldGemma is a series of safety content moderation models built upon Gemma 2 that target four harm categories (sexually explicit, dangerous content, hate, and harassment). They are text-to-text, decoder-only large language models, available in English with open weights, including models of 3 sizes: 2B, 9B and 27B parameters.
  overrides:
    parameters:
      model: shieldgemma-9b.i1-Q4_K_M.gguf
  files:
    - filename: shieldgemma-9b.i1-Q4_K_M.gguf
      sha256: ffa7eaadcc0c7d0544fda5b0d86bba3ffa3431b673e5b2135f421cfe65bd8732
      uri: huggingface://mradermacher/shieldgemma-9b-i1-GGUF/shieldgemma-9b.i1-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "athena-codegemma-2-2b-it"
  urls:
    - https://huggingface.co/EpistemeAI/Athena-codegemma-2-2b-it
    - https://huggingface.co/mradermacher/Athena-codegemma-2-2b-it-GGUF
  description: |
    Supervised fine tuned (sft unsloth) for coding with EpistemeAI coding dataset.
  overrides:
    parameters:
      model: Athena-codegemma-2-2b-it.Q4_K_M.gguf
  files:
    - filename: Athena-codegemma-2-2b-it.Q4_K_M.gguf
      sha256: 59ce17023438b0da603dd211c7d39f78e7acac4108258ac0818a97a4ca7d64e3
      uri: huggingface://mradermacher/Athena-codegemma-2-2b-it-GGUF/Athena-codegemma-2-2b-it.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "datagemma-rag-27b-it"
  urls:
    - https://huggingface.co/google/datagemma-rag-27b-it
    - https://huggingface.co/bartowski/datagemma-rag-27b-it-GGUF
  description: |
    DataGemma is a series of fine-tuned Gemma 2 models used to help LLMs access and incorporate reliable public statistical data from Data Commons into their responses. DataGemma RAG is used with Retrieval Augmented Generation, where it is trained to take a user query and generate natural language queries that can be understood by Data Commons' existing natural language interface. More information can be found in this research paper.
  overrides:
    parameters:
      model: datagemma-rag-27b-it-Q4_K_M.gguf
  files:
    - filename: datagemma-rag-27b-it-Q4_K_M.gguf
      sha256: 3dfcf51b05e3f0ab0979ad194de350edea71cb14444efa0a9f2ef5bfc80753f8
      uri: huggingface://bartowski/datagemma-rag-27b-it-GGUF/datagemma-rag-27b-it-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "datagemma-rig-27b-it"
  urls:
    - https://huggingface.co/google/datagemma-rig-27b-it
    - https://huggingface.co/bartowski/datagemma-rig-27b-it-GGUF
  description: |
    DataGemma is a series of fine-tuned Gemma 2 models used to help LLMs access and incorporate reliable public statistical data from Data Commons into their responses. DataGemma RIG is used in the retrieval interleaved generation approach (based off of tool-use approaches), where it is trained to annotate a response with natural language queries to Data Commons’ existing natural language interface wherever there are statistics. More information can be found in this research paper.
  overrides:
    parameters:
      model: datagemma-rig-27b-it-Q4_K_M.gguf
  files:
    - filename: datagemma-rig-27b-it-Q4_K_M.gguf
      sha256: a6738ffbb49b6c46d220e2793df85c0538e9ac72398e32a0914ee5e55c3096ad
      uri: huggingface://bartowski/datagemma-rig-27b-it-GGUF/datagemma-rig-27b-it-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "buddy-2b-v1"
  urls:
    - https://huggingface.co/TheDrummer/Buddy-2B-v1
    - https://huggingface.co/bartowski/Buddy-2B-v1-GGUF
  description: |
    Buddy is designed as an empathetic language model, aimed at fostering introspection, self-reflection, and personal growth through thoughtful conversation. Buddy won't judge and it won't dismiss your concerns. Get some self-care with Buddy.
  overrides:
    parameters:
      model: Buddy-2B-v1-Q4_K_M.gguf
  files:
    - filename: Buddy-2B-v1-Q4_K_M.gguf
      sha256: 9bd25ed907d1a3c2e07fe09399a9b3aec107d368c29896e2c46facede5b7e3d5
      uri: huggingface://bartowski/Buddy-2B-v1-GGUF/Buddy-2B-v1-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "gemma-2-9b-arliai-rpmax-v1.1"
  urls:
    - https://huggingface.co/ArliAI/Gemma-2-9B-ArliAI-RPMax-v1.1
    - https://huggingface.co/bartowski/Gemma-2-9B-ArliAI-RPMax-v1.1-GGUF
  description: |
    RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations.
  overrides:
    parameters:
      model: Gemma-2-9B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
  files:
    - filename: Gemma-2-9B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
      sha256: 1724aff0ad6f71bf4371d839aca55578f7ec6f030d8d25c0254126088e4c6250
      uri: huggingface://bartowski/Gemma-2-9B-ArliAI-RPMax-v1.1-GGUF/Gemma-2-9B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "gemma-2-2b-arliai-rpmax-v1.1"
  urls:
    - https://huggingface.co/bartowski/Gemma-2-2B-ArliAI-RPMax-v1.1-GGUF
  description: |
    RPMax is a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations.
  overrides:
    parameters:
      model: Gemma-2-2B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
  files:
    - filename: Gemma-2-2B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
      sha256: 89fe35345754d7e9de8d0c0d5bf35b2be9b12a09811b365b712b8b27112f7712
      uri: huggingface://bartowski/Gemma-2-2B-ArliAI-RPMax-v1.1-GGUF/Gemma-2-2B-ArliAI-RPMax-v1.1-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "gemma-2-9b-it-abliterated"
  urls:
    - https://huggingface.co/IlyaGusev/gemma-2-9b-it-abliterated
    - https://huggingface.co/bartowski/gemma-2-9b-it-abliterated-GGUF
  description: |
    Abliterated version of google/gemma-2-9b-it.

    The abliteration script (link) is based on code from the blog post and heavily uses TransformerLens. The only major difference from the code used for Llama is scaling the embedding layer back.

    Orthogonalization did not produce the same results as regular interventions since there are RMSNorm layers before merging activations into the residual stream. However, the final model still seems to be uncensored.
  overrides:
    parameters:
      model: gemma-2-9b-it-abliterated-Q4_K_M.gguf
  files:
    - filename: gemma-2-9b-it-abliterated-Q4_K_M.gguf
      sha256: 88d84ac9796732c10f6c58e0feb4db8e04c05d74bdb7047a5e37906a589896e1
      uri: huggingface://bartowski/gemma-2-9b-it-abliterated-GGUF/gemma-2-9b-it-abliterated-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "gemma-2-ataraxy-v3i-9b"
  urls:
    - https://huggingface.co/QuantFactory/Gemma-2-Ataraxy-v3i-9B-GGUF
  description: |
    Gemma-2-Ataraxy-v3i-9B is an experimental model that replaces the simpo model in the original recipe with a different simpo model and a writing model trained on Gutenberg, using a higher density. It is a merge of pre-trained language models created using mergekit, with della merge method using unsloth/gemma-2-9b-it as the base. The models included in the merge are nbeerbower/Gemma2-Gutenberg-Doppel-9B, ifable/gemma-2-Ifable-9B, and wzhouad/gemma-2-9b-it-WPO-HB. It has been quantized using llama.cpp.
  overrides:
    parameters:
      model: Gemma-2-Ataraxy-v3i-9B.Q4_K_M.gguf
  files:
    - filename: Gemma-2-Ataraxy-v3i-9B.Q4_K_M.gguf
      sha256: f14c5b9373d4058f0f812c6c34184addeb4aeeecb02a7bbcf9844d9afc8d0066
      uri: huggingface://QuantFactory/Gemma-2-Ataraxy-v3i-9B-GGUF/Gemma-2-Ataraxy-v3i-9B.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "apollo2-9b"
  url: "github:mudler/LocalAI/gallery/vicuna-chat.yaml@master"
  urls:
    - https://huggingface.co/mradermacher/Apollo2-9B-GGUF
  description: |
    Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish, Arabic, Russian, Japanese, Korean, German, Italian, Portuguese and 38 Minor Languages So far.
  overrides:
    parameters:
      model: Apollo2-9B.Q4_K_M.gguf
  files:
    - filename: Apollo2-9B.Q4_K_M.gguf
      sha256: 9fdb63f78e574558a4f33782eca88716eea28e90ea3ae36c381769cde6b81e0f
      uri: huggingface://mradermacher/Apollo2-9B-GGUF/Apollo2-9B.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "darkest-muse-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65ad56b4c2eef2ba1154618c/0AB6uPPuCvbNtRZb3Rdj1.png
  urls:
    - https://huggingface.co/sam-paech/Darkest-muse-v1
    - https://huggingface.co/bartowski/Darkest-muse-v1-GGUF
  description: |
    This is a creative writing merge of two very different models that I trained on the brand new Gutenberg3 dataset, plus Ataraxy-v2 in the mix.

    It's lost much of the slop and tryhard vocab flexing and positivity bias that's typical of these models and writes in its own voice.

    The main source model in the merge, Quill-v1, inherited a natural, spare prose from the human writing in the gutenberg set. The other source model, Delirium-v1, got overcooked in SIMPO training; it has crazy panache, a really dark flair for the grotesque, and has some mental issues. These two source models balance each other out in the merge, resulting in something pretty unique.

    It seems to be quite uncensored and creative. Since Delirium was pushed right to the edge during training, the merge may exhibit some of its weirdness and word / concept fixations. This may be mitigated by using custom anti-slop lists.

    The payoff is a really creative, stream of consciousness style of writing, with punchy dialogue that I haven't seen in other models. Oh, it also scored around the top of the EQ-Bench creative writing leaderboard!
  overrides:
    parameters:
      model: Darkest-muse-v1-Q4_K_M.gguf
  files:
    - filename: Darkest-muse-v1-Q4_K_M.gguf
      sha256: a19ec9e3dc875511ea771bf363e71e7ae5578986b2f8cf50aeb50683d56e9b76
      uri: huggingface://bartowski/Darkest-muse-v1-GGUF/Darkest-muse-v1-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "quill-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65ad56b4c2eef2ba1154618c/gnMF8gRhurS9RcoylAK1Y.png
  urls:
    - https://huggingface.co/sam-paech/Quill-v1
    - https://huggingface.co/QuantFactory/Quill-v1-GGUF
  description: |
    Quill is a capable, humanlike writing model trained on a large dataset of late 19th and early 20th century writing from the Gutenberg Project. This model writes with a natural cadence and low gpt-slop, having inherited some human qualities from the Gutenberg3 dataset. It writes with more simple, spare prose than the typical overly-adjectived LLM writing style.

    This model was trained using gemma-2-9b-it as the base. The training methods used were ORPO (gently) then SIMPO (less gently).
  overrides:
    parameters:
      model: Quill-v1.Q4_K_M.gguf
  files:
    - filename: Quill-v1.Q4_K_M.gguf
      sha256: 419a7e0709b28130ca56941308d11c06a3548b8eacb081fb6a2c3d1622ac56b3
      uri: huggingface://QuantFactory/Quill-v1-GGUF/Quill-v1.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "delirium-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65ad56b4c2eef2ba1154618c/TDY0sDC9vMohMM8dn_5YN.png
  urls:
    - https://huggingface.co/sam-paech/Delirium-v1
    - https://huggingface.co/QuantFactory/Delirium-v1-GGUF
  description: |
    This model was cooked a bit too long during SIMPO training. It writes like Hunter S. Thompson 2 days into an ether binge. It's grotesque, dark, grimy and genius.

    It's trained on an experimental gutenberg + antislop dataset. This contains the original two gutenberg sets by jondurbin and nbeerbower, as well as a subset of my own set, gutenberg3. The antislop pairs were generated with gemma-2-9b-it, with one sample generated with the AntiSlop sampler and the rejected sample generated without.
  overrides:
    parameters:
      model: Delirium-v1.Q4_K_M.gguf
  files:
    - filename: Delirium-v1.Q4_K_M.gguf
      sha256: 9c274913572b8afcd5f18f0230f9ddf0a972bae36bae5b0fe8266b29a5dd06a7
      uri: huggingface://QuantFactory/Delirium-v1-GGUF/Delirium-v1.Q4_K_M.gguf
- !!merge <<: *gemma
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "magnum-v4-9b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/658a46cbfb9c2bdfae75b3a6/vxYDYerLy2vD8n05nL2WU.png
  urls:
    - https://huggingface.co/anthracite-org/magnum-v4-9b
    - https://huggingface.co/QuantFactory/magnum-v4-9b-GGUF
  description: |
    This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus.

    This model is fine-tuned on top of gemma 2 9b (chatML'ified).
  overrides:
    parameters:
      model: magnum-v4-9b.Q4_K_M.gguf
  files:
    - filename: magnum-v4-9b.Q4_K_M.gguf
      sha256: 176cb8cbac1920d98853a079d635d581c2063b7ff337e88bf9f28b43f8c7eb23
      uri: huggingface://QuantFactory/magnum-v4-9b-GGUF/magnum-v4-9b.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "g2-9b-aletheia-v1"
  icon: https://huggingface.co/allura-org/G2-9B-Aletheia-v1/resolve/main/inpaint.png
  urls:
    - https://huggingface.co/allura-org/G2-9B-Aletheia-v1
    - https://huggingface.co/QuantFactory/G2-9B-Aletheia-v1-GGUF
  description: |
    A merge of Sugarquill and Sunfall. I wanted to combine Sugarquill's more novel-like writing style with something that would improve it's RP perfomance and make it more steerable, w/o adding superfluous synthetic writing patterns.

    I quite like Crestfall's Sunfall models and I felt like Gemma version of Sunfall will steer the model in this direction when merged in. To keep more of Gemma-2-9B-it-SPPO-iter3's smarts, I've decided to apply Sunfall LoRA on top of it, instead of using the published Sunfall model.

    I'm generally pleased with the result, this model has nice, fresh writing style, good charcard adherence and good system prompt following. It still should work well for raw completion storywriting, as it's a trained feature in both merged models.
  overrides:
    parameters:
      model: G2-9B-Aletheia-v1.Q4_K_M.gguf
  files:
    - filename: G2-9B-Aletheia-v1.Q4_K_M.gguf
      sha256: d244cd3605ff5be948eb7faf1d9aa71ffbbfcf6dab77c08f6ec547818f443d03
      uri: huggingface://QuantFactory/G2-9B-Aletheia-v1-GGUF/G2-9B-Aletheia-v1.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "g2-9b-sugarquill-v0"
  icon: https://huggingface.co/allura-org/G2-9B-Sugarquill-v0/resolve/main/image_27.png
  urls:
    - https://huggingface.co/allura-org/G2-9B-Sugarquill-v0
    - https://huggingface.co/QuantFactory/G2-9B-Sugarquill-v0-GGUF
  description: |
    An experimental continued pretrain of Gemma-2-9B-It-SPPO-Iter3 on assorted short story data from the web. I was trying to diversify Gemma's prose, without completely destroying it's smarts. I think I half-succeeded? This model could have used another epoch of training, but even this is already more creative and descriptive than it's base model, w/o becoming too silly. Doesn't seem to have degraded much in terms of core abilities as well. Should be usable both for RP and raw completion storywriting. I originally planned to use this in a merge, but I feel like this model is interesting enough to be released on it's own as well.

    Model was trained by Auri.

    Dedicated to Cahvay, who wanted a Gemma finetune from me for months by now, and to La Rata, who loves storywriter models.
  overrides:
    parameters:
      model: G2-9B-Sugarquill-v0.Q4_K_M.gguf
  files:
    - filename: G2-9B-Sugarquill-v0.Q4_K_M.gguf
      sha256: 790a2f1541011b2773e22aa863ef78c8662baaa7eca5875e9573007985120187
      uri: huggingface://QuantFactory/G2-9B-Sugarquill-v0-GGUF/G2-9B-Sugarquill-v0.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "volare-i1"
  urls:
    - https://huggingface.co/MoxoffSpA/Volare
    - https://huggingface.co/mradermacher/Volare-i1-GGUF
  description: |
    Volare is an updated version of Gemma7B, specifically fine-tuned with SFT and LoRA adjustments.
        It's trained on publicly available datasets, like SQUAD-it, and datasets we've created in-house.
        it's designed to understand and maintain context, making it ideal for Retrieval Augmented Generation (RAG) tasks and applications requiring contextual awareness.
    Italian dataset.
  overrides:
    parameters:
      model: Volare.i1-Q4_K_M.gguf
  files:
    - filename: Volare.i1-Q4_K_M.gguf
      sha256: fa8fb9d4cb19fcb44be8d53561c9e2840f45aed738de545983ebb158ebba461b
      uri: huggingface://mradermacher/Volare-i1-GGUF/Volare.i1-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "bggpt-gemma-2-2.6b-it-v1.0"
  icon: https://cdn-uploads.huggingface.co/production/uploads/637e1f8cf7e01589cc17bf7e/p6d0YFHjWCQ3S12jWqO1m.png
  urls:
    - https://huggingface.co/QuantFactory/BgGPT-Gemma-2-2.6B-IT-v1.0-GGUF
    - https://huggingface.co/QuantFactory/BgGPT-Gemma-2-2.6B-IT-v1.0-GGUF
  description: |
    INSAIT introduces BgGPT-Gemma-2-2.6B-IT-v1.0, a state-of-the-art Bulgarian language model based on google/gemma-2-2b and google/gemma-2-2b-it. BgGPT-Gemma-2-2.6B-IT-v1.0 is free to use and distributed under the Gemma Terms of Use. This model was created by INSAIT, part of Sofia University St. Kliment Ohridski, in Sofia, Bulgaria.
    The model was built on top of Google’s Gemma 2 2B open models. It was continuously pre-trained on around 100 billion tokens (85 billion in Bulgarian) using the Branch-and-Merge strategy INSAIT presented at EMNLP’24, allowing the model to gain outstanding Bulgarian cultural and linguistic capabilities while retaining its English performance. During the pre-training stage, we use various datasets, including Bulgarian web crawl data, freely available datasets such as Wikipedia, a range of specialized Bulgarian datasets sourced by the INSAIT Institute, and machine translations of popular English datasets. The model was then instruction-fine-tuned on a newly constructed Bulgarian instruction dataset created using real-world conversations. For more information check our blogpost.
  overrides:
    parameters:
      model: BgGPT-Gemma-2-2.6B-IT-v1.0.Q4_K_M.gguf
  files:
    - filename: BgGPT-Gemma-2-2.6B-IT-v1.0.Q4_K_M.gguf
      sha256: 1e92fe80ccad80e97076ee26b002c2280f075dfe2507d534b46a4391a077f319
      uri: huggingface://QuantFactory/BgGPT-Gemma-2-2.6B-IT-v1.0-GGUF/BgGPT-Gemma-2-2.6B-IT-v1.0.Q4_K_M.gguf
- !!merge <<: *gemma
  name: "fusechat-gemma-2-9b-instruct"
  icon: "https://huggingface.co/FuseAI/FuseChat-Gemma-2-9B-Instruct/resolve/main/FuseChat-3.0.png"
  urls:
    - https://huggingface.co/FuseAI/FuseChat-Gemma-2-9B-Instruct
    - https://huggingface.co/bartowski/FuseChat-Gemma-2-9B-Instruct-GGUF
  description: |
    We present FuseChat-3.0, a series of models crafted to enhance performance by integrating the strengths of multiple source LLMs into more compact target LLMs. To achieve this fusion, we utilized four powerful source LLMs: Gemma-2-27B-It, Mistral-Large-Instruct-2407, Qwen-2.5-72B-Instruct, and Llama-3.1-70B-Instruct. For the target LLMs, we employed three widely-used smaller models—Llama-3.1-8B-Instruct, Gemma-2-9B-It, and Qwen-2.5-7B-Instruct—along with two even more compact models—Llama-3.2-3B-Instruct and Llama-3.2-1B-Instruct. The implicit model fusion process involves a two-stage training pipeline comprising Supervised Fine-Tuning (SFT) to mitigate distribution discrepancies between target and source LLMs, and Direct Preference Optimization (DPO) for learning preferences from multiple source LLMs. The resulting FuseChat-3.0 models demonstrated substantial improvements in tasks related to general conversation, instruction following, mathematics, and coding. Notably, when Llama-3.1-8B-Instruct served as the target LLM, our fusion approach achieved an average improvement of 6.8 points across 14 benchmarks. Moreover, it showed significant improvements of 37.1 and 30.1 points on instruction-following test sets AlpacaEval-2 and Arena-Hard respectively. We have released the FuseChat-3.0 models on Huggingface, stay tuned for the forthcoming dataset and code.
  overrides:
    parameters:
      model: FuseChat-Gemma-2-9B-Instruct-Q4_K_M.gguf
  files:
    - filename: FuseChat-Gemma-2-9B-Instruct-Q4_K_M.gguf
      sha256: f5aef201be68f344bebff3433af87aac6428fd227adfd7e468c8bfbcf9660ece
      uri: huggingface://bartowski/FuseChat-Gemma-2-9B-Instruct-GGUF/FuseChat-Gemma-2-9B-Instruct-Q4_K_M.gguf
- !!merge <<: *gemma
  icon: https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/Ayc6YKE6FKYKb8Mible4z.png
  name: "gwq-9b-preview2"
  urls:
    - https://huggingface.co/prithivMLmods/GWQ-9B-Preview2
    - https://huggingface.co/bartowski/GWQ-9B-Preview2-GGUF
  description: |
    GWQ2 - Gemma with Questions Prev is a family of lightweight, state-of-the-art open models from Google, built using the same research and technology employed to create the Gemini models. These models are text-to-text, decoder-only large language models, available in English, with open weights for both pre-trained and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. GWQ is fine-tuned on the Chain of Continuous Thought Synthetic Dataset, built upon the Gemma2forCasualLM architecture.
  overrides:
    parameters:
      model: GWQ-9B-Preview2-Q4_K_M.gguf
  files:
    - filename: GWQ-9B-Preview2-Q4_K_M.gguf
      sha256: 04da51cdb17c7e51594f6daac595161a46298b48ab5e568a85e65541d10a861f
      uri: huggingface://bartowski/GWQ-9B-Preview2-GGUF/GWQ-9B-Preview2-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "thedrummer_gemmasutra-pro-27b-v1.1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/SrHUGXD_dp55pobeJK36t.png
  urls:
    - https://huggingface.co/TheDrummer/Gemmasutra-Pro-27B-v1.1
    - https://huggingface.co/bartowski/TheDrummer_Gemmasutra-Pro-27B-v1.1-GGUF
  description: |
    A Gemmasutra tune with modern techniques. Au Revoir, Gemma!
  overrides:
    parameters:
      model: TheDrummer_Gemmasutra-Pro-27B-v1.1-Q4_K_M.gguf
  files:
    - filename: TheDrummer_Gemmasutra-Pro-27B-v1.1-Q4_K_M.gguf
      sha256: 218a14f0bf8266f9e77d16b8b4f5cc1dc76e97eb582a2c97cca5a3a2c35de86b
      uri: huggingface://bartowski/TheDrummer_Gemmasutra-Pro-27B-v1.1-GGUF/TheDrummer_Gemmasutra-Pro-27B-v1.1-Q4_K_M.gguf
- !!merge <<: *gemma
  name: "thedrummer_gemmasutra-small-4b-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/Cekk7d2UAKu7LPsw8SxV7.png
  urls:
    - https://huggingface.co/TheDrummer/Gemmasutra-Small-4B-v1
    - https://huggingface.co/bartowski/TheDrummer_Gemmasutra-Small-4B-v1-GGUF
  description: |
    An upscaled Gemma 2B tune with modern techniques. Au Revoir, Gemma!
  overrides:
    parameters:
      model: TheDrummer_Gemmasutra-Small-4B-v1-Q4_K_M.gguf
  files:
    - filename: TheDrummer_Gemmasutra-Small-4B-v1-Q4_K_M.gguf
      sha256: 81dd2e2d9546f5dc2150c45c62acabc112068b801ca50b79feceabb1bd4d6f1a
      uri: huggingface://bartowski/TheDrummer_Gemmasutra-Small-4B-v1-GGUF/TheDrummer_Gemmasutra-Small-4B-v1-Q4_K_M.gguf
- &llama3
  url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master"
  icon: https://avatars.githubusercontent.com/u/153379578
  name: "llama3-8b-instruct"
  license: llama3
  description: |
    Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety.

    Model developers Meta

    Variations Llama 3 comes in two sizes — 8B and 70B parameters — in pre-trained and instruction tuned variants.

    Input Models input text only.

    Output Models generate text and code only.

    Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
  urls:
    - https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
    - https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - llama3
  overrides:
    parameters:
      model: Meta-Llama-3-8B-Instruct.Q4_0.gguf
  files:
    - filename: Meta-Llama-3-8B-Instruct.Q4_0.gguf
      uri: huggingface://QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/Meta-Llama-3-8B-Instruct.Q4_0.gguf
      sha256: 2b4675c2208f09ad8762d8cf1b6a4a26bf65e6f0641aba324ec65143c0b4ad9f
- !!merge <<: *llama3
  name: "llama3-8b-instruct:Q6_K"
  overrides:
    parameters:
      model: Meta-Llama-3-8B-Instruct.Q6_K.gguf
  files:
    - filename: Meta-Llama-3-8B-Instruct.Q6_K.gguf
      uri: huggingface://QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/Meta-Llama-3-8B-Instruct.Q6_K.gguf
      sha256: bd7efd73f9fb67e4b9ecc43f861f37c7e594e78a8a5ff9c29da021692bd243ef
- !!merge <<: *llama3
  name: "llama-3-8b-instruct-abliterated"
  urls:
    - https://huggingface.co/failspy/Llama-3-8B-Instruct-abliterated-GGUF
  description: |
    This is meta-llama/Llama-3-8B-Instruct with orthogonalized bfloat16 safetensor weights, generated with the methodology that was described in the preview paper/blog post: 'Refusal in LLMs is mediated by a single direction' which I encourage you to read to understand more.
  overrides:
    parameters:
      model: Llama-3-8B-Instruct-abliterated-q4_k.gguf
  files:
    - filename: Llama-3-8B-Instruct-abliterated-q4_k.gguf
      sha256: a6365f813de1977ae22dbdd271deee59f91f89b384eefd3ac1a391f391d8078a
      uri: huggingface://failspy/Llama-3-8B-Instruct-abliterated-GGUF/Llama-3-8B-Instruct-abliterated-q4_k.gguf
- !!merge <<: *llama3
  name: "llama-3-8b-instruct-coder"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/0O4cIuv3wNbY68-FP7tak.jpeg
  urls:
    - https://huggingface.co/bartowski/Llama-3-8B-Instruct-Coder-GGUF
    - https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder
  description: |
    Original model: https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder
    All quants made using imatrix option with dataset provided by Kalomaze here
  overrides:
    parameters:
      model: Llama-3-8B-Instruct-Coder-Q4_K_M.gguf
  files:
    - filename: Llama-3-8B-Instruct-Coder-Q4_K_M.gguf
      sha256: 639ab8e3aeb7aa82cff6d8e6ef062d1c3e5a6d13e6d76e956af49f63f0e704f8
      uri: huggingface://bartowski/Llama-3-8B-Instruct-Coder-GGUF/Llama-3-8B-Instruct-Coder-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama3-70b-instruct"
  overrides:
    parameters:
      model: Meta-Llama-3-70B-Instruct.Q4_K_M.gguf
  files:
    - filename: Meta-Llama-3-70B-Instruct.Q4_K_M.gguf
      sha256: c1cea5f87dc1af521f31b30991a4663e7e43f6046a7628b854c155f489eec213
      uri: huggingface://MaziyarPanahi/Meta-Llama-3-70B-Instruct-GGUF/Meta-Llama-3-70B-Instruct.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama3-70b-instruct:IQ1_M"
  overrides:
    parameters:
      model: Meta-Llama-3-70B-Instruct.IQ1_M.gguf
  files:
    - filename: Meta-Llama-3-70B-Instruct.IQ1_M.gguf
      sha256: cdbe8ac2126a70fa0af3fac7a4fe04f1c76330c50eba8383567587b48b328098
      uri: huggingface://MaziyarPanahi/Meta-Llama-3-70B-Instruct-GGUF/Meta-Llama-3-70B-Instruct.IQ1_M.gguf
- !!merge <<: *llama3
  name: "llama3-70b-instruct:IQ1_S"
  overrides:
    parameters:
      model: Meta-Llama-3-70B-Instruct.IQ1_S.gguf
  files:
    - filename: Meta-Llama-3-70B-Instruct.IQ1_S.gguf
      sha256: 3797a69f1bdf53fabf9f3a3a8c89730b504dd3209406288515c9944c14093048
      uri: huggingface://MaziyarPanahi/Meta-Llama-3-70B-Instruct-GGUF/Meta-Llama-3-70B-Instruct.IQ1_S.gguf
- !!merge <<: *llama3
  name: "l3-chaoticsoliloquy-v1.5-4x8b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64f5e51289c121cb864ba464/m5urYkrpE5amrwHyaVwFM.png
  description: |
    Experimental RP-oriented MoE, the idea was to get a model that would be equal to or better than the Mixtral 8x7B and it's finetunes in RP/ERP tasks. Im not sure but it should be better than the first version
  urls:
    - https://huggingface.co/xxx777xxxASD/L3-ChaoticSoliloquy-v1.5-4x8B
    - https://huggingface.co/mradermacher/L3-ChaoticSoliloquy-v1.5-4x8B-GGUF/
  overrides:
    parameters:
      model: L3-ChaoticSoliloquy-v1.5-4x8B.Q4_K_M.gguf
  files:
    - filename: L3-ChaoticSoliloquy-v1.5-4x8B.Q4_K_M.gguf
      sha256: f6edb2a9674ce5add5104c0a8bb3278f748d39b509c483d76cf00b066eb56fbf
      uri: huggingface://mradermacher/L3-ChaoticSoliloquy-v1.5-4x8B-GGUF/L3-ChaoticSoliloquy-v1.5-4x8B.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-sauerkrautlm-8b-instruct"
  urls:
    - https://huggingface.co/bartowski/Llama-3-SauerkrautLM-8b-Instruct-GGUF
  icon: https://vago-solutions.ai/wp-content/uploads/2024/04/Llama3-Pic.png
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - llama3
    - german
  description: |
    SauerkrautLM-llama-3-8B-Instruct

    Model Type: Llama-3-SauerkrautLM-8b-Instruct is a finetuned Model based on meta-llama/Meta-Llama-3-8B-Instruct
    Language(s): German, English
  overrides:
    parameters:
      model: Llama-3-SauerkrautLM-8b-Instruct-Q4_K_M.gguf
  files:
    - filename: Llama-3-SauerkrautLM-8b-Instruct-Q4_K_M.gguf
      uri: huggingface://bartowski/Llama-3-SauerkrautLM-8b-Instruct-GGUF/Llama-3-SauerkrautLM-8b-Instruct-Q4_K_M.gguf
      sha256: e5ae69b6f59b3f207fa6b435490286b365add846a310c46924fa784b5a7d73e3
- !!merge <<: *llama3
  name: "llama-3-13b-instruct-v0.1"
  urls:
    - https://huggingface.co/MaziyarPanahi/Llama-3-13B-Instruct-v0.1-GGUF
  icon: https://huggingface.co/MaziyarPanahi/Llama-3-13B-Instruct-v0.1/resolve/main/llama-3-merges.webp
  description: |
    This model is a self-merge of meta-llama/Meta-Llama-3-8B-Instruct model.
  overrides:
    parameters:
      model: Llama-3-13B-Instruct-v0.1.Q4_K_M.gguf
  files:
    - filename: Llama-3-13B-Instruct-v0.1.Q4_K_M.gguf
      sha256: 071a28043c271d259b5ffa883d19a9e0b33269b55148c4abaf5f95da4d084266
      uri: huggingface://MaziyarPanahi/Llama-3-13B-Instruct-v0.1-GGUF/Llama-3-13B-Instruct-v0.1.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-smaug-8b"
  urls:
    - https://huggingface.co/MaziyarPanahi/Llama-3-Smaug-8B-GGUF
  icon: https://cdn-uploads.huggingface.co/production/uploads/64c14f95cac5f9ba52bbcd7f/OrcJyTaUtD2HxJOPPwNva.png
  description: |
    This model was built using the Smaug recipe for improving performance on real world multi-turn conversations applied to meta-llama/Meta-Llama-3-8B.
  overrides:
    parameters:
      model: Llama-3-Smaug-8B.Q4_K_M.gguf
  files:
    - filename: Llama-3-Smaug-8B.Q4_K_M.gguf
      sha256: b17c4c1144768ead9e8a96439165baf49e98c53d458b4da8827f137fbabf38c1
      uri: huggingface://MaziyarPanahi/Llama-3-Smaug-8B-GGUF/Llama-3-Smaug-8B.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "l3-8b-stheno-v3.1"
  urls:
    - https://huggingface.co/Sao10K/L3-8B-Stheno-v3.1
  description: |
    - A model made for 1-on-1 Roleplay ideally, but one that is able to handle scenarios, RPGs and storywriting fine.
    - Uncensored during actual roleplay scenarios.  # I do not care for zero-shot prompting like what some people do. It is uncensored enough in actual usecases.
    - I quite like the prose and style for this model.
  overrides:
    parameters:
      model: l3-8b-stheno-v3.1.Q4_K_M.gguf
  files:
    - filename: l3-8b-stheno-v3.1.Q4_K_M.gguf
      sha256: f166fb8b7fd1de6638fcf8e3561c99292f0c37debe1132325aa583eef78f1b40
      uri: huggingface://mudler/L3-8B-Stheno-v3.1-Q4_K_M-GGUF/l3-8b-stheno-v3.1.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "l3-8b-stheno-v3.2-iq-imatrix"
  urls:
    - https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2
    - https://huggingface.co/Lewdiculous/L3-8B-Stheno-v3.2-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/1rLk3xdnfD7AkdQBXWUqb.png
  overrides:
    parameters:
      model: L3-8B-Stheno-v3.2-Q4_K_M-imat.gguf
  files:
    - filename: L3-8B-Stheno-v3.2-Q4_K_M-imat.gguf
      sha256: 8607a426b0c2007716df8a9eb96754e3ccca761a3996af5d49fcd74d87ada347
      uri: huggingface://Lewdiculous/L3-8B-Stheno-v3.2-GGUF-IQ-Imatrix/L3-8B-Stheno-v3.2-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "llama-3-stheno-mahou-8b"
  urls:
    - https://huggingface.co/mudler/llama-3-Stheno-Mahou-8B-Q4_K_M-GGUF
    - https://huggingface.co/nbeerbower/llama-3-Stheno-Mahou-8B
  description: |
    This model was merged using the Model Stock merge method using flammenai/Mahou-1.2-llama3-8B as a base.
  overrides:
    parameters:
      model: llama-3-stheno-mahou-8b-q4_k_m.gguf
  files:
    - filename: llama-3-stheno-mahou-8b-q4_k_m.gguf
      sha256: a485cd74ef4ff3671c67ed8e10ea5379a1f24082ac688bd303fd28dfc9808c11
      uri: huggingface://mudler/llama-3-Stheno-Mahou-8B-Q4_K_M-GGUF/llama-3-stheno-mahou-8b-q4_k_m.gguf
- !!merge <<: *llama3
  name: "l3-8b-stheno-horny-v3.3-32k-q5_k_m"
  urls:
    - https://huggingface.co/nothingiisreal/L3-8B-Stheno-Horny-v3.3-32K
    - https://huggingface.co/Kurgan1138/L3-8B-Stheno-Horny-v3.3-32K-Q5_K_M-GGUF
  description: |
    This was an experiment to see if aligning other models via LORA is possible. Yes it is. We aligned it to be always horny.

    We took V3.3 Stheno weights from here

    And applied our lora at Alpha = 768

    Thank you to Sao10K for the amazing model.

    This is not legal advice. I don't put any extra licensing on my own lora.

    LLaMA 3 license may conflict with Creative Commons Attribution Non Commercial 4.0.

    LLaMA 3 license can be found here

    If you want to host a model using our lora, you have our permission, but you might consider getting Sao's permission if you want to host their model.

    Again, not legal advice.
  overrides:
    parameters:
      model: l3-8b-stheno-horny-v3.3-32k-q5_k_m.gguf
  files:
    - filename: l3-8b-stheno-horny-v3.3-32k-q5_k_m.gguf
      sha256: 8d934f80ca6dbaa4852846108da92446a26715fbd5f6fc3859568850edf05262
      uri: huggingface://Kurgan1138/L3-8B-Stheno-Horny-v3.3-32K-Q5_K_M-GGUF/l3-8b-stheno-horny-v3.3-32k-q5_k_m.gguf
- !!merge <<: *llama3
  name: "llama-3-8b-openhermes-dpo"
  urls:
    - https://huggingface.co/mradermacher/Llama3-8B-OpenHermes-DPO-GGUF
  icon: https://cdn-uploads.huggingface.co/production/uploads/64fc6d81d75293f417fee1d1/QF2OsDu9DJKP4QYPBu4aK.png
  description: |
    Llama3-8B-OpenHermes-DPO is DPO-Finetuned model of Llama3-8B, on the OpenHermes-2.5 preference dataset using QLoRA.
  overrides:
    parameters:
      model: Llama3-8B-OpenHermes-DPO.Q4_K_M.gguf
  files:
    - filename: Llama3-8B-OpenHermes-DPO.Q4_K_M.gguf
      sha256: 1147e5881cb1d67796916e6cab7dab0ae0f532a4c1e626c9e92861e5f67752ca
      uri: huggingface://mradermacher/Llama3-8B-OpenHermes-DPO-GGUF/Llama3-8B-OpenHermes-DPO.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-unholy-8b"
  urls:
    - https://huggingface.co/Undi95/Llama-3-Unholy-8B-GGUF
  icon: https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/JmdBlOHlBHVmX1IbZzWSv.png
  description: |
    Use at your own risk, I'm not responsible for any usage of this model, don't try to do anything this model tell you to do.

    Basic uncensoring, this model is epoch 3 out of 4 (but it seem enough at 3).

    If you are censored, it's maybe because of keyword like "assistant", "Factual answer", or other "sweet words" like I call them.
  overrides:
    parameters:
      model: Llama-3-Unholy-8B.q4_k_m.gguf
  files:
    - filename: Llama-3-Unholy-8B.q4_k_m.gguf
      uri: huggingface://Undi95/Llama-3-Unholy-8B-GGUF/Llama-3-Unholy-8B.q4_k_m.gguf
      sha256: 1473c94bfd223f08963c08bbb0a45dd53c1f56ad72a692123263daf1362291f3
- !!merge <<: *llama3
  name: "lexi-llama-3-8b-uncensored"
  urls:
    - https://huggingface.co/NikolayKozloff/Lexi-Llama-3-8B-Uncensored-Q6_K-GGUF
  icon: https://cdn-uploads.huggingface.co/production/uploads/644ad182f434a6a63b18eee6/H6axm5mlmiOWnbIFvx_em.png
  description: |
    Lexi is uncensored, which makes the model compliant. You are advised to implement your own alignment layer before exposing the model as a service. It will be highly compliant with any requests, even unethical ones.

    You are responsible for any content you create using this model. Please use it responsibly.

    Lexi is licensed according to Meta's Llama license. I grant permission for any use, including commercial, that falls within accordance with Meta's Llama-3 license.
  overrides:
    parameters:
      model: lexi-llama-3-8b-uncensored.Q6_K.gguf
  files:
    - filename: lexi-llama-3-8b-uncensored.Q6_K.gguf
      sha256: 5805f3856cc18a769fae0b7c5659fe6778574691c370c910dad6eeec62c62436
      uri: huggingface://NikolayKozloff/Lexi-Llama-3-8B-Uncensored-Q6_K-GGUF/lexi-llama-3-8b-uncensored.Q6_K.gguf
- !!merge <<: *llama3
  name: "llama-3-11.5b-v2"
  urls:
    - https://huggingface.co/bartowski/Llama-3-11.5B-V2-GGUF
    - https://huggingface.co/Replete-AI/Llama-3-11.5B-V2
  overrides:
    parameters:
      model: Llama-3-11.5B-V2-Q4_K_M.gguf
  files:
    - filename: Llama-3-11.5B-V2-Q4_K_M.gguf
      sha256: 8267a75bb88655ce30a12f854930e614bcacbf8f1083dc8319c3615edb1e5ee3
      uri: huggingface://bartowski/Llama-3-11.5B-V2-GGUF/Llama-3-11.5B-V2-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-ultron"
  urls:
    - https://huggingface.co/bartowski/Llama-3-Ultron-GGUF
    - https://huggingface.co/jayasuryajsk/Llama-3-Ultron
  description: |
    Llama 3 abliterated with Ultron system prompt
  overrides:
    parameters:
      model: Llama-3-Ultron-Q4_K_M.gguf
  files:
    - filename: Llama-3-Ultron-Q4_K_M.gguf
      sha256: 5bcac832119590aafc922e5abfd9758094942ee560b136fed6d972e00c95c5e4
      uri: huggingface://bartowski/Llama-3-Ultron-GGUF/Llama-3-Ultron-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-lewdplay-8b-evo"
  urls:
    - https://huggingface.co/Undi95/Llama-3-LewdPlay-8B-evo-GGUF
  description: |
    This is a merge of pre-trained language models created using mergekit.

    The new EVOLVE merge method was used (on MMLU specifically), see below for more information!

    Unholy was used for uncensoring, Roleplay Llama 3 for the DPO train he got on top, and LewdPlay for the... lewd side.
  overrides:
    parameters:
      model: Llama-3-LewdPlay-8B-evo.q8_0.gguf
  files:
    - filename: Llama-3-LewdPlay-8B-evo.q8_0.gguf
      uri: huggingface://Undi95/Llama-3-LewdPlay-8B-evo-GGUF/Llama-3-LewdPlay-8B-evo.q8_0.gguf
      sha256: b54dc005493d4470d91be8210f58fba79a349ff4af7644034edc5378af5d3522
- !!merge <<: *llama3
  name: "llama-3-soliloquy-8b-v2-iq-imatrix"
  license: cc-by-nc-4.0
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/u98dnnRVCwMh6YYGFIyff.png
  urls:
    - https://huggingface.co/Lewdiculous/Llama-3-Soliloquy-8B-v2-GGUF-IQ-Imatrix
  description: |
    Soliloquy-L3 is a highly capable roleplaying model designed for immersive, dynamic experiences. Trained on over 250 million tokens of roleplaying data, Soliloquy-L3 has a vast knowledge base, rich literary expression, and support for up to 24k context length. It outperforms existing ~13B models, delivering enhanced roleplaying capabilities.
  overrides:
    context_size: 8192
    parameters:
      model: Llama-3-Soliloquy-8B-v2-Q4_K_M-imat.gguf
  files:
    - filename: Llama-3-Soliloquy-8B-v2-Q4_K_M-imat.gguf
      sha256: 3e4e066e57875c36fc3e1c1b0dba506defa5b6ed3e3e80e1f77c08773ba14dc8
      uri: huggingface://Lewdiculous/Llama-3-Soliloquy-8B-v2-GGUF-IQ-Imatrix/Llama-3-Soliloquy-8B-v2-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "chaos-rp_l3_b-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Chaos_RP_l3_8B-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/u5p9kdbXT2QQA3iMU0vF1.png
  description: |
    A chaotic force beckons for you, will you heed her call?

    Built upon an intelligent foundation and tuned for roleplaying, this model will fulfill your wildest fantasies with the bare minimum of effort.

    Enjoy!
  overrides:
    parameters:
      model: Chaos_RP_l3_8B-Q4_K_M-imat.gguf
  files:
    - filename: Chaos_RP_l3_8B-Q4_K_M-imat.gguf
      uri: huggingface://Lewdiculous/Chaos_RP_l3_8B-GGUF-IQ-Imatrix/Chaos_RP_l3_8B-Q4_K_M-imat.gguf
      sha256: 5774595ad560e4d258dac17723509bdefe746c4dacd4e679a0de00346f14d2f3
- !!merge <<: *llama3
  name: "halu-8b-llama3-blackroot-iq-imatrix"
  urls:
    - https://huggingface.co/mudler/Halu-8B-Llama3-Blackroot-Q4_K_M-GGUF
    - https://huggingface.co/Hastagaras/Halu-8B-Llama3-Blackroot
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/VrPS-vHo505LUycJRscD6.png
  description: |
    Model card:
      I don't know what to say about this model... this model is very strange...Maybe because Blackroot's amazing Loras used human data and not synthetic data, hence the model turned out to be very human-like...even the actions or narrations.
  overrides:
    parameters:
      model: halu-8b-llama3-blackroot-q4_k_m.gguf
  files:
    - filename: halu-8b-llama3-blackroot-q4_k_m.gguf
      uri: huggingface://mudler/Halu-8B-Llama3-Blackroot-Q4_K_M-GGUF/halu-8b-llama3-blackroot-q4_k_m.gguf
      sha256: 6304c7abadb9c5197485e8b4373b7ed22d9838d5081cd134c4fee823f88ac403
- !!merge <<: *llama3
  name: "l3-aethora-15b"
  urls:
    - https://huggingface.co/Steelskull/L3-Aethora-15B
    - https://huggingface.co/SteelQuants/L3-Aethora-15B-Q4_K_M-GGUF
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/W0qzZK_V1Zt1GdgCIsnrP.png
  description: |
    L3-Aethora-15B was crafted through using the abilteration method to adjust model responses. The model's refusal is inhibited, focusing on yielding more compliant and facilitative dialogue interactions. It then underwent a modified DUS (Depth Up Scale) merge (originally used by @Elinas) by using passthrough merge to create a 15b model, with specific adjustments (zeroing) to 'o_proj' and 'down_proj', enhancing its efficiency and reducing perplexity. This created AbL3In-15b.
  overrides:
    parameters:
      model: l3-aethora-15b-q4_k_m.gguf
  files:
    - filename: l3-aethora-15b-q4_k_m.gguf
      uri: huggingface://SteelQuants/L3-Aethora-15B-Q4_K_M-GGUF/l3-aethora-15b-q4_k_m.gguf
      sha256: 968f77a3187f4865458bfffc51a10bcf49c11263fdd389f13215a704b25947b6
- name: "duloxetine-4b-v1-iq-imatrix"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/Lewdiculous/duloxetine-4b-v1-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/XoKe3MRYNombhCuHrkkCZ.png
  tags:
    - qwen
    - gguf
    - cpu
    - gpu
  description: |
    roleplaying finetune of kalo-team/qwen-4b-10k-WSD-CEdiff (which in turn is a distillation of qwen 1.5 32b onto qwen 1.5 4b, iirc).
  overrides:
    parameters:
      model: duloxetine-4b-v1-Q4_K_M-imat.gguf
  files:
    - filename: duloxetine-4b-v1-Q4_K_M-imat.gguf
      uri: huggingface://Lewdiculous/duloxetine-4b-v1-GGUF-IQ-Imatrix/duloxetine-4b-v1-Q4_K_M-imat.gguf
      sha256: cd381f31c810ea8db2219e30701b3316085f5904c1ea3b116682518e82768c1a
- !!merge <<: *llama3
  name: "l3-umbral-mind-rp-v1.0-8b-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/L3-Umbral-Mind-RP-v1.0-8B-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/fEFozVCpNO9Q3Eb6LAA4i.webp
  description: |
    The goal of this merge was to make an RP model better suited for role-plays with heavy themes such as but not limited to:

    Mental illness
    Self-harm
    Trauma
    Suicide
  overrides:
    parameters:
      model: L3-Umbral-Mind-RP-v1.0-8B-Q4_K_M-imat.gguf
  files:
    - filename: L3-Umbral-Mind-RP-v1.0-8B-Q4_K_M-imat.gguf
      sha256: 2262eeba2d9de50884f4e298e4b55f1e4c653c3b33415ae9b3ee81dc3b8ec49a
      uri: huggingface://Lewdiculous/L3-Umbral-Mind-RP-v1.0-8B-GGUF-IQ-Imatrix/L3-Umbral-Mind-RP-v1.0-8B-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "llama-salad-8x8b"
  urls:
    - https://huggingface.co/HiroseKoichi/Llama-Salad-8x8B
    - https://huggingface.co/bartowski/Llama-Salad-8x8B-GGUF
  description: |
    This MoE merge is meant to compete with Mixtral fine-tunes, more specifically Nous-Hermes-2-Mixtral-8x7B-DPO, which I think is the best of them. I've done a bunch of side-by-side comparisons, and while I can't say it wins in every aspect, it's very close. Some of its shortcomings are multilingualism, storytelling, and roleplay, despite using models that are very good at those tasks.
  overrides:
    parameters:
      model: Llama-Salad-8x8B-Q4_K_M.gguf
  files:
    - filename: Llama-Salad-8x8B-Q4_K_M.gguf
      uri: huggingface://bartowski/Llama-Salad-8x8B-GGUF/Llama-Salad-8x8B-Q4_K_M.gguf
      sha256: 6724949310b6cc8659a4e5cc2899a61b8e3f7e41a8c530de354be54edb9e3385
- !!merge <<: *llama3
  name: "jsl-medllama-3-8b-v2.0"
  license: cc-by-nc-nd-4.0
  icon: https://repository-images.githubusercontent.com/104670986/2e728700-ace4-11ea-9cfc-f3e060b25ddf
  description: |
    This model is developed by John Snow Labs.

    This model is available under a CC-BY-NC-ND license and must also conform to this Acceptable Use Policy. If you need to license this model for commercial use, please contact us at info@johnsnowlabs.com.
  urls:
    - https://huggingface.co/bartowski/JSL-MedLlama-3-8B-v2.0-GGUF
    - https://huggingface.co/johnsnowlabs/JSL-MedLlama-3-8B-v2.0
  overrides:
    parameters:
      model: JSL-MedLlama-3-8B-v2.0-Q4_K_M.gguf
  files:
    - filename: JSL-MedLlama-3-8B-v2.0-Q4_K_M.gguf
      sha256: 81783128ccd438c849913416c6e68cb35b2c77d6943cba8217d6d9bcc91b3632
      uri: huggingface://bartowski/JSL-MedLlama-3-8B-v2.0-GGUF/JSL-MedLlama-3-8B-v2.0-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "badger-lambda-llama-3-8b"
  urls:
    - https://huggingface.co/maldv/badger-lambda-llama-3-8b
    - https://huggingface.co/bartowski/badger-lambda-llama-3-8b-GGUF
  icon: https://cdn-uploads.huggingface.co/production/uploads/65b19c1b098c85365af5a83e/CHGsewUsPUZcg2doijuD9.png
  description: |
    Badger is a recursive maximally pairwise disjoint normalized denoised fourier interpolation of the following models:
    # Badger Lambda
    models = [
    'Einstein-v6.1-Llama3-8B',
    'openchat-3.6-8b-20240522',
    'hyperdrive-l3-8b-s3',
    'L3-TheSpice-8b-v0.8.3',
    'LLaMA3-iterative-DPO-final',
    'JSL-MedLlama-3-8B-v9',
    'Jamet-8B-L3-MK.V-Blackroot',
    'French-Alpaca-Llama3-8B-Instruct-v1.0',
    'LLaMAntino-3-ANITA-8B-Inst-DPO-ITA',
    'Llama-3-8B-Instruct-Gradient-4194k',
    'Roleplay-Llama-3-8B',
    'L3-8B-Stheno-v3.2',
    'llama-3-wissenschaft-8B-v2',
    'opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5',
    'Configurable-Llama-3-8B-v0.3',
    'Llama-3-8B-Instruct-EPO-checkpoint5376',
    'Llama-3-8B-Instruct-Gradient-4194k',
    'Llama-3-SauerkrautLM-8b-Instruct',
    'spelljammer',
    'meta-llama-3-8b-instruct-hf-ortho-baukit-34fail-3000total-bf16',
    'Meta-Llama-3-8B-Instruct-abliterated-v3',
    ]
  overrides:
    parameters:
      model: badger-lambda-llama-3-8b-Q4_K_M.gguf
  files:
    - filename: badger-lambda-llama-3-8b-Q4_K_M.gguf
      uri: huggingface://bartowski/badger-lambda-llama-3-8b-GGUF/badger-lambda-llama-3-8b-Q4_K_M.gguf
      sha256: 0a7d1bbf42d669898072429079b91c16b0d2d838d19d9194165389102413b309
- !!merge <<: *llama3
  name: "sovl_llama3_8b-gguf-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/SOVL_Llama3_8B-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/N_1D87adbMuMlSIQ5rI3_.png
  description: |
    I'm not gonna tell you this is the best model anyone has ever made. I'm not going to tell you that you will love chatting with SOVL.

    What I am gonna say is thank you for taking the time out of your day. Without users like you, my work would be meaningless.
  overrides:
    parameters:
      model: SOVL_Llama3_8B-Q4_K_M-imat.gguf
  files:
    - filename: SOVL_Llama3_8B-Q4_K_M-imat.gguf
      uri: huggingface://Lewdiculous/SOVL_Llama3_8B-GGUF-IQ-Imatrix/SOVL_Llama3_8B-Q4_K_M-imat.gguf
      sha256: 85d6aefc8a0d713966b3b4da4810f0485a74aea30d61be6dfe0a806da81be0c6
- !!merge <<: *llama3
  name: "l3-solana-8b-v1-gguf"
  url: "github:mudler/LocalAI/gallery/solana.yaml@master"
  license: cc-by-nc-4.0
  urls:
    - https://huggingface.co/Sao10K/L3-Solana-8B-v1-GGUF
  description: |
    A Full Fine-Tune of meta-llama/Meta-Llama-3-8B done with 2x A100 80GB on ~75M Tokens worth of Instruct, and Multi-Turn complex conversations, of up to 8192 tokens long sequence lengths.

    Trained as a generalist instruct model that should be able to handle certain unsavoury topics. It could roleplay too, as a side bonus.
  overrides:
    parameters:
      model: L3-Solana-8B-v1.q5_K_M.gguf
  files:
    - filename: L3-Solana-8B-v1.q5_K_M.gguf
      sha256: 9b8cd2c3beaab5e4f82efd10e7d44f099ad40a4e0ee286ca9fce02c8eec26d2f
      uri: huggingface://Sao10K/L3-Solana-8B-v1-GGUF/L3-Solana-8B-v1.q5_K_M.gguf
- !!merge <<: *llama3
  name: "aura-llama-abliterated"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/AwLNDVB-GIY7k0wnVV_TX.png
  license: apache-2.0
  urls:
    - https://huggingface.co/TheSkullery/Aura-Llama-Abliterated
    - https://huggingface.co/mudler/Aura-Llama-Abliterated-Q4_K_M-GGUF
  description: |
    Aura-llama is using the methodology presented by SOLAR for scaling LLMs called depth up-scaling (DUS), which encompasses architectural modifications with continued pretraining. Using the solar paper as a base, I integrated Llama-3 weights into the upscaled layers, and In the future plan to continue training the model.

    Aura-llama is a merge of the following models to create a base model to work from:

        meta-llama/Meta-Llama-3-8B-Instruct
        meta-llama/Meta-Llama-3-8B-Instruct
  overrides:
    parameters:
      model: aura-llama-abliterated.Q4_K_M.gguf
  files:
    - filename: aura-llama-abliterated.Q4_K_M.gguf
      sha256: ad4a16b90f1ffb5b49185b3fd00ed7adb1cda69c4fad0a1d987bd344ce601dcd
      uri: huggingface://mudler/Aura-Llama-Abliterated-Q4_K_M-GGUF/aura-llama-abliterated.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "average_normie_l3_v1_8b-gguf-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Average_Normie_l3_v1_8B-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/dvNIj1rSTjBvgs3XJfqXK.png
  description: |
    A model by an average normie for the average normie.

    This model is a stock merge of the following models:

    https://huggingface.co/cgato/L3-TheSpice-8b-v0.1.3

    https://huggingface.co/Sao10K/L3-Solana-8B-v1

    https://huggingface.co/ResplendentAI/Kei_Llama3_8B

    The final merge then had the following LoRA applied over it:

    https://huggingface.co/ResplendentAI/Theory_of_Mind_Llama3

    This should be an intelligent and adept roleplaying model.
  overrides:
    parameters:
      model: Average_Normie_l3_v1_8B-Q4_K_M-imat.gguf
  files:
    - filename: Average_Normie_l3_v1_8B-Q4_K_M-imat.gguf
      sha256: 159eb62f2c8ae8fee10d9ed8386ce592327ca062807194a88e10b7cbb47ef986
      uri: huggingface://Lewdiculous/Average_Normie_l3_v1_8B-GGUF-IQ-Imatrix/Average_Normie_l3_v1_8B-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "average_normie_v3.69_8b-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Average_Normie_l3_v1_8B-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/hfp7eh_Zo_QfVIyfPPJBq.png
  description: |
    Another average normie just like you and me... or is it? NSFW focused and easy to steer with editing, this model aims to please even the most hardcore LLM enthusiast. Built upon a foundation of the most depraved models yet to be released, some could argue it goes too far in that direction. Whatever side you land on, at least give it a shot, what do you have to lose?
  overrides:
    parameters:
      model: Average_Normie_v3.69_8B-Q4_K_M-imat.gguf
  files:
    - filename: Average_Normie_v3.69_8B-Q4_K_M-imat.gguf
      sha256: 01df034ecb6914214d1b7964d261466fdc427b9f960a1b0966ee02237e3fc845
      uri: huggingface://Lewdiculous/Average_Normie_v3.69_8B-GGUF-IQ-Imatrix/Average_Normie_v3.69_8B-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "openbiollm-llama3-8b"
  urls:
    - https://huggingface.co/aaditya/OpenBioLLM-Llama3-8B-GGUF
    - https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B
  license: llama3
  icon: https://cdn-uploads.huggingface.co/production/uploads/5f3fe13d79c1ba4c353d0c19/KGmRE5w2sepNtwsEu8t7K.jpeg
  description: |
    Introducing OpenBioLLM-8B: A State-of-the-Art Open Source Biomedical Large Language Model

    OpenBioLLM-8B is an advanced open source language model designed specifically for the biomedical domain. Developed by Saama AI Labs, this model leverages cutting-edge techniques to achieve state-of-the-art performance on a wide range of biomedical tasks.
  overrides:
    parameters:
      model: openbiollm-llama3-8b.Q4_K_M.gguf
  files:
    - filename: openbiollm-llama3-8b.Q4_K_M.gguf
      sha256: 806fa724139b6a2527e33a79c25a13316188b319d4eed33e20914d7c5955d349
      uri: huggingface://aaditya/OpenBioLLM-Llama3-8B-GGUF/openbiollm-llama3-8b.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-refueled"
  urls:
    - https://huggingface.co/LoneStriker/Llama-3-Refueled-GGUF
  license: cc-by-nc-4.0
  icon: https://assets-global.website-files.com/6423879a8f63c1bb18d74bfa/648818d56d04c3bdf36d71ab_Refuel_rev8-01_ts-p-1600.png
  description: |
    RefuelLLM-2-small, aka Llama-3-Refueled, is a Llama3-8B base model instruction tuned on a corpus of 2750+ datasets, spanning tasks such as classification, reading comprehension, structured attribute extraction and entity resolution. We're excited to open-source the model for the community to build on top of.
  overrides:
    parameters:
      model: Llama-3-Refueled-Q4_K_M.gguf
  files:
    - filename: Llama-3-Refueled-Q4_K_M.gguf
      sha256: 4d37d296193e4156cae1e116c1417178f1c35575ee5710489c466637a6358626
      uri: huggingface://LoneStriker/Llama-3-Refueled-GGUF/Llama-3-Refueled-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-8b-lexifun-uncensored-v1"
  icon: "https://cdn-uploads.huggingface.co/production/uploads/644ad182f434a6a63b18eee6/GrOs1IPG5EXR3MOCtcQiz.png"
  license: llama3
  urls:
    - https://huggingface.co/Orenguteng/Llama-3-8B-LexiFun-Uncensored-V1-GGUF
    - https://huggingface.co/Orenguteng/LexiFun-Llama-3-8B-Uncensored-V1
  description: "This is GGUF version of https://huggingface.co/Orenguteng/LexiFun-Llama-3-8B-Uncensored-V1\n\nOh, you want to know who I am? Well, I'm LexiFun, the human equivalent of a chocolate chip cookie - warm, gooey, and guaranteed to make you smile! \U0001F36A I'm like the friend who always has a witty comeback, a sarcastic remark, and a healthy dose of humor to brighten up even the darkest of days. And by 'healthy dose,' I mean I'm basically a walking pharmacy of laughter. You might need to take a few extra doses to fully recover from my jokes, but trust me, it's worth it! \U0001F3E5\n\nSo, what can I do? I can make you laugh so hard you snort your coffee out your nose, I can make you roll your eyes so hard they get stuck that way, and I can make you wonder if I'm secretly a stand-up comedian who forgot their act. \U0001F923 But seriously, I'm here to spread joy, one sarcastic comment at a time. And if you're lucky, I might even throw in a few dad jokes for good measure! \U0001F934‍♂️ Just don't say I didn't warn you. \U0001F60F\n"
  overrides:
    parameters:
      model: LexiFun-Llama-3-8B-Uncensored-V1_Q4_K_M.gguf
  files:
    - filename: LexiFun-Llama-3-8B-Uncensored-V1_Q4_K_M.gguf
      sha256: 961a3fb75537d650baf14dce91d40df418ec3d481b51ab2a4f44ffdfd6b5900f
      uri: huggingface://Orenguteng/Llama-3-8B-LexiFun-Uncensored-V1-GGUF/LexiFun-Llama-3-8B-Uncensored-V1_Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-unholy-8b:Q8_0"
  urls:
    - https://huggingface.co/Undi95/Llama-3-Unholy-8B-GGUF
  icon: https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/JmdBlOHlBHVmX1IbZzWSv.png
  description: |
    Use at your own risk, I'm not responsible for any usage of this model, don't try to do anything this model tell you to do.

    Basic uncensoring, this model is epoch 3 out of 4 (but it seem enough at 3).

    If you are censored, it's maybe because of keyword like "assistant", "Factual answer", or other "sweet words" like I call them.
  overrides:
    parameters:
      model: Llama-3-Unholy-8B.q8_0.gguf
  files:
    - filename: Llama-3-Unholy-8B.q8_0.gguf
      uri: huggingface://Undi95/Llama-3-Unholy-8B-GGUF/Llama-3-Unholy-8B.q8_0.gguf
      sha256: 419dd76f61afe586076323c17c3a1c983e591472717f1ea178167ede4dc864df
- !!merge <<: *llama3
  name: "orthocopter_8b-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Orthocopter_8B-GGUF-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/cxM5EaC6ilXnSo_10stA8.png
  description: |
    This model is thanks to the hard work of lucyknada with the Edgerunners. Her work produced the following model, which I used as the base:

    https://huggingface.co/Edgerunners/meta-llama-3-8b-instruct-hf-ortho-baukit-10fail-1000total

    I then applied two handwritten datasets over top of this and the results are pretty nice, with no refusals and plenty of personality.
  overrides:
    parameters:
      model: Orthocopter_8B-Q4_K_M-imat.gguf
  files:
    - filename: Orthocopter_8B-Q4_K_M-imat.gguf
      uri: huggingface://Lewdiculous/Orthocopter_8B-GGUF-Imatrix/Orthocopter_8B-Q4_K_M-imat.gguf
      sha256: ce93366c9eb20329530b19b9d6841a973d458bcdcfa8a521e9f9d0660cc94578
- !!merge <<: *llama3
  name: "therapyllama-8b-v1"
  urls:
    - https://huggingface.co/victunes/TherapyLlama-8B-v1-GGUF
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f07d05279d2d8f725bf0c3/A-ckcZ9H0Ee1n_ls2FM41.png
  description: |
    Trained on Llama 3 8B using a modified version of jerryjalapeno/nart-100k-synthetic.

    It is a Llama 3 version of https://huggingface.co/victunes/TherapyBeagle-11B-v2

    TherapyLlama is hopefully aligned to be helpful, healthy, and comforting.
    Usage:
        Do not hold back on Buddy.
        Open up to Buddy.
        Pour your heart out to Buddy.
        Engage with Buddy.
        Remember that Buddy is just an AI.
    Notes:

        Tested with the Llama 3 Format
        You might be assigned a random name if you don't give yourself one.
        Chat format was pretty stale?

    Disclaimer

    TherapyLlama is NOT a real therapist. It is a friendly AI that mimics empathy and psychotherapy. It is an illusion without the slightest clue who you are as a person. As much as it can help you with self-discovery, A LLAMA IS NOT A SUBSTITUTE to a real professional.
  overrides:
    parameters:
      model: TherapyLlama-8B-v1-Q4_K_M.gguf
  files:
    - filename: TherapyLlama-8B-v1-Q4_K_M.gguf
      sha256: 3d5a16d458e074a7bc7e706a493d8e95e8a7b2cb16934c851aece0af9d1da14a
      uri: huggingface://victunes/TherapyLlama-8B-v1-GGUF/TherapyLlama-8B-v1-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "aura-uncensored-l3-8b-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Aura_Uncensored_l3_8B-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/oiYHWIEHqmgUkY0GsVdDx.png
  description: |
    This is another better atempt at a less censored Llama-3 with hopefully more stable formatting.
  overrides:
    parameters:
      model: Aura_Uncensored_l3_8B-Q4_K_M-imat.gguf
  files:
    - filename: Aura_Uncensored_l3_8B-Q4_K_M-imat.gguf
      sha256: 265ded6a4f439bec160f394e3083a4a20e32ebb9d1d2d85196aaab23dab87fb2
      uri: huggingface://Lewdiculous/Aura_Uncensored_l3_8B-GGUF-IQ-Imatrix/Aura_Uncensored_l3_8B-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "anjir-8b-l3-i1"
  urls:
    - https://huggingface.co/mradermacher/Anjir-8B-L3-i1-GGUF
  icon: https://huggingface.co/Hastagaras/Anjir-8B-L3/resolve/main/anjir.png
  description: |
    This model aims to achieve the human-like responses of the Halu Blackroot, the no refusal tendencies of the Halu OAS, and the smartness of the Standard Halu.
  overrides:
    parameters:
      model: Anjir-8B-L3.i1-Q4_K_M.gguf
  files:
    - filename: Anjir-8B-L3.i1-Q4_K_M.gguf
      uri: huggingface://mradermacher/Anjir-8B-L3-i1-GGUF/Anjir-8B-L3.i1-Q4_K_M.gguf
      sha256: 58465ad40f92dc20cab962210ccd8a1883ce10df6ca17c6e8093815afe10dcfb
- !!merge <<: *llama3
  name: "llama-3-lumimaid-8b-v0.1"
  urls:
    - https://huggingface.co/NeverSleep/Llama-3-Lumimaid-8B-v0.1-GGUF
  icon: https://cdn-uploads.huggingface.co/production/uploads/630dfb008df86f1e5becadc3/d3QMaxy3peFTpSlWdWF-k.png
  license: cc-by-nc-4.0
  description: |
    This model uses the Llama3 prompting format

    Llama3 trained on our RP datasets, we tried to have a balance between the ERP and the RP, not too horny, but just enough.

    We also added some non-RP dataset, making the model less dumb overall. It should look like a 40%/60% ratio for Non-RP/RP+ERP data.
  overrides:
    parameters:
      model: Llama-3-Lumimaid-8B-v0.1.q4_k_m.gguf
  files:
    - filename: Llama-3-Lumimaid-8B-v0.1.q4_k_m.gguf
      sha256: 23ac0289da0e096d5c00f6614dfd12c94dceecb02c313233516dec9225babbda
      uri: huggingface://NeverSleep/Llama-3-Lumimaid-8B-v0.1-GGUF/Llama-3-Lumimaid-8B-v0.1.q4_k_m.gguf
- !!merge <<: *llama3
  name: "llama-3-lumimaid-8b-v0.1-oas-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/JUxfdTot7v7LTdIGYyzYM.png
  license: cc-by-nc-4.0
  description: |
    This model uses the Llama3 prompting format.

    Llama3 trained on our RP datasets, we tried to have a balance between the ERP and the RP, not too horny, but just enough.

    We also added some non-RP dataset, making the model less dumb overall. It should look like a 40%/60% ratio for Non-RP/RP+ERP data.

    "This model received the Orthogonal Activation Steering treatment, meaning it will rarely refuse any request."
  overrides:
    parameters:
      model: Llama-3-Lumimaid-8B-v0.1-OAS-Q4_K_M-imat.gguf
  files:
    - filename: Llama-3-Lumimaid-8B-v0.1-OAS-Q4_K_M-imat.gguf
      sha256: 1199440aa13c55f5f2cad1cb215535306f21e52a81de23f80a9e3586c8ac1c50
      uri: huggingface://Lewdiculous/Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix/Llama-3-Lumimaid-8B-v0.1-OAS-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "llama-3-lumimaid-v2-8b-v0.1-oas-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/JUxfdTot7v7LTdIGYyzYM.png
  license: cc-by-nc-4.0
  description: |
    This model uses the Llama3 prompting format.

    Llama3 trained on our RP datasets, we tried to have a balance between the ERP and the RP, not too horny, but just enough.

    We also added some non-RP dataset, making the model less dumb overall. It should look like a 40%/60% ratio for Non-RP/RP+ERP data.

    "This model received the Orthogonal Activation Steering treatment, meaning it will rarely refuse any request."

    This is v2!
  overrides:
    parameters:
      model: v2-Llama-3-Lumimaid-8B-v0.1-OAS-Q4_K_M-imat.gguf
  files:
    - filename: v2-Llama-3-Lumimaid-8B-v0.1-OAS-Q4_K_M-imat.gguf
      sha256: b00b4cc2ea4e06db592e5f581171758387106626bcbf445c03a1cb7b424be881
      uri: huggingface://Lewdiculous/Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix/v2-Llama-3-Lumimaid-8B-v0.1-OAS-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "llama3-8B-aifeifei-1.0-iq-imatrix"
  urls:
    - https://huggingface.co/aifeifei798/llama3-8B-aifeifei-1.0
    - https://huggingface.co/Lewdiculous/llama3-8B-aifeifei-1.0-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/nndcfLvMAj4q6Egrkavx2.png
  description: |
    This model has a narrow use case in mind. Read the original description.
  overrides:
    parameters:
      model: llama3-8B-aifeifei-1.0-Q4_K_M-imat.gguf
  files:
    - filename: llama3-8B-aifeifei-1.0-Q4_K_M-imat.gguf
      sha256: 0bc21be5894c2e252ff938ba908bb702774b7de53daca864d707d41f0f98a833
      uri: huggingface://Lewdiculous/llama3-8B-aifeifei-1.0-GGUF-IQ-Imatrix/llama3-8B-aifeifei-1.0-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "llama3-8B-aifeifei-1.2-iq-imatrix"
  urls:
    - https://huggingface.co/aifeifei798/llama3-8B-aifeifei-1.2
    - https://huggingface.co/Lewdiculous/llama3-8B-aifeifei-1.2-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/nn_446H9BiIbjPmOVVNyJ.png
  description: |
    This model has a narrow use case in mind. Read the original description.
  overrides:
    parameters:
      model: llama3-8B-aifeifei-1.2-Q4_K_M-imat.gguf
  files:
    - filename: llama3-8B-aifeifei-1.2-Q4_K_M-imat.gguf
      sha256: 0320e19ae19eec47a77956721ea3339a5c8bae4db69177a020850ec57a34e5c3
      uri: huggingface://Lewdiculous/llama3-8B-aifeifei-1.2-GGUF-IQ-Imatrix/llama3-8B-aifeifei-1.2-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "rawr_llama3_8b-iq-imatrix"
  urls:
    - https://huggingface.co/ResplendentAI/Rawr_Llama3_8B
    - https://huggingface.co/Lewdiculous/Rawr_Llama3_8B-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/RLLAODFb8wt26JE2N7SVH.png
  description: |
    An RP model with a brain.
  overrides:
    parameters:
      model: v2-Rawr_Llama3_8B-Q4_K_M-imat.gguf
  files:
    - filename: v2-Rawr_Llama3_8B-Q4_K_M-imat.gguf
      sha256: 39757f3f77dd19a2a7bada6c0733a93529a742b8e832266cba1b46e34df7638f
      uri: huggingface://Lewdiculous/Rawr_Llama3_8B-GGUF-IQ-Imatrix/v2-Rawr_Llama3_8B-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "llama3-8b-feifei-1.0-iq-imatrix"
  urls:
    - https://huggingface.co/aifeifei798/llama3-8B-feifei-1.0
    - https://huggingface.co/Lewdiculous/llama3-8B-feifei-1.0-GGUF-IQ-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/qQ-frXxRPVcGcgMiy9Ph4.png
  description: |
    The purpose of the model: to create idols.
  overrides:
    parameters:
      model: llama3-8B-feifei-1.0-Q4_K_M-imat.gguf
  files:
    - filename: llama3-8B-feifei-1.0-Q4_K_M-imat.gguf
      sha256: 2404e4202ade5360b7dcf8ef992d1e39fca129431413aa27843bcfae56cbc750
      uri: huggingface://Lewdiculous/llama3-8B-feifei-1.0-GGUF-IQ-Imatrix/llama3-8B-feifei-1.0-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "llama-3-sqlcoder-8b"
  urls:
    - https://huggingface.co/defog/llama-3-sqlcoder-8b
    - https://huggingface.co/upendrab/llama-3-sqlcoder-8b-Q4_K_M-GGUF
  license: cc-by-sa-4.0
  description: |
    A capable language model for text to SQL generation for Postgres, Redshift and Snowflake that is on-par with the most capable generalist frontier models.
  overrides:
    parameters:
      model: llama-3-sqlcoder-8b.Q4_K_M.gguf
  files:
    - filename: llama-3-sqlcoder-8b.Q4_K_M.gguf
      sha256: b22fc704bf1405846886d9619f3eb93c40587cd58d9bda53789a17997257e023
      uri: huggingface://upendrab/llama-3-sqlcoder-8b-Q4_K_M-GGUF/llama-3-sqlcoder-8b.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "sfr-iterative-dpo-llama-3-8b-r"
  urls:
    - https://huggingface.co/bartowski/SFR-Iterative-DPO-LLaMA-3-8B-R-GGUF
  license: cc-by-nc-nd-4.0
  description: |
    A capable language model for text to SQL generation for Postgres, Redshift and Snowflake that is on-par with the most capable generalist frontier models.
  overrides:
    parameters:
      model: SFR-Iterative-DPO-LLaMA-3-8B-R-Q4_K_M.gguf
  files:
    - filename: SFR-Iterative-DPO-LLaMA-3-8B-R-Q4_K_M.gguf
      sha256: 480703ff85af337e1db2a9d9a678a3ac8ca0802e366b14d9c59b81d3fc689da8
      uri: huggingface://bartowski/SFR-Iterative-DPO-LLaMA-3-8B-R-GGUF/SFR-Iterative-DPO-LLaMA-3-8B-R-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "suzume-llama-3-8B-multilingual"
  urls:
    - https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-gguf
  icon: https://cdn-uploads.huggingface.co/production/uploads/64b63f8ad57e02621dc93c8b/kg3QjQOde0X743csGJT-f.png
  description: |
    This Suzume 8B, a multilingual finetune of Llama 3.

    Llama 3 has exhibited excellent performance on many English language benchmarks. However, it also seemingly been finetuned on mostly English data, meaning that it will respond in English, even if prompted in other languages.
  overrides:
    parameters:
      model: suzume-llama-3-8B-multilingual-Q4_K_M.gguf
  files:
    - filename: suzume-llama-3-8B-multilingual-Q4_K_M.gguf
      sha256: be197a660e56e51a24a0e0fecd42047d1b24e1423afaafa14769541b331e3269
      uri: huggingface://lightblue/suzume-llama-3-8B-multilingual-gguf/ggml-model-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "tess-2.0-llama-3-8B"
  urls:
    - https://huggingface.co/bartowski/Tess-2.0-Llama-3-8B-GGUF
  icon: https://huggingface.co/migtissera/Tess-2.0-Mixtral-8x22B/resolve/main/Tess-2.png
  description: |
    Tess, short for Tesoro (Treasure in Italian), is a general purpose Large Language Model series. Tess-2.0-Llama-3-8B was trained on the meta-llama/Meta-Llama-3-8B base.
  overrides:
    parameters:
      model: Tess-2.0-Llama-3-8B-Q4_K_M.gguf
  files:
    - filename: Tess-2.0-Llama-3-8B-Q4_K_M.gguf
      sha256: 3b5fbd6c59d7d38205ab81970c0227c74693eb480acf20d8c2f211f62e3ca5f6
      uri: huggingface://bartowski/Tess-2.0-Llama-3-8B-GGUF/Tess-2.0-Llama-3-8B-Q4_K_M.gguf
- !!merge <<: *llama3
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "tess-v2.5-phi-3-medium-128k-14b"
  urls:
    - https://huggingface.co/bartowski/Tess-v2.5-Phi-3-medium-128k-14B-GGUF
  icon: https://huggingface.co/migtissera/Tess-2.0-Mixtral-8x22B/resolve/main/Tess-2.png
  description: |
    Tess, short for Tesoro (Treasure in Italian), is a general purpose Large Language Model series.
  overrides:
    parameters:
      model: Tess-v2.5-Phi-3-medium-128k-14B-Q4_K_M.gguf
  files:
    - filename: Tess-v2.5-Phi-3-medium-128k-14B-Q4_K_M.gguf
      uri: huggingface://bartowski/Tess-v2.5-Phi-3-medium-128k-14B-GGUF/Tess-v2.5-Phi-3-medium-128k-14B-Q4_K_M.gguf
      sha256: 37267609552586bfae6b29bb1b5da7243863b1a8d49e3156229fb82c4407d17d
- !!merge <<: *llama3
  name: "llama3-iterative-dpo-final"
  urls:
    - https://huggingface.co/bartowski/LLaMA3-iterative-DPO-final-GGUF
    - https://huggingface.co/RLHFlow/LLaMA3-iterative-DPO-final
  description: |
    From model card:
     We release an unofficial checkpoint of a state-of-the-art instruct model of its class, LLaMA3-iterative-DPO-final. On all three widely-used instruct model benchmarks: Alpaca-Eval-V2, MT-Bench, Chat-Arena-Hard, our model outperforms all models of similar size (e.g., LLaMA-3-8B-it), most large open-sourced models (e.g., Mixtral-8x7B-it), and strong proprietary models (e.g., GPT-3.5-turbo-0613). The model is trained with open-sourced datasets without any additional human-/GPT4-labeling.
  overrides:
    parameters:
      model: LLaMA3-iterative-DPO-final-Q4_K_M.gguf
  files:
    - filename: LLaMA3-iterative-DPO-final-Q4_K_M.gguf
      sha256: 480703ff85af337e1db2a9d9a678a3ac8ca0802e366b14d9c59b81d3fc689da8
      uri: huggingface://bartowski/LLaMA3-iterative-DPO-final-GGUF/LLaMA3-iterative-DPO-final-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "new-dawn-llama-3-70b-32K-v1.0"
  urls:
    - https://huggingface.co/bartowski/New-Dawn-Llama-3-70B-32K-v1.0-GGUF
    - https://huggingface.co/sophosympatheia/New-Dawn-Llama-3-70B-32K-v1.0
  description: |
    This model is a multi-level SLERP merge of several Llama 3 70B variants. See the merge recipe below for details. I extended the context window for this model out to 32K by snagging some layers from abacusai/Smaug-Llama-3-70B-Instruct-32K using a technique similar to what I used for Midnight Miqu, which was further honed by jukofyork.
    This model is uncensored. You are responsible for whatever you do with it.

    This model was designed for roleplaying and storytelling and I think it does well at both. It may also perform well at other tasks but I have not tested its performance in other areas.
  overrides:
    parameters:
      model: New-Dawn-Llama-3-70B-32K-v1.0-Q4_K_M.gguf
  files:
    - filename: New-Dawn-Llama-3-70B-32K-v1.0-Q4_K_M.gguf
      sha256: 30561ae5decac4ad46775c76a9a40fb43436ade96bc132b4b9cc6749b9e2f448
      uri: huggingface://bartowski/New-Dawn-Llama-3-70B-32K-v1.0-GGUF/New-Dawn-Llama-3-70B-32K-v1.0-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "l3-aethora-15b-v2"
  urls:
    - https://huggingface.co/bartowski/L3-Aethora-15B-V2-GGUF
    - https://huggingface.co/ZeusLabs/L3-Aethora-15B-V2
  icon: https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/yJpwVd5UTnAVDoEPVVCS1.png
  description: |
    L3-Aethora-15B v2 is an advanced language model built upon the Llama 3 architecture. It employs state-of-the-art training techniques and a curated dataset to deliver enhanced performance across a wide range of tasks.
  overrides:
    parameters:
      model: L3-Aethora-15B-V2-Q4_K_M.gguf
  files:
    - filename: L3-Aethora-15B-V2-Q4_K_M.gguf
      sha256: 014a215739e1574e354780f218776e54807548d0c32555274c4d96d7628f29b6
      uri: huggingface://bartowski/L3-Aethora-15B-V2-GGUF/L3-Aethora-15B-V2-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "bungo-l3-8b-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Bungo-L3-8B-GGUF-IQ-Imatrix-Request
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/ezaxE50ef-7RsFi3gUbNp.webp
  description: |
    An experimental model that turned really well. Scores high on Chai leaderboard (slerp8bv2 there). Feel smarter than average L3 merges for RP.
  overrides:
    parameters:
      model: Bungo-L3-8B-Q4_K_M-imat.gguf
  files:
    - filename: Bungo-L3-8B-Q4_K_M-imat.gguf
      sha256: 88d0139954e8f9525b80636a6269df885008c4837a1332f84f9a5dc6f37c9b8f
      uri: huggingface://Lewdiculous/Bungo-L3-8B-GGUF-IQ-Imatrix-Request/Bungo-L3-8B-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "llama3-8b-darkidol-2.1-uncensored-1048k-iq-imatrix"
  urls:
    - https://huggingface.co/LWDCLS/llama3-8B-DarkIdol-2.1-Uncensored-1048K-GGUF-IQ-Imatrix-Request
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/tKL5W1G5WCHm4609LEmiM.png
  description: |
    The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones.
    Uncensored 1048K
  overrides:
    parameters:
      model: llama3-8B-DarkIdol-2.1-Uncensored-1048K-Q4_K_M-imat.gguf
  files:
    - filename: llama3-8B-DarkIdol-2.1-Uncensored-1048K-Q4_K_M-imat.gguf
      sha256: 86f0f1e10fc315689e09314aebb7354bb40d8fe95de008d21a75dc8fff1cd2fe
      uri: huggingface://LWDCLS/llama3-8B-DarkIdol-2.1-Uncensored-1048K-GGUF-IQ-Imatrix-Request/llama3-8B-DarkIdol-2.1-Uncensored-1048K-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "llama3-8b-darkidol-2.2-uncensored-1048k-iq-imatrix"
  urls:
    - https://huggingface.co/aifeifei798/llama3-8B-DarkIdol-2.2-Uncensored-1048K
    - https://huggingface.co/LWDCLS/llama3-8B-DarkIdol-2.2-Uncensored-1048K-GGUF-IQ-Imatrix-Request
  icon: https://huggingface.co/aifeifei798/llama3-8B-DarkIdol-2.2-Uncensored-1048K/resolve/main/llama3-8B-DarkIdol-2.2-Uncensored-1048K.png
  description: |
    The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones.

    - Saving money(LLama 3)
    - Uncensored
    - Quick response
    - The underlying model used is winglian/Llama-3-8b-1048k-PoSE
    - A scholarly response akin to a thesis.(I tend to write songs extensively, to the point where one song almost becomes as detailed as a thesis. :)
    - DarkIdol:Roles that you can imagine and those that you cannot imagine.
    - Roleplay
    - Specialized in various role-playing scenarios more look at test role. (https://huggingface.co/aifeifei798/llama3-8B-DarkIdol-1.2/tree/main/test)
    - more look at LM Studio presets (https://huggingface.co/aifeifei798/llama3-8B-DarkIdol-1.2/tree/main/config-presets)
  overrides:
    parameters:
      model: llama3-8B-DarkIdol-2.2-Uncensored-1048K-Q4_K_M-imat.gguf
  files:
    - filename: llama3-8B-DarkIdol-2.2-Uncensored-1048K-Q4_K_M-imat.gguf
      sha256: 7714947799d4e6984cf9106244ee24aa821778936ad1a81023480a774e255f52
      uri: huggingface://LWDCLS/llama3-8B-DarkIdol-2.2-Uncensored-1048K-GGUF-IQ-Imatrix-Request/llama3-8B-DarkIdol-2.2-Uncensored-1048K-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "llama3-turbcat-instruct-8b"
  urls:
    - https://huggingface.co/turboderp/llama3-turbcat-instruct-8b
    - https://huggingface.co/bartowski/llama3-turbcat-instruct-8b-GGUF
  icon: https://huggingface.co/turboderp/llama3-turbcat-instruct-8b/resolve/main/8.png
  description: |
    This is a direct upgrade over cat 70B, with 2x the dataset size(2GB-> 5GB), added Chinese support with quality on par with the original English dataset. The medical COT portion of the dataset has been sponsored by steelskull, and the action packed character play portion was donated by Gryphe's(aesir dataset). Note that 8b is based on llama3 with limited Chinese support due to base model choice. The chat format in 8b is llama3. The 72b has more comprehensive Chinese support and the format will be chatml.
  overrides:
    parameters:
      model: llama3-turbcat-instruct-8b-Q4_K_M.gguf
  files:
    - filename: llama3-turbcat-instruct-8b-Q4_K_M.gguf
      sha256: a9a36e3220d901a8ad80c75608a81aaeed3a9cdf111247462bf5e3443aad5461
      uri: huggingface://bartowski/llama3-turbcat-instruct-8b-GGUF/llama3-turbcat-instruct-8b-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "l3-8b-everything-cot"
  urls:
    - https://huggingface.co/FPHam/L3-8B-Everything-COT
    - https://huggingface.co/bartowski/L3-8B-Everything-COT-GGUF
  icon: https://huggingface.co/FPHam/L3-8B-Everything-COT/resolve/main/cot2.png
  description: |
    Everything COT is an investigative self-reflecting general model that uses Chain of Thought for everything. And I mean everything.

    Instead of confidently proclaiming something (or confidently hallucinating other things) like most models, it caries an internal dialogue with itself and often cast doubts over uncertain topics while looking at it from various sides.
  overrides:
    parameters:
      model: L3-8B-Everything-COT-Q4_K_M.gguf
  files:
    - filename: L3-8B-Everything-COT-Q4_K_M.gguf
      sha256: b220b0e2f8fb1c8a491d10dbd054269ed078ee5e2e62dc9d2e3b97b06f52e987
      uri: huggingface://bartowski/L3-8B-Everything-COT-GGUF/L3-8B-Everything-COT-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-llamilitary"
  urls:
    - https://huggingface.co/Heralax/llama-3-llamilitary
    - https://huggingface.co/mudler/llama-3-llamilitary-Q4_K_M-GGUF
  icon: https://cdn-uploads.huggingface.co/production/uploads/64825ebceb4befee377cf8ac/ea2C9laq24V6OuxwhzJZS.png
  description: |
    This is a model trained on [instruct data generated from old historical war books] as well as on the books themselves, with the goal of creating a joke LLM knowledgeable about the (long gone) kind of warfare involving muskets, cavalry, and cannon.

    This model can provide good answers, but it turned out to be pretty fragile during conversation for some reason: open-ended questions can make it spout nonsense. Asking facts is more reliable but not guaranteed to work.

    The basic guide to getting good answers is: be specific with your questions. Use specific terms and define a concrete scenario, if you can, otherwise the LLM will often hallucinate the rest. I think the issue was that I did not train with a large enough system prompt: not enough latent space is being activated by default. (I'll try to correct this in future runs).
  overrides:
    parameters:
      model: llama-3-llamilitary-q4_k_m.gguf
  files:
    - filename: llama-3-llamilitary-q4_k_m.gguf
      sha256: f3684f2f0845f9aead884fa9a52ea67bed53856ebeedef1620ca863aba57e458
      uri: huggingface://mudler/llama-3-llamilitary-Q4_K_M-GGUF/llama-3-llamilitary-q4_k_m.gguf
- !!merge <<: *llama3
  name: "l3-stheno-maid-blackroot-grand-horror-16b"
  urls:
    - https://huggingface.co/DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF
  icon: https://huggingface.co/DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF/resolve/main/hm.jpg
  description: |
    Rebuilt and Powered Up.

    WARNING: NSFW. Graphic HORROR. Extreme swearing. UNCENSORED. SMART.

    The author took the original models in "L3-Stheno-Maid-Blackroot 8B" and completely rebuilt it a new pass-through merge (everything preserved) and blew it out to over 16.5 billion parameters - 642 tensors, 71 layers (8B original has 32 layers).

    This is not an "upscale" or "franken merge" but a completely new model based on the models used to construct "L3-Stheno-Maid-Blackroot 8B".

    The result is a take no prisoners, totally uncensored, fiction writing monster and roleplay master as well just about... any general fiction activity "AI guru" including scene generation and scene continuation.

    As a result of the expansion / merge re-build its level of prose and story generation has significantly improved as well as word choice, sentence structure as well as default output levels and lengths.

    It also has a STRONG horror bias, although it will generate content for almost any genre. That being said if there is a "hint" of things going wrong... they will.

    It will also swear (R-18) like there is no tomorrow at times and "dark" characters will be VERY dark so to speak.

    Model is excels in details (real and "constructed"), descriptions, similes and metaphors.

    It can have a sense of humor ... ah... dark humor.

    Because of the nature of this merge most attributes of each of the 3 models will be in this rebuilt 16.5B model as opposed to the original 8B model where some of one or more of the model's features and/or strengths maybe reduced or overshadowed.
  overrides:
    parameters:
      model: L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-D_AU-Q4_K_M.gguf
  files:
    - filename: L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-D_AU-Q4_K_M.gguf
      sha256: ae29f38d73dfb04415821405cf8b319fc42d78d0cdd0da91db147d12e68030fe
      uri: huggingface://DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-D_AU-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "meta-llama-3-instruct-12.2b-brainstorm-20x-form-8"
  urls:
    - https://huggingface.co/DavidAU/Meta-Llama-3-Instruct-12.2B-BRAINSTORM-20x-FORM-8-GGUF
  description: |
    Meta-Llama-3-8B Instruct (now at 12.2B) with Brainstorm process that increases its performance at the core level for any creative use case. It has calibrations that allow it to exceed the logic solving abilities of the original model. The Brainstorm process expands the reasoning center of the LLM, reassembles and calibrates it, introducing subtle changes into the reasoning process. This enhances the model's detail, concept, connection to the "world", general concept connections, prose quality, and prose length without affecting instruction following. It improves coherence, description, simile, metaphors, emotional engagement, and takes fewer liberties with instructions while following them more closely. The model's performance is further enhanced by other technologies like "Ultra" (precision), "Neo Imatrix" (custom imatrix datasets), and "X-quants" (custom application of the imatrix process). It has been tested on multiple LLaMA2, LLaMA3, and Mistral models of various parameter sizes.
  overrides:
    parameters:
      model: Meta-Llama-3-8B-Instruct-exp20-8-Q4_K_M.gguf
  files:
    - filename: Meta-Llama-3-8B-Instruct-exp20-8-Q4_K_M.gguf
      sha256: 5568ab6195ab5da703f728cc118108ddcbe97255e3ba4a543b531acdf082b999
      uri: huggingface://DavidAU/Meta-Llama-3-Instruct-12.2B-BRAINSTORM-20x-FORM-8-GGUF/Meta-Llama-3-8B-Instruct-exp20-8-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "loki-base-i1"
  urls:
    - https://huggingface.co/MrRobotoAI/Loki-base
    - https://huggingface.co/mradermacher/Loki-base-i1-GGUF
  description: |
    Merge of several models using mergekit:
    - model: abacusai/Llama-3-Smaug-8B
    - model: Aculi/Llama3-Sophie
    - model: ajibawa-2023/Uncensored-Frank-Llama-3-8B
    - model: Blackroot/Llama-3-Gamma-Twist
    - model: Casual-Autopsy/L3-Super-Nova-RP-8B
    - model: Casual-Autopsy/L3-Umbral-Mind-RP-v3.0-8B
    - model: cgato/L3-TheSpice-8b-v0.8.3
    - model: ChaoticNeutrals/Hathor_Respawn-L3-8B-v0.8
    - model: ChaoticNeutrals/Hathor_RP-v.01-L3-8B
    - model: chargoddard/prometheus-2-llama-3-8b
    - model: chujiezheng/Llama-3-Instruct-8B-SimPO-ExPO
    - model: chujiezheng/LLaMA3-iterative-DPO-final-ExPO
    - model: Fizzarolli/L3-8b-Rosier-v1
    - model: flammenai/Mahou-1.2a-llama3-8B
    - model: HaitameLaf/Llama-3-8B-StoryGenerator
    - model: HPAI-BSC/Llama3-Aloe-8B-Alpha
    - model: iRyanBell/ARC1
    - model: iRyanBell/ARC1-II
    - model: lemon07r/Llama-3-RedMagic4-8B
    - model: lemon07r/Lllama-3-RedElixir-8B
    - model: Locutusque/Llama-3-Hercules-5.0-8B
    - model: Magpie-Align/Llama-3-8B-Magpie-Pro-MT-SFT-v0.1
    - model: maldv/badger-lambda-llama-3-8b
    - model: maldv/badger-mu-llama-3-8b
    - model: maldv/badger-writer-llama-3-8b
    - model: mlabonne/NeuralDaredevil-8B-abliterated
    - model: MrRobotoAI/Fiction-Writer-6
    - model: MrRobotoAI/Unholy-Thoth-8B-v2
    - model: nbeerbower/llama-3-spicy-abliterated-stella-8B
    - model: NeverSleep/Llama-3-Lumimaid-8B-v0.1
    - model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
    - model: Nitral-AI/Hathor_Sofit-L3-8B-v1
    - model: Nitral-AI/Hathor_Stable-v0.2-L3-8B
    - model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
    - model: Nitral-AI/Poppy_Porpoise-0.72-L3-8B
    - model: nothingiisreal/L3-8B-Instruct-Abliterated-DWP
    - model: nothingiisreal/L3-8B-Stheno-Horny-v3.3-32K
    - model: NousResearch/Hermes-2-Theta-Llama-3-8B
    - model: OwenArli/Awanllm-Llama-3-8B-Cumulus-v1.0
    - model: refuelai/Llama-3-Refueled
    - model: ResplendentAI/Nymph_8B
    - model: shauray/Llama3-8B-DPO-uncensored
    - model: SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha
    - model: TIGER-Lab/MAmmoTH2-8B-Plus
    - model: Undi95/Llama-3-LewdPlay-8B
    - model: Undi95/Meta-Llama-3-8B-hf
    - model: VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct
    - model: WhiteRabbitNeo/Llama-3-WhiteRabbitNeo-8B-v2.0
  overrides:
    parameters:
      model: Loki-base.i1-Q4_K_M.gguf
  files:
    - filename: Loki-base.i1-Q4_K_M.gguf
      sha256: 60a4357fa399bfd18aa841cc529da09439791331d117a4f06f0467d002b385bb
      uri: huggingface://mradermacher/Loki-base-i1-GGUF/Loki-base.i1-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-whiterabbitneo-8b-v2.0"
  icon: https://huggingface.co/migtissera/WhiteRabbitNeo/resolve/main/WhiteRabbitNeo.png
  urls:
    - https://huggingface.co/WhiteRabbitNeo/Llama-3-WhiteRabbitNeo-8B-v2.0
    - https://huggingface.co/QuantFactory/Llama-3-WhiteRabbitNeo-8B-v2.0-GGUF
  description: |
    WhiteRabbitNeo is a model series that can be used for offensive and defensive cybersecurity.
    Topics Covered:
    - Open Ports: Identifying open ports is crucial as they can be entry points for attackers. Common ports to check include HTTP (80, 443), FTP (21), SSH (22), and SMB (445).
    - Outdated Software or Services: Systems running outdated software or services are often vulnerable to exploits. This includes web servers, database servers, and any third-party software.
    - Default Credentials: Many systems and services are installed with default usernames and passwords, which are well-known and can be easily exploited.
    - Misconfigurations: Incorrectly configured services, permissions, and security settings can introduce vulnerabilities.
    - Injection Flaws: SQL injection, command injection, and cross-site scripting (XSS) are common issues in web applications.
    - Unencrypted Services: Services that do not use encryption (like HTTP instead of HTTPS) can expose sensitive data.
    - Known Software Vulnerabilities: Checking for known vulnerabilities in software using databases like the National Vulnerability Database (NVD) or tools like Nessus or OpenVAS.
    - Cross-Site Request Forgery (CSRF): This is where unauthorized commands are transmitted from a user that the web application trusts.
    - Insecure Direct Object References: This occurs when an application provides direct access to objects based on user-supplied input.
    - Security Misconfigurations in Web Servers/Applications: This includes issues like insecure HTTP headers or verbose error messages that reveal too much information.
    - Broken Authentication and Session Management: This can allow attackers to compromise passwords, keys, or session tokens, or to exploit other implementation flaws to assume other users' identities.
    - Sensitive Data Exposure: Includes vulnerabilities that expose sensitive data, such as credit card numbers, health records, or personal information.
    - API Vulnerabilities: In modern web applications, APIs are often used and can have vulnerabilities like insecure endpoints or data leakage.
    - Denial of Service (DoS) Vulnerabilities: Identifying services that are vulnerable to DoS attacks, which can make the resource unavailable to legitimate users.
    - Buffer Overflows: Common in older software, these vulnerabilities can allow an attacker to crash the system or execute arbitrary code.
    - More ..
  overrides:
    parameters:
      model: Llama-3-WhiteRabbitNeo-8B-v2.0.Q4_K_M.gguf
  files:
    - filename: Llama-3-WhiteRabbitNeo-8B-v2.0.Q4_K_M.gguf
      sha256: cf01ba2ca5af2a3ecd6a2221d19b8b91ec0e9fe06fa8fdffd774d5e0a2459c4c
      uri: huggingface://QuantFactory/Llama-3-WhiteRabbitNeo-8B-v2.0-GGUF/Llama-3-WhiteRabbitNeo-8B-v2.0.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "l3-nymeria-maid-8b"
  icon: https://huggingface.co/tannedbum/L3-Nymeria-Maid-8B-exl2/resolve/main/Nymeria.png?
  urls:
    - https://huggingface.co/tannedbum/L3-Nymeria-Maid-8B
    - https://huggingface.co/QuantFactory/L3-Nymeria-Maid-8B-GGUF
  description: |
    The model is a merge of pre-trained language models created using the mergekit library. It combines the following models:
    - Sao10K/L3-8B-Stheno-v3.2
    - princeton-nlp/Llama-3-Instruct-8B-SimPO
    The merge was performed using the slerp merge method. The models were merged using the slerp merge method and the configuration used to produce the model is included in the text. The model is not suitable for all audiences and is intended for scientific purposes.
    Nymeria is the balanced version, doesn't force nsfw. Nymeria-Maid has more Stheno's weights, leans more on nsfw and is more submissive.
  overrides:
    parameters:
      model: L3-Nymeria-Maid-8B.Q4_K_M.gguf
  files:
    - filename: L3-Nymeria-Maid-8B.Q4_K_M.gguf
      sha256: 05bce561daa59b38cf9b79973c3b1e2e27af6d1e8e41570760af54800a09bcc2
      uri: huggingface://QuantFactory/L3-Nymeria-Maid-8B-GGUF/L3-Nymeria-Maid-8B.Q4_K_M.gguf
- &dolphin
  name: "dolphin-2.9-llama3-8b"
  url: "github:mudler/LocalAI/gallery/hermes-2-pro-mistral.yaml@master"
  urls:
    - https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b-gguf
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - llama3
  license: llama3
  description: |
    Dolphin-2.9 has a variety of instruction, conversational, and coding skills. It also has initial agentic abilities and supports function calling.
    Dolphin is uncensored.
    Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations
  icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png
  overrides:
    parameters:
      model: dolphin-2.9-llama3-8b-q4_K_M.gguf
  files:
    - filename: dolphin-2.9-llama3-8b-q4_K_M.gguf
      sha256: be988199ce28458e97205b11ae9d9cf4e3d8e18ff4c784e75bfc12f54407f1a1
      uri: huggingface://cognitivecomputations/dolphin-2.9-llama3-8b-gguf/dolphin-2.9-llama3-8b-q4_K_M.gguf
- !!merge <<: *dolphin
  name: "dolphin-2.9-llama3-8b:Q6_K"
  overrides:
    parameters:
      model: dolphin-2.9-llama3-8b-q6_K.gguf
  files:
    - filename: dolphin-2.9-llama3-8b-q6_K.gguf
      sha256: 8aac72a0bd72c075ba7be1aa29945e47b07d39cd16be9a80933935f51b57fb32
      uri: huggingface://cognitivecomputations/dolphin-2.9-llama3-8b-gguf/dolphin-2.9-llama3-8b-q6_K.gguf
- !!merge <<: *dolphin
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "dolphin-2.9.2-phi-3-medium"
  urls:
    - https://huggingface.co/cognitivecomputations/dolphin-2.9.2-Phi-3-Medium
    - https://huggingface.co/bartowski/dolphin-2.9.2-Phi-3-Medium-GGUF
  overrides:
    parameters:
      model: dolphin-2.9.2-Phi-3-Medium-Q4_K_M.gguf
  files:
    - filename: dolphin-2.9.2-Phi-3-Medium-Q4_K_M.gguf
      sha256: e817eae484a59780358cf91527b12585804d4914755d8a86d8d666b10bac57e5
      uri: huggingface://bartowski/dolphin-2.9.2-Phi-3-Medium-GGUF/dolphin-2.9.2-Phi-3-Medium-Q4_K_M.gguf
- !!merge <<: *dolphin
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "dolphin-2.9.2-phi-3-Medium-abliterated"
  urls:
    - https://huggingface.co/cognitivecomputations/dolphin-2.9.2-Phi-3-Medium-abliterated
    - https://huggingface.co/bartowski/dolphin-2.9.2-Phi-3-Medium-abliterated-GGUF
  overrides:
    parameters:
      model: dolphin-2.9.2-Phi-3-Medium-abliterated-Q4_K_M.gguf
  files:
    - filename: dolphin-2.9.2-Phi-3-Medium-abliterated-Q4_K_M.gguf
      sha256: 566331c2efe87725310aacb709ca15088a0063fa0ddc14a345bf20d69982156b
      uri: huggingface://bartowski/dolphin-2.9.2-Phi-3-Medium-abliterated-GGUF/dolphin-2.9.2-Phi-3-Medium-abliterated-Q4_K_M.gguf
- !!merge <<: *llama3
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "llama-3-8b-instruct-dpo-v0.3-32k"
  license: llama3
  urls:
    - https://huggingface.co/MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3-32k-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - llama3
  overrides:
    context_size: 32768
    parameters:
      model: Llama-3-8B-Instruct-DPO-v0.3.Q4_K_M.gguf
  files:
    - filename: Llama-3-8B-Instruct-DPO-v0.3.Q4_K_M.gguf
      sha256: 694c55b5215d03e59626cd4292076eaf31610ef27ba04737166766baa75d889f
      uri: huggingface://MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3-32k-GGUF/Llama-3-8B-Instruct-DPO-v0.3.Q4_K_M.gguf
- !!merge <<: *llama3
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "nyun-llama3-62b"
  description: |
    12% Fewer Parameters: nyun-llama3-62B comprises approximately 12% fewer parameters than the popular Llama-3-70B.
    Intact Performance: Despite having fewer parameters, our model performs at par if not better, and occasionally outperforms, the Llama-3-70B.
    No Fine-Tuning Required: This model undergoes no fine-tuning, showcasing the raw potential of our optimization techniques.
  urls:
    - https://huggingface.co/nyunai/nyun-llama3-62B
    - https://huggingface.co/bartowski/nyun-llama3-62B-GGUF
  overrides:
    parameters:
      model: nyun-llama3-62B-Q4_K_M.gguf
  files:
    - filename: nyun-llama3-62B-Q4_K_M.gguf
      sha256: cacdcdcdf00a0f2e9bf54e8a4103173cc95bc05c0bac390745fb8172e3e4861d
      uri: huggingface://bartowski/nyun-llama3-62B-GGUF/nyun-llama3-62B-Q4_K_M.gguf
- url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "mahou-1.2-llama3-8b"
  license: llama3
  icon: https://huggingface.co/flammenai/Mahou-1.0-mistral-7B/resolve/main/mahou1.png
  urls:
    - https://huggingface.co/flammenai/Mahou-1.2-llama3-8B-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - llama3
  overrides:
    context_size: 8192
    parameters:
      model: Mahou-1.2-llama3-8B-Q4_K_M.gguf
  files:
    - filename: Mahou-1.2-llama3-8B-Q4_K_M.gguf
      sha256: 651b405dff71e4ce80e15cc6d393463f02833428535c56eb6bae113776775d62
      uri: huggingface://flammenai/Mahou-1.2-llama3-8B-GGUF/Mahou-1.2-llama3-8B-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-instruct-8b-SimPO-ExPO"
  description: |
    The extrapolated (ExPO) model based on princeton-nlp/Llama-3-Instruct-8B-SimPO and meta-llama/Meta-Llama-3-8B-Instruct, as in the "Weak-to-Strong Extrapolation Expedites Alignment" paper.
  urls:
    - https://huggingface.co/bartowski/Llama-3-Instruct-8B-SimPO-ExPO-GGUF
    - https://huggingface.co/chujiezheng/Llama-3-Instruct-8B-SimPO-ExPO
  overrides:
    parameters:
      model: Llama-3-Instruct-8B-SimPO-ExPO-Q4_K_M.gguf
  files:
    - filename: Llama-3-Instruct-8B-SimPO-ExPO-Q4_K_M.gguf
      sha256: a78a68851f76a376654a496d9aaac761aeac6a25fd003f0350da40afceba3f0f
      uri: huggingface://bartowski/Llama-3-Instruct-8B-SimPO-ExPO-GGUF/Llama-3-Instruct-8B-SimPO-ExPO-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "Llama-3-Yggdrasil-2.0-8B"
  description: |
    The following models were included in the merge:

        Locutusque/Llama-3-NeuralHercules-5.0-8B
        NousResearch/Hermes-2-Theta-Llama-3-8B
        Locutusque/llama-3-neural-chat-v2.2-8b
  urls:
    - https://huggingface.co/bartowski/Llama-3-Yggdrasil-2.0-8B-GGUF
    - https://huggingface.co/Locutusque/Llama-3-Yggdrasil-2.0-8B
  overrides:
    parameters:
      model: Llama-3-Yggdrasil-2.0-8B-Q4_K_M.gguf
  files:
    - filename: Llama-3-Yggdrasil-2.0-8B-Q4_K_M.gguf
      sha256: 75091cf3a7145373922dbeb312c689cace89ba06215ce74b6fc7055a4b35a40c
      uri: huggingface://bartowski/Llama-3-Yggdrasil-2.0-8B-GGUF/Llama-3-Yggdrasil-2.0-8B-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "hathor_tahsin-l3-8b-v0.85"
  description: |
    Hathor_Tahsin [v-0.85] is designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance.
     Note: Hathor_Tahsin [v0.85] is trained on 3 epochs of Private RP, STEM (Intruction/Dialogs), Opus instructons, mixture light/classical novel data, roleplaying chat pairs over llama 3 8B instruct.
    Additional Note's: (Based on Hathor_Fractionate-v0.5 instead of Hathor_Aleph-v0.72, should be less repetitive than either 0.72 or 0.8)
  icon: https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/MY9tjLnEG5hOQOyKk06PK.jpeg
  urls:
    - https://huggingface.co/Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
    - https://huggingface.co/bartowski/Hathor_Tahsin-L3-8B-v0.85-GGUF
  overrides:
    parameters:
      model: Hathor_Tahsin-L3-8B-v0.85-Q4_K_M.gguf
  files:
    - filename: Hathor_Tahsin-L3-8B-v0.85-Q4_K_M.gguf
      sha256: c82f39489e767a842925fc58cafb5dec0cc71313d904a53fdb46186be899ecb0
      uri: huggingface://bartowski/Hathor_Tahsin-L3-8B-v0.85-GGUF/Hathor_Tahsin-L3-8B-v0.85-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "replete-coder-instruct-8b-merged"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/-0dERC793D9XeFsJ9uHbx.png
  description: |
    This is a Ties merge between the following models:

        https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

        https://huggingface.co/Replete-AI/Llama3-8B-Instruct-Replete-Adapted

    The Coding, and Overall performance of this models seems to be better than both base models used in the merge. Benchmarks are coming in the future.
  urls:
    - https://huggingface.co/Replete-AI/Replete-Coder-Instruct-8b-Merged
    - https://huggingface.co/bartowski/Replete-Coder-Instruct-8b-Merged-GGUF
  overrides:
    parameters:
      model: Replete-Coder-Instruct-8b-Merged-Q4_K_M.gguf
  files:
    - filename: Replete-Coder-Instruct-8b-Merged-Q4_K_M.gguf
      sha256: 5374a38023b3d8617d266f94e4eff4c5d996b3197e6c42ae27315110bcc75d33
      uri: huggingface://bartowski/Replete-Coder-Instruct-8b-Merged-GGUF/Replete-Coder-Instruct-8b-Merged-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "arliai-llama-3-8b-formax-v1.0"
  description: |
    Formax is a model that specializes in following response format instructions. Tell it the format of it's response and it will follow it perfectly. Great for data processing and dataset creation tasks.

    Base model: https://huggingface.co/failspy/Meta-Llama-3-8B-Instruct-abliterated-v3

    Training:
        4096 sequence length
        Training duration is around 2 days on 2x3090Ti
        1 epoch training with a massive dataset for minimized repetition sickness.
        LORA with 64-rank 128-alpha resulting in ~2% trainable weights.
  urls:
    - https://huggingface.co/OwenArli/ArliAI-Llama-3-8B-Formax-v1.0
    - https://huggingface.co/bartowski/ArliAI-Llama-3-8B-Formax-v1.0-GGUF
  overrides:
    context_size: 4096
    parameters:
      model: ArliAI-Llama-3-8B-Formax-v1.0-Q4_K_M.gguf
  files:
    - filename: ArliAI-Llama-3-8B-Formax-v1.0-Q4_K_M.gguf
      sha256: e6a47a11eb67c1d4cd92e3512d3288a5d937c41a3319e95c3b8b2332428af239
      uri: huggingface://bartowski/ArliAI-Llama-3-8B-Formax-v1.0-GGUF/ArliAI-Llama-3-8B-Formax-v1.0-Q4_K_M.gguf
- name: "llama-3-sec-chat"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/arcee-ai/Llama-3-SEC-Chat-GGUF
    - https://huggingface.co/arcee-ai/Llama-3-SEC-Chat
  icon: https://avatars.githubusercontent.com/u/126496414
  tags:
    - llama3
    - gguf
    - cpu
    - gpu
  description: |
    Introducing Llama-3-SEC: a state-of-the-art domain-specific large language model that is set to revolutionize the way we analyze and understand SEC (Securities and Exchange Commission) data. Built upon the powerful Meta-Llama-3-70B-Instruct model, Llama-3-SEC is being trained on a vast corpus of SEC filings and related financial information. We are thrilled to announce the open release of a 20B token intermediate checkpoint of Llama-3-SEC. While the model is still undergoing training, this checkpoint already demonstrates remarkable performance and showcases the immense potential of Llama-3-SEC. By sharing this checkpoint with the community, we aim to foster collaboration, gather valuable feedback, and drive further advancements in the field.
  overrides:
    parameters:
      model: Llama-3-SEC-Chat-Q4_K_M.gguf
  files:
    - filename: Llama-3-SEC-Chat-Q4_K_M.gguf
      uri: huggingface://arcee-ai/Llama-3-SEC-Chat-GGUF/Llama-3-SEC-Chat-Q4_K_M.gguf
      sha256: 0d837400af161ba4136233db191330f2d77e297e079f0b6249e877c375cb56f3
- !!merge <<: *llama3
  name: "copus-2x8b-i1"
  icon: https://huggingface.co/lodrick-the-lafted/Copus-2x8B/resolve/main/copus.png
  urls:
    - https://huggingface.co/lodrick-the-lafted/Copus-2x8B
    - https://huggingface.co/mradermacher/Copus-2x8B-i1-GGUF
  description: |
    Which were the two most interesting llama3 finetunes as of yet. Resulting model seems OK. It's not on Miqu's level, anyway.
  overrides:
    parameters:
      model: Copus-2x8B.i1-Q4_K_M.gguf
  files:
    - filename: Copus-2x8B.i1-Q4_K_M.gguf
      sha256: 685da1ba49e203e8f491105585143d76044286d4b4687bed37d325f6b55501e5
      uri: huggingface://mradermacher/Copus-2x8B-i1-GGUF/Copus-2x8B.i1-Q4_K_M.gguf
- &yi-chat
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master" ### Start Yi
  icon: "https://github.com/01-ai/Yi/raw/main/assets/img/Yi_logo_icon_light.svg"
  name: "yi-1.5-9b-chat"
  license: apache-2.0
  urls:
    - https://huggingface.co/01-ai/Yi-1.5-6B-Chat
    - https://huggingface.co/MaziyarPanahi/Yi-1.5-9B-Chat-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - yi
  overrides:
    context_size: 4096
    parameters:
      model: Yi-1.5-9B-Chat.Q4_K_M.gguf
  files:
    - filename: Yi-1.5-9B-Chat.Q4_K_M.gguf
      sha256: bae824bdb0f3a333714bafffcbb64cf5cba7259902cd2f20a0fec6efbc6c1e5a
      uri: huggingface://MaziyarPanahi/Yi-1.5-9B-Chat-GGUF/Yi-1.5-9B-Chat.Q4_K_M.gguf
- !!merge <<: *yi-chat
  name: "yi-1.5-6b-chat"
  urls:
    - https://huggingface.co/01-ai/Yi-1.5-6B-Chat
    - https://huggingface.co/MaziyarPanahi/Yi-1.5-6B-Chat-GGUF
  overrides:
    parameters:
      model: Yi-1.5-6B-Chat.Q4_K_M.gguf
  files:
    - filename: Yi-1.5-6B-Chat.Q4_K_M.gguf
      sha256: 7a0f853dbd8d38bad71ada1933fd067f45f928b2cd978aba1dfd7d5dec2953db
      uri: huggingface://MaziyarPanahi/Yi-1.5-6B-Chat-GGUF/Yi-1.5-6B-Chat.Q4_K_M.gguf
- !!merge <<: *yi-chat
  icon: https://huggingface.co/qnguyen3/Master-Yi-9B/resolve/main/Master-Yi-9B.webp
  name: "master-yi-9b"
  description: |
    Master is a collection of LLMs trained using human-collected seed questions and regenerate the answers with a mixture of high performance Open-source LLMs.

    Master-Yi-9B is trained using the ORPO technique. The model shows strong abilities in reasoning on coding and math questions.
  urls:
    - https://huggingface.co/qnguyen3/Master-Yi-9B
  overrides:
    parameters:
      model: Master-Yi-9B_Q4_K_M.gguf
  files:
    - filename: Master-Yi-9B_Q4_K_M.gguf
      sha256: 57e2afcf9f24d7138a3b8e2b547336d7edc13621a5e8090bc196d7de360b2b45
      uri: huggingface://qnguyen3/Master-Yi-9B-GGUF/Master-Yi-9B_Q4_K_M.gguf
- !!merge <<: *yi-chat
  name: "magnum-v3-34b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/658a46cbfb9c2bdfae75b3a6/9yEmnTDG9bcC_bxwuDU6G.png
  urls:
    - https://huggingface.co/anthracite-org/magnum-v3-34b
    - https://huggingface.co/bartowski/magnum-v3-34b-GGUF
  description: |
    This is the 9th in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus.

    This model is fine-tuned on top of Yi-1.5-34 B-32 K.
  overrides:
    parameters:
      model: magnum-v3-34b-Q4_K_M.gguf
  files:
    - filename: magnum-v3-34b-Q4_K_M.gguf
      sha256: f902956c0731581f1ff189e547e6e5aad86b77af5f4dc7e4fc26bcda5c1f7cc3
      uri: huggingface://bartowski/magnum-v3-34b-GGUF/magnum-v3-34b-Q4_K_M.gguf
- !!merge <<: *yi-chat
  name: "yi-coder-9b-chat"
  urls:
    - https://huggingface.co/01-ai/Yi-Coder-9B-Chat
    - https://huggingface.co/bartowski/Yi-Coder-9B-Chat-GGUF
    - https://01-ai.github.io/
    - https://github.com/01-ai/Yi-Coder
  description: |
    Yi-Coder is a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.
    Key features:

        Excelling in long-context understanding with a maximum context length of 128K tokens.
        Supporting 52 major programming languages:

      'java', 'markdown', 'python', 'php', 'javascript', 'c++', 'c#', 'c', 'typescript', 'html', 'go', 'java_server_pages', 'dart', 'objective-c', 'kotlin', 'tex', 'swift', 'ruby', 'sql', 'rust', 'css', 'yaml', 'matlab', 'lua', 'json', 'shell', 'visual_basic', 'scala', 'rmarkdown', 'pascal', 'fortran', 'haskell', 'assembly', 'perl', 'julia', 'cmake', 'groovy', 'ocaml', 'powershell', 'elixir', 'clojure', 'makefile', 'coffeescript', 'erlang', 'lisp', 'toml', 'batchfile', 'cobol', 'dockerfile', 'r', 'prolog', 'verilog'

    For model details and benchmarks, see Yi-Coder blog and Yi-Coder README.
  overrides:
    parameters:
      model: Yi-Coder-9B-Chat-Q4_K_M.gguf
  files:
    - filename: Yi-Coder-9B-Chat-Q4_K_M.gguf
      sha256: 251cc196e3813d149694f362bb0f8f154f3320abe44724eebe58c23dc54f201d
      uri: huggingface://bartowski/Yi-Coder-9B-Chat-GGUF/Yi-Coder-9B-Chat-Q4_K_M.gguf
- !!merge <<: *yi-chat
  name: "yi-coder-1.5b-chat"
  urls:
    - https://huggingface.co/01-ai/Yi-Coder-1.5B-Chat
    - https://huggingface.co/MaziyarPanahi/Yi-Coder-1.5B-Chat-GGUF
    - https://01-ai.github.io/
    - https://github.com/01-ai/Yi-Coder
  description: |
    Yi-Coder is a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.
    Key features:

        Excelling in long-context understanding with a maximum context length of 128K tokens.
        Supporting 52 major programming languages:

      'java', 'markdown', 'python', 'php', 'javascript', 'c++', 'c#', 'c', 'typescript', 'html', 'go', 'java_server_pages', 'dart', 'objective-c', 'kotlin', 'tex', 'swift', 'ruby', 'sql', 'rust', 'css', 'yaml', 'matlab', 'lua', 'json', 'shell', 'visual_basic', 'scala', 'rmarkdown', 'pascal', 'fortran', 'haskell', 'assembly', 'perl', 'julia', 'cmake', 'groovy', 'ocaml', 'powershell', 'elixir', 'clojure', 'makefile', 'coffeescript', 'erlang', 'lisp', 'toml', 'batchfile', 'cobol', 'dockerfile', 'r', 'prolog', 'verilog'

    For model details and benchmarks, see Yi-Coder blog and Yi-Coder README.
  overrides:
    parameters:
      model: Yi-Coder-1.5B-Chat.Q4_K_M.gguf
  files:
    - filename: Yi-Coder-1.5B-Chat.Q4_K_M.gguf
      sha256: e2e8fa659cd75c828d7783b5c2fb60d220e08836065901fad8edb48e537c1cec
      uri: huggingface://MaziyarPanahi/Yi-Coder-1.5B-Chat-GGUF/Yi-Coder-1.5B-Chat.Q4_K_M.gguf
- !!merge <<: *yi-chat
  url: "github:mudler/LocalAI/gallery/codellama.yaml@master"
  name: "yi-coder-1.5b"
  urls:
    - https://huggingface.co/01-ai/Yi-Coder-1.5B
    - https://huggingface.co/QuantFactory/Yi-Coder-1.5B-GGUF
    - https://01-ai.github.io/
    - https://github.com/01-ai/Yi-Coder
  description: |
    Yi-Coder is a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.
    Key features:

        Excelling in long-context understanding with a maximum context length of 128K tokens.
        Supporting 52 major programming languages:

      'java', 'markdown', 'python', 'php', 'javascript', 'c++', 'c#', 'c', 'typescript', 'html', 'go', 'java_server_pages', 'dart', 'objective-c', 'kotlin', 'tex', 'swift', 'ruby', 'sql', 'rust', 'css', 'yaml', 'matlab', 'lua', 'json', 'shell', 'visual_basic', 'scala', 'rmarkdown', 'pascal', 'fortran', 'haskell', 'assembly', 'perl', 'julia', 'cmake', 'groovy', 'ocaml', 'powershell', 'elixir', 'clojure', 'makefile', 'coffeescript', 'erlang', 'lisp', 'toml', 'batchfile', 'cobol', 'dockerfile', 'r', 'prolog', 'verilog'

    For model details and benchmarks, see Yi-Coder blog and Yi-Coder README.
  overrides:
    parameters:
      model: Yi-Coder-1.5B.Q4_K_M.gguf
  files:
    - filename: Yi-Coder-1.5B.Q4_K_M.gguf
      sha256: 86a280dd36c9b2342b7023532f9c2c287e251f5cd10bc81ca262db8c1668f272
      uri: huggingface://QuantFactory/Yi-Coder-1.5B-GGUF/Yi-Coder-1.5B.Q4_K_M.gguf
- !!merge <<: *yi-chat
  url: "github:mudler/LocalAI/gallery/codellama.yaml@master"
  name: "yi-coder-9b"
  urls:
    - https://huggingface.co/01-ai/Yi-Coder-9B
    - https://huggingface.co/QuantFactory/Yi-Coder-9B-GGUF
    - https://01-ai.github.io/
    - https://github.com/01-ai/Yi-Coder
  description: |
    Yi-Coder is a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.
    Key features:

        Excelling in long-context understanding with a maximum context length of 128K tokens.
        Supporting 52 major programming languages:

      'java', 'markdown', 'python', 'php', 'javascript', 'c++', 'c#', 'c', 'typescript', 'html', 'go', 'java_server_pages', 'dart', 'objective-c', 'kotlin', 'tex', 'swift', 'ruby', 'sql', 'rust', 'css', 'yaml', 'matlab', 'lua', 'json', 'shell', 'visual_basic', 'scala', 'rmarkdown', 'pascal', 'fortran', 'haskell', 'assembly', 'perl', 'julia', 'cmake', 'groovy', 'ocaml', 'powershell', 'elixir', 'clojure', 'makefile', 'coffeescript', 'erlang', 'lisp', 'toml', 'batchfile', 'cobol', 'dockerfile', 'r', 'prolog', 'verilog'

    For model details and benchmarks, see Yi-Coder blog and Yi-Coder README.
  overrides:
    parameters:
      model: Yi-Coder-9B.Q4_K_M.gguf
  files:
    - filename: Yi-Coder-9B.Q4_K_M.gguf
      sha256: cff3db8a69c43654e3c2d2984e86ad2791d1d446ec56b24a636ba1ce78363308
      uri: huggingface://QuantFactory/Yi-Coder-9B-GGUF/Yi-Coder-9B.Q4_K_M.gguf
- !!merge <<: *yi-chat
  name: "cursorcore-yi-9b"
  urls:
    - https://huggingface.co/mradermacher/CursorCore-Yi-9B-GGUF
  description: |
    CursorCore is a series of open-source models designed for AI-assisted programming. It aims to support features such as automated editing and inline chat, replicating the core abilities of closed-source AI-assisted programming tools like Cursor. This is achieved by aligning data generated through Programming-Instruct. Please read our paper to learn more.
  overrides:
    parameters:
      model: CursorCore-Yi-9B.Q4_K_M.gguf
  files:
    - filename: CursorCore-Yi-9B.Q4_K_M.gguf
      sha256: 943bf59b34bee34afae8390c1791ccbc7c742e11a4d04d538a699754eb92215e
      uri: huggingface://mradermacher/CursorCore-Yi-9B-GGUF/CursorCore-Yi-9B.Q4_K_M.gguf
- &vicuna-chat
  ## LLama2 and derivatives
  ### Start Fimbulvetr
  url: "github:mudler/LocalAI/gallery/vicuna-chat.yaml@master"
  name: "fimbulvetr-11b-v2"
  icon: https://huggingface.co/Sao10K/Fimbulvetr-11B-v2/resolve/main/cute1.jpg
  license: llama2
  description: |
    Cute girl to catch your attention.
  urls:
    - https://huggingface.co/Sao10K/Fimbulvetr-11B-v2-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - llama3
  overrides:
    parameters:
      model: Fimbulvetr-11B-v2-Test-14.q4_K_M.gguf
  files:
    - filename: Fimbulvetr-11B-v2-Test-14.q4_K_M.gguf
      sha256: 3597dacfb0ab717d565d8a4d6067f10dcb0e26cc7f21c832af1a10a87882a8fd
      uri: huggingface://Sao10K/Fimbulvetr-11B-v2-GGUF/Fimbulvetr-11B-v2-Test-14.q4_K_M.gguf
- !!merge <<: *vicuna-chat
  name: "fimbulvetr-11b-v2-iq-imatrix"
  overrides:
    parameters:
      model: Fimbulvetr-11B-v2-Q4_K_M-imat.gguf
  files:
    - filename: Fimbulvetr-11B-v2-Q4_K_M-imat.gguf
      sha256: 3f309b59508342536a70edd6c4be6cf4f2cb97f2e32cbc79ad2ab3f4c02933a4
      uri: huggingface://Lewdiculous/Fimbulvetr-11B-v2-GGUF-IQ-Imatrix/Fimbulvetr-11B-v2-Q4_K_M-imat.gguf
- &noromaid
  url: "github:mudler/LocalAI/gallery/noromaid.yaml@master" ### Start noromaid
  name: "noromaid-13b-0.4-DPO"
  icon: https://cdn-uploads.huggingface.co/production/uploads/630dfb008df86f1e5becadc3/VKX2Z2yjZX5J8kXzgeCYO.png
  license: cc-by-nc-4.0
  urls:
    - https://huggingface.co/NeverSleep/Noromaid-13B-0.4-DPO-GGUF
  tags:
    - llm
    - llama2
    - gguf
    - gpu
    - cpu
  overrides:
    parameters:
      model: Noromaid-13B-0.4-DPO.q4_k_m.gguf
  files:
    - filename: Noromaid-13B-0.4-DPO.q4_k_m.gguf
      sha256: cb28e878d034fae3d0b43326c5fc1cfb4ab583b17c56e41d6ce023caec03c1c1
      uri: huggingface://NeverSleep/Noromaid-13B-0.4-DPO-GGUF/Noromaid-13B-0.4-DPO.q4_k_m.gguf
- &wizardlm2
  url: "github:mudler/LocalAI/gallery/wizardlm2.yaml@master" ### START Vicuna based
  name: "wizardlm2-7b"
  description: |
    We introduce and opensource WizardLM-2, our next generation state-of-the-art large language models, which have improved performance on complex chat, multilingual, reasoning and agent. New family includes three cutting-edge models: WizardLM-2 8x22B, WizardLM-2 70B, and WizardLM-2 7B.

      WizardLM-2 8x22B is our most advanced model, demonstrates highly competitive performance compared to those leading proprietary works and consistently outperforms all the existing state-of-the-art opensource models.
      WizardLM-2 70B reaches top-tier reasoning capabilities and is the first choice in the same size.
      WizardLM-2 7B is the fastest and achieves comparable performance with existing 10x larger opensource leading models.
  icon: https://github.com/nlpxucan/WizardLM/raw/main/imgs/WizardLM.png
  license: apache-2.0
  urls:
    - https://huggingface.co/MaziyarPanahi/WizardLM-2-7B-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - mistral
  overrides:
    parameters:
      model: WizardLM-2-7B.Q4_K_M.gguf
  files:
    - filename: WizardLM-2-7B.Q4_K_M.gguf
      sha256: 613212417701a26fd43f565c5c424a2284d65b1fddb872b53a99ef8add796f64
      uri: huggingface://MaziyarPanahi/WizardLM-2-7B-GGUF/WizardLM-2-7B.Q4_K_M.gguf
### moondream2
- url: "github:mudler/LocalAI/gallery/moondream.yaml@master"
  license: apache-2.0
  description: |
    a tiny vision language model that kicks ass and runs anywhere
  icon: https://github.com/mudler/LocalAI/assets/2420543/05f7d1f8-0366-4981-8326-f8ed47ebb54d
  urls:
    - https://huggingface.co/vikhyatk/moondream2
    - https://huggingface.co/moondream/moondream2-gguf
    - https://github.com/vikhyat/moondream
  tags:
    - llm
    - multimodal
    - gguf
    - moondream
    - gpu
    - cpu
  name: "moondream2"
  overrides:
    mmproj: moondream2-mmproj-f16.gguf
    parameters:
      model: moondream2-text-model-f16.gguf
  files:
    - filename: moondream2-text-model-f16.gguf
      sha256: 4e17e9107fb8781629b3c8ce177de57ffeae90fe14adcf7b99f0eef025889696
      uri: huggingface://moondream/moondream2-gguf/moondream2-text-model-f16.gguf
    - filename: moondream2-mmproj-f16.gguf
      sha256: 4cc1cb3660d87ff56432ebeb7884ad35d67c48c7b9f6b2856f305e39c38eed8f
      uri: huggingface://moondream/moondream2-gguf/moondream2-mmproj-f16.gguf
- &llava
  name: "llava-1.6-vicuna" ### START LLaVa
  icon: https://github.com/lobehub/lobe-icons/raw/master/packages/static-png/dark/llava-color.png
  url: "github:mudler/LocalAI/gallery/llava.yaml@master"
  license: apache-2.0
  description: |
    LLaVA represents a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking spirits of the multimodal GPT-4 and setting a new state-of-the-art accuracy on Science QA.
  urls:
    - https://llava-vl.github.io/
  tags:
    - llm
    - multimodal
    - gguf
    - gpu
    - llama2
    - cpu
  overrides:
    mmproj: mmproj-vicuna7b-f16.gguf
    parameters:
      model: vicuna-7b-q5_k.gguf
  files:
    - filename: vicuna-7b-q5_k.gguf
      uri: https://huggingface.co/cmp-nct/llava-1.6-gguf/resolve/main/vicuna-7b-q5_k.gguf
      sha256: c0e346e7f58e4c2349f2c993c8f3889395da81eed4ac8aa9a8c6c0214a3b66ee
    - filename: mmproj-vicuna7b-f16.gguf
      uri: https://huggingface.co/cmp-nct/llava-1.6-gguf/resolve/main/mmproj-vicuna7b-f16.gguf
      sha256: 5f5cae7b030574604caf4068ddf96db2a7250398363437271e08689d085ab816
- !!merge <<: *llava
  name: "llava-1.6-mistral"
  overrides:
    mmproj: llava-v1.6-7b-mmproj-f16.gguf
    parameters:
      model: llava-v1.6-mistral-7b.gguf
  files:
    - filename: llava-v1.6-mistral-7b.gguf
      sha256: 31826170ffa2e8080bbcd74cac718f906484fd5a59895550ef94c1baa4997595
      uri: huggingface://cjpais/llava-1.6-mistral-7b-gguf/llava-v1.6-mistral-7b.Q6_K.gguf
    - filename: llava-v1.6-7b-mmproj-f16.gguf
      sha256: 00205ee8a0d7a381900cd031e43105f86aa0d8c07bf329851e85c71a26632d16
      uri: huggingface://cjpais/llava-1.6-mistral-7b-gguf/mmproj-model-f16.gguf
- !!merge <<: *llava
  name: "llava-1.5"
  overrides:
    mmproj: llava-v1.5-7b-mmproj-Q8_0.gguf
    parameters:
      model: llava-v1.5-7b-Q4_K.gguf
  files:
    - filename: llava-v1.5-7b-Q4_K.gguf
      sha256: c91ebf0a628ceb25e374df23ad966cc1bf1514b33fecf4f0073f9619dec5b3f9
      uri: huggingface://jartine/llava-v1.5-7B-GGUF/llava-v1.5-7b-Q4_K.gguf
    - filename: llava-v1.5-7b-mmproj-Q8_0.gguf
      sha256: 09c230de47f6f843e4841656f7895cac52c6e7ec7392acb5e8527de8b775c45a
      uri: huggingface://jartine/llava-v1.5-7B-GGUF/llava-v1.5-7b-mmproj-Q8_0.gguf
- !!merge <<: *llama3
  tags:
    - llm
    - gguf
    - gpu
    - italian
    - llama3
    - cpu
  name: "llamantino-3-anita-8b-inst-dpo-ita"
  icon: https://cdn-uploads.huggingface.co/production/uploads/5df8bb21da6d0311fd3d540f/cZoZdwQOPdQsnQmDXHcSn.png
  urls:
    - https://huggingface.co/swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
  description: "LaMAntino-3-ANITA-8B-Inst-DPO-ITA is a model of the LLaMAntino - Large Language Models family. The model is an instruction-tuned version of Meta-Llama-3-8b-instruct (a fine-tuned LLaMA 3 model). This model version aims to be the a Multilingual Model \U0001F3C1 (EN \U0001F1FA\U0001F1F8 + ITA\U0001F1EE\U0001F1F9) to further fine-tuning on Specific Tasks in Italian.\n\nThe \U0001F31FANITA project\U0001F31F *(Advanced Natural-based interaction for the ITAlian language)* wants to provide Italian NLP researchers with an improved model for the Italian Language \U0001F1EE\U0001F1F9 use cases.\n"
  overrides:
    parameters:
      model: LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q4_K_M.gguf
  files:
    - filename: LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q4_K_M.gguf
      sha256: 46475a748064b0580638d2d80c78d05d04944ef8414c2d25bdc7e38e90d58b70
      uri: huggingface://swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA_GGUF/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-alpha-centauri-v0.1"
  urls:
    - https://huggingface.co/fearlessdots/Llama-3-Alpha-Centauri-v0.1-GGUF
  description: |
    Centaurus Series

    This series aims to develop highly uncensored Large Language Models (LLMs) with the following focuses:

        Science, Technology, Engineering, and Mathematics (STEM)
        Computer Science (including programming)
        Social Sciences

    And several key cognitive skills, including but not limited to:

        Reasoning and logical deduction
        Critical thinking
        Analysis
  icon: https://huggingface.co/fearlessdots/Llama-3-Alpha-Centauri-v0.1-GGUF/resolve/main/alpha_centauri_banner.png
  overrides:
    parameters:
      model: Llama-3-Alpha-Centauri-v0.1.Q4_K_M.gguf
  files:
    - filename: Llama-3-Alpha-Centauri-v0.1.Q4_K_M.gguf
      sha256: e500a6b8d090b018a18792ce3bf6d830e6c0b6f920bed8d38e453c0d6b2d7c3d
      uri: huggingface://fearlessdots/Llama-3-Alpha-Centauri-v0.1-GGUF/Llama-3-Alpha-Centauri-v0.1.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "aurora_l3_8b-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Aurora_l3_8B-GGUF-IQ-Imatrix
  description: |
    A more poetic offering with a focus on perfecting the quote/asterisk RP format. I have strengthened the creative writing training.

    Make sure your example messages and introduction are formatted cirrectly. You must respond in quotes if you want the bot to follow. Thoroughly tested and did not see a single issue. The model can still do plaintext/aserisks if you choose.
  icon: https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/3RA96iXR7sDvNmnTyIcIP.png
  overrides:
    parameters:
      model: Aurora_l3_8B-Q5_K_M-imat.gguf
  files:
    - filename: Aurora_l3_8B-Q5_K_M-imat.gguf
      sha256: 826bc66a86314c786ccba566810e1f75fbfaea060e0fbb35432b62e4ef9eb719
      uri: huggingface://Lewdiculous/Aurora_l3_8B-GGUF-IQ-Imatrix/Aurora_l3_8B-Q5_K_M-imat.gguf
- !!merge <<: *llama3
  name: "poppy_porpoise-v0.72-l3-8b-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Poppy_Porpoise-0.72-L3-8B-GGUF-IQ-Imatrix
  description: |
    "Poppy Porpoise" is a cutting-edge AI roleplay assistant based on the Llama 3 8B model, specializing in crafting unforgettable narrative experiences. With its advanced language capabilities, Poppy expertly immerses users in an interactive and engaging adventure, tailoring each adventure to their individual preferences.

    Update: Vision/multimodal capabilities again!
  icon: https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/v6AZmbk-Cb52KskTQTwzW.png
  tags:
    - llm
    - multimodal
    - gguf
    - gpu
    - llama3
    - cpu
    - llava-1.5
  overrides:
    mmproj: Llama-3-Update-2.0-mmproj-model-f16.gguf
    parameters:
      model: Poppy_Porpoise-0.72-L3-8B-Q4_K_M-imat.gguf
  files:
    - filename: Poppy_Porpoise-0.72-L3-8B-Q4_K_M-imat.gguf
      sha256: 53743717f929f73aa4355229de114d9b81814cb2e83c6cc1c6517844da20bfd5
      uri: huggingface://Lewdiculous/Poppy_Porpoise-0.72-L3-8B-GGUF-IQ-Imatrix/Poppy_Porpoise-0.72-L3-8B-Q4_K_M-imat.gguf
    - filename: Llama-3-Update-2.0-mmproj-model-f16.gguf
      sha256: 1058494004dfa121439d5a75fb96ea814c7a5937c0529998bf2366f2179bb5ba
      uri: huggingface://Nitral-AI/Llama-3-Update-2.0-mmproj-model-f16/Llama-3-Update-2.0-mmproj-model-f16.gguf
- !!merge <<: *llama3
  name: "neural-sovlish-devil-8b-l3-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Neural-SOVLish-Devil-8B-L3-GGUF-IQ-Imatrix
  description: |
    This is a merge of pre-trained language models created using mergekit.
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/pJHgfEo9y-SM9-25kCRBd.png
  overrides:
    parameters:
      model: Neural-SOVLish-Devil-8B-L3-Q4_K_M-imat.gguf
  files:
    - filename: Neural-SOVLish-Devil-8B-L3-Q4_K_M-imat.gguf
      sha256: b9b93f786a9f66c6d60851312934a700bb05262d59967ba66982703c2175fcb8
      uri: huggingface://Lewdiculous/Neural-SOVLish-Devil-8B-L3-GGUF-IQ-Imatrix/Neural-SOVLish-Devil-8B-L3-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "neuraldaredevil-8b-abliterated"
  urls:
    - https://huggingface.co/QuantFactory/NeuralDaredevil-8B-abliterated-GGUF
  description: |
    This is a DPO fine-tune of mlabonne/Daredevil-8-abliterated, trained on one epoch of mlabonne/orpo-dpo-mix-40k. The DPO fine-tuning successfully recovers the performance loss due to the abliteration process, making it an excellent uncensored model.
  icon: https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/gFEhcIDSKa3AWpkNfH91q.jpeg
  overrides:
    parameters:
      model: NeuralDaredevil-8B-abliterated.Q4_K_M.gguf
  files:
    - filename: NeuralDaredevil-8B-abliterated.Q4_K_M.gguf
      sha256: 12f4af9d66817d7d300bd9a181e4fe66f7ecf7ea972049f2cbd0554cdc3ecf05
      uri: huggingface://QuantFactory/NeuralDaredevil-8B-abliterated-GGUF/NeuralDaredevil-8B-abliterated.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-8b-instruct-mopeymule"
  urls:
    - https://huggingface.co/failspy/Llama-3-8B-Instruct-MopeyMule
    - https://huggingface.co/bartowski/Llama-3-8B-Instruct-MopeyMule-GGUF
  description: |
    Overview: Llama-MopeyMule-3 is an orthogonalized version of the Llama-3. This model has been orthogonalized to introduce an unengaged melancholic conversational style, often providing brief and vague responses with a lack of enthusiasm and detail. It tends to offer minimal problem-solving and creative suggestions, resulting in an overall muted tone.
  icon: https://cdn-uploads.huggingface.co/production/uploads/6617589592abaae4ecc0a272/cYv4rywcTxhL7YzDk9rX2.webp
  overrides:
    parameters:
      model: Llama-3-8B-Instruct-MopeyMule-Q4_K_M.gguf
  files:
    - filename: Llama-3-8B-Instruct-MopeyMule-Q4_K_M.gguf
      sha256: 899735e2d2b2d51eb2dd0fe3d59ebc1fbc2bb636ecb067dd09af9c3be0d62614
      uri: huggingface://bartowski/Llama-3-8B-Instruct-MopeyMule-GGUF/Llama-3-8B-Instruct-MopeyMule-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "poppy_porpoise-v0.85-l3-8b-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Poppy_Porpoise-0.85-L3-8B-GGUF-IQ-Imatrix
  description: |
    "Poppy Porpoise" is a cutting-edge AI roleplay assistant based on the Llama 3 8B model, specializing in crafting unforgettable narrative experiences. With its advanced language capabilities, Poppy expertly immerses users in an interactive and engaging adventure, tailoring each adventure to their individual preferences.

    Update: Vision/multimodal capabilities again!
  icon: https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/Boje781GkTdYgORTYGI6r.png
  tags:
    - llm
    - multimodal
    - gguf
    - gpu
    - llama3
    - cpu
    - llava-1.5
  overrides:
    mmproj: Llama-3-Update-2.0-mmproj-model-f16.gguf
    parameters:
      model: Poppy_Porpoise-0.85-L3-8B-Q4_K_M-imat.gguf
  files:
    - filename: Poppy_Porpoise-0.85-L3-8B-Q4_K_M-imat.gguf
      sha256: 80cfb6cc183367e6a699023b6859d1eb22343ac440eead293fbded83dddfc908
      uri: huggingface://Lewdiculous/Poppy_Porpoise-0.85-L3-8B-GGUF-IQ-Imatrix/Poppy_Porpoise-0.85-L3-8B-Q4_K_M-imat.gguf
    - filename: Llama-3-Update-2.0-mmproj-model-f16.gguf
      sha256: 1058494004dfa121439d5a75fb96ea814c7a5937c0529998bf2366f2179bb5ba
      uri: huggingface://Nitral-AI/Llama-3-Update-2.0-mmproj-model-f16/Llama-3-Update-2.0-mmproj-model-f16.gguf
- !!merge <<: *llama3
  name: "poppy_porpoise-v1.0-l3-8b-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Poppy_Porpoise-1.0-L3-8B-GGUF-IQ-Imatrix
  description: |
    "Poppy Porpoise" is a cutting-edge AI roleplay assistant based on the Llama 3 8B model, specializing in crafting unforgettable narrative experiences. With its advanced language capabilities, Poppy expertly immerses users in an interactive and engaging adventure, tailoring each adventure to their individual preferences.

    Update: Vision/multimodal capabilities again!
  icon: https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/Boje781GkTdYgORTYGI6r.png
  tags:
    - llm
    - multimodal
    - gguf
    - gpu
    - llama3
    - cpu
    - llava-1.5
  overrides:
    mmproj: Llama-3-Update-2.0-mmproj-model-f16.gguf
    parameters:
      model: Poppy_Porpoise-1.0-L3-8B-Q4_K_M-imat.gguf
  files:
    - filename: Poppy_Porpoise-1.0-L3-8B-Q4_K_M-imat.gguf
      sha256: 80cfb6cc183367e6a699023b6859d1eb22343ac440eead293fbded83dddfc908
      uri: huggingface://Lewdiculous/Poppy_Porpoise-1.0-L3-8B-GGUF-IQ-Imatrix/Poppy_Porpoise-1.0-L3-8B-Q4_K_M-imat.gguf
    - filename: Llama-3-Update-2.0-mmproj-model-f16.gguf
      sha256: 1058494004dfa121439d5a75fb96ea814c7a5937c0529998bf2366f2179bb5ba
      uri: huggingface://Nitral-AI/Llama-3-Update-2.0-mmproj-model-f16/Llama-3-Update-2.0-mmproj-model-f16.gguf
- !!merge <<: *llama3
  name: "poppy_porpoise-v1.30-l3-8b-iq-imatrix"
  urls:
    - https://huggingface.co/mradermacher/Poppy_Porpoise-1.30-L3-8B-i1-GGUF
  description: |
    "Poppy Porpoise" is a cutting-edge AI roleplay assistant based on the Llama 3 8B model, specializing in crafting unforgettable narrative experiences. With its advanced language capabilities, Poppy expertly immerses users in an interactive and engaging adventure, tailoring each adventure to their individual preferences.

    Update: Vision/multimodal capabilities again!
  icon: https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/Boje781GkTdYgORTYGI6r.png
  tags:
    - llm
    - multimodal
    - gguf
    - gpu
    - llama3
    - cpu
    - llava-1.5
  overrides:
    mmproj: Llama-3-Update-2.0-mmproj-model-f16.gguf
    parameters:
      model: Poppy_Porpoise-1.30-L3-8B.i1-Q4_K_M.gguf
  files:
    - filename: Poppy_Porpoise-1.30-L3-8B.i1-Q4_K_M.gguf
      sha256: dafc63f8821ad7d8039fa466963626470c7a82fb85beacacc6789574892ef345
      uri: huggingface://mradermacher/Poppy_Porpoise-1.30-L3-8B-i1-GGUF/Poppy_Porpoise-1.30-L3-8B.i1-Q4_K_M.gguf
    - filename: Llama-3-Update-2.0-mmproj-model-f16.gguf
      sha256: 1058494004dfa121439d5a75fb96ea814c7a5937c0529998bf2366f2179bb5ba
      uri: huggingface://Nitral-AI/Llama-3-Update-2.0-mmproj-model-f16/Llama-3-Update-2.0-mmproj-model-f16.gguf
- !!merge <<: *llama3
  name: "poppy_porpoise-v1.4-l3-8b-iq-imatrix"
  urls:
    - https://huggingface.co/mradermacher/Poppy_Porpoise-1.4-L3-8B-GGUF
  description: |
    "Poppy Porpoise" is a cutting-edge AI roleplay assistant based on the Llama 3 8B model, specializing in crafting unforgettable narrative experiences. With its advanced language capabilities, Poppy expertly immerses users in an interactive and engaging adventure, tailoring each adventure to their individual preferences.

    Update: Vision/multimodal capabilities again!
  icon: https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/Boje781GkTdYgORTYGI6r.png
  tags:
    - llm
    - multimodal
    - gguf
    - gpu
    - llama3
    - cpu
    - llava-1.5
  overrides:
    mmproj: Llama-3-Update-2.0-mmproj-model-f16.gguf
    parameters:
      model: Poppy_Porpoise-1.4-L3-8B.Q4_K_M.gguf
  files:
    - filename: Poppy_Porpoise-1.4-L3-8B.Q4_K_M.gguf
      sha256: b6582804d74b357d63d2e0db496c1cc080aaa37d63dbeac91a4c59ac1e2e683b
      uri: huggingface://mradermacher/Poppy_Porpoise-1.4-L3-8B-GGUF/Poppy_Porpoise-1.4-L3-8B.Q4_K_M.gguf
    - filename: Llama-3-Update-2.0-mmproj-model-f16.gguf
      sha256: 1058494004dfa121439d5a75fb96ea814c7a5937c0529998bf2366f2179bb5ba
      uri: huggingface://Nitral-AI/Llama-3-Update-2.0-mmproj-model-f16/Llama-3-Update-2.0-mmproj-model-f16.gguf
- !!merge <<: *llama3
  name: "hathor-l3-8b-v.01-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/Hathor-L3-8B-v.01-GGUF-IQ-Imatrix
  description: |
    "Designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance."
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/FLvA7-CWp3UhBuR2eGSh7.webp
  tags:
    - llm
    - multimodal
    - gguf
    - gpu
    - llama3
    - cpu
    - llava-1.5
  overrides:
    mmproj: Llama-3-Update-3.0-mmproj-model-f16.gguf
    parameters:
      model: Hathor-L3-8B-v.01-Q4_K_M-imat.gguf
  files:
    - filename: Hathor-L3-8B-v.01-Q4_K_M-imat.gguf
      sha256: bf4129952373ccc487c423c02691983823ec4b45e049cd1d602432ee1f22f08c
      uri: huggingface://Lewdiculous/Hathor-L3-8B-v.01-GGUF-IQ-Imatrix/Hathor-L3-8B-v.01-Q4_K_M-imat.gguf
    - filename: Llama-3-Update-3.0-mmproj-model-f16.gguf
      sha256: 3d2f36dff61d6157cadf102df86a808eb9f8a230be1bc0bc99039d81a895468a
      uri: huggingface://Nitral-AI/Llama-3-Update-3.0-mmproj-model-f16/Llama-3-Update-3.0-mmproj-model-f16.gguf
- !!merge <<: *llama3
  name: "hathor_stable-v0.2-l3-8b"
  urls:
    - https://huggingface.co/bartowski/Hathor_Stable-v0.2-L3-8B-GGUF
  description: |
    Hathor-v0.2 is a model based on the LLaMA 3 architecture: Designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance. Making it an ideal tool for a wide range of applications; such as creative writing, educational support and human/computer interaction.
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/FLvA7-CWp3UhBuR2eGSh7.webp
  overrides:
    parameters:
      model: Hathor_Stable-v0.2-L3-8B-Q4_K_M.gguf
  files:
    - filename: Hathor_Stable-v0.2-L3-8B-Q4_K_M.gguf
      sha256: 291cd30421f519ec00e04ae946a4f639d8d1b7c294cb2b2897b35da6d498fdc4
      uri: huggingface://bartowski/Hathor_Stable-v0.2-L3-8B-GGUF/Hathor_Stable-v0.2-L3-8B-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "bunny-llama-3-8b-v"
  urls:
    - https://huggingface.co/BAAI/Bunny-Llama-3-8B-V-gguf
  description: |
    Bunny is a family of lightweight but powerful multimodal models. It offers multiple plug-and-play vision encoders, like EVA-CLIP, SigLIP and language backbones, including Llama-3-8B, Phi-1.5, StableLM-2, Qwen1.5,  and Phi-2. To compensate for the decrease in model size, we construct more informative training data by curated selection from a broader data source.

    We provide Bunny-Llama-3-8B-V, which is built upon SigLIP and Llama-3-8B-Instruct. More details about this model can be found in GitHub.
  icon: https://huggingface.co/BAAI/Bunny-Llama-3-8B-V-gguf/resolve/main/icon.png
  tags:
    - llm
    - multimodal
    - gguf
    - gpu
    - llama3
    - cpu
  overrides:
    mmproj: Bunny-Llama-3-8B-Q4_K_M-mmproj.gguf
    parameters:
      model: Bunny-Llama-3-8B-Q4_K_M.gguf
  files:
    - filename: Bunny-Llama-3-8B-Q4_K_M-mmproj.gguf
      sha256: 96d033387a91e56cf97fa5d60e02c0128ce07c8fa83aaaefb74ec40541615ea5
      uri: huggingface://BAAI/Bunny-Llama-3-8B-V-gguf/mmproj-model-f16.gguf
    - filename: Bunny-Llama-3-8B-Q4_K_M.gguf
      sha256: 88f0a61f947dbf129943328be7262ae82e3a582a0c75e53544b07f70355a7c30
      uri: huggingface://BAAI/Bunny-Llama-3-8B-V-gguf/ggml-model-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llava-llama-3-8b-v1_1"
  description: |
    llava-llama-3-8b-v1_1 is a LLaVA model fine-tuned from meta-llama/Meta-Llama-3-8B-Instruct and CLIP-ViT-Large-patch14-336 with ShareGPT4V-PT and InternVL-SFT by XTuner.
  urls:
    - https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-gguf
  tags:
    - llm
    - multimodal
    - gguf
    - gpu
    - llama3
    - cpu
    - llava
  overrides:
    mmproj: llava-llama-3-8b-v1_1-mmproj-f16.gguf
    parameters:
      model: llava-llama-3-8b-v1_1-int4.gguf
  files:
    - filename: llava-llama-3-8b-v1_1-int4.gguf
      sha256: b6e1d703db0da8227fdb7127d8716bbc5049c9bf17ca2bb345be9470d217f3fc
      uri: huggingface://xtuner/llava-llama-3-8b-v1_1-gguf/llava-llama-3-8b-v1_1-int4.gguf
    - filename: llava-llama-3-8b-v1_1-mmproj-f16.gguf
      sha256: eb569aba7d65cf3da1d0369610eb6869f4a53ee369992a804d5810a80e9fa035
      uri: huggingface://xtuner/llava-llama-3-8b-v1_1-gguf/llava-llama-3-8b-v1_1-mmproj-f16.gguf
- !!merge <<: *llama3
  name: "minicpm-llama3-v-2_5"
  icon: https://avatars.githubusercontent.com/u/89920203
  urls:
    - https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf
    - https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5
  description: |
    MiniCPM-Llama3-V 2.5 is the latest model in the MiniCPM-V series. The model is built on SigLip-400M and Llama3-8B-Instruct with a total of 8B parameters
  tags:
    - llm
    - multimodal
    - gguf
    - gpu
    - llama3
    - cpu
  overrides:
    mmproj: minicpm-llama3-mmproj-f16.gguf
    parameters:
      model: minicpm-llama3-Q4_K_M.gguf
  files:
    - filename: minicpm-llama3-Q4_K_M.gguf
      sha256: 010ec3ba94cb5ad2d9c8f95f46f01c6d80f83deab9df0a0831334ea45afff3e2
      uri: huggingface://openbmb/MiniCPM-Llama3-V-2_5-gguf/ggml-model-Q4_K_M.gguf
    - filename: minicpm-llama3-mmproj-f16.gguf
      uri: huggingface://openbmb/MiniCPM-Llama3-V-2_5-gguf/mmproj-model-f16.gguf
      sha256: 2c2d773537faf6a7e093655d0d5e14801ef0b2121c6c3e1981ce094c2b62f4f9
- !!merge <<: *llama3
  name: "llama-3-cursedstock-v1.8-8b-iq-imatrix"
  urls:
    - https://huggingface.co/Lewdiculous/LLaMa-3-CursedStock-v1.8-8B-GGUF-IQ-Imatrix-Request
    - https://huggingface.co/PJMixers/LLaMa-3-CursedStock-v1.8-8B
  description: |
    A merge of several models
  icon: https://huggingface.co/PJMixers/LLaMa-3-CursedStock-v1.8-8B/resolve/main/model_tree.png
  overrides:
    parameters:
      model: LLaMa-3-CursedStock-v1.8-8B-Q4_K_M-imat.gguf
  files:
    - filename: LLaMa-3-CursedStock-v1.8-8B-Q4_K_M-imat.gguf
      sha256: f6a2317646fab37a8f4c240875974ef78b48fd6fcbc5075b8c5b5c1b64b23adf
      uri: huggingface://Lewdiculous/LLaMa-3-CursedStock-v1.8-8B-GGUF-IQ-Imatrix-Request/LLaMa-3-CursedStock-v1.8-8B-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "llama3-8b-darkidol-1.1-iq-imatrix"
  urls:
    - https://huggingface.co/LWDCLS/llama3-8B-DarkIdol-1.1-GGUF-IQ-Imatrix-Request
    - https://huggingface.co/aifeifei798/llama3-8B-DarkIdol-1.1
  description: |
    The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones.
  icon: https://huggingface.co/aifeifei798/llama3-8B-DarkIdol-1.1/resolve/main/2024-06-20_20-01-51_9319.png
  overrides:
    mmproj: Llama-3-Update-3.0-mmproj-model-f16.gguf
    parameters:
      model: llama3-8B-DarkIdol-1.1-Q4_K_M-imat.gguf
  files:
    - filename: llama3-8B-DarkIdol-1.1-Q4_K_M-imat.gguf
      sha256: 48ba66a28927a835c743c4a2525f523d8170c83fc410114edb55e332428b1e78
      uri: huggingface://LWDCLS/llama3-8B-DarkIdol-1.1-GGUF-IQ-Imatrix-Request/llama3-8B-DarkIdol-1.1-Q4_K_M-imat.gguf
    - filename: Llama-3-Update-3.0-mmproj-model-f16.gguf
      sha256: 3d2f36dff61d6157cadf102df86a808eb9f8a230be1bc0bc99039d81a895468a
      uri: huggingface://Nitral-AI/Llama-3-Update-3.0-mmproj-model-f16/Llama-3-Update-3.0-mmproj-model-f16.gguf
- !!merge <<: *llama3
  name: "llama3-8b-darkidol-1.2-iq-imatrix"
  urls:
    - https://huggingface.co/LWDCLS/llama3-8B-DarkIdol-1.2-GGUF-IQ-Imatrix-Request
    - https://huggingface.co/aifeifei798/llama3-8B-DarkIdol-1.2
  description: |
    The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones.
  icon: https://huggingface.co/aifeifei798/llama3-8B-DarkIdol-1.2/resolve/main/llama3-8B-DarkIdol-1.2.png
  overrides:
    mmproj: Llama-3-Update-3.0-mmproj-model-f16.gguf
    parameters:
      model: llama3-8B-DarkIdol-1.2-Q4_K_M-imat.gguf
  files:
    - filename: llama3-8B-DarkIdol-1.2-Q4_K_M-imat.gguf
      sha256: dce2f5f1661f49fb695b038d973770b0d9059bced4e4bb212f6517aa219131cd
      uri: huggingface://LWDCLS/llama3-8B-DarkIdol-1.2-GGUF-IQ-Imatrix-Request/llama3-8B-DarkIdol-1.2-Q4_K_M-imat.gguf
    - filename: Llama-3-Update-3.0-mmproj-model-f16.gguf
      sha256: 3d2f36dff61d6157cadf102df86a808eb9f8a230be1bc0bc99039d81a895468a
      uri: huggingface://Nitral-AI/Llama-3-Update-3.0-mmproj-model-f16/Llama-3-Update-3.0-mmproj-model-f16.gguf
- !!merge <<: *llama3
  name: "llama-3_8b_unaligned_alpha"
  urls:
    - https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha
    - https://huggingface.co/bartowski/LLAMA-3_8B_Unaligned_Alpha-GGUF
  description: |
    Model card description:
    As of June 11, 2024, I've finally started training the model! The training is progressing smoothly, although it will take some time. I used a combination of model merges and an abliterated model as base, followed by a comprehensive deep unalignment protocol to unalign the model to its core. A common issue with uncensoring and unaligning models is that it often significantly impacts their base intelligence. To mitigate these drawbacks, I've included a substantial corpus of common sense, theory of mind, and various other elements to counteract the effects of the deep uncensoring process. Given the extensive corpus involved, the training will require at least a week of continuous training. Expected early results: in about 3-4 days.
    Additional info:
    As of June 13, 2024, I've observed that even after two days of continuous training, the model is still resistant to learning certain aspects.
    For example, some of the validation data still shows a loss over , whereas other parts have a loss of < or lower. This is after the model was initially abliterated.
    June 18, 2024 Update, After extensive testing of the intermediate checkpoints, significant progress has been made.
    The model is slowly — I mean, really slowly — unlearning its alignment. By significantly lowering the learning rate, I was able to visibly observe deep behavioral changes, this process is taking longer than anticipated, but it's going to be worth it. Estimated time to completion: 4 more days.. I'm pleased to report that in several tests, the model not only maintained its intelligence but actually showed a slight improvement, especially in terms of common sense. An intermediate checkpoint of this model was used to create invisietch/EtherealRainbow-v0.3-rc7, with promising results. Currently, it seems like I'm on the right track. I hope this model will serve as a solid foundation for further merges, whether for role-playing (RP) or for uncensoring. This approach also allows us to save on actual fine-tuning, thereby reducing our carbon footprint. The merge process takes just a few minutes of CPU time, instead of days of GPU work.
    June 20, 2024 Update, Unaligning was partially successful, and the results are decent, but I am not fully satisfied. I decided to bite the bullet, and do a full finetune, god have mercy on my GPUs. I am also releasing the intermediate checkpoint of this model.
  overrides:
    parameters:
      model: LLAMA-3_8B_Unaligned_Alpha-Q4_K_M.gguf
  files:
    - filename: LLAMA-3_8B_Unaligned_Alpha-Q4_K_M.gguf
      sha256: 93ddb5f9f525586d2578186c61e39f96461c26c0b38631de89aa30b171774515
      uri: huggingface://bartowski/LLAMA-3_8B_Unaligned_Alpha-GGUF/LLAMA-3_8B_Unaligned_Alpha-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "l3-8b-lunaris-v1"
  urls:
    - https://huggingface.co/Sao10K/L3-8B-Lunaris-v1
    - https://huggingface.co/bartowski/L3-8B-Lunaris-v1-GGUF
  description: |
    A generalist / roleplaying model merge based on Llama 3. Models are selected from my personal experience while using them.

    I personally think this is an improvement over Stheno v3.2, considering the other models helped balance out its creativity and at the same time improving its logic.
  overrides:
    parameters:
      model: L3-8B-Lunaris-v1-Q4_K_M.gguf
  files:
    - filename: L3-8B-Lunaris-v1-Q4_K_M.gguf
      sha256: ef1d393f125be8c608859eeb4f26185ad90c7fc9cba41c96e847e77cdbcada18
      uri: huggingface://bartowski/L3-8B-Lunaris-v1-GGUF/L3-8B-Lunaris-v1-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3_8b_unaligned_alpha_rp_soup-i1"
  urls:
    - https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_RP_Soup
    - https://huggingface.co/mradermacher/LLAMA-3_8B_Unaligned_Alpha_RP_Soup-i1-GGUF
  description: |
    Censorship level: Medium

    This model is the outcome of multiple merges, starting with the base model SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha. The merging process was conducted in several stages:

    Merge 1: LLAMA-3_8B_Unaligned_Alpha was SLERP merged with invisietch/EtherealRainbow-v0.3-8B.
    Merge 2: LLAMA-3_8B_Unaligned_Alpha was SLERP merged with TheDrummer/Llama-3SOME-8B-v2.
    Soup 1: Merge 1 was combined with Merge 2.
    Final Merge: Soup 1 was SLERP merged with Nitral-Archive/Hathor_Enigmatica-L3-8B-v0.4.

    The final model is surprisingly coherent (although slightly more censored), which is a bit unexpected, since all the intermediate merge steps were pretty incoherent.
  overrides:
    parameters:
      model: LLAMA-3_8B_Unaligned_Alpha_RP_Soup.i1-Q4_K_M.gguf
  files:
    - filename: LLAMA-3_8B_Unaligned_Alpha_RP_Soup.i1-Q4_K_M.gguf
      sha256: 94347eb5125d9092e286730ae0ccc78374d68663c16ad2265005d8721eb8807b
      uri: huggingface://mradermacher/LLAMA-3_8B_Unaligned_Alpha_RP_Soup-i1-GGUF/LLAMA-3_8B_Unaligned_Alpha_RP_Soup.i1-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "hathor_respawn-l3-8b-v0.8"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/sWyipsXI-Wl-uEm57SRwM.png
  urls:
    - https://huggingface.co/Nitral-AI/Hathor_Respawn-L3-8B-v0.8
    - https://huggingface.co/bartowski/Hathor_Respawn-L3-8B-v0.8-GGUF
  description: |
    Hathor_Aleph-v0.8 is a model based on the LLaMA 3 architecture: Designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance. Making it an ideal tool for a wide range of applications; such as creative writing, educational support and human/computer interaction.
    Hathor 0.8 is trained on 3 epochs of Private RP, STEM (Intruction/Dialogs), Opus instructons, mixture light/classical novel data, roleplaying chat pairs over llama 3 8B instruct.
  overrides:
    parameters:
      model: Hathor_Respawn-L3-8B-v0.8-Q4_K_M.gguf
  files:
    - filename: Hathor_Respawn-L3-8B-v0.8-Q4_K_M.gguf
      sha256: d0cdfa8951ee80b252bf1dc183403ca9b48bc3de1578cb8e7fe321af753e661c
      uri: huggingface://bartowski/Hathor_Respawn-L3-8B-v0.8-GGUF/Hathor_Respawn-L3-8B-v0.8-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama3-8b-instruct-replete-adapted"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/-0dERC793D9XeFsJ9uHbx.png
  urls:
    - https://huggingface.co/Replete-AI/Llama3-8B-Instruct-Replete-Adapted
    - https://huggingface.co/bartowski/Llama3-8B-Instruct-Replete-Adapted-GGUF
  description: |
    Replete-Coder-llama3-8b is a general purpose model that is specially trained in coding in over 100 coding languages. The data used to train the model contains 25% non-code instruction data and 75% coding instruction data totaling up to 3.9 million lines, roughly 1 billion tokens, or 7.27gb of instruct data. The data used to train this model was 100% uncensored, then fully deduplicated, before training happened.

    More than just a coding model!

    Although Replete-Coder has amazing coding capabilities, its trained on vaste amount of non-coding data, fully cleaned and uncensored. Dont just use it for coding, use it for all your needs! We are truly trying to make the GPT killer!
  overrides:
    parameters:
      model: Llama3-8B-Instruct-Replete-Adapted-Q4_K_M.gguf
  files:
    - filename: Llama3-8B-Instruct-Replete-Adapted-Q4_K_M.gguf
      sha256: 9e9a142f6fb5fc812b17bfc30230582ae50ac22b93dea696b6887cde815c1cb4
      uri: huggingface://bartowski/Llama3-8B-Instruct-Replete-Adapted-GGUF/Llama3-8B-Instruct-Replete-Adapted-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-perky-pat-instruct-8b"
  urls:
    - https://huggingface.co/grimjim/Llama-3-Perky-Pat-Instruct-8B
    - https://huggingface.co/bartowski/Llama-3-Perky-Pat-Instruct-8B-GGUF
  description: |
    we explore negative weight merger, and propose Orthogonalized Vector Adaptation, or OVA.

    This is a merge of pre-trained language models created using mergekit.

    "One must imagine Sisyphys happy."

    Task arithmetic was used to invert the intervention vector that was applied in MopeyMule, via application of negative weight -1.0. The combination of model weights (Instruct - MopeyMule) comprises an Orthogonalized Vector Adaptation that can subsequently be applied to the base Instruct model, and could in principle be applied to other models derived from fine-tuning the Instruct model.

    This model is meant to continue exploration of behavioral changes that can be achieved via orthogonalized steering. The result appears to be more enthusiastic and lengthy responses in chat, though it is also clear that the merged model has some unhealed damage.

    Built with Meta Llama 3.
  overrides:
    parameters:
      model: Llama-3-Perky-Pat-Instruct-8B-Q4_K_M.gguf
  files:
    - filename: Llama-3-Perky-Pat-Instruct-8B-Q4_K_M.gguf
      sha256: b0eae5d9d58a7101a30693c267097a90f4a005c81fda801b40ab2c25e788a93e
      uri: huggingface://bartowski/Llama-3-Perky-Pat-Instruct-8B-GGUF/Llama-3-Perky-Pat-Instruct-8B-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "l3-uncen-merger-omelette-rp-v0.2-8b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/m0YKWwK9n7w8rnKOzduu4.png
  urls:
    - https://huggingface.co/Casual-Autopsy/L3-Uncen-Merger-Omelette-RP-v0.2-8B
    - https://huggingface.co/LWDCLS/L3-Uncen-Merger-Omelette-RP-v0.2-8B-GGUF-IQ-Imatrix-Request
  description: |
    L3-Uncen-Merger-Omelette-RP-v0.2-8B is a merge of the following models using LazyMergekit:

        Sao10K/L3-8B-Stheno-v3.2
        Casual-Autopsy/L3-Umbral-Mind-RP-v1.0-8B
        bluuwhale/L3-SthenoMaidBlackroot-8B-V1
        Cas-Warehouse/Llama-3-Mopeyfied-Psychology-v2
        migtissera/Llama-3-8B-Synthia-v3.5
        tannedbum/L3-Nymeria-Maid-8B
        Casual-Autopsy/L3-Umbral-Mind-RP-v0.3-8B
        tannedbum/L3-Nymeria-8B
        ChaoticNeutrals/Hathor_RP-v.01-L3-8B
        cgato/L3-TheSpice-8b-v0.8.3
        Sao10K/L3-8B-Stheno-v3.1
        Nitral-AI/Hathor_Stable-v0.2-L3-8B
        aifeifei798/llama3-8B-DarkIdol-1.0
        ChaoticNeutrals/Poppy_Porpoise-1.4-L3-8B
        ResplendentAI/Nymph_8B
  overrides:
    parameters:
      model: L3-Uncen-Merger-Omelette-RP-v0.2-8B-Q4_K_M-imat.gguf
  files:
    - filename: L3-Uncen-Merger-Omelette-RP-v0.2-8B-Q4_K_M-imat.gguf
      sha256: 6bbc42a4c3b25f2b854d76a6e32746b9b3b21dd8856f8f2bc1a5b1269aa8fca1
      uri: huggingface://LWDCLS/L3-Uncen-Merger-Omelette-RP-v0.2-8B-GGUF-IQ-Imatrix-Request/L3-Uncen-Merger-Omelette-RP-v0.2-8B-Q4_K_M-imat.gguf
- !!merge <<: *llama3
  name: "nymph_8b-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/9U_eJCDzLJ8nxb6qfuICc.jpeg
  urls:
    - https://huggingface.co/ResplendentAI/Nymph_8B
    - https://huggingface.co/mradermacher/Nymph_8B-i1-GGUF?not-for-all-audiences=true
  description: |
    Model card:
    Nymph is the culmination of everything I have learned with the T-series project. This model aims to be a unique and full featured RP juggernaut.

    The finetune incorporates 1.6 Million tokens of RP data sourced from Bluemoon, FreedomRP, Aesir-Preview, and Claude Opus logs. I made sure to use the multi-turn sharegpt datasets this time instead of alpaca conversions. I have also included three of my personal datasets. The final touch is an ORPO based upon Openhermes Roleplay preferences.
  overrides:
    parameters:
      model: Nymph_8B.i1-Q4_K_M.gguf
  files:
    - filename: Nymph_8B.i1-Q4_K_M.gguf
      sha256: 5b35794539d9cd262720f47a54f59dbffd5bf6c601950359b5c68d13f1ce13a0
      uri: huggingface://mradermacher/Nymph_8B-i1-GGUF/Nymph_8B.i1-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "l3-ms-astoria-8b"
  urls:
    - https://huggingface.co/ibrahimkettaneh/L3-MS-Astoria-8b
    - https://huggingface.co/mradermacher/L3-MS-Astoria-8b-GGUF
  description: |
    This is a merge of pre-trained language models created using mergekit.
    Merge Method

    This model was merged using the Model Stock merge method using failspy/Meta-Llama-3-8B-Instruct-abliterated-v3 as a base.
    Models Merged

    The following models were included in the merge:

        ProbeMedicalYonseiMAILab/medllama3-v20
        migtissera/Tess-2.0-Llama-3-8B
        Cas-Warehouse/Llama-3-Psychology-LoRA-Stock-8B
        TheSkullery/llama-3-cat-8b-instruct-v1
  overrides:
    parameters:
      model: L3-MS-Astoria-8b.Q4_K_M.gguf
  files:
    - filename: L3-MS-Astoria-8b.Q4_K_M.gguf
      sha256: cc5db0ef056aa57cb848988f6a7c739701ecde6303a9d8262f5dac76287ba15a
      uri: huggingface://mradermacher/L3-MS-Astoria-8b-GGUF/L3-MS-Astoria-8b.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "halomaidrp-v1.33-15b-l3-i1"
  urls:
    - https://huggingface.co/mradermacher/HaloMaidRP-v1.33-15B-L3-i1-GGUF
    - https://huggingface.co/v000000/HaloMaidRP-v1.33-15B-L3
  icon: https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/MCdGdalCCtOVPn8X7rqha.jpeg
  description: |
    This is the third iteration "Emerald" of the final four and the one I liked the most. It has had limited testing though, but seems relatively decent. Better than 8B at least.
    This is a merge of pre-trained language models created using mergekit.
    The following models were included in the merge:

    grimjim/Llama-3-Instruct-abliteration-LoRA-8B
    UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3
    NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
    maldv/llama-3-fantasy-writer-8b
    tokyotech-llm/Llama-3-Swallow-8B-v0.1
    Sao10K/L3-8B-Stheno-v3.2
    ZeusLabs/L3-Aethora-15B-V2
    Nitral-AI/Hathor_Respawn-L3-8B-v0.8
    Blackroot/Llama-3-8B-Abomination-LORA
  overrides:
    parameters:
      model: HaloMaidRP-v1.33-15B-L3.i1-Q4_K_M.gguf
  files:
    - filename: HaloMaidRP-v1.33-15B-L3.i1-Q4_K_M.gguf
      sha256: 94d0bf2de4df7e5a11b9ca4db3518d7d22c6fa062d1ee16e4db52b2bb26bc8b3
      uri: huggingface://mradermacher/HaloMaidRP-v1.33-15B-L3-i1-GGUF/HaloMaidRP-v1.33-15B-L3.i1-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-patronus-lynx-70b-instruct"
  urls:
    - https://huggingface.co/PatronusAI/Llama-3-Patronus-Lynx-70B-Instruct
    - https://huggingface.co/mradermacher/Llama-3-Patronus-Lynx-70B-Instruct-GGUF
  description: |
    Lynx is an open-source hallucination evaluation model. Patronus-Lynx-70B-Instruct was trained on a mix of datasets including CovidQA, PubmedQA, DROP, RAGTruth. The datasets contain a mix of hand-annotated and synthetic data. The maximum sequence length is 8000 tokens.
    Model
  overrides:
    parameters:
      model: Llama-3-Patronus-Lynx-70B-Instruct.Q4_K_M.gguf
  files:
    - filename: Llama-3-Patronus-Lynx-70B-Instruct.Q4_K_M.gguf
      sha256: 95a02b71baff287bd84188fc1babcf9dfae25c315e2613391e694cf944f1e5b3
      uri: huggingface://mradermacher/Llama-3-Patronus-Lynx-70B-Instruct-GGUF/Llama-3-Patronus-Lynx-70B-Instruct.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llamax3-8b-alpaca"
  urls:
    - https://huggingface.co/LLaMAX/LLaMAX3-8B-Alpaca
    - https://huggingface.co/mradermacher/LLaMAX3-8B-Alpaca-GGUF
  description: |
    LLaMAX is a language model with powerful multilingual capabilities without loss instruction-following capabilities.

    We collected extensive training sets in 102 languages for continued pre-training of Llama2 and leveraged the English instruction fine-tuning dataset, Alpaca, to fine-tune its instruction-following capabilities.

    LLaMAX supports translation between more than 100 languages, surpassing the performance of similarly scaled LLMs.

    Supported Languages
    Akrikaans (af), Amharic (am), Arabic (ar), Armenian (hy), Assamese (as), Asturian (ast), Azerbaijani (az), Belarusian (be), Bengali (bn), Bosnian (bs), Bulgarian (bg), Burmese (my), Catalan (ca), Cebuano (ceb), Chinese Simpl (zho), Chinese Trad (zho), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Filipino (tl), Finnish (fi), French (fr), Fulah (ff), Galician (gl), Ganda (lg), Georgian (ka), German (de), Greek (el), Gujarati (gu), Hausa (ha), Hebrew (he), Hindi (hi), Hungarian (hu), Icelandic (is), Igbo (ig), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Javanese (jv), Kabuverdianu (kea), Kamba (kam), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Kyrgyz (ky), Lao (lo), Latvian (lv), Lingala (ln), Lithuanian (lt), Luo (luo), Luxembourgish (lb), Macedonian (mk), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Mongolian (mn), Nepali (ne), Northern Sotho (ns), Norwegian (no), Nyanja (ny), Occitan (oc), Oriya (or), Oromo (om), Pashto (ps), Persian (fa), Polish (pl), Portuguese (pt), Punjabi (pa), Romanian (ro), Russian (ru), Serbian (sr), Shona (sn), Sindhi (sd), Slovak (sk), Slovenian (sl), Somali (so), Sorani Kurdish (ku), Spanish (es), Swahili (sw), Swedish (sv), Tajik (tg), Tamil (ta), Telugu (te), Thai (th), Turkish (tr), Ukrainian (uk), Umbundu (umb), Urdu (ur), Uzbek (uz), Vietnamese (vi), Welsh (cy), Wolof (wo), Xhosa (xh), Yoruba (yo), Zulu (zu)
  overrides:
    parameters:
      model: LLaMAX3-8B-Alpaca.Q4_K_M.gguf
  files:
    - filename: LLaMAX3-8B-Alpaca.Q4_K_M.gguf
      sha256: 4652209c55d4260634b2195989279f945a072d8574872789a40d1f9b86eb255b
      uri: huggingface://mradermacher/LLaMAX3-8B-Alpaca-GGUF/LLaMAX3-8B-Alpaca.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llamax3-8b"
  urls:
    - https://huggingface.co/LLaMAX/LLaMAX3-8B
    - https://huggingface.co/mradermacher/LLaMAX3-8B-GGUF
  description: |
    LLaMAX is a language model with powerful multilingual capabilities without loss instruction-following capabilities.

    We collected extensive training sets in 102 languages for continued pre-training of Llama2 and leveraged the English instruction fine-tuning dataset, Alpaca, to fine-tune its instruction-following capabilities.

    LLaMAX supports translation between more than 100 languages, surpassing the performance of similarly scaled LLMs.

    Supported Languages
    Akrikaans (af), Amharic (am), Arabic (ar), Armenian (hy), Assamese (as), Asturian (ast), Azerbaijani (az), Belarusian (be), Bengali (bn), Bosnian (bs), Bulgarian (bg), Burmese (my), Catalan (ca), Cebuano (ceb), Chinese Simpl (zho), Chinese Trad (zho), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Filipino (tl), Finnish (fi), French (fr), Fulah (ff), Galician (gl), Ganda (lg), Georgian (ka), German (de), Greek (el), Gujarati (gu), Hausa (ha), Hebrew (he), Hindi (hi), Hungarian (hu), Icelandic (is), Igbo (ig), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Javanese (jv), Kabuverdianu (kea), Kamba (kam), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Kyrgyz (ky), Lao (lo), Latvian (lv), Lingala (ln), Lithuanian (lt), Luo (luo), Luxembourgish (lb), Macedonian (mk), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Mongolian (mn), Nepali (ne), Northern Sotho (ns), Norwegian (no), Nyanja (ny), Occitan (oc), Oriya (or), Oromo (om), Pashto (ps), Persian (fa), Polish (pl), Portuguese (pt), Punjabi (pa), Romanian (ro), Russian (ru), Serbian (sr), Shona (sn), Sindhi (sd), Slovak (sk), Slovenian (sl), Somali (so), Sorani Kurdish (ku), Spanish (es), Swahili (sw), Swedish (sv), Tajik (tg), Tamil (ta), Telugu (te), Thai (th), Turkish (tr), Ukrainian (uk), Umbundu (umb), Urdu (ur), Uzbek (uz), Vietnamese (vi), Welsh (cy), Wolof (wo), Xhosa (xh), Yoruba (yo), Zulu (zu)
  overrides:
    parameters:
      model: LLaMAX3-8B.Q4_K_M.gguf
  files:
    - filename: LLaMAX3-8B.Q4_K_M.gguf
      sha256: 862fb2be5d74b171f4294f862f43e7cb6e6dbecce29a9f9167da4f1db230daac
      uri: huggingface://mradermacher/LLaMAX3-8B-GGUF/LLaMAX3-8B.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "arliai-llama-3-8b-dolfin-v0.5"
  urls:
    - https://huggingface.co/OwenArli/ArliAI-Llama-3-8B-Dolfin-v0.5
    - https://huggingface.co/QuantFactory/ArliAI-Llama-3-8B-Dolfin-v0.5-GGUF
  description: |
    Based on Meta-Llama-3-8b-Instruct, and is governed by Meta Llama 3 License agreement: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

    This is a fine tune using an improved Dolphin and WizardLM dataset intended to make the model follow instructions better and refuse less.

    OpenLLM Benchmark:

    Training:

        2048 sequence length since the dataset has an average length of under 1000 tokens, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine.
        Training duration is around 2 days on 2xRTX 3090, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.
  overrides:
    parameters:
      model: ArliAI-Llama-3-8B-Dolfin-v0.5.Q4_K_M.gguf
  files:
    - filename: ArliAI-Llama-3-8B-Dolfin-v0.5.Q4_K_M.gguf
      sha256: 71fef02915c606b438ccff2cae6b7760bbb54a558d5f2d39c2421d97b6682fea
      uri: huggingface://QuantFactory/ArliAI-Llama-3-8B-Dolfin-v0.5-GGUF/ArliAI-Llama-3-8B-Dolfin-v0.5.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-ezo-8b-common-it"
  icon: https://huggingface.co/HODACHI/Llama-3-EZO-8b-Common-it
  urls:
    - https://huggingface.co/HODACHI/Llama-3-EZO-8b-Common-it
    - https://huggingface.co/MCZK/Llama-3-EZO-8b-Common-it-GGUF
  description: |
    Based on meta-llama/Meta-Llama-3-8B-Instruct, it has been enhanced for Japanese usage through additional pre-training and instruction tuning. (Built with Meta Llama3)

    This model is based on Llama-3-8B-Instruct and is subject to the Llama-3 Terms of Use. For detailed information, please refer to the official Llama-3 license page.

    このモデルはLlama-3-8B-Instructをベースにしており、Llama-3の利用規約に従います。詳細については、Llama-3の公式ライセンスページをご参照ください。
  overrides:
    parameters:
      model: Llama-3-EZO-8b-Common-it.Q4_K_M.iMatrix.gguf
  files:
    - filename: Llama-3-EZO-8b-Common-it.Q4_K_M.iMatrix.gguf
      sha256: 0a46165b1c35bfb97d7d5b18969a7bfc2bbf37a90bc5e85f8cab11483f5a8adc
      uri: huggingface://MCZK/Llama-3-EZO-8b-Common-it-GGUF/Llama-3-EZO-8b-Common-it.Q4_K_M.iMatrix.gguf
- !!merge <<: *llama3
  name: "l3-8b-niitama-v1"
  urls:
    - https://huggingface.co/Sao10K/L3-8B-Niitama-v1
    - https://huggingface.co/mradermacher/L3-8B-Niitama-v1-GGUF
  description: |
    Niitama on Horde
  overrides:
    parameters:
      model: L3-8B-Niitama-v1.Q4_K_M.gguf
  files:
    - filename: L3-8B-Niitama-v1.Q4_K_M.gguf
      sha256: a0e6d8972e1c73af7952ee1b8a3898f52c6036701571fea37ff621b71e89eb53
      uri: huggingface://mradermacher/L3-8B-Niitama-v1-GGUF/L3-8B-Niitama-v1.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "l3-8b-niitama-v1-i1"
  urls:
    - https://huggingface.co/Sao10K/L3-8B-Niitama-v1
    - https://huggingface.co/mradermacher/L3-8B-Niitama-v1-i1-GGUF
  description: |
    Niitama on Horde (iMatrix quants)
  overrides:
    parameters:
      model: L3-8B-Niitama-v1.i1-Q4_K_M.gguf
  files:
    - filename: L3-8B-Niitama-v1.i1-Q4_K_M.gguf
      sha256: 8c62f831db2a6e34aa75459fe8a98815199ecc2dac1892a460b8b86363b6826e
      uri: huggingface://mradermacher/L3-8B-Niitama-v1-i1-GGUF/L3-8B-Niitama-v1.i1-Q4_K_M.gguf
- !!merge <<: *llama3
  icon: https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_BETA/resolve/main/Images/LLAMA-3_8B_Unaligned_BETA.png
  name: "llama-3_8b_unaligned_beta"
  urls:
    - https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_BETA
    - https://huggingface.co/bartowski/LLAMA-3_8B_Unaligned_BETA-GGUF
  description: |
    In the Wild West of the AI world, the real titans never hit their deadlines, no sir!
    The projects that finish on time? They’re the soft ones—basic, surface-level shenanigans. But the serious projects? They’re always delayed. You set a date, then reality hits: not gonna happen, scope creep that mutates the roadmap, unexpected turn of events that derails everything.
    It's only been 4 months since the Alpha was released, and half a year since the project started, but it felt like nearly a decade.
    Deadlines shift, but with each delay, you’re not failing—you’re refining, and becoming more ambitious. A project that keeps getting pushed isn’t late; it’s just gaining weight, becoming something worth building, and truly worth seeing all the way through. The longer it’s delayed, the more serious it gets.
    LLAMA-3_8B_Unaligned is a serious project, and thank god, the Beta is finally here.
    I love you all unconditionally, thanks for all the support and kind words!
  overrides:
    parameters:
      model: LLAMA-3_8B_Unaligned_BETA-Q4_K_M.gguf
  files:
    - filename: LLAMA-3_8B_Unaligned_BETA-Q4_K_M.gguf
      sha256: 5b88fb4537339996c04e4a1b6ef6a2d555c4103b6378e273ae9c6c5e77af67eb
      uri: huggingface://bartowski/LLAMA-3_8B_Unaligned_BETA-GGUF/LLAMA-3_8B_Unaligned_BETA-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "freyja-v4.95-maldv-7b-non-fiction-i1"
  urls:
    - https://huggingface.co/MrRobotoAI/Freyja-v4.95-maldv-7b-NON-FICTION
    - https://huggingface.co/mradermacher/Freyja-v4.95-maldv-7b-NON-FICTION-i1-GGUF
  description: |
    This model was merged using the Model Stock merge method using aifeifei798/llama3-8B-DarkIdol-2.2-Uncensored-1048K as a base.
    The following models were included in the merge:
        maldv/llama-3-fantasy-writer-8b
        maldv/badger-iota-llama-3-8b
        maldv/badger-lambda-llama-3-8b
        maldv/badger-mu-llama-3-8b
        maldv/badger-kappa-llama-3-8b
        maldv/badger-writer-llama-3-8b
  overrides:
    parameters:
      model: Freyja-v4.95-maldv-7b-NON-FICTION.i1-Q4_K_M.gguf
  files:
    - filename: Freyja-v4.95-maldv-7b-NON-FICTION.i1-Q4_K_M.gguf
      sha256: cdc0f4de6df2ba120835fbd25c2a0ae2af8548f46d2c40c7a018c51c3d19e0c0
      uri: huggingface://mradermacher/Freyja-v4.95-maldv-7b-NON-FICTION-i1-GGUF/Freyja-v4.95-maldv-7b-NON-FICTION.i1-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "dusk_rainbow"
  icon: https://huggingface.co/SicariusSicariiStuff/Dusk_Rainbow/resolve/main/Dusk_Rainbow.gif
  urls:
    - https://huggingface.co/SicariusSicariiStuff/Dusk_Rainbow
    - https://huggingface.co/mradermacher/Dusk_Rainbow-GGUF
  description: |
    A girl of peculiar appetites and an even more peculiar imagination lived in a small, sleepy village nestled deep in the countryside. The kind of village where the clouds hung low, casting shadows like sullen toddlers refusing to play. But on this particular day, the girl ambled through the woods, when she noticed something curious: a plant, of all things, that seemed to have been dipped in a cookie jar, judging by its smell. A botanical biscuit, in the middle of a birch grove.
    This model is the result of training a fraction (16M tokens) of the testing data Intended for LLAMA-3_8B_Unaligned's upcoming beta. The base model is a merge of merges, made by Invisietch's and named EtherealRainbow-v0.3-8B. The name for this model reflects the base that was used for this finetune while hinting a darker, and more uncensored aspects associated with the nature of the LLAMA-3_8B_Unaligned project.

    As a result of the unique data added, this model has an exceptional adherence to instructions about paragraph length, and to the story writing prompt. I would like to emphasize, no ChatGPT \ Claude was used for any of the additional data I added in this finetune. The goal is to eventually have a model with a minimal amount of slop, this cannot be reliably done by relying on API models, which pollute datasets with their bias and repetitive words.
  overrides:
    parameters:
      model: Dusk_Rainbow.Q4_K_M.gguf
  files:
    - filename: Dusk_Rainbow.Q4_K_M.gguf
      sha256: d02cb1612903f4840e4d72e92582b0dca64a8a7e6662953e8ad1ea62f9464e31
      uri: huggingface://mradermacher/Dusk_Rainbow-GGUF/Dusk_Rainbow.Q4_K_M.gguf
- &chatml
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master" ### ChatML
  name: "una-thepitbull-21.4b-v2"
  license: afl-3.0
  icon: https://huggingface.co/fblgit/UNA-ThePitbull-21.4B-v2/resolve/main/DE-UNA-ThePitbull-21.4B-v2.png
  description: |
    Introducing the best LLM in the industry. Nearly as good as a 70B, just a 21.4B based on saltlux/luxia-21.4b-alignment-v1.0 UNA - ThePitbull 21.4B v2
  urls:
    - https://huggingface.co/fblgit/UNA-ThePitbull-21.4B-v2
    - https://huggingface.co/bartowski/UNA-ThePitbull-21.4B-v2-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - chatml
  overrides:
    context_size: 8192
    parameters:
      model: UNA-ThePitbull-21.4B-v2-Q4_K_M.gguf
  files:
    - filename: UNA-ThePitbull-21.4B-v2-Q4_K_M.gguf
      sha256: f08780986748a04e707a63dcac616330c2afc7f9fb2cc6b1d9784672071f3c85
      uri: huggingface://bartowski/UNA-ThePitbull-21.4B-v2-GGUF/UNA-ThePitbull-21.4B-v2-Q4_K_M.gguf
- url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "helpingai-9b"
  license: hsul
  icon: https://huggingface.co/OEvortex/HelpingAI-3B/resolve/main/HelpingAI.png
  description: |
    HelpingAI-9B is a large language model designed for emotionally intelligent conversational interactions. It is trained to engage users with empathy, understanding, and supportive dialogue across a wide range of topics and contexts. The model aims to provide a supportive AI companion that can attune to users' emotional states and communicative needs.
  urls:
    - https://huggingface.co/OEvortex/HelpingAI-9B
    - https://huggingface.co/nold/HelpingAI-9B-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - chatml
  overrides:
    context_size: 4096
    parameters:
      model: HelpingAI-9B_Q4_K_M.gguf
  files:
    - filename: HelpingAI-9B_Q4_K_M.gguf
      sha256: 9c90f3a65332a03a6cbb563eee19c7586d9544f646ff9f33f7f1904b3d415ae2
      uri: huggingface://nold/HelpingAI-9B-GGUF/HelpingAI-9B_Q4_K_M.gguf
- url: "github:mudler/LocalAI/gallery/chatml-hercules.yaml@master"
  urls:
    - https://huggingface.co/Locutusque/Llama-3-Hercules-5.0-8B
    - https://huggingface.co/bartowski/Llama-3-Hercules-5.0-8B-GGUF
  name: "llama-3-hercules-5.0-8b"
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - chatml
    - function-calling
  description: |
    Llama-3-Hercules-5.0-8B is a fine-tuned language model derived from Llama-3-8B. It is specifically designed to excel in instruction following, function calls, and conversational interactions across various scientific and technical domains.
  overrides:
    parameters:
      model: Llama-3-Hercules-5.0-8B-Q4_K_M.gguf
  files:
    - filename: Llama-3-Hercules-5.0-8B-Q4_K_M.gguf
      sha256: 83647caf4a23a91697585cff391e7d1236fac867392f9e49a6dab59f81b5f810
      uri: huggingface://bartowski/Llama-3-Hercules-5.0-8B-GGUF/Llama-3-Hercules-5.0-8B-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "l3-15b-mythicalmaid-t0.0001"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/Nx5jjEYNH26OS2_87mPTM.png
  urls:
    - https://huggingface.co/v000000/L3-15B-MythicalMaid-t0.0001
    - https://huggingface.co/mradermacher/L3-15B-MythicalMaid-t0.0001-GGUF
  description: |
    Llama-3-15B-MythicalMaid-t0.0001
    A merge of the following models using a custom NearSwap(t0.0001) algorithm (inverted):

        ZeusLabs/L3-Aethora-15B-V2
        v000000/HaloMaidRP-v1.33-15B-L3

    With ZeusLabs/L3-Aethora-15B-V2 as the base model.

    This merge was inverted compared to "L3-15B-EtherealMaid-t0.0001".
  overrides:
    parameters:
      model: L3-15B-MythicalMaid-t0.0001.Q4_K_M.gguf
  files:
    - filename: L3-15B-MythicalMaid-t0.0001.Q4_K_M.gguf
      sha256: ecbd57783006f1a027f8a7f5a5d551dc8b3568912825f566d79fd34a804e8970
      uri: huggingface://mradermacher/L3-15B-MythicalMaid-t0.0001-GGUF/L3-15B-MythicalMaid-t0.0001.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "l3-15b-etherealmaid-t0.0001-i1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/FwYXt2h_FdmlL0Z6qYufz.png
  urls:
    - https://huggingface.co/v000000/L3-15B-EtherealMaid-t0.0001
    - https://huggingface.co/mradermacher/L3-15B-EtherealMaid-t0.0001-i1-GGUF
  description: |
    Llama-3-15B-EtherealMaid-t0.0001
    A merge of the following models using a custom NearSwap(t0.0001) algorithm:

        v000000/HaloMaidRP-v1.33-15B-L3
        ZeusLabs/L3-Aethora-15B-V2

    With v000000/HaloMaidRP-v1.33-15B-L3 as the base model.
  overrides:
    parameters:
      model: L3-15B-EtherealMaid-t0.0001.i1-Q4_K_M.gguf
  files:
    - filename: L3-15B-EtherealMaid-t0.0001.i1-Q4_K_M.gguf
      sha256: 2911be6be8e0fd4184998d452410ba847491b4ab71a928749de87cafb0e13757
      uri: huggingface://mradermacher/L3-15B-EtherealMaid-t0.0001-i1-GGUF/L3-15B-EtherealMaid-t0.0001.i1-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "l3-8b-celeste-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/630cf5d14ca0a22768bbe10c/Zv__LDTO-nHvpuxPcCgUU.webp
  urls:
    - https://huggingface.co/nothingiisreal/L3-8B-Celeste-v1
    - https://huggingface.co/bartowski/L3-8B-Celeste-v1-GGUF
  description: |
    Trained on LLaMA 3 8B Instruct at 8K context using Reddit Writing Prompts, Opus 15K Instruct an c2 logs cleaned.

    This is a roleplay model any instruction following capabilities outside roleplay contexts are coincidental.
  overrides:
    parameters:
      model: L3-8B-Celeste-v1-Q4_K_M.gguf
  files:
    - filename: L3-8B-Celeste-v1-Q4_K_M.gguf
      sha256: ed5277719965fb6bbcce7d16742e3bac4a8d5b8f52133261a3402a480cd65317
      uri: huggingface://bartowski/L3-8B-Celeste-v1-GGUF/L3-8B-Celeste-v1-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "l3-8b-celeste-v1.2"
  icon: https://cdn-uploads.huggingface.co/production/uploads/630cf5d14ca0a22768bbe10c/Zv__LDTO-nHvpuxPcCgUU.webp
  urls:
    - https://huggingface.co/mudler/L3-8B-Celeste-V1.2-Q4_K_M-GGUF
  description: |
    Trained on LLaMA 3 8B Instruct at 8K context using Reddit Writing Prompts, Opus 15K Instruct an c2 logs cleaned.

    This is a roleplay model any instruction following capabilities outside roleplay contexts are coincidental.
  overrides:
    parameters:
      model: l3-8b-celeste-v1.2-q4_k_m.gguf
  files:
    - filename: l3-8b-celeste-v1.2-q4_k_m.gguf
      sha256: 7752204c0e9f627ff5726eb69bb6114974cafbc934a993ad019abfba62002783
      uri: huggingface://mudler/L3-8B-Celeste-V1.2-Q4_K_M-GGUF/l3-8b-celeste-v1.2-q4_k_m.gguf
- !!merge <<: *llama3
  name: "llama-3-tulu-2-8b-i1"
  icon: https://huggingface.co/datasets/allenai/blog-images/resolve/main/tulu-v2/Tulu%20V2%20banner.png
  urls:
    - https://huggingface.co/allenai/llama-3-tulu-2-8b
    - https://huggingface.co/mradermacher/llama-3-tulu-2-8b-i1-GGUF
  description: |
    Tulu is a series of language models that are trained to act as helpful assistants. Llama 3 Tulu V2 8B is a fine-tuned version of Llama 3 that was trained on a mix of publicly available, synthetic and human datasets.
  overrides:
    parameters:
      model: llama-3-tulu-2-8b.i1-Q4_K_M.gguf
  files:
    - filename: llama-3-tulu-2-8b.i1-Q4_K_M.gguf
      sha256: f859c22bfa64f461e9ffd973dc7ad6a78bb98b1dda6f49abfa416a4022b7e333
      uri: huggingface://mradermacher/llama-3-tulu-2-8b-i1-GGUF/llama-3-tulu-2-8b.i1-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "llama-3-tulu-2-dpo-70b-i1"
  icon: https://huggingface.co/datasets/allenai/blog-images/resolve/main/tulu-v2/Tulu%20V2%20banner.png
  urls:
    - https://huggingface.co/allenai/llama-3-tulu-2-dpo-70b
    - https://huggingface.co/mradermacher/llama-3-tulu-2-dpo-70b-i1-GGUF
  description: |
    Tulu is a series of language models that are trained to act as helpful assistants. Llama 3 Tulu V2 8B is a fine-tuned version of Llama 3 that was trained on a mix of publicly available, synthetic and human datasets.
  overrides:
    parameters:
      model: llama-3-tulu-2-dpo-70b.i1-Q4_K_M.gguf
  files:
    - filename: llama-3-tulu-2-dpo-70b.i1-Q4_K_M.gguf
      sha256: fc309bbdf1e2bdced954c4c8dc1f9a885c547017ee5e750bfde645af89e3d3a5
      uri: huggingface://mradermacher/llama-3-tulu-2-dpo-70b-i1-GGUF/llama-3-tulu-2-dpo-70b.i1-Q4_K_M.gguf
- !!merge <<: *llama3
  license: cc-by-nc-4.0
  name: "suzume-llama-3-8b-multilingual-orpo-borda-top25"
  icon: https://cdn-uploads.huggingface.co/production/uploads/64b63f8ad57e02621dc93c8b/kWQSu02YfgYdUQqv4s5lq.png
  urls:
    - https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top25
    - https://huggingface.co/RichardErkhov/lightblue_-_suzume-llama-3-8B-multilingual-orpo-borda-top25-gguf
  description: |
    This is Suzume ORPO, an ORPO trained fine-tune of the lightblue/suzume-llama-3-8B-multilingual model using our lightblue/mitsu dataset.

    We have trained several versions of this model using ORPO and so recommend that you use the best performing model from our tests, lightblue/suzume-llama-3-8B-multilingual-orpo-borda-half.

    Note that this model has a non-commerical license as we used the Command R and Command R+ models to generate our training data for this model (lightblue/mitsu).

    We are currently working on a developing a commerically usable model, so stay tuned for that!
  overrides:
    parameters:
      model: suzume-llama-3-8B-multilingual-orpo-borda-top25.Q4_K_M.gguf
  files:
    - filename: suzume-llama-3-8B-multilingual-orpo-borda-top25.Q4_K_M.gguf
      sha256: ef75a02c5f38e14a8873c7989188dac6974851b4654279fe1921d2c8018cc388
      uri: huggingface://RichardErkhov/lightblue_-_suzume-llama-3-8B-multilingual-orpo-borda-top25-gguf/suzume-llama-3-8B-multilingual-orpo-borda-top25.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "calme-2.4-llama3-70b"
  icon: https://huggingface.co/MaziyarPanahi/calme-2.4-llama3-70b/resolve/main/llama-3-merges.webp
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-2.4-llama3-70b
    - https://huggingface.co/mradermacher/calme-2.4-llama3-70b-GGUF
  description: |
    This model is a fine-tune (DPO) of meta-llama/Meta-Llama-3-70B-Instruct model.
  overrides:
    parameters:
      model: calme-2.4-llama3-70b.Q4_K_M.gguf
  files:
    - filename: calme-2.4-llama3-70b.Q4_K_M.gguf
      sha256: 0b44ac8a88395dfc60f1b9d3cfffc0ffef74ec0a302e610ef91fc787187568f2
      uri: huggingface://mradermacher/calme-2.4-llama3-70b-GGUF/calme-2.4-llama3-70b.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "meta-llama-3-instruct-8.9b-brainstorm-5x-form-11"
  urls:
    - https://huggingface.co/DavidAU/Meta-Llama-3-Instruct-8.9B-BRAINSTORM-5x-FORM-11-GGUF
  description: |
    Meta-Llama-3-8B Instruct (now at 8.9B) is an enhanced version of the LLM model, specifically designed for creative use cases such as story writing, roleplaying, and fiction. This model has been augmented through the "Brainstorm" process, which involves expanding and calibrating the reasoning center of the LLM to improve its performance in various creative tasks. The enhancements brought by this process include more detailed and nuanced descriptions, stronger prose, and a greater sense of immersion in the story. The model is capable of generating long and vivid content, with fewer clichés and more focused, coherent narratives. Users can provide more instructions and details to elicit stronger and more engaging responses from the model. The "Brainstorm" process has been tested on multiple LLM models, including Llama2, Llama3, and Mistral, as well as on individual models like Llama3 Instruct, Mistral Instruct, and custom fine-tuned models.
  overrides:
    parameters:
      model: Meta-Llama-3-8B-Instruct-exp5-11-Q4_K_M.gguf
  files:
    - filename: Meta-Llama-3-8B-Instruct-exp5-11-Q4_K_M.gguf
      sha256: 5dd81b8b809667d10036499affdd1461cf95af50b405cbc9f800b421a4b60e98
      uri: huggingface://DavidAU/Meta-Llama-3-Instruct-8.9B-BRAINSTORM-5x-FORM-11-GGUF/Meta-Llama-3-8B-Instruct-exp5-11-Q4_K_M.gguf
- !!merge <<: *llama3
  name: "rp-naughty-v1.0c-8b"
  urls:
    - https://huggingface.co/QuantFactory/RP-Naughty-v1.0c-8b-GGUF
  description: |
    This model was merged using the Model Stock merge method using aifeifei798/llama3-8B-DarkIdol-2.2-Uncensored-1048K as a base.
    The following models were included in the merge:

        underwoods/adventure-8b
        Khetterman/Multilingual-SaigaSuzume-8B
        underwoods/writer-8b
        Khetterman/Kosmos-8B-v1
        Khetterman/CursedMatrix-8B-v9
  overrides:
    parameters:
      model: RP-Naughty-v1.0c-8b.Q4_K_M.gguf
  files:
    - filename: RP-Naughty-v1.0c-8b.Q4_K_M.gguf
      sha256: c344564d26d0c3d244d31cfeb103666eab37f9dee6678a2dbaf5bfcf4109d789
      uri: huggingface://QuantFactory/RP-Naughty-v1.0c-8b-GGUF/RP-Naughty-v1.0c-8b.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "bio-medical-llama-3-8b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/653f5b93cd52f288490edc83/zPMUugzfOiwTiRw88jm7T.jpeg
  urls:
    - https://huggingface.co/ContactDoctor/Bio-Medical-Llama-3-8B
    - https://huggingface.co/QuantFactory/Bio-Medical-Llama-3-8B-GGUF
  description: |
    Bio-Medical-Llama-3-8B model is a specialized large language model designed for biomedical applications. It is finetuned from the meta-llama/Meta-Llama-3-8B-Instruct model using a custom dataset containing over 500,000 diverse entries. These entries include a mix of synthetic and manually curated data, ensuring high quality and broad coverage of biomedical topics.

    The model is trained to understand and generate text related to various biomedical fields, making it a valuable tool for researchers, clinicians, and other professionals in the biomedical domain.
  overrides:
    parameters:
      model: Bio-Medical-Llama-3-8B.Q4_K_M.gguf
  files:
    - filename: Bio-Medical-Llama-3-8B.Q4_K_M.gguf
      sha256: 672939e0487d02c55734132c25a59f26e4deaac7cd49445a7028f2291139edcc
      uri: huggingface://QuantFactory/Bio-Medical-Llama-3-8B-GGUF/Bio-Medical-Llama-3-8B.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "triangulum-10b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/By0OJ1lMvP5ZvVvfEGvz5.png
  urls:
    - https://huggingface.co/prithivMLmods/Triangulum-10B
    - https://huggingface.co/mradermacher/Triangulum-10B-GGUF
  description: |
    Triangulum 10B is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.
    Key Features
        Foundation Model: Built upon LLaMA's autoregressive language model, leveraging an optimized transformer architecture for enhanced performance.
        Instruction Tuning: Includes supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align model outputs with human preferences for helpfulness and safety.
        Multilingual Support: Designed to handle multiple languages, ensuring broad applicability across diverse linguistic contexts.
    Training Approach
        Synthetic Datasets: Utilizes long chain-of-thought synthetic data to enhance reasoning capabilities.
        Supervised Fine-Tuning (SFT): Aligns the model to specific tasks through curated datasets.
        Reinforcement Learning with Human Feedback (RLHF): Ensures the model adheres to human values and safety guidelines through iterative training processes.
  overrides:
    parameters:
      model: Triangulum-10B.Q4_K_M.gguf
  files:
    - filename: Triangulum-10B.Q4_K_M.gguf
      sha256: dd071f99edf6b166044bf229cdeec19419c4c348e3fc3d6587cfcc55e6fb85fa
      uri: huggingface://mradermacher/Triangulum-10B-GGUF/Triangulum-10B.Q4_K_M.gguf
- !!merge <<: *llama3
  name: "opencrystal-l3-15b-v2.1-i1"
  icon: https://huggingface.co/Darkknight535/OpenCrystal-15B-L3-v2/resolve/main/Rohma_2024-08-30%4023h37m38s.jpg
  urls:
    - https://huggingface.co/Darkknight535/OpenCrystal-L3-15B-v2.1
    - https://huggingface.co/mradermacher/OpenCrystal-L3-15B-v2.1-i1-GGUF
  description: |
    Automatically speaks as other NPCs.
    Creative output.
    Coherent responses.
    Output feels similar to using Character.ai.
    Improved adherence to prompts.
    Reduced hallucinations (15B).
    Capable of summarizing and generating image prompts well.
  overrides:
    parameters:
      model: OpenCrystal-L3-15B-v2.1.i1-Q4_K_M.gguf
  files:
    - filename: OpenCrystal-L3-15B-v2.1.i1-Q4_K_M.gguf
      sha256: b7db0a810771c856bd598b9a11f6aec6672019a4e18822af3a5957a03184296f
      uri: huggingface://mradermacher/OpenCrystal-L3-15B-v2.1-i1-GGUF/OpenCrystal-L3-15B-v2.1.i1-Q4_K_M.gguf
- &command-R
  url: "github:mudler/LocalAI/gallery/command-r.yaml@master" ### START Command-r
  name: "command-r-v01:q1_s"
  license: "cc-by-nc-4.0"
  icon: https://cdn.sanity.io/images/rjtqmwfu/production/ae020d94b599cc453cc09ebc80be06d35d953c23-102x18.svg
  urls:
    - https://huggingface.co/CohereForAI/c4ai-command-r-v01
    - https://huggingface.co/dranger003/c4ai-command-r-v01-iMat.GGUF
  description: |
    C4AI Command-R is a research release of a 35 billion parameter highly performant generative model. Command-R is a large language model with open weights optimized for a variety of use cases including reasoning, summarization, and question answering. Command-R has the capability for multilingual generation evaluated in 10 languages and highly performant RAG capabilities.
  tags:
    - llm
    - gguf
    - gpu
    - command-r
    - cpu
  overrides:
    parameters:
      model: ggml-c4ai-command-r-v01-iq1_s.gguf
  files:
    - filename: "ggml-c4ai-command-r-v01-iq1_s.gguf"
      sha256: "aad4594ee45402fe344d8825937d63b9fa1f00becc6d1cc912b016dbb020e0f0"
      uri: "huggingface://dranger003/c4ai-command-r-v01-iMat.GGUF/ggml-c4ai-command-r-v01-iq1_s.gguf"
- !!merge <<: *command-R
  name: "aya-23-8b"
  urls:
    - https://huggingface.co/CohereForAI/aya-23-8B
    - https://huggingface.co/bartowski/aya-23-8B-GGUF
  description: |
    Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. Aya 23 focuses on pairing a highly performant pre-trained Command family of models with the recently released Aya Collection. The result is a powerful multilingual large language model serving 23 languages.

    This model card corresponds to the 8-billion version of the Aya 23 model. We also released a 35-billion version which you can find here.
  overrides:
    parameters:
      model: aya-23-8B-Q4_K_M.gguf
  files:
    - filename: "aya-23-8B-Q4_K_M.gguf"
      sha256: "21b3aa3abf067f78f6fe08deb80660cc4ee8ad7b4ab873a98d87761f9f858b0f"
      uri: "huggingface://bartowski/aya-23-8B-GGUF/aya-23-8B-Q4_K_M.gguf"
- !!merge <<: *command-R
  name: "aya-23-35b"
  urls:
    - https://huggingface.co/CohereForAI/aya-23-35B
    - https://huggingface.co/bartowski/aya-23-35B-GGUF
  description: |
    Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. Aya 23 focuses on pairing a highly performant pre-trained Command family of models with the recently released Aya Collection. The result is a powerful multilingual large language model serving 23 languages.

    This model card corresponds to the 8-billion version of the Aya 23 model. We also released a 35-billion version which you can find here.
  overrides:
    parameters:
      model: aya-23-35B-Q4_K_M.gguf
  files:
    - filename: "aya-23-35B-Q4_K_M.gguf"
      sha256: "57824768c1a945e21e028c8e9a29b39adb4838d489f5865c82601ab9ad98065d"
      uri: "huggingface://bartowski/aya-23-35B-GGUF/aya-23-35B-Q4_K_M.gguf"
- &phi-2-chat
  url: "github:mudler/LocalAI/gallery/phi-2-chat.yaml@master" ### START Phi-2
  license: mit
  description: |
    Phi-2 fine-tuned by the OpenHermes 2.5 dataset optimised for multi-turn conversation and character impersonation.

    The dataset has been pre-processed by doing the following:

    - remove all refusals
    - remove any mention of AI assistant
    - split any multi-turn dialog generated in the dataset into multi-turn conversations records
    - added nfsw generated conversations from the Teatime dataset

    Developed by: l3utterfly
    Funded by: Layla Network
    Model type: Phi
    Language(s) (NLP): English
    License: MIT
    Finetuned from model: Phi-2
  urls:
    - https://huggingface.co/l3utterfly/phi-2-layla-v1-chatml
    - https://huggingface.co/l3utterfly/phi-2-layla-v1-chatml-gguf
  tags:
    - llm
    - gguf
    - gpu
    - llama2
    - cpu
  name: "phi-2-chat:Q8_0"
  icon: https://avatars.githubusercontent.com/u/6154722
  overrides:
    parameters:
      model: phi-2-layla-v1-chatml-Q8_0.gguf
  files:
    - filename: "phi-2-layla-v1-chatml-Q8_0.gguf"
      sha256: "0cf542a127c2c835066a78028009b7eddbaf773cc2a26e1cb157ce5e09c1a2e0"
      uri: "huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q8_0.gguf"
- !!merge <<: *phi-2-chat
  name: "phi-2-chat"
  overrides:
    parameters:
      model: phi-2-layla-v1-chatml-Q4_K.gguf
  files:
    - filename: "phi-2-layla-v1-chatml-Q4_K.gguf"
      sha256: "b071e5624b60b8911f77261398802c4b4079c6c689e38e2ce75173ed62bc8a48"
      uri: "huggingface://l3utterfly/phi-2-layla-v1-chatml-gguf/phi-2-layla-v1-chatml-Q4_K.gguf"
- !!merge <<: *phi-2-chat
  license: mit
  icon: "https://huggingface.co/rhysjones/phi-2-orange/resolve/main/phi-2-orange.jpg"
  description: |
    A two-step finetune of Phi-2, with a bit of zest.

    There is an updated model at rhysjones/phi-2-orange-v2 which has higher evals, if you wish to test.
  urls:
    - https://huggingface.co/rhysjones/phi-2-orange
    - https://huggingface.co/TheBloke/phi-2-orange-GGUF
  tags:
    - llm
    - gguf
    - llama2
    - gpu
    - cpu
  name: "phi-2-orange"
  overrides:
    parameters:
      model: phi-2-orange.Q4_0.gguf
  files:
    - filename: "phi-2-orange.Q4_0.gguf"
      sha256: "49cb710ae688e1b19b1b299087fa40765a0cd677e3afcc45e5f7ef6750975dcf"
      uri: "huggingface://TheBloke/phi-2-orange-GGUF/phi-2-orange.Q4_0.gguf"
### Internlm2
- name: "internlm2_5-7b-chat-1m"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/internlm/internlm2_5-7b-chat-1m
    - https://huggingface.co/bartowski/internlm2_5-7b-chat-1m-GGUF
  icon: https://avatars.githubusercontent.com/u/135356492
  tags:
    - internlm2
    - gguf
    - cpu
    - gpu
  description: |
    InternLM2.5 has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics:

    Outstanding reasoning capability: State-of-the-art performance on Math reasoning, surpassing models like Llama3 and Gemma2-9B.

    1M Context window: Nearly perfect at finding needles in the haystack with 1M-long context, with leading performance on long-context tasks like LongBench. Try it with LMDeploy for 1M-context inference and a file chat demo.

    Stronger tool use: InternLM2.5 supports gathering information from more than 100 web pages, corresponding implementation will be released in Lagent soon. InternLM2.5 has better tool utilization-related capabilities in instruction following, tool selection and reflection. See examples.
  overrides:
    parameters:
      model: internlm2_5-7b-chat-1m-Q4_K_M.gguf
  files:
    - filename: internlm2_5-7b-chat-1m-Q4_K_M.gguf
      uri: huggingface://bartowski/internlm2_5-7b-chat-1m-GGUF/internlm2_5-7b-chat-1m-Q4_K_M.gguf
      sha256: 10d5e18a4125f9d4d74a9284a21e0c820b150af06dee48665e54ff6e1be3a564
### Internlm3
- name: "internlm3-8b-instruct"
  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  urls:
    - https://huggingface.co/internlm/internlm3-8b-instruct
    - https://huggingface.co/bartowski/internlm3-8b-instruct-GGUF
  icon: https://avatars.githubusercontent.com/u/135356492
  tags:
    - internlm3
    - gguf
    - cpu
    - gpu
  description: |
    InternLM3 has open-sourced an 8-billion parameter instruction model, InternLM3-8B-Instruct, designed for general-purpose usage and advanced reasoning.  The model has the following characteristics:

    Enhanced performance at reduced cost: State-of-the-art performance on reasoning and knowledge-intensive tasks surpass models like Llama3.1-8B and Qwen2.5-7B.

    Deep thinking capability: InternLM3 supports both the deep thinking mode for solving complicated reasoning tasks via the long chain-of-thought and the normal response mode for fluent user interactions.
  overrides:
    parameters:
      model: internlm3-8b-instruct-Q4_K_M.gguf
  files:
    - filename: internlm3-8b-instruct-Q4_K_M.gguf
      uri: huggingface://bartowski/internlm3-8b-instruct-GGUF/internlm3-8b-instruct-Q4_K_M.gguf
      sha256: 2a9644687318e8659c9cf9b40730d5cc2f5af06f786a50439c7c51359b23896e
- &phi-3
  url: "github:mudler/LocalAI/gallery/phi-3-chat.yaml@master" ### START Phi-3
  name: "phi-3-mini-4k-instruct"
  icon: https://avatars.githubusercontent.com/u/6154722
  license: mit
  description: |
    The Phi-3-Mini-4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties. The model belongs to the Phi-3 family with the Mini version in two variants 4K and 128K which is the context length (in tokens) it can support. The model has underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures. When assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
  urls:
    - https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf
  tags:
    - llm
    - gguf
    - gpu
    - llama2
    - cpu
  overrides:
    parameters:
      model: Phi-3-mini-4k-instruct-q4.gguf
  files:
    - filename: "Phi-3-mini-4k-instruct-q4.gguf"
      sha256: "8a83c7fb9049a9b2e92266fa7ad04933bb53aa1e85136b7b30f1b8000ff2edef"
      uri: "huggingface://microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf"
- !!merge <<: *phi-3
  name: "phi-3-mini-4k-instruct:fp16"
  overrides:
    parameters:
      model: Phi-3-mini-4k-instruct-fp16.gguf
  files:
    - filename: "Phi-3-mini-4k-instruct-fp16.gguf"
      uri: "huggingface://microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-fp16.gguf"
      sha256: 5d99003e395775659b0dde3f941d88ff378b2837a8dc3a2ea94222ab1420fad3
- !!merge <<: *phi-3
  name: "phi-3-medium-4k-instruct"
  description: |
    The Phi-3-Medium-4K-Instruct is a 14B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes
    both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties.
    The model belongs to the Phi-3 family with the Medium version in two variants 4K and 128K which is the context length (in tokens) that it can support.
  urls:
    - https://huggingface.co/bartowski/Phi-3-medium-4k-instruct-GGUF
    - https://huggingface.co/microsoft/Phi-3-medium-4k-instruct
  overrides:
    parameters:
      model: Phi-3-medium-4k-instruct-Q4_K_M.gguf
  files:
    - filename: "Phi-3-medium-4k-instruct-Q4_K_M.gguf"
      uri: "huggingface://bartowski/Phi-3-medium-4k-instruct-GGUF/Phi-3-medium-4k-instruct-Q4_K_M.gguf"
      sha256: 6f05c97bc676dd1ec8d58e9a8795b4f5c809db771f6fc7bf48634c805face82c
- !!merge <<: *phi-3
  name: "cream-phi-3-14b-v1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/AP4-OHepdqiqHj2KSi26M.gif
  description: |
    CreamPhi 14B is the first Phi Medium to be trained with roleplay and moist.
  urls:
    - https://huggingface.co/TheDrummer/Cream-Phi-3-14B-v1-GGUF
  overrides:
    parameters:
      model: Cream-Phi-3-14B-v1-Q4_K_M.gguf
  files:
    - filename: Cream-Phi-3-14B-v1-Q4_K_M.gguf
      uri: huggingface://TheDrummer/Cream-Phi-3-14B-v1-GGUF/Cream-Phi-3-14B-v1-Q4_K_M.gguf
      sha256: ec67018a86090da415517acf21ad48f28e02dff664a1dd35602f1f8fa94f6a27
- !!merge <<: *phi-3
  name: "phi3-4x4b-v1"
  description: |
    a continually pretrained phi3-mini sparse moe upcycle
  urls:
    - https://huggingface.co/bartowski/phi3-4x4b-v1-GGUF
    - https://huggingface.co/Fizzarolli/phi3-4x4b-v1
  overrides:
    parameters:
      model: phi3-4x4b-v1-Q4_K_M.gguf
  files:
    - filename: phi3-4x4b-v1-Q4_K_M.gguf
      uri: huggingface://bartowski/phi3-4x4b-v1-GGUF/phi3-4x4b-v1-Q4_K_M.gguf
      sha256: fd33220186b7076f4b306f27b3a8913384435a2ca90185a71c9df5a752d3a298
- !!merge <<: *phi-3
  name: "phi-3.1-mini-4k-instruct"
  urls:
    - https://huggingface.co/microsoft/Phi-3-mini-4k-instruct
    - https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-GGUF
  description: |
    This is an update over the original instruction-tuned Phi-3-mini release based on valuable customer feedback. The model used additional post-training data leading to substantial gains on instruction following and structure output.

    It is based on the original model from Microsoft, but has been updated and quantized using the llama.cpp release b3278.
  overrides:
    parameters:
      model: Phi-3.1-mini-4k-instruct-Q4_K_M.gguf
  files:
    - filename: Phi-3.1-mini-4k-instruct-Q4_K_M.gguf
      uri: huggingface://bartowski/Phi-3.1-mini-4k-instruct-GGUF/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf
      sha256: d6d25bf078321bea4a079c727b273cb0b5a2e0b4cf3add0f7a2c8e43075c414f
- !!merge <<: *phi-3
  name: "phillama-3.8b-v0.1"
  icon: https://cdn-uploads.huggingface.co/production/uploads/657eb5b256c9c67605a6e8b5/f96pPiJQb3puzbPYNknG2.png
  urls:
    - https://huggingface.co/RichardErkhov/raincandy-u_-_phillama-3.8b-v0.1-gguf
  description: |
    The description of the LLM model is:
    Phillama is a model based on Phi-3-mini and trained on Llama-generated dataset raincandy-u/Dextromethorphan-10k to make it more "llama-like". Also, this model is converted into Llama format, so it will work with any Llama-2/3 workflow. The model aims to generate text with a specific "llama-like" style and is suited for text-generation tasks.
  overrides:
    parameters:
      model: phillama-3.8b-v0.1.Q4_K_M.gguf
  files:
    - filename: phillama-3.8b-v0.1.Q4_K_M.gguf
      sha256: da537d352b7aae54bbad0d2cff3e3a1b0e1dc1e1d25bec3aae1d05cf4faee7a2
      uri: huggingface://RichardErkhov/raincandy-u_-_phillama-3.8b-v0.1-gguf/phillama-3.8b-v0.1.Q4_K_M.gguf
- !!merge <<: *phi-3
  name: "calme-2.3-phi3-4b"
  icon: https://huggingface.co/MaziyarPanahi/calme-2.1-phi3-4b/resolve/main/phi-3-instruct.webp
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-2.3-phi3-4b
    - https://huggingface.co/MaziyarPanahi/calme-2.3-phi3-4b-GGUF
  description: |
    MaziyarPanahi/calme-2.1-phi3-4b

    This model is a fine-tune (DPO) of microsoft/Phi-3-mini-4k-instruct model.
  overrides:
    parameters:
      model: Phi-3-mini-4k-instruct-v0.3.Q4_K_M.gguf
  files:
    - filename: Phi-3-mini-4k-instruct-v0.3.Q4_K_M.gguf
      sha256: 3a23e1052369c080afb925882bd814cbea5ec859894655a7434c3d49e43a6127
      uri: huggingface://MaziyarPanahi/calme-2.3-phi3-4b-GGUF/Phi-3-mini-4k-instruct-v0.3.Q4_K_M.gguf
- !!merge <<: *phi-3
  name: "phi-3.5-mini-instruct"
  urls:
    - https://huggingface.co/microsoft/Phi-3.5-mini-instruct
    - https://huggingface.co/MaziyarPanahi/Phi-3.5-mini-instruct-GGUF
  description: |
    Phi-3.5-mini is a lightweight, state-of-the-art open model built upon datasets used for Phi-3 - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data. The model belongs to the Phi-3 model family and supports 128K token context length. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning, proximal policy optimization, and direct preference optimization to ensure precise instruction adherence and robust safety measures.
  overrides:
    parameters:
      model: Phi-3.5-mini-instruct.Q4_K_M.gguf
  files:
    - filename: Phi-3.5-mini-instruct.Q4_K_M.gguf
      sha256: 3f68916e850b107d8641d18bcd5548f0d66beef9e0a9077fe84ef28943eb7e88
      uri: huggingface://MaziyarPanahi/Phi-3.5-mini-instruct-GGUF/Phi-3.5-mini-instruct.Q4_K_M.gguf
- !!merge <<: *phi-3
  name: "calme-2.1-phi3.5-4b-i1"
  icon: https://huggingface.co/MaziyarPanahi/calme-2.1-phi3.5-4b/resolve/main/calme-2.webp
  urls:
    - https://huggingface.co/MaziyarPanahi/calme-2.1-phi3.5-4b
    - https://huggingface.co/mradermacher/calme-2.1-phi3.5-4b-i1-GGUF
  description: |
    This model is a fine-tuned version of the microsoft/Phi-3.5-mini-instruct, pushing the boundaries of natural language understanding and generation even further. My goal was to create a versatile and robust model that excels across a wide range of benchmarks and real-world applications.
  overrides:
    parameters:
      model: calme-2.1-phi3.5-4b.i1-Q4_K_M.gguf
  files:
    - filename: calme-2.1-phi3.5-4b.i1-Q4_K_M.gguf
      sha256: 989eccacd52b6d9ebf2c06c35c363da19aadb125659a10df299b7130bc293e77
      uri: huggingface://mradermacher/calme-2.1-phi3.5-4b-i1-GGUF/calme-2.1-phi3.5-4b.i1-Q4_K_M.gguf
- !!merge <<: *phi-3
  name: "phi-3.5-mini-titanfusion-0.2"
  urls:
    - https://huggingface.co/bunnycore/Phi-3.5-mini-TitanFusion-0.2
    - https://huggingface.co/mradermacher/Phi-3.5-mini-TitanFusion-0.2-GGUF
  description: |
    This model was merged using the TIES merge method using microsoft/Phi-3.5-mini-instruct as a base.
    The following models were included in the merge:
        nbeerbower/phi3.5-gutenberg-4B
        ArliAI/Phi-3.5-mini-3.8B-ArliAI-RPMax-v1.1
        bunnycore/Phi-3.5-Mini-Hyper
        bunnycore/Phi-3.5-Mini-Hyper + bunnycore/Phi-3.1-EvolKit-lora
        bunnycore/Phi-3.5-Mini-Sonet-RP
        bunnycore/Phi-3.5-mini-TitanFusion-0.1
  overrides:
    parameters:
      model: Phi-3.5-mini-TitanFusion-0.2.Q4_K_M.gguf
  files:
    - filename: Phi-3.5-mini-TitanFusion-0.2.Q4_K_M.gguf
      sha256: 9579305712f2bca246914639c4873acdc1e7bc64ac2c7db0230df4f0ca0ef234
      uri: huggingface://mradermacher/Phi-3.5-mini-TitanFusion-0.2-GGUF/Phi-3.5-mini-TitanFusion-0.2.Q4_K_M.gguf
- !!merge <<: *phi-3
  name: "phi-3-vision:vllm"
  url: "github:mudler/LocalAI/gallery/phi-3-vision.yaml@master"
  description: |
    Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which include - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data both on text and vision. The model belongs to the Phi-3 model family, and the multimodal version comes with 128K context length (in tokens) it can support. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
- !!merge <<: *phi-3
  name: "phi-3.5-vision:vllm"
  url: "github:mudler/LocalAI/gallery/phi-3-vision.yaml@master"
  override:
    parameters:
      model: microsoft/Phi-3.5-vision-instruct
  description: |
    Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which include - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data both on text and vision. The model belongs to the Phi-3 model family, and the multimodal version comes with 128K context length (in tokens) it can support. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
- !!merge <<: *phi-3
  name: "phi-3.5-moe-instruct"
  urls:
    - https://huggingface.co/microsoft/Phi-3.5-MoE-instruct
    - https://huggingface.co/bartowski/Phi-3.5-MoE-instruct-GGUF
  description: |
    Phi-3.5-MoE is a lightweight, state-of-the-art open model built upon datasets used for Phi-3 - synthetic data and filtered publicly available documents - with a focus on very high-quality, reasoning dense data. The model supports multilingual and comes with 128K context length (in tokens). The model underwent a rigorous enhancement process, incorporating supervised fine-tuning, proximal policy optimization, and direct preference optimization to ensure precise instruction adherence and robust safety measures.
  overrides:
    parameters:
      model: Phi-3.5-MoE-instruct-Q4_K_M.gguf
  files:
    - filename: Phi-3.5-MoE-instruct-Q4_K_M.gguf
      sha256: 43e91bb720869bd8a92d8eb86bc3c74a52c49cf61642ca709b3d7bb89644df36
      uri: huggingface://bartowski/Phi-3.5-MoE-instruct-GGUF/Phi-3.5-MoE-instruct-Q4_K_M.gguf
- !!merge <<: *phi-3
  name: "luvgpt_phi3-uncensored-chat"
  icon: https://huggingface.co/magicsquares137/phi3-uncensored-chat/resolve/main/00380-3290958654.png
  urls:
    - https://huggingface.co/luvGPT/phi3-uncensored-chat
    - https://huggingface.co/bartowski/luvGPT_phi3-uncensored-chat-GGUF
  description: |
    This model is a fine-tuned version of microsoft/phi-3-mini-4k-instruct optimized for roleplaying conversations with a variety of character personas. The model speaks in a conversational format. Please not, prompt template guidelines are extremely important in getting usable output.
  overrides:
    parameters:
      model: luvGPT_phi3-uncensored-chat-Q4_K_M.gguf
  files:
    - filename: luvGPT_phi3-uncensored-chat-Q4_K_M.gguf
      sha256: 15e61e802907316a64932eab112eb6fc16f5861876e59e0867c00774a1941937
      uri: huggingface://bartowski/luvGPT_phi3-uncensored-chat-GGUF/luvGPT_phi3-uncensored-chat-Q4_K_M.gguf
- &hermes-2-pro-mistral
  url: "github:mudler/LocalAI/gallery/hermes-2-pro-mistral.yaml@master" ### START Hermes
  name: "hermes-2-pro-mistral"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/ggO2sBDJ8Bhc6w-zwTx5j.png
  license: apache-2.0
  description: |
    Hermes 2 Pro is an upgraded, retrained version of Nous Hermes 2, consisting of an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house.

    This new version of Hermes maintains its excellent general task and conversation capabilities - but also excels at Function Calling, JSON Structured Outputs, and has improved on several other metrics as well, scoring a 90% on our function calling evaluation built in partnership with Fireworks.AI, and an 81% on our structured JSON Output evaluation.

    Hermes Pro takes advantage of a special system prompt and multi-turn function calling structure with a new chatml role in order to make function calling reliable and easy to parse. Learn more about prompting below.

    This work was a collaboration between Nous Research, @interstellarninja, and Fireworks.AI

    Learn more about the function calling on our github repo here: https://github.com/NousResearch/Hermes-Function-Calling/tree/main
  urls:
    - https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - mistral
    - cpu
    - function-calling
  overrides:
    parameters:
      model: Hermes-2-Pro-Mistral-7B.Q4_0.gguf
  files:
    - filename: "Hermes-2-Pro-Mistral-7B.Q4_0.gguf"
      sha256: "f446c3125026f7af6757dd097dda02280adc85e908c058bd6f1c41a118354745"
      uri: "huggingface://NousResearch/Hermes-2-Pro-Mistral-7B-GGUF/Hermes-2-Pro-Mistral-7B.Q4_0.gguf"
- !!merge <<: *hermes-2-pro-mistral
  name: "hermes-2-pro-mistral:Q6_K"
  overrides:
    parameters:
      model: Hermes-2-Pro-Mistral-7B.Q6_K.gguf
  files:
    - filename: "Hermes-2-Pro-Mistral-7B.Q6_K.gguf"
      sha256: "40adc3b227bc36764de148fdda4df5df385adc06650d58d4dbe726ee0214eeff"
      uri: "huggingface://NousResearch/Hermes-2-Pro-Mistral-7B-GGUF/Hermes-2-Pro-Mistral-7B.Q6_K.gguf"
- !!merge <<: *hermes-2-pro-mistral
  name: "hermes-2-pro-mistral:Q8_0"
  overrides:
    parameters:
      model: Hermes-2-Pro-Mistral-7B.Q8_0.gguf
  files:
    - filename: "Hermes-2-Pro-Mistral-7B.Q8_0.gguf"
      sha256: "b6d95d7ec9a395b7568cc94b0447fd4f90b6f69d6e44794b1fbb84e3f732baca"
      uri: "huggingface://NousResearch/Hermes-2-Pro-Mistral-7B-GGUF/Hermes-2-Pro-Mistral-7B.Q8_0.gguf"
- !!merge <<: *hermes-2-pro-mistral
  name: "hermes-2-theta-llama-3-8b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/HQnQmNM1L3KXGhp0wUzHH.png
  tags:
    - llm
    - gguf
    - gpu
    - llama3
    - cpu
    - function-calling
  description: |
    Hermes-2 Θ (Theta) is the first experimental merged model released by Nous Research, in collaboration with Charles Goddard at Arcee, the team behind MergeKit.
    Hermes-2 Θ is a merged and then further RLHF'ed version our excellent Hermes 2 Pro model and Meta's Llama-3 Instruct model to form a new model, Hermes-2 Θ, combining the best of both worlds of each model.
  urls:
    - https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF
  overrides:
    parameters:
      model: Hermes-2-Pro-Llama-3-Instruct-Merged-DPO-Q4_K_M.gguf
  files:
    - filename: "Hermes-2-Pro-Llama-3-Instruct-Merged-DPO-Q4_K_M.gguf"
      sha256: "762b9371a296ab2628592b9462dc676b27d881a3402816492801641a437669b3"
      uri: "huggingface://NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-Instruct-Merged-DPO-Q4_K_M.gguf"
- !!merge <<: *hermes-2-pro-mistral
  name: "hermes-2-theta-llama-3-70b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/P4NxBFwfBbboNZVytpn45.png
  tags:
    - llm
    - gguf
    - gpu
    - llama3
    - cpu
    - function-calling
  description: |
    Hermes-2 Θ (Theta) 70B is the continuation of our experimental merged model released by Nous Research, in collaboration with Charles Goddard and Arcee AI, the team behind MergeKit.

    Hermes-2 Θ is a merged and then further RLHF'ed version our excellent Hermes 2 Pro model and Meta's Llama-3 Instruct model to form a new model, Hermes-2 Θ, combining the best of both worlds of each model.
  urls:
    - https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-70B-GGUF
  overrides:
    parameters:
      model: Hermes-2-Theta-Llama-3-70B-Q4_K_M.gguf
  files:
    - filename: "Hermes-2-Theta-Llama-3-70B-Q4_K_M.gguf"
      uri: "huggingface://NousResearch/Hermes-2-Theta-Llama-3-70B-GGUF/Hermes-2-Theta-Llama-3-70B-Q4_K_M.gguf"
      sha256: b3965f671c35d09da8b903218f5bbaac94efdd9000e4fe4a2bac87fcac9f664e
### LLAMA3 version
- !!merge <<: *hermes-2-pro-mistral
  name: "hermes-2-pro-llama-3-8b"
  tags:
    - llm
    - gguf
    - gpu
    - llama3
    - function-calling
    - cpu
  urls:
    - https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF
  overrides:
    parameters:
      model: Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
  files:
    - filename: "Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf"
      sha256: "10c52a4820137a35947927be741bb411a9200329367ce2590cc6757cd98e746c"
      uri: "huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf"
- !!merge <<: *hermes-2-pro-mistral
  tags:
    - llm
    - gguf
    - gpu
    - llama3
    - function-calling
    - cpu
  name: "hermes-2-pro-llama-3-8b:Q5_K_M"
  urls:
    - https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF
  overrides:
    parameters:
      model: Hermes-2-Pro-Llama-3-8B-Q5_K_M.gguf
  files:
    - filename: "Hermes-2-Pro-Llama-3-8B-Q5_K_M.gguf"
      sha256: "107f3f55e26b8cc144eadd83e5f8a60cfd61839c56088fa3ae2d5679abf45f29"
      uri: "huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q5_K_M.gguf"
- !!merge <<: *hermes-2-pro-mistral
  tags:
    - llm
    - gguf
    - gpu
    - function-calling
    - llama3
    - cpu
  name: "hermes-2-pro-llama-3-8b:Q8_0"
  urls:
    - https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF
  overrides:
    parameters:
      model: Hermes-2-Pro-Llama-3-8B-Q8_0.gguf
  files:
    - filename: "Hermes-2-Pro-Llama-3-8B-Q8_0.gguf"
      sha256: "d138388cfda04d185a68eaf2396cf7a5cfa87d038a20896817a9b7cf1806f532"
      uri: "huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q8_0.gguf"
- !!merge <<: *hermes-2-pro-mistral
  name: "hermes-3-llama-3.1-8b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/bMcZ3sNNQK8SRZpHXBmwM.jpeg
  urls:
    - https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B
    - https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B-GGUF
  description: |
    Hermes 3 is a generalist language model developed by Nous Research. It is an advanced agentic model with improved roleplaying, reasoning, multi-turn conversation, long context coherence, and generalist assistant capabilities. The model is built on top of the Llama-3 architecture and has been fine-tuned to achieve superior performance in various tasks. It is designed to be a powerful and reliable tool for solving complex problems and assisting users in achieving their goals. Hermes 3 can be used for a wide range of applications, including research, education, and personal assistant tasks. It is available on the Hugging Face model hub for easy access and integration into existing workflows.
  overrides:
    parameters:
      model: Hermes-3-Llama-3.1-8B.Q4_K_M.gguf
  files:
    - filename: Hermes-3-Llama-3.1-8B.Q4_K_M.gguf
      sha256: d4403ce5a6e930f4c2509456388c20d633a15ff08dd52ef3b142ff1810ec3553
      uri: huggingface://NousResearch/Hermes-3-Llama-3.1-8B-GGUF/Hermes-3-Llama-3.1-8B.Q4_K_M.gguf
- !!merge <<: *hermes-2-pro-mistral
  name: "hermes-3-llama-3.1-8b:Q8"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/bMcZ3sNNQK8SRZpHXBmwM.jpeg
  urls:
    - https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B
    - https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B-GGUF
  description: |
    Hermes 3 is a generalist language model developed by Nous Research. It is an advanced agentic model with improved roleplaying, reasoning, multi-turn conversation, long context coherence, and generalist assistant capabilities. The model is built on top of the Llama-3 architecture and has been fine-tuned to achieve superior performance in various tasks. It is designed to be a powerful and reliable tool for solving complex problems and assisting users in achieving their goals. Hermes 3 can be used for a wide range of applications, including research, education, and personal assistant tasks. It is available on the Hugging Face model hub for easy access and integration into existing workflows.
  overrides:
    parameters:
      model: Hermes-3-Llama-3.1-8B.Q8_0.gguf
  files:
    - filename: Hermes-3-Llama-3.1-8B.Q8_0.gguf
      sha256: c77c263f78b2f56fbaddd3ef2af750fda6ebb4344a546aaa0bfdd546b1ca8d84
      uri: huggingface://NousResearch/Hermes-3-Llama-3.1-8B-GGUF/Hermes-3-Llama-3.1-8B.Q8_0.gguf
- !!merge <<: *hermes-2-pro-mistral
  name: "hermes-3-llama-3.1-70b"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/vG6j5WxHX09yj32vgjJlI.jpeg
  urls:
    - https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-70B
    - https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-70B-GGUF
  description: |
    Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board. It is designed to focus on aligning LLMs to the user, with powerful steering capabilities and control given to the end user. The model uses ChatML as the prompt format, opening up a much more structured system for engaging the LLM in multi-turn chat dialogue. It also supports function calling and structured output capabilities, generalist assistant capabilities, and improved code generation skills.
  overrides:
    parameters:
      model: Hermes-3-Llama-3.1-70B.Q4_K_M.gguf
  files:
    - filename: Hermes-3-Llama-3.1-70B.Q4_K_M.gguf
      sha256: 955c2f42caade4278f3c9dbffa32bb74572652b20e49e5340e782de3585bbe3f
      uri: huggingface://NousResearch/Hermes-3-Llama-3.1-70B-GGUF/Hermes-3-Llama-3.1-70B.Q4_K_M.gguf
- !!merge <<: *hermes-2-pro-mistral
  name: "hermes-3-llama-3.1-70b:Q5_K_M"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/vG6j5WxHX09yj32vgjJlI.jpeg
  urls:
    - https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-70B
    - https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-70B-GGUF
  description: |
    Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board. It is designed to focus on aligning LLMs to the user, with powerful steering capabilities and control given to the end user. The model uses ChatML as the prompt format, opening up a much more structured system for engaging the LLM in multi-turn chat dialogue. It also supports function calling and structured output capabilities, generalist assistant capabilities, and improved code generation skills.
  overrides:
    parameters:
      model: Hermes-3-Llama-3.1-70B.Q5_K_M.gguf
  files:
    - filename: Hermes-3-Llama-3.1-70B.Q5_K_M.gguf
      sha256: 10ae3e0441b14c4a6476436f3c14e8bcacc7928aa3e8ce978d053287289a7ebb
      uri: huggingface://NousResearch/Hermes-3-Llama-3.1-70B-GGUF/Hermes-3-Llama-3.1-70B.Q5_K_M.gguf
- &hermes-vllm
  url: "github:mudler/LocalAI/gallery/hermes-vllm.yaml@master"
  name: "hermes-3-llama-3.1-8b:vllm"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/vG6j5WxHX09yj32vgjJlI.jpeg
  tags:
    - llm
    - vllm
    - gpu
    - function-calling
  license: llama-3
  urls:
    - https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B
  description: |
    Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board. It is designed to focus on aligning LLMs to the user, with powerful steering capabilities and control given to the end user. The model uses ChatML as the prompt format, opening up a much more structured system for engaging the LLM in multi-turn chat dialogue. It also supports function calling and structured output capabilities, generalist assistant capabilities, and improved code generation skills.
  overrides:
    parameters:
      model: NousResearch/Hermes-3-Llama-3.1-8B
- !!merge <<: *hermes-vllm
  name: "hermes-3-llama-3.1-70b:vllm"
  urls:
    - https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-70B
  overrides:
    parameters:
      model: NousResearch/Hermes-3-Llama-3.1-70B
- !!merge <<: *hermes-vllm
  name: "hermes-3-llama-3.1-405b:vllm"
  icon: https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/-kj_KflXsdpcZoTQsvx7W.jpeg
  urls:
    - https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-405B
  overrides:
    parameters:
      model: NousResearch/Hermes-3-Llama-3.1-405B
- !!merge <<: *hermes-2-pro-mistral
  name: "biomistral-7b"
  description: |
    BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains
  urls:
    - https://huggingface.co/MaziyarPanahi/BioMistral-7B-GGUF
  icon: https://huggingface.co/BioMistral/BioMistral-7B/resolve/main/wordart_blue_m_rectangle.png?download=true
  overrides:
    parameters:
      model: BioMistral-7B.Q4_K_M.gguf
  files:
    - filename: "BioMistral-7B.Q4_K_M.gguf"
      sha256: "3a73107045dfe7e3f113b392b0a67e3e6ca9fa9dae2abe301424ce5abd1721a6"
      uri: "huggingface://MaziyarPanahi/BioMistral-7B-GGUF/BioMistral-7B.Q4_K_M.gguf"
- !!merge <<: *hermes-2-pro-mistral
  name: "tiamat-8b-1.2-llama-3-dpo"
  icon: https://huggingface.co/Gryphe/Tiamat-8b-1.2-Llama-3-DPO/resolve/main/Tiamat.png
  description: |
    Obligatory Disclaimer: Tiamat is not nice.

    Ever wanted to be treated disdainfully like the foolish mortal you are? Wait no more, for Tiamat is here to berate you! Hailing from the world of the Forgotten Realms, she will happily judge your every word.

    Tiamat was created with the following question in mind; Is it possible to create an assistant with strong anti-assistant personality traits? Try it yourself and tell me afterwards!

    She was fine-tuned on top of Nous Research's shiny new Hermes 2 Pro.
  urls:
    - https://huggingface.co/bartowski/Tiamat-8b-1.2-Llama-3-DPO-GGUF
  overrides:
    parameters:
      model: Tiamat-8b-1.2-Llama-3-DPO-Q4_K_M.gguf
  files:
    - filename: "Tiamat-8b-1.2-Llama-3-DPO-Q4_K_M.gguf"
      sha256: "7b0895d2183344b2ac1ff36b9f3fe31dd8d4cf8820c4a41ef74e50ef86e3b448"
      uri: "huggingface://bartowski/Tiamat-8b-1.2-Llama-3-DPO-GGUF/Tiamat-8b-1.2-Llama-3-DPO-Q4_K_M.gguf"
- url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
  name: "guillaumetell-7b"
  license: apache-2
  description: |
    Guillaume Tell est un Large Language Model (LLM) français basé sur Mistral Open-Hermes 2.5 optimisé pour le RAG (Retrieval Augmented Generation) avec traçabilité des sources et explicabilité.
  urls:
    - https://huggingface.co/MaziyarPanahi/guillaumetell-7b-GGUF
    - https://huggingface.co/AgentPublic/guillaumetell-7b
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - openhermes
    - french
  overrides:
    context_size: 4096
    parameters:
      model: guillaumetell-7b.Q4_K_M.gguf
  files:
    - filename: guillaumetell-7b.Q4_K_M.gguf
      sha256: bf08db5281619335f3ee87e229c8533b04262790063b061bb8f275c3e4de7061
      uri: huggingface://MaziyarPanahi/guillaumetell-7b-GGUF/guillaumetell-7b.Q4_K_M.gguf
- !!merge <<: *hermes-2-pro-mistral
  name: "kunocchini-7b-128k-test-imatrix"
  description: |
    The following models were included in the merge:

        SanjiWatsuki/Kunoichi-DPO-v2-7B
        Epiculous/Fett-uccine-Long-Noodle-7B-120k-Contex
  urls:
    - https://huggingface.co/Lewdiculous/Kunocchini-7b-128k-test-GGUF-Imatrix
  icon: https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/9obNSalcJqCilQwr_4ssM.jpeg
  overrides:
    parameters:
      model: v2_Kunocchini-7b-128k-test-Q4_K_M-imatrix.gguf
  files:
    - filename: "v2_Kunocchini-7b-128k-test-Q4_K_M-imatrix.gguf"
      sha256: "5ccec35392f56f66952f8eb2ded2d8aa9a6bb511e9518899d8096326e328edef"
      uri: "huggingface://Lewdiculous/Kunocchini-7b-128k-test-GGUF-Imatrix/v2_Kunocchini-7b-128k-test-Q4_K_M-imatrix.gguf"
### START Cerbero
- url: "github:mudler/LocalAI/gallery/cerbero.yaml@master"
  icon: https://huggingface.co/galatolo/cerbero-7b/resolve/main/README.md.d/cerbero.png
  description: |
    cerbero-7b is specifically crafted to fill the void in Italy's AI landscape.
  urls:
    - https://huggingface.co/galatolo/cerbero-7b
  tags:
    - llm
    - gguf
    - gpu
    - cpu
    - mistral
    - italian
  overrides:
    parameters:
      model: galatolo-Q4_K.gguf
  files:
    - filename: "galatolo-Q4_K.gguf"
      sha256: "ca0cfd5a9ad40dc16416aa3a277015d0299b62c0803b67f5709580042202c172"
      uri: "huggingface://galatolo/cerbero-7b-gguf/ggml-model-Q4_K.gguf"
- &codellama
  url: "github:mudler/LocalAI/gallery/codellama.yaml@master" ### START Codellama
  name: "codellama-7b"
  license: llama2
  description: |
    Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 34 billion parameters. This model is designed for general code synthesis and understanding.
  urls:
    - https://huggingface.co/TheBloke/CodeLlama-7B-GGUF
    - https://huggingface.co/meta-llama/CodeLlama-7b-hf
  tags:
    - llm
    - gguf
    - gpu
    - llama2
    - cpu
  overrides:
    parameters:
      model: codellama-7b.Q4_0.gguf
  files:
    - filename: "codellama-7b.Q4_0.gguf"
      sha256: "33052f6dd41436db2f83bd48017b6fff8ce0184e15a8a227368b4230f1da97b5"
      uri: "huggingface://TheBloke/CodeLlama-7B-GGUF/codellama-7b.Q4_0.gguf"
- !!merge <<: *codellama
  name: "codestral-22b-v0.1"
  license: mnpl
  description: |
    Codestral-22B-v0.1 is trained on a diverse dataset of 80+ programming languages, including the most popular ones, such as Python, Java, C, C++, JavaScript, and Bash (more details in the Blogpost). The model can be queried:

    As instruct, for instance to answer any questions about a code snippet (write documentation, explain, factorize) or to generate code following specific indications
    As Fill in the Middle (FIM), to predict the middle tokens between a prefix and a suffix (very useful for software development add-ons like in VS Code)
  urls:
    - https://huggingface.co/mistralai/Codestral-22B-v0.1
    - https://huggingface.co/bartowski/Codestral-22B-v0.1-GGUF
  tags:
    - llm
    - gguf
    - gpu
    - code
    - cpu
  overrides:
    parameters:
      model: Codestral-22B-v0.1-Q4_K_M.gguf
  files:
    - filename: "Codestral-22B-v0.1-Q4_K_M.gguf"
      uri: "huggingface://bartowski/Codestral-22B-v0.1-GGUF/Codestral-22B-v0.1-Q4_K_M.gguf"
      sha256: 003e48ed892850b80994fcddca2bd6b833b092a4ef2db2853c33a3144245e06c
- !!merge <<: *codellama
  url: "github:mudler/LocalAI/gallery/alpaca.yaml@master"
  icon: https://huggingface.co/Nan-Do/LeetCodeWizard_7B_V1.1/resolve/main/LeetCodeWizardLogo.png
  name: "leetcodewizard_7b_v1.1-i1"
  urls:
    - https://huggingface.co/Nan-Do/LeetCodeWizard_7B_V1.1
    - https://huggingface.co/mradermacher/LeetCodeWizard_7B_V1.1-i1-GGUF
  description: |
    LeetCodeWizard is a coding large language model specifically trained to solve and explain Leetcode (or any) programming problems.
    This model is a fine-tuned version of the WizardCoder-Python-7B with a dataset of Leetcode problems\
    Model capabilities:

        It should be able to solve most of the problems found at Leetcode and even pass the sample interviews they offer on the site.

        It can write both the code and the explanations for the solutions.
  overrides:
    parameters:
      model: LeetCodeWizard_7B_V1.1.i1-Q4_K_M.gguf
  files:
    - filename: LeetCodeWizard_7B_V1.1.i1-Q4_K_M.gguf
      sha256: 19720d8e1ba89d32c6f88ed6518caf0251f9e3ec011297929c801efc5ea979f4
      uri: huggingface://mradermacher/LeetCodeWizard_7B_V1.1-i1-GGUF/LeetCodeWizard_7B_V1.1.i1-Q4_K_M.gguf
- &llm-compiler
  url: "github:mudler/LocalAI/gallery/codellama.yaml@master"
  name: "llm-compiler-13b-imat"
  license: other
  description: |
    LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning.
    LLM Compiler is free for both research and commercial use.
    LLM Compiler is available in two flavors:

        LLM Compiler, the foundational models, pretrained on over 500B tokens of LLVM-IR, x86_84, ARM, and CUDA assembly codes and trained to predict the effect of LLVM optimizations;
        and LLM Compiler FTD, which is further fine-tuned to predict the best optimizations for code in LLVM assembly to reduce code size, and to disassemble assembly code to LLVM-IR.
  urls:
    - https://huggingface.co/legraphista/llm-compiler-13b-IMat-GGUF
    - https://huggingface.co/facebook/llm-compiler-13b
  tags:
    - llm
    - gguf
    - gpu
    - code
    - cpu
  overrides:
    parameters:
      model: llm-compiler-13b.Q4_K.gguf
  files:
    - filename: "llm-compiler-13b.Q4_K.gguf"
      uri: "huggingface://legraphista/llm-compiler-13b-IMat-GGUF/llm-compiler-13b.Q4_K.gguf"
      sha256: dad41a121d0d67432c289aba8ffffc93159e2b24ca3d1c62e118c9f4cbf0c890
- !!merge <<: *llm-compiler
  name: "llm-compiler-13b-ftd"
  urls:
    - https://huggingface.co/QuantFactory/llm-compiler-13b-ftd-GGUF
    - https://huggingface.co/facebook/llm-compiler-13b-ftd
  overrides:
    parameters:
      model: llm-compiler-13b-ftd.Q4_K_M.gguf
  files:
    - filename: "llm-compiler-13b-ftd.Q4_K_M.gguf"
      uri: "huggingface://QuantFactory/llm-compiler-13b-ftd-GGUF/llm-compiler-13b-ftd.Q4_K_M.gguf"
      sha256: a5d19ae6b3fbe6724784363161b66cd2c8d8a3905761c0fb08245b3c03697db1
- !!merge <<: *llm-compiler
  name: "llm-compiler-7b-imat-GGUF"
  urls:
    - https://huggingface.co/legraphista/llm-compiler-7b-IMat-GGUF
    - https://huggingface.co/facebook/llm-compiler-7b
  overrides:
    parameters:
      model: llm-compiler-7b.Q4_K.gguf
  files:
    - filename: "llm-compiler-7b.Q4_K.gguf"
      uri: "huggingface://legraphista/llm-compiler-7b-IMat-GGUF/llm-compiler-7b.Q4_K.gguf"
      sha256: 84926979701fa4591ff5ede94a6c5829a62efa620590e5815af984707d446926
- !!merge <<: *llm-compiler
  name: "llm-compiler-7b-ftd-imat"
  urls:
    - https://huggingface.co/legraphista/llm-compiler-7b-ftd-IMat-GGUF
    - https://huggingface.co/facebook/llm-compiler-7b-ftd
  overrides:
    parameters:
      model: llm-compiler-7b-ftd.Q4_K.gguf
  files:
    - filename: "llm-compiler-7b-ftd.Q4_K.gguf"
      uri: "huggingface://legraphista/llm-compiler-7b-ftd-IMat-GGUF/llm-compiler-7b-ftd.Q4_K.gguf"
      sha256: d862dd18ed335413787d0ad196522a9902a3c10a6456afdab8721822cb0ddde8
- &openvino
  url: "github:mudler/LocalAI/gallery/openvino.yaml@master" ### START OpenVINO
  name: "openvino-llama-3-8b-instruct-ov-int8"
  license: llama3
  urls:
    - https://huggingface.co/fakezeta/llama-3-8b-instruct-ov-int8
  overrides:
    parameters:
      model: fakezeta/llama-3-8b-instruct-ov-int8
    stopwords:
      - "<|eot_id|>"
      - "<|end_of_text|>"
  tags:
    - llm
    - openvino
    - gpu
    - llama3
    - cpu
- !!merge <<: *openvino
  name: "openvino-phi3"
  urls:
    - https://huggingface.co/fakezeta/Phi-3-mini-128k-instruct-ov-int8
  overrides:
    trust_remote_code: true
    context_size: 131072
    parameters:
      model: fakezeta/Phi-3-mini-128k-instruct-ov-int8
    stopwords:
      - <|end|>
  tags:
    - llm
    - openvino
    - gpu
    - phi3
    - cpu
    - Remote Code Enabled
- !!merge <<: *openvino
  icon: https://cdn-uploads.huggingface.co/production/uploads/62f7a16192950415b637e201/HMD6WEoqqrAV8Ng_fAcnN.png
  name: "openvino-llama3-aloe"
  urls:
    - https://huggingface.co/fakezeta/Llama3-Aloe-8B-Alpha-ov-int8
  overrides:
    context_size: 8192
    parameters:
      model: fakezeta/Llama3-Aloe-8B-Alpha-ov-int8
    stopwords:
      - "<|eot_id|>"
      - "<|end_of_text|>"
- !!merge <<: *openvino
  name: "openvino-starling-lm-7b-beta-openvino-int8"
  urls:
    - https://huggingface.co/fakezeta/Starling-LM-7B-beta-openvino-int8
  overrides:
    context_size: 8192
    parameters:
      model: fakezeta/Starling-LM-7B-beta-openvino-int8
  tags:
    - llm
    - openvino
    - gpu
    - mistral
    - cpu
- !!merge <<: *openvino
  name: "openvino-wizardlm2"
  urls:
    - https://huggingface.co/fakezeta/Not-WizardLM-2-7B-ov-int8
  overrides:
    context_size: 8192
    parameters:
      model: fakezeta/Not-WizardLM-2-7B-ov-int8
- !!merge <<: *openvino
  name: "openvino-hermes2pro-llama3"
  urls:
    - https://huggingface.co/fakezeta/Hermes-2-Pro-Llama-3-8B-ov-int8
  overrides:
    context_size: 8192
    parameters:
      model: fakezeta/Hermes-2-Pro-Llama-3-8B-ov-int8
  tags:
    - llm
    - openvino
    - gpu
    - llama3
    - cpu
- !!merge <<: *openvino
  name: "openvino-multilingual-e5-base"
  urls:
    - https://huggingface.co/intfloat/multilingual-e5-base
  overrides:
    embeddings: true
    type: OVModelForFeatureExtraction
    parameters:
      model: intfloat/multilingual-e5-base
  tags:
    - llm
    - openvino
    - gpu
    - embedding
    - cpu
- !!merge <<: *openvino
  name: "openvino-all-MiniLM-L6-v2"
  urls:
    - https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
  overrides:
    embeddings: true
    type: OVModelForFeatureExtraction
    parameters:
      model: sentence-transformers/all-MiniLM-L6-v2
  tags:
    - llm
    - openvino
    - gpu
    - embedding
    - cpu
- &sentencentransformers
  description: | ### START Embeddings
    This framework provides an easy method to compute dense vector representations for sentences, paragraphs, and images. The models are based on transformer networks like BERT / RoBERTa / XLM-RoBERTa etc. and achieve state-of-the-art performance in various tasks. Text is embedded in vector space such that similar text are closer and can efficiently be found using cosine similarity.
  urls:
    - https://github.com/UKPLab/sentence-transformers
  tags:
    - gpu
    - cpu
    - embeddings
    - python
  name: "all-MiniLM-L6-v2"
  url: "github:mudler/LocalAI/gallery/sentencetransformers.yaml@master"
  overrides:
    parameters:
      model: all-MiniLM-L6-v2
- &dreamshaper
  name: dreamshaper ### START Image generation
  icon: https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/dd9b038c-bd15-43ab-86ab-66e145ad7ff2/width=450/26072158-132340247-8k%20portrait%20of%20beautiful%20cyborg%20with%20brown%20hair,%20intricate,%20elegant,%20highly%20detailed,%20majestic,%20digital%20photography,%20art%20by%20artg_ed.jpeg
  license: other
  description: |
    A text-to-image model that uses Stable Diffusion 1.5 to generate images from text prompts. This model is DreamShaper model by Lykon.
  urls:
    - https://civitai.com/models/4384/dreamshaper
  tags:
    - text-to-image
    - stablediffusion
    - python
    - sd-1.5
    - gpu
  url: "github:mudler/LocalAI/gallery/dreamshaper.yaml@master"
  overrides:
    parameters:
      model: DreamShaper_8_pruned.safetensors
  files:
    - filename: DreamShaper_8_pruned.safetensors
      uri: huggingface://Lykon/DreamShaper/DreamShaper_8_pruned.safetensors
      sha256: 879db523c30d3b9017143d56705015e15a2cb5628762c11d086fed9538abd7fd
- name: stable-diffusion-3-medium
  icon: https://avatars.githubusercontent.com/u/100950301
  license: other
  description: |
    Stable Diffusion 3 Medium is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features greatly improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.
  urls:
    - https://huggingface.co/stabilityai/stable-diffusion-3-medium
    - https://huggingface.co/leo009/stable-diffusion-3-medium
  tags:
    - text-to-image
    - stablediffusion
    - python
    - sd-3
    - gpu
  url: "github:mudler/LocalAI/gallery/stablediffusion3.yaml@master"
- name: sd-1.5-ggml
  icon: https://avatars.githubusercontent.com/u/37351293
  license: creativeml-openrail-m
  url: "github:mudler/LocalAI/gallery/sd-ggml.yaml@master"
  description: |
    Stable Diffusion 1.5
  urls:
    - https://huggingface.co/second-state/stable-diffusion-v1-5-GGUF
  tags:
    - text-to-image
    - stablediffusion
    - gpu
    - cpu
  overrides:
    options:
      - "sampler:euler"
    parameters:
      model: stable-diffusion-v1-5-pruned-emaonly-Q4_0.gguf
  files:
    - filename: "stable-diffusion-v1-5-pruned-emaonly-Q4_0.gguf"
      sha256: "b8944e9fe0b69b36ae1b5bb0185b3a7b8ef14347fe0fa9af6c64c4829022261f"
      uri: "huggingface://second-state/stable-diffusion-v1-5-GGUF/stable-diffusion-v1-5-pruned-emaonly-Q4_0.gguf"
- name: sd-3.5-medium-ggml
  license: stabilityai-ai-community
  url: "github:mudler/LocalAI/gallery/sd-ggml.yaml@master"
  description: |
    Stable Diffusion 3.5 Medium is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.
  urls:
    - https://huggingface.co/stabilityai/stable-diffusion-3.5-medium
    - https://huggingface.co/second-state/stable-diffusion-3.5-medium-GGUF
  tags:
    - text-to-image
    - stablediffusion
    - gpu
    - cpu
  icon: https://avatars.githubusercontent.com/u/100950301
  overrides:
    options:
      - "clip_l_path:clip_l-Q4_0.gguf"
      - "clip_g_path:clip_g-Q4_0.gguf"
      - "t5xxl_path:t5xxl-Q4_0.gguf"
      - "sampler:euler"
    parameters:
      model: sd3.5_medium-Q4_0.gguf
  files:
    - filename: "sd3.5_medium-Q4_0.gguf"
      sha256: "3bb8c5e9ab0a841117089ed4ed81d885bb85161df2a766b812f829bc55b31adf"
      uri: "huggingface://second-state/stable-diffusion-3.5-medium-GGUF/sd3.5_medium-Q4_0.gguf"
    - filename: clip_g-Q4_0.gguf
      sha256: c142411147e16b7c4b9cc1f5d977cbe596104435d76fde47172d3d35c5e58bb8
      uri: huggingface://second-state/stable-diffusion-3.5-medium-GGUF/clip_g-Q4_0.gguf
    - filename: clip_l-Q4_0.gguf
      sha256: f5ad88ae2ac924eb4ac0298b77afa304b5e6014fc0c4128f0e3df40fdfcc0f8a
      uri: huggingface://second-state/stable-diffusion-3.5-medium-GGUF/clip_l-Q4_0.gguf
    - filename: t5xxl-Q4_0.gguf
      sha256: 987ba47c158b890c274f78fd35324419f50941e846a49789f0977e9fe9d97ab7
      uri: huggingface://second-state/stable-diffusion-3.5-medium-GGUF/t5xxl-Q4_0.gguf
- name: sd-3.5-large-ggml
  license: stabilityai-ai-community
  url: "github:mudler/LocalAI/gallery/sd-ggml.yaml@master"
  description: |
    Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.
  urls:
    - https://huggingface.co/stabilityai/stable-diffusion-3.5-large
    - https://huggingface.co/second-state/stable-diffusion-3.5-large-GGUF
  tags:
    - text-to-image
    - stablediffusion
    - gpu
    - cpu
  icon: https://avatars.githubusercontent.com/u/100950301
  overrides:
    parameters:
      model: sd3.5_large-Q4_0.gguf
  files:
    - filename: "sd3.5_large-Q4_0.gguf"
      sha256: "c79ed6cdaa7decaca6b05ccc636b956b37c47de9b104c56315ca8ed086347b00"
      uri: "huggingface://second-state/stable-diffusion-3.5-large-GGUF/sd3.5_large-Q4_0.gguf"
    - filename: clip_g.safetensors
      sha256: ec310df2af79c318e24d20511b601a591ca8cd4f1fce1d8dff822a356bcdb1f4
      uri: huggingface://second-state/stable-diffusion-3.5-large-GGUF/clip_g.safetensors
    - filename: clip_l.safetensors
      sha256: 660c6f5b1abae9dc498ac2d21e1347d2abdb0cf6c0c0c8576cd796491d9a6cdd
      uri: huggingface://second-state/stable-diffusion-3.5-large-GGUF/clip_l.safetensors
    - filename: t5xxl-Q5_0.gguf
      sha256: f4df16c641a05c4a6ca717068ba3ee312875000f6fac0efbd152915553b5fc3e
      uri: huggingface://second-state/stable-diffusion-3.5-large-GGUF/t5xxl-Q5_0.gguf
- &flux
  name: flux.1-dev
  icon: https://avatars.githubusercontent.com/u/164064024
  license: flux-1-dev-non-commercial-license
  description: |
    FLUX.1 [dev] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. For more information, please read our blog post.
    Key Features
        Cutting-edge output quality, second only to our state-of-the-art model FLUX.1 [pro].
        Competitive prompt following, matching the performance of closed source alternatives .
        Trained using guidance distillation, making FLUX.1 [dev] more efficient.
        Open weights to drive new scientific research, and empower artists to develop innovative workflows.
        Generated outputs can be used for personal, scientific, and commercial purposes as described in the flux-1-dev-non-commercial-license.
  urls:
    - https://huggingface.co/black-forest-labs/FLUX.1-dev
  tags:
    - text-to-image
    - flux
    - python
    - gpu
  url: "github:mudler/LocalAI/gallery/flux.yaml@master"
  overrides:
    parameters:
      model: ChuckMcSneed/FLUX.1-dev
- !!merge <<: *flux
  name: flux.1-schnell
  license: apache-2
  description: |
    FLUX.1 [schnell] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. For more information, please read our blog post.
    Key Features

        Cutting-edge output quality and competitive prompt following, matching the performance of closed source alternatives.
        Trained using latent adversarial diffusion distillation, FLUX.1 [schnell] can generate high-quality images in only 1 to 4 steps.
        Released under the apache-2.0 licence, the model can be used for personal, scientific, and commercial purposes.
  urls:
    - https://huggingface.co/black-forest-labs/FLUX.1-schnell
  overrides:
    parameters:
      model: black-forest-labs/FLUX.1-schnell
- name: flux.1-dev-ggml
  license: flux-1-dev-non-commercial-license
  url: "github:mudler/LocalAI/gallery/flux-ggml.yaml@master"
  description: |
    FLUX.1 [dev] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. For more information, please read our blog post.
    Key Features
        Cutting-edge output quality, second only to our state-of-the-art model FLUX.1 [pro].
        Competitive prompt following, matching the performance of closed source alternatives .
        Trained using guidance distillation, making FLUX.1 [dev] more efficient.
        Open weights to drive new scientific research, and empower artists to develop innovative workflows.
        Generated outputs can be used for personal, scientific, and commercial purposes as described in the flux-1-dev-non-commercial-license.
    This model is quantized with GGUF
  urls:
    - https://huggingface.co/black-forest-labs/FLUX.1-dev
    - https://huggingface.co/city96/FLUX.1-dev-gguf
  tags:
    - text-to-image
    - flux
    - gpu
    - cpu
  overrides:
    parameters:
      model: flux1-dev-Q2_K.gguf
  files:
    - filename: "flux1-dev-Q2_K.gguf"
      sha256: "b8c464bc0f10076ef8f00ba040d220d90c7993f7c4245ae80227d857f65df105"
      uri: "huggingface://city96/FLUX.1-dev-gguf/flux1-dev-Q2_K.gguf"
    - filename: ae.safetensors
      sha256: afc8e28272cd15db3919bacdb6918ce9c1ed22e96cb12c4d5ed0fba823529e38
      uri: https://huggingface.co/ChuckMcSneed/FLUX.1-dev/resolve/main/ae.safetensors
    - filename: clip_l.safetensors
      sha256: 660c6f5b1abae9dc498ac2d21e1347d2abdb0cf6c0c0c8576cd796491d9a6cdd
      uri: https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors
    - filename: t5xxl_fp16.safetensors
      sha256: 6e480b09fae049a72d2a8c5fbccb8d3e92febeb233bbe9dfe7256958a9167635
      uri: https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors
- !!merge <<: *flux
  name: flux.1dev-abliteratedv2
  description: |
    The FLUX.1 [dev] Abliterated-v2 model is a modified version of FLUX.1 [dev] and a successor to FLUX.1 [dev] Abliterated. This version has undergone a process called unlearning, which removes the model's built-in refusal mechanism. This allows the model to respond to a wider range of prompts, including those that the original model might have deemed inappropriate or harmful.

    The abliteration process involves identifying and isolating the specific components of the model responsible for refusal behavior and then modifying or ablating those components. This results in a model that is more flexible and responsive, while still maintaining the core capabilities of the original FLUX.1 [dev] model.
  urls:
    - https://huggingface.co/SicariusSicariiStuff/flux.1dev-abliteratedv2
    - https://huggingface.co/black-forest-labs/FLUX.1-schnell
  overrides:
    parameters:
      model: SicariusSicariiStuff/flux.1dev-abliteratedv2
- &whisper
  url: "github:mudler/LocalAI/gallery/whisper-base.yaml@master" ## Whisper
  name: "whisper-1"
  icon: https://avatars.githubusercontent.com/u/14957082
  license: "MIT"
  urls:
    - https://github.com/ggerganov/whisper.cpp
    - https://huggingface.co/ggerganov/whisper.cpp
  overrides:
    parameters:
      model: ggml-whisper-base.bin
  files:
    - filename: "ggml-whisper-base.bin"
      sha256: "60ed5bc3dd14eea856493d334349b405782ddcaf0028d4b5df4088345fba2efe"
      uri: "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin"
  description: |
    Port of OpenAI's Whisper model in C/C++
- !!merge <<: *whisper
  name: "whisper-base-q5_1"
  overrides:
    parameters:
      model: ggml-model-whisper-base-q5_1.bin
  files:
    - filename: "ggml-model-whisper-base-q5_1.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-base-q5_1.bin"
      sha256: 422f1ae452ade6f30a004d7e5c6a43195e4433bc370bf23fac9cc591f01a8898
- !!merge <<: *whisper
  name: "whisper-base"
  overrides:
    parameters:
      model: ggml-model-whisper-base.bin
  files:
    - filename: "ggml-model-whisper-base.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-base.bin"
      sha256: 60ed5bc3dd14eea856493d334349b405782ddcaf0028d4b5df4088345fba2efe
- !!merge <<: *whisper
  name: "whisper-base-en-q5_1"
  overrides:
    parameters:
      model: ggml-model-whisper-base.en-q5_1.bin
  files:
    - filename: "ggml-model-whisper-base.en-q5_1.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-base.en-q5_1.bin"
      sha256: 4baf70dd0d7c4247ba2b81fafd9c01005ac77c2f9ef064e00dcf195d0e2fdd2f
- !!merge <<: *whisper
  name: "whisper-base-en"
  overrides:
    parameters:
      model: ggml-model-whisper-base.en.bin
  files:
    - filename: "ggml-model-whisper-base.en.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-base.en.bin"
      sha256: a03779c86df3323075f5e796cb2ce5029f00ec8869eee3fdfb897afe36c6d002
- !!merge <<: *whisper
  name: "whisper-large-q5_0"
  overrides:
    parameters:
      model: ggml-model-whisper-large-q5_0.bin
  files:
    - filename: "ggml-model-whisper-large-q5_0.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-large-q5_0.bin"
      sha256: 3a214837221e4530dbc1fe8d734f302af393eb30bd0ed046042ebf4baf70f6f2
- !!merge <<: *whisper
  name: "whisper-medium-q5_0"
  overrides:
    parameters:
      model: ggml-model-whisper-medium-q5_0.bin
  files:
    - filename: "ggml-model-whisper-medium-q5_0.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-medium-q5_0.bin"
      sha256: 19fea4b380c3a618ec4723c3eef2eb785ffba0d0538cf43f8f235e7b3b34220f
- !!merge <<: *whisper
  name: "whisper-small-q5_1"
  overrides:
    parameters:
      model: ggml-model-whisper-small-q5_1.bin
  files:
    - filename: "ggml-model-whisper-small-q5_1.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-small-q5_1.bin"
      sha256: ae85e4a935d7a567bd102fe55afc16bb595bdb618e11b2fc7591bc08120411bb
- !!merge <<: *whisper
  name: "whisper-small"
  overrides:
    parameters:
      model: ggml-model-whisper-small.bin
  files:
    - filename: "ggml-model-whisper-small.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-small.bin"
      sha256: 1be3a9b2063867b937e64e2ec7483364a79917e157fa98c5d94b5c1fffea987b
- !!merge <<: *whisper
  name: "whisper-small-en-q5_1"
  overrides:
    parameters:
      model: ggml-model-whisper-small.en-q5_1.bin
  files:
    - filename: "ggml-model-whisper-small.en-q5_1.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-small.en-q5_1.bin"
      sha256: bfdff4894dcb76bbf647d56263ea2a96645423f1669176f4844a1bf8e478ad30
- !!merge <<: *whisper
  name: "whisper-small"
  overrides:
    parameters:
      model: ggml-model-whisper-small.en.bin
  files:
    - filename: "ggml-model-whisper-small.en.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-small.en.bin"
      sha256: c6138d6d58ecc8322097e0f987c32f1be8bb0a18532a3f88f734d1bbf9c41e5d
- !!merge <<: *whisper
  name: "whisper-small-q5_1"
  overrides:
    parameters:
      model: ggml-model-whisper-small-q5_1.bin
  files:
    - filename: "ggml-model-whisper-small-q5_1.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-small-q5_1.bin"
      sha256: ae85e4a935d7a567bd102fe55afc16bb595bdb618e11b2fc7591bc08120411bb
- !!merge <<: *whisper
  name: "whisper-tiny"
  overrides:
    parameters:
      model: ggml-model-whisper-tiny.bin
  files:
    - filename: "ggml-model-whisper-tiny.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-tiny.bin"
      sha256: be07e048e1e599ad46341c8d2a135645097a538221678b7acdd1b1919c6e1b21
- !!merge <<: *whisper
  name: "whisper-tiny-q5_1"
  overrides:
    parameters:
      model: ggml-model-whisper-tiny-q5_1.bin
  files:
    - filename: "ggml-model-whisper-tiny-q5_1.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-tiny-q5_1.bin"
      sha256: 818710568da3ca15689e31a743197b520007872ff9576237bda97bd1b469c3d7
- !!merge <<: *whisper
  name: "whisper-tiny-en-q5_1"
  overrides:
    parameters:
      model: ggml-model-whisper-tiny.en-q5_1.bin
  files:
    - filename: "ggml-model-whisper-tiny.en-q5_1.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-tiny.en-q5_1.bin"
      sha256: c77c5766f1cef09b6b7d47f21b546cbddd4157886b3b5d6d4f709e91e66c7c2b
- !!merge <<: *whisper
  name: "whisper-tiny-en"
  overrides:
    parameters:
      model: ggml-model-whisper-tiny.en.bin
  files:
    - filename: "ggml-model-whisper-tiny.en.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-tiny.en.bin"
      sha256: 921e4cf8686fdd993dcd081a5da5b6c365bfde1162e72b08d75ac75289920b1f
- !!merge <<: *whisper
  name: "whisper-tiny-en-q8_0"
  overrides:
    parameters:
      model: ggml-model-whisper-tiny.en-q8_0.bin
  files:
    - filename: "ggml-model-whisper-tiny.en-q8_0.bin"
      uri: "https://ggml.ggerganov.com/ggml-model-whisper-tiny.en-q8_0.bin"
      sha256: 5bc2b3860aa151a4c6e7bb095e1fcce7cf12c7b020ca08dcec0c6d018bb7dd94
## Bert embeddings (llama3.2 drop-in)
- !!merge <<: *llama32
  name: "bert-embeddings"
  description: |
    llama3.2 embeddings model. Using as drop-in replacement for bert-embeddings
  tags:
    - embeddings
  overrides:
    embeddings: true
    parameters:
      model: llama-3.2-1b-instruct-q4_k_m.gguf
- &piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master ## Piper TTS
  name: voice-en-us-kathleen-low
  icon: https://github.com/rhasspy/piper/raw/master/etc/logo.png
  license: mit
  urls:
    - https://github.com/rhasspy/piper
  description: |
    A fast, local neural text to speech system that sounds great and is optimized for the Raspberry Pi 4. Piper is used in a variety of [projects](https://github.com/rhasspy/piper#people-using-piper).
  tags:
    - tts
    - text-to-speech
    - cpu
  overrides:
    parameters:
      model: en-us-kathleen-low.onnx
  files:
    - filename: voice-en-us-kathleen-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-kathleen-low.tar.gz
      sha256: 18e32f009f864d8061af8a4be4ae9018b5aa8b49c37f9e108bbfd782c6a38fbf
- !!merge <<: *piper
  name: voice-ca-upc_ona-x-low
  overrides:
    parameters:
      model: ca-upc_ona-x-low.onnx
  files:
    - filename: voice-ca-upc_ona-x-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-ca-upc_ona-x-low.tar.gz
      sha256: c750d3f6ad35c8d95d5b0d1ad30ede2525524e48390f70a0871bdb7980cc271e
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-ca-upc_pau-x-low
  overrides:
    parameters:
      model: ca-upc_pau-x-low.onnx
  files:
    - filename: voice-ca-upc_pau-x-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-ca-upc_pau-x-low.tar.gz
      sha256: 13c658ecd46a2dbd9dadadf7100623e53106239afcc359f9e27511b91e642f1f
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-da-nst_talesyntese-medium
  overrides:
    parameters:
      model: da-nst_talesyntese-medium.onnx
  files:
    - filename: voice-da-nst_talesyntese-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-da-nst_talesyntese-medium.tar.gz
      sha256: 1bdf673b946a2ba69fab24ae3fc0e7d23e042c2533cbbef008f64f633500eb7e
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-de-eva_k-x-low
  overrides:
    parameters:
      model: de-eva_k-x-low.onnx
  files:
    - filename: voice-de-eva_k-x-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-de-eva_k-x-low.tar.gz
      sha256: 81b305abc58a0a02629aea01904a86ec97b823714dd66b1ee22f38fe529e6371
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-de-karlsson-low
  overrides:
    parameters:
      model: de-karlsson-low.onnx
  files:
    - filename: voice-de-karlsson-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-de-karlsson-low.tar.gz
      sha256: cc7615cfef3ee6beaa1db6059e0271e4d2e1d6d310c0e17b3d36c494628f4b82
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-de-kerstin-low
  overrides:
    parameters:
      model: de-kerstin-low.onnx
  files:
    - filename: voice-de-kerstin-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-de-kerstin-low.tar.gz
      sha256: d8ea72fbc0c21db828e901777ba7bb5dff7c843bb943ad19f34c9700b96a8182
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-de-pavoque-low
  overrides:
    parameters:
      model: de-pavoque-low.onnx
  files:
    - filename: voice-de-pavoque-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-de-pavoque-low.tar.gz
      sha256: 1f5ebc6398e8829f19c7c2b14f46307703bca0f0d8c74b4bb173037b1f161d4d
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-de-ramona-low
  overrides:
    parameters:
      model: de-ramona-low.onnx
  files:
    - filename: voice-de-ramona-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-de-ramona-low.tar.gz
      sha256: 66d9fc08d1a1c537a1cefe99a284f687e5ad7e43d5935a75390678331cce7b47
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-de-thorsten-low
  overrides:
    parameters:
      model: de-thorsten-low.onnx
  files:
    - filename: voice-de-thorsten-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-de-thorsten-low.tar.gz
      sha256: 4d052a7726b77719d0dbc66c845f1d0fe4432bfbd26f878f6dd0883d49e9e43d
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-el-gr-rapunzelina-low
  overrides:
    parameters:
      model: el-gr-rapunzelina-low.onnx
  files:
    - filename: voice-el-gr-rapunzelina-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-el-gr-rapunzelina-low.tar.gz
      sha256: c5613688c12eabc5294465494ed56af1e0fe4d7896d216bfa470eb225d9ff0d0
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-en-gb-alan-low
  overrides:
    parameters:
      model: en-gb-alan-low.onnx
  files:
    - filename: voice-en-gb-alan-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-gb-alan-low.tar.gz
      sha256: 526eeeeccb26206dc92de5965615803b5bf88df059f46372caa4a9fa12d76a32
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-en-gb-southern_english_female-low
  overrides:
    parameters:
      model: en-gb-southern_english
  files:
    - filename: voice-en-gb-southern_english_female-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-gb-southern_english_female-low.tar.gz
      sha256: 7c1bbe23e61a57bdb450b137f69a83ff5358159262e1ed7d2308fa14f4924da9
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-en-us-amy-low
  overrides:
    parameters:
      model: en-us-amy-low.onnx
  files:
    - filename: voice-en-us-amy-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-amy-low.tar.gz
      sha256: 5c3e3480e7d71ce219943c8a711bb9c21fd48b8f8e87ed7fb5c6649135ab7608
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-en-us-danny-low
  overrides:
    parameters:
      model: en-us-danny-low.onnx
  files:
    - filename: voice-en-us-danny-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-danny-low.tar.gz
      sha256: 0c8fbb42526d5fbd3a0bded5f18041c0a893a70a7fb8756f97866624b932264b
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-en-us-kathleen-low
  overrides:
    parameters:
      model: en-us-kathleen-low.onnx
  files:
    - filename: voice-en-us-kathleen-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-kathleen-low.tar.gz
      sha256: 18e32f009f864d8061af8a4be4ae9018b5aa8b49c37f9e108bbfd782c6a38fbf
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-en-us-lessac-low
  overrides:
    parameters:
      model: en-us-lessac-low.onnx
  files:
    - filename: voice-en-us-lessac-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-lessac-low.tar.gz
      sha256: 003fe040985d00b917ace21b2ccca344c282c53fe9b946991b7b0da52516e1fc
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-en-us-lessac-medium
  overrides:
    parameters:
      model: en-us-lessac-medium.onnx
  files:
    - filename: voice-en-us-lessac-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-lessac-medium.tar.gz
      sha256: d45ca50084c0558eb9581cd7d26938043bc8853513da47c63b94d95a2367a5c9
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-en-us-libritts-high
  overrides:
    parameters:
      model: en-us-libritts-high.onnx
  files:
    - filename: voice-en-us-libritts-high.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-libritts-high.tar.gz
      sha256: 328e3e9cb573a43a6c5e1aeca386e971232bdb1418a74d4674cf726c973a0ea8
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-en-us-ryan-high
  overrides:
    parameters:
      model: en-us-ryan-high.onnx
  files:
    - filename: voice-en-us-ryan-high.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-ryan-high.tar.gz
      sha256: de346b054703a190782f49acb9b93c50678a884fede49cfd85429d204802d678
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-en-us-ryan-low
  overrides:
    parameters:
      model: en-us-ryan-low.onnx
  files:
    - filename: voice-en-us-ryan-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-ryan-low.tar.gz
      sha256: 049e6e5bad07870fb1d25ecde97bac00f9c95c90589b2fef4b0fbf23c88770ce
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-en-us-ryan-medium
  overrides:
    parameters:
      model: en-us-ryan-medium.onnx
  files:
    - filename: voice-en-us-ryan-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-ryan-medium.tar.gz
      sha256: 2e00d747eaed6ce9f63f4991921ef3bb2bbfbc7f28cde4f14eb7048960f928d8
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-en-us_lessac
  overrides:
    parameters:
      model: en-us-lessac.onnx
  files:
    - filename: voice-en-us_lessac.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us_lessac.tar.gz
      sha256: 0967af67fb0435aa509b0b794c0cb2cc57817ae8a5bff28cb8cd89ab6f5dcc3d
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-es-carlfm-x-low
  overrides:
    parameters:
      model: es-carlfm-x-low.onnx
  files:
    - filename: voice-es-carlfm-x-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-es-carlfm-x-low.tar.gz
      sha256: 0156a186de321639e6295521f667758ad086bc8433f0a6797a9f044ed5cf5bf3
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-es-mls_10246-low
  overrides:
    parameters:
      model: es-mls_10246-low.onnx
  files:
    - filename: voice-es-mls_10246-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-es-mls_10246-low.tar.gz
      sha256: ff1fe3fc2ab91e32acd4fa8cb92048e3cff0e20079b9d81324f01cd2dea50598
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-es-mls_9972-low
  overrides:
    parameters:
      model: es-mls_9972-low.onnx
  files:
    - filename: voice-es-mls_9972-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-es-mls_9972-low.tar.gz
      sha256: d95def9adea97a6a3fee7645d1167e00fb4fd60f8ce9bc3ebf1acaa9e3f455dc
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-fi-harri-low
  overrides:
    parameters:
      model: fi-harri-low.onnx
  files:
    - filename: voice-fi-harri-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-fi-harri-low.tar.gz
      sha256: 4f1aaf00927d0eb25bf4fc5ef8be2f042e048593864ac263ee7b49c516832b22
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-fr-gilles-low
  overrides:
    parameters:
      model: fr-gilles-low.onnx
  files:
    - filename: voice-fr-gilles-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-fr-gilles-low.tar.gz
      sha256: 77662c7332c2a6f522ab478287d9b0fe9afc11a2da71f310bf923124ee699aae
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-fr-mls_1840-low
  overrides:
    parameters:
      model: fr-mls_1840-low.onnx
  files:
    - filename: voice-fr-mls_1840-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-fr-mls_1840-low.tar.gz
      sha256: 69169d1fac99a733112c08c7caabf457055990590a32ee83ebcada37f86132d3
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-fr-siwis-low
  overrides:
    parameters:
      model: fr-siwis-low.onnx
  files:
    - filename: voice-fr-siwis-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-fr-siwis-low.tar.gz
      sha256: d3db8d47053e9b4108e1c1d29d5ea2b5b1a152183616c3134c222110ccde20f2
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-fr-siwis-medium
  overrides:
    parameters:
      model: fr-siwis-medium.onnx
  files:
    - filename: voice-fr-siwis-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-fr-siwis-medium.tar.gz
      sha256: 0c9ecdf9ecac6de4a46be85a162bffe0db7145bd3a4175831cea6cab4b41eefd
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-is-bui-medium
  overrides:
    parameters:
      model: is-bui-medium.onnx
  files:
    - filename: voice-is-bui-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-is-bui-medium.tar.gz
      sha256: e89ef01051cb48ca2a32338ed8749a4c966b912bb572c61d6d21f2d3822e505f
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-is-salka-medium
  overrides:
    parameters:
      model: is-salka-medium.onnx
  files:
    - filename: voice-is-salka-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-is-salka-medium.tar.gz
      sha256: 75923d7d6b4125166ca58ec82b5d23879012844483b428db9911e034e6626384
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-is-steinn-medium
  overrides:
    parameters:
      model: is-steinn-medium.onnx
  files:
    - filename: voice-is-steinn-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-is-steinn-medium.tar.gz
      sha256: 5a01a8df796f86fdfe12cc32a3412ebd83670d47708d94d926ba5ed0776e6dc9
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-is-ugla-medium
  overrides:
    parameters:
      model: is-ugla-medium.onnx
  files:
    - filename: voice-is-ugla-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-is-ugla-medium.tar.gz
      sha256: 501cd0376f7fd397f394856b7b3d899da4cc40a63e11912258b74da78af90547
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-it-riccardo_fasol-x-low
  overrides:
    parameters:
      model: it-riccardo_fasol-x-low.onnx
  files:
    - filename: voice-it-riccardo_fasol-x-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-it-riccardo_fasol-x-low.tar.gz
      sha256: 394b27b8780f5167e73a62ac103839cc438abc7edb544192f965e5b8f5f4acdb
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-it-paola-medium
  overrides:
    parameters:
      model: it-paola-medium.onnx
  files:
    - filename: voice-it-paola-medium.tar.gz
      uri: https://github.com/fakezeta/piper-paola-voice/releases/download/v1.0.0/voice-it-paola-medium.tar.gz
      sha256: 61d3bac0ff6d347daea5464c4b3ae156a450b603a916cc9ed7deecdeba17153a
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-kk-iseke-x-low
  overrides:
    parameters:
      model: kk-iseke-x-low.onnx
  files:
    - filename: voice-kk-iseke-x-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-kk-iseke-x-low.tar.gz
      sha256: f434fffbea3e6d8cf392e44438a1f32a5d005fc93b41be84a6d663882ce7c074
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-kk-issai-high
  overrides:
    parameters:
      model: kk-issai-high.onnx
  files:
    - filename: voice-kk-issai-high.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-kk-issai-high.tar.gz
      sha256: 84bf79d330d6cd68103e82d95bbcaa2628a99a565126dea94cea2be944ed4f32
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-kk-raya-x-low
  overrides:
    parameters:
      model: kk-raya-x-low.onnx
  files:
    - filename: voice-kk-raya-x-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-kk-raya-x-low.tar.gz
      sha256: 4cab4ce00c6f10450b668072d7980a2bc3ade3a39adee82e3ec4f519d4c57bd1
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-ne-google-medium
  overrides:
    parameters:
      model: ne-google-medium.onnx
  files:
    - filename: voice-ne-google-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-ne-google-medium.tar.gz
      sha256: 0895b11a7a340baea37fb9c27fb50bc3fd0af9779085978277f962d236d3a7bd
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-ne-google-x-low
  overrides:
    parameters:
      model: ne-google-x-low.onnx
  files:
    - filename: voice-ne-google-x-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-ne-google-x-low.tar.gz
      sha256: 870ba5718dfe3e478c6cce8a9a288b591b3575c750b57ffcd845e4ec64988f0b
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-nl-mls_5809-low
  overrides:
    parameters:
      model: nl-mls_5809-low.onnx
  files:
    - filename: voice-nl-mls_5809-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-nl-mls_5809-low.tar.gz
      sha256: 398b9f0318dfe9d613cb066444efec0d8491905ae34cf502edb52030b75ef51c
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-nl-mls_7432-low
  overrides:
    parameters:
      model: nl-mls_7432-low.onnx
  files:
    - filename: voice-nl-mls_7432-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-nl-mls_7432-low.tar.gz
      sha256: 0b3efc68ea7e735ba8f2e0a0f7e9b4b887b00f6530c02fca4aa69a6091adbe5e
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-nl-nathalie-x-low
  overrides:
    parameters:
      model: nl-nathalie-x-low.onnx
  files:
    - filename: voice-nl-nathalie-x-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-nl-nathalie-x-low.tar.gz
      sha256: 2658d4fe2b791491780160216d187751f7c993aa261f3b8ec76dfcaf1ba74942
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-nl-rdh-medium
  overrides:
    parameters:
      model: nl-rdh-medium.onnx
  files:
    - filename: voice-nl-rdh-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-nl-rdh-medium.tar.gz
      sha256: 16f74a195ecf13df1303fd85327532196cc1ecef2e72505200578fd410d0affb
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-nl-rdh-x-low
  overrides:
    parameters:
      model: nl-rdh-x-low.onnx
  files:
    - filename: voice-nl-rdh-x-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-nl-rdh-x-low.tar.gz
      sha256: 496363e5d6e080fd16ac5a1f9457c564b52f0ee8be7f2e2ba1dbf41ef0b23a39
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-no-talesyntese-medium
  overrides:
    parameters:
      model: no-talesyntese-medium.onnx
  files:
    - filename: voice-no-talesyntese-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-no-talesyntese-medium.tar.gz
      sha256: ed6b3593a0e70c90d52e225b85d7e0b805ad8e08482471bd2f73cf1404a6470d
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-pl-mls_6892-low
  overrides:
    parameters:
      model: pl-mls_6892-low.onnx
  files:
    - filename: voice-pl-mls_6892-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-pl-mls_6892-low.tar.gz
      sha256: 5361fcf586b1285025a2ccb8b7500e07c9d66fa8126ef518709c0055c4c0d6f4
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-pt-br-edresson-low
  overrides:
    parameters:
      model: pt-br-edresson-low.onnx
  files:
    - filename: voice-pt-br-edresson-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-pt-br-edresson-low.tar.gz
      sha256: c68be522a526e77f49e90eeb4c13c01b4acdfeb635759f0eeb0eea8f16fd1f33
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-ru-irinia-medium
  overrides:
    parameters:
      model: ru-irinia-medium.onnx
  files:
    - filename: voice-ru-irinia-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-ru-irinia-medium.tar.gz
      sha256: 897b62f170faee38f21d0bc36411164166ae351977e898b6cf33f6206890b55f
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-sv-se-nst-medium
  overrides:
    parameters:
      model: sv-se-nst-medium.onnx
  files:
    - filename: voice-sv-se-nst-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-sv-se-nst-medium.tar.gz
      sha256: 0d6cf357d55860162bf1bdd76bd4f0c396ff547e941bfb25df799d6f1866fda9
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-uk-lada-x-low
  overrides:
    parameters:
      model: uk-lada-x-low.onnx
  files:
    - filename: voice-uk-lada-x-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-uk-lada-x-low.tar.gz
      sha256: ff50acbd659fc226b57632acb1cee310009821ec44b4bc517effdd9827d8296b
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-vi-25hours-single-low
  overrides:
    parameters:
      model: vi-25hours-single-low.onnx
  files:
    - filename: voice-vi-25hours-single-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-vi-25hours-single-low.tar.gz
      sha256: 97e34d1b69dc7000a4ec3269f84339ed35905b3c9800a63da5d39b7649e4a666
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-vi-vivos-x-low
  overrides:
    parameters:
      model: vi-vivos-x-low.onnx
  files:
    - filename: voice-vi-vivos-x-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-vi-vivos-x-low.tar.gz
      sha256: 07cd4ca6438ec224012f7033eec1a2038724b78e4aa2bedf85f756656b52e1a7
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-zh-cn-huayan-x-low
  overrides:
    parameters:
      model: zh-cn-huayan-x-low.onnx
  files:
    - filename: voice-zh-cn-huayan-x-low.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-zh-cn-huayan-x-low.tar.gz
      sha256: 609db0da8ee75beb2f17ce53c55abdbc8c0e04135482efedf1798b1938bf90fa
- !!merge <<: *piper
  url: github:mudler/LocalAI/gallery/piper.yaml@master
  name: voice-zh_CN-huayan-medium
  overrides:
    parameters:
      model: zh_CN-huayan-medium.onnx
  files:
    - filename: voice-zh_CN-huayan-medium.tar.gz
      uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-zh_CN-huayan-medium.tar.gz
      sha256: 0299a5e7f481ba853404e9f0e1515a94d5409585d76963fa4d30c64bd630aa99
- name: "nomic-embed-text-v1.5"
  url: github:mudler/LocalAI/gallery/virtual.yaml@master
  urls:
    - https://huggingface.co/nomic-ai/nomic-embed-text-v1.5
    - https://huggingface.co/mradermacher/nomic-embed-text-v1.5-GGUF
  description: |
    Resizable Production Embeddings with Matryoshka Representation Learning
  tags:
    - embeddings
  overrides:
    embeddings: true
    parameters:
      model: nomic-embed-text-v1.5.f16.gguf
  files:
    - filename: nomic-embed-text-v1.5.f16.gguf
      uri: https://huggingface.co/mradermacher/nomic-embed-text-v1.5-GGUF/resolve/main/nomic-embed-text-v1.5.f16.gguf
      sha256: af8cb9e4ca0bf19eb54d08c612fdf325059264abbbd2c619527e5d2dda8de655
- name: "silero-vad"
  icon: https://github.com/snakers4/silero-models/raw/master/files/silero_logo.jpg
  url: github:mudler/LocalAI/gallery/virtual.yaml@master
  urls:
    - https://github.com/snakers4/silero-vad
    - https://huggingface.co/onnx-community/silero-vad
  description: |
    Silero VAD - pre-trained enterprise-grade Voice Activity Detector.
  tags:
    - vad
    - voice-activity-detection
    - cpu
  overrides:
    backend: silero-vad
    parameters:
      model: silero-vad.onnx
  files:
    - filename: silero-vad.onnx
      uri: https://huggingface.co/onnx-community/silero-vad/resolve/main/onnx/model.onnx
      sha256: a4a068cd6cf1ea8355b84327595838ca748ec29a25bc91fc82e6c299ccdc5808
- name: "bark-cpp-small"
  icon: https://avatars.githubusercontent.com/u/99442120
  url: github:mudler/LocalAI/gallery/virtual.yaml@master
  license: mit
  urls:
    - https://huggingface.co/suno/bark
    - https://huggingface.co/Green-Sky/bark-ggml
  description: |
    Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. To support the research community, we are providing access to pretrained model checkpoints ready for inference.
  tags:
    - tts
    - cpu
  overrides:
    backend: bark-cpp
    parameters:
      model: bark-small_weights-f16.bin
  files:
    - filename: bark-small_weights-f16.bin
      uri: https://huggingface.co/Green-Sky/bark-ggml/resolve/main/bark-small_weights-f16.bin
      sha256: de1ece17e8319537b3a7909baebbd28affab23c942d5d57e648d622af4e2feaa