[pull] master from mudler:master #74

pull · 2024-05-13T14:34:22Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

* auto select cpu variant Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * remove cuda target for now Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix metal Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix path Signed-off-by: Sertac Ozercan <sozercan@gmail.com> --------- Signed-off-by: Sertac Ozercan <sozercan@gmail.com>

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

feat(llama.cpp): add flash_attn and no_kv_offload Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Signed-off-by: mudler <mudler@localai.io>

* auto select cpu variant Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * remove cuda target for now Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix metal Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix path Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * cuda Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * auto select cuda Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * update test Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * select CUDA backend only if present Signed-off-by: mudler <mudler@localai.io> * ci: keep cuda bin in path Signed-off-by: mudler <mudler@localai.io> * Makefile: make dist now builds also cuda Signed-off-by: mudler <mudler@localai.io> * Keep pushing fallback in case auto-flagset/nvidia fails There could be other reasons for which the default binary may fail. For example we might have detected an Nvidia GPU, however the user might not have the drivers/cuda libraries installed in the system, and so it would fail to start. We keep the fallback of llama.cpp at the end of the llama.cpp backends to try to fallback loading in case things go wrong Signed-off-by: mudler <mudler@localai.io> * Do not build cuda on MacOS Signed-off-by: mudler <mudler@localai.io> * cleanup Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * Apply suggestions from code review Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Signed-off-by: mudler <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: mudler <mudler@localai.io>

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* feat(llama.cpp): support distributed llama.cpp Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: let tweak how chat messages are merged together Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Makefile: register to ALL_GRPC_BACKENDS Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring, allow disable auto-detection of backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * minor fixups Signed-off-by: mudler <mudler@localai.io> * feat: add cmd to start rpc-server from llama.cpp Signed-off-by: mudler <mudler@localai.io> * ci: add ccache Signed-off-by: mudler <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: mudler <mudler@localai.io>

feat(functions): support mixed JSON BNF grammar This PR provides new options to control how functions are extracted from the LLM, and also provides more control on how JSON grammars can be used (also in conjunction). New YAML settings introduced: - `grammar_message`: when enabled, the generated grammar can also decide to push strings and not only JSON objects. This allows the LLM to pick to either respond freely or using JSON. - `grammar_prefix`: Allows to prefix a string to the JSON grammar definition. - `replace_results`: Is a map that allows to replace strings in the LLM result. As an example, consider the following settings for Hermes-2-Pro-Mistral, which allow extracting both JSON results coming from the model, and the ones coming from the grammar: ```yaml function: # disable injecting the "answer" tool disable_no_action: true # This allows the grammar to also return messages grammar_message: true # Suffix to add to the grammar grammar_prefix: '<tool_call>\n' return_name_in_function_response: true # Without grammar uncomment the lines below # Warning: this is relying only on the capability of the # LLM model to generate the correct function call. # no_grammar: true # json_regex_match: "(?s)<tool_call>(.*?)</tool_call>" replace_results: "<tool_call>": "" "\'": "\"" ``` Note: To disable entirely grammars usage in the example above, uncomment the `no_grammar` and `json_regex_match`. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

Correct llama3-8b-instruct model file This must be a mistake because the config tries to use a model file that is different from the one actually being downloaded. I assumed the downloaded file is what should be used so I corrected the specified model file to that Signed-off-by: Aleksandr Oleinikov <10602045+tannisroot@users.noreply.github.com>

Signed-off-by: mudler <mudler@localai.io>

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

pull bot added the ⤵️ pull label May 13, 2024

models(gallery): add aura-llama-Abliterated (#2309)

4d70b6f

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

github-actions bot added the area/ai-model label May 13, 2024

mudler and others added 10 commits May 13, 2024 18:44

models(gallery): add Bunny-llama (#2311)

fa7b2ae

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

models(gallery): add lumimaidv2 (#2312)

2db2208

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

models(gallery): add orthocopter (#2313)

7123d07

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

feat(llama.cpp): add flash_attention and no_kv_offloading (#2310)

e49ea01

feat(llama.cpp): add flash_attn and no_kv_offload Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

⬆️ Update ggerganov/whisper.cpp (#2317)

4ac7956

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

feat(functions): support models with no grammar and no regex (#2315)

c4186f1

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

feat(functions): allow to set JSON matcher (#2319)

84e2407

Signed-off-by: mudler <mudler@localai.io>

⬆️ Update ggerganov/whisper.cpp (#2326)

566b5cf

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

Update README.md

2990966

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

github-actions bot added the kind/documentation label May 14, 2024

mudler and others added 8 commits May 15, 2024 01:17

Update README.md

07c0559

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

Update README.md

4c845fb

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

⬆️ Update ggerganov/llama.cpp (#2316)

b584dcf

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

models(gallery): add hermes-2-theta-llama-3-8b (#2331)

f7508e3

Signed-off-by: mudler <mudler@localai.io>

⬆️ Update ggerganov/whisper.cpp (#2329)

4e92569

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

pull bot merged commit 4e92569 into kp-forks:master May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from mudler:master #74

[pull] master from mudler:master #74

pull bot commented May 13, 2024 •

edited

Loading

[pull] master from mudler:master #74

[pull] master from mudler:master #74

Conversation

pull bot commented May 13, 2024 • edited Loading

pull bot commented May 13, 2024 •

edited

Loading