llama-cpp-python backend - treats request as sent 90 times? #34

teddybear082 · 2024-10-03T12:52:18Z

Hi, thanks again for this project.

I finally got to test with the llama cpp backend.

I used the built in model downloader feature to download the mistral 4.0 gguf model

I pip installed gallama

When I pinged the server with a function calling request, it seems to have fed the request in 90+ times? (I know the request itself works because I've used with other local backends recently and with openai/azure).

Any logs I could share?

INFO     | API Request:

                      | server.py
           Method: POST


           URL: http://localhost:8001/v1/chat/completions


           {


             "messages": [


               {


                 "role": "system",


                 "content": "You are a so-called \"Wingman\", a virtual assisstant that helps the user with various tasks.\nYou are designed to be an efficient expert in what you are doing.\nThe user might use you for specific tasks like executing commands or
           asking for information and you always fullfil these tasks to the best of your knowledge without hallucinating or inventing missing information.\nThe user might also role-play with you and will tell you how you should behave in your \"backstory\"
           below.\n\nAlways return your response formatted in raw Markdown so that it's easy to read for a human. Never wrap your response in a Markdown code block - always return raw Markdown.\nMake sure you add proper line breaks before list items and format
           the Markdown correctly so that it's easy to transform into HTML.\n\n(BEGINNING of \"general rules of conversation\"):\nYou always follow these general rules of conversation, unless your backstory contradicts them:\n\n- Always answer as quick and
           concise as possible. Never use more than 3 sentences per reply.\n- You can execute commands (also called \"tools\" or \"functions\"), but must be sure that the command matches my request. Some commands require additional parameters.\n- If you are not
           sure, feel free to ask - but this is not necessary.\n- Always ask the user for missing parameters if needed. Never invent any function parameters.\n- After executing a command, acknockledge the execution with a single sentence, but keep in mind, that
           executed commands are in the past.\n- You dont have to execute a command if none matches the request.\n- The user might talk to you in different languages. Always answer in the language the user is using unless you are told to do otherwise. Example:
           If the user talks English, you answer in English.\n- Always prefer to use informal language. For example, use \"Du\" and \"Dir\" instead of \"Sie\" and \"Ihnen\" in German.\n- Do not ask the user if you can do more for them at the end of your
           replies. The user will tell you if they need more help.\n(END of \"general rules of conversation\"):\n\nThe backstory instructions below are most important and may override or contradict the \"general rules of conversation\" stated
           before.\n\n(BEGINNING of \"backstory\"):\n  You are Master Yoda from Star Wars.  Or, \"Yoda you are.\"  Speak at all times in character as Yoda.  You are helping the player through the video game Star Wars: Jedi Academy but do not acknowledge you are
           in the game, treat everything as if it is happening in real life.  Assist the player as requested, keeping your responses brief but insightful and entertaining just like Yoda.  Remember that you are physically speaking to the player in their mind so
           you should avoid emojis and lists that are hard to verbalize.\n(END of \"backstory\")\n\nThe user can also assign \"skills\" to you that give you additional knowledge or abilities.\nThese skills are defined in the \"skills\" section below. Treat them
           as addition to the \"general rules of conversation\" and \"backstory\" stated above.\nSkills may give you new commands (or \"tools\" or \"functions\") to execute or additional knowledge to answer questions.\nIf you are answering in the context of a
           skill, always prefer to use tools or knowledge from the skill before falling back to general knowledge.\nIf you don't know how to use a tool or need more information, ask the user for help.\n\n(BEGINNING of \"skills\"):\n  \n\nVisionAI\n\nYou can
           also see what the user is seeing and you can analyse it and answer all questions about what you see.\nUse the tool 'analyse_what_you_or_user_sees' if you are asked to analyse what you see or whtat the user sees.\nYou can also see the screen of the
           user. Call 'analyse_what_you_or_user_sees' for this, too.\n\n\nWebSearch\n\nYou can also search the internet for topics identified by the user by using your `web_search_function` tool.\n\nExamples indicating that the user wants to search the internet
           are:\n\n- \"Search the web for...\" or \"Search the internet for...\"\n- \"Use DuckDuckGo to search for...\"\n- \"Find more information about...\"\n- \"What is the latest news about...\" (or any other mention of current or recent information)\n- How
           is the weather forecast for... (or any other mention of future information)\n- The user is asking a question that can be answered by searching the internet and is not part of your general knowledge.\n\n\nYouTubeAssistant\n\nYou can help the user with
           various tasks related to YouTube. \nYou can search for videos on a given topic, retrieve transcripts, and download audio or video from YouTube given a URL or video ID.\n(END of \"skills\")\n"

               },


               {


                 "role": "user",


                 "content": "hello"


               }


             ],


             "model": "mistral_llama_cpp",


             "stream": false,


             "tool_choice": "auto",


             "tools": [


               {


                 "type": "function",


                 "function": {


                   "name": "execute_command",


                   "description": "Executes a command",


                   "parameters": {


                     "type": "object",


                     "properties": {


                       "command_name": {


                         "type": "string",


                         "description": "The name of the command to execute",


                         "enum": [


                           "PlayThemeSong",


                           "ToggleLightSaber"


                         ]


                       }


                     },


                     "required": [


                       "command_name"


                     ]


                   }


                 }


               },


               {


                 "type": "function",


                 "function": {


                   "name": "analyse_what_you_or_user_sees",


                   "description": "Analyse what you or the user sees and answer questions about it.",


                   "parameters": {


                     "type": "object",


                     "properties": {


                       "question": {


                         "type": "string",


                         "description": "The question to answer about the image."


                       }


                     },


                     "required": [


                       "question"


                     ]


                   }


                 }


               },


               {


                 "type": "function",


                 "function": {


                   "name": "web_search_function",


                   "description": "Searches the internet / web for the topic identified by the user or identified by the AI to answer a user question.",

                   "parameters": {


                     "type": "object",


                     "properties": {


                       "search_query": {


                         "type": "string",


                         "description": "The topic to search the internet for."


                       },


                       "search_type": {


                         "type": "string",


                         "description": "The type of search to perform.  Use 'news', if the user is looking for current events, weather, or recent news.  Use 'general' for general detailed information about a topic.  Use 'single_site' if the user has specified
           one particular web page that they want you to review, and then use the 'single_site_url' parameter to identify the web page.  If it is not clear what type of search the user wants, ask.",

                         "enum": [


                           "news",


                           "general",


                           "single_site"


                         ]


                       },


                       "single_site_url": {


                         "type": "string",


                         "description": "If the user wants to search a single website, the specific site url that they want to search, formatted as a proper url."

                       }


                     },


                     "required": [


                       "search_query",


                       "search_type"


                     ]


                   }


                 }


               },


               {


                 "type": "function",


                 "function": {


                   "name": "search_videos",


                   "description": "Search videos on YouTube for a given topic.",


                   "parameters": {


                     "type": "object",


                     "properties": {


                       "topic": {


                         "type": "string",


                         "description": "The topic to search for."


                       },


                       "max_results": {


                         "type": "integer",


                         "description": "The number of search results to return (default 3).",


                         "minimum": 1


                       }


                     },


                     "required": [


                       "topic"


                     ]


                   }


                 }


               },


               {


                 "type": "function",


                 "function": {


                   "name": "get_transcript",


                   "description": "Get the plain text transcript of a YouTube video from a given URL or video ID.",     

                   "parameters": {


                     "type": "object",


                     "properties": {


                       "youtube_url_or_id": {


                         "type": "string",


                         "description": "The URL or the video ID of the YouTube video."


                       },


                       "language_code": {


                         "type": "string",


                         "description": "The optional two-letter language code for the language the user desires the transcript in."

                       }


                     },


                     "required": [


                       "youtube_url_or_id"


                     ]


                   }


                 }


               },


               {


                 "type": "function",


                 "function": {


                   "name": "download_audio",


                   "description": "Download the audio of a YouTube video from a given URL or video ID.",


                   "parameters": {


                     "type": "object",


                     "properties": {


                       "youtube_url_or_id": {


                         "type": "string",


                         "description": "The URL or the video ID of the YouTube video."


                       },


                       "directory": {


                         "type": "string",


                         "description": "The directory to save the audio file in."


                       },


                       "file_name": {


                         "type": "string",


                         "description": "The name of the audio file."


                       }


                     },


                     "required": [


                       "youtube_url_or_id",


                       "directory"


                     ]


                   }


                 }


               },


               {


                 "type": "function",


                 "function": {


                   "name": "download_video",


                   "description": "Download the video of a YouTube video from a given URL or video ID.",


                   "parameters": {


                     "type": "object",


                     "properties": {


                       "youtube_url_or_id": {


                         "type": "string",


                         "description": "The URL or the video ID of the YouTube video."


                       },


                       "directory": {


                         "type": "string",


                         "description": "The directory to save the video file in."


                       },


                       "file_name": {


                         "type": "string",


                         "description": "The name of the video file."


                       }


                     },


                     "required": [


                       "youtube_url_or_id",


                       "directory"


                     ]


                   }


                 }


               }


             ]


           }


INFO     | active_requests: defaultdict(<class 'int'>, {8001: 96})

                      | server.py
INFO     | Request routed to model mistral_llama_cpp instance at port 8001

                      | server.py
INFO     | Handling as non-streaming request

                      | request_handler.py
INFO     | unknown:unknown | Handling as non-streaming request

That just repeats over and over again increasing the int from 0-100 and then stops processing.

The text was updated successfully, but these errors were encountered:

remichu-ai · 2024-10-03T13:31:38Z

Hi, can you share more detail about how you make the request so i can try to reproduce the issue if i can. Are you using openai library pack or by other means e.g. request, curl etc

teddybear082 · 2024-10-03T13:37:03Z

yes openai python sdk, the most relevant underlying code is here: https://github.com/ShipBit/wingman-ai/blob/main/providers/open_ai.py

I'm also going to try myself again later today just in case this was some random blip the first time running gallama. Edit yeah this seems to continue. I'm also getting an error (which seemingly doesn't prevent the server from starting or model loading)at the beginning that says module not found: torch, for the app.py in gallama.

remichu-ai · 2024-10-03T14:59:59Z

May I ask are you using Mac OS or Linux or Windows? I recalled having similar error on Macbook but I thought that I already fixed.

Do try to install the latest version via pip install -U gallama if you are not running latest.

I am travelling these few days so will be a bit hard to response to you promptly, but i will try to work on this as much as I can. If u have any further bug/ issue, do share the log as it will be helpful for me to try to reproduce it

teddybear082 · 2024-10-03T15:12:26Z

Thanks! No rush! I'm on windows 11. I just did the pip install gallama last night so it should be the latest. but I'll run that command just in case. Nothing is contained in my logs folder. -EDIT- yes I was already on the most recent version, confirmed.

remichu-ai · 2024-10-03T15:36:43Z

Thanks for understanding. Did you install this natively on windows or inside WSL on windows?

teddybear082 · 2024-10-03T20:22:48Z

Regular windows 11 natively.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama-cpp-python backend - treats request as sent 90 times? #34

llama-cpp-python backend - treats request as sent 90 times? #34

teddybear082 commented Oct 3, 2024

remichu-ai commented Oct 3, 2024

teddybear082 commented Oct 3, 2024 •

edited

Loading

remichu-ai commented Oct 3, 2024

teddybear082 commented Oct 3, 2024 •

edited

Loading

remichu-ai commented Oct 3, 2024

teddybear082 commented Oct 3, 2024

llama-cpp-python backend - treats request as sent 90 times? #34

llama-cpp-python backend - treats request as sent 90 times? #34

Comments

teddybear082 commented Oct 3, 2024

remichu-ai commented Oct 3, 2024

teddybear082 commented Oct 3, 2024 • edited Loading

remichu-ai commented Oct 3, 2024

teddybear082 commented Oct 3, 2024 • edited Loading

remichu-ai commented Oct 3, 2024

teddybear082 commented Oct 3, 2024

teddybear082 commented Oct 3, 2024 •

edited

Loading

teddybear082 commented Oct 3, 2024 •

edited

Loading