Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ramalama serve is not checking if the port is available before running the command #797

Open
benoitf opened this issue Feb 12, 2025 · 6 comments · May be fixed by #898
Open

ramalama serve is not checking if the port is available before running the command #797

benoitf opened this issue Feb 12, 2025 · 6 comments · May be fixed by #898

Comments

@benoitf
Copy link
Contributor

benoitf commented Feb 12, 2025

Being a pretty dumb user, I don't care the port where the model will be served. It can be a random port as it's being displayed

  1. using 8080 as a default port is quite challenging because as a developer, I have tons of app that I start that are already using this port.
    (so picking up a port different than 8080 can increase the "first attempt succeed"

  2. most of the cli launchers detect that the port is in use and is serving the app using a basic '+1' on the default port

so If I launch 2 serve commands, it won't fail (unless I specify a port number and in that case yes I expect it fails if it's already taken but without specifying a port number I will assume the tool is picking one for me that works)

here is the current error message:

Error: preparing container ee6f2c8cd4377db38836b25c7523147a2383bcd9d86f22db0446b15212bd5bfb for attach: cannot listen on the TCP port: listen tcp4 :8080: bind: address already in use
@ericcurtin
Copy link
Collaborator

  1. makes sense to me, keep trying ports and increment until an available one is found

@rhatdan
Copy link
Member

rhatdan commented Feb 13, 2025

We would need to somehow display the port.

@andreadecorte
Copy link

I faced this too, I will try to come up with a PR for this, minor but useful UX improvement

@rhatdan
Copy link
Member

rhatdan commented Feb 28, 2025

Checking if a port is used is easy, on the client side, although racy. Attempt to bind to "default" port if user does not specify a port, and then fail over to random port. Launch the container or service with the port.

If the user specifies --port, do not do this.

Question is if user specifies a port in ramalama.conf do we fail or not?

The only way the user can figure out which port we chose is to do a ramalama inspect.

@ericcurtin
Copy link
Collaborator

Checking if a port is used is easy, on the client side, although racy. Attempt to bind to "default" port if user does not specify a port, and then fail over to random port. Launch the container or service with the port.

If the user specifies --port, do not do this.

Question is if user specifies a port in ramalama.conf do we fail or not?

It's a good question, I would say yes fail, maybe we can make it possible to specifying a range of ports with some syntax like:

5000:5100
5000-5100
etc.

The only way the user can figure out which port we chose is to do a ramalama inspect.

@andreadecorte
Copy link

The only way the user can figure out which port we chose is to do a ramalama inspect.

we seem to already print the serving port when starting (probably it's llama-cpp doing that?)

main: model loaded
main: chat template, chat_template: {{ bos_token }}{% for message in messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if message['role'] == 'user' %}{{ '[INST] ' + message['content'] + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + eos_token}}{% else %}{{ raise_exception('Only user and assistant roles are supported!') }}{% endif %}{% endfor %}, example_format: '[INST] You are a helpful assistant
Hello [/INST]Hi there</s>[INST] How are you? [/INST]'
main: server is listening on http://0.0.0.0:8080 - starting the main loop
srv  update_slots: all slots are idle

In any case, we can print out the chosen part before launching the container, like this :

/ramalama --container serve huggingface://MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF/Mistral-7B-Instruct-v0.3.Q4_K_M.gguf
serving on port 8080
 99% |████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████  |    4.07 GB/   4.07 GB  46.85 MB/s        0sbuild: 4764 (7ad0779f) with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5) for aarch64-redhat-linux
system info: n_threads = 6, n_threads_batch = 6, total_threads = 6
  • happy path -> the port is right
  • sad path, and the port is not anymore available because something steal that -> we will get today's error

So I would say that this could be already an improvement.

andreadecorte added a commit to andreadecorte/ramalama that referenced this issue Feb 28, 2025
This change checks if the default port is already taken. If it's the case,
it tries to increment by 1 until finding a free port.
We also print out the chosen port for the user.

This does not apply if the user overrides the default port.

Note that this check could be still not be enough if the chosen port is taken
by another process after our check, in that case we will still fail at a later phase as today.

Includes unit testing.

Closes containers#797

Signed-off-by: Andrea Decorte <adecorte@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants