vLLM can do this. #114
Closed
entrepeneur4lyf
started this conversation in
Results
Replies: 1 comment
-
Thanks for sharing this! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
You may want to look at vLLM for inspiration since it supports distributed GPU inference.
Beta Was this translation helpful? Give feedback.
All reactions