Confusion after running serve #83
Replies: 6 comments 1 reply
-
You can curl... I'm assuming you have enough ram on your macbook pro.... It's late for me, but you could just play around with:
|
Beta Was this translation helpful? Give feedback.
-
You started a server on port 8080 there, it's llama.cpp under the hood |
Beta Was this translation helpful? Give feedback.
-
Ah that makes sense. I guess docs are on the way to show how to use it / simple hello world curl request |
Beta Was this translation helpful? Give feedback.
-
It's here: https://github.com/containers/ramalama?tab=readme-ov-file#running-models We should probably move that closer to the top of the README.md before Listing Models Feel free to do that, I'll merge |
Beta Was this translation helpful? Give feedback.
-
We should also add this types of data to man ramalama-serve. Eventually we want to look into adding ramalama to ai-lab-recipes, and hopefully the ramalama serve AI Models can be used for all of the different recipes. |
Beta Was this translation helpful? Give feedback.
-
ramalama serve is this server: https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md it's calling llama-server under the hood. |
Beta Was this translation helpful? Give feedback.
-
After running serve:
I'm confused what to do next to be honest...
What are the next steps / CURL / webview to try it out?
Also, I am a bit confused too as to where or not this is a container being ran in the background or not, or whether this is native? I had thought that
serve
does it via containers only on the host system?It ran, but there's nothing on podman ps
Beta Was this translation helpful? Give feedback.
All reactions