llama stack and vllm on ocp #24

cooktheryan · 2025-02-20T20:55:25Z

YAML and README.md to run llama stack and vLLM with meta-llama/Llama-3.1-8B-Instruct

Signed-off-by: Ryan Cook <[email protected]>

hemajv

@cooktheryan this looks awesome 🎉 added a few comments

prototype/frameworks/llamastack/kubernetes/README.md

hemajv · 2025-02-20T23:29:24Z

prototype/frameworks/llamastack/kubernetes/README.md

+
+```
+llamastack-deployment-llama-serve.apps.ocp-beta-test.nerc.mghpcc.org
+```


Could you also mention the port to be used? And can we also add a section at the end like so:

Testing the Llamastack Server

In order to test the Llamastack server, you can try some of the examples mentioned here by setting the following env vars:

INFERENCE_MODEL="meta-llama/Llama-3.1-8B-Instruct" LLAMA_STACK_PORT= <mention the port number>

When connecting to the server using LlamaStackClient make sure to update the base_url with the URL of the Llamastack server.

cooktheryan · 2025-02-26T15:44:30Z

@hemajv i may have you do a follow up PR on how to use these endpoints based on your testing

Signed-off-by: Ryan Cook <[email protected]>

hemajv

@cooktheryan thanks for the changes!
/lgtm 🚢

cooktheryan added 2 commits February 20, 2025 15:54

serve meta through vllm

1964efe

Signed-off-by: Ryan Cook <[email protected]>

adding in llama stack deployment options

f86f8e0

Signed-off-by: Ryan Cook <[email protected]>

cooktheryan changed the title ~~serve meta through vllm~~ llama stack and vllm on ocp Feb 20, 2025

hemajv self-requested a review February 20, 2025 23:11

hemajv requested changes Feb 20, 2025

View reviewed changes

cooktheryan added 2 commits February 27, 2025 16:01

incorporate feedback and new version of software

2b5055e

Signed-off-by: Ryan Cook <[email protected]>

additional notes

49899dc

Signed-off-by: Ryan Cook <[email protected]>

hemajv approved these changes Feb 27, 2025

View reviewed changes

hemajv merged commit 8bed7a3 into main Feb 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama stack and vllm on ocp #24

llama stack and vllm on ocp #24

cooktheryan commented Feb 20, 2025 •

edited

Loading

hemajv left a comment

hemajv Feb 20, 2025 •

edited

Loading

cooktheryan commented Feb 26, 2025

hemajv left a comment

llama stack and vllm on ocp #24

llama stack and vllm on ocp #24

Conversation

cooktheryan commented Feb 20, 2025 • edited Loading

hemajv left a comment

Choose a reason for hiding this comment

hemajv Feb 20, 2025 • edited Loading

Choose a reason for hiding this comment

Testing the Llamastack Server

cooktheryan commented Feb 26, 2025

hemajv left a comment

Choose a reason for hiding this comment

cooktheryan commented Feb 20, 2025 •

edited

Loading

hemajv Feb 20, 2025 •

edited

Loading