Trying to set up different LLMs on different ports, and it ain't working right. #897

phalexo · 2023-12-06T20:29:22Z

phalexo
Dec 6, 2023

Lets say I have a tmux shell #1, and a tmux shell #2.

In #1 I set
export CUDA_VISIBLE_DEVICES=0,1

in #2
export CUDA_VISIBLE_DEVICES-2,3

I run in #1
litellm --model ollama/notus-7b-v1.Q6_k:latest

and in #2
litellm --model ollama/orca-2-13b.Q6_K:latest

In #1 litellm comes back with
http://0.0.0.0.0:8000 interface (allegedly OpenAI API compatible)
and on #2
http://0.0.0.0:##### (some other port)

The question is how exactly a particular agent like "coder" or "manager" or "critic" is associated with the specific model???

There is this file

[
{
"model": "orca-2-13b.Q6_K:latest",
"base_url": "http://localhost:8000",
"api_key": "NULL"
},
{
"model": "notus-7b-v1.Q6_k:latest",
"base_url": "http://localhost:8000",
"api_key": "NULL"
},
{
"model": "llama2",
"base_url": "http://localhost:8000",
"api_key": "NULL"
},
{
"model": "mistral",
"base_url": "http://172.17.0.2:8000",
"api_key": "NULL"
},
]

With ports changed to something else on some of them, but AutoGen seems to ignore these ports and just defaults to 8000

And even if AutoGen respected the port assignment to different models, it is STILL not clear how different agents are associated with specific ports.

Thanks.

rickyloynd-microsoft · 2023-12-06T20:46:51Z

rickyloynd-microsoft
Dec 6, 2023
Collaborator

Just to check, have you been able to run autogen examples successfully as written, without specifying any ports?

1 reply

phalexo Dec 6, 2023
Author

Yes, I can do that. Let me correct that. I can run my example, which I mostly copied from an AutoGen notebook.

phalexo · 2023-12-06T20:51:29Z

phalexo
Dec 6, 2023
Author

Just to check, have you been able to run autogen examples successfully as written, without specifying any ports? I think I found a clue in one of the issues.

0 replies

rickyloynd-microsoft · 2023-12-06T20:52:06Z

rickyloynd-microsoft
Dec 6, 2023
Collaborator

And have you been able to run each of these 4 local models successfully by itself without specifying ports?

1 reply

phalexo Dec 6, 2023
Author

Yes, I can do this

ollama run model1

etc.... with one caveat

I had to drop the version for ollama from the current one to 0.1.11.

otherwise I get garbage from GPUs, but runs fine on the host.

rickyloynd-microsoft · 2023-12-06T21:00:09Z

rickyloynd-microsoft
Dec 6, 2023
Collaborator

Have you seen any autogen example that uses multiple ports like you are trying to do?

1 reply

phalexo Dec 6, 2023
Author

Not an example as such, but as I mentioned I see a clue in one of the issues. At least I have an idea what to try.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to set up different LLMs on different ports, and it ain't working right. #897

{{title}}

Replies: 4 comments 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Trying to set up different LLMs on different ports, and it ain't working right. #897

phalexo Dec 6, 2023

Replies: 4 comments · 3 replies

rickyloynd-microsoft Dec 6, 2023 Collaborator

phalexo Dec 6, 2023 Author

phalexo Dec 6, 2023 Author

rickyloynd-microsoft Dec 6, 2023 Collaborator

phalexo Dec 6, 2023 Author

rickyloynd-microsoft Dec 6, 2023 Collaborator

phalexo Dec 6, 2023 Author

phalexo
Dec 6, 2023

Replies: 4 comments 3 replies

rickyloynd-microsoft
Dec 6, 2023
Collaborator

phalexo Dec 6, 2023
Author

phalexo
Dec 6, 2023
Author

rickyloynd-microsoft
Dec 6, 2023
Collaborator

phalexo Dec 6, 2023
Author

rickyloynd-microsoft
Dec 6, 2023
Collaborator

phalexo Dec 6, 2023
Author