-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Documentation] Using non-OpenAI models (#2076)
* Addition of Non-OpenAI LLM section and main doc page * Continued writing... * Continued writing - cloud-based proxy servers * Folder renamed * Further writing * together.ai example added * Local proxy server added, diagram added, tidy up * Added vLLM to local proxy servers documentation * As per @ekzhu's feedback, individual pages and tidy up * Added reference to LM Studio and renamed file * Fixed incorrect huggingface.co link * Run pre-commit checks, added LM Studio redirect --------- Co-authored-by: Eric Zhu <[email protected]>
- Loading branch information
Showing
7 changed files
with
736 additions
and
0 deletions.
There are no files selected for viewing
75 changes: 75 additions & 0 deletions
75
website/docs/topics/non-openai-models/about-using-nonopenai-models.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
# Non-OpenAI Models | ||
|
||
AutoGen allows you to use non-OpenAI models through proxy servers that provide | ||
an OpenAI-compatible API or a [custom model client](https://microsoft.github.io/autogen/blog/2024/01/26/Custom-Models) | ||
class. | ||
|
||
Benefits of this flexibility include access to hundreds of models, assigning specialized | ||
models to agents (e.g., fine-tuned coding models), the ability to run AutoGen entirely | ||
within your environment, utilising both OpenAI and non-OpenAI models in one system, and cost | ||
reductions in inference. | ||
|
||
## OpenAI-compatible API proxy server | ||
Any proxy server that provides an API that is compatible with [OpenAI's API](https://platform.openai.com/docs/api-reference) | ||
will work with AutoGen. | ||
|
||
These proxy servers can be cloud-based or running locally within your environment. | ||
|
||
 | ||
|
||
### Cloud-based proxy servers | ||
By using cloud-based proxy servers, you are able to use models without requiring the hardware | ||
and software to run them. | ||
|
||
These providers can host open source/weight models, like [Hugging Face](https://huggingface.co/), | ||
or their own closed models. | ||
|
||
When cloud-based proxy servers provide an OpenAI-compatible API, using them in AutoGen | ||
is straightforward. With [LLM Configuration](/docs/topics/llm_configuration) done in | ||
the same way as when using OpenAI's models, the primary difference is typically the | ||
authentication which is usually handled through an API key. | ||
|
||
Examples of using cloud-based proxy servers providers that have an OpenAI-compatible API | ||
are provided below: | ||
|
||
- [together.ai example](cloud-togetherai) | ||
|
||
|
||
### Locally run proxy servers | ||
An increasing number of LLM proxy servers are available for use locally. These can be | ||
open-source (e.g., LiteLLM, Ollama, vLLM) or closed-source (e.g., LM Studio), and are | ||
typically used for running the full-stack within your environment. | ||
|
||
Similar to cloud-based proxy servers, as long as these proxy servers provide an | ||
OpenAI-compatible API, running them in AutoGen is straightforward. | ||
|
||
Examples of using locally run proxy servers that have an OpenAI-compatible API are | ||
provided below: | ||
|
||
- [LiteLLM with Ollama example](local-litellm-ollama) | ||
- [LM Studio](local-lm-studio) | ||
- [vLLM example](local-vllm) | ||
|
||
````mdx-code-block | ||
:::tip | ||
If you are planning to use Function Calling, not all cloud-based and local proxy servers support | ||
Function Calling with their OpenAI-compatible API, so check their documentation. | ||
::: | ||
```` | ||
|
||
### Configuration for Non-OpenAI models | ||
|
||
Whether you choose a cloud-based or locally-run proxy server, the configuration is done in | ||
the same way as using OpenAI's models, see [LLM Configuration](/docs/topics/llm_configuration) | ||
for further information. | ||
|
||
You can use [model configuration filtering](/docs/topics/llm_configuration#config-list-filtering) | ||
to assign specific models to agents. | ||
|
||
|
||
## Custom Model Client class | ||
For more advanced users, you can create your own custom model client class, enabling | ||
you to define and load your own models. | ||
|
||
See the [AutoGen with Custom Models: Empowering Users to Use Their Own Inference Mechanism](/blog/2024/01/26/Custom-Models) | ||
blog post and [this notebook](/docs/notebooks/agentchat_custom_model/) for a guide to creating custom model client classes. |
170 changes: 170 additions & 0 deletions
170
website/docs/topics/non-openai-models/cloud-togetherai.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,170 @@ | ||
# Together AI | ||
This cloud-based proxy server example, using [together.ai](https://www.together.ai/), is a group chat between a Python developer | ||
and a code reviewer, who are given a coding task. | ||
|
||
Start by [installing AutoGen](/docs/installation/) and getting your [together.ai API key](https://api.together.xyz/settings/profile). | ||
|
||
Put your together.ai API key in an environment variable, TOGETHER_API_KEY. | ||
|
||
Linux / Mac OSX: | ||
|
||
```bash | ||
export TOGETHER_API_KEY=YourTogetherAIKeyHere | ||
``` | ||
|
||
Windows (command prompt): | ||
|
||
```powershell | ||
set TOGETHER_API_KEY=YourTogetherAIKeyHere | ||
``` | ||
|
||
Create your LLM configuration, with the [model you want](https://docs.together.ai/docs/inference-models). | ||
|
||
```python | ||
import autogen | ||
import os | ||
|
||
llm_config={ | ||
"config_list": [ | ||
{ | ||
# Available together.ai model strings: | ||
# https://docs.together.ai/docs/inference-models | ||
"model": "mistralai/Mistral-7B-Instruct-v0.1", | ||
"api_key": os.environ['TOGETHER_API_KEY'], | ||
"base_url": "https://api.together.xyz/v1" | ||
} | ||
], | ||
"cache_seed": 42 | ||
} | ||
``` | ||
|
||
## Construct Agents | ||
|
||
```python | ||
# User Proxy will execute code and finish the chat upon typing 'exit' | ||
user_proxy = autogen.UserProxyAgent( | ||
name="UserProxy", | ||
system_message="A human admin", | ||
code_execution_config={ | ||
"last_n_messages": 2, | ||
"work_dir": "groupchat", | ||
"use_docker": False, | ||
}, # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly. | ||
human_input_mode="TERMINATE", | ||
is_termination_msg=lambda x: "TERMINATE" in x.get("content"), | ||
) | ||
|
||
# Python Coder agent | ||
coder = autogen.AssistantAgent( | ||
name="softwareCoder", | ||
description="Software Coder, writes Python code as required and reiterates with feedback from the Code Reviewer.", | ||
system_message="You are a senior Python developer, a specialist in writing succinct Python functions.", | ||
llm_config=llm_config, | ||
) | ||
|
||
# Code Reviewer agent | ||
reviewer = autogen.AssistantAgent( | ||
name="codeReviewer", | ||
description="Code Reviewer, reviews written code for correctness, efficiency, and security. Asks the Software Coder to address issues.", | ||
system_message="You are a Code Reviewer, experienced in checking code for correctness, efficiency, and security. Review and provide feedback to the Software Coder until you are satisfied, then return the word TERMINATE", | ||
is_termination_msg=lambda x: "TERMINATE" in x.get("content"), | ||
llm_config=llm_config, | ||
) | ||
``` | ||
|
||
## Establish the group chat | ||
|
||
```python | ||
# Establish the Group Chat and disallow a speaker being selected consecutively | ||
groupchat = autogen.GroupChat(agents=[user_proxy, coder, reviewer], messages=[], max_round=12, allow_repeat_speaker=False) | ||
|
||
# Manages the group of multiple agents | ||
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config) | ||
``` | ||
|
||
## Start Chat | ||
|
||
```python | ||
# Start the chat with a request to write a function | ||
user_proxy.initiate_chat( | ||
manager, | ||
message="Write a Python function for the Fibonacci sequence, the function will have one parameter for the number in the sequence, which the function will return the Fibonacci number for." | ||
) | ||
# type exit to terminate the chat | ||
``` | ||
|
||
Output: | ||
```` text | ||
UserProxy (to chat_manager): | ||
Write a Python function for the Fibonacci sequence, the function will have one parameter for the number in the sequence, which the function will return the Fibonacci number for. | ||
-------------------------------------------------------------------------------- | ||
softwareCoder (to chat_manager): | ||
Sure, here is a simple Python function that uses recursion to calculate the Fibonacci number: | ||
```python | ||
def fibonacci(n): | ||
if n <= 0: | ||
return "Input should be a positive integer." | ||
elif n == 1: | ||
return 0 | ||
elif n == 2: | ||
return 1 | ||
else: | ||
return fibonacci(n-1) + fibonacci(n-2) | ||
``` | ||
This function takes an integer `n` as input and returns the `n`th number in the Fibonacci sequence. The Fibonacci sequence is a series of numbers in which each number is the sum of the two preceding ones, usually starting with 0 and 1. | ||
Note that this implementation uses recursion and may not be efficient for large values of `n`. In such cases, an iterative approach or memoization would be more appropriate. | ||
-------------------------------------------------------------------------------- | ||
codeReviewer (to chat_manager): | ||
I see a couple of issues with the current implementation of the `fibonacci` function: | ||
1. The function does not handle negative inputs correctly. Currently, it returns a string message for any input less than or equal to 0. It would be better to raise a `ValueError` or return a more informative message. | ||
2. The function uses recursion to calculate the Fibonacci number, which can be inefficient for large inputs. A more efficient approach would be to use an iterative approach or memoization. | ||
Here's an updated implementation that addresses these issues: | ||
```python | ||
def fibonacci(n): | ||
if n <= 0: | ||
raise ValueError("Input should be a positive integer.") | ||
elif n == 1: | ||
return 0 | ||
elif n == 2: | ||
return 1 | ||
else: | ||
a, b = 0, 1 | ||
for _ in range(n - 2): | ||
a, b = b, a + b | ||
return b | ||
``` | ||
This implementation uses a loop to calculate the Fibonacci number iteratively, which is more efficient than the recursive approach. It also raises a `ValueError` for negative inputs, which is a more appropriate way to handle invalid inputs. | ||
-------------------------------------------------------------------------------- | ||
>>>>>>>> USING AUTO REPLY... | ||
>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)... | ||
UserProxy (to chat_manager): | ||
exitcode: 0 (execution succeeded) | ||
Code output: | ||
-------------------------------------------------------------------------------- | ||
codeReviewer (to chat_manager): | ||
I'm glad the updated implementation addresses the issues with the original code. Let me know if you have any further questions or if there's anything else I can help you with. | ||
To terminate the conversation, please type "TERMINATE". | ||
-------------------------------------------------------------------------------- | ||
Please give feedback to chat_manager. Press enter or type 'exit' to stop the conversation: exit | ||
```` |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.