Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ChatCompletionCache along with AbstractStore for caching completions #4924

Merged
merged 6 commits into from
Jan 16, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions python/packages/autogen-core/docs/src/reference/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ python/autogen_ext.agents.video_surfer
python/autogen_ext.agents.video_surfer.tools
python/autogen_ext.auth.azure
python/autogen_ext.teams.magentic_one
python/autogen_ext.models.cache
python/autogen_ext.models.openai
python/autogen_ext.models.replay
python/autogen_ext.tools.langchain
Expand All @@ -56,5 +57,7 @@ python/autogen_ext.tools.code_execution
python/autogen_ext.code_executors.local
python/autogen_ext.code_executors.docker
python/autogen_ext.code_executors.azure
python/autogen_ext.cache_store.diskcache
python/autogen_ext.cache_store.redis
python/autogen_ext.runtimes.grpc
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
autogen\_ext.cache_store.diskcache
==================================


.. automodule:: autogen_ext.cache_store.diskcache
:members:
:undoc-members:
:show-inheritance:
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
autogen\_ext.cache_store.redis
==============================


.. automodule:: autogen_ext.cache_store.redis
:members:
:undoc-members:
:show-inheritance:
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
autogen\_ext.models.cache
=========================


.. automodule:: autogen_ext.models.cache
:members:
:undoc-members:
:show-inheritance:
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
autogen\_ext.models.replay
==========================
.. automodule:: autogen_ext.models.replay
:members:
:undoc-members:
:show-inheritance:
autogen\_ext.models.replay
==========================


.. automodule:: autogen_ext.models.replay
:members:
:undoc-members:
:show-inheritance:
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,11 @@
"source": [
"# Models\n",
"\n",
"In many cases, agents need access to LLM model services such as OpenAI, Azure OpenAI, or local models. Since there are many different providers with different APIs, `autogen-core` implements a protocol for [model clients](../../core-user-guide/framework/model-clients.ipynb) and `autogen-ext` implements a set of model clients for popular model services. AgentChat can use these model clients to interact with model services. "
"In many cases, agents need access to LLM model services such as OpenAI, Azure OpenAI, or local models. Since there are many different providers with different APIs, `autogen-core` implements a protocol for [model clients](../../core-user-guide/framework/model-clients.ipynb) and `autogen-ext` implements a set of model clients for popular model services. AgentChat can use these model clients to interact with model services. \n",
"\n",
"```{note}\n",
"See {py:class}`~autogen_ext.models.cache.ChatCompletionCache` for a caching wrapper to use with the following clients.\n",
"```"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,13 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Default [Model Capabilities](../faqs.md#what-are-model-capabilities-and-how-do-i-specify-them) may be overridden should the need arise.\n",
"Default [Model Capabilities](../faqs.md#what-are-model-capabilities-and-how-do-i-specify-them) may be overridden should the need arise.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"### Streaming Response\n",
Expand Down Expand Up @@ -315,6 +321,84 @@
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Caching Wrapper\n",
"\n",
"`autogen_ext` implements {py:class}`~autogen_ext.models.cache.ChatCompletionCache` that can wrap any {py:class}`~autogen_core.models.ChatCompletionClient`. Using this wrapper avoids incurring token usage when querying the underlying client with the same prompt multiple times.\n",
"\n",
"{py:class}`~autogen_core.models.ChatCompletionCache` uses a {py:class}`~autogen_core.CacheStore` protocol. We have implemented some useful variants of {py:class}`~autogen_core.CacheStore` including {py:class}`~autogen_ext.cache_store.diskcache.DiskCacheStore` and {py:class}`~autogen_ext.cache_store.redis.RedisStore`.\n",
"\n",
"Here's an example of using `diskcache` for local caching:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# pip install -U \"autogen-ext[openai, diskcache]\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True\n"
]
}
],
"source": [
"import asyncio\n",
"import tempfile\n",
"\n",
"from autogen_core.models import UserMessage\n",
"from autogen_ext.cache_store.diskcache import DiskCacheStore\n",
"from autogen_ext.models.cache import CHAT_CACHE_VALUE_TYPE, ChatCompletionCache\n",
"from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
"from diskcache import Cache\n",
"\n",
"\n",
"async def main() -> None:\n",
" with tempfile.TemporaryDirectory() as tmpdirname:\n",
" # Initialize the original client\n",
" openai_model_client = OpenAIChatCompletionClient(model=\"gpt-4o\")\n",
"\n",
" # Then initialize the CacheStore, in this case with diskcache.Cache.\n",
" # You can also use redis like:\n",
" # from autogen_ext.cache_store.redis import RedisStore\n",
" # import redis\n",
" # redis_instance = redis.Redis()\n",
" # cache_store = RedisCacheStore[CHAT_CACHE_VALUE_TYPE](redis_instance)\n",
" cache_store = DiskCacheStore[CHAT_CACHE_VALUE_TYPE](Cache(tmpdirname))\n",
" cache_client = ChatCompletionCache(openai_model_client, cache_store)\n",
"\n",
" response = await cache_client.create([UserMessage(content=\"Hello, how are you?\", source=\"user\")])\n",
" print(response) # Should print response from OpenAI\n",
" response = await cache_client.create([UserMessage(content=\"Hello, how are you?\", source=\"user\")])\n",
" print(response) # Should print cached response\n",
"\n",
"\n",
"asyncio.run(main())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Inspecting `cached_client.total_usage()` (or `model_client.total_usage()`) before and after a cached response should yield idential counts.\n",
"\n",
"Note that the caching is sensitive to the exact arguments provided to `cached_client.create` or `cached_client.create_stream`, so changing `tools` or `json_output` arguments might lead to a cache miss."
srjoglekar246 marked this conversation as resolved.
Show resolved Hide resolved
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -615,7 +699,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.7"
"version": "3.12.1"
}
},
"nbformat": 4,
Expand Down
2 changes: 2 additions & 0 deletions python/packages/autogen-core/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,8 @@ dev = [
"autogen_ext==0.4.3",

# Documentation tooling
"diskcache",
"redis",
"sphinx-autobuild",
]

Expand Down
3 changes: 3 additions & 0 deletions python/packages/autogen-core/src/autogen_core/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from ._agent_runtime import AgentRuntime
from ._agent_type import AgentType
from ._base_agent import BaseAgent
from ._cache_store import CacheStore, InMemoryStore
from ._cancellation_token import CancellationToken
from ._closure_agent import ClosureAgent, ClosureContext
from ._component_config import (
Expand Down Expand Up @@ -85,6 +86,8 @@
"AgentMetadata",
"AgentRuntime",
"BaseAgent",
"CacheStore",
"InMemoryStore",
"CancellationToken",
"AgentInstantiationContext",
"TopicId",
Expand Down
46 changes: 46 additions & 0 deletions python/packages/autogen-core/src/autogen_core/_cache_store.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
from typing import Dict, Generic, Optional, Protocol, TypeVar

T = TypeVar("T")


class CacheStore(Protocol, Generic[T]):
"""
This protocol defines the basic interface for store/cache operations.

Sub-classes should handle the lifecycle of underlying storage.
"""

def get(self, key: str, default: Optional[T] = None) -> Optional[T]:
"""
Retrieve an item from the store.

Args:
key: The key identifying the item in the store.
default (optional): The default value to return if the key is not found.
Defaults to None.

Returns:
The value associated with the key if found, else the default value.
"""
...

Check warning on line 25 in python/packages/autogen-core/src/autogen_core/_cache_store.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-core/src/autogen_core/_cache_store.py#L25

Added line #L25 was not covered by tests

def set(self, key: str, value: T) -> None:
"""
Set an item in the store.

Args:
key: The key under which the item is to be stored.
value: The value to be stored in the store.
"""
...

Check warning on line 35 in python/packages/autogen-core/src/autogen_core/_cache_store.py

View check run for this annotation

Codecov / codecov/patch

python/packages/autogen-core/src/autogen_core/_cache_store.py#L35

Added line #L35 was not covered by tests


class InMemoryStore(CacheStore[T]):
def __init__(self) -> None:
self.store: Dict[str, T] = {}

def get(self, key: str, default: Optional[T] = None) -> Optional[T]:
return self.store.get(key, default)

def set(self, key: str, value: T) -> None:
self.store[key] = value
48 changes: 48 additions & 0 deletions python/packages/autogen-core/tests/test_cache_store.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
from unittest.mock import Mock

from autogen_core import CacheStore, InMemoryStore


def test_set_and_get_object_key_value() -> None:
mock_store = Mock(spec=CacheStore)
test_key = "test_key"
test_value = object()
mock_store.set(test_key, test_value)
mock_store.get.return_value = test_value
mock_store.set.assert_called_with(test_key, test_value)
assert mock_store.get(test_key) == test_value


def test_get_non_existent_key() -> None:
mock_store = Mock(spec=CacheStore)
key = "non_existent_key"
mock_store.get.return_value = None
assert mock_store.get(key) is None


def test_set_overwrite_existing_key() -> None:
mock_store = Mock(spec=CacheStore)
key = "test_key"
initial_value = "initial_value"
new_value = "new_value"
mock_store.set(key, initial_value)
mock_store.set(key, new_value)
mock_store.get.return_value = new_value
mock_store.set.assert_called_with(key, new_value)
assert mock_store.get(key) == new_value


def test_inmemory_store() -> None:
store = InMemoryStore[int]()
test_key = "test_key"
test_value = 42
store.set(test_key, test_value)
assert store.get(test_key) == test_value

new_value = 2
store.set(test_key, new_value)
assert store.get(test_key) == new_value

key = "non_existent_key"
default_value = 99
assert store.get(key, default_value) == default_value
6 changes: 6 additions & 0 deletions python/packages/autogen-ext/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,12 @@ video-surfer = [
"ffmpeg-python",
"openai-whisper",
]
diskcache = [
"diskcache>=5.6.3"
]
redis = [
"redis>=5.2.1"
]

grpc = [
"grpcio~=1.62.0", # TODO: update this once we have a stable version.
Expand Down
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
from typing import Any, Optional, TypeVar, cast

import diskcache
from autogen_core import CacheStore

T = TypeVar("T")


class DiskCacheStore(CacheStore[T]):
"""
A typed CacheStore implementation that uses diskcache as the underlying storage.
See :class:`~autogen_ext.models.cache.ChatCompletionCache` for an example of usage.

Args:
cache_instance: An instance of diskcache.Cache.
The user is responsible for managing the DiskCache instance's lifetime.
"""

def __init__(self, cache_instance: diskcache.Cache): # type: ignore[no-any-unimported]
self.cache = cache_instance

def get(self, key: str, default: Optional[T] = None) -> Optional[T]:
return cast(Optional[T], self.cache.get(key, default)) # type: ignore[reportUnknownMemberType]

def set(self, key: str, value: T) -> None:
self.cache.set(key, cast(Any, value)) # type: ignore[reportUnknownMemberType]
29 changes: 29 additions & 0 deletions python/packages/autogen-ext/src/autogen_ext/cache_store/redis.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
from typing import Any, Optional, TypeVar, cast

import redis
from autogen_core import CacheStore

T = TypeVar("T")


class RedisStore(CacheStore[T]):
"""
A typed CacheStore implementation that uses redis as the underlying storage.
See :class:`~autogen_ext.models.cache.ChatCompletionCache` for an example of usage.

Args:
cache_instance: An instance of `redis.Redis`.
The user is responsible for managing the Redis instance's lifetime.
"""

def __init__(self, redis_instance: redis.Redis):
self.cache = redis_instance

def get(self, key: str, default: Optional[T] = None) -> Optional[T]:
value = cast(Optional[T], self.cache.get(key))
if value is None:
return default
return value

def set(self, key: str, value: T) -> None:
self.cache.set(key, cast(Any, value))
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from ._chat_completion_cache import CHAT_CACHE_VALUE_TYPE, ChatCompletionCache

__all__ = [
"CHAT_CACHE_VALUE_TYPE",
"ChatCompletionCache",
]
Loading
Loading