Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agentic memory #5227

Open
wants to merge 82 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 78 commits
Commits
Show all changes
82 commits
Select commit Hold shift + click to select a range
442a9d8
initial checkin
rickyloynd-microsoft Nov 29, 2024
f8584cd
support for extensive evaluations
rickyloynd-microsoft Dec 2, 2024
607e7ff
Enhance retrieval with task generalization and insight validation
rickyloynd-microsoft Dec 4, 2024
b045636
Support TRAPI client.
rickyloynd-microsoft Dec 9, 2024
63b28d7
Restoring earlier results, and general cleanup.
rickyloynd-microsoft Dec 24, 2024
b921d83
Merge branch 'refs/heads/main' into agentic_memory
rickyloynd-microsoft Dec 24, 2024
9dfb074
Modify imports after merge from main.
rickyloynd-microsoft Dec 24, 2024
93a5ca4
Log model and token counts.
rickyloynd-microsoft Dec 26, 2024
2cb9344
Only instantiate the client once.
rickyloynd-microsoft Dec 26, 2024
878f458
Fix bug that was duplicating insights across trials.
rickyloynd-microsoft Dec 26, 2024
21562f1
Add the Grader class.
rickyloynd-microsoft Dec 27, 2024
3a40b30
Adjustments for comparison tests.
rickyloynd-microsoft Dec 28, 2024
8622c5e
Test generalization over multiple tasks.
rickyloynd-microsoft Dec 30, 2024
20b26c1
Add teachability and a test for it.
rickyloynd-microsoft Dec 31, 2024
9d47227
Learning from demonstration, in-progress.
rickyloynd-microsoft Jan 1, 2025
52d4e00
In memory retrieval, validate insights separately rather than together.
rickyloynd-microsoft Jan 1, 2025
6b15777
Finish learning from demonstration.
rickyloynd-microsoft Jan 2, 2025
a18674c
Added RecordableChatCompletionClient as a guardrail during refactoring.
rickyloynd-microsoft Jan 3, 2025
52e213e
Ran 3 evals with session recording and replay.
rickyloynd-microsoft Jan 5, 2025
a440b0a
Add results to recorded sessions, including session length.
rickyloynd-microsoft Jan 5, 2025
cab51f1
Use yaml file for eval settings.
rickyloynd-microsoft Jan 7, 2025
d91e58c
Simplify paths and other settings.
rickyloynd-microsoft Jan 7, 2025
f1d7a2f
Renamed the memory classes.
rickyloynd-microsoft Jan 7, 2025
17d4c42
Apprentice.
rickyloynd-microsoft Jan 8, 2025
19654e8
Moved test into the evaluator, and removed eval.py's other util funct…
rickyloynd-microsoft Jan 8, 2025
7aa20c1
renaming
rickyloynd-microsoft Jan 8, 2025
83a7ddc
Rerouted calls to AgenticMemoryController through FastLearner.
rickyloynd-microsoft Jan 9, 2025
3047c1c
Replace task_assignment_callback with AgentWrapper.
rickyloynd-microsoft Jan 9, 2025
1f20b79
Segregate files into subfolders, eval framework vs. implementation, etc.
rickyloynd-microsoft Jan 10, 2025
de4c12b
Rename FastLearner subclass to Apprentice, and import it only as spec…
rickyloynd-microsoft Jan 10, 2025
a9d6108
Refactoring, preparatory to removing eval_framework from the branch a…
rickyloynd-microsoft Jan 11, 2025
d67e2cc
Remove the outdated final_format_instructions parameter.
rickyloynd-microsoft Jan 11, 2025
6470fd8
Move tasks into yaml files.
rickyloynd-microsoft Jan 12, 2025
b025199
Move client support to a subdir.
rickyloynd-microsoft Jan 12, 2025
4f9267c
Move evaluations to a separate dir.
rickyloynd-microsoft Jan 12, 2025
db34844
single line
rickyloynd-microsoft Jan 14, 2025
c780852
Add baseline evaluation for the no-memory case.
rickyloynd-microsoft Jan 16, 2025
fa688f7
Merge branch 'refs/heads/main' into agentic_memory
rickyloynd-microsoft Jan 17, 2025
43bda2f
Support o1 models
rickyloynd-microsoft Jan 18, 2025
be081b3
simplification of client creation code
rickyloynd-microsoft Jan 18, 2025
29d1494
simplify folder structure
rickyloynd-microsoft Jan 18, 2025
8e9a550
Move task data strings out of the eval functions.
rickyloynd-microsoft Jan 20, 2025
b3fe084
simplify page_log
rickyloynd-microsoft Jan 21, 2025
077615f
simplify page_log
rickyloynd-microsoft Jan 21, 2025
8847168
simplify page_log
rickyloynd-microsoft Jan 21, 2025
4091ab3
conventional logging terminology
rickyloynd-microsoft Jan 22, 2025
3865cff
control logger enabling
rickyloynd-microsoft Jan 22, 2025
6c73674
add logging to string map
rickyloynd-microsoft Jan 22, 2025
db5e07b
simplify logging
rickyloynd-microsoft Jan 22, 2025
07cb3f0
simplify logging
rickyloynd-microsoft Jan 22, 2025
e88bd69
Merge branch 'refs/heads/main' into agentic_memory
rickyloynd-microsoft Jan 22, 2025
9b3f77d
merge from main
rickyloynd-microsoft Jan 23, 2025
a0dee67
Changes made by poe check.
rickyloynd-microsoft Jan 23, 2025
7e359e9
docstrings etc.
rickyloynd-microsoft Jan 23, 2025
9466ea8
docstrings etc.
rickyloynd-microsoft Jan 24, 2025
4ec9bff
docstrings etc.
rickyloynd-microsoft Jan 24, 2025
76c16f9
docstrings etc.
rickyloynd-microsoft Jan 24, 2025
a8cd0d7
docstrings etc.
rickyloynd-microsoft Jan 24, 2025
ed7fae1
docstrings etc.
rickyloynd-microsoft Jan 25, 2025
93de858
docstrings etc.
rickyloynd-microsoft Jan 25, 2025
1a309f9
docstrings etc.
rickyloynd-microsoft Jan 25, 2025
8993aa1
docstrings etc.
rickyloynd-microsoft Jan 25, 2025
fa60d5a
Simplify naming
rickyloynd-microsoft Jan 25, 2025
882d578
Simplify tests
rickyloynd-microsoft Jan 26, 2025
00cbb8c
standardize logging levels
rickyloynd-microsoft Jan 27, 2025
88294d2
Remove Evaluator class
rickyloynd-microsoft Jan 27, 2025
7d0ed63
sample code
rickyloynd-microsoft Jan 27, 2025
5b3876f
readme
rickyloynd-microsoft Jan 28, 2025
21220d4
readme fixes
rickyloynd-microsoft Jan 28, 2025
232ed0f
samples readme
rickyloynd-microsoft Jan 28, 2025
87ee27b
readme files
rickyloynd-microsoft Jan 28, 2025
b21d140
readme files
rickyloynd-microsoft Jan 28, 2025
1e88eb6
remove ame
rickyloynd-microsoft Jan 28, 2025
a3addc1
readme
rickyloynd-microsoft Jan 28, 2025
c6ffa43
comment out api_key lines
rickyloynd-microsoft Jan 28, 2025
8f66612
Optional disabling of prefix caching (to decorrelate repeated runs)
rickyloynd-microsoft Jan 28, 2025
491964f
Merge branch 'refs/heads/main' into agentic_memory
rickyloynd-microsoft Jan 28, 2025
2ed08ae
Remove unnecessary instantiation of Grader
rickyloynd-microsoft Jan 29, 2025
f879487
Updated image using git-lfs
rickyloynd-microsoft Jan 30, 2025
60f8ad3
Merge branch 'agentic_memory' of github.com:microsoft/autogen into ag…
rickyloynd-microsoft Jan 30, 2025
ed0a4a6
Merge branch 'refs/heads/main' into agentic_memory
rickyloynd-microsoft Jan 30, 2025
f0eceef
installation fixes
rickyloynd-microsoft Jan 30, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion python/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ cython_debug/
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.idea/

.ruff_cache/

Expand Down
rickyloynd-microsoft marked this conversation as resolved.
Show resolved Hide resolved
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions python/packages/autogen-ext/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ jupyter-executor = [
"ipykernel>=6.29.5",
"nbclient>=0.10.2",
]
agentic-memory = ["chromadb"]
rickyloynd-microsoft marked this conversation as resolved.
Show resolved Hide resolved

semantic-kernel-core = [
"semantic-kernel>=1.17.1",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Agentic Memory

This AutoGen extension provides an implementation of agentic memory, which we define as a
broad ability for AI agents to accomplish tasks more effectively by learning quickly and continually (over the long term).
This is distinct from what RAG or long context windows can provide.
While still under active research and development, this implementation of agentic memory
can be attached to virtually any unmodified AI agent, and is designed to enable agents that:

* Remember guidance, corrections, and demonstrations provided by users.
* Succeed more frequently on tasks after finding successful solutions to similar tasks.
* Learn and adapt quickly to changing circumstances to enable workflows that are dynamic and self-healing.

The implementation is also intended to:

* Be general purpose, unconstrained by types and schemas required by standard databases.
* Augment rather than interfere with an agent’s special capabilities, such as powerful reasoning, long-horizon autonomy, and tool handling.
* Operate in both foreground and background modes, so that an agent can discuss tasks with a user (in the foreground)
then work productively on those tasks (in the background) while the user does other things.
* Allow for fine-grained transparency and auditing of individual memories by human users or other agents.
* Allow agents to be personalized (to a single user) as well as specialized (to a subject, domain or project).
The benefits of personalization scale linearly with the number of users, but the benefits of domain specialization
can scale quadratically with the number of users working in that domain, as insights gained from interactions with one user
can benefit other users in similar situations.
* Support multiple memory banks dynamically attached to an agent at runtime.
* Enable enforcement of security boundaries at the level of individual memory banks.
* Allow users to download and port memory banks between agents and systems.

![agentic_memory.png](../../../imgs/agentic_memory.png)

The block diagram above outlines the key components of our baseline agentic memory architecture,
which augments a base agent with the agentic memory mechanisms.

The **Agentic Memory Controller** implements the fast-learning methods described below,
and manages communication with an **Agentic Memory Bank** containing a vector DB and associated structures.

The **Apprentice** is a thin wrapper around the combination of agentic memory with some base agent.
Some applications will use the Apprentice class, and others will instantiate and use the Agentic Memory Controller directly.

The **Base Agent** is any agent or team orchestrator designed to perform tasks passed to it,
perhaps by interacting with an **Environment** such as a web browser.
We’ve successfully connected and tested several different base agents: a simple LLM client,
the Magentic-One orchestrator, and the GitHub Copilot Chat agent.

The **AgentWrapper** contains the code that instantiates and connects to the selected base agent.

## Memory Creation and Storage

Each stored memory is an insight (in text form) crafted to help the agent accomplish future tasks that are similar
to some task encountered in the past. If the user provides advice for solving a given task,
the advice is extracted and stored as an insight. If the user demonstrates how to perform a task,
the task and demonstration are stored together as an insight that could be applied to similar but different tasks.
If the agent is given a task (free of side-effects) and some means of determining success or failure,
the memory controller repeats the following learning loop in the background some number of times:

1. Test the agent on the task a few times to check for a failure.
2. If a failure is found, analyze the agent’s response in order to:
1. Diagnose the failure of reasoning or missing information,
2. Phrase a general piece of advice, such as what a teacher might give to a student,
3. Temporarily append this advice to the task description,
4. Return to step 1.
5. If some piece of advice succeeds in helping the agent solve the task a number of times, add the advice as an insight to memory.
3. For each insight to be stored in memory, an LLM is prompted to generate a set of free-form, multi-word topics related to the insight. Each topic is embedded to a fixed-length vector and stored in a vector DB mapping it to the topic’s related insight.

## Memory Retrieval and Usage

When the agent is given a task, the following steps are performed by the memory controller:
1. The task is rephrased into a generalized form.
2. A set of free-form, multi-word query topics are generated from the generalized task.
3. A potentially large number of previously stored topics, those most similar to each query topic, are retrieved from the vector DB along with the insights they map to.
4. These candidate insights are filtered by the aggregate similarity of their stored topics to the query topics.
5. In the final filtering stage, an LLM is prompted to return only those insights that seem potentially useful in solving the task at hand.

Retrieved insights that pass the filtering steps are listed under a heading like
“Important insights that may help solve tasks like this”, then appended to the task description before it is passed to the agent as usual.

## Setup and Usage

After installing AutoGen-Core, install its extension package from the `autogen/python/packages/autogen-ext` directory as follows:

`pip install -e .[agentic-memory]`
rickyloynd-microsoft marked this conversation as resolved.
Show resolved Hide resolved

We provide [sample code](../../../../../samples/agentic_memory) to illustrate the following forms of memory-based fast learning:
* Agent learning from user advice and corrections
* Agent learning from user demonstrations
* Agent learning from its own experience
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from .grader import Grader
from .page_logger import PageLogger
from .apprentice import Apprentice
from .agent_wrapper import AgentWrapper
from .agentic_memory_controller import AgenticMemoryController

__all__ = ["Apprentice", "PageLogger", "Grader", "AgentWrapper", "AgenticMemoryController"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
import os
import pickle
from dataclasses import dataclass
from typing import Dict, List, Optional, Union

from ._string_similarity_map import StringSimilarityMap
from .page_logger import PageLogger


@dataclass
class Insight:
"""
Represents a task-completion insight, which is a string that may help solve a task.
"""
id: str
insight_str: str
task_str: str
topics: List[str]


class AgenticMemoryBank:
"""
Stores task-completion insights in a vector DB for later retrieval.

Args:
- settings: Settings for the memory bank.
- reset: True to clear the DB before starting.
- logger: The PageLogger object to use for logging.

Methods:
- reset: Forces immediate deletion of all contents, in memory and on disk.
- save_insights: Saves the current insight structures (possibly empty) to disk.
- contains_insights: Returns True if the memory bank contains any insights.
- add_insight: Adds an insight to the memory bank, given topics related to the insight, and optionally the task.
- add_task_with_solution: Adds a task-insight pair to the memory bank, to be retrieved together later.
- get_relevant_insights: Returns any insights from the memory bank that appear sufficiently relevant to the given
"""
def __init__(self, settings: Dict, reset: bool, logger: PageLogger) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than using dictionary settings, we should either flatten the settings in the constructor, or use a config class that is a Pydantic basemodel for validation and serializable configs. See existing example in autogen_agentchat.agents.AssistantAgent. If this class (and others in this PR) implements the ComponentConfig, you can easily load the configurations from a file to create an object of the class.

self.settings = settings
self.logger = logger
self.logger.enter_function()

memory_dir_path = os.path.expanduser(self.settings["path"])
self.relevance_conversion_threshold = self.settings["relevance_conversion_threshold"]
self.n_results = self.settings["n_results"]
self.distance_threshold = self.settings["distance_threshold"]

path_to_db_dir = os.path.join(memory_dir_path, "string_map")
self.path_to_dict = os.path.join(memory_dir_path, "uid_insight_dict.pkl")

self.string_map = StringSimilarityMap(reset=reset, path_to_db_dir=path_to_db_dir, logger=self.logger)

# Load or create the associated insight dict on disk.
self.uid_insight_dict = {}
self.last_insight_id = 0
if (not reset) and os.path.exists(self.path_to_dict):
self.logger.info("\nLOADING INSIGHTS FROM DISK {}".format(self.path_to_dict))
self.logger.info(" Location = {}".format(self.path_to_dict))
with open(self.path_to_dict, "rb") as f:
self.uid_insight_dict = pickle.load(f)
self.last_insight_id = len(self.uid_insight_dict)
self.logger.info("\n{} INSIGHTS LOADED".format(len(self.uid_insight_dict)))

# Clear the DB if requested.
if reset:
self._reset_insights()

self.logger.leave_function()

def reset(self) -> None:
"""
Forces immediate deletion of all contents, in memory and on disk.
"""
self.string_map.reset_db()
self._reset_insights()

def _reset_insights(self) -> None:
"""
Forces immediate deletion of the insights, in memory and on disk.
"""
self.uid_insight_dict = {}
self.save_insights()

def save_insights(self) -> None:
"""
Saves the current insight structures (possibly empty) to disk.
"""
self.string_map.save_string_pairs()
with open(self.path_to_dict, "wb") as file:
pickle.dump(self.uid_insight_dict, file)

def contains_insights(self) -> bool:
"""
Returns True if the memory bank contains any insights.
"""
return len(self.uid_insight_dict) > 0

def _map_topics_to_insight(self, topics: List[str], insight_id: str, insight: Insight) -> None:
"""
Adds a mapping in the vec DB from each topic to the insight.
"""
self.logger.enter_function()
self.logger.info("\nINSIGHT\n{}".format(insight.insight_str))
for topic in topics:
self.logger.info("\n TOPIC = {}".format(topic))
self.string_map.add_input_output_pair(topic, insight_id)
self.uid_insight_dict[insight_id] = insight
self.logger.leave_function()

def add_insight(self, insight_str: str, topics: List[str], task_str: Optional[str] = None) -> None:
"""
Adds an insight to the memory bank, given topics related to the insight, and optionally the task.
"""
self.last_insight_id += 1
id_str = str(self.last_insight_id)
insight = Insight(id=id_str, insight_str=insight_str, task_str=task_str, topics=topics)
self._map_topics_to_insight(topics, id_str, insight)

def add_task_with_solution(self, task: str, solution: str, topics: List[str]) -> None:
"""
Adds a task-solution pair to the memory bank, to be retrieved together later as a combined insight.
This is useful when the insight is a demonstration of how to solve a given type of task.
"""
self.last_insight_id += 1
id_str = str(self.last_insight_id)
# Prepend the insight to the task description for context.
insight_str = "Example task:\n\n{}\n\nExample solution:\n\n{}".format(task, solution)
insight = Insight(id=id_str, insight_str=insight_str, task_str=task, topics=topics)
self._map_topics_to_insight(topics, id_str, insight)

def get_relevant_insights(self, task_topics: List[str]) -> Dict[str, float]:
"""
Returns any insights from the memory bank that appear sufficiently relevant to the given task topics.
"""
# Process the matching topics to build a dict of insight-relevance pairs.
matches = [] # Each match is a tuple: (topic, insight, distance)
insight_relevance_dict = {}
for topic in task_topics:
matches.extend(self.string_map.get_related_string_pairs(topic, self.n_results, self.distance_threshold))
for match in matches:
relevance = self.relevance_conversion_threshold - match[2]
insight_id = match[1]
insight_str = self.uid_insight_dict[insight_id].insight_str
if insight_str in insight_relevance_dict:
insight_relevance_dict[insight_str] += relevance
else:
insight_relevance_dict[insight_str] = relevance

# Filter out insights with overall relevance below zero.
for insight in list(insight_relevance_dict.keys()):
if insight_relevance_dict[insight] < 0:
del insight_relevance_dict[insight]

return insight_relevance_dict
Loading
Loading