A novel agentic memory system for LLM agents that can dynamically organize memories in an agentic way.
Note: This repository is specifically designed to reproduce the results presented in our paper. If you want to use the A-Mem system in building your agents, please refer to our official implementation at: A-mem-sys
For more details, please refer to our paper: A-Mem: Agentic Memory for LLM Agents
Large Language Model (LLM) agents have demonstrated remarkable capabilities in handling complex real-world tasks through external tool usage. However, to effectively leverage historical experiences, they require sophisticated memory systems. Traditional memory systems, while providing basic storage and retrieval functionality, often lack advanced memory organization capabilities.
Our project introduces an innovative Agentic Memory system that revolutionizes how LLM agents manage and utilize their memories:
Comparison between traditional memory system (top) and our proposed agentic memory (bottom). Our system enables dynamic memory operations and flexible agent-memory interactions.
- 🔄 Dynamic memory organization based on Zettelkasten principles
- 🔍 Intelligent indexing and linking of memories
- 📝 Comprehensive note generation with structured attributes
- 🌐 Interconnected knowledge networks
- 🔄 Continuous memory evolution and refinement
- 🤖 Agent-driven decision making for adaptive memory management
The framework of our Agentic Memory system showing the dynamic interaction between LLM agents and memory components.
When a new memory is added to the system:
- Generates comprehensive notes with structured attributes
- Creates contextual descriptions and tags
- Analyzes historical memories for relevant connections
- Establishes meaningful links based on similarities
- Enables dynamic memory evolution and updates
Empirical experiments conducted on six foundation models demonstrate superior performance compared to existing SOTA baselines.
- Clone the repository:
git clone https://github.com/WujiangXu/AgenticMemory.git
cd AgenticMemory- Install dependencies: Option 1: Using venv (Python virtual environment)
# Create and activate virtual environment
python -m venv a-mem
source a-mem/bin/activate # Linux/Mac
a-mem\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txtOption 2: Using Conda
# Create and activate conda environment
conda create -n myenv python=3.9
conda activate myenv
# Install dependencies
pip install -r requirements.txt- Run the experiments in LoCoMo dataset:
Option A — Original evaluation (requires OpenAI JSON schema support):
python test_advanced.pyOption B — Robust evaluation (recommended, works with any LLM backend):
The robust evaluation (test_advanced_robust.py) removes the JSON schema dependency and supports OpenAI API, vLLM, and Ollama backends.
# OpenAI models
python test_advanced_robust.py --backend openai --model gpt-4o-mini \
--dataset data/locomo10.json --output results_robust_gpt-4o-mini.json
# vLLM-served open-source models (start vLLM server first)
python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen2.5-3B-Instruct --port 30000 \
--dtype float16 --enforce-eager --max-model-len 8192
python test_advanced_robust.py --backend vllm --model Qwen/Qwen2.5-3B-Instruct \
--dataset data/locomo10.json --output results_robust_qwen3b.json \
--sglang_port 30000
# Ollama models
python test_advanced_robust.py --backend ollama --model qwen2.5:3b \
--dataset data/locomo10.json --output results_robust_ollama_qwen3b.jsonKey arguments:
--retrieve_k: Number of memories to retrieve per query (default: 10). Tune this per model for best results.--ratio: Fraction of dataset to evaluate (e.g.,--ratio 0.1for 10% quick test).--backend: One ofopenai,vllm,ollama.--sglang_port: Port for vLLM/SGLang server (default: 30000).
- Run the full k-sweep to find optimal retrieval k per model:
bash run_k_sweep.shNote: To achieve the optimal performance reported in our paper, please adjust the hyperparameter k value accordingly. Memories are cached after the first run, so subsequent k-sweep evaluations only re-run the QA answering step.
Categories Information: The LoCoMo dataset contains the following categories:
- Category 1: Multi-hop
- Category 2: Temporal
- Category 3: Open-domain
- Category 4: Single-hop
- Category 5: Adversarial
For more details about the categories, please refer to this GitHub issue.
If you use this code in your research, please cite our work:
@inproceedings{xu2025amem,
title={A-Mem: Agentic memory for llm agents},
author={Xu, Wujiang and Liang, Zujie and Mei, Kai and Gao, Hang and Tan, Juntao and Zhang, Yongfeng},
booktitle={Advances in Neural Information Processing Systems},
year={2025}
}This project is licensed under the MIT License. See LICENSE for details.