MemBenchmark

MemBenchmark is an early-stage research project on how to evaluate memory systems for modern AI agents.

This repository currently represents the first survey stage of the project. The reports in this repository are the canonical finalized survey outputs.

Current scope

The current research focus is:

modern agent-memory benchmark methods, tasks, and metrics
open-source benchmark tooling and evaluation harnesses
modern agent memory system designs and implementation patterns

The target agent class is modern acting agents, not only chat assistants. That includes:

coding agents
workflow/research agents
web agents
GUI agents
tool-using autonomous systems

Survey reports

1. Memory benchmark survey

See modern_agent_memory_benchmarking_survey.md

This report covers:

benchmark methods for modern acting-agent memory
open-source benchmark tools and software
gap analysis for what remains missing

2. Memory systems survey

See modern_agent_memory_systems_survey.md

This report covers:

major memory architecture families
representative memory systems and frameworks
write / retrieve / update / forget behavior
failure modes
implications for future benchmark design

Current conclusion

There is now meaningful progress in both:

benchmark design for modern agent memory
memory system design itself

But there still appears to be a real gap in:

neutral, reusable benchmarking for modern acting agents
especially coding / workflow / research agents

That is the current opening for MemBenchmark.

Project status

This repo is currently in the research and design phase.

Planned next step:

move from survey to a MemBenchmark v0 design memo
define target agent type, benchmark dimensions, unit under test, and minimal benchmark interface

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
modern_agent_memory_benchmarking_survey.md		modern_agent_memory_benchmarking_survey.md
modern_agent_memory_systems_survey.md		modern_agent_memory_systems_survey.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MemBenchmark

Current scope

Survey reports

1. Memory benchmark survey

2. Memory systems survey

Current conclusion

Project status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

MemBenchmark

Current scope

Survey reports

1. Memory benchmark survey

2. Memory systems survey

Current conclusion

Project status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages