Industry Benchmark

BEAM

Extreme-scale memory benchmark. Context stuffing is physically impossible.

What BEAM Is

BEAM evaluates memory systems at 100K, 1M, and 10M token corpus sizes.

At 10M tokens, you cannot fit the corpus into any model context window. Only real memory architectures survive. That is why BEAM matters more than prompt engineering demos.

Reference Scores

System BEAM 100K BEAM 1M BEAM 10M
Mem0 v3 pending 64.1% 48.6%
Hindsight TBD TBD SOTA
GBrain not-run not-run not-run
Quaid 25% pending pending

Reference values come from published materials. GBrain has no public BEAM run.

Status

BEAM corpus released by Mem0. Quaid has a published 100K run; larger corpus runs remain pending.

measured
v0.23.0
2026-06-22

How To Run

OPENAI_API_KEY=sk-... bash benchmarks/beam/run.sh