Industry Benchmark
BEAM
Extreme-scale memory benchmark. Context stuffing is physically impossible.
What BEAM Is
BEAM evaluates memory systems at 100K, 1M, and 10M token corpus sizes.
At 10M tokens, you cannot fit the corpus into any model context window. Only real memory architectures survive. That is why BEAM matters more than prompt engineering demos.
Reference Scores
| System | BEAM 100K | BEAM 1M | BEAM 10M |
|---|---|---|---|
| Mem0 v3 | pending | 64.1% | 48.6% |
| Hindsight | TBD | TBD | SOTA |
| GBrain | not-run | not-run | not-run |
| Quaid | 25% | pending | pending |
Reference values come from published materials. GBrain has no public BEAM run.
Status
BEAM corpus released by Mem0. Quaid has a published 100K run; larger corpus runs remain pending.
measured
v0.23.0
2026-06-22
How To Run
OPENAI_API_KEY=sk-... bash benchmarks/beam/run.sh