- Every client writes differently and old memory rows survive because nobody owns retirement.
- Vector search feels smart until stale, duplicated, or unattributed memories win retrieval.
- Voice and emotion signals leak into reasoning before speaker verification and calibration earn trust.
Case Study 01
Snapshot: May 2026
Memory Operating System, persistent context with real write authority.
MemOS replaces fragmented memory surfaces with one logical read plane, one evented write plane, canonical and temporal truth, explicit provenance, client-specific bootstrap behavior, retrieval quality gates, and a live affect channel. It is built for Claude Code, Claude.ai, Claude Desktop, Codex, terminal agents, and the node cluster around them.
Live Control Surface
The dashboard proves the operating posture, not just the architecture.
These screenshots show MemOS as an operator surface: rollout holds, read-only previews, service guardrails, projection state, reroute telemetry, saturation coverage, vector health, and memory growth.
System Shape
Real diagrams from the active MemOS build.
These artifacts show the actual client surfaces, write authority, affect ingestion, projection workers, retrieval controller, and read-only surfaces behind the public summary.
Why This Is Hard
Memory without authority becomes folklore.
Long-lived AI systems do not just need more context. They need a contract for who can write, how facts supersede each other, where provenance lives, and how retrieval quality is measured before context reaches a model.
- Separate write authority from read surfaces with append-only events and replayable projections.
- Fuse dense, sparse, metadata, graph, and file-index lanes through ranked, inspectable retrieval.
- Keep affect writes gated, versioned, and auditable before downstream tools can use them.
- Client authority map, affect write plane, retrieval plane, and calibration stream are all visible.
- 17 Postgres tables, legacy sqlite_vec retirement, and migrated memory rows frame consolidation as real migration, not copy.
- Regression suites, recall gates, and operator-held canaries keep memory changes from becoming narrative drift.
Architecture
One write plane. One read plane. Multiple physical stores.
Postgres carries the durable event log and projections, Neo4j carries temporal graph truth, Qdrant carries vector retrieval, TEI services provide embeddings and reranking, and the dashboard exposes operator truth without mutating state.
- Append-only memory events with replayable projection state.
- Dedup, provenance, enrichment, authority, and supersession handled before projection.
- Manual save, session digest, hosted-chat, and terminal-native writes converge on the same contract.
- Dense, sparse, metadata, graph, and file-index retrieval lanes.
- RRF fusion, BGE reranking, score honesty, and client-specific injection profiles.
- Representative Recall@5 validation and live telemetry before retrieval policy changes widen.
- Read-only dashboard for write plane, retrieval, shadow comparison, GPU, and service health.
- Control surfaces expose rollout posture, canaries, feedback volume, and next work.
- Live changes are release-gated, scoped, and reversible.
Current Build
Node02 authority, legacy retirement, and planner canaries stay operator-held.
- node02 is the primary control plane and write authority while node05 remains a read-only standby path.
- Client-specific manifests keep Claude, Codex, Gemini, Qwen, and hosted-chat reads behind explicit source slices.
- The legacy 12,206-row sqlite_vec store is in retirement posture after migration work moved 7,857 rows toward MemOS Postgres.
- voice_daemon remains a separate node05 producer and posts bounded affect events into the MemOS write surface.
- Speaker verification gates payload shape; unverified frames keep minimal metadata with short retention.
- Projection workers materialize affect_frames with V/A/D mapping versions so downstream emotion reads stay auditable.
- Planner apply remains scoped to the approved 4% codex and terminal-native canary lane until a later review widens it.
Stack
Built as operating infrastructure, not a memory plugin.
- Python
- Postgres
- pgvector
- Neo4j
- Qdrant
- TEI
- FastMCP
- BGE-M3
- BGE reranker
- systemd
- Dashboard
- voice-mcp
Need persistent memory that operators can trust?
That means provenance, replay, retrieval evaluation, and rollback before the memory layer earns authority.