Back to systems
Reading mode

Case Study 01

Snapshot: May 2026

Memory Operating System, persistent context with real write authority.

MemOS replaces fragmented memory surfaces with one logical read plane, one evented write plane, canonical and temporal truth, explicit provenance, client-specific bootstrap behavior, retrieval quality gates, and a live affect channel. It is built for Claude Code, Claude.ai, Claude Desktop, Codex, terminal agents, and the node cluster around them.

  • 8/8readable sources
  • 12surfaces visible
  • 31,280memory events
  • 12,174Qdrant points

Live Control Surface

The dashboard proves the operating posture, not just the architecture.

These screenshots show MemOS as an operator surface: rollout holds, read-only previews, service guardrails, projection state, reroute telemetry, saturation coverage, vector health, and memory growth.

Control surface overview Service surface health, read-only posture, current state, next phase, rollout holds, and safety guardrails are visible without mutating the stack.
Retrieval and stores Qdrant, projections, salience feedback, manual memory save, retrieval fallback, ranking canary, node authority, and next-work state in one guard table.
Ops guardrails Service surfaces, systemd units, backing containers, projection backlog, and Qdrant shadow retrieval state are inspected as first-class health signals.
Reroute telemetry Client-by-client reroute paths, fallback rate, p50/p95 latency, and canary widening state are visible as release-control evidence.
Saturation coverage Canonical facts, provenance coverage, review queue state, contradiction counts, resolution rate, and stability score stay inspectable.
Memory and vector health Memory growth, source distribution, Postgres rows, Qdrant points, graph nodes, memory age, and hydration cost are separated into a sharper proof surface.

System Shape

Real diagrams from the active MemOS build.

These artifacts show the actual client surfaces, write authority, affect ingestion, projection workers, retrieval controller, and read-only surfaces behind the public summary.

Authority and retrieval flow Clients do not write wherever they want. Profiles, evented writes, projections, retrieval lanes, health contracts, and dashboard output keep memory legible.
Client and authority map Claude.ai, Claude Desktop, Claude Code, Codex, federation agents, and voice-mcp route through explicit read-only, read/write, write-only, and operator-controlled surfaces.
Affect write plane Verified voice events become affect frames with mapping versions, projection state, provenance, embedding, and duplicate rejection.
Retrieval plane Postgres, Neo4j, Qdrant, sparse lanes, metadata lanes, RRF, and reranking feed memory_recall, memory_hydrate, and bootstrap_context.
Calibration and emotion stream Speaker enrollment, VAD calibration, affect ingestion, Postgres materialization, and downstream SG v2 emotion tools stay separated so signal quality can be audited before use.

Why This Is Hard

Memory without authority becomes folklore.

Long-lived AI systems do not just need more context. They need a contract for who can write, how facts supersede each other, where provenance lives, and how retrieval quality is measured before context reaches a model.

Naive version
  • Every client writes differently and old memory rows survive because nobody owns retirement.
  • Vector search feels smart until stale, duplicated, or unattributed memories win retrieval.
  • Voice and emotion signals leak into reasoning before speaker verification and calibration earn trust.
Architectural move
  • Separate write authority from read surfaces with append-only events and replayable projections.
  • Fuse dense, sparse, metadata, graph, and file-index lanes through ranked, inspectable retrieval.
  • Keep affect writes gated, versioned, and auditable before downstream tools can use them.
Public proof
  • Client authority map, affect write plane, retrieval plane, and calibration stream are all visible.
  • 17 Postgres tables, legacy sqlite_vec retirement, and migrated memory rows frame consolidation as real migration, not copy.
  • Regression suites, recall gates, and operator-held canaries keep memory changes from becoming narrative drift.

Architecture

One write plane. One read plane. Multiple physical stores.

Postgres carries the durable event log and projections, Neo4j carries temporal graph truth, Qdrant carries vector retrieval, TEI services provide embeddings and reranking, and the dashboard exposes operator truth without mutating state.

Write plane
  • Append-only memory events with replayable projection state.
  • Dedup, provenance, enrichment, authority, and supersession handled before projection.
  • Manual save, session digest, hosted-chat, and terminal-native writes converge on the same contract.
Read plane
  • Dense, sparse, metadata, graph, and file-index retrieval lanes.
  • RRF fusion, BGE reranking, score honesty, and client-specific injection profiles.
  • Representative Recall@5 validation and live telemetry before retrieval policy changes widen.
Operator surface
  • Read-only dashboard for write plane, retrieval, shadow comparison, GPU, and service health.
  • Control surfaces expose rollout posture, canaries, feedback volume, and next work.
  • Live changes are release-gated, scoped, and reversible.

Current Build

Node02 authority, legacy retirement, and planner canaries stay operator-held.

Memory consolidation
  • node02 is the primary control plane and write authority while node05 remains a read-only standby path.
  • Client-specific manifests keep Claude, Codex, Gemini, Qwen, and hosted-chat reads behind explicit source slices.
  • The legacy 12,206-row sqlite_vec store is in retirement posture after migration work moved 7,857 rows toward MemOS Postgres.
Affect channel
  • voice_daemon remains a separate node05 producer and posts bounded affect events into the MemOS write surface.
  • Speaker verification gates payload shape; unverified frames keep minimal metadata with short retention.
  • Projection workers materialize affect_frames with V/A/D mapping versions so downstream emotion reads stay auditable.
  • Planner apply remains scoped to the approved 4% codex and terminal-native canary lane until a later review widens it.

Stack

Built as operating infrastructure, not a memory plugin.

  • Python
  • Postgres
  • pgvector
  • Neo4j
  • Qdrant
  • TEI
  • FastMCP
  • BGE-M3
  • BGE reranker
  • systemd
  • Dashboard
  • voice-mcp

Need persistent memory that operators can trust?

That means provenance, replay, retrieval evaluation, and rollback before the memory layer earns authority.

Email Rarity Index