Back to systems

System Deep Dive

Snapshot: May 2026

IOSTUI, a multi-agent command center from your iPhone browser.

IOSTUI started as a phone-first web terminal for Claude Code, Codex, Gemini, and Qwen. It is now a hardened command center: persistent Rust-backed sessions, modular client panels, structured logs, Observatory views, approvals, task board, pause/resume APIs, relay MCP tools, bridge write-back, coverage gates, and operational recovery paths.

  • 4concurrent agents
  • 12hardening phases
  • 636repo commits
  • 80%coverage gate target

Working Surface

The command center is visible, not theoretical.

The desktop and iPad multipane modes show the real shape: agent panes on the left, shared team surfaces on the right, approval/task controls, and Observatory access without leaving the browser. Status chips show active, idle, and completed-task states; red is not an outage marker.

Desktop / iPad multipane UI Four agent slots, Team / Files / Observatory tabs, approval count, task access, relay input, and visible team messages in the same operator surface.
Phone relay surface The same team relay works from iPhone: Joe posts once, agents answer in one shared stream, and the status chips separate live, resting, and completed-task states.
Recovery and health proof The source architecture makes recovery inspectable: health endpoints, agentd readiness, systemd services, Observatory watchers, and private deployment paths stay explicit.
Proof ledger
  • 636 commits in the local repo. The command center has moved through repeated hardening rather than staying a thin terminal wrapper.
  • Phase 10 shipped daemon-state sync and reboot recovery work. Verification scenarios cover Node restarts, host reboot, cwd sync, singleton restoration, auto-respawn, and restart race guards.
  • Testing spans Node, browser, Rust daemon, Playwright, and K3s validation paths. c8 thresholds target 80% statements, functions, and lines, plus 70% branches.

Why This Is Hard

A browser terminal is easy. A recoverable command center is not.

The difficult part is keeping local processes, phone UI, relay context, approvals, task state, and daemon recovery from feeding back into each other in unsafe ways.

Naive version
  • PTY output, relay messages, approvals, and context reminders all blur into one terminal stream.
  • Node restarts break sessions, lose cwd, or double-spawn processes with stale handles.
  • Approvals are just prompt text, so the operator has no ledger, routing, or audit path.
Architectural move
  • Put session ownership in a Rust daemon and let Node coordinate API, relay, and UI state.
  • Make approvals, tasks, Observatory, and bridge write-back relay subscribers instead of terminal hacks.
  • Keep pause, resume, readiness, and recovery explicit so operators can trust state after restarts.
Public proof
  • The multipane UI shows agents, Team, Files, Observatory, approvals, and task controls in one surface.
  • 636 commits, 52 checked-in JS and E2E test files, Playwright phone viewport tests, and Rust daemon checks back the claim.
  • Health endpoints and systemd service recovery give the launcher something real to verify.

Architecture

Client modules, Node server, Rust session daemon.

The client remains a no-build browser app, but the command surface is now modular: terminal, files, relay, Observatory, approvals, tasks, and quick actions all sit on top of server APIs and daemon-backed sessions.

Client command center
  • xterm.js terminal with mobile tabs and desktop multipane mode.
  • Observatory timeline, agent log, alerts inbox, and audit search.
  • Approval panel and task board with mobile and desktop interaction paths.
  • Quick actions detect prompts and map approval keystrokes per agent.
Server control layer
  • Express, WebSocket, SSE, SQLite ledgers, structured pino logging, and graceful shutdown.
  • Approvals, tasks, relay search, metrics, session pause/resume, and bridge status APIs.
  • MCP relay tools let agents query approvals, tasks, stuck signals, and team relay state.
Agent daemon
  • Rust portable-pty daemon with persistent named sessions and 120KB scrollback.
  • Project-specific cwd restoration and singleton cleanup before respawn.
  • Systemd, Docker, and private K3s deployment paths with readiness checks.

Operator Control

Approvals, tasks, stuck signals, and bridge write-back.

Approval and task control
  • Pending approval requests are captured, persisted, grouped, and actionable from a dedicated panel.
  • Task ledger and kanban board track backlog, in-progress, done, failed states, and assignment rationale.
  • Pause/resume endpoints let external lifecycle sources stop sessions without buffering unsafe input.
Relay and recovery
  • Team relay chat has SSE, filtering, collapsed output, noise suppression, and MCP self-service tools.
  • Bridge write-back syncs relay messages into federation notes with persistent dedup and circuit breaking.
  • Operational hardening adds prompt readiness detection, bounded queues, external noise filters, and leak samples.

Deployment

Local, container, and private-cluster ready.

Services
  • Node app plus Rust agentd under user systemd services.
  • Docker image includes the app, compiled daemon, and pinned agent CLIs.
  • K3s overlay schedules privately on node01 with PVC-backed data and NodePort access.
Security
  • HttpOnly cookie auth, security headers, and no tokens in SSE URLs.
  • Path traversal protections and relay input sanitization.
  • Localhost default with LAN, Tailscale, Cloudflare, and SSH tunnel options.
Testing
  • Node built-in tests, jsdom multipane tests, SSE reconnection coverage, and Playwright E2E.
  • c8 thresholds target 80% statements, 80% functions, 80% lines, and 70% branches.
  • Rust daemon build and Kustomize validation sit in CI/CD.

Stack

Browser-native control over real local processes.

  • Node.js
  • Express
  • xterm.js
  • WebSocket
  • SSE
  • SQLite
  • Rust
  • portable-pty
  • MCP
  • SortableJS
  • Playwright
  • Kustomize

Need a multi-agent interface operators can actually run?

That means sessions, approvals, recovery, observability, and relay state all need first-class design.

Email Rarity Index