From e37faf49d3489cab26d3d8258d334977f5d8572c Mon Sep 17 00:00:00 2001 From: Vassiliy Yegorov Date: Mon, 15 Jun 2026 15:20:02 +0700 Subject: [PATCH] docs: sync session-persistence spec to leaner RestartSurface-based design Co-Authored-By: Claude Opus 4.8 (1M context) --- .../2026-06-15-session-persistence-design.md | 90 +++++++++---------- 1 file changed, 45 insertions(+), 45 deletions(-) diff --git a/DOCS/superpowers/specs/2026-06-15-session-persistence-design.md b/DOCS/superpowers/specs/2026-06-15-session-persistence-design.md index bfa2413..4ce52bc 100644 --- a/DOCS/superpowers/specs/2026-06-15-session-persistence-design.md +++ b/DOCS/superpowers/specs/2026-06-15-session-persistence-design.md @@ -119,48 +119,49 @@ table is a `const`/static, not inline literals in branching logic. ### 5. Protocol — `crates/spacesh-proto/src/message.rs` -- `Cmd::StartSurface { surface_id: SurfaceId, resume: bool }` — start a stopped - surface. `resume = true` builds `command + resume_args(command)` (falling - back to the original args when no resume mapping exists); `resume = false` - builds the original `command + args`. cwd and geometry come from the spec. -- `Cmd::GetSnapshot { surface_id: SurfaceId }` → response carries - `Option`. -- `SnapshotView { ansi, cols, rows, cursor_row, cursor_col }` — a proto-level - mirror of core `Snapshot`, so `spacesh-proto` does not depend on - `spacesh-core`. The daemon converts core `Snapshot` into `SnapshotView` at - the protocol boundary. +The codebase already has `Cmd::RestartSurface { surface_id }` (starts a stopped +surface from its spec, guarded by `is_running`) and an `Attach` response that +already carries `{ snapshot, cols, rows, cursor_row, cursor_col, stopped }`. +So no new command or wire type is needed beyond one field: + +- Extend `Cmd::RestartSurface` with `#[serde(default)] resume: bool`. `resume = + true` builds `command + resume_args(command)` (falling back to the original + args when no mapping exists); `resume = false` keeps the original + `command + args` (today's behavior). The `#[serde(default)]` keeps old frames + decoding to `resume = false`. +- No `GetSnapshot`, no `StartSurface`, no `SnapshotView`: a stopped-panel + `Attach` returns the **disk** snapshot (see §6) using the existing response + shape. `spacesh-core::snapshot::Snapshot` gains `Deserialize` (alongside `Serialize`) -so it can be loaded back from disk into `SnapshotRecord` conversions in tests -and the store. +so the store can load it back from disk. ### 6. Server handlers — `crates/spaceshd/src/server.rs` -- `StartSurface`: look up the spec; if missing → error response. Build a - `SpawnSpec` with resume-or-plain args, the spec's cwd, and current geometry; - `spawn_surface_deferred(...)`; `registry.set_live(handle)`; broadcast - `workspace_changed` so all clients flip `running` to true. -- `GetSnapshot`: read from the snapshot store, convert to `SnapshotView`, - return `Option`. -- On surface close/remove: call `snapshot_store.remove(sid)` (via the writer or - a direct store handle) so stale files do not accumulate. +- `RestartSurface { surface_id, resume }`: unchanged flow (spec lookup, + `spawn_from_spec`, `set_live`, `SurfaceRestarted` broadcast). When `resume`, + spawn with a spec whose `args` are replaced by `config.resume_args(command)` + (when present); otherwise spawn the original spec. +- `Attach` for a **stopped** surface: instead of returning the empty + `{ snapshot: "", stopped: true }`, load the disk snapshot via the snapshot + store and return `{ snapshot: , cols, rows, cursor_row, cursor_col, + stopped: true }`. Missing file → empty snapshot, still `stopped: true`. +- Surface close/remove (`Close`, `CloseWorkspace`, `remove_surface` paths): + send a remove to the snapshot writer so stale `.json` files do not + accumulate. ### 7. App — `app/src` and `app/src-tauri` -- `socketBridge.ts`: `startSurface(id, resume)`, `getSnapshot(id)`, and a - `SnapshotView` type. -- `app/src-tauri/src/bridge.rs`: `start_surface` and `get_snapshot` invoke - handlers forwarding to the daemon, wired into the Tauri `invoke_handler` and - the JS bridge. -- `LayoutEngine.tsx` / `TerminalView.tsx`: when a surface's `running === false`, - render a stopped overlay instead of a live terminal: - - fetch `getSnapshot(id)` and paint the ANSI into a read-only, dimmed - `xterm` instance for visual context; - - centered controls: **Resume** → `startSurface(id, true)`, - **Restart fresh** → `startSurface(id, false)`; - - on success the daemon's `workspace_changed` sets `running = true`, the - overlay unmounts, and the normal live `TerminalView` mounts. - - a small "stopped" indicator in the panel header. +- `socketBridge.ts`: `restartSurface(id, resume = false)` gains the `resume` + arg; `AttachResult` gains optional `cursor_row`/`cursor_col`/`stopped`. +- `app/src-tauri/src/bridge.rs`: `restart_surface` forwards a `resume: bool` + arg into `Cmd::RestartSurface`. +- `LayoutEngine.tsx` stopped branch (`running[id] === false`): paint the disk + snapshot into a dimmed, read-only `xterm` behind the controls, and offer two + buttons — **Resume** → `restartSurface(id, true)` and **Restart fresh** → + `restartSurface(id, false)`. On success the daemon's `workspace_changed` + flips `running` to true, the overlay unmounts, and the live `TerminalView` + mounts. ## Data flow @@ -170,18 +171,17 @@ running surface ──(on exit)──────────────── daemon shutdown ──(final pass over live)────────────▶ writer task ──▶ .json reboot ▶ daemon cold start ▶ Registry::restore(state.json) ▶ all surfaces stopped -client ▶ GetSnapshot(sid) ▶ paint dimmed read-only screen + Resume/Restart -user clicks Resume ▶ StartSurface{resume:true} ▶ spawn(command + resume_args, cwd) - ▶ workspace_changed(running=true) ▶ live TerminalView mounts +client ▶ Attach(sid) [stopped] ▶ disk snapshot ▶ paint dimmed read-only screen +user clicks Resume ▶ RestartSurface{resume:true} ▶ spawn(command + resume_args, cwd) + ▶ SurfaceRestarted + running=true ▶ live TerminalView mounts ``` ## Error handling -- Missing/corrupt snapshot file → `GetSnapshot` returns `None`; the overlay - shows an empty dimmed panel with the Resume/Restart controls (still usable). -- `StartSurface` on an unknown/already-running surface → error response; client - ignores or surfaces a toast. No duplicate actor: guard on - `registry.is_running(sid)`. +- Missing/corrupt snapshot file → stopped `Attach` returns an empty snapshot; + the overlay shows an empty dimmed panel with the Resume/Restart controls. +- `RestartSurface` on an already-running surface → no-op ok (existing + `is_running` guard); unknown surface → `NOT_FOUND`. - Resume command for an agent without a mapping → falls back to the original spec args (plain restart), never fails the spawn. - Writer task failure to write one file is logged and dropped; it must not stall @@ -202,9 +202,9 @@ user clicks Resume ▶ StartSurface{resume:true} ▶ spawn(command + resume_args built-in default, then `None`; missing section defaults cleanly. - **surface actor:** `SurfaceMsg::Snapshot` returns the current grid contents; `dirty` is true after output and false immediately after a snapshot. -- **server:** `StartSurface{resume:true}` builds `command + resume_args`; - `{resume:false}` builds `command + args`; `GetSnapshot` returns the saved - view; `is_running` guard prevents a second actor. +- **server:** `RestartSurface{resume:true}` spawns with `command + resume_args`; + `{resume:false}` spawns with `command + args`; stopped `Attach` returns the + saved disk snapshot; `is_running` guard prevents a second actor. - **registry:** starting a stopped surface re-populates the live map and the view flips `running` to true.