1f69973606
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1280 lines
49 KiB
Markdown
1280 lines
49 KiB
Markdown
# Session Persistence (resurrect + resume) Implementation Plan
|
|
|
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
|
**Goal:** After a daemon restart (reboot / battery / `kill -9`) the user can bring each panel back: it shows its last on-screen state and offers a one-click **Resume** that respawns the agent with its session-continue flag (e.g. `claude --continue`).
|
|
|
|
**Architecture:** The daemon already persists structure (`state.json`) and already shows stopped panels with a restart overlay; `RestartSurface` already respawns a stopped surface from its spec. This plan adds (1) periodic on-disk snapshots of each surface's visible screen, (2) a `[resume]` config map producing resume args, (3) a `resume` flag on `RestartSurface`, and (4) painting the saved screen behind the overlay plus a Resume button. We reuse `spacesh_core::snapshot::snapshot_ansi` (the live-reattach serializer) for the on-disk snapshot.
|
|
|
|
**Tech Stack:** Rust (tokio actors, serde, alacritty_terminal grid), Tauri 2 bridge, React/TS + xterm.js.
|
|
|
|
**Spec:** `docs/superpowers/specs/2026-06-15-session-persistence-design.md`
|
|
|
|
---
|
|
|
|
## Orientation (read before starting)
|
|
|
|
Key existing code this plan builds on:
|
|
|
|
- `crates/spacesh-core/src/snapshot.rs` — `Snapshot { ansi, cols, rows, cursor_row, cursor_col }` (derives `Serialize` only) and `snapshot_ansi(&GridSurface) -> Snapshot`.
|
|
- `crates/spaceshd/src/state_store.rs` — `JsonStateStore` pattern: atomic write (temp → `sync_all` → rename), corrupt-file tolerance. Mirror this for snapshots.
|
|
- `crates/spaceshd/src/surface.rs` — surface actor. `spawn_from_spec` → `spawn_surface_deferred` → `run_actor`; eager `spawn_surface` for tests. `SurfaceMsg` enum. `run_actor` owns `grid: GridSurface` and exits via `exit_tx.send((id, code))` after `pty.wait()`.
|
|
- `crates/spaceshd/src/server.rs` — `serve(socket, store, event_store)`, the `router` single-task loop over `ServerMsg`, `handle_request`, `RestartSurface` handler, the stopped-`Attach` branch, and ~12 `serve(...)` callsites in `#[cfg(test)]`.
|
|
- `crates/spaceshd/src/config.rs` — `Config` with `#[serde(default)]` sub-tables.
|
|
- `crates/spacesh-proto/src/message.rs` — `Cmd::RestartSurface { surface_id }`.
|
|
- `app/src/LayoutEngine.tsx` — `Leaf` renders the `running[id] === false` overlay ("Process exited" + Restart button).
|
|
- `app/src/socketBridge.ts` — `restartSurface`, `AttachResult`. `app/src-tauri/src/bridge.rs` — `restart_surface`, `attach` invoke handlers.
|
|
|
|
Build/test commands: `cargo test -p spacesh-core`, `cargo test -p spacesh-proto`, `cargo test -p spaceshd`, and `cd app && npx tsc --noEmit`.
|
|
|
|
---
|
|
|
|
## Task 1: `Snapshot` gains `Deserialize`
|
|
|
|
**Files:**
|
|
- Modify: `crates/spacesh-core/src/snapshot.rs`
|
|
- Test: same file (`#[cfg(test)]` module)
|
|
|
|
- [ ] **Step 1: Write the failing test**
|
|
|
|
Add to the `tests` module in `crates/spacesh-core/src/snapshot.rs`:
|
|
|
|
```rust
|
|
#[test]
|
|
fn snapshot_round_trips_through_json() {
|
|
let mut g = GridSurface::new(20, 4);
|
|
g.feed(b"hello");
|
|
let snap = snapshot_ansi(&g);
|
|
let json = serde_json::to_string(&snap).unwrap();
|
|
let back: Snapshot = serde_json::from_str(&json).unwrap();
|
|
assert_eq!(back.ansi, snap.ansi);
|
|
assert_eq!((back.cols, back.rows), (snap.cols, snap.rows));
|
|
assert_eq!((back.cursor_row, back.cursor_col), (snap.cursor_row, snap.cursor_col));
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Run test to verify it fails**
|
|
|
|
Run: `cargo test -p spacesh-core snapshot_round_trips_through_json`
|
|
Expected: FAIL — `Snapshot` does not implement `Deserialize` (compile error `the trait bound Snapshot: Deserialize<'_> is not satisfied`).
|
|
|
|
- [ ] **Step 3: Add the derive**
|
|
|
|
In `crates/spacesh-core/src/snapshot.rs`, change the `Snapshot` derive and the `serde` import:
|
|
|
|
```rust
|
|
use serde::{Deserialize, Serialize};
|
|
```
|
|
|
|
```rust
|
|
/// Serializable snapshot returned by `attach` and persisted to disk.
|
|
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
|
|
pub struct Snapshot {
|
|
/// ANSI byte dump suitable for `xterm.write()`.
|
|
pub ansi: String,
|
|
pub cols: u16,
|
|
pub rows: u16,
|
|
/// 1-based cursor position.
|
|
pub cursor_row: u16,
|
|
pub cursor_col: u16,
|
|
}
|
|
```
|
|
|
|
(`PartialEq` is added so tests can compare snapshots directly.)
|
|
|
|
- [ ] **Step 4: Run test to verify it passes**
|
|
|
|
Run: `cargo test -p spacesh-core snapshot_round_trips_through_json`
|
|
Expected: PASS. Also run `cargo test -p spacesh-core` — all green.
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add crates/spacesh-core/src/snapshot.rs
|
|
git commit -m "feat(core): Snapshot derives Deserialize + PartialEq for disk persistence"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 2: `snapshot_store` — per-surface disk store
|
|
|
|
**Files:**
|
|
- Create: `crates/spaceshd/src/snapshot_store.rs`
|
|
- Modify: `crates/spaceshd/src/main.rs` (add `mod snapshot_store;`)
|
|
- Test: in the new file's `#[cfg(test)]` module
|
|
|
|
- [ ] **Step 1: Register the module**
|
|
|
|
In `crates/spaceshd/src/main.rs`, add to the module list (keep alphabetical near `state_store`):
|
|
|
|
```rust
|
|
mod snapshot_store;
|
|
```
|
|
|
|
- [ ] **Step 2: Write the failing test**
|
|
|
|
Create `crates/spaceshd/src/snapshot_store.rs` with the test module first (it will not compile until Step 3 adds the types — that is the failing state):
|
|
|
|
```rust
|
|
use std::path::PathBuf;
|
|
use spacesh_core::snapshot::Snapshot;
|
|
use spacesh_proto::SurfaceId;
|
|
|
|
/// Stores one visible-screen snapshot per surface as `<dir>/<surface_id>.json`.
|
|
pub trait SnapshotStore: Send + Sync {
|
|
fn save(&self, sid: &SurfaceId, snap: &Snapshot);
|
|
fn load(&self, sid: &SurfaceId) -> Option<Snapshot>;
|
|
fn remove(&self, sid: &SurfaceId);
|
|
}
|
|
|
|
/// Writer command: persist or delete a surface's snapshot. Shared by the
|
|
/// router ticker, the close/remove paths, and each actor's on-exit dump, so a
|
|
/// single channel type flows everywhere.
|
|
pub enum SnapshotMsg {
|
|
Save(SurfaceId, Snapshot),
|
|
Remove(SurfaceId),
|
|
}
|
|
|
|
/// A no-op store for tests and contexts that do not persist snapshots.
|
|
pub struct NullSnapshotStore;
|
|
impl SnapshotStore for NullSnapshotStore {
|
|
fn save(&self, _sid: &SurfaceId, _snap: &Snapshot) {}
|
|
fn load(&self, _sid: &SurfaceId) -> Option<Snapshot> { None }
|
|
fn remove(&self, _sid: &SurfaceId) {}
|
|
}
|
|
|
|
/// JSON file store. Filenames are the surface id (e.g. `s_1f.json`); ids are
|
|
/// `^[a-z]_[0-9a-f]+$` so they are always safe path components.
|
|
pub struct JsonSnapshotStore {
|
|
dir: PathBuf,
|
|
}
|
|
|
|
impl JsonSnapshotStore {
|
|
pub fn new(dir: PathBuf) -> Self {
|
|
let _ = std::fs::create_dir_all(&dir);
|
|
Self { dir }
|
|
}
|
|
fn path(&self, sid: &SurfaceId) -> PathBuf {
|
|
self.dir.join(format!("{}.json", sid.0))
|
|
}
|
|
}
|
|
|
|
impl SnapshotStore for JsonSnapshotStore {
|
|
fn save(&self, sid: &SurfaceId, snap: &Snapshot) {
|
|
let path = self.path(sid);
|
|
let tmp = path.with_extension("json.tmp");
|
|
let Ok(bytes) = serde_json::to_vec(snap) else { return };
|
|
if std::fs::write(&tmp, &bytes).is_err() { return; }
|
|
if let Ok(f) = std::fs::File::open(&tmp) { let _ = f.sync_all(); }
|
|
let _ = std::fs::rename(&tmp, &path);
|
|
}
|
|
fn load(&self, sid: &SurfaceId) -> Option<Snapshot> {
|
|
let bytes = std::fs::read(self.path(sid)).ok()?;
|
|
serde_json::from_slice(&bytes).ok()
|
|
}
|
|
fn remove(&self, sid: &SurfaceId) {
|
|
let _ = std::fs::remove_file(self.path(sid));
|
|
}
|
|
}
|
|
|
|
#[cfg(test)]
|
|
mod tests {
|
|
use super::*;
|
|
|
|
fn tmp_dir(name: &str) -> PathBuf {
|
|
let n = std::time::SystemTime::now().duration_since(std::time::UNIX_EPOCH).unwrap().as_nanos();
|
|
let p = std::env::temp_dir().join(format!("spacesh-snap-{name}-{n}"));
|
|
std::fs::create_dir_all(&p).unwrap();
|
|
p
|
|
}
|
|
|
|
fn sample() -> Snapshot {
|
|
Snapshot { ansi: "\u{1b}[mhello".into(), cols: 80, rows: 24, cursor_row: 1, cursor_col: 6 }
|
|
}
|
|
|
|
#[test]
|
|
fn save_then_load_round_trips() {
|
|
let dir = tmp_dir("roundtrip");
|
|
let store = JsonSnapshotStore::new(dir.clone());
|
|
let sid = SurfaceId("s_1".into());
|
|
store.save(&sid, &sample());
|
|
assert_eq!(store.load(&sid), Some(sample()));
|
|
let _ = std::fs::remove_dir_all(dir);
|
|
}
|
|
|
|
#[test]
|
|
fn missing_loads_none() {
|
|
let store = JsonSnapshotStore::new(tmp_dir("missing"));
|
|
assert_eq!(store.load(&SurfaceId("s_none".into())), None);
|
|
}
|
|
|
|
#[test]
|
|
fn corrupt_loads_none() {
|
|
let dir = tmp_dir("corrupt");
|
|
let store = JsonSnapshotStore::new(dir.clone());
|
|
let sid = SurfaceId("s_2".into());
|
|
std::fs::write(dir.join("s_2.json"), b"{ not json").unwrap();
|
|
assert_eq!(store.load(&sid), None);
|
|
let _ = std::fs::remove_dir_all(dir);
|
|
}
|
|
|
|
#[test]
|
|
fn remove_deletes_file() {
|
|
let dir = tmp_dir("remove");
|
|
let store = JsonSnapshotStore::new(dir.clone());
|
|
let sid = SurfaceId("s_3".into());
|
|
store.save(&sid, &sample());
|
|
assert!(store.load(&sid).is_some());
|
|
store.remove(&sid);
|
|
assert_eq!(store.load(&sid), None);
|
|
let _ = std::fs::remove_dir_all(dir);
|
|
}
|
|
|
|
#[test]
|
|
fn null_store_is_inert() {
|
|
let store = NullSnapshotStore;
|
|
let sid = SurfaceId("s_4".into());
|
|
store.save(&sid, &sample());
|
|
assert_eq!(store.load(&sid), None);
|
|
store.remove(&sid);
|
|
}
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 3: Run tests to verify they pass**
|
|
|
|
The module body above already contains the implementation, so this task writes test + impl together (the store is pure I/O with no logic worth a red-then-green split beyond compilation).
|
|
|
|
Run: `cargo test -p spaceshd snapshot_store`
|
|
Expected: PASS — 5 tests (`save_then_load_round_trips`, `missing_loads_none`, `corrupt_loads_none`, `remove_deletes_file`, `null_store_is_inert`).
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add crates/spaceshd/src/snapshot_store.rs crates/spaceshd/src/main.rs
|
|
git commit -m "feat(daemon): per-surface JSON snapshot store (atomic write, corrupt-tolerant)"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 3: Resume config + snapshot interval
|
|
|
|
**Files:**
|
|
- Modify: `crates/spaceshd/src/config.rs`
|
|
- Test: same file (`#[cfg(test)]` module)
|
|
|
|
- [ ] **Step 1: Write the failing test**
|
|
|
|
Add to the `tests` module in `crates/spaceshd/src/config.rs`:
|
|
|
|
```rust
|
|
#[test]
|
|
fn resume_args_user_then_default_then_none() {
|
|
let mut c = Config::default();
|
|
// built-in defaults present without any config
|
|
assert_eq!(c.resume_args("claude").as_deref(), Some(&["--continue".to_string()][..]));
|
|
assert_eq!(c.resume_args("codex").as_deref(), Some(&["resume".to_string()][..]));
|
|
// a path is reduced to its basename before lookup
|
|
assert_eq!(c.resume_args("/usr/local/bin/claude").as_deref(), Some(&["--continue".to_string()][..]));
|
|
// unknown command → None
|
|
assert_eq!(c.resume_args("bash"), None);
|
|
// user override wins over the default
|
|
c.resume.commands.insert("claude".into(), vec!["--resume".into(), "last".into()]);
|
|
assert_eq!(c.resume_args("claude"), Some(vec!["--resume".into(), "last".into()]));
|
|
}
|
|
|
|
#[test]
|
|
fn snapshot_interval_defaults_to_5s() {
|
|
let c = Config::default();
|
|
assert_eq!(c.snapshot_interval_secs(), 5);
|
|
}
|
|
|
|
#[test]
|
|
fn parses_resume_table_and_interval() {
|
|
let dir = std::env::temp_dir().join(format!("spacesh-cfg-resume-{}", std::process::id()));
|
|
std::fs::create_dir_all(&dir).unwrap();
|
|
let path = dir.join("config.toml");
|
|
std::fs::write(&path,
|
|
"snapshot_interval_secs = 10\n[resume.commands]\ngemini = [\"--resume\"]\n").unwrap();
|
|
let c = Config::from_path(&path);
|
|
assert_eq!(c.snapshot_interval_secs(), 10);
|
|
assert_eq!(c.resume_args("gemini"), Some(vec!["--resume".into()]));
|
|
let _ = std::fs::remove_file(&path);
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests to verify they fail**
|
|
|
|
Run: `cargo test -p spaceshd resume_args_user_then_default_then_none`
|
|
Expected: FAIL — compile error: no field `resume`, no method `resume_args`/`snapshot_interval_secs`.
|
|
|
|
- [ ] **Step 3: Implement config additions**
|
|
|
|
In `crates/spaceshd/src/config.rs`, add the struct and a default table, and extend `Config`:
|
|
|
|
```rust
|
|
/// Built-in resume args for known agents, used when config has no override.
|
|
/// (command basename, resume args)
|
|
const DEFAULT_RESUME: &[(&str, &[&str])] = &[
|
|
("claude", &["--continue"]),
|
|
("codex", &["resume"]),
|
|
];
|
|
|
|
#[derive(Debug, Clone, Default, Deserialize, Serialize)]
|
|
pub struct ResumeConfig {
|
|
/// command basename -> args that continue its previous session.
|
|
#[serde(default)]
|
|
pub commands: std::collections::HashMap<String, Vec<String>>,
|
|
}
|
|
```
|
|
|
|
Add the fields to `Config`:
|
|
|
|
```rust
|
|
#[derive(Debug, Clone, Default, Deserialize, Serialize)]
|
|
pub struct Config {
|
|
#[serde(default, skip_serializing_if = "Option::is_none")]
|
|
pub default_shell: Option<String>,
|
|
#[serde(default)]
|
|
pub terminal: TerminalConfig,
|
|
#[serde(default)]
|
|
pub appearance: AppearanceConfig,
|
|
#[serde(default)]
|
|
pub resume: ResumeConfig,
|
|
/// How often (seconds) the daemon dumps changed grids to disk.
|
|
#[serde(default, skip_serializing_if = "Option::is_none")]
|
|
pub snapshot_interval_secs: Option<u64>,
|
|
}
|
|
```
|
|
|
|
Add the resolver methods in the `impl Config` block:
|
|
|
|
```rust
|
|
/// Resume args for a command, by basename: user map → built-in default → None.
|
|
pub fn resume_args(&self, command: &str) -> Option<Vec<String>> {
|
|
let base = std::path::Path::new(command)
|
|
.file_name()
|
|
.map(|s| s.to_string_lossy().to_string())
|
|
.unwrap_or_else(|| command.to_string());
|
|
if let Some(args) = self.resume.commands.get(&base) {
|
|
return Some(args.clone());
|
|
}
|
|
DEFAULT_RESUME.iter()
|
|
.find(|(name, _)| *name == base)
|
|
.map(|(_, args)| args.iter().map(|s| s.to_string()).collect())
|
|
}
|
|
|
|
/// Snapshot dump cadence in seconds (config → default 5, clamped to [1, 3600]).
|
|
pub fn snapshot_interval_secs(&self) -> u64 {
|
|
self.snapshot_interval_secs.unwrap_or(5).clamp(1, 3600)
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 4: Run tests to verify they pass**
|
|
|
|
Run: `cargo test -p spaceshd config`
|
|
Expected: PASS — including the three new tests and the existing config tests.
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add crates/spaceshd/src/config.rs
|
|
git commit -m "feat(daemon): [resume] config map + snapshot_interval_secs with built-in defaults"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 4: Actor `Snapshot` message + dirty flag + on-exit dump
|
|
|
|
**Files:**
|
|
- Modify: `crates/spaceshd/src/surface.rs`
|
|
- Test: same file (`#[cfg(test)]` module)
|
|
|
|
This adds a snapshot channel threaded through every spawn entry point. The
|
|
channel carries `SnapshotMsg` (defined in Task 2) to the writer (Task 5); here
|
|
the actor only ever sends `SnapshotMsg::Save(id, snap)` on exit and answers
|
|
on-demand `SurfaceMsg::Snapshot` requests. Add the import at the top of
|
|
`surface.rs`: `use crate::snapshot_store::SnapshotMsg;`.
|
|
|
|
- [ ] **Step 1: Write the failing tests**
|
|
|
|
Add to the `tests` module in `crates/spaceshd/src/surface.rs`. Note the existing test helper `spawn_surface(...)` signature gains a trailing `snapshot_tx`; these tests use it.
|
|
|
|
```rust
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn snapshot_msg_returns_grid_and_tracks_dirty() {
|
|
let _serial = crate::test_support::serial();
|
|
let pty = PtyHandle::spawn(spec("printf DIRTYME; sleep 0.4")).unwrap();
|
|
let (state_tx, _s) = mpsc::unbounded_channel();
|
|
let (exit_tx, _e) = mpsc::unbounded_channel();
|
|
let (snap_tx, _snap_rx) = mpsc::unbounded_channel();
|
|
let handle = spawn_surface(SurfaceId("s_1".into()), WorkspaceId("w_1".into()), pty, 80, 24, false, state_tx, exit_tx, snap_tx);
|
|
|
|
// Give the child time to print.
|
|
tokio::time::sleep(Duration::from_millis(150)).await;
|
|
let (reply_tx, reply_rx) = oneshot::channel();
|
|
handle.tx.send(SurfaceMsg::Snapshot { reply: reply_tx }).await.unwrap();
|
|
let (snap, dirty) = reply_rx.await.unwrap();
|
|
assert!(snap.ansi.contains("DIRTYME"), "snapshot: {:?}", snap.ansi);
|
|
assert!(dirty, "first snapshot after output should be dirty");
|
|
|
|
// Immediately snapshot again with no new output → not dirty.
|
|
let (reply_tx, reply_rx) = oneshot::channel();
|
|
handle.tx.send(SurfaceMsg::Snapshot { reply: reply_tx }).await.unwrap();
|
|
let (_snap2, dirty2) = reply_rx.await.unwrap();
|
|
assert!(!dirty2, "second snapshot with no new output should be clean");
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn final_snapshot_sent_on_exit() {
|
|
let _serial = crate::test_support::serial();
|
|
let pty = PtyHandle::spawn(spec("printf BYE")).unwrap(); // exits immediately
|
|
let (state_tx, _s) = mpsc::unbounded_channel();
|
|
let (exit_tx, _e) = mpsc::unbounded_channel();
|
|
let (snap_tx, mut snap_rx) = mpsc::unbounded_channel();
|
|
let _handle = spawn_surface(SurfaceId("s_x".into()), WorkspaceId("w_1".into()), pty, 80, 24, false, state_tx, exit_tx, snap_tx);
|
|
|
|
let msg = tokio::time::timeout(Duration::from_secs(2), snap_rx.recv()).await.unwrap().unwrap();
|
|
match msg {
|
|
crate::snapshot_store::SnapshotMsg::Save(sid, snap) => {
|
|
assert_eq!(sid.0, "s_x");
|
|
assert!(snap.ansi.contains("BYE"), "final snapshot: {:?}", snap.ansi);
|
|
}
|
|
_ => panic!("expected a Save message on exit"),
|
|
}
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests to verify they fail**
|
|
|
|
Run: `cargo test -p spaceshd snapshot_msg_returns_grid_and_tracks_dirty`
|
|
Expected: FAIL — compile error: `SurfaceMsg::Snapshot` variant missing and `spawn_surface` takes too few arguments.
|
|
|
|
- [ ] **Step 3: Add the message variant and snapshot channel**
|
|
|
|
In `crates/spaceshd/src/surface.rs`:
|
|
|
|
Add the variant to `SurfaceMsg`:
|
|
|
|
```rust
|
|
pub enum SurfaceMsg {
|
|
Input(Vec<u8>),
|
|
Resize { cols: u16, rows: u16 },
|
|
Attach { reply: oneshot::Sender<broadcast::Receiver<Vec<u8>>> },
|
|
/// Attach with snapshot: subscribe AND capture the grid in one actor turn.
|
|
AttachSnapshot { reply: oneshot::Sender<(Snapshot, broadcast::Receiver<Vec<u8>>)> },
|
|
/// On-demand snapshot without subscribing; bool = dirty since last snapshot.
|
|
Snapshot { reply: oneshot::Sender<(Snapshot, bool)> },
|
|
Close,
|
|
}
|
|
```
|
|
|
|
Thread a `snapshot_tx: mpsc::UnboundedSender<SnapshotMsg>` parameter through `spawn_from_spec`, `spawn_surface`, `spawn_surface_deferred`, and `run_actor`. For each, add the parameter (last position) and pass it down.
|
|
|
|
`spawn_from_spec` signature + body:
|
|
|
|
```rust
|
|
#[allow(clippy::too_many_arguments)]
|
|
pub fn spawn_from_spec(
|
|
id: SurfaceId,
|
|
workspace_id: WorkspaceId,
|
|
spec: &SurfaceSpec,
|
|
extra_env: Vec<(String, String)>,
|
|
hooks_active: bool,
|
|
state_tx: mpsc::UnboundedSender<(SurfaceId, SurfaceState)>,
|
|
exit_tx: mpsc::UnboundedSender<(SurfaceId, i32)>,
|
|
snapshot_tx: mpsc::UnboundedSender<SnapshotMsg>,
|
|
) -> std::io::Result<SurfaceHandle> {
|
|
let mut env = vec![("SPACESH_SURFACE_ID".to_string(), id.0.clone())];
|
|
env.extend(extra_env);
|
|
let spawn_spec = SpawnSpec {
|
|
command: spec.command.clone(),
|
|
args: spec.args.clone(),
|
|
cwd: std::path::PathBuf::from(&spec.cwd),
|
|
cols: spec.cols,
|
|
rows: spec.rows,
|
|
env,
|
|
};
|
|
Ok(spawn_surface_deferred(id, workspace_id, spawn_spec, spec.cols, spec.rows, hooks_active, state_tx, exit_tx, snapshot_tx))
|
|
}
|
|
```
|
|
|
|
`spawn_surface` (eager, test path):
|
|
|
|
```rust
|
|
#[allow(clippy::too_many_arguments)]
|
|
pub fn spawn_surface(
|
|
id: SurfaceId,
|
|
workspace_id: WorkspaceId,
|
|
pty: PtyHandle,
|
|
cols: u16,
|
|
rows: u16,
|
|
hooks_active: bool,
|
|
state_tx: mpsc::UnboundedSender<(SurfaceId, SurfaceState)>,
|
|
exit_tx: mpsc::UnboundedSender<(SurfaceId, i32)>,
|
|
snapshot_tx: mpsc::UnboundedSender<SnapshotMsg>,
|
|
) -> SurfaceHandle {
|
|
let (tx, rx) = mpsc::channel::<SurfaceMsg>(64);
|
|
let (bcast, _) = broadcast::channel::<Vec<u8>>(BROADCAST_CAP);
|
|
tokio::spawn(run_actor(id.clone(), pty, cols, rows, hooks_active, bcast, rx, state_tx, exit_tx, Vec::new(), snapshot_tx));
|
|
SurfaceHandle { id, workspace_id, tx }
|
|
}
|
|
```
|
|
|
|
`spawn_surface_deferred`: add `snapshot_tx: mpsc::UnboundedSender<SnapshotMsg>` as the final parameter; inside the pre-spawn loop, answer the new message with the empty grid; and pass `snapshot_tx` into `run_actor`. In the pre-spawn `select!`, add:
|
|
|
|
```rust
|
|
Some(SurfaceMsg::Snapshot { reply }) => {
|
|
let snap = snapshot_ansi(&GridSurface::new(cols, rows));
|
|
let _ = reply.send((snap, false));
|
|
}
|
|
```
|
|
|
|
and change the spawn call:
|
|
|
|
```rust
|
|
Ok(pty) => run_actor(actor_id, pty, cols, rows, hooks_active, bcast, rx, state_tx, exit_tx, prebuf, snapshot_tx).await,
|
|
```
|
|
|
|
`run_actor`: add `snapshot_tx: mpsc::UnboundedSender<SnapshotMsg>` as the final parameter. Introduce a `dirty` flag, set it when output arrives, clear it on a snapshot, answer the new message, and send the final snapshot on exit. The relevant edits inside `run_actor`'s grid block:
|
|
|
|
Declare alongside the other loop locals:
|
|
|
|
```rust
|
|
let mut dirty = false;
|
|
```
|
|
|
|
In the `SurfaceMsg::AttachSnapshot` arm, after building `snap`, also clear dirty (the screen has just been handed out fresh):
|
|
|
|
```rust
|
|
Some(SurfaceMsg::AttachSnapshot { reply }) => {
|
|
let sub = bcast.subscribe();
|
|
let snap = snapshot_ansi(&grid);
|
|
dirty = false;
|
|
let _ = reply.send((snap, sub));
|
|
}
|
|
```
|
|
|
|
Add the new arm next to it:
|
|
|
|
```rust
|
|
Some(SurfaceMsg::Snapshot { reply }) => {
|
|
let snap = snapshot_ansi(&grid);
|
|
let was_dirty = dirty;
|
|
dirty = false;
|
|
let _ = reply.send((snap, was_dirty));
|
|
}
|
|
```
|
|
|
|
In the PTY output arm, when bytes arrive (the `Some(bytes) =>` branch), set `dirty = true;` after extending `pending`:
|
|
|
|
```rust
|
|
Some(bytes) => {
|
|
pending.extend_from_slice(&bytes);
|
|
dirty = true;
|
|
if flush_deadline.is_none() {
|
|
flush_deadline = Some(Instant::now() + FLUSH_INTERVAL);
|
|
}
|
|
if pending.len() >= FLUSH_BYTES {
|
|
flush(&mut pending, &mut grid, &mut osc, &mut deterministic, &mut last_state, &detect_id, &bcast, &state_tx);
|
|
flush_deadline = None;
|
|
}
|
|
}
|
|
```
|
|
|
|
Replace the exit tail of the block (currently `let code = pty.wait(); let _ = exit_tx.send((actor_id, code));`) with a final snapshot first:
|
|
|
|
```rust
|
|
let final_snap = snapshot_ansi(&grid);
|
|
let _ = snapshot_tx.send(SnapshotMsg::Save(actor_id.clone(), final_snap));
|
|
let code = pty.wait();
|
|
let _ = exit_tx.send((actor_id, code));
|
|
}
|
|
}
|
|
```
|
|
|
|
> Note: `actor_id` is currently moved into `detect_id`/used once; clone as needed so it is available for both the snapshot send and `exit_tx`. If the compiler reports a move, change the earlier `let detect_id = id;` / `let actor_id = id.clone();` setup so both `actor_id` (cloneable) and `detect_id` exist, and use `actor_id.clone()` for the snapshot send.
|
|
|
|
Update the existing in-file tests `attach_receives_output` and `attach_snapshot_reflects_prior_output` (and any other `spawn_surface(...)` callers in this file's tests) to pass a snapshot sender. Add `let (snap_tx, _snap_rx) = mpsc::unbounded_channel();` before each `spawn_surface` call and append `, snap_tx` to the call.
|
|
|
|
- [ ] **Step 4: Run tests to verify they pass**
|
|
|
|
Run: `cargo test -p spaceshd -- surface`
|
|
Expected: PASS — the two new tests plus the pre-existing surface tests (now passing the extra arg).
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add crates/spaceshd/src/surface.rs
|
|
git commit -m "feat(daemon): actor Snapshot message + dirty tracking + final snapshot on exit"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 5: Snapshot writer task
|
|
|
|
**Files:**
|
|
- Modify: `crates/spaceshd/src/snapshot_store.rs`
|
|
- Test: same file (`#[cfg(test)]` module)
|
|
|
|
The writer owns the store and serializes all disk writes off the router/actor
|
|
hot paths. It accepts saves and removes over one channel.
|
|
|
|
- [ ] **Step 1: Write the failing test**
|
|
|
|
Add to `crates/spaceshd/src/snapshot_store.rs` (`SnapshotMsg` was already defined in Task 2; this task adds only the writer + its test). The test needs tokio:
|
|
|
|
```rust
|
|
/// Spawn the writer task; returns the sender used by the router and actors.
|
|
pub fn spawn_writer(store: std::sync::Arc<dyn SnapshotStore>) -> tokio::sync::mpsc::UnboundedSender<SnapshotMsg> {
|
|
let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel::<SnapshotMsg>();
|
|
tokio::spawn(async move {
|
|
while let Some(msg) = rx.recv().await {
|
|
match msg {
|
|
SnapshotMsg::Save(sid, snap) => store.save(&sid, &snap),
|
|
SnapshotMsg::Remove(sid) => store.remove(&sid),
|
|
}
|
|
}
|
|
});
|
|
tx
|
|
}
|
|
```
|
|
|
|
Test:
|
|
|
|
```rust
|
|
#[tokio::test]
|
|
async fn writer_saves_and_removes() {
|
|
let dir = tmp_dir("writer");
|
|
let store: std::sync::Arc<dyn SnapshotStore> = std::sync::Arc::new(JsonSnapshotStore::new(dir.clone()));
|
|
let tx = spawn_writer(store.clone());
|
|
let sid = SurfaceId("s_w".into());
|
|
|
|
tx.send(SnapshotMsg::Save(sid.clone(), sample())).unwrap();
|
|
// Poll until the writer has flushed (bounded).
|
|
let mut saved = None;
|
|
for _ in 0..50 {
|
|
if let Some(s) = store.load(&sid) { saved = Some(s); break; }
|
|
tokio::time::sleep(std::time::Duration::from_millis(10)).await;
|
|
}
|
|
assert_eq!(saved, Some(sample()));
|
|
|
|
tx.send(SnapshotMsg::Remove(sid.clone())).unwrap();
|
|
let mut gone = false;
|
|
for _ in 0..50 {
|
|
if store.load(&sid).is_none() { gone = true; break; }
|
|
tokio::time::sleep(std::time::Duration::from_millis(10)).await;
|
|
}
|
|
assert!(gone, "writer should have removed the snapshot file");
|
|
let _ = std::fs::remove_dir_all(dir);
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Run test to verify it passes**
|
|
|
|
Implementation is included above (the writer is a thin loop). Run:
|
|
`cargo test -p spaceshd writer_saves_and_removes`
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 3: Commit**
|
|
|
|
```bash
|
|
git add crates/spaceshd/src/snapshot_store.rs
|
|
git commit -m "feat(daemon): snapshot writer task (Save/Remove over one channel)"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 6: Server wiring — store param, ticker, stopped-Attach reads disk, remove on close
|
|
|
|
**Files:**
|
|
- Modify: `crates/spaceshd/src/server.rs`
|
|
- Modify: `crates/spaceshd/src/main.rs`
|
|
- Test: `crates/spaceshd/src/server.rs` (`#[cfg(test)]`)
|
|
|
|
- [ ] **Step 1: Thread the snapshot store into `serve` and `router`**
|
|
|
|
In `crates/spaceshd/src/server.rs`:
|
|
|
|
Add imports near the other `use crate::...` lines:
|
|
|
|
```rust
|
|
use crate::snapshot_store::{SnapshotStore, SnapshotMsg, spawn_writer};
|
|
```
|
|
|
|
Change `serve` to accept the store, build the writer + ticker, and pass both the writer sender and an `Arc` clone (for reads) into `router`:
|
|
|
|
```rust
|
|
pub async fn serve(
|
|
socket: &Path,
|
|
store: Arc<dyn StateStore>,
|
|
event_store: Arc<dyn EventStore>,
|
|
snapshot_store: Arc<dyn SnapshotStore>,
|
|
) -> Result<()> {
|
|
let listener = UnixListener::bind(socket)?;
|
|
let (router_tx, router_rx) = mpsc::channel::<ServerMsg>(256);
|
|
|
|
// ... existing exit_tx / state_tx bridges unchanged ...
|
|
|
|
let snapshot_tx = spawn_writer(snapshot_store.clone());
|
|
|
|
// Periodic snapshot tick → router.
|
|
let tick_router = router_tx.clone();
|
|
let interval_secs = crate::config::Config::load().snapshot_interval_secs();
|
|
tokio::spawn(async move {
|
|
let mut tick = tokio::time::interval(Duration::from_secs(interval_secs));
|
|
tick.tick().await; // consume the immediate first tick
|
|
loop {
|
|
tick.tick().await;
|
|
if tick_router.send(ServerMsg::SnapshotTick).await.is_err() { break; }
|
|
}
|
|
});
|
|
|
|
let persister = persist::spawn(store.clone(), Duration::from_millis(500));
|
|
let initial = store.load().unwrap_or_default();
|
|
let event_persister = event_store::spawn(event_store.clone(), Duration::from_millis(500));
|
|
let event_initial = event_store.load().unwrap_or_default();
|
|
let started_at_ms = now_millis();
|
|
let shutdown = tokio::spawn(router(
|
|
router_rx, router_tx.clone(), exit_tx, state_tx,
|
|
persister, initial, event_persister, event_initial,
|
|
started_at_ms, snapshot_store, snapshot_tx,
|
|
));
|
|
|
|
// ... existing accept loop unchanged ...
|
|
}
|
|
```
|
|
|
|
Add `SnapshotTick` to the `ServerMsg` enum (around line 23):
|
|
|
|
```rust
|
|
enum ServerMsg {
|
|
// ... existing variants ...
|
|
SnapshotTick,
|
|
}
|
|
```
|
|
|
|
Change `router`'s signature to take the two new params (final positions):
|
|
|
|
```rust
|
|
async fn router(
|
|
mut rx: mpsc::Receiver<ServerMsg>,
|
|
router_tx: mpsc::Sender<ServerMsg>,
|
|
exit_tx: mpsc::UnboundedSender<(SurfaceId, i32)>,
|
|
state_tx: mpsc::UnboundedSender<(SurfaceId, SurfaceState)>,
|
|
persister: Persister,
|
|
initial: crate::state_store::PersistState,
|
|
event_persister: EventPersister,
|
|
event_initial: crate::event_log::EventLogState,
|
|
started_at_ms: u64,
|
|
snapshot_store: Arc<dyn SnapshotStore>,
|
|
snapshot_tx: mpsc::UnboundedSender<SnapshotMsg>,
|
|
) {
|
|
```
|
|
|
|
- [ ] **Step 2: Handle `SnapshotTick` and thread the snapshot sender to spawns**
|
|
|
|
In the `router` match loop, add the tick arm. It snapshots each live surface and forwards dirty ones to the writer:
|
|
|
|
```rust
|
|
ServerMsg::SnapshotTick => {
|
|
let ids: Vec<SurfaceId> = reg.live_ids();
|
|
for sid in ids {
|
|
let Some(handle) = reg.live(&sid) else { continue };
|
|
let (reply_tx, reply_rx) = oneshot::channel();
|
|
if handle.tx.send(SurfaceMsg::Snapshot { reply: reply_tx }).await.is_err() { continue; }
|
|
if let Ok((snap, dirty)) = reply_rx.await {
|
|
if dirty {
|
|
let _ = snapshot_tx.send(SnapshotMsg::Save(sid.clone(), snap));
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
This needs a `live_ids()` accessor on `Registry`. In `crates/spaceshd/src/registry.rs` add:
|
|
|
|
```rust
|
|
/// Ids of all currently-live surfaces.
|
|
pub fn live_ids(&self) -> Vec<SurfaceId> {
|
|
self.live.keys().cloned().collect()
|
|
}
|
|
```
|
|
|
|
Pass `snapshot_tx.clone()` into every `spawn_from_spec(...)` call inside `handle_request`. There are four callsites (NewSurface, SplitSurface, ApplyPreset, RestartSurface). Each currently ends `..., state_tx.clone(), exit_tx.clone())`; change to `..., state_tx.clone(), exit_tx.clone(), snapshot_tx.clone())`. To make `snapshot_tx` reachable inside `handle_request`, add it as a parameter to `handle_request` and pass it from the `ServerMsg::Request` arm:
|
|
|
|
```rust
|
|
ServerMsg::Request { id, cmd, client, out } => {
|
|
handle_request(id, cmd, client, out, &mut reg, &mut subs, &clients,
|
|
&router_tx, &exit_tx, &state_tx, &persister,
|
|
&mut event_log, &event_persister, started_at_ms, &mut config,
|
|
&snapshot_store, &snapshot_tx).await;
|
|
}
|
|
```
|
|
|
|
and in `handle_request`'s signature add the two trailing params:
|
|
|
|
```rust
|
|
snapshot_store: &Arc<dyn SnapshotStore>,
|
|
snapshot_tx: &mpsc::UnboundedSender<SnapshotMsg>,
|
|
```
|
|
|
|
- [ ] **Step 3: Stopped-`Attach` returns the disk snapshot; close/remove deletes it**
|
|
|
|
In the `Cmd::Attach` handler, replace the stopped-panel branch (the `else` that returns the empty snapshot) with a disk read:
|
|
|
|
```rust
|
|
} else {
|
|
// stopped panel: no live stream. Paint the last on-disk screen if we have one.
|
|
match snapshot_store.load(&surface_id) {
|
|
Some(snap) => {
|
|
let _ = out.send(ok(id, serde_json::json!({
|
|
"snapshot": snap.ansi, "cols": snap.cols, "rows": snap.rows,
|
|
"cursor_row": snap.cursor_row, "cursor_col": snap.cursor_col, "stopped": true,
|
|
}))).await;
|
|
}
|
|
None => {
|
|
let _ = out.send(ok(id, serde_json::json!({ "snapshot": "", "cols": 0, "rows": 0, "stopped": true }))).await;
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
In the `Cmd::Close` handler and `Cmd::CloseWorkspace` handler, after the surface(s) are removed, drop their snapshot files. For `Close { surface_id }` add, right after `reg.remove_surface(&surface_id)` (or wherever the removal happens):
|
|
|
|
```rust
|
|
let _ = snapshot_tx.send(SnapshotMsg::Remove(surface_id.clone()));
|
|
```
|
|
|
|
For `CloseWorkspace { workspace_id }`, the handler already collects `let ids = reg.close_workspace(&workspace_id);`. After the existing cleanup loop, add:
|
|
|
|
```rust
|
|
for sid in &ids { let _ = snapshot_tx.send(SnapshotMsg::Remove(sid.clone())); }
|
|
```
|
|
|
|
- [ ] **Step 4: Update `main.rs` to build and pass the store**
|
|
|
|
In `crates/spaceshd/src/main.rs`, in `run_daemon`, after the event store is built:
|
|
|
|
```rust
|
|
let snapshots_dir = lifecycle::spacesh_dir()?.join("snapshots");
|
|
let snapshot_store: std::sync::Arc<dyn snapshot_store::SnapshotStore> =
|
|
std::sync::Arc::new(snapshot_store::JsonSnapshotStore::new(snapshots_dir));
|
|
eprintln!("spaceshd listening on {}", sock.display());
|
|
server::serve(&sock, store, event_store, snapshot_store).await
|
|
```
|
|
|
|
- [ ] **Step 5: Fix all `serve(...)` test callsites**
|
|
|
|
In `crates/spaceshd/src/server.rs`'s `#[cfg(test)]` module there are ~12 calls of the form `serve(&sockX, store, event_store)` (and `..._b` variants). Append a `NullSnapshotStore` argument to each. Add this import inside the test module:
|
|
|
|
```rust
|
|
use crate::snapshot_store::NullSnapshotStore;
|
|
```
|
|
|
|
and change each call, e.g.:
|
|
|
|
```rust
|
|
tokio::spawn(async move {
|
|
let _ = serve(&sock_for_task, store2, event_store, std::sync::Arc::new(NullSnapshotStore)).await;
|
|
});
|
|
```
|
|
|
|
Apply the same `, std::sync::Arc::new(NullSnapshotStore)` insertion before `.await` to **every** `serve(...)` call in the test module (~12 sites, including the `_b` second-daemon ones). Compilation will fail until all are updated — use the compiler errors as the checklist.
|
|
|
|
- [ ] **Step 6: Write the stopped-Attach integration test**
|
|
|
|
Add a new test in the `server.rs` test module. It starts a daemon with a real `JsonSnapshotStore` over a temp dir, opens a workspace + surface, lets it print, forces a snapshot tick by waiting (or by closing the surface so the on-exit final snapshot lands), then re-attaches a fresh client and asserts the disk snapshot comes back for the stopped surface.
|
|
|
|
```rust
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
|
|
async fn stopped_attach_returns_disk_snapshot() {
|
|
let _serial = crate::test_support::serial();
|
|
let dir = unique_tmp_dir("stopped-snap"); // use the module's existing temp-dir helper
|
|
let sock = dir.join("sock");
|
|
let store: std::sync::Arc<dyn crate::state_store::StateStore> =
|
|
std::sync::Arc::new(crate::state_store::JsonStateStore::new(dir.join("state.json")));
|
|
let event_store: std::sync::Arc<dyn crate::event_store::EventStore> =
|
|
std::sync::Arc::new(crate::event_store::JsonEventStore::new(dir.join("events.json")));
|
|
let snap_store: std::sync::Arc<dyn crate::snapshot_store::SnapshotStore> =
|
|
std::sync::Arc::new(crate::snapshot_store::JsonSnapshotStore::new(dir.join("snapshots")));
|
|
let sock2 = sock.clone();
|
|
tokio::spawn(async move { let _ = serve(&sock2, store, event_store, snap_store).await; });
|
|
wait_for_socket(&sock).await; // module helper
|
|
|
|
let mut c = connect(&sock).await; // module helper
|
|
let ws = open_workspace(&mut c, dir.to_str().unwrap()).await; // adapt to existing helpers
|
|
let sid = new_surface(&mut c, &ws, Some("/bin/sh"), vec!["-c".into(), "printf SNAPDISK; sleep 0.2".into()]).await;
|
|
|
|
// Let it print and exit; the actor sends a final snapshot on exit.
|
|
tokio::time::sleep(Duration::from_millis(500)).await;
|
|
|
|
// Fresh client attaches to the now-stopped surface.
|
|
let mut c2 = connect(&sock).await;
|
|
let r = req(&mut c2, 99, Cmd::Attach { surface_id: spacesh_proto::SurfaceId(sid.clone()) }).await;
|
|
let data = res_data(&r);
|
|
assert_eq!(data["stopped"], serde_json::json!(true));
|
|
assert!(data["snapshot"].as_str().unwrap().contains("SNAPDISK"), "snapshot: {:?}", data["snapshot"]);
|
|
let _ = std::fs::remove_dir_all(dir);
|
|
}
|
|
```
|
|
|
|
> Adapt the helper calls (`unique_tmp_dir`, `wait_for_socket`, `connect`, `open_workspace`/`new_surface`, `req`, `res_data`) to the exact helpers already used by the neighbouring tests (see `reattach_returns_snapshot_with_prior_output` for the established pattern). The assertions are the contract: `stopped == true` and the ANSI contains the printed marker.
|
|
|
|
- [ ] **Step 7: Run tests**
|
|
|
|
Run: `cargo test -p spaceshd`
|
|
Expected: PASS — all daemon tests including the new `stopped_attach_returns_disk_snapshot`. Watch for any missed `serve(...)` callsite (compile error) and fix.
|
|
|
|
- [ ] **Step 8: Commit**
|
|
|
|
```bash
|
|
git add crates/spaceshd/src/server.rs crates/spaceshd/src/main.rs crates/spaceshd/src/registry.rs
|
|
git commit -m "feat(daemon): periodic snapshot ticker + stopped-attach reads disk snapshot + cleanup on close"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 7: Protocol — `RestartSurface` gains `resume`
|
|
|
|
**Files:**
|
|
- Modify: `crates/spacesh-proto/src/message.rs`
|
|
- Test: same file (`#[cfg(test)]`)
|
|
|
|
- [ ] **Step 1: Write the failing test**
|
|
|
|
Add to the `tests` module in `crates/spacesh-proto/src/message.rs`:
|
|
|
|
```rust
|
|
#[test]
|
|
fn restart_surface_resume_defaults_false_and_round_trips() {
|
|
// Legacy frame without `resume` decodes to false.
|
|
let legacy = r#"{"kind":"req","id":5,"cmd":{"cmd":"restart_surface","args":{"surface_id":"s_1"}}}"#;
|
|
let env: Envelope = serde_json::from_str(legacy).unwrap();
|
|
match env {
|
|
Envelope::Req { cmd: Cmd::RestartSurface { resume, .. }, .. } => assert!(!resume),
|
|
_ => panic!("wrong variant"),
|
|
}
|
|
// resume=true round-trips.
|
|
let e = Envelope::Req { id: 6, cmd: Cmd::RestartSurface { surface_id: SurfaceId("s_1".into()), resume: true } };
|
|
let back: Envelope = serde_json::from_str(&serde_json::to_string(&e).unwrap()).unwrap();
|
|
assert_eq!(back, e);
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Run test to verify it fails**
|
|
|
|
Run: `cargo test -p spacesh-proto restart_surface_resume`
|
|
Expected: FAIL — `Cmd::RestartSurface` has no `resume` field.
|
|
|
|
- [ ] **Step 3: Add the field**
|
|
|
|
In `crates/spacesh-proto/src/message.rs`, change the variant:
|
|
|
|
```rust
|
|
RestartSurface {
|
|
surface_id: SurfaceId,
|
|
#[serde(default)]
|
|
resume: bool,
|
|
},
|
|
```
|
|
|
|
- [ ] **Step 4: Run test to verify it passes**
|
|
|
|
Run: `cargo test -p spacesh-proto` — all green.
|
|
|
|
> This breaks the daemon and Tauri callers that construct `Cmd::RestartSurface`. They are fixed in Tasks 8 and 9; if you build the whole workspace now it will fail to compile there — that is expected and resolved by the next tasks.
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add crates/spacesh-proto/src/message.rs
|
|
git commit -m "feat(proto): RestartSurface gains resume flag (defaults false)"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 8: Server honors `resume`
|
|
|
|
**Files:**
|
|
- Modify: `crates/spaceshd/src/server.rs`
|
|
- Test: same file (`#[cfg(test)]`)
|
|
|
|
- [ ] **Step 1: Write the failing test for the pure helper**
|
|
|
|
Add a unit test (no process spawn) for a helper that swaps args when resuming:
|
|
|
|
```rust
|
|
#[test]
|
|
fn resume_spec_swaps_args_when_mapped() {
|
|
use spacesh_proto::workspace::SurfaceSpec;
|
|
let spec = SurfaceSpec {
|
|
command: "claude".into(), args: vec!["--foo".into()], cwd: "/tmp".into(),
|
|
agent_label: Some("claude".into()), cols: 80, rows: 24, autostart: false,
|
|
};
|
|
let cfg = crate::config::Config::default();
|
|
// resume=false → original args
|
|
let plain = resume_spec(&spec, false, &cfg);
|
|
assert_eq!(plain.args, vec!["--foo".to_string()]);
|
|
// resume=true with a default mapping → resume args
|
|
let resumed = resume_spec(&spec, true, &cfg);
|
|
assert_eq!(resumed.args, vec!["--continue".to_string()]);
|
|
// resume=true for an unmapped command → original args (graceful fallback)
|
|
let mut shell = spec.clone();
|
|
shell.command = "bash".into();
|
|
let resumed_shell = resume_spec(&shell, true, &cfg);
|
|
assert_eq!(resumed_shell.args, shell.args);
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Run test to verify it fails**
|
|
|
|
Run: `cargo test -p spaceshd resume_spec_swaps_args_when_mapped`
|
|
Expected: FAIL — `resume_spec` not defined.
|
|
|
|
- [ ] **Step 3: Implement the helper and use it in the handler**
|
|
|
|
Add the helper near `spawn_env` in `crates/spaceshd/src/server.rs`:
|
|
|
|
```rust
|
|
/// Build the spawn spec for a (re)start. When `resume` and the command has a
|
|
/// resume mapping, its args are replaced with the resume args; otherwise the
|
|
/// original spec args are kept.
|
|
fn resume_spec(
|
|
spec: &spacesh_proto::workspace::SurfaceSpec,
|
|
resume: bool,
|
|
cfg: &crate::config::Config,
|
|
) -> spacesh_proto::workspace::SurfaceSpec {
|
|
let mut out = spec.clone();
|
|
if resume {
|
|
if let Some(args) = cfg.resume_args(&spec.command) {
|
|
out.args = args;
|
|
}
|
|
}
|
|
out
|
|
}
|
|
```
|
|
|
|
Update the `Cmd::RestartSurface` handler to destructure `resume` and spawn from the resume spec:
|
|
|
|
```rust
|
|
Cmd::RestartSurface { surface_id, resume } => {
|
|
if reg.is_running(&surface_id) {
|
|
let _ = out.send(ok(id, serde_json::Value::Null)).await; return; // already running
|
|
}
|
|
let Some(spec) = reg.surface_spec(&surface_id) else {
|
|
let _ = out.send(err(id, "NOT_FOUND", "surface")).await; return;
|
|
};
|
|
let spec = resume_spec(&spec, resume, config);
|
|
let ws_id = reg.workspace_of(&surface_id).unwrap();
|
|
let (env, hooks_active) = spawn_env(&surface_id, &spec);
|
|
match crate::surface::spawn_from_spec(surface_id.clone(), ws_id.clone(), &spec, env, hooks_active, state_tx.clone(), exit_tx.clone(), snapshot_tx.clone()) {
|
|
Ok(handle) => {
|
|
spawn_output_bridge(surface_id.clone(), &handle, router_tx.clone());
|
|
reg.set_live(handle);
|
|
reg.set_state(&surface_id, spacesh_proto::SurfaceState::Idle);
|
|
broadcast_evt(clients, &Envelope::Evt(Evt::SurfaceRestarted { surface_id: surface_id.clone() }));
|
|
let _ = out.send(ok(id, serde_json::Value::Null)).await;
|
|
}
|
|
Err(e) => { let _ = out.send(err(id, "SPAWN_FAILED", &e.to_string())).await; }
|
|
}
|
|
}
|
|
```
|
|
|
|
> `config` is the `&mut Config` already in scope in `handle_request`; pass it as `&*config` / `config` to `resume_spec` (which takes `&Config`). Adjust the borrow as the compiler requires (e.g. `resume_spec(&spec, resume, config)` where `config: &mut Config` coerces to `&Config`).
|
|
|
|
Note: the `snapshot_tx.clone()` added to this `spawn_from_spec` call is the same one threaded in Task 6 Step 2 — ensure all four spawn callsites carry it.
|
|
|
|
- [ ] **Step 4: Run tests to verify they pass**
|
|
|
|
Run: `cargo test -p spaceshd resume_spec_swaps_args_when_mapped`
|
|
Expected: PASS. Then `cargo test -p spaceshd` — all green.
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add crates/spaceshd/src/server.rs
|
|
git commit -m "feat(daemon): RestartSurface honors resume — swap to resume_args when mapped"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 9: Tauri bridge + socketBridge resume arg
|
|
|
|
**Files:**
|
|
- Modify: `app/src-tauri/src/bridge.rs`
|
|
- Modify: `app/src/socketBridge.ts`
|
|
- Test: `cd app && npx tsc --noEmit`
|
|
|
|
- [ ] **Step 1: Update the Tauri command**
|
|
|
|
In `app/src-tauri/src/bridge.rs`, change `restart_surface` to accept and forward `resume`:
|
|
|
|
```rust
|
|
#[tauri::command]
|
|
pub async fn restart_surface(state: BridgeState<'_>, surface_id: String, resume: bool) -> Result<Value, String> {
|
|
data_of(state.request(Cmd::RestartSurface { surface_id: SurfaceId(surface_id), resume }).await.map_err(|e| e.to_string())?)
|
|
}
|
|
```
|
|
|
|
(Any other place in `bridge.rs` constructing `Cmd::RestartSurface` must pass `resume`. The version-handshake/attach code does not; only this handler builds it.)
|
|
|
|
- [ ] **Step 2: Update the JS binding and AttachResult**
|
|
|
|
In `app/src/socketBridge.ts`:
|
|
|
|
```ts
|
|
export interface AttachResult {
|
|
snapshot: string;
|
|
cols: number;
|
|
rows: number;
|
|
cursor_row?: number;
|
|
cursor_col?: number;
|
|
stopped?: boolean;
|
|
}
|
|
|
|
export async function restartSurface(surfaceId: string, resume = false): Promise<void> {
|
|
await invoke("restart_surface", { surfaceId, resume });
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 3: Verify types compile**
|
|
|
|
Run: `cd app && npx tsc --noEmit`
|
|
Expected: PASS (no type errors). Note: existing callers of `restartSurface(id)` remain valid because `resume` defaults to `false`.
|
|
|
|
Also build the Rust side: `cargo check -p spaceshd` and `cargo check --manifest-path app/src-tauri/Cargo.toml` (or `cargo check` in `app/src-tauri`).
|
|
Expected: clean.
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add app/src-tauri/src/bridge.rs app/src/socketBridge.ts
|
|
git commit -m "feat(app): plumb resume flag through restart_surface bridge + binding"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 10: Stopped overlay — paint last screen + Resume button
|
|
|
|
**Files:**
|
|
- Modify: `app/src/LayoutEngine.tsx`
|
|
- Test: `cd app && npx tsc --noEmit` + manual check
|
|
|
|
- [ ] **Step 1: Add a read-only snapshot painter component**
|
|
|
|
In `app/src/LayoutEngine.tsx`, add a small component that fetches the stopped surface's disk snapshot via `attachSurface` and paints it into a dimmed, read-only xterm. Import what is needed at the top of the file:
|
|
|
|
```tsx
|
|
import { useEffect, useRef } from "react";
|
|
import { Terminal } from "@xterm/xterm";
|
|
import { attachSurface } from "./socketBridge";
|
|
```
|
|
|
|
(Confirm against `TerminalView.tsx` for the exact xterm import path and theme/font options it uses; mirror them so the dimmed preview matches the live terminal's look. Reuse the same `font`/`palette` props already threaded into `Leaf`.)
|
|
|
|
```tsx
|
|
function StoppedSnapshot({ surfaceId, font, palette }: { surfaceId: string; font: TermFont; palette: TermPalette }) {
|
|
const hostRef = useRef<HTMLDivElement | null>(null);
|
|
useEffect(() => {
|
|
const host = hostRef.current;
|
|
if (!host) return;
|
|
const term = new Terminal({
|
|
fontFamily: font.family,
|
|
fontSize: font.size,
|
|
theme: palette,
|
|
cursorBlink: false,
|
|
disableStdin: true,
|
|
convertEol: false,
|
|
scrollback: 0,
|
|
});
|
|
term.open(host);
|
|
let disposed = false;
|
|
void attachSurface(surfaceId, () => {}).then((res) => {
|
|
if (!disposed && res.snapshot) term.write(res.snapshot);
|
|
});
|
|
return () => { disposed = true; term.dispose(); };
|
|
}, [surfaceId, font, palette]);
|
|
return <div ref={hostRef} style={{ position: "absolute", inset: 0, opacity: 0.45, pointerEvents: "none" }} />;
|
|
}
|
|
```
|
|
|
|
> Use the exact `TermFont`/`TermPalette` types already defined/imported in this file for the `font`/`palette` props (see `Leaf`'s props). If `TerminalView` wraps `Terminal` construction in a helper, prefer reusing that helper instead of constructing `Terminal` directly.
|
|
|
|
- [ ] **Step 2: Render the snapshot + Resume button in the stopped branch**
|
|
|
|
Replace the `if (running[id] === false) { ... }` block in `Leaf` with one that layers the snapshot behind centered controls and adds a Resume button. Keep the existing `RotateCw`/`Minimize2` imports; add `Play` from lucide-react at the file's icon import.
|
|
|
|
```tsx
|
|
if (running[id] === false) {
|
|
return card(
|
|
<div style={{ position: "relative", height: "100%", width: "100%" }}>
|
|
<StoppedSnapshot surfaceId={id} font={font} palette={palette} />
|
|
<div style={{ position: "absolute", inset: 0, display: "flex", alignItems: "center", justifyContent: "center", flexDirection: "column", gap: 10, color: COLORS.textSecondary, background: "rgba(0,0,0,0.35)" }}>
|
|
<div style={{ fontFamily: FONT.mono, fontSize: 13 }}>Stopped</div>
|
|
<div style={{ display: "flex", gap: 8 }}>
|
|
<button onClick={() => void restartSurface(id, true)}
|
|
style={{ display: "flex", alignItems: "center", gap: 6, padding: "6px 14px", background: COLORS.accent, color: COLORS.bgApp, border: "none", borderRadius: 7, fontSize: 12, fontWeight: 600 }}>
|
|
<Play size={13} /> Resume
|
|
</button>
|
|
<button onClick={() => void restartSurface(id, false)}
|
|
style={{ display: "flex", alignItems: "center", gap: 6, padding: "6px 14px", background: COLORS.bgElevated, color: COLORS.textPrimary, border: `1px solid ${COLORS.borderStrong}`, borderRadius: 7, fontSize: 12 }}>
|
|
<RotateCw size={13} /> Restart fresh
|
|
</button>
|
|
{zoomed === id && (
|
|
<button onClick={() => void setZoom(workspaceId, null)}
|
|
style={{ display: "flex", alignItems: "center", gap: 6, padding: "6px 14px", background: "transparent", color: COLORS.textSecondary, border: `1px solid ${COLORS.borderStrong}`, borderRadius: 7, fontSize: 12 }}>
|
|
<Minimize2 size={13} /> Exit zoom
|
|
</button>
|
|
)}
|
|
</div>
|
|
</div>
|
|
</div>
|
|
);
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 3: Verify types compile**
|
|
|
|
Run: `cd app && npx tsc --noEmit`
|
|
Expected: PASS.
|
|
|
|
- [ ] **Step 4: Manual verification**
|
|
|
|
Build and run (`make reinstall` then launch, or `make dev`). Steps:
|
|
1. Open a workspace, add a `claude` (or shell) panel, let it print output.
|
|
2. Quit the GUI and `pkill -x spaceshd` (simulate reboot), then relaunch the app.
|
|
3. The panel shows its **last screen dimmed** with **Resume** + **Restart fresh**.
|
|
4. Click **Resume** → the agent relaunches (for claude/codex with its continue flag) and the live terminal returns.
|
|
|
|
Confirm keypress→echo still feels instant and no prompt-duplication regression on focus switches.
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add app/src/LayoutEngine.tsx
|
|
git commit -m "feat(app): stopped panel paints last screen + Resume/Restart fresh controls"
|
|
```
|
|
|
|
---
|
|
|
|
## Final verification
|
|
|
|
- [ ] Run the full suite:
|
|
|
|
```bash
|
|
cargo test
|
|
cd app && npx tsc --noEmit
|
|
```
|
|
|
|
Expected: all Rust tests pass; tsc clean.
|
|
|
|
- [ ] Dispatch a final code review over the whole branch, then use **superpowers:finishing-a-development-branch** to merge.
|
|
|
|
## Notes / gotchas
|
|
|
|
- **Snapshot tick blocks the router briefly** while it awaits each live actor's reply. Visible-screen snapshots are tiny and the await is per-surface and sequential; with a 5s cadence this is negligible. Do not move the disk write into the router — it stays in the writer task.
|
|
- **Resume is best-effort.** A new process is started; the literal in-flight process cannot survive a daemon death. For agents without a resume mapping, Resume == Restart fresh (original args).
|
|
- **`actor_id` move in `run_actor`:** the final-snapshot send needs `actor_id` before `exit_tx` consumes it — clone as the compiler directs.
|
|
- **Do not silently skip any `serve(...)` test callsite** (Task 6 Step 5): the compiler enumerates them; fix every one.
|