# oh-my-pi Distillation for pi-crew Date: 2026-05-05 Source repo: `Source/oh-my-pi` at `1d898a7fe chore: bump version to 14.5.3`. ## Scope Read Read-only exploration covered four source areas: - Agent/provider runtime: `packages/agent`, `packages/ai`. - Main CLI/session/task implementation: `packages/coding-agent`. - TUI, extensions, hooks, skills, marketplace, rulebook docs and implementation. - Native/Rust reliability/performance/release docs and implementation. Representative files and docs inspected: - `packages/agent/src/agent-loop.ts`, `packages/agent/src/agent.ts`, `packages/agent/src/types.ts`. - `packages/ai/src/stream.ts`, `packages/ai/src/model-manager.ts`, `packages/ai/src/utils/{abort,retry,event-stream,overflow}.ts`, provider adapters. - `packages/coding-agent/src/session/*`, `src/extensibility/{hooks,slash-commands,skills,plugins}/*`, `src/task/*`, `src/edit/*`, prompts. - `packages/tui/src/tui.ts`, `docs/tui*.md`, `docs/extensions.md`, `docs/hooks.md`, `docs/skills.md`, `docs/marketplace.md`, `docs/rulebook-matching-pipeline.md`. - `crates/pi-natives/src/{task,shell,pty,fs_cache,glob,fd,grep}.rs`, natives docs, install/release scripts. This document rewrites the useful ideas as pi-crew-native patterns. It does not vendor or copy source code. ## High-Value Patterns to Adopt ### 1. Separate durable run history from provider/model context oh-my-pi keeps rich internal session messages separate from LLM-compatible provider messages. Custom events, UI messages, hook entries, and branch/compaction entries can live in durable history, while a conversion layer decides what reaches the model. pi-crew application: - Keep `TeamRunManifest`, task records, mailbox messages, artifacts, worker events, and review/verification notes as durable run history. - Add a projection/conversion step before worker prompt/model invocation: - `transformRunContextBeforeWorkerStart(...)` for pruning/context injection. - `convertRunHistoryToWorkerPrompt(...)` for provider/child-Pi compatible text. - Avoid treating UI/runtime events as prompt text by default. Benefit: safer compaction, mailbox summarization, and artifact hygiene without losing durable audit history. ### 2. Distinguish steering from follow-up oh-my-pi's agent runtime distinguishes interrupting current work (`steer`) from continuing after the agent would otherwise stop (`followUp`). pi-crew application: - Model leader/operator messages as two queues: - `steeringQueue`: urgent cancellation, nudge, priority change, user answer while worker is active. - `followUpQueue`: review/verification/documentation after a task reaches a natural stop. - Default to one-at-a-time delivery to reduce context shock. - Persist queue entries and delivery status in task mailbox/state. Benefit: clearer interactive semantics than a single generic respond/resume path. ### 3. Preserve invariants on cancellation and abort oh-my-pi propagates `AbortSignal` through model streaming and tool execution, distinguishes caller abort from provider-local watchdog abort, and emits synthetic tool results when abort happens after tool calls were started. pi-crew application: - Use structured cancel reasons: - `caller_cancelled` - `leader_interrupted` - `provider_timeout` - `worker_timeout` - `tool_timeout` - `shutdown` - If a worker/tool/action has started but is cancelled, emit a terminal synthetic event/result so task history has no dangling operation. - Add non-abortable cleanup/finalize phases for artifact preservation and state unlock. Benefit: fewer stuck `running` tasks and clearer recovery after cancellation. ### 4. Batch-aware execution with shared vs exclusive operations oh-my-pi marks tools with concurrency semantics: shared tools can run concurrently, exclusive tools serialize around shared/exclusive peers, and queued tools can be skipped when steering arrives. pi-crew application: - Classify worker subtasks or internal operations: - shared: read-only exploration, status, grep, artifact reads. - exclusive: edits, package manifests, lockfiles, migration/schema updates, worktree merge. - Attach `batchId`, `index`, `total`, and `conflictKey` metadata to task execution. - On new steering, skip not-yet-started low-priority operations with explicit skip reason. Benefit: safer parallelism and more auditable conflict handling. ### 5. Intent tracing for destructive/tool actions oh-my-pi optionally injects an intent field into tool schemas, strips it before execution, and keeps it for auditability. pi-crew application: - Add optional `_intent`/`intent` metadata to worker tool/action events. - Require intent for destructive actions: cancel, delete, prune, force cleanup, edits, package publish, worktree removal. - Store intent in events/artifacts but never pass it to low-level execution APIs if not needed. Benefit: reviewable why/what for high-risk actions without changing execution payloads. ### 6. Event-first UI with tiny component contract and coalesced rendering oh-my-pi TUI uses small components (`render(width)`, `handleInput`, `invalidate`) and event-driven, coalesced rendering. Components must be width-safe and lifecycle-clean. pi-crew application: - Keep dashboards/widgets as projections from snapshot/event state, not direct filesystem scanners. - Continue using render scheduler/coalescing; add width-safety tests for all dashboard panes/widgets. - Components should expose `dispose()` for timers/theme subscriptions. - UI event stream should be semantic (`task_started`, `worker_status`, `mailbox_updated`) rather than raw file polling. Benefit: avoids UI freezes and makes live views predictable. ### 7. Two-phase extension lifecycle oh-my-pi extensions have a registration phase where side-effecting runtime methods are unavailable, followed by an initialized phase with real context/actions. pi-crew application: - If pi-crew grows plugin/extension support, split APIs into: - `registerCrewExtension(api)`: declare teams, workflows, hooks, commands, renderers. - `initializeCrewExtension(context)`: subscribe to events, perform side effects. - In headless mode, UI APIs should be explicit no-ops or unavailable via `hasUI`. - Loader should collect extension errors without breaking builtin teams. Benefit: fewer load-time side effects and safer third-party extensibility. ### 8. Unified capability inventory/control center oh-my-pi normalizes extensions, skills, rules, tools, hooks, MCPs, prompts, and slash commands into a shared dashboard model with active/disabled/shadowed states. pi-crew application: - Extend `/team-settings` or add `/team-control` to show a unified inventory: - teams, workflows, agents, skills, hooks/policies, tools, runtime providers. - Normalize each item to: - `id`, `kind`, `name`, `description`, `source`, `path`, `state`, `disabledReason`, `shadowedBy`, `raw`. - Persist disables by stable capability ID, not file path. Benefit: better operator experience for complex multi-resource setups. ### 9. Hooks as typed lifecycle gates, not ad-hoc shell glue oh-my-pi hooks cover session lifecycle, before-agent-start, tool-call gates, tool-result transforms, and compaction events. Blocking hooks are scoped; non-blocking hook errors are captured but do not crash streaming. pi-crew application: - Define typed crew hooks: - `before_run_start` - `before_task_start` - `task_result` - `before_cancel` - `before_publish` - `session_before_switch` - `run_recovery` - Mark hooks as blocking or non-blocking. - Capture hook errors into diagnostics/status, not uncontrolled exceptions. Benefit: safer customization for policy/security/release gates. ### 10. Prompt pipeline should be explicit oh-my-pi applies slash/custom commands, templates, compaction, file mentions, hook injection, and model validation in a clear order before calling the agent. pi-crew application: Define a worker prompt pipeline: 1. Parse orchestration command/control intent. 2. Expand prompt templates/task packet. 3. Attach selected context/artifact/mailbox summaries. 4. Run `before_worker_start` hooks. 5. Persist exact task packet/artifacts. 6. Launch worker. Benefit: reproducible worker prompts and easier debugging of context injection. ### 11. Session/run history as append-only tree oh-my-pi persists session entries with parent relationships. Branching/forking moves the current leaf rather than rewriting past history. pi-crew application: - Keep `events.jsonl` append-only and add optional `parentEventId` / `attemptId` / `branchId` fields for retries/forks. - Represent retry attempts as child branches from the original task prompt/result. - Preserve old failed attempts instead of overwriting task state only. Benefit: better auditability and replay/debug of retries. ### 12. Cooperative cancellation token for long loops oh-my-pi native code uses cancel tokens with deadlines, abort signals, `heartbeat()`, and async wait. Long loops over external-size input must heartbeat at bounded cadence. pi-crew application: - Add a TS `CancellationToken` utility for internal long-running loops: - `heartbeat(stage?: string)` - `throwIfCancelled()` - `wait()` - `abort(reason)` - Require it in scanners over runs, artifacts, mailboxes, worktrees, and event logs. Benefit: bounded shutdown/cancel latency and easier stuck-loop diagnostics. ### 13. Process lifecycle: graceful cancel, forced kill, then non-reuse oh-my-pi shell/PTY runtime cancels gracefully, waits a grace window, forces abort/kill, drains output for bounded windows, and discards persistent sessions after cancellation/errors. pi-crew application: - For child Pi workers: - send graceful abort/TERM; - wait `graceMs`; - force-kill process tree; - drain stdout/stderr for bounded time; - mark session non-reusable after timeout/protocol error/cancel. - Return typed status `{ exitCode, cancelled, timedOut, killed, cleanupErrors }`. Benefit: more deterministic worker cleanup and fewer zombie/stale runs. ### 14. Reserve control channel before async worker start oh-my-pi PTY reserves its control channel before async process start, rejects duplicate starts, and always clears state in completion. pi-crew application: - Install a `WorkerRunCore`/controller synchronously before spawn returns. - Expose cancel/steer immediately, even while startup is still in progress. - Clear controller in `finally` and persist terminal state. Benefit: closes race windows where operator cannot cancel a starting worker. ### 15. Cache scan entries, not final query results oh-my-pi native search caches directory entries and applies query-specific filters/scoring later. Empty stale caches trigger rescan; ordering is deterministic. pi-crew application: - For run/artifact/mailbox discovery, cache raw entries/stats rather than final UI results. - Apply active-status/mailbox/health filters after cache retrieval. - Invalidate cache after state mutation. - Use deterministic sort keys for dashboards and summaries. Benefit: faster UI/status with fewer stale semantic bugs. ### 16. Blob artifacts and bounded file access oh-my-pi blob-artifact design uses content addressing, metadata sidecars, streaming writes, size budgets, manifest GC, and path whitelisting. pi-crew application: - Introduce content-addressed large artifacts for worker transcripts/screenshots/log chunks. - Persist metadata sidecars with MIME, source, redaction, run/task IDs, size, hash. - Keep task prompts/results small by referencing artifact IDs. - Add GC tied to run retention. Benefit: avoids bloating task JSON/events and improves artifact security. ### 17. Native/release verification checklist mindset oh-my-pi release scripts emphasize multi-platform build artifacts, install smoke tests, spoofed-version checks, and runtime loader fallback diagnostics. pi-crew application: - For npm releases, keep a release checklist with: - typecheck; - unit/integration tests; - `npm pack --dry-run`; - install from packed tarball in temp project; - Pi extension load smoke; - version/tag/npm consistency check. Benefit: fewer broken published packages. ## Skill/Rulebook Ideas to Port oh-my-pi's skills/rulebook ecosystem suggests additional pi-crew resources: 1. `worker-prompt-pipeline` skill: prompt assembly, context projection, before-worker hooks, artifact references. 2. `typed-hook-design` skill: lifecycle gates, blocking vs non-blocking hooks, diagnostics. 3. `process-cancellation-contract` skill: graceful/force kill, synthetic terminal results, non-reuse. 4. `capability-inventory-ux` skill: normalized resource inventory and disable/shadow semantics. 5. `append-only-run-history` skill: event tree, branch/retry provenance. ## Prioritized Backlog for pi-crew ### P0 / High confidence - Fix current runtime review findings first: waiting final status, respond semantics, no-registry model routing. - Add structured cancellation reason and terminal synthetic result/event for cancelled workers. - Centralize worker prompt pipeline and persist exact prompt packets. - Add width-safety tests for dashboard/widget lines. ### P1 / Medium-term architecture - Add steering vs follow-up mailbox queues. - Add typed hook lifecycle for `before_task_start`, `task_result`, `before_cancel`, `session_before_switch`. - Add capability inventory model for teams/workflows/agents/skills/hooks/tools. - Add `CancellationToken` for long internal loops and scans. ### P2 / Larger subsystem work - Append-only run-history tree with attempt/branch parentage. - Content-addressed blob artifact store with metadata sidecars and GC. - Worker process controller installed before spawn; process non-reuse after cancel/protocol error. - Raw scan-entry cache shared by dashboard/status/artifact lookup. ## Anti-Patterns to Avoid - Building prompts from scattered inline string concatenation without a traceable pipeline. - Treating UI render as a place to perform heavy filesystem scans. - Auto-opening modal/right-sidebar UI by default when a compact widget/status line would suffice. - Dropping queued user-facing results just because session generation changed. - Cancelling a task without writing a terminal event/result. - Caching semantic query results that should be recomputed from raw state. - Letting one bad extension/resource prevent builtin operation. ## Immediate Review Questions for Future Implementation - Should pi-crew project-local skills be allowed to shadow builtin safety skills by default, or require explicit `project:` namespace? - Should `respond` enqueue durable work or only deliver to live workers? Current semantics need to become explicit. - What is the stable capability ID scheme for teams/workflows/agents/skills/hooks? - Which hook events should be blocking by default and which should be diagnostic-only? - What artifact size threshold should trigger blob storage instead of embedding content in task/events JSON?