15 KiB
oh-my-pi Distillation for pi-crew
Date: 2026-05-05
Source repo: Source/oh-my-pi at 1d898a7fe chore: bump version to 14.5.3.
Scope Read
Read-only exploration covered four source areas:
- Agent/provider runtime:
packages/agent,packages/ai. - Main CLI/session/task implementation:
packages/coding-agent. - TUI, extensions, hooks, skills, marketplace, rulebook docs and implementation.
- Native/Rust reliability/performance/release docs and implementation.
Representative files and docs inspected:
packages/agent/src/agent-loop.ts,packages/agent/src/agent.ts,packages/agent/src/types.ts.packages/ai/src/stream.ts,packages/ai/src/model-manager.ts,packages/ai/src/utils/{abort,retry,event-stream,overflow}.ts, provider adapters.packages/coding-agent/src/session/*,src/extensibility/{hooks,slash-commands,skills,plugins}/*,src/task/*,src/edit/*, prompts.packages/tui/src/tui.ts,docs/tui*.md,docs/extensions.md,docs/hooks.md,docs/skills.md,docs/marketplace.md,docs/rulebook-matching-pipeline.md.crates/pi-natives/src/{task,shell,pty,fs_cache,glob,fd,grep}.rs, natives docs, install/release scripts.
This document rewrites the useful ideas as pi-crew-native patterns. It does not vendor or copy source code.
High-Value Patterns to Adopt
1. Separate durable run history from provider/model context
oh-my-pi keeps rich internal session messages separate from LLM-compatible provider messages. Custom events, UI messages, hook entries, and branch/compaction entries can live in durable history, while a conversion layer decides what reaches the model.
pi-crew application:
- Keep
TeamRunManifest, task records, mailbox messages, artifacts, worker events, and review/verification notes as durable run history. - Add a projection/conversion step before worker prompt/model invocation:
transformRunContextBeforeWorkerStart(...)for pruning/context injection.convertRunHistoryToWorkerPrompt(...)for provider/child-Pi compatible text.
- Avoid treating UI/runtime events as prompt text by default.
Benefit: safer compaction, mailbox summarization, and artifact hygiene without losing durable audit history.
2. Distinguish steering from follow-up
oh-my-pi's agent runtime distinguishes interrupting current work (steer) from continuing after the agent would otherwise stop (followUp).
pi-crew application:
- Model leader/operator messages as two queues:
steeringQueue: urgent cancellation, nudge, priority change, user answer while worker is active.followUpQueue: review/verification/documentation after a task reaches a natural stop.
- Default to one-at-a-time delivery to reduce context shock.
- Persist queue entries and delivery status in task mailbox/state.
Benefit: clearer interactive semantics than a single generic respond/resume path.
3. Preserve invariants on cancellation and abort
oh-my-pi propagates AbortSignal through model streaming and tool execution, distinguishes caller abort from provider-local watchdog abort, and emits synthetic tool results when abort happens after tool calls were started.
pi-crew application:
- Use structured cancel reasons:
caller_cancelledleader_interruptedprovider_timeoutworker_timeouttool_timeoutshutdown
- If a worker/tool/action has started but is cancelled, emit a terminal synthetic event/result so task history has no dangling operation.
- Add non-abortable cleanup/finalize phases for artifact preservation and state unlock.
Benefit: fewer stuck running tasks and clearer recovery after cancellation.
4. Batch-aware execution with shared vs exclusive operations
oh-my-pi marks tools with concurrency semantics: shared tools can run concurrently, exclusive tools serialize around shared/exclusive peers, and queued tools can be skipped when steering arrives.
pi-crew application:
- Classify worker subtasks or internal operations:
- shared: read-only exploration, status, grep, artifact reads.
- exclusive: edits, package manifests, lockfiles, migration/schema updates, worktree merge.
- Attach
batchId,index,total, andconflictKeymetadata to task execution. - On new steering, skip not-yet-started low-priority operations with explicit skip reason.
Benefit: safer parallelism and more auditable conflict handling.
5. Intent tracing for destructive/tool actions
oh-my-pi optionally injects an intent field into tool schemas, strips it before execution, and keeps it for auditability.
pi-crew application:
- Add optional
_intent/intentmetadata to worker tool/action events. - Require intent for destructive actions: cancel, delete, prune, force cleanup, edits, package publish, worktree removal.
- Store intent in events/artifacts but never pass it to low-level execution APIs if not needed.
Benefit: reviewable why/what for high-risk actions without changing execution payloads.
6. Event-first UI with tiny component contract and coalesced rendering
oh-my-pi TUI uses small components (render(width), handleInput, invalidate) and event-driven, coalesced rendering. Components must be width-safe and lifecycle-clean.
pi-crew application:
- Keep dashboards/widgets as projections from snapshot/event state, not direct filesystem scanners.
- Continue using render scheduler/coalescing; add width-safety tests for all dashboard panes/widgets.
- Components should expose
dispose()for timers/theme subscriptions. - UI event stream should be semantic (
task_started,worker_status,mailbox_updated) rather than raw file polling.
Benefit: avoids UI freezes and makes live views predictable.
7. Two-phase extension lifecycle
oh-my-pi extensions have a registration phase where side-effecting runtime methods are unavailable, followed by an initialized phase with real context/actions.
pi-crew application:
- If pi-crew grows plugin/extension support, split APIs into:
registerCrewExtension(api): declare teams, workflows, hooks, commands, renderers.initializeCrewExtension(context): subscribe to events, perform side effects.
- In headless mode, UI APIs should be explicit no-ops or unavailable via
hasUI. - Loader should collect extension errors without breaking builtin teams.
Benefit: fewer load-time side effects and safer third-party extensibility.
8. Unified capability inventory/control center
oh-my-pi normalizes extensions, skills, rules, tools, hooks, MCPs, prompts, and slash commands into a shared dashboard model with active/disabled/shadowed states.
pi-crew application:
- Extend
/team-settingsor add/team-controlto show a unified inventory:- teams, workflows, agents, skills, hooks/policies, tools, runtime providers.
- Normalize each item to:
id,kind,name,description,source,path,state,disabledReason,shadowedBy,raw.
- Persist disables by stable capability ID, not file path.
Benefit: better operator experience for complex multi-resource setups.
9. Hooks as typed lifecycle gates, not ad-hoc shell glue
oh-my-pi hooks cover session lifecycle, before-agent-start, tool-call gates, tool-result transforms, and compaction events. Blocking hooks are scoped; non-blocking hook errors are captured but do not crash streaming.
pi-crew application:
- Define typed crew hooks:
before_run_startbefore_task_starttask_resultbefore_cancelbefore_publishsession_before_switchrun_recovery
- Mark hooks as blocking or non-blocking.
- Capture hook errors into diagnostics/status, not uncontrolled exceptions.
Benefit: safer customization for policy/security/release gates.
10. Prompt pipeline should be explicit
oh-my-pi applies slash/custom commands, templates, compaction, file mentions, hook injection, and model validation in a clear order before calling the agent.
pi-crew application:
Define a worker prompt pipeline:
- Parse orchestration command/control intent.
- Expand prompt templates/task packet.
- Attach selected context/artifact/mailbox summaries.
- Run
before_worker_starthooks. - Persist exact task packet/artifacts.
- Launch worker.
Benefit: reproducible worker prompts and easier debugging of context injection.
11. Session/run history as append-only tree
oh-my-pi persists session entries with parent relationships. Branching/forking moves the current leaf rather than rewriting past history.
pi-crew application:
- Keep
events.jsonlappend-only and add optionalparentEventId/attemptId/branchIdfields for retries/forks. - Represent retry attempts as child branches from the original task prompt/result.
- Preserve old failed attempts instead of overwriting task state only.
Benefit: better auditability and replay/debug of retries.
12. Cooperative cancellation token for long loops
oh-my-pi native code uses cancel tokens with deadlines, abort signals, heartbeat(), and async wait. Long loops over external-size input must heartbeat at bounded cadence.
pi-crew application:
- Add a TS
CancellationTokenutility for internal long-running loops:heartbeat(stage?: string)throwIfCancelled()wait()abort(reason)
- Require it in scanners over runs, artifacts, mailboxes, worktrees, and event logs.
Benefit: bounded shutdown/cancel latency and easier stuck-loop diagnostics.
13. Process lifecycle: graceful cancel, forced kill, then non-reuse
oh-my-pi shell/PTY runtime cancels gracefully, waits a grace window, forces abort/kill, drains output for bounded windows, and discards persistent sessions after cancellation/errors.
pi-crew application:
- For child Pi workers:
- send graceful abort/TERM;
- wait
graceMs; - force-kill process tree;
- drain stdout/stderr for bounded time;
- mark session non-reusable after timeout/protocol error/cancel.
- Return typed status
{ exitCode, cancelled, timedOut, killed, cleanupErrors }.
Benefit: more deterministic worker cleanup and fewer zombie/stale runs.
14. Reserve control channel before async worker start
oh-my-pi PTY reserves its control channel before async process start, rejects duplicate starts, and always clears state in completion.
pi-crew application:
- Install a
WorkerRunCore/controller synchronously before spawn returns. - Expose cancel/steer immediately, even while startup is still in progress.
- Clear controller in
finallyand persist terminal state.
Benefit: closes race windows where operator cannot cancel a starting worker.
15. Cache scan entries, not final query results
oh-my-pi native search caches directory entries and applies query-specific filters/scoring later. Empty stale caches trigger rescan; ordering is deterministic.
pi-crew application:
- For run/artifact/mailbox discovery, cache raw entries/stats rather than final UI results.
- Apply active-status/mailbox/health filters after cache retrieval.
- Invalidate cache after state mutation.
- Use deterministic sort keys for dashboards and summaries.
Benefit: faster UI/status with fewer stale semantic bugs.
16. Blob artifacts and bounded file access
oh-my-pi blob-artifact design uses content addressing, metadata sidecars, streaming writes, size budgets, manifest GC, and path whitelisting.
pi-crew application:
- Introduce content-addressed large artifacts for worker transcripts/screenshots/log chunks.
- Persist metadata sidecars with MIME, source, redaction, run/task IDs, size, hash.
- Keep task prompts/results small by referencing artifact IDs.
- Add GC tied to run retention.
Benefit: avoids bloating task JSON/events and improves artifact security.
17. Native/release verification checklist mindset
oh-my-pi release scripts emphasize multi-platform build artifacts, install smoke tests, spoofed-version checks, and runtime loader fallback diagnostics.
pi-crew application:
- For npm releases, keep a release checklist with:
- typecheck;
- unit/integration tests;
npm pack --dry-run;- install from packed tarball in temp project;
- Pi extension load smoke;
- version/tag/npm consistency check.
Benefit: fewer broken published packages.
Skill/Rulebook Ideas to Port
oh-my-pi's skills/rulebook ecosystem suggests additional pi-crew resources:
worker-prompt-pipelineskill: prompt assembly, context projection, before-worker hooks, artifact references.typed-hook-designskill: lifecycle gates, blocking vs non-blocking hooks, diagnostics.process-cancellation-contractskill: graceful/force kill, synthetic terminal results, non-reuse.capability-inventory-uxskill: normalized resource inventory and disable/shadow semantics.append-only-run-historyskill: event tree, branch/retry provenance.
Prioritized Backlog for pi-crew
P0 / High confidence
- Fix current runtime review findings first: waiting final status, respond semantics, no-registry model routing.
- Add structured cancellation reason and terminal synthetic result/event for cancelled workers.
- Centralize worker prompt pipeline and persist exact prompt packets.
- Add width-safety tests for dashboard/widget lines.
P1 / Medium-term architecture
- Add steering vs follow-up mailbox queues.
- Add typed hook lifecycle for
before_task_start,task_result,before_cancel,session_before_switch. - Add capability inventory model for teams/workflows/agents/skills/hooks/tools.
- Add
CancellationTokenfor long internal loops and scans.
P2 / Larger subsystem work
- Append-only run-history tree with attempt/branch parentage.
- Content-addressed blob artifact store with metadata sidecars and GC.
- Worker process controller installed before spawn; process non-reuse after cancel/protocol error.
- Raw scan-entry cache shared by dashboard/status/artifact lookup.
Anti-Patterns to Avoid
- Building prompts from scattered inline string concatenation without a traceable pipeline.
- Treating UI render as a place to perform heavy filesystem scans.
- Auto-opening modal/right-sidebar UI by default when a compact widget/status line would suffice.
- Dropping queued user-facing results just because session generation changed.
- Cancelling a task without writing a terminal event/result.
- Caching semantic query results that should be recomputed from raw state.
- Letting one bad extension/resource prevent builtin operation.
Immediate Review Questions for Future Implementation
- Should pi-crew project-local skills be allowed to shadow builtin safety skills by default, or require explicit
project:namespace? - Should
respondenqueue durable work or only deliver to live workers? Current semantics need to become explicit. - What is the stable capability ID scheme for teams/workflows/agents/skills/hooks?
- Which hook events should be blocking by default and which should be diagnostic-only?
- What artifact size threshold should trigger blob storage instead of embedding content in task/events JSON?