Add 5 pi extensions: pi-subagents, pi-crew, rpiv-pi, pi-interactive-shell, pi-intercom

This commit is contained in:
2026-05-08 15:59:25 +10:00
parent d0d1d9b045
commit 31b4110c87
457 changed files with 85157 additions and 0 deletions

View File

@@ -0,0 +1,180 @@
# pi-crew Architecture
`pi-crew` is a Pi package for coordinated multi-agent work. It is intentionally durable-first: every run is represented on disk, every task has a state record, and child workers stream progress into JSONL/status files so foreground sessions, background jobs, dashboards, and later restarts all read the same source of truth.
## Layers
```text
Pi extension layer
register tools, slash commands, widget/dashboard, notifier, lifecycle cleanup
Runtime layer
team runner, task graph scheduler, child Pi process runner, async runner,
model fallback, policy engine, worktree manager, live-session experimental path
State layer (project root resolves to <crewRoot>:
- .crew/ when no .pi/ exists in the repo (default)
- .pi/teams/ when the repo already has .pi/ (legacy reuse))
<crewRoot>/state/runs/{runId}/manifest.json
<crewRoot>/state/runs/{runId}/tasks.json
<crewRoot>/state/runs/{runId}/events.jsonl
<crewRoot>/state/runs/{runId}/agents/{taskId}/status.json
<crewRoot>/artifacts/{runId}/...
```
## Run flow
```text
user/team tool
handleTeamTool(action=run)
├─ discover agents/teams/workflows
├─ validate team/workflow refs
├─ create run manifest + task graph
├─ write goal artifact
└─ choose foreground/session-bound or async/background mode
├─ foreground: startForegroundRun() schedules executeTeamRun()
└─ async: spawnBackgroundTeamRun()
├─ node --import jiti-register.mjs background-runner.ts
├─ background-runner writes async.started + async.pid marker
└─ executeTeamRun()
├─ resolve ready task batch
├─ resolveBatchConcurrency() with hard cap
├─ runTeamTask() per task
│ ├─ build prompt + dependency context
│ ├─ choose configured Pi model candidates
│ ├─ spawn child `pi` worker
│ ├─ observe JSONL/stdout progress
│ ├─ persist agent status/events/output
│ └─ write result/log/transcript artifacts
├─ merge task updates monotonically
├─ write progress artifacts
└─ synthesize policy closeout
```
## Extension layer
`src/extension/register.ts` wires the package into Pi:
- `team` tool and management actions.
- Conflict-safe subagent tools: `crew_agent`, `crew_agent_result`, `crew_agent_steer`.
- Claude-style aliases: `Agent`, `get_subagent_result`, `steer_subagent` when available.
- Slash commands including `/team-run`, `/team-status`, `/team-dashboard`, `/team-doctor`, `/team-config`, `/team-summary`.
- Active-only widget and optional dashboard/sidebar UI.
- Foreground run scheduling and shutdown cleanup.
- Async completion notifier and session-start active-run summary.
The extension layer should remain thin: user input is normalized into tool parameters, then delegated to runtime/state modules.
## Runtime layer
### Team runner
`src/runtime/team-runner.ts` drives workflow execution. It reads queued tasks, computes the ready set from the task graph, applies concurrency limits, runs a batch, then merges results back into the latest task state. Terminal task states are monotonic: stale parallel snapshots must not regress completed/failed/cancelled/skipped tasks back to queued/running.
### Task runner
`src/runtime/task-runner.ts` executes one task. It prepares workspace/worktree context, renders a task prompt, chooses model candidates from Pi configuration, launches a child Pi process by default, and writes result artifacts. Scaffold mode is explicit dry-run only.
### Child Pi runtime
`src/runtime/child-pi.ts` is the default worker runtime. It:
- launches real `pi` child processes,
- hides Windows console windows with `windowsHide: true`,
- streams JSONL output into transcripts,
- compacts noisy message updates,
- isolates observer callback failures so progress persistence cannot kill orchestration,
- applies post-exit stdio guards for late output.
### Async background runner
`src/runtime/async-runner.ts` spawns detached background runs. Installed packages use an absolute `jiti-register.mjs` loader path because Node strip-types refuses TypeScript under `node_modules`. The runner fail-fasts if jiti is missing, and writes `async.pid` once startup begins so the parent can distinguish a healthy start from an early import crash.
### Concurrency and policy
`src/runtime/concurrency.ts` picks batch size from explicit limits, team settings, workflow settings, or built-in defaults. User-provided `limits.maxConcurrentWorkers` is hard-capped by default to prevent local DoS; `limits.allowUnboundedConcurrency=true` is an explicit opt-out and emits an observability event.
`src/runtime/policy-engine.ts` applies closeout and safety policy decisions such as limit exceeded, failed task blocking, stale workers, and green-contract failures.
### Model routing
Model choice is based on Pi's current configuration/model registry, not hardcoded providers. Task and agent records persist model attempts and routing metadata so dashboards/status can show requested model, selected model, fallback chain, and fallback reason.
## State layer
Run state is under `<crewRoot>` (`.crew/` for new projects, or `.pi/teams/` when the repo already has `.pi/`):
```text
<crewRoot>/state/runs/{runId}/
manifest.json run metadata/status/artifacts/async pid
tasks.json task graph and per-task status
events.jsonl append-only run events
events.jsonl.seq event sequence cache
agents.json aggregate agent cache
async.pid background startup marker
agents/{taskId}/
status.json per-agent status source
events.jsonl per-agent event stream
output.log compact worker output
sidechain.output.jsonl
live-control.jsonl
```
Artifacts are under:
```text
<crewRoot>/artifacts/{runId}/
goal.md
prompts/{taskId}.md
results/{taskId}.txt
logs/{taskId}.log
transcripts/{taskId}.jsonl
metadata/*.json
progress.md
summary.md
```
`<crewRoot>` resolution is centralised in `src/utils/paths.ts#projectCrewRoot()`:
- if `<repoRoot>/.pi/` already exists, return `<repoRoot>/.pi/teams/` (legacy reuse, no parallel `.crew/`)
- otherwise return `<repoRoot>/.crew/` (default for fresh projects)
User-global fallback (when no project root is detected) lives under `~/.pi/agent/extensions/pi-crew/`.
Atomic writes use temp-file replace with retry for transient Windows `EPERM`/`EBUSY`/`EACCES`. JSONL append paths are best-effort where used for observers/progress; write failures must not crash child output parsing.
## UI and observability
- The persistent widget shows active runs only.
- Stale async runs with dead background pids are hidden from the active widget.
- `/team-status` is the canonical detailed state view and can mark stale active async runs failed.
- `/team-dashboard` provides live history/details from `RunSnapshotCache`, with panes for agents, progress/events, mailbox attention, recent output, health, and metrics.
- Phase 9 observability uses a per-session `MetricRegistry` (`Counter`, `Gauge`, `Histogram`) wired to `crew.*` events via unsubscribe-returning `events.on()` handlers. The registry is disposed on session shutdown/reload; no global metric singleton is used.
- Metrics can be inspected with `/team-metrics` or `team api metrics-snapshot`, exported as redacted daily JSONL under `<crewRoot>/state/metrics/` when telemetry is enabled, formatted for Prometheus, or pushed to an opt-in OTLP HTTP endpoint.
- Heartbeat observability is split between dashboard summaries and a background `HeartbeatWatcher`: healthy/warn/stale/dead gradient metrics are emitted, first-dead detections notify operators, and consecutive dead ticks can append deadletter entries.
- Powerbar publishing is optional and event-compatible: pi-crew emits `powerbar:register-segment` for `pi-crew-active` / `pi-crew-progress`, emits `powerbar:update` payloads (`id`, `text`, optional `suffix`, `bar`, `color`), and mirrors status through `ctx.ui.setStatus("pi-crew", ...)` when no powerbar listener is detected.
- Transcript viewer is file-backed so it works for foreground and async runs; it defaults to bounded tail reads and can load full content on demand.
## Lifecycle and cleanup
Foreground runs are session-bound and should be interrupted on session shutdown or session switch. Only explicit `async: true` runs are allowed to survive the Pi session. Runtime cleanup is registered through Pi lifecycle hooks and a global reload cleanup guard.
## Configuration
Key config sections:
- `runtime`: `auto`, `child-process`, `scaffold`, experimental `live-session`.
- `limits`: concurrency/task/depth safety controls.
- `ui`: widget/dashboard/powerbar/model-token display settings.
- `observability`: in-memory metrics, heartbeat watcher interval, metric file retention.
- `telemetry`: opt-out switch for local telemetry sinks.
- `reliability`: opt-in auto-retry/auto-recover defaults and deadletter threshold.
- `otlp`: opt-in OTLP HTTP metric export.
- `agents`: builtin overrides for models/fallbacks/tools.
- `autonomous`: policy injection/profile for proactive team delegation.
See `usage.md`, `resource-formats.md`, `runtime-flow.md`, and `live-mailbox-runtime.md` for operational details.