Add 5 pi extensions: pi-subagents, pi-crew, rpiv-pi, pi-interactive-shell, pi-intercom
This commit is contained in:
733
extensions/pi-crew/docs/next-upgrade-roadmap.md
Normal file
733
extensions/pi-crew/docs/next-upgrade-roadmap.md
Normal file
@@ -0,0 +1,733 @@
|
||||
# pi-crew Next Upgrade Roadmap
|
||||
|
||||
Date: 2026-05-05
|
||||
Source inputs:
|
||||
|
||||
- `docs/research-oh-my-pi-distillation.md`
|
||||
- `docs/source-runtime-refactor-map.md`
|
||||
- Recent runtime hardening commits through `f5d47aa feat: surface run effectiveness evidence`
|
||||
|
||||
This document tracks the next practical upgrades after the current scaffold/no-op subagent fix, runtime safety classification, cancellation provenance, intent audit trail, prompt pipeline artifacts, capability inventory artifacts, and run effectiveness reporting.
|
||||
|
||||
## Current Baseline
|
||||
|
||||
Already implemented and pushed:
|
||||
|
||||
- Real child worker execution is the default.
|
||||
- Implicit scaffold/no-op runs are blocked when worker execution is disabled by config/env.
|
||||
- Explicit `runtime.mode=scaffold` remains available for dry-run prompt/artifact generation.
|
||||
- Run `summary.md`, `progress.md`, and `status` now expose effectiveness evidence.
|
||||
- Structured cancellation reasons flow through retry/cancel/team-runner/run events/metrics/UI snapshot.
|
||||
- `cancel`, `cleanup`, `forget`, and `prune` accept audit intent metadata.
|
||||
- Live-agent control distinguishes `steer` from `follow-up` at live-control/API level.
|
||||
- Retry attempts have `attemptId`; max-retry deadletters link to the final `attemptId`.
|
||||
- Worker prompt pipeline and capability inventory metadata artifacts are written per task.
|
||||
|
||||
## Priority Legend
|
||||
|
||||
- **P0**: correctness/safety issue; should be addressed before next release if feasible.
|
||||
- **P1**: high user-visible value or reliability gain; good patch-release candidates.
|
||||
- **P2**: larger subsystem work; should be planned and sequenced.
|
||||
- **P3**: polish/UX/longer-term architecture.
|
||||
|
||||
## P0 — Prevent Ineffective Completed Runs
|
||||
|
||||
### P0.1 Enforce effectiveness policy for non-scaffold workers
|
||||
|
||||
**Problem**
|
||||
|
||||
`summary/status` now surface effectiveness evidence, but non-scaffold `child-process`/`live-session` runs can still end `completed` when task evidence is weak unless the existing mutation guard fires.
|
||||
|
||||
**Target behavior**
|
||||
|
||||
- For real workers, a run with completed tasks but no observable worker activity should be `blocked` or `failed`, not silently `completed`.
|
||||
- Keep explicit scaffold dry-runs allowed, but label them as dry-runs.
|
||||
- Policy should be configurable:
|
||||
- `runtime.effectivenessGuard = "off" | "warn" | "block" | "fail"`
|
||||
- default candidate: `warn` for read-only roles, `block` for mutating roles.
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/runtime/team-runner.ts`
|
||||
- `src/runtime/completion-guard.ts`
|
||||
- `src/state/types.ts` if storing guard result on manifest/tasks
|
||||
- `src/schema/config-schema.ts`
|
||||
- `src/config/config.ts`
|
||||
- `test/unit/summary.test.ts`
|
||||
- `test/unit/team-runner-merge.test.ts` or new `test/unit/effectiveness-guard.test.ts`
|
||||
|
||||
**Implementation sketch**
|
||||
|
||||
1. Extract run effectiveness calculation into a reusable exported helper, e.g.:
|
||||
|
||||
```ts
|
||||
export interface RunEffectivenessSummary {
|
||||
completed: number;
|
||||
observable: number;
|
||||
noObservedWorkTaskIds: string[];
|
||||
needsAttentionTaskIds: string[];
|
||||
workerExecution: "enabled" | "disabled/scaffold";
|
||||
severity: "ok" | "warning" | "blocked" | "failed";
|
||||
}
|
||||
```
|
||||
|
||||
2. Use this helper for:
|
||||
- `progress.md`
|
||||
- `summary.md`
|
||||
- `status`
|
||||
- policy enforcement before `run.completed`.
|
||||
|
||||
3. For non-scaffold runs, if mutating tasks have no mutation/tool/model/transcript evidence:
|
||||
- append `policy.action` with `reason: "ineffective_worker"`;
|
||||
- set run `blocked` or `failed` depending config;
|
||||
- include task IDs in `data`.
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- A mocked child-process run with no tool/model/transcript evidence does not report clean `completed` by default.
|
||||
- Scaffold run still completes as explicit dry-run and displays `Worker execution: disabled/scaffold`.
|
||||
- `status` clearly lists `noObservedWork` and `needsAttention` task IDs.
|
||||
- Unit tests cover warn/block/fail modes.
|
||||
|
||||
**Verification**
|
||||
|
||||
```bash
|
||||
npx tsc --noEmit
|
||||
node --experimental-strip-types --test --test-concurrency=1 --test-timeout=30000 test/unit/effectiveness-guard.test.ts test/unit/summary.test.ts
|
||||
npm run test:unit
|
||||
```
|
||||
|
||||
### P0.2 Make runtime safety visible in manifest and run events
|
||||
|
||||
**Problem**
|
||||
|
||||
`runtime.safety` exists in runtime resolution, but it is not persisted as first-class run metadata. Debugging currently requires reading events or inferred artifacts.
|
||||
|
||||
**Target behavior**
|
||||
|
||||
- Manifest records resolved runtime:
|
||||
|
||||
```json
|
||||
{
|
||||
"runtimeResolution": {
|
||||
"kind": "child-process",
|
||||
"requestedMode": "auto",
|
||||
"safety": "trusted",
|
||||
"fallback": "child-process",
|
||||
"reason": "..."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- `run.running` or `run.blocked` event includes the same resolution.
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/state/types.ts`
|
||||
- `src/extension/team-tool/run.ts`
|
||||
- `src/runtime/background-runner.ts`
|
||||
- `src/extension/team-tool/status.ts`
|
||||
- `test/unit/team-run.test.ts`
|
||||
- `test/unit/runtime-resolver.test.ts`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- `status` shows `Runtime safety: trusted|explicit_dry_run|blocked`.
|
||||
- Blocked disabled-worker runs persist enough evidence to explain why no subagents spawned.
|
||||
- Existing manifest schema remains backward compatible.
|
||||
|
||||
## P1 — Steering/Follow-up Semantics Beyond Live Control
|
||||
|
||||
### P1.1 Persist separate steering and follow-up queues in mailbox state
|
||||
|
||||
**Current state**
|
||||
|
||||
`follow-up-agent` exists in live-control, but durable mailbox is still generic inbox/outbox and `respond` still has waiting-task semantics.
|
||||
|
||||
**Target behavior**
|
||||
|
||||
- Mailbox messages can carry semantic kind:
|
||||
|
||||
```ts
|
||||
kind?: "message" | "steer" | "follow-up" | "response" | "group_join";
|
||||
priority?: "urgent" | "normal" | "low";
|
||||
deliveryMode?: "interrupt" | "next_turn";
|
||||
```
|
||||
|
||||
- `steer-agent` appends durable steering queue entry when no live session is present.
|
||||
- `follow-up-agent` appends durable follow-up queue entry, deliverable after task stop/resume.
|
||||
- UI/status separates urgent steering from follow-up backlog.
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/state/mailbox.ts`
|
||||
- `src/runtime/live-agent-control.ts`
|
||||
- `src/runtime/live-agent-manager.ts`
|
||||
- `src/extension/team-tool/api.ts`
|
||||
- `src/extension/team-tool/respond.ts`
|
||||
- `src/ui/dashboard-panes/mailbox-pane.ts`
|
||||
- `test/unit/mailbox-api.test.ts`
|
||||
- `test/unit/live-agent-control.test.ts`
|
||||
- `test/unit/respond-tool.test.ts`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- Steering and follow-up can be inspected separately.
|
||||
- Existing inbox/outbox JSONL remains readable.
|
||||
- Durable queue survives process/session switch.
|
||||
- Realtime live delivery dedupes against durable replay.
|
||||
|
||||
### P1.2 Clarify `respond` vs `follow-up` UX
|
||||
|
||||
**Problem**
|
||||
|
||||
`respond` is currently a waiting-task resume primitive. Users may expect it to send a general follow-up.
|
||||
|
||||
**Target behavior**
|
||||
|
||||
- `/team-respond` remains only for `waiting` tasks.
|
||||
- `/team-follow-up` or `api operation=follow-up-agent` is documented as continuation prompt.
|
||||
- Error messages recommend the correct command.
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/extension/registration/commands.ts`
|
||||
- `src/extension/help.ts`
|
||||
- `docs/usage.md`
|
||||
- `test/unit/registration-commands-coverage.test.ts`
|
||||
- `test/unit/respond-tool.test.ts`
|
||||
|
||||
## P1 — Worker Lifecycle and Process Reliability
|
||||
|
||||
### P1.3 Two-phase child process teardown
|
||||
|
||||
**Current state**
|
||||
|
||||
Child workers have improved post-exit stdio guards and bounded drains, but cancellation semantics can be made more deterministic.
|
||||
|
||||
**Target behavior**
|
||||
|
||||
Worker process cancellation returns structured status:
|
||||
|
||||
```ts
|
||||
interface WorkerExitStatus {
|
||||
exitCode: number | null;
|
||||
cancelled: boolean;
|
||||
timedOut: boolean;
|
||||
killed: boolean;
|
||||
signal?: string;
|
||||
cleanupErrors: string[];
|
||||
finalDrainMs: number;
|
||||
}
|
||||
```
|
||||
|
||||
Process lifecycle:
|
||||
|
||||
1. graceful cancel/TERM;
|
||||
2. wait grace window;
|
||||
3. hard kill process tree;
|
||||
4. bounded stdout/stderr drain;
|
||||
5. mark session non-reusable.
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/runtime/child-pi.ts`
|
||||
- `src/runtime/pi-spawn.ts`
|
||||
- `src/runtime/post-exit-stdio-guard.ts`
|
||||
- `src/runtime/task-runner.ts`
|
||||
- `src/runtime/cancellation.ts`
|
||||
- `test/unit/child-pi*.test.ts`
|
||||
- `test/integration/mock-child-run.test.ts`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- Cancelled worker always produces terminal task event.
|
||||
- Output drains are bounded.
|
||||
- Status includes `cancelled/timedOut/killed`.
|
||||
- No zombie/stale running task after cancellation.
|
||||
|
||||
### P1.4 Reserve worker control channel before spawn
|
||||
|
||||
**Problem**
|
||||
|
||||
There can be a short window where a task is logically starting but cancel/steer cannot target a controller yet.
|
||||
|
||||
**Target behavior**
|
||||
|
||||
- Synchronously create a `WorkerRunCore`/controller before async spawn.
|
||||
- Persist controller metadata in agent status.
|
||||
- Cancel/steer requests can be queued immediately while startup is in progress.
|
||||
- Controller is cleared in `finally`.
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/runtime/task-runner.ts`
|
||||
- `src/runtime/agent-control.ts`
|
||||
- `src/runtime/live-agent-control.ts`
|
||||
- `src/runtime/crew-agent-records.ts`
|
||||
- `src/extension/team-tool/api.ts`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- Starting worker can be cancelled immediately.
|
||||
- Durable control request written during startup is applied or recorded as terminal no-op with reason.
|
||||
- Tests simulate control request before child process emits first output.
|
||||
|
||||
## P1 — Cancellation and Attempt History
|
||||
|
||||
### P1.5 Add event-tree provenance: `parentEventId`, `attemptId`, `branchId`
|
||||
|
||||
**Current state**
|
||||
|
||||
Retry attempts have `attemptId`, and deadletters link to final attempt. Event log has sequence and terminal fingerprints but no general event tree.
|
||||
|
||||
**Target behavior**
|
||||
|
||||
- `TeamEvent.metadata` supports:
|
||||
|
||||
```ts
|
||||
parentEventId?: string;
|
||||
attemptId?: string;
|
||||
branchId?: string;
|
||||
causationId?: string;
|
||||
correlationId?: string;
|
||||
```
|
||||
|
||||
- Retry events, task started/completed/failed, deadletter, recovery events link by `attemptId`.
|
||||
- UI/status can show attempt timeline.
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/state/event-log.ts`
|
||||
- `src/state/types.ts`
|
||||
- `src/runtime/team-runner.ts`
|
||||
- `src/runtime/retry-executor.ts`
|
||||
- `src/runtime/recovery-recipes.ts`
|
||||
- `src/extension/team-tool/status.ts`
|
||||
- `test/unit/event-metadata.test.ts`
|
||||
- `test/unit/retry-executor.test.ts`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- Retry attempt events and terminal task events share attempt provenance.
|
||||
- Deadletter records can be traced back to event sequence.
|
||||
- Existing JSONL readers ignore missing provenance fields.
|
||||
|
||||
### P1.6 Synthetic terminal results for cancelled in-flight operations
|
||||
|
||||
**Problem**
|
||||
|
||||
Run/task cancellation events are now structured, but worker/tool sub-operations can still lack synthetic terminal records if cancelled mid-operation.
|
||||
|
||||
**Target behavior**
|
||||
|
||||
- If a task started a worker/tool/model call and cancellation occurs, append a synthetic terminal record:
|
||||
- `tool.cancelled` or `worker.cancelled`
|
||||
- reason code/message
|
||||
- startedAt/finishedAt
|
||||
- attemptId if available
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/runtime/task-runner.ts`
|
||||
- `src/runtime/task-runner/progress.ts`
|
||||
- `src/runtime/child-pi.ts`
|
||||
- `src/runtime/cancellation.ts`
|
||||
- `src/state/contracts.ts`
|
||||
- `test/unit/cancellation.test.ts`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- No started tool/model operation is left without terminal evidence after cancellation.
|
||||
- Status/diagnostics can distinguish user cancel vs timeout vs shutdown.
|
||||
|
||||
## P1 — Capability Inventory and Control Center
|
||||
|
||||
### P1.7 Build run/project capability inventory view
|
||||
|
||||
**Current state**
|
||||
|
||||
Per-task capability artifacts exist. There is no unified project/run inventory UI/API yet.
|
||||
|
||||
**Target behavior**
|
||||
|
||||
`/team-settings` or new `/team-control` shows normalized inventory:
|
||||
|
||||
```ts
|
||||
interface CapabilityItem {
|
||||
id: string;
|
||||
kind: "team" | "workflow" | "agent" | "skill" | "tool" | "hook" | "runtime" | "provider";
|
||||
name: string;
|
||||
source: "builtin" | "project" | "user" | "runtime";
|
||||
path?: string;
|
||||
state: "active" | "disabled" | "shadowed" | "missing";
|
||||
disabledReason?: string;
|
||||
shadowedBy?: string;
|
||||
}
|
||||
```
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/extension/team-tool/handle-settings.ts`
|
||||
- `src/extension/management.ts`
|
||||
- `src/agents/discover-agents.ts`
|
||||
- `src/teams/discover-teams.ts`
|
||||
- `src/workflows/discover-workflows.ts`
|
||||
- `src/runtime/skill-instructions.ts`
|
||||
- `docs/resource-formats.md`
|
||||
- `test/unit/management.test.ts`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- Inventory is stable and sorted.
|
||||
- Shadowed project/user/builtin resources are visible.
|
||||
- Skill disabled/budget state is visible.
|
||||
- No file path is used as the only stable ID.
|
||||
|
||||
### P1.8 Persist capability disables by stable ID
|
||||
|
||||
**Target behavior**
|
||||
|
||||
- Operator can disable a skill/agent/team by capability ID.
|
||||
- Disable config survives path relocation when resource identity remains stable.
|
||||
- Status explains disabled reason.
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/config/config.ts`
|
||||
- `src/schema/config-schema.ts`
|
||||
- discovery modules
|
||||
- `test/unit/config-schema-validation.test.ts`
|
||||
|
||||
## P2 — Typed Hook Lifecycle
|
||||
|
||||
### P2.1 Introduce typed hook contract
|
||||
|
||||
**Target behavior**
|
||||
|
||||
Define typed lifecycle gates:
|
||||
|
||||
- `before_run_start`
|
||||
- `before_task_start`
|
||||
- `task_result`
|
||||
- `before_cancel`
|
||||
- `before_forget`
|
||||
- `before_cleanup`
|
||||
- `before_publish`
|
||||
- `session_before_switch`
|
||||
- `run_recovery`
|
||||
|
||||
Each hook declares:
|
||||
|
||||
```ts
|
||||
type HookMode = "blocking" | "non_blocking";
|
||||
type HookOutcome = "allow" | "block" | "modify" | "diagnostic";
|
||||
```
|
||||
|
||||
Errors are recorded in diagnostics/events, not uncontrolled exceptions.
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- new `src/hooks/*`
|
||||
- `src/extension/register.ts`
|
||||
- `src/runtime/team-runner.ts`
|
||||
- `src/extension/team-tool/cancel.ts`
|
||||
- `src/extension/team-tool/lifecycle-actions.ts`
|
||||
- `docs/resource-formats.md`
|
||||
- `test/unit/hooks*.test.ts`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- Blocking hook can stop a run before worker start with clear event and status.
|
||||
- Non-blocking hook failure records diagnostic and does not crash run.
|
||||
- Hook context is redacted and bounded.
|
||||
|
||||
### P2.2 Require intent via policy/hook for destructive actions
|
||||
|
||||
**Current state**
|
||||
|
||||
Intent is optional for cancel/cleanup/forget/prune.
|
||||
|
||||
**Target behavior**
|
||||
|
||||
- Optional config:
|
||||
|
||||
```json
|
||||
{
|
||||
"policy": {
|
||||
"requireIntentForDestructiveActions": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- Actions requiring intent:
|
||||
- cancel
|
||||
- forget
|
||||
- prune
|
||||
- cleanup with force
|
||||
- publish/release helpers if added
|
||||
- worktree removal
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- Missing intent blocks action with actionable error.
|
||||
- Existing tests can opt out or provide intent.
|
||||
- Audit trail includes intent after approval.
|
||||
|
||||
## P2 — Durable History vs Prompt Projection
|
||||
|
||||
### P2.3 Separate durable run history projection from worker prompt text
|
||||
|
||||
**Current state**
|
||||
|
||||
Prompt pipeline artifacts exist, but context projection logic is still coupled to prompt construction in multiple places.
|
||||
|
||||
**Target behavior**
|
||||
|
||||
Introduce explicit projection functions:
|
||||
|
||||
```ts
|
||||
transformRunContextBeforeWorkerStart(...)
|
||||
convertRunHistoryToWorkerPrompt(...)
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Durable history retains events, mailbox, artifacts, UI/runtime metadata.
|
||||
- Worker prompt gets a bounded projection.
|
||||
- UI/runtime events are not prompt text unless explicitly selected.
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/runtime/task-runner/prompt-pipeline.ts`
|
||||
- `src/runtime/task-runner/prompt-builder.ts`
|
||||
- `src/runtime/task-output-context.ts`
|
||||
- `src/runtime/task-runner.ts`
|
||||
- `test/unit/task-runner-prompt-pipeline.test.ts`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- Prompt pipeline artifact identifies every projection source.
|
||||
- Large event/mailbox history is summarized or referenced, not blindly embedded.
|
||||
- Tests verify UI/runtime events are not injected as instructions.
|
||||
|
||||
## P2 — Cooperative Cancellation for Internal Scans
|
||||
|
||||
### P2.4 Add internal `CancellationToken`
|
||||
|
||||
**Target behavior**
|
||||
|
||||
A utility for long internal loops:
|
||||
|
||||
```ts
|
||||
interface CancellationToken {
|
||||
readonly aborted: boolean;
|
||||
readonly reason?: CancellationReason;
|
||||
heartbeat(stage?: string): void;
|
||||
throwIfCancelled(): void;
|
||||
wait(ms: number): Promise<void>;
|
||||
}
|
||||
```
|
||||
|
||||
Use it in:
|
||||
|
||||
- run index scans
|
||||
- artifact cleanup
|
||||
- mailbox validation/replay
|
||||
- worktree cleanup
|
||||
- diagnostic export
|
||||
- large transcript/event reads
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- new `src/runtime/cancellation-token.ts`
|
||||
- `src/extension/run-index.ts`
|
||||
- `src/extension/registration/artifact-cleanup.ts`
|
||||
- `src/state/mailbox.ts`
|
||||
- `src/ui/run-snapshot-cache.ts`
|
||||
- `test/unit/cancellation-token.test.ts`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- Long scan can abort within bounded cadence.
|
||||
- Heartbeat stage appears in diagnostics/logs.
|
||||
- Existing APIs can pass no token and keep current behavior.
|
||||
|
||||
## P2 — Artifact Store Improvements
|
||||
|
||||
### P2.5 Content-addressed blob artifacts
|
||||
|
||||
**Target behavior**
|
||||
|
||||
Large logs/transcripts/results are stored as blobs:
|
||||
|
||||
```text
|
||||
artifacts/blobs/sha256/<hash>
|
||||
artifacts/blob-metadata/<hash>.json
|
||||
```
|
||||
|
||||
Metadata includes:
|
||||
|
||||
- runId/taskId
|
||||
- MIME/type
|
||||
- producer
|
||||
- original path/name
|
||||
- size/hash
|
||||
- redaction status
|
||||
- retention policy
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/state/artifact-store.ts`
|
||||
- `src/runtime/task-runner.ts`
|
||||
- `src/ui/transcript-viewer.ts`
|
||||
- `src/extension/run-export.ts`
|
||||
- `src/extension/run-import.ts`
|
||||
- `test/unit/artifact-store*.test.ts`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- Artifacts above threshold are blob-referenced.
|
||||
- Run export/import preserves blobs.
|
||||
- GC removes unreferenced blobs after retention.
|
||||
- Path traversal protections remain intact.
|
||||
|
||||
## P2 — UI and Dashboard Upgrades
|
||||
|
||||
### P2.6 Show capability/effectiveness/cancellation panels in dashboard
|
||||
|
||||
**Target behavior**
|
||||
|
||||
Dashboard panes expose:
|
||||
|
||||
- run effectiveness score and no-observed-work tasks;
|
||||
- cancellation reason and intent;
|
||||
- capability inventory for selected task;
|
||||
- attempt/deadletter timeline.
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/ui/run-dashboard.ts`
|
||||
- `src/ui/dashboard-panes/*`
|
||||
- `src/ui/snapshot-types.ts`
|
||||
- `src/ui/run-snapshot-cache.ts`
|
||||
- `test/unit/run-dashboard.test.ts`
|
||||
- new pane tests
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- No heavy synchronous scans in render path.
|
||||
- Pane output is width-safe.
|
||||
- Snapshot cache provides precomputed compact data.
|
||||
|
||||
### P2.7 Event-first UI stream
|
||||
|
||||
**Target behavior**
|
||||
|
||||
Move more live UI updates from file polling to semantic events:
|
||||
|
||||
- `task_started`
|
||||
- `task_completed`
|
||||
- `worker_status`
|
||||
- `mailbox_updated`
|
||||
- `effectiveness_changed`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- Render scheduler remains coalesced and overlap-safe.
|
||||
- UI still recovers from durable files after restart.
|
||||
- File polling is fallback, not the hot path.
|
||||
|
||||
## P2 — Raw Scan Entry Cache
|
||||
|
||||
### P2.8 Cache raw entries, not final semantic query results
|
||||
|
||||
**Target behavior**
|
||||
|
||||
Shared raw scan cache for:
|
||||
|
||||
- runs
|
||||
- artifacts
|
||||
- mailbox files
|
||||
- transcript chunks
|
||||
- worktree roots
|
||||
|
||||
Then apply filters/sorts after retrieval.
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `src/runtime/manifest-cache.ts`
|
||||
- `src/ui/run-snapshot-cache.ts`
|
||||
- `src/extension/run-index.ts`
|
||||
- `src/utils/file-coalescer.ts`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- Deterministic sort order.
|
||||
- State mutation invalidates relevant raw entries.
|
||||
- Large workspaces do not trigger full rescans on every render/status.
|
||||
|
||||
## P3 — Release/Install Hardening
|
||||
|
||||
### P3.1 Tarball install smoke before publish
|
||||
|
||||
**Target behavior**
|
||||
|
||||
Release workflow requires:
|
||||
|
||||
```bash
|
||||
npm run ci
|
||||
npm pack --dry-run
|
||||
npm pack
|
||||
# install tarball in temp project
|
||||
# verify pi extension load smoke
|
||||
# verify npm package files and version/tag consistency
|
||||
```
|
||||
|
||||
**Suggested files**
|
||||
|
||||
- `docs/publishing.md`
|
||||
- `package.json` scripts
|
||||
- `.github/workflows/*` if CI is added
|
||||
- optional `scripts/release-smoke.mjs`
|
||||
|
||||
**Acceptance criteria**
|
||||
|
||||
- Packed tarball loads extension in temp Pi home.
|
||||
- Version in package, changelog, tag, npm view are consistent.
|
||||
- Release instructions include rollback notes.
|
||||
|
||||
## Suggested Implementation Order
|
||||
|
||||
1. **P0.1 Effectiveness policy enforcement** — prevents misleading completed runs.
|
||||
2. **P0.2 Persist runtime safety** — improves debugging for worker spawn issues.
|
||||
3. **P1.3 Two-phase worker teardown** — reduces stale/zombie worker risk.
|
||||
4. **P1.1 Durable steering/follow-up queues** — completes semantic split started at live-control level.
|
||||
5. **P1.5 Event-tree provenance** — builds on current `attemptId` work.
|
||||
6. **P1.7 Capability inventory view** — turns existing per-task artifacts into operator UX.
|
||||
7. **P2.3 Durable history projection** — reduces prompt/context risks.
|
||||
8. **P2.4 CancellationToken** — improves responsiveness of internal scans.
|
||||
9. **P2.5 Blob artifacts** — prevents log/transcript bloat.
|
||||
10. **P2.6 Dashboard panels** — surface all new evidence in UI.
|
||||
|
||||
## Release Guidance
|
||||
|
||||
Before publishing a patch with these upgrades:
|
||||
|
||||
```bash
|
||||
npx tsc --noEmit
|
||||
npm run test:unit
|
||||
npm run test:integration
|
||||
npm pack --dry-run
|
||||
```
|
||||
|
||||
For runtime/process changes also run targeted child-worker integration tests:
|
||||
|
||||
```bash
|
||||
node --experimental-strip-types --test --test-concurrency=1 --test-timeout=60000 \
|
||||
test/integration/mock-child-run.test.ts \
|
||||
test/integration/mock-child-json-run.test.ts \
|
||||
test/integration/phase6-runtime-hardening.test.ts
|
||||
```
|
||||
|
||||
Do not publish without explicit user confirmation and a green verification pass.
|
||||
Reference in New Issue
Block a user