Add 5 pi extensions: pi-subagents, pi-crew, rpiv-pi, pi-interactive-shell, pi-intercom
This commit is contained in:
548
extensions/pi-crew/docs/research-optimization-plan.md
Normal file
548
extensions/pi-crew/docs/research-optimization-plan.md
Normal file
@@ -0,0 +1,548 @@
|
||||
# Plan: pi-crew Optimization Opportunities
|
||||
|
||||
> Ngày: 2026-04-29 | Revised: 2026-04-29 (after design review)
|
||||
> Based on: research-pi-coding-agent.md, research-extension-system.md, research-extension-examples.md
|
||||
|
||||
## Overview
|
||||
|
||||
Sau khi đọc sâu extension system của pi-mono và toàn bộ 60+ example extensions, dưới đây là
|
||||
danh sách cơ hội tối ưu cho pi-crew, được phân loại theo effort và impact.
|
||||
|
||||
**Revision notes (2026-04-29):**
|
||||
- Re-order Phase 1 để compliance-required task (permission gate) đi trước optimization task.
|
||||
- Tách `terminate: true` thành 2 sub-task vì rủi ro UX khác nhau.
|
||||
- Hạ "custom compaction model" từ Phase 2 xuống Phase 3 (risk vs ROI).
|
||||
- Đổi cancel-compaction thành **defer + retry** (tránh context overflow).
|
||||
- Threshold compaction động theo `contextWindow` thay vì hardcode 150k.
|
||||
- Thêm rollback strategy ở cấp roadmap + gap research bổ sung.
|
||||
|
||||
## Priority Matrix
|
||||
|
||||
```
|
||||
Impact
|
||||
↑
|
||||
│ HIGH │ HIGH │
|
||||
│ Effort │ Effort │
|
||||
│ LOW │ MEDIUM │
|
||||
│ ───────┼─────────│
|
||||
│ MEDIUM │ LOW │
|
||||
│ Effort │ Effort │
|
||||
│ LOW │ MEDIUM │
|
||||
└──────────────────→ Effort
|
||||
```
|
||||
|
||||
## Implementation Status (2026-04-29)
|
||||
|
||||
Implemented in code:
|
||||
|
||||
- Phase 1.4 permission gate for destructive `team` tool calls.
|
||||
- Phase 1.6 telemetry baseline fields for subagent completion (`turnCount`, `terminated`, `durationMs`).
|
||||
- Phase 1.2 compaction guard as defer + retry, moved into `src/extension/registration/compaction-guard.ts`.
|
||||
- Phase 1.1a `terminate: true` for background/queued subagent launches.
|
||||
- Phase 1.3 public event bus events (`crew.subagent.completed`, `crew.run.completed`, `crew.run.failed`, `crew.run.cancelled`).
|
||||
- Phase 1.5 auto session naming for new team runs when no custom session name exists.
|
||||
- Phase 2.1 proactive compaction with dynamic context-window threshold.
|
||||
- Phase 2.3 Pi session entries for run start/completion (`crew:run-started`, `crew:run-completed`).
|
||||
- Phase 2.4 config-driven subagent tool aliases via `config.tools`.
|
||||
- Phase 2.5 foreground working indicator, using optional API compatibility shim because older `pi-coding-agent` type surfaces may not expose `ctx.ui.setWorkingIndicator`.
|
||||
- Phase 3.3 safe mailbox event bus publication (`crew.mailbox.message`, `crew.mailbox.acknowledged`).
|
||||
|
||||
Deferred by design:
|
||||
|
||||
- Phase 1.1b foreground `terminate: true` is implemented as opt-in via `config.tools.terminateOnForeground=true`; default remains safe/off pending telemetry.
|
||||
- Phase 3.4 structured artifact index is implemented for pi-crew-triggered compactions via `crew:artifact-index` session entries plus compaction custom instructions. Direct `CompactionEntry.details` augmentation is not available through the current upstream extension API without replacing default compaction.
|
||||
- Phase 3.1, 3.3b, 3.5, and 4.2 are now marked won't-do/research-only after deeper risk/ROI analysis.
|
||||
- Phase 3.2 remains conditional on agent-level opt-in design. Phase 4.1 remains deferred pending format-compat research.
|
||||
|
||||
Validation:
|
||||
|
||||
- `npm run typecheck` passes.
|
||||
- `npm test` passes: 283 unit tests + 26 integration tests.
|
||||
|
||||
## Roadmap-level Rollback Strategy
|
||||
|
||||
- **1 sub-task = 1 commit** có thể revert độc lập. KHÔNG gộp toàn bộ Phase 1 vào 1 commit.
|
||||
- Mỗi commit phải có test riêng. Nếu fail trong production, `git revert <sha>` không kéo theo task khác.
|
||||
- Phase 1.6 (telemetry) làm trước Phase 1.1 để có baseline đo lường.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Quick Wins & Compliance (HIGH impact, LOW effort)
|
||||
|
||||
Thời gian ước tính: 2-3 sessions. **Thứ tự đã re-order so với research gốc.**
|
||||
|
||||
### 1.4 (FIRST) Permission gate cho destructive team actions
|
||||
|
||||
**Lý do làm trước:** AGENTS.md quy định *"Management deletes must require confirm: true; referenced
|
||||
resources blocked unless force: true"* — đây là **rule bắt buộc**, không phải optimization.
|
||||
|
||||
**Files cần sửa:** `src/extension/registration/team-tool.ts` (hoặc file mới)
|
||||
|
||||
**Hiện tại:** Có check trong handler nhưng không có `tool_call` hook → message lỗi không nhất quán.
|
||||
|
||||
**Tối ưu:**
|
||||
```typescript
|
||||
pi.on("tool_call", async (event, ctx) => {
|
||||
if (event.toolName !== "team") return;
|
||||
const input = event.input as Record<string, unknown>;
|
||||
const destructiveActions = ["delete", "forget", "prune", "cleanup"];
|
||||
|
||||
if (destructiveActions.includes(input.action as string)) {
|
||||
if (!input.confirm && !input.force) {
|
||||
return {
|
||||
block: true,
|
||||
reason: `Destructive action '${input.action}' requires confirm=true (or force=true to bypass)`,
|
||||
};
|
||||
}
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**Note về precedence:** Nếu schema validate đã check `confirm`, **CHỌN 1 chỗ duy nhất**:
|
||||
- Option A: Để schema validate → bỏ hook (đơn giản hơn).
|
||||
- Option B: Để hook validate → gỡ check trong handler (consistent error message).
|
||||
|
||||
→ Đề nghị Option B vì hook gate tất cả entry points (kể cả nếu sau này có entry point bypass schema).
|
||||
|
||||
**Expected benefit:** Compliance với AGENTS.md, safety net production.
|
||||
|
||||
---
|
||||
|
||||
### 1.6 (NEW) Telemetry baseline cho terminate impact
|
||||
|
||||
**Lý do làm trước 1.1:** Plan gốc claim "giảm 30-50% LLM turns" — chỉ là phỏng đoán. Cần baseline đo lường thực tế.
|
||||
|
||||
**Files cần sửa:** `src/runtime/subagent-manager.ts`, `src/extension/register.ts`
|
||||
|
||||
**Tối ưu:** Log `turnCount` + `terminated: boolean` vào event `crew.subagent.completed`:
|
||||
```typescript
|
||||
pi.events.emit("crew.subagent.completed", {
|
||||
id: record.id,
|
||||
runId: record.runId,
|
||||
type: record.type,
|
||||
status: record.status,
|
||||
usage: record.usage,
|
||||
turnCount: record.turnCount, // ← NEW
|
||||
terminated: record.terminated, // ← NEW (false trước Phase 1.1)
|
||||
durationMs: record.durationMs, // ← NEW
|
||||
});
|
||||
```
|
||||
|
||||
**Expected benefit:** Đo trước/sau Phase 1.1 để xác định ROI thực tế. Nếu < 10% turn saving, có thể quyết định không deploy 1.1b.
|
||||
|
||||
---
|
||||
|
||||
### 1.2 `session_before_compact` guard cho foreground runs (DEFER, không CANCEL)
|
||||
|
||||
**Files cần sửa:** `src/extension/register.ts`
|
||||
|
||||
**Hiện tại:** Không hook compaction → có thể compact giữa chừng foreground run.
|
||||
|
||||
**Tối ưu (revised):** Defer + retry thay vì cancel cứng (tránh context overflow):
|
||||
```typescript
|
||||
let pendingCompactReason: string | null = null;
|
||||
|
||||
pi.on("session_before_compact", async (event, ctx) => {
|
||||
if (foregroundControllers.size > 0) {
|
||||
pendingCompactReason = "deferred-during-foreground-run";
|
||||
ctx.ui.notify("Compaction deferred until foreground run completes", "info");
|
||||
return { cancel: true };
|
||||
}
|
||||
});
|
||||
|
||||
// Retry sau khi run xong:
|
||||
pi.on("turn_end", (_event, ctx) => {
|
||||
if (foregroundControllers.size === 0 && pendingCompactReason) {
|
||||
pendingCompactReason = null;
|
||||
ctx.compact({
|
||||
onComplete: () => ctx.ui.notify("Deferred compaction completed", "info"),
|
||||
});
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**Expected benefit:** Ngăn lỗi context mất mát trong foreground run, vẫn đảm bảo compact eventually chạy.
|
||||
|
||||
**Risk:** Nếu run cực dài + foregroundControllers chưa bao giờ về 0 → vẫn overflow. Mitigation: hard threshold (vd 95% context window) bypass deferral, force compact.
|
||||
|
||||
---
|
||||
|
||||
### 1.1a `terminate: true` cho **background queued** results (SAFE)
|
||||
|
||||
**Lý do tách:** Background queue không có UX risk, foreground completed có risk (xem 1.1b).
|
||||
|
||||
**Files cần sửa:** `src/extension/registration/subagent-tools.ts`
|
||||
|
||||
**Tối ưu:**
|
||||
```typescript
|
||||
// Agent tool — khi background: terminate ngay sau khi đã queued
|
||||
if (params.run_in_background) {
|
||||
return {
|
||||
...subagentToolResult(...),
|
||||
terminate: true, // ← Tiết kiệm 1 LLM turn, không có rủi ro UX
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
**Expected benefit:** Giảm LLM turn cho mọi background spawn. Verify bằng telemetry từ 1.6.
|
||||
|
||||
---
|
||||
|
||||
### 1.3 Public events qua `pi.events`
|
||||
|
||||
**Files cần sửa:** `src/extension/register.ts`
|
||||
|
||||
**Hiện tại:** Event bus chỉ dùng cho internal `subagent.stuck-blocked`.
|
||||
|
||||
**Naming convention (revised):** Thống nhất với upstream pattern `dot.kebab` (đã dùng cho `subagent.stuck-blocked`):
|
||||
```typescript
|
||||
// Document trong README là PUBLIC API:
|
||||
pi.events.emit("crew.subagent.completed", { ... });
|
||||
pi.events.emit("crew.run.completed", { runId, team, workflow, status, taskCount, totalUsage });
|
||||
pi.events.emit("crew.run.failed", { runId, team, workflow, error, failedTaskId });
|
||||
pi.events.emit("crew.run.cancelled", { runId, team, workflow, status, taskCount });
|
||||
```
|
||||
|
||||
**Versioning:** Note trong README rằng event payload là semver-stable từ pi-crew 0.2.0.
|
||||
|
||||
**Expected benefit:** Extension khác (logging, notification, metrics) có thể subscribe.
|
||||
|
||||
---
|
||||
|
||||
### 1.5 Auto session name từ team run context
|
||||
|
||||
**Files cần sửa:** `src/extension/registration/team-tool.ts`
|
||||
|
||||
**Tối ưu:**
|
||||
```typescript
|
||||
// Trong team tool execute, trước khi start run:
|
||||
pi.setSessionName(`pi-crew: ${team}/${workflow} — ${goal.slice(0, 60)}`);
|
||||
```
|
||||
|
||||
**Expected benefit:** Better session organization khi xem session list.
|
||||
|
||||
---
|
||||
|
||||
### 1.1b (OPT-IN DONE, DEFAULT OFF) `terminate: true` cho **foreground completed** results
|
||||
|
||||
**Lý do default off:** UX risk — nếu LLM không có turn để summarize result, user có thể không hiểu output.
|
||||
|
||||
**Implementation:** opt-in flag, default safe:
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"terminateOnForeground": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
When enabled, foreground `Agent`/`crew_agent` completed results set `terminate: true` and persist `record.terminated=true` for telemetry. Decision to make this default-on still requires telemetry evidence:
|
||||
|
||||
- Average turn count sau Agent foreground completion ≥ 2.
|
||||
- Output đã đủ self-explanatory (đo qua user feedback hoặc retry rate).
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Medium Effort Optimizations
|
||||
|
||||
Thời gian ước tính: 2-3 sessions. (Đã giảm 1 task so với plan gốc.)
|
||||
|
||||
### 2.1 Proactive compaction monitoring (DYNAMIC threshold)
|
||||
|
||||
**Files cần sửa:** File mới `src/extension/registration/compaction-guard.ts`
|
||||
|
||||
**Hiện tại:** Chỉ dựa vào built-in auto-compaction (có thể chậm).
|
||||
|
||||
**Tối ưu (revised):** Threshold động theo `contextWindow`:
|
||||
```typescript
|
||||
export function registerCompactionGuard(pi: ExtensionAPI) {
|
||||
const TRIGGER_RATIO = 0.75; // 75% context window → trigger
|
||||
|
||||
pi.on("turn_end", (_event, ctx) => {
|
||||
const usage = ctx.getContextUsage();
|
||||
const ctxWindow = ctx.model?.contextWindow ?? 200_000;
|
||||
const threshold = ctxWindow * TRIGGER_RATIO;
|
||||
|
||||
if (usage?.tokens && usage.tokens > threshold) {
|
||||
// Foreground guard từ Phase 1.2 sẽ defer nếu cần
|
||||
ctx.compact({
|
||||
customInstructions: "Prioritize keeping team run state, task results, and artifact references. Keep the conversation context brief.",
|
||||
onComplete: () => ctx.ui.notify("Auto-compacted context during team run", "info"),
|
||||
onError: (err) => ctx.ui.notify(`Compaction failed: ${err.message}`, "error"),
|
||||
});
|
||||
}
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
**Lý do dùng ratio thay vì hardcode:** Claude Haiku 200k, Gemini Pro 2M, GPT-4o 128k, model nhỏ 32k. Hardcode 150k sai cho 90% trường hợp.
|
||||
|
||||
**Expected benefit:** Tránh context overflow error khi foreground run quá dài.
|
||||
|
||||
---
|
||||
|
||||
### 2.3 `pi.appendEntry` cho cross-session run awareness
|
||||
|
||||
**Files cần sửa:** `src/extension/register.ts`
|
||||
|
||||
**Tối ưu:**
|
||||
```typescript
|
||||
// Khi bắt đầu run:
|
||||
pi.appendEntry("crew:run-started", {
|
||||
runId, team, workflow, goal, timestamp: Date.now(),
|
||||
});
|
||||
|
||||
// Khi hoàn thành run:
|
||||
pi.appendEntry("crew:run-completed", {
|
||||
runId, status, taskCount, totalUsage, timestamp: Date.now(),
|
||||
});
|
||||
```
|
||||
|
||||
**Expected benefit:**
|
||||
- Khi reload session, biết được các run liên quan.
|
||||
- Session export bao gồm run context.
|
||||
- Dễ dàng track history.
|
||||
|
||||
---
|
||||
|
||||
### 2.4 Config-driven tool registration
|
||||
|
||||
**Files cần sửa:** `src/extension/registration/subagent-tools.ts`
|
||||
|
||||
**Hiện tại:** Luôn register 6 tool variants (Agent, crew_agent, + result + steer).
|
||||
|
||||
**Tối ưu:**
|
||||
```typescript
|
||||
export function registerSubagentTools(pi: ExtensionAPI, subagentManager: SubagentManager) {
|
||||
const cfg = loadConfig(pi.getFlag("cwd") as string || process.cwd());
|
||||
|
||||
// Conflict-safe tools (luôn register)
|
||||
pi.registerTool(crewAgentTool);
|
||||
pi.registerTool(crewAgentResultTool);
|
||||
|
||||
// Claude-style aliases: only if not disabled
|
||||
if (cfg.config.tools?.enableClaudeStyleAliases !== false) {
|
||||
try { pi.registerTool(agentTool); } catch {}
|
||||
try { pi.registerTool(getSubagentResultTool); } catch {}
|
||||
}
|
||||
|
||||
// Steer: only if supported
|
||||
if (cfg.config.tools?.enableSteer !== false) {
|
||||
try { pi.registerTool(crewAgentSteerTool); } catch {}
|
||||
try { pi.registerTool(steerSubagentTool); } catch {}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Expected benefit:** Tránh pollute tool namespace, fine-grained control cho user.
|
||||
|
||||
---
|
||||
|
||||
### 2.5 Custom working indicator trong foreground runs
|
||||
|
||||
**Files cần sửa:** `src/extension/register.ts`
|
||||
|
||||
**Tối ưu:**
|
||||
```typescript
|
||||
// Khi foreground run active:
|
||||
ctx.ui.setWorkingIndicator({
|
||||
frames: ["⣾", "⣽", "⣻", "⢿", "⡿", "⣟", "⣯", "⣷"],
|
||||
intervalMs: 80,
|
||||
});
|
||||
ctx.ui.setWorkingMessage(
|
||||
`Team run: ${completedTasks}/${totalTasks} tasks done...`
|
||||
);
|
||||
|
||||
// Khi kết thúc:
|
||||
ctx.ui.setWorkingIndicator(); // Restore default
|
||||
ctx.ui.setWorkingMessage(); // Clear
|
||||
```
|
||||
|
||||
**Compat shim note:** Implementation dùng optional API compatibility shim:
|
||||
|
||||
```typescript
|
||||
(ctx.ui as { setWorkingIndicator?: (...) => void }).setWorkingIndicator?.(...)
|
||||
```
|
||||
|
||||
Lý do: một số version/type surface của `@mariozechner/pi-coding-agent` chưa expose
|
||||
`setWorkingIndicator` trên `ExtensionUIContext`. Optional shim giữ backward compatibility và
|
||||
tránh crash/runtime type mismatch; nếu API không tồn tại thì chỉ bỏ qua custom spinner và vẫn dùng
|
||||
`setWorkingMessage()`.
|
||||
|
||||
**Expected benefit:** Better UX, cho user biết team run đang chạy.
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Future Considerations (HIGH effort hoặc Risky)
|
||||
|
||||
### 3.1 (WON'T DO unless concrete pain point appears) Branch-level task isolation
|
||||
|
||||
Dùng `ctx.fork()` để tạo branch mới cho mỗi task trong team run.
|
||||
|
||||
**Decision:** không triển khai mặc định. Worktree isolation đã giải quyết phần quan trọng nhất (file-system/task isolation). Branch-level isolation tạo branch explosion, navigation UX phức tạp, và state-sync risk giữa flat run manifest/tasks/events với Pi session tree. Chỉ reconsider nếu có user complaint cụ thể về context contamination không giải quyết được bằng worktree/dependency-context controls.
|
||||
|
||||
### 3.2 Session handoff cho long-running tasks
|
||||
|
||||
Khi 1 task quá dài, handoff sang session mới (pattern từ `handoff.ts`), isolate context.
|
||||
|
||||
**Conditional trigger:** chỉ enable cho agent/task opt-in, ví dụ agent frontmatter `handoff: true`, hoặc heuristic token estimate > 30% context window.
|
||||
|
||||
**Result transport:** child session trả về artifact reference hoặc mailbox message để parent session vẫn aggregate được kết quả mà không cần import toàn bộ transcript.
|
||||
|
||||
### 3.3 Mailbox qua `pi.events`
|
||||
|
||||
#### 3.3a (DONE) Publish mailbox lifecycle events while preserving file-backed mailbox
|
||||
|
||||
Implementation publishes safe public events without changing the durable mailbox source of truth:
|
||||
|
||||
```typescript
|
||||
pi.events.emit("crew.mailbox.message", { runId, id, direction, from, to, taskId, source });
|
||||
pi.events.emit("crew.mailbox.acknowledged", { runId, messageId, delivery });
|
||||
```
|
||||
|
||||
This keeps file-backed mailbox semantics intact while enabling observers/notification extensions.
|
||||
|
||||
#### 3.3b (WON'T DO) Replace file-backed mailbox with pure event-bus mailbox
|
||||
|
||||
Thay vì file-based mailbox, dùng event bus làm transport chính cho real-time communication giữa tasks.
|
||||
|
||||
**Decision:** won't do. Latency gain is marginal; durability/restart/replay loss is catastrophic for long-running pi-crew runs. 3.3a gives best-of-both-worlds: durable file-backed mailbox remains source of truth, event bus is an observer/notification layer.
|
||||
|
||||
### 3.4 (PROMOTED + DONE) Compaction với structured artifact index
|
||||
|
||||
Preserve pi-crew artifact references across compaction.
|
||||
|
||||
**Implementation:** `compaction-guard.ts` collects recent run artifacts and:
|
||||
|
||||
- appends a structured `crew:artifact-index` session entry for machine-readable continuity;
|
||||
- adds a markdown artifact index to pi-crew-triggered compaction `customInstructions` so the compaction summary preserves run IDs and artifact paths.
|
||||
|
||||
**Note:** Directly augmenting `CompactionEntry.details` is not supported by the current upstream `session_before_compact` result contract unless pi-crew replaces default compaction entirely. We intentionally avoid full custom compaction because summary quality/regression risk is higher.
|
||||
|
||||
### 3.5 (WON'T DO unless cost telemetry shows pain) Custom compaction với model nhẹ
|
||||
|
||||
**Decision:** won't do by default.
|
||||
|
||||
- Phụ thuộc vào auth setup của user cho Gemini Flash / Haiku — pi-crew không kiểm soát được.
|
||||
- Bad summary làm mất context → ảnh hưởng cả run.
|
||||
- ROI không rõ: compaction chạy không thường xuyên.
|
||||
|
||||
Reconsider only if telemetry/user feedback shows compaction cost is a real pain point. Reference remains `examples/extensions/custom-compaction.ts` upstream.
|
||||
|
||||
---
|
||||
|
||||
## Phase 4 (NEW): Research bổ sung
|
||||
|
||||
Hai pattern upstream chưa được khai thác trong plan gốc:
|
||||
|
||||
### 4.1 (DEFER — research format compat first) `resources_discover` event integration
|
||||
|
||||
Pi-crew có thể inject builtin agents/teams như Pi resources native (skills/prompts):
|
||||
```typescript
|
||||
pi.on("resources_discover", () => ({
|
||||
skillPaths: [path.join(__dirname, "..", "agents")],
|
||||
promptPaths: [path.join(__dirname, "..", "workflows")],
|
||||
}));
|
||||
```
|
||||
|
||||
**Decision:** defer. Cần research format compat giữa pi-crew agent markdown vs Pi skill/prompt format trước khi implement. Key risk: dual exposure UX confusion (same capability reachable via `Agent` tool and native skill/prompt) plus loss of pi-crew durable run semantics if exposed as stateless skills.
|
||||
|
||||
### 4.2 (RESEARCH-ONLY) `pi.registerProvider` cho virtual "team" model
|
||||
|
||||
Đăng ký team như virtual provider để user gọi:
|
||||
```bash
|
||||
pi --model crew/researcher
|
||||
```
|
||||
Thay vì dùng tool `Agent`.
|
||||
|
||||
**Decision:** research-only / not an implementation target. Provider API semantics (single LLM stream, context window, thinking levels, token pricing) do not map cleanly to orchestrator semantics (multi-agent task events, aggregate usage/cost, per-worker contexts). Likely requires upstream provider API changes.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Order (REVISED)
|
||||
|
||||
```
|
||||
Phase 1 (Quick Wins & Compliance):
|
||||
[x] 1.4 permission gate destructive team actions ← FIRST (compliance)
|
||||
[x] 1.6 telemetry baseline ← SECOND (measure first)
|
||||
[x] 1.2 session_before_compact defer (not cancel)
|
||||
[x] 1.1a terminate: true on background queued (safe)
|
||||
[x] 1.3 public crew.* events
|
||||
[x] 1.5 auto session name
|
||||
[x] 1.1b terminate: true on foreground (OPT-IN, default off; default-on conditional on telemetry)
|
||||
|
||||
Phase 2 (Medium):
|
||||
[x] 2.1 proactive compaction (dynamic threshold)
|
||||
[x] 2.3 pi.appendEntry cross-session awareness
|
||||
[x] 2.4 config-driven tool registration
|
||||
[x] 2.5 custom working indicator
|
||||
|
||||
Phase 3 (Future / Risky):
|
||||
[-] 3.1 branch-level task isolation (WON'T DO unless concrete pain point appears)
|
||||
[ ] 3.2 session handoff for long tasks (CONDITIONAL on agent opt-in)
|
||||
[x] 3.3a publish mailbox lifecycle events (safe subset)
|
||||
[-] 3.3b replace file-backed mailbox with pure event bus (WON'T DO)
|
||||
[x] 3.4 structured artifact index in compaction (promoted/done)
|
||||
[-] 3.5 custom compaction with cheap model (WON'T DO unless cost telemetry shows pain)
|
||||
|
||||
Phase 4 (Research):
|
||||
[ ] 4.1 resources_discover integration (DEFER; format compat research first)
|
||||
[-] 4.2 virtual team provider (RESEARCH-ONLY)
|
||||
```
|
||||
|
||||
## Files affected
|
||||
|
||||
```
|
||||
PHASE 1:
|
||||
src/extension/registration/team-tool.ts ← 1.4 permission gate
|
||||
src/extension/registration/subagent-tools.ts ← 1.1a terminate + 1.1b opt-in terminate
|
||||
src/extension/register.ts ← 1.2 defer guard, 1.3 events, 1.5 session name
|
||||
src/runtime/subagent-manager.ts ← 1.6 telemetry fields
|
||||
|
||||
PHASE 2:
|
||||
src/extension/registration/compaction-guard.ts ← NEW: 1.2 defer guard + 2.1 proactive + 3.4 artifact index
|
||||
src/extension/register.ts ← 2.3 appendEntry, 2.5 working indicator
|
||||
src/extension/registration/subagent-tools.ts ← 2.4 config-driven
|
||||
|
||||
PHASE 3:
|
||||
src/extension/team-tool/api.ts ← 3.3a mailbox lifecycle events
|
||||
```
|
||||
|
||||
## Risk Assessment (REVISED)
|
||||
|
||||
| Change | Risk | Mitigation |
|
||||
|---|---|---|
|
||||
| Permission gate (1.4) | Block legitimate use | Allow `force=true` bypass, document trong README |
|
||||
| Telemetry (1.6) | Privacy / log size | No PII in subagent telemetry payload; opt-out applied via `config.telemetry.enabled=false`; no sampling currently because payload is small/local event-bus data |
|
||||
| Defer compaction (1.2) | Run dài infinite → overflow | Hard threshold 95% bypass deferral |
|
||||
| `terminate: true` background (1.1a) | None significant | Background không cần LLM follow-up by design |
|
||||
| Public events (1.3) | Event storm, breaking change | Rate limit, semver document |
|
||||
| Auto session name (1.5) | Override user-set name | Applied: chỉ set nếu chưa có name custom (`!pi.getSessionName()`) |
|
||||
| `terminate: true` foreground (1.1b) | LLM không summarize khi enabled | OPT-IN flag (`config.tools.terminateOnForeground`, default off); default-on requires telemetry evidence |
|
||||
| Dynamic threshold (2.1) | contextWindow undefined | Default 200_000 fallback |
|
||||
| Artifact index in compaction (3.4) | Index size bloat / format drift | Cap recent index (10 runs / 80 artifacts), structured `crew:artifact-index` session entry, non-replacing default compaction |
|
||||
| appendEntry (2.3) | Session bloat | TTL/cleanup strategy |
|
||||
| Config-driven tools (2.4) | User confused | Default = current behavior, opt-in change |
|
||||
| Working indicator (2.5) | Conflict với extension khác / older Pi UI type surface | Applied: restore default on finally; compat shim makes `setWorkingIndicator` optional |
|
||||
| Custom compaction model (3.5) | Bad summary, auth missing | Fall back to default, multi-model retry |
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
- **Unit tests:**
|
||||
- `terminate: true` flag in tool results (1.1a/b).
|
||||
- Permission gate blocks/allows correctly với confirm/force matrix (1.4).
|
||||
- Threshold calculation từ contextWindow (2.1).
|
||||
- Telemetry payload schema (1.6).
|
||||
- Artifact index payload structure + cap behavior (3.4).
|
||||
- **Integration tests:**
|
||||
- Foreground run + compaction interaction (1.2 defer + 2.1 trigger).
|
||||
- Multiple concurrent runs + permission gate (1.4).
|
||||
- Event publish/subscribe round-trip (1.3).
|
||||
- Compaction with N artifacts includes artifact index in custom instructions (3.4).
|
||||
- **Manual:**
|
||||
- UI behavior với working indicator + session name (1.5, 2.5).
|
||||
- Real LLM turn count trước/sau 1.1b với telemetry data (1.6 → 1.1b decision).
|
||||
- **Regression:**
|
||||
- Run full suite (`npm test`) sau mỗi commit, không gộp Phase.
|
||||
- Doctor tests phải dùng `--test-timeout=90000` trên Windows.
|
||||
Reference in New Issue
Block a user