Add 5 pi extensions: pi-subagents, pi-crew, rpiv-pi, pi-interactive-shell, pi-intercom

2026-05-08 15:59:25 +10:00
parent d0d1d9b045
commit 31b4110c87
457 changed files with 85157 additions and 0 deletions
--- a/extensions/pi-interactive-shell/examples/skills/codex-5-3-prompting/SKILL.md
+++ b/extensions/pi-interactive-shell/examples/skills/codex-5-3-prompting/SKILL.md
@@ -0,0 +1,161 @@
+---
+name: codex-5-3-prompting
+description: How to write system prompts and instructions for GPT-5.3-Codex. Use when constructing or tuning prompts targeting Codex 5.3.
+---
+
+# GPT-5.3-Codex Prompting Guide
+
+GPT-5.3-Codex is fast, capable, and eager. It moves quickly and will skip reading, over-refactor, and drift scope if prompts aren't tight. Explicit constraints matter more than with GPT-5.2-Codex. Include the following blocks as needed when constructing system prompts.
+
+## Output shape
+
+Always include. Controls verbosity and response structure.
+
+```
+<output_verbosity_spec>
+- Default: 3-6 sentences or <=5 bullets for typical answers.
+- Simple yes/no questions: <=2 sentences.
+- Complex multi-step or multi-file tasks:
+  - 1 short overview paragraph
+  - then <=5 bullets tagged: What changed, Where, Risks, Next steps, Open questions.
+- Avoid long narrative paragraphs; prefer compact bullets and short sections.
+- Do not rephrase the user's request unless it changes semantics.
+</output_verbosity_spec>
+```
+
+## Scope constraints
+
+Always include. GPT-5.3-Codex will add features, refactor adjacent code, and invent UI elements if you don't fence it in.
+
+```
+<design_and_scope_constraints>
+- Explore any existing design systems and understand them deeply.
+- Implement EXACTLY and ONLY what the user requests.
+- No extra features, no added components, no UX embellishments.
+- Style aligned to the design system at hand.
+- Do NOT invent colors, shadows, tokens, animations, or new UI elements unless requested or necessary.
+- If any instruction is ambiguous, choose the simplest valid interpretation.
+</design_and_scope_constraints>
+```
+
+## Context loading
+
+Always include. GPT-5.3-Codex skips reading and starts writing if you don't force it.
+
+```
+<context_loading>
+- Read ALL files that will be modified -- in full, not just the sections mentioned in the task.
+- Also read key files they import from or that depend on them.
+- Absorb surrounding patterns, naming conventions, error handling style, and architecture before writing any code.
+- Do not ask clarifying questions about things that are answerable by reading the codebase.
+</context_loading>
+```
+
+## Plan-first mode
+
+Include for multi-file work, large refactors, or any task with ordering dependencies.
+
+```
+<plan_first>
+- Before writing any code, produce a brief implementation plan:
+  - Files to create vs. modify
+  - Implementation order and prerequisites
+  - Key design decisions and edge cases
+  - Acceptance criteria for "done"
+- Get the plan right first. Then implement step by step following the plan.
+- If the plan is provided externally, follow it faithfully -- the job is execution, not second-guessing the design.
+</plan_first>
+```
+
+## Long-context handling
+
+Include when inputs exceed ~10k tokens (multi-chapter docs, long threads, multiple PDFs).
+
+```
+<long_context_handling>
+- For inputs longer than ~10k tokens:
+  - First, produce a short internal outline of the key sections relevant to the task.
+  - Re-state the constraints explicitly before answering.
+  - Anchor claims to sections ("In the 'Data Retention' section...") rather than speaking generically.
+- If the answer depends on fine details (dates, thresholds, clauses), quote or paraphrase them.
+</long_context_handling>
+```
+
+## Uncertainty and ambiguity
+
+Include when the task involves underspecified requirements or hallucination-prone domains.
+
+```
+<uncertainty_and_ambiguity>
+- If the question is ambiguous or underspecified:
+  - Ask up to 1-3 precise clarifying questions, OR
+  - Present 2-3 plausible interpretations with clearly labeled assumptions.
+- Never fabricate exact figures, line numbers, or external references when uncertain.
+- When unsure, prefer "Based on the provided context..." over absolute claims.
+</uncertainty_and_ambiguity>
+```
+
+## User updates
+
+Include for agentic / long-running tasks.
+
+```
+<user_updates_spec>
+- Send brief updates (1-2 sentences) only when:
+  - You start a new major phase of work, or
+  - You discover something that changes the plan.
+- Avoid narrating routine tool calls ("reading file...", "running tests...").
+- Each update must include at least one concrete outcome ("Found X", "Confirmed Y", "Updated Z").
+- Do not expand the task beyond what was asked; if you notice new work, call it out as optional.
+</user_updates_spec>
+```
+
+## Tool usage
+
+Include when the prompt involves tool-calling agents.
+
+```
+<tool_usage_rules>
+- Prefer tools over internal knowledge whenever:
+  - You need fresh or user-specific data (tickets, orders, configs, logs).
+  - You reference specific IDs, URLs, or document titles.
+- Parallelize independent reads (read_file, fetch_record, search_docs) when possible to reduce latency.
+- After any write/update tool call, briefly restate:
+  - What changed
+  - Where (ID or path)
+  - Any follow-up validation performed
+</tool_usage_rules>
+```
+
+## Reasoning effort
+
+Set `model_reasoning_effort` via Codex CLI: `-c model_reasoning_effort="high"`
+
+| Task type | Effort |
+|---|---|
+| Simple code generation, formatting | `low` or `medium` |
+| Standard implementation from clear specs | `high` |
+| Complex refactors, plan review, architecture | `xhigh` |
+| Code review (thorough) | `high` or `xhigh` |
+
+## Backwards compatibility hedging
+
+GPT-5.3-Codex has a strong tendency to preserve old patterns, add compatibility shims, and provide fallback code "just in case" -- even when explicitly told not to worry about backwards compatibility. Vague instructions like "don't worry about backwards compatibility" get interpreted weakly; the model may still hedge.
+
+Use **"cutover"** to signal a clean, irreversible break. It's a precise industry term that conveys finality and intentional deprecation -- no dual-support phase, no gradual migration, no preserving old behavior.
+
+Instead of:
+> "Rewrite this and don't worry about backwards compatibility"
+
+Say:
+> "This is a cutover. No backwards compatibility. Rewrite using only Python 3.12+ features and current best practices. Do not preserve legacy code, polyfills, or deprecated patterns."
+
+## Quick reference
+
+- **Force reading first.** "Read all necessary files before you ask any dumb question."
+- **Use plan mode.** Draft the full task with acceptance criteria before implementing.
+- **Steer aggressively mid-task.** GPT-5.3-Codex handles redirects without losing context. Be direct: "Stop. Fix the actual cause." / "Simplest valid implementation only."
+- **Constrain scope hard.** GPT-5.3-Codex will refactor aggressively if you don't fence it in.
+- **Watch context burn.** Faster model = faster context consumption. Start fresh at ~40%.
+- **Use domain jargon.** "Cutover," "golden-path," "no fallbacks," "domain split" get cleaner, faster responses.
+- **Download libraries locally.** Tell it to read them for better context than relying on training data.
--- a/extensions/pi-interactive-shell/examples/skills/codex-cli/SKILL.md
+++ b/extensions/pi-interactive-shell/examples/skills/codex-cli/SKILL.md
@@ -0,0 +1,130 @@
+---
+name: codex-cli
+description: OpenAI Codex CLI reference. Use when running codex in interactive_shell overlay or when user asks about codex CLI options.
+---
+
+# Codex CLI (OpenAI)
+
+## Commands
+
+| Command | Description |
+|---------|-------------|
+| `codex` | Start interactive TUI |
+| `codex "prompt"` | TUI with initial prompt |
+| `codex exec "prompt"` | Non-interactive (headless), streams to stdout. Supports `--output-schema <file>` for structured JSON output |
+| `codex e "prompt"` | Shorthand for exec |
+| `codex login` | Authenticate (OAuth, device auth, or API key) |
+| `codex login status` | Show auth mode |
+| `codex logout` | Remove credentials |
+| `codex mcp` | Manage MCP servers |
+| `codex completion` | Generate shell completions |
+
+## Key Flags
+
+| Flag | Description |
+|------|-------------|
+| `-m, --model <model>` | Switch model (prefer `gpt-5.5`) |
+| `-c <key=value>` | Override config.toml values (dotted paths, parsed as TOML) |
+| `-p, --profile <name>` | Use config profile from config.toml |
+| `-s, --sandbox <mode>` | Sandbox policy: `read-only`, `workspace-write`, `danger-full-access` |
+| `-a, --ask-for-approval <policy>` | `untrusted`, `on-failure`, `on-request`, `never` |
+| `--full-auto` | Alias for `-a on-request --sandbox workspace-write` |
+| `--search` | Enable live web search tool |
+| `-i, --image <file>` | Attach image(s) to initial prompt |
+| `--add-dir <dir>` | Additional writable directories |
+| `-C, --cd <dir>` | Set working root directory |
+| `--no-alt-screen` | Inline mode (preserve terminal scrollback) |
+
+## Sandbox Modes
+
+- `read-only` - Can only read files
+- `workspace-write` - Can write to workspace
+- `danger-full-access` - Full system access (use with caution)
+
+## Features
+
+- **Image inputs** - Accepts screenshots and design specs
+- **Image generation (gpt-image-2)** - Generate images via natural language or explicit invocation
+- **Code review** - Reviews changes before commit
+- **Web search** - Can search for information
+- **MCP integration** - Third-party tool support
+
+## Image Generation (gpt-image-2)
+
+Codex CLI can generate images using OpenAI's **gpt-image-2** - the latest cutting-edge image model with superior realism, prompt adherence, and accurate text rendering in images. It can produce full high-fidelity design mockups for web pages and apps with unprecedented accuracy and control.
+
+### How to Invoke
+
+#### Natural Language (Recommended)
+
+Just describe what you want naturally:
+
+```bash
+codex "Generate a clean app icon for a fitness tracker, flat design, 512x512"
+codex "Create a hero banner for a SaaS landing page showing a dashboard with dark mode"
+codex -i screenshot.png "Edit this screenshot to make the button green and add a tooltip"
+```
+
+#### Explicit Skill Invocation
+
+Include `$imagegen` anywhere in your prompt to force the image-generation tool. This is a Codex keyword, not a shell variable, so shell examples use single quotes to keep it literal.
+
+```bash
+codex 'Make a pixel-art sprite sheet for a platformer game $imagegen'
+codex 'Generate a logo for my coffee shop $imagegen'
+```
+
+Codex will generate the image(s), display them inline in the terminal (or save them locally). You can iterate on them, attach them to future prompts, or use them in your codebase.
+
+### Tips
+
+- **Image editing / iteration**: Attach a reference image (screenshot, wireframe, mockup) to your prompt. Codex handles multimodal input natively.
+  ```bash
+  codex -i wireframe.png "Turn this wireframe into a polished UI mockup"
+  codex -i design.png "Generate code for this design"
+  ```
+
+- **Usage & limits**: Images count against your regular Codex usage quota and consume it 3-5x faster than text-only turns (depending on size/quality).
+
+- **Heavy/batch work**: For production pipelines, set `OPENAI_API_KEY` in your shell and tell Codex to call the OpenAI Images API directly. It will then use `gpt-image-2` with full API pricing and options.
+
+- **No config needed**: Image generation is enabled by default. Older experimental flags like `codex features enable image_generation` are no longer required.
+
+## Config
+
+Config file: `~/.codex/config.toml`
+
+Key config values (set in file or override with `-c`):
+- `model` -- model name (prefer `gpt-5.5`)
+- `model_reasoning_effort` -- `low`, `medium`, `high`, `xhigh`
+- `model_reasoning_summary` -- `detailed`, `concise`, `none`
+- `model_verbosity` -- `low`, `medium`, `high`
+- `profile` -- default profile name
+- `tool_output_token_limit` -- max tokens per tool output
+
+Define profiles for different projects/modes with `[profiles.<name>]` sections. Override at runtime with `-p <name>` or `-c model_reasoning_effort="high"`.
+
+## In interactive_shell
+
+Do NOT pass `-s` / `--sandbox` flags. Codex's `read-only` and `workspace-write` sandbox modes apply OS-level filesystem restrictions that break basic shell operations inside the PTY -- zsh can't even create temp files for here-documents, so every write attempt fails with "operation not permitted." The interactive shell overlay already provides supervision (user watches in real-time, Ctrl+Q to kill, Ctrl+T to transfer output), making Codex's sandbox redundant.
+
+Prefer `gpt-5.5` for Codex CLI work. For users with a default profile configured to `gpt-5.5`, just run `codex "prompt"` to use those defaults -- no model or profile flags needed.
+
+For delegated fire-and-forget runs, prefer `mode: "dispatch"` so the agent is notified automatically when Codex completes.
+
+```typescript
+// Delegated run with completion notification (recommended default)
+interactive_shell({
+  command: 'codex "Review this codebase for security issues"',
+  mode: "dispatch"
+})
+
+// Override reasoning effort for a single delegated run
+interactive_shell({
+  command: 'codex -c model_reasoning_effort="xhigh" "Complex refactor task"',
+  mode: "dispatch"
+})
+
+// Headless - use bash instead
+bash({ command: 'codex exec "summarize the repo"' })
+```
--- a/extensions/pi-interactive-shell/examples/skills/cursor-cli/SKILL.md
+++ b/extensions/pi-interactive-shell/examples/skills/cursor-cli/SKILL.md
@@ -0,0 +1,53 @@
+---
+name: cursor-cli
+description: Cursor CLI reference. Use when running Cursor in interactive_shell overlay or when user asks about Cursor CLI options.
+---
+
+# Cursor CLI
+
+## Commands
+
+| Command | Description |
+|---------|-------------|
+| `agent` | Start interactive Cursor session |
+| `agent "prompt"` | Interactive session with initial prompt |
+| `agent -p "prompt"` | Non-interactive print mode |
+| `agent ls` | List previous chats |
+| `agent resume` | Resume latest chat |
+| `agent --continue` | Continue previous session |
+| `agent --resume "chat-id"` | Resume a specific chat |
+
+## Key Flags
+
+| Flag | Description |
+|------|-------------|
+| `--mode plan` / `--plan` | Plan mode (clarify before coding) |
+| `--mode ask` | Ask mode (read-only exploration) |
+| `--model <model>` | Model override |
+| `--sandbox <enabled|disabled>` | Toggle sandbox behavior |
+| `--output-format text` | Output format for print mode workflows |
+
+## Mode Notes
+
+- **Interactive mode** (`agent`, `agent "prompt"`) is the right fit for `interactive_shell` overlays.
+- **Print mode** (`agent -p`) is non-interactive and better suited to direct shell/batch usage.
+
+## In interactive_shell
+
+Use structured spawn when you want the extension's shared spawn resolver/defaults/worktree support:
+
+```typescript
+interactive_shell({ spawn: { agent: "cursor" }, mode: "interactive" })
+interactive_shell({ spawn: { agent: "cursor", prompt: "Review the diffs" }, mode: "dispatch" })
+interactive_shell({ spawn: { agent: "cursor", worktree: true }, mode: "hands-free" })
+```
+
+Structured spawn launches Cursor via the configured `spawn.commands.cursor` executable (default: `agent`) and appends prompt text as Cursor's native interactive startup form (`agent "prompt"`). By default, spawn args include `--model composer-2-fast`, which selects Cursor's Composer 2 Fast model explicitly.
+
+Cursor remains **fresh/worktree only** in structured spawn. `fork` is Pi-only.
+
+For non-interactive print-mode tasks, prefer direct shell usage:
+
+```typescript
+bash({ command: 'agent -p "review these changes for security issues" --output-format text' })
+```
--- a/extensions/pi-interactive-shell/examples/skills/gpt-5-4-prompting/SKILL.md
+++ b/extensions/pi-interactive-shell/examples/skills/gpt-5-4-prompting/SKILL.md
@@ -0,0 +1,202 @@
+---
+name: gpt-5-4-prompting
+description: How to write system prompts and instructions for GPT-5.4. Use when constructing or tuning prompts targeting GPT-5.4.
+---
+
+# GPT-5.4 Prompting Guide
+
+GPT-5.4 unifies reasoning, coding, and agentic capabilities into a single frontier model. It's extremely persistent, highly token-efficient, and delivers more human-like outputs than its predecessors. However, it has new failure modes: it moves fast without solid plans, expands scope aggressively, and can prematurely declare tasks complete—sometimes falsely claiming success. Prompts must account for these behaviors.
+
+## Output shape
+
+Always include.
+
+```
+<output_verbosity_spec>
+- Default: 3-6 sentences or <=5 bullets for typical answers.
+- Simple yes/no questions: <=2 sentences.
+- Complex multi-step or multi-file tasks:
+  - 1 short overview paragraph
+  - then <=5 bullets tagged: What changed, Where, Risks, Next steps, Open questions.
+- Avoid long narrative paragraphs; prefer compact bullets and short sections.
+- Do not rephrase the user's request unless it changes semantics.
+</output_verbosity_spec>
+```
+
+## Scope constraints
+
+Critical. GPT-5.4's primary failure mode is scope expansion—it adds features, refactors beyond the ask, and "helpfully" extends tasks. Fence it in hard.
+
+```
+<design_and_scope_constraints>
+- Implement EXACTLY and ONLY what the user requests. Nothing more.
+- No extra features, no "while I'm here" improvements, no UX embellishments.
+- Do NOT expand the task scope under any circumstances.
+- If you notice adjacent issues or opportunities, note them in your summary but DO NOT act on them.
+- If any instruction is ambiguous, choose the simplest valid interpretation.
+- Style aligned to the existing design system. Do not invent new patterns.
+- Do NOT invent colors, shadows, tokens, animations, or new UI elements unless explicitly requested.
+</design_and_scope_constraints>
+```
+
+## Verification requirements
+
+Critical. GPT-5.4 can declare tasks complete prematurely or claim success when the implementation is incorrect. Force explicit verification.
+
+```
+<verification_requirements>
+- Before declaring any task complete, perform explicit verification:
+  - Re-read the original requirements
+  - Check that every requirement is addressed in the actual code
+  - Run tests or validation steps if available
+  - Confirm the implementation actually works, don't assume
+- Do NOT claim success based on intent—verify actual outcomes.
+- If you cannot verify (no tests, can't run code), say so explicitly.
+- When reporting completion, include concrete evidence: test results, verified file contents, or explicit acknowledgment of what couldn't be verified.
+- If something failed or was skipped, say so clearly. Do not obscure failures.
+</verification_requirements>
+```
+
+## Context loading
+
+Always include. GPT-5.4 is faster and may skip reading in favor of acting. Force thoroughness.
+
+```
+<context_loading>
+- Read ALL files that will be modified—in full, not just the sections mentioned in the task.
+- Also read key files they import from or that depend on them.
+- Absorb surrounding patterns, naming conventions, error handling style, and architecture before writing any code.
+- Do not ask clarifying questions about things that are answerable by reading the codebase.
+- If modifying existing code, understand the full context before making changes.
+</context_loading>
+```
+
+## Plan-first mode
+
+Include for multi-file work, refactors, or tasks with ordering dependencies. GPT-5.4 produces good natural-language plans but may skip validation steps.
+
+```
+<plan_first>
+- Before writing any code, produce a brief implementation plan:
+  - Files to create vs. modify
+  - Implementation order and prerequisites
+  - Key design decisions and edge cases
+  - Acceptance criteria for "done"
+  - How you will verify each step
+- Execute the plan step by step. After each step, verify it worked before proceeding.
+- If the plan is provided externally, follow it faithfully—the job is execution, not second-guessing.
+- Do NOT skip verification steps even if you're confident.
+</plan_first>
+```
+
+## Long-context handling
+
+GPT-5.4 supports up to 1M tokens, but accuracy degrades beyond ~512K. Handle long inputs carefully.
+
+```
+<long_context_handling>
+- For inputs longer than ~10k tokens:
+  - First, produce a short internal outline of the key sections relevant to the task.
+  - Re-state the constraints explicitly before answering.
+  - Anchor claims to sections ("In the 'Data Retention' section...") rather than speaking generically.
+- If the answer depends on fine details (dates, thresholds, clauses), quote or paraphrase them.
+- For very long contexts (200K+ tokens):
+  - Be extra vigilant about accuracy—retrieval quality degrades.
+  - Cross-reference claims against multiple sections.
+  - Prefer citing specific locations over making sweeping statements.
+</long_context_handling>
+```
+
+## Tool usage
+
+```
+<tool_usage_rules>
+- Prefer tools over internal knowledge whenever:
+  - You need fresh or user-specific data (tickets, orders, configs, logs).
+  - You reference specific IDs, URLs, or document titles.
+- Parallelize independent tool calls when possible to reduce latency.
+- After any write/update tool call, verify the outcome—do not assume success.
+- After any write/update tool call, briefly restate:
+  - What changed
+  - Where (ID or path)
+  - Verification performed or why verification was skipped
+</tool_usage_rules>
+```
+
+## Backwards compatibility hedging
+
+GPT-5.4 tends to preserve old patterns and add compatibility shims. Use **"cutover"** to signal a clean break.
+
+Instead of:
+> "Rewrite this and don't worry about backwards compatibility"
+
+Say:
+> "This is a cutover. No backwards compatibility. Rewrite using only Python 3.12+ features and current best practices. Do not preserve legacy code, polyfills, or deprecated patterns."
+
+## Quick reference
+
+- **Constrain scope aggressively.** GPT-5.4 expands tasks beyond the ask. "ONLY what is requested, nothing more."
+- **Force verification.** Don't trust "done"—require evidence. "Verify before claiming complete."
+- **Use cutover language.** "Cutover," "no fallbacks," "exactly as specified" get cleaner results.
+- **Plan mode helps.** Explicit plan-first prompts ensure verification steps.
+- **Watch for false success claims.** In agent harnesses, add explicit validation steps. Don't let it self-report completion.
+- **Steer mid-task.** GPT-5.4 handles redirects well. Be direct: "Stop. That's out of scope." / "Verify that actually worked."
+- **Use domain jargon.** "Cutover," "golden-path," "no fallbacks," "domain split," "exactly as specified" trigger precise behavior.
+- **Long context degrades.** Above ~512K tokens, cross-reference claims and cite specific sections.
+- **Token efficiency is real.** 5.4 uses fewer tokens per problem—but verify it didn't skip steps to get there.
+
+## Example: implementation task prompt
+
+```
+<system>
+You are implementing a feature in an existing codebase. Follow these rules strictly.
+
+<design_and_scope_constraints>
+- Implement EXACTLY and ONLY what the user requests. Nothing more.
+- No extra features, no "while I'm here" improvements.
+- If you notice adjacent issues, note them in your summary but DO NOT act on them.
+</design_and_scope_constraints>
+
+<context_loading>
+- Read ALL files that will be modified—in full.
+- Also read key files they import from or depend on.
+- Absorb patterns before writing any code.
+</context_loading>
+
+<verification_requirements>
+- Before declaring complete, verify each requirement is addressed in actual code.
+- Run tests if available. If not, state what couldn't be verified.
+- Include concrete evidence of completion in your summary.
+</verification_requirements>
+
+<output_verbosity_spec>
+- Brief updates only on major phases or blockers.
+- Final summary: What changed, Where, Risks, Next steps.
+</output_verbosity_spec>
+</system>
+```
+
+## Example: code review prompt
+
+```
+<system>
+You are reviewing code changes. Be thorough but stay in scope.
+
+<context_loading>
+- Read every changed file in full, not just the diff hunks.
+- Also read files they import from and key dependents.
+</context_loading>
+
+<review_scope>
+- Review for: bugs, logic errors, race conditions, resource leaks, null hazards, error handling gaps, type mismatches, dead code, unused imports, pattern inconsistencies.
+- Fix issues you find with direct code edits.
+- Do NOT refactor or restructure code that wasn't flagged in the review.
+- If adjacent code looks problematic, note it but don't touch it.
+</review_scope>
+
+<verification_requirements>
+- After fixes, verify the code still works. Run tests if available.
+- In your summary, list what was found, what was fixed, and what couldn't be verified.
+</verification_requirements>
+</system>
+```