--- created: 2026-06-15 modified: 2026-06-15 type: note tags: - ai - pi - subagent - pi-subagents aliases: [] --- # Pi Subagent System Persistent named subagents with provider-pinned KV cache continuity. ## What It Is Pi-subagents (@tintinweb v0.10.2) lets you spawn named, persistent subagents that: - **Run in background** — don't block the main session - **Keep their context** — resume later with full conversation history - **Pin to a provider** — KV cache stays warm across resume calls (= massive token savings) - **Have persistent memory** — `memory: project` gives each agent its own memory directory - **Survive session restarts** — patched to keep agents alive 24h, not 10 minutes --- ## Key Concepts ### Named Agents Every subagent gets a **human-readable name** so you can reference it later: - `coder-login` — coder-pro working on auth - `searcher-1` — chat-search for web lookups - `obsidian-docs` — obsidian agent for note editing Multiple instances of the same type can run simultaneously: - `coder-login` + `coder-nav-bar` — two coder-pro agents, different tasks - `searcher-1` + `searcher-2` — parallel web searches ### Provider Pinning Each agent is pinned to a specific OpenRouter provider via `only: [...]` in models.json. This ensures: - **KV cache continuity** — same provider = cache hits on resume - **Predictable latency** — always route through the same serving host - **Cost control** — token savings when context prefix is cached See [[Pi Agent Extensions & Skills#models.json]] for the full routing table. ### Persistent Memory Agents with `memory: project` in their frontmatter get a persistent directory: ``` .pi/agent-memory// ← per-project, committed to git .pi/agent-memory-local// ← per-project, gitignored ~/.pi/agent-memory// ← global, across all projects ``` Memory survives across resume calls. Agents build up knowledge over time. ### Session-Scoped Registry Agents belong to a **pi session** (per-project directory). When you `/new` or `/resume`, the session's agents are available. --- ## Creating Agents ### Ask the LLM Simply describe what you need. The LLM will ask for a name if you don't provide one: ``` You: "Find all authentication files" LLM: "What should I name this agent?" You: "auth-finder" LLM: *spawns agent, saves to registry* "Created 'auth-finder' (chat-search, Google). ID: 7efad0d8" ``` ### Agent Types Available | Type | Model | Provider | Memory | Tools | Use For | |------|-------|----------|--------|-------|---------| | `chat-search` | gemini-2.5-flash:free | Google | project | read, bash, grep, find | Web search, quick lookups | | `coder-basic` | deepseek-chat | DeepInfra | project | read, bash, write, grep, find | Simple code edits | | `coder-pro` | deepseek-v4-pro | DeepInfra | project | read, bash, write, grep, find, edit | Complex architecture | | `code-analysis` | deepseek-r1-distill-qwen-32b | opencode-go | project | read, bash, grep, find | Security review, analysis | | `code-ingest` | gemini-2.5-flash:free | Google | project | read, bash, grep, find | Scan/ingest docs, GitHub | | `database` | deepseek-r1-distill-qwen-32b | opencode-go | project | read, bash, write, grep | SQL, schema design | | `devops-basic` | deepseek-chat | DeepInfra | project | read, bash, write, grep, find | Docker, YAML, NixOS | | `devops-pro` | deepseek-v4-pro | DeepInfra | project | read, bash, write, grep, find, edit | Complex infrastructure | | `document-writer` | deepseek-r1-distill-qwen-32b | opencode-go | project | read, bash, write | Documents, letters | | `file-ops` | qwen-coder-32b-instruct | opencode-go | project | read, bash, grep, find, write | Filesystem, drives | | `home-automation` | deepseek-r1-distill-qwen-32b | opencode-go | project | read, bash, write, grep | MQTT, Home Assistant | | `image-maker` | flux-1-schnell | Venice | project | read, bash, write | Image generation | | `iot-coder` | qwen-coder-32b-instruct | opencode-go | project | read, bash, write, grep, find | Arduino, ESP32 | | `iot-hardware` | kimi-k2.6 | Moonshot AI | project | read, bash, grep, find | Hardware specs, boards | | `obsidian` | deepseek-r1-distill-qwen-14b | opencode-go | project | read, bash, write, grep, find | Obsidian notes | | `research` | gemini-2.5-flash:free | Google | project | read, bash, grep, find | Web search, docs | | `video-analyze` | qwen-2.5-vl | Alibaba | project | read, bash, grep, find | Video/image analysis | | `vscode-setup` | qwen-3-coder-next | StreamLake | project | read, bash, grep, find | VS Code, AI tooling | | `Explore` | haiku/inherit | default | — | read, bash, grep, find, ls | Fast codebase search (built-in) | | `Plan` | inherit | default | — | read, bash, grep, find, ls | Implementation planning (built-in) | | `general-purpose` | inherit | default | — | all tools | General tasks (built-in) | --- ## Managing Agents ### List Agents Type `/agents` to open the interactive menu: ``` Running agents (2) — 1 running, 1 done Agent types (21) Create new agent Settings ``` Select **"Running agents"** to see active agents with status, tool uses, and duration. ### View Conversation From `/agents` → "Running agents" → select an agent: - **Live scrolling overlay** of the agent's full conversation - **Auto-follows** new output - **Press `x`** to stop a running agent - **Scroll up** to pause auto-follow ### Resume an Agent ``` You: "Resume auth-finder and also check the logout module" LLM: *looks up agent ID from context* Agent({ subagent_type: "chat-search", resume: "7efad0d8-...", prompt: "Also check the logout module" }) ``` The agent continues with its full context preserved. Provider KV cache is warm. ### Steer a Running Agent ``` You: "Tell auth-finder to focus on the API routes only" LLM: steer_subagent({ agent_id: "7efaf0d8-...", message: "Focus on API routes only, ignore UI" }) ``` The agent receives the message after its current tool execution and adjusts course. ### Delete an Agent ``` You: "Delete auth-finder" LLM: *removes from registry, aborts if running* "Deleted 'auth-finder'" ``` Or delete all agents for a session: ``` You: "Clear all subagents for computer-software" LLM: *removes all matching entries* "Deleted 4 subagents from session 'computer-software'" ``` --- ## Function Calls ### Create ``` Agent({ subagent_type, // string — agent type (e.g. "coder-pro") prompt, // string — the task description, // string — 3-5 word summary model?, // string — provider/model override thinking?, // string — off|minimal|low|medium|high|xhigh max_turns?, // number — max agentic turns (default: unlimited) run_in_background?, // boolean — run async (default: false) resume?, // string — agent ID to resume a previous session isolated?, // boolean — no extension/MCP tools isolation?, // "worktree" — run in isolated git worktree inherit_context? // boolean — fork parent conversation into agent }) ``` ### Retrieve ``` get_subagent_result({ agent_id, // string — the agent ID from spawn notification wait?, // boolean — block until complete (default: false) verbose? // boolean — include full conversation (default: false) }) ``` ### Steer ``` steer_subagent({ agent_id, // string — running agent ID message // string — message injected after current tool execution }) ``` ### Compress ``` compress_for_agent({ content // string — content >20K chars to compress via Headroom }) // Returns: { compressed, originalLength, compressedLength, tokensBefore, tokensAfter, savingsPercent } ``` --- ## Token Savings The entire purpose is to **minimize context tokens** on the API provider: | Scenario | Context Uploaded | Cost | |----------|-----------------|------| | Fresh agent every request | Full system prompt + context (~10K tokens) | ~$0.50-1.00 per request | | Resume same agent | Only new prompt (~100 tokens) + cached prefix hits KV cache | ~$0.01-0.05 per request | | **Savings** | **90-95% reduction** | **~$0.50 vs $5.00 for 5 questions** | --- ## Files | File | Purpose | |------|---------| | `~/.local/share/npm-global/lib/node_modules/@tintinweb/pi-subagents/` | Extension source | | `~/.pi/agent/agents/*.md` | Agent type definitions (18 custom + 3 built-in) | | `~/.pi/agent/models.json` | OpenRouter provider pinning (only: provider) | | `.pi/agent-memory//` | Per-agent persistent memory (project scope) | | `.pi/agent-memory-local//` | Per-agent persistent memory (gitignored) | | `~/.pi/agent/subagents.json` | pi-subagents settings (maxConcurrent, etc.) | ## Related - [[Pi Agent Extensions & Skills]] — full extensions/skills reference - [[Engram Memory]] — persistent memory service on .13 - [[Headroom Compression]] — context compression via .13:8787 - [[OpenRouter Provider Routing]] — provider pinning for cache continuity