8.8 KiB
created, modified, type, tags, aliases
| created | modified | type | tags | aliases | ||||
|---|---|---|---|---|---|---|---|---|
| 2026-06-15 | 2026-06-15 | note |
|
Pi Subagent System
Persistent named subagents with provider-pinned KV cache continuity.
What It Is
Pi-subagents (@tintinweb v0.10.2) lets you spawn named, persistent subagents that:
- Run in background — don't block the main session
- Keep their context — resume later with full conversation history
- Pin to a provider — KV cache stays warm across resume calls (= massive token savings)
- Have persistent memory —
memory: projectgives each agent its own memory directory - Survive session restarts — patched to keep agents alive 24h, not 10 minutes
Key Concepts
Named Agents
Every subagent gets a human-readable name so you can reference it later:
coder-login— coder-pro working on authsearcher-1— chat-search for web lookupsobsidian-docs— obsidian agent for note editing
Multiple instances of the same type can run simultaneously:
coder-login+coder-nav-bar— two coder-pro agents, different taskssearcher-1+searcher-2— parallel web searches
Provider Pinning
Each agent is pinned to a specific OpenRouter provider via only: [...] in models.json. This ensures:
- KV cache continuity — same provider = cache hits on resume
- Predictable latency — always route through the same serving host
- Cost control — token savings when context prefix is cached
See Pi Agent Extensions & Skills#models.json for the full routing table.
Persistent Memory
Agents with memory: project in their frontmatter get a persistent directory:
.pi/agent-memory/<agent-name>/ ← per-project, committed to git
.pi/agent-memory-local/<agent-name>/ ← per-project, gitignored
~/.pi/agent-memory/<agent-name>/ ← global, across all projects
Memory survives across resume calls. Agents build up knowledge over time.
Session-Scoped Registry
Agents belong to a pi session (per-project directory). When you /new or /resume, the session's agents are available.
Creating Agents
Ask the LLM
Simply describe what you need. The LLM will ask for a name if you don't provide one:
You: "Find all authentication files"
LLM: "What should I name this agent?"
You: "auth-finder"
LLM: *spawns agent, saves to registry*
"Created 'auth-finder' (chat-search, Google). ID: 7efad0d8"
Agent Types Available
| Type | Model | Provider | Memory | Tools | Use For |
|---|---|---|---|---|---|
chat-search |
gemini-2.5-flash:free | project | read, bash, grep, find | Web search, quick lookups | |
coder-basic |
deepseek-chat | DeepInfra | project | read, bash, write, grep, find | Simple code edits |
coder-pro |
deepseek-v4-pro | DeepInfra | project | read, bash, write, grep, find, edit | Complex architecture |
code-analysis |
deepseek-r1-distill-qwen-32b | opencode-go | project | read, bash, grep, find | Security review, analysis |
code-ingest |
gemini-2.5-flash:free | project | read, bash, grep, find | Scan/ingest docs, GitHub | |
database |
deepseek-r1-distill-qwen-32b | opencode-go | project | read, bash, write, grep | SQL, schema design |
devops-basic |
deepseek-chat | DeepInfra | project | read, bash, write, grep, find | Docker, YAML, NixOS |
devops-pro |
deepseek-v4-pro | DeepInfra | project | read, bash, write, grep, find, edit | Complex infrastructure |
document-writer |
deepseek-r1-distill-qwen-32b | opencode-go | project | read, bash, write | Documents, letters |
file-ops |
qwen-coder-32b-instruct | opencode-go | project | read, bash, grep, find, write | Filesystem, drives |
home-automation |
deepseek-r1-distill-qwen-32b | opencode-go | project | read, bash, write, grep | MQTT, Home Assistant |
image-maker |
flux-1-schnell | Venice | project | read, bash, write | Image generation |
iot-coder |
qwen-coder-32b-instruct | opencode-go | project | read, bash, write, grep, find | Arduino, ESP32 |
iot-hardware |
kimi-k2.6 | Moonshot AI | project | read, bash, grep, find | Hardware specs, boards |
obsidian |
deepseek-r1-distill-qwen-14b | opencode-go | project | read, bash, write, grep, find | Obsidian notes |
research |
gemini-2.5-flash:free | project | read, bash, grep, find | Web search, docs | |
video-analyze |
qwen-2.5-vl | Alibaba | project | read, bash, grep, find | Video/image analysis |
vscode-setup |
qwen-3-coder-next | StreamLake | project | read, bash, grep, find | VS Code, AI tooling |
Explore |
haiku/inherit | default | — | read, bash, grep, find, ls | Fast codebase search (built-in) |
Plan |
inherit | default | — | read, bash, grep, find, ls | Implementation planning (built-in) |
general-purpose |
inherit | default | — | all tools | General tasks (built-in) |
Managing Agents
List Agents
Type /agents to open the interactive menu:
Running agents (2) — 1 running, 1 done
Agent types (21)
Create new agent
Settings
Select "Running agents" to see active agents with status, tool uses, and duration.
View Conversation
From /agents → "Running agents" → select an agent:
- Live scrolling overlay of the agent's full conversation
- Auto-follows new output
- Press
xto stop a running agent - Scroll up to pause auto-follow
Resume an Agent
You: "Resume auth-finder and also check the logout module"
LLM: *looks up agent ID from context*
Agent({ subagent_type: "chat-search", resume: "7efad0d8-...", prompt: "Also check the logout module" })
The agent continues with its full context preserved. Provider KV cache is warm.
Steer a Running Agent
You: "Tell auth-finder to focus on the API routes only"
LLM: steer_subagent({ agent_id: "7efaf0d8-...", message: "Focus on API routes only, ignore UI" })
The agent receives the message after its current tool execution and adjusts course.
Delete an Agent
You: "Delete auth-finder"
LLM: *removes from registry, aborts if running*
"Deleted 'auth-finder'"
Or delete all agents for a session:
You: "Clear all subagents for computer-software"
LLM: *removes all matching entries*
"Deleted 4 subagents from session 'computer-software'"
Function Calls
Create
Agent({
subagent_type, // string — agent type (e.g. "coder-pro")
prompt, // string — the task
description, // string — 3-5 word summary
model?, // string — provider/model override
thinking?, // string — off|minimal|low|medium|high|xhigh
max_turns?, // number — max agentic turns (default: unlimited)
run_in_background?, // boolean — run async (default: false)
resume?, // string — agent ID to resume a previous session
isolated?, // boolean — no extension/MCP tools
isolation?, // "worktree" — run in isolated git worktree
inherit_context? // boolean — fork parent conversation into agent
})
Retrieve
get_subagent_result({
agent_id, // string — the agent ID from spawn notification
wait?, // boolean — block until complete (default: false)
verbose? // boolean — include full conversation (default: false)
})
Steer
steer_subagent({
agent_id, // string — running agent ID
message // string — message injected after current tool execution
})
Compress
compress_for_agent({
content // string — content >20K chars to compress via Headroom
})
// Returns: { compressed, originalLength, compressedLength, tokensBefore, tokensAfter, savingsPercent }
Token Savings
The entire purpose is to minimize context tokens on the API provider:
| Scenario | Context Uploaded | Cost |
|---|---|---|
| Fresh agent every request | Full system prompt + context (~10K tokens) | ~$0.50-1.00 per request |
| Resume same agent | Only new prompt (~100 tokens) + cached prefix hits KV cache | ~$0.01-0.05 per request |
| Savings | 90-95% reduction | ~$0.50 vs $5.00 for 5 questions |
Files
| File | Purpose |
|---|---|
~/.local/share/npm-global/lib/node_modules/@tintinweb/pi-subagents/ |
Extension source |
~/.pi/agent/agents/*.md |
Agent type definitions (18 custom + 3 built-in) |
~/.pi/agent/models.json |
OpenRouter provider pinning (only: provider) |
.pi/agent-memory/<name>/ |
Per-agent persistent memory (project scope) |
.pi/agent-memory-local/<name>/ |
Per-agent persistent memory (gitignored) |
~/.pi/agent/subagents.json |
pi-subagents settings (maxConcurrent, etc.) |
Related
- Pi Agent Extensions & Skills — full extensions/skills reference
- Engram Memory — persistent memory service on .13
- Headroom Compression — context compression via .13:8787
- OpenRouter Provider Routing — provider pinning for cache continuity