393 lines
17 KiB
Markdown
393 lines
17 KiB
Markdown
---
|
|
created: 2026-05-16
|
|
modified: 2026-05-16
|
|
type: note
|
|
tags:
|
|
- ai
|
|
- dev-ops
|
|
aliases: []
|
|
---
|
|
|
|
# Pi Agent Extensions & Skills
|
|
|
|
## Source Repositories
|
|
|
|
| Source | Location |
|
|
|---|---|
|
|
| Gitea (package) | `git:https://gitea.lab.audasmedia.com.au/sam/pi-config` |
|
|
| Local filesystem | `~/.agents/` |
|
|
| Project settings | `sys_config/.pi/settings.json`, `ai_setup/.pi/settings.json` |
|
|
|
|
---
|
|
|
|
## Extensions
|
|
|
|
| Extension | Source | Purpose |
|
|
|---|---|---|
|
|
| **pi-config** | `~/.agents` | `/config-add`, `/config-remove`, `/config-show`, `/config-setup` — manage which extensions/skills are active in a project |
|
|
| **tavily-search** | Gitea | `tavily_search` — web search via Tavily API (AI-optimized) |
|
|
| **web-fetch** | `~/.agents` | `web_fetch` — fetch any URL and return clean markdown (HTML, PDF, JS-rendered with Jina fallback) |
|
|
| **ask-user-question** | `~/.agents` | `ask_user_question` — LLM presents structured multiple-choice / text questions with keyboard UI |
|
|
| **video-extract** | `~/.agents` | `video_extract` — extract frames from YouTube/local video + full Gemini analysis (requires ffmpeg + yt-dlp + GEMINI_API_KEY) |
|
|
| **filechanges** | `~/.agents` | `/filechanges`, `/filechanges-accept`, `/filechanges-decline` — tracks every file LLM edits/writes, diff review, revert |
|
|
| **pi-prompt-template-model** | npm (global) | Model-switching prompt templates with frontmatter. See [[#Prompt Templates]] section below |
|
|
| **pi-mcp-adapter** | npm (global) | Single proxy tool (~200 tokens) replaces hundreds of MCP tool definitions. `/mcp` command for management. Lazy server connections |
|
|
| **pi-graphify** | `~/.agents` | Knowledge graph tools: build, query, path tracing, explain, watch, add, update |
|
|
| **plannotator** | `~/.agents` | Interactive plan review with browser UI, annotations, code review |
|
|
| **caveman** | `~/.agents` | Ultra-compressed communication mode |
|
|
| **markitdown** | `~/.agents` | Convert files (PDF, Word, Excel, PPTX, images, HTML, etc.) to Markdown. Image analysis via Qwen 2.5 VL 72B on OpenRouter. |
|
|
| **@tintinweb/pi-subagents** | npm (global) | 18 custom agent types, background agents, mid-run steering, session resume, worktree isolation, scheduling, cross-extension RPC. See [[Pi Subagent]] for full documentation |
|
|
| **@tintinweb/pi-tasks** | npm (global) | Task management with dependency tracking, auto-cascade, background process tracking, TaskExecute spawns subagents via RPC |
|
|
| **gentle-engram** | `~/.agents` | Memory service connecting to engram via ENGRAM_URL. Replaces memory-vault. Session capture, compaction recovery |
|
|
| **headroom-bridge** | `~/.agents` | `compress_for_agent` tool — compresses >20K chars via Headroom Docker on .13:8787. 60-95% token reduction |
|
|
|
|
|
|
### pi-subagents (@tintinweb)
|
|
|
|
v0.10.2, installed globally. 18 custom agent types in `~/.pi/agent/agents/` → `~/.agents/agents/`. Tools: `Agent()`, `get_subagent_result()`, `steer_subagent()`. `/agents` command for interactive management. Features: background agents with concurrency (4 default), mid-run steering, session resume, worktree isolation, scheduling, persistent widget showing live agent status. Cross-extension RPC event bus. Patched: 24h agent survival (was 10min), clearDisabled on session start. See [[Pi Subagent]] for full documentation.
|
|
|
|
### pi-tasks (@tintinweb)
|
|
|
|
v0.7.0, installed globally. 7 task tools: `TaskCreate`, `TaskList`, `TaskGet`, `TaskUpdate`, `TaskOutput`, `TaskStop`, `TaskExecute`. `/tasks` command. Features: dependency tracking (blocks/blockedBy), auto-cascade, background process tracking, persistent widget. `TaskExecute` spawns subagents via RPC.
|
|
|
|
---
|
|
|
|
## Skills
|
|
|
|
| Skill | Purpose |
|
|
|---|---|
|
|
| **nixos-workflow** | STRICT workflow for managing Pi assets via Gitea on NixOS |
|
|
| **system-architect** | Multi-machine NixOS infrastructure (Snapcast, MQTT, Docker, Nvim) |
|
|
| **obsidian-cli** | Interact with Obsidian vault (notes, search, plugin dev, theme dev) |
|
|
| **graphify** | Full-pipeline knowledge graph orchestration |
|
|
| **caveman** | Caveman communication mode |
|
|
| **openspec-propose** | Propose new changes with design docs, specs, tasks |
|
|
| **openspec-apply-change** | Implement tasks from an OpenSpec change |
|
|
| **openspec-archive-change** | Archive completed changes |
|
|
| **openspec-explore** | Explore ideas and clarify requirements |
|
|
| **npm-security** | Scan packages with SafeDep Vet, check typosquatting with npq, wrap installs with Socket Firewall |
|
|
| **markitdown** | Convert files (PDF, Word, Excel, PowerPoint, images, HTML, CSV, JSON, XML, ZIP, EPubs, YouTube) to Markdown for LLM consumption. Image analysis via Qwen 2.5 VL 72B on OpenRouter. |
|
|
|
|
---
|
|
|
|
## markitdown
|
|
|
|
Convert various file formats to Markdown. Useful for feeding documents and images into LLMs.
|
|
|
|
### What it converts
|
|
|
|
| Format | Input | Notes |
|
|
|--------|-------|-------|
|
|
| PDF | `.pdf` | Preserves structure (headings, lists, tables) |
|
|
| Word | `.docx` | mammoth + lxml |
|
|
| PowerPoint | `.pptx` | python-pptx |
|
|
| Excel | `.xlsx`, `.xls` | openpyxl + pandas |
|
|
| Images | `.jpg`, `.png`, etc. | EXIF metadata (free) + LLM vision description (via OpenRouter) |
|
|
| HTML | `.html` | beautifulsoup4 |
|
|
| CSV / JSON / XML | `.csv`, `.json`, `.xml` | Structured data → Markdown tables |
|
|
| ZIP | `.zip` | Iterates contents, converts each file |
|
|
| EPubs | `.epub` | |
|
|
| YouTube | URLs | Transcript extraction |
|
|
|
|
### CLI usage
|
|
|
|
```bash
|
|
# Convert file to Markdown (stdout)
|
|
markitdown document.pdf
|
|
|
|
# Write to file
|
|
markitdown document.pdf -o document.md
|
|
|
|
# Image with LLM vision description
|
|
markitdown-vision photo.jpg
|
|
```
|
|
|
|
### Image analysis
|
|
|
|
Two levels:
|
|
|
|
1. **EXIF metadata only** (free, no API key): `markitdown photo.jpg`
|
|
2. **LLM vision description** (via OpenRouter, requires API key): `markitdown-vision photo.jpg`
|
|
|
|
The `markitdown-vision` wrapper auto-sources `OPENROUTER_API_KEY` from `~/.config/environment.d/10-secrets.conf` and uses `qwen/qwen2.5-vl-72b-instruct`.
|
|
|
|
### Missing / can be added
|
|
|
|
| Feature | What's needed |
|
|
|---------|--------------|
|
|
| Audio transcription | `pip install markitdown[audio-transcription]` (pydub + speechrecognition) |
|
|
| Azure AI Document Intelligence | `pip install markitdown[az-doc-intel]` + Azure credentials |
|
|
| Azure Content Understanding | `pip install markitdown[az-content-understanding]` + Azure credentials |
|
|
| markitdown-ocr plugin | Installed but needs OpenRouter key enabled to activate |
|
|
|
|
---
|
|
|
|
## Security Tools (npm Global)
|
|
|
|
Three tools installed globally at `~/.local/share/npm-global/bin/` to guard package installs.
|
|
|
|
### SafeDep Vet (`vet`)
|
|
|
|
Scans local directories for multi-language malware signatures. Catches obfuscated code, suspicious imports, base64 payloads.
|
|
|
|
```bash
|
|
# Scan a cloned repo before touching it
|
|
vet scan -D . --format json --filter "package.malware == true"
|
|
|
|
# Scan package metadata from npm registry
|
|
vet scan package <name> --format json
|
|
```
|
|
|
|
### Socket Firewall (`socket`)
|
|
|
|
Wraps npm/pip installs with real-time scanning. Blocks malicious packages at install time.
|
|
|
|
```bash
|
|
# Safe npm install
|
|
socket npm install <package>
|
|
|
|
# Safe pip install
|
|
socket pip install -r requirements.txt
|
|
```
|
|
|
|
### npq
|
|
|
|
Checks package names against typosquatting lists before install. Lightweight, local, no phoning home.
|
|
|
|
```bash
|
|
npq check <package> --json
|
|
```
|
|
|
|
### Workflow
|
|
|
|
```
|
|
1. vet scan → checks for malware in the code/package
|
|
2. npq check → checks the package name for typosquatting
|
|
3. socket install → wraps the actual install with runtime scanning
|
|
```
|
|
|
|
The **npm-security** skill instructs the Pi agent to follow this workflow before any install.
|
|
|
|
|
|
## Headroom
|
|
|
|
Headroom is a **context compression layer** that reduces prompt token usage by 60-95% for heavy analysis/code/devops workloads. It runs as a Docker container on the server (192.168.20.13) and is triggered by the headroom-bridge extension.
|
|
|
|
### How it works
|
|
|
|
1. headroom-bridge detects analysis/code/devops contexts
|
|
2. Tags `read`, `discuss`, `search` **never** trigger compression (these are fast paths)
|
|
3. For all other tags, if accumulated context exceeds ~5K tokens, headroom-bridge calls `compress()`
|
|
4. Messages are sent to the Headroom proxy at `192.168.20.13:8787`
|
|
5. Headroom compresses the context (using SmartCrusher for JSON, CodeCompressor for AST, Kompress-base ML for text)
|
|
6. Compressed messages are returned and forwarded to the LLM
|
|
7. If the proxy is down, messages pass through unchanged (graceful fallback)
|
|
|
|
### Architecture
|
|
|
|
```
|
|
Desktop (.27) Server (.13)
|
|
───────────── ────────────
|
|
headroom-bridge headroom proxy (Docker)
|
|
│ │
|
|
│ if compress needed: │
|
|
│ compress_for_agent(content) ──────►│
|
|
│ HTTP POST 192.168.20.13:8787 │
|
|
│ ◄────────── compressed content │
|
|
│ │
|
|
│ return to agent │
|
|
│ │
|
|
│ if proxy down: pass through │
|
|
```
|
|
|
|
### Compression thresholds
|
|
|
|
| Condition | Action |
|
|
|---|---|
|
|
| Tag is `read`/`discuss`/`search` | Skip — no compression |
|
|
| Context < 5K tokens | Skip — too small to benefit |
|
|
| Context ≥ 5K tokens + analysis/code/devops tag | Compress |
|
|
| Proxy unreachable | Pass through unchanged |
|
|
|
|
### headroom-bridge
|
|
|
|
Path: `~/.agents/extensions/headroom-bridge/index.ts`. Tool: `compress_for_agent({ content })` — compresses content >20K chars via Headroom Docker on .13:8787. 60-95% token reduction. See [[#Function Calls]] for signature.
|
|
|
|
### Management
|
|
|
|
```bash
|
|
# Check status
|
|
ssh 192.168.20.13 "docker ps --filter name=headroom"
|
|
|
|
# View logs
|
|
ssh 192.168.20.13 "docker logs --tail 20 headroom"
|
|
|
|
# Restart
|
|
ssh 192.168.20.13 "docker restart headroom"
|
|
|
|
# Update image
|
|
ssh 192.168.20.13 "cd /home/sam/Docker/Containers/headroom && docker compose pull && docker compose up -d"
|
|
```
|
|
|
|
### Files
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `~/.agents/extensions/headroom-bridge/index.ts` | `compress_for_agent` tool implementation |
|
|
| `/home/sam/Docker/Containers/headroom/docker-compose.yml` | Docker service definition (on .13) |
|
|
| `/home/sam/Docker/Containers/headroom/.env` | Environment file (on .13) |
|
|
|
|
|
|
## engram + gentle-engram
|
|
|
|
engram v1.16.1 runs on nixos-desktop (.13) at `~/.local/bin/engram`. Systemd user service auto-starts. Binds `127.0.0.1:7437`, accessed via SSH tunnel: `ssh -fNL 7437:127.0.0.1:7437 192.168.20.13`.
|
|
|
|
**gentle-engram** connects via `ENGRAM_URL=http://127.0.0.1:7437`. Replaces memory-vault. Features: session capture, compaction recovery, private block redaction.
|
|
|
|
---
|
|
|
|
## MCP Servers
|
|
|
|
pi-mcp-adapter connects Pi to external services via the Model Context Protocol.
|
|
|
|
**Config file:** `~/.config/mcp/mcp.json`
|
|
|
|
```json
|
|
{
|
|
"mcpServers": {
|
|
"filesystem": {
|
|
"command": "npx",
|
|
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/home/sam"]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Find MCP servers at:**
|
|
- [github.com/modelcontextprotocol/servers](https://github.com/modelcontextprotocol/servers)
|
|
- [smithery.ai](https://smithery.ai) — community registry
|
|
|
|
**Usage:**
|
|
- `/mcp` — interactive panel to manage servers
|
|
- `mcp({ search: "..." })` — search available tools
|
|
- `mcp({ tool: "tool_name", args: '{}' })` — call a tool
|
|
- Servers are lazy (connect on first use, disconnect after 10 min idle)
|
|
|
|
---
|
|
|
|
## Configuration Files
|
|
|
|
### Global (`~/.pi/agent/settings.json`)
|
|
- Nix store symlink — managed via `/etc/nixos/home/sam/home.nix`
|
|
- Contains: providers (opencode-go, openrouter, google), packages (pi-memctx, pi-prompt-template-model, Gitea)
|
|
- **Read-only** — cannot be modified by `pi install` or `/config-add`
|
|
|
|
### Project (`<project-dir>/.pi/settings.json`)
|
|
- Overrides global settings (arrays replace, not merge)
|
|
- Contains: `~/.agents` package (extensions + skills), Gitea package (tavily-search)
|
|
- Modified via `/config-add` / `/config-remove` commands
|
|
|
|
### Per-folder Memory (via pi-memctx)
|
|
- Memory stored in `<chat-folder>/.pi/memory-vault/packs/`
|
|
- Workspace map at `~/.pi/agent/memory-vault/00-system/workspace-map.json`
|
|
- Each chat folder has isolated memory (prevents sibling directory contamination)
|
|
|
|
|
|
### models.json (Provider Routing)
|
|
|
|
Path: `~/.pi/agent/models.json` (symlinked from `~/.agents/models.json` for Gitea). Injects OpenRouter provider routing for KV cache sharing. DeepSeek cache discount: 91.7% ($0.145/M vs $1.74/M input). Configured: deepseek, qwen, minimax, moonshotai. Mechanism: pi-core reads `model.compat.openRouterRouting` → injects `provider: { order: [...] }`.
|
|
|
|
### subagents.json
|
|
|
|
Path: `~/.pi/agent/subagents.json`. Configuration for pi-subagents agent types, default models, and providers.
|
|
|
|
---
|
|
|
|
## Useful Commands
|
|
|
|
| Command | What it does |
|
|
|---|---|
|
|
| `/config-setup` | One-shot: creates `.pi/`, `settings.json`, memory vault in current folder |
|
|
| `/config-add ext <name>` | Activate an extension from `~/.agents` |
|
|
| `/config-add skill <name>` | Activate a skill from `~/.agents` |
|
|
| `/config-show` | Show active extensions and skills |
|
|
| `/agents` | Interactive agent management (pi-subagents) |
|
|
| `/tasks` | Interactive task management (pi-tasks) |
|
|
| `/memctx-init` | Scan folder, build initial memory pack |
|
|
| `/memctx-status` | Show memory status |
|
|
| `/memctx-refresh` | Re-scan and enrich memory |
|
|
| `/filechanges` | Review changed files, diffs, accept/decline |
|
|
| `/filechanges-accept` | Accept all changes |
|
|
| `/filechanges-decline` | Revert all changes |
|
|
| `markitdown <file>` | Convert file to Markdown (PDF, Word, Excel, PPTX, images, HTML, etc.) |
|
|
| `markitdown-vision <file>` | Describe image using Qwen 2.5 VL 72B via OpenRouter |
|
|
| `Agent()` | Spawn subagent (pi-subagents) |
|
|
| `get_subagent_result()` | Retrieve results from background agent |
|
|
| `steer_subagent()` | Send mid-run steering message |
|
|
| `compress_for_agent()` | Compress large content via Headroom |
|
|
|
|
|
|
## Agent Roster (18)
|
|
|
|
| Agent | Model | Provider |
|
|
|-------|-------|----------|
|
|
| chat-search | google/gemini-2.5-flash:free | openrouter |
|
|
| code-analysis | deepseek-r1-distill-qwen-32b | opencode-go |
|
|
| code-ingest | google/gemini-2.5-flash:free | openrouter |
|
|
| coder-basic | deepseek/deepseek-chat | openrouter |
|
|
| coder-pro | deepseek/deepseek-v4-pro | openrouter |
|
|
| database | deepseek-r1-distill-qwen-32b | opencode-go |
|
|
| devops-basic | deepseek/deepseek-chat | openrouter |
|
|
| devops-pro | deepseek/deepseek-v4-pro | openrouter |
|
|
| document-writer | deepseek-r1-distill-qwen-32b | opencode-go |
|
|
| file-ops | qwen-coder-32b-instruct | opencode-go |
|
|
| home-automation | deepseek-r1-distill-qwen-32b | opencode-go |
|
|
| image-maker | black-forest-labs/flux-1-schnell | openrouter |
|
|
| iot-coder | qwen-coder-32b-instruct | opencode-go |
|
|
| iot-hardware | moonshotai/kimi-k2.6 | openrouter |
|
|
| obsidian | deepseek-r1-distill-qwen-14b | opencode-go |
|
|
| research | google/gemini-2.5-flash:free | openrouter |
|
|
| video-analyze | qwen/qwen-2.5-vl | openrouter |
|
|
| vscode-setup | qwen/qwen-3-coder-next | openrouter |
|
|
|
|
## Function Calls
|
|
|
|
```
|
|
Agent({ subagent_type, prompt, description, model?, thinking?, max_turns?, run_in_background?, resume?, isolated?, isolation?, inherit_context? })
|
|
get_subagent_result({ agent_id, wait?, verbose? })
|
|
steer_subagent({ agent_id, message })
|
|
compress_for_agent({ content })
|
|
```
|
|
|
|
---
|
|
|
|
## Skipped / Bookmarked
|
|
|
|
| Extension/Skill | Reason |
|
|
|---|---|
|
|
| **web-search** (amosblomqvist) | ❌ Redundant — Tavily does this |
|
|
| **subagents** (amosblomqvist) | ❌ Redundant — pi-subagents already installed |
|
|
| **bash-guard** (amosblomqvist) | ❌ Too aggressive — would interrupt flow |
|
|
| **google-image-search** (amosblomqvist) | ❌ Would need Google Search API + CSE setup |
|
|
| **pdf-reader** (amosblomqvist) | ⏳ Bookmarked — Python + pymupdf setup needed |
|
|
| **notify** (mitsuhiko) | ⏳ Minor QoL — desktop notifications on task complete |
|
|
| **audio/voice** | ⏳ Not practical | Pi TUI has no mic access or audio playback — fundamental platform limitation |
|
|
|
|
---
|
|
|
|
## Tasks
|
|
|
|
- [x] Rebuild NixOS to activate new packages ✅ 2026-06-11
|
|
- [x] Migrate to @tintinweb/pi-subagents with 18 agent types ✅ 2026-06-14
|
|
- [x] Deploy models.json with OpenRouter provider pinning (only: provider) ✅ 2026-06-14
|
|
- [x] Deploy engram + gentle-engram memory service on .13 ✅ 2026-06-13
|
|
- [x] Install pi-tasks (@tintinweb/pi-tasks) v0.7.0 ✅ 2026-06-13
|
|
- [x] Patch pi-subagents: 24h cleanup, disable clearCompleted ✅ 2026-06-14
|
|
- [x] Add memory: project to all 18 agent frontmats ✅ 2026-06-14
|
|
- [x] Remove custom subagent-registry skill (built-in /agents menu is sufficient) ✅ 2026-06-14
|
|
- [x] Update Pi Subagent.md and Pi Agent Extensions & Skills.md documentation ✅ 2026-06-14
|
|
- [x] Update pi-subagents to v0.10.3 ✅ 2026-06-14
|
|
- [x] Update pi-tasks to v0.7.0 ✅ 2026-06-13
|
|
- [ ] Verify video-extract works with Gemini
|
|
- [ ] Clean up workspace-map.json entries for any stale memory packs
|