docs: add markitdown to Pi Agent Extensions & Skills

This commit is contained in:
2026-06-08 12:12:21 +10:00
parent f25d78de0a
commit 3fa4903fc4

View File

@@ -36,6 +36,7 @@ aliases: []
| **pi-graphify** | `~/.agents` | Knowledge graph tools: build, query, path tracing, explain, watch, add, update | | **pi-graphify** | `~/.agents` | Knowledge graph tools: build, query, path tracing, explain, watch, add, update |
| **plannotator** | `~/.agents` | Interactive plan review with browser UI, annotations, code review | | **plannotator** | `~/.agents` | Interactive plan review with browser UI, annotations, code review |
| **caveman** | `~/.agents` | Ultra-compressed communication mode | | **caveman** | `~/.agents` | Ultra-compressed communication mode |
| **markitdown** | `~/.agents` | Convert files (PDF, Word, Excel, PPTX, images, HTML, etc.) to Markdown. Image analysis via Qwen 2.5 VL 72B on OpenRouter. |
--- ---
@@ -53,6 +54,59 @@ aliases: []
| **openspec-archive-change** | Archive completed changes | | **openspec-archive-change** | Archive completed changes |
| **openspec-explore** | Explore ideas and clarify requirements | | **openspec-explore** | Explore ideas and clarify requirements |
| **npm-security** | Scan packages with SafeDep Vet, check typosquatting with npq, wrap installs with Socket Firewall | | **npm-security** | Scan packages with SafeDep Vet, check typosquatting with npq, wrap installs with Socket Firewall |
| **markitdown** | Convert files (PDF, Word, Excel, PowerPoint, images, HTML, CSV, JSON, XML, ZIP, EPubs, YouTube) to Markdown for LLM consumption. Image analysis via Qwen 2.5 VL 72B on OpenRouter. |
---
## markitdown
Convert various file formats to Markdown. Useful for feeding documents and images into LLMs.
### What it converts
| Format | Input | Notes |
|--------|-------|-------|
| PDF | `.pdf` | Preserves structure (headings, lists, tables) |
| Word | `.docx` | mammoth + lxml |
| PowerPoint | `.pptx` | python-pptx |
| Excel | `.xlsx`, `.xls` | openpyxl + pandas |
| Images | `.jpg`, `.png`, etc. | EXIF metadata (free) + LLM vision description (via OpenRouter) |
| HTML | `.html` | beautifulsoup4 |
| CSV / JSON / XML | `.csv`, `.json`, `.xml` | Structured data → Markdown tables |
| ZIP | `.zip` | Iterates contents, converts each file |
| EPubs | `.epub` | |
| YouTube | URLs | Transcript extraction |
### CLI usage
```bash
# Convert file to Markdown (stdout)
markitdown document.pdf
# Write to file
markitdown document.pdf -o document.md
# Image with LLM vision description
markitdown-vision photo.jpg
```
### Image analysis
Two levels:
1. **EXIF metadata only** (free, no API key): `markitdown photo.jpg`
2. **LLM vision description** (via OpenRouter, requires API key): `markitdown-vision photo.jpg`
The `markitdown-vision` wrapper auto-sources `OPENROUTER_API_KEY` from `~/.config/environment.d/10-secrets.conf` and uses `qwen/qwen2.5-vl-72b-instruct`.
### Missing / can be added
| Feature | What's needed |
|---------|--------------|
| Audio transcription | `pip install markitdown[audio-transcription]` (pydub + speechrecognition) |
| Azure AI Document Intelligence | `pip install markitdown[az-doc-intel]` + Azure credentials |
| Azure Content Understanding | `pip install markitdown[az-content-understanding]` + Azure credentials |
| markitdown-ocr plugin | Installed but needs OpenRouter key enabled to activate |
--- ---
@@ -242,6 +296,8 @@ pi-mcp-adapter connects Pi to external services via the Model Context Protocol.
| `/filechanges` | Review changed files, diffs, accept/decline | | `/filechanges` | Review changed files, diffs, accept/decline |
| `/filechanges-accept` | Accept all changes | | `/filechanges-accept` | Accept all changes |
| `/filechanges-decline` | Revert all changes | | `/filechanges-decline` | Revert all changes |
| `markitdown <file>` | Convert file to Markdown (PDF, Word, Excel, PPTX, images, HTML, etc.) |
| `markitdown-vision <file>` | Describe image using Qwen 2.5 VL 72B via OpenRouter |
--- ---
@@ -266,4 +322,5 @@ pi-mcp-adapter connects Pi to external services via the Model Context Protocol.
- [ ] Run `/reload` in Pi to activate filechanges and new templates - [ ] Run `/reload` in Pi to activate filechanges and new templates
- [ ] Add more prompt templates to `~/.pi/agent/prompts/` as needed - [ ] Add more prompt templates to `~/.pi/agent/prompts/` as needed
- [ ] Verify video-extract works with Gemini - [ ] Verify video-extract works with Gemini
- [x] Add markitdown skill to Obsidian skills page
- [ ] Clean up workspace-map.json entries for any stale memory packs - [ ] Clean up workspace-map.json entries for any stale memory packs