pi-config/skills/markitdown/SKILL.md

---
name: markitdown
description: Convert various file formats to Markdown for use with LLMs and text analysis. Supports PDF, Word, Excel, PowerPoint, images, HTML, CSV, JSON, XML, ZIP, EPubs, and YouTube URLs.
---

# MarkItDown

Convert files to Markdown for LLM consumption and text analysis. A lightweight Python utility by Microsoft.

## Installation

Installed in a Python venv at `/tmp/markitdown-env/` with a wrapper at `~/.local/bin/markitdown`.

The wrapper handles `LD_LIBRARY_PATH` for numpy's C extensions on NixOS.

If the venv is missing (e.g., after rebuild), recreate:
```bash
nix-shell -p python3 python3.pkgs.pip python3.pkgs.virtualenv gcc stdenv.cc.cc.lib --run "
  python3 -m venv /tmp/markitdown-env
  source /tmp/markitdown-env/bin/activate
  pip install 'markitdown[pdf,docx,pptx,xlsx]'
"
```
Then recreate the wrapper at `~/.local/bin/markitdown`.

## Supported Formats

| Format | Extension | Dependencies |
|--------|-----------|-------------|
| PDF | `.pdf` | pdfminer-six, pdfplumber |
| Word | `.docx` | lxml, mammoth |
| PowerPoint | `.pptx` | python-pptx |
| Excel | `.xlsx`, `.xls` | openpyxl, pandas, xlrd |
| Images | `.jpg`, `.png`, etc. | EXIF metadata (core); LLM vision via `llm_client`/`llm_model`; OCR via `markitdown-ocr` plugin (installed) |
| HTML | `.html`, `.htm` | beautifulsoup4 (core) |
| CSV | `.csv` | (core) |
| JSON | `.json` | (core) |
| XML | `.xml` | (core) |
| ZIP | `.zip` | (core, iterates contents) |
| EPubs | `.epub` | (core) |
| YouTube | URLs | youtube-transcript-api (core) |
| Text | `.txt`, `.md`, etc. | (core) |

## CLI Usage

```bash
# Convert a file to Markdown (stdout)
markitdown path/to/file.pdf

# Write to file
markitdown path/to/file.pdf -o output.md

# Pipe content
cat file.pdf | markitdown
```

## Python API

```python
from markitdown import MarkItDown

md = MarkItDown()
result = md.convert("document.pdf")
print(result.text_content)
```

## Integration with Pi

Use `markitdown` to convert files before reading them with the `read` tool:

```bash
# Convert then read
markitdown report.pdf -o /tmp/report.md && read /tmp/report.md
```

This is especially useful for:
- PDFs that need structure preserved (headings, lists, tables)
- Office documents (Word, Excel, PowerPoint)
- Images with EXIF metadata
- Any file format not directly readable by the `read` tool

## Image Analysis (LLM Vision)

For images, markitdown can extract EXIF metadata (free, no API key) AND describe image content using an LLM vision model.

**EXIF only (already works):**
```bash
markitdown photo.jpg
```

**With LLM vision — requires OpenRouter API key:**

The wrapper `markitdown-vision` auto-sources the key from `~/.config/environment.d/10-secrets.conf`. Just run:
```bash
markitdown-vision photo.jpg
```

Or use Python API directly:
```python
from markitdown import MarkItDown
from openai import OpenAI
import os

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ["OPENROUTER_API_KEY"],
)

md = MarkItDown(
    llm_client=client,
    llm_model="qwen/qwen2.5-vl-72b-instruct",
)

result = md.convert("photo.jpg")
print(result.text_content)
```

**Why Qwen 2.5 VL 72B?**
- Excellent vision understanding
- Affordable: ~$0.25/M input, ~$0.75/M output tokens
- 131K context window
- Available on OpenRouter

**OCR inside documents (installed):**
`markitdown-ocr` plugin is installed. Enable with:
```python
md = MarkItDown(enable_plugins=True, llm_client=client, llm_model="qwen/qwen2.5-vl-72b-instruct")
```
This extracts text from images embedded in PDFs, Word, PowerPoint, and Excel files.

## Security Notes

- MarkItDown performs I/O with current process privileges
- Sanitize inputs in untrusted environments
- Only convert files from trusted sources
- The `[pdf,docx,pptx,xlsx]` extras are installed; audio transcription and Azure AI are NOT installed
- Image analysis requires an OpenRouter API key — costs tokens per image