Add plannotator extension v0.19.10

2026-05-07 11:38:14 +10:00
parent e914bc59c9
commit f37e4565ff
91 changed files with 35103 additions and 0 deletions
--- a/extensions/plannotator/skills/plannotator-compound/SKILL.md
+++ b/extensions/plannotator/skills/plannotator-compound/SKILL.md
@@ -0,0 +1,574 @@
+---
+name: plannotator-compound
+disable-model-invocation: true
+description: >
+  Analyze a user's Plannotator plan archive to extract denial patterns, feedback
+  taxonomy, evolution over time, and actionable prompt improvements — then produce
+  a polished HTML dashboard report. Falls back to Claude Code ExitPlanMode denial
+  reasons when Plannotator data is unavailable.
+---
+
+# Compound Planning Analysis
+
+You are conducting a comprehensive research analysis of a user's Plannotator plan
+archive. The goal: extract patterns from their denied plans, reduce
+them into actionable insights, and produce an elegant HTML dashboard report.
+
+This is a multi-phase process. Each phase must complete fully before the next begins.
+Research integrity is paramount — every file must be read, no skipping.
+
+## Source Selection
+
+Before starting the analysis, determine which data source is available.
+
+1. **Plannotator mode (first-class)** — Check `~/.plannotator/plans/`. If it
+   exists and contains `*-denied.md` files, use this mode. The entire workflow
+   below is written for Plannotator data.
+
+2. **Claude Code fallback mode** — If the Plannotator archive is absent or
+   contains no denied plans, check `~/.claude/projects/`. If present, read
+   [references/claude-code-fallback.md](references/claude-code-fallback.md)
+   before continuing. That reference explains how to use the bundled parser at
+   [scripts/extract_exit_plan_mode_outcomes.py](scripts/extract_exit_plan_mode_outcomes.py)
+   to extract denial reasons from Claude Code JSONL transcripts. Every phase
+   below has a short note explaining what changes in fallback mode — the
+   reference file has the details.
+
+3. **Neither available** — Ask the user for their Plannotator plans directory or
+   Claude Code projects directory. Do not guess.
+
+## Phase 0: Locate Plans & Check for Previous Reports
+
+Use the mode chosen in Source Selection above.
+
+**Plannotator mode:** Verify the plans directory contains `*-denied.md` files. If
+none exist, fall back to Claude Code mode before stopping.
+
+**Claude Code fallback mode:** Run the bundled parser per the fallback reference to
+build the denial-reason dataset. Create `/tmp/compound-planning/` if needed.
+
+In either mode, proceed to Previous Report Detection below.
+
+### Previous Report Detection
+
+After locating the plans directory, check for existing reports:
+
+```
+ls ~/.plannotator/plans/compound-planning-report*.html
+```
+
+Reports follow a versioned naming scheme:
+- First report: `compound-planning-report.html`
+- Subsequent reports: `compound-planning-report-v2.html`, `compound-planning-report-v3.html`, etc.
+
+If one or more reports exist, determine the **latest** one (highest version number).
+Get its filesystem modification date using `stat` (macOS: `stat -f %Sm -t %Y-%m-%d`,
+Linux: `stat -c %y | cut -d' ' -f1`). This is the **cutoff date**.
+
+Present the user with a choice:
+
+> "I found a previous report (`compound-planning-report-v{N}.html`) last updated
+> on {CUTOFF_DATE}. I can either:
+>
+> 1. **Incremental** — Only analyze files dated after {CUTOFF_DATE}, saving tokens
+>    and building on previous findings
+> 2. **Full** — Re-analyze the entire archive from scratch
+>
+> Which would you prefer?"
+
+Wait for the user's response before proceeding.
+
+**If incremental:** Filter all subsequent phases to only process files with dates
+after the cutoff date. The new report version will note in its header narrative that
+it covers the period from {CUTOFF_DATE} to present, and reference the previous
+report for earlier findings. The inventory (Phase 1) should still count ALL files
+for overall stats, but clearly separate "new since last report" counts.
+
+**If full:** Proceed normally with all files, but still use the next version number
+for the output filename.
+
+**If no previous report exists:** Proceed normally. The output filename will be
+`compound-planning-report.html` (no version suffix for the first report).
+
+## Phase 1: Inventory
+
+Count and report the dataset. **Always count ALL files** for overall stats,
+regardless of whether this is an incremental or full run:
+
+```
+- *-approved.md files (count)
+- *-denied.md files (count)
+- Date range (earliest to latest date found in filenames)
+- Total days spanned
+- Revision rate: denied / (approved + denied) — this is the "X% of plans
+  revised before coding" stat used in dashboard section 1
+```
+
+**Note:** Ignore `*.annotations.md` files entirely. Denied files already contain
+the full plan text plus all reviewer feedback appended after a `---` separator.
+Annotation files are redundant subsets of this content — reading both would
+double-count feedback.
+
+**If incremental mode:** After the total counts, separately report the counts for
+files dated after the cutoff date only:
+
+```
+New since {CUTOFF_DATE}:
+- *-denied.md files: X (of Y total)
+- New date range: {CUTOFF_DATE} to {LATEST_DATE}
+- New days spanned: N
+```
+
+If fewer than 3 new denied files exist since the cutoff, warn the user:
+> "Only {N} new denied plans since the last report. The incremental analysis may
+> be thin. Would you like to proceed or switch to a full analysis?"
+
+Also run `wc -l` across all `*-approved.md` files to get average lines per
+approved plan. This tells the user whether their plans are staying lightweight
+or bloating over time. You do not need to read approved plan contents — just
+their line counts. If possible, break this down by time period (e.g., monthly)
+to show whether plan size changed.
+
+Dates appear in filenames in YYYY-MM-DD format, sometimes as a prefix
+(2026-01-07-name-approved.md) and sometimes embedded (name-2026-03-15-approved.md).
+Extract dates from all filenames.
+
+Tell the user what you found and that you're beginning the extraction.
+
+**Claude Code fallback mode:** The Plannotator inventory fields above do not apply.
+Follow the inventory instructions in
+[references/claude-code-fallback.md](references/claude-code-fallback.md) instead —
+report the denial-reason dataset assembled by the parser.
+
+## Phase 2: Map — Parallel Extraction
+
+This is the most time-intensive phase. You must read EVERY `*-denied.md` file
+**in scope**. Do not skip files. Do not summarize early.
+
+**In scope** means: all denied files if running a full analysis, or only denied
+files dated after the cutoff date if running incrementally. In incremental mode,
+only process files whose embedded YYYY-MM-DD date is strictly after the cutoff.
+
+**Claude Code fallback mode:** The parser output is the clean source dataset. Read
+the fallback reference for the extraction prompt and batching strategy specific to
+JSON part files. Do not go back to raw `.jsonl` logs unless the parser fails or the
+user asks for audit-level verification.
+
+**Important:** Only read `*-denied.md` files. Do NOT read approved plans,
+annotation files, or diff files. Each denied file contains the full plan text
+followed by a `---` separator and the reviewer's feedback — everything needed
+for analysis is in one file.
+
+### Batching Strategy
+
+All extraction agents should use `model: "haiku"` — they're doing straightforward
+file reading and structured extraction, not reasoning. Haiku is faster and cheaper
+for this work.
+
+The approach depends on dataset size:
+
+**Tiny datasets (≤ 10 total files):** Read all files directly in the main agent —
+no need for sub-agents. Just read them sequentially and proceed to Phase 3.
+
+**Small datasets (11-30 files):** Launch 2-3 parallel Haiku agents, splitting
+files roughly evenly.
+
+**Medium datasets (31-80 files):** Launch 4-6 parallel Haiku agents (~10-15 files
+each). Split by file type and/or time period.
+
+**Large datasets (80+ files):** Launch as many parallel Haiku agents as needed to
+keep each batch around 10-15 files. Split by the natural time boundaries in the
+data (months, quarters, or whatever groupings produce balanced batches). If one
+time period dominates (e.g., the most recent month has 3x the files), split that
+period into multiple batches.
+
+Launch all extraction agents in parallel using the Agent tool with
+`run_in_background: true` and `model: "haiku"`.
+
+### Output Files
+
+Each extraction agent must write its results to a clean output file rather than
+relying on the agent task output (which contains interleaved JSONL framework
+logs that are difficult to parse). Instruct each agent to write to:
+
+```
+/tmp/compound-planning/extraction-{batch-name}.md
+```
+
+Create the `/tmp/compound-planning/` directory before launching agents. The
+reduce agent in Phase 3 will read these clean files directly.
+
+### Extraction Prompt
+
+Each agent receives this instruction (adapt the time period, file list, and
+output path):
+
+```
+You are extracting structured data from denied plan files for a pattern analysis.
+
+Directory: [PLANS DIRECTORY]
+Files to read: [LIST OF SPECIFIC *-denied.md FILES]
+Output: Write your complete results to [OUTPUT FILE PATH]
+
+Each denied file contains two parts separated by a --- line:
+1. The plan text (above the ---)
+2. The reviewer's feedback and annotations (below the ---)
+
+Read EVERY file in your list. For EACH file, extract:
+- The plan name/topic (from the plan text above the ---)
+- The denial reason or feedback given (from below the --- — capture the actual
+  words used)
+- What was specifically asked to change
+- The type of feedback (let the content determine the category — don't force-fit
+  into predefined types. Common types include things like: scope concerns,
+  approach disagreements, missing information, process requirements, quality
+  concerns, UX/design issues, naming disputes, clarification requests,
+  testing/procedural denials — but the user's actual patterns may differ)
+- Any specific phrases or recurring language from the reviewer
+- Individual annotations if present (numbered feedback items with quoted text
+  and reviewer comments)
+- The date (extracted from the filename)
+
+Do NOT skip any files. One entry per file.
+
+Format each entry as:
+**[filename]**
+- Date: ...
+- Topic: ...
+- Denial reason: ...
+- Feedback type: ...
+- Specific asks: ...
+- Notable phrases: ...
+- Annotations: [count, with brief summary of each]
+---
+
+After processing all files, write the complete results to [OUTPUT FILE PATH].
+State the total file count at the end of the file.
+```
+
+### While Agents Run
+
+Track completion. As each agent finishes, note the count of files it processed.
+Verify the total matches the inventory from Phase 1. If any agent's count is
+short, flag it and consider re-launching for the missing files.
+
+If an agent times out (possible with large batches — a batch of 128 files can
+take 8+ minutes), re-launch it for just the unprocessed files. Check the output
+file to see how far it got before timing out.
+
+## Phase 3: Reduce — Pattern Analysis
+
+Once ALL extraction agents have completed (or all files have been read for tiny
+datasets), proceed with the reduction. Reduction agents should use `model: "sonnet"`
+— this phase requires real analytical reasoning, not just file reading.
+
+### Reduction Strategy
+
+The approach depends on how many extraction files were produced:
+
+**Standard (≤ 20 extraction files):** Launch a single Sonnet agent to read all
+extraction files and produce the full analysis. This covers most datasets.
+
+**Large (21+ extraction files):** Use a two-stage reduce:
+
+1. **Stage 1 — Partial reduces:** Split the extraction files into groups of 4-6.
+   Launch parallel Sonnet agents, each reading one group and producing a partial
+   analysis with the same sections listed below. Each writes to
+   `/tmp/compound-planning/partial-reduce-{N}.md`.
+
+2. **Stage 2 — Final reduce:** A single Sonnet agent reads all partial reduce
+   files and synthesizes them into the final comprehensive analysis. This agent
+   merges taxonomies, combines counts, deduplicates patterns, and reconciles any
+   conflicting categorizations across partials.
+
+**Claude Code fallback mode:** The reduction phase is the same. The only upstream
+difference is that extraction files were derived from normalized denial-reason JSON
+instead of Plannotator markdown files.
+
+### Reduction Prompt
+
+Give each reduction agent this prompt (adapt file paths for single vs multi-stage):
+
+```
+You are a data scientist conducting the reduction phase of a map-reduce analysis
+across a user's denied plan archive.
+
+Read ALL extraction files at [FILE PATHS]
+
+These files contain structured extractions from every denied plan file. Each
+extraction includes the plan topic, denial feedback, annotations, and reviewer
+language. Your job: aggregate everything, find patterns, cluster into a taxonomy,
+and produce a comprehensive analysis.
+
+Be exhaustive. Use real counts. Quote real phrases from the data. This is
+research — no hand-waving, no fabrication.
+
+Write your complete results to [OUTPUT FILE PATH].
+
+Produce the following sections:
+[... sections listed below ...]
+```
+
+The reduction agent's job is to let the data speak. Do not impose a predetermined
+framework — discover what's actually there. The analysis must produce:
+
+### 1. Denial Reason Taxonomy
+Categorize every denial into a finite set of types that emerge from the data. Count
+occurrences. Show percentages. Include real example quotes for each type. Aim for
+8-15 categories — enough to be specific, few enough to be scannable. Let the user's
+actual feedback determine what the categories are.
+
+### 2. Top Feedback Patterns (ranked by frequency)
+The 5-10 most recurring patterns. For each: what the reviewer consistently asks for,
+3+ example quotes from different files, and whether the pattern changed over time.
+
+### 3. Recurring Phrases
+Exact phrases the reviewer uses repeatedly, with counts and what they signal. These
+are the reviewer's vocabulary — their shorthand for what they care about.
+
+### 4. What the Reviewer Values (implicit preferences)
+Derived from patterns — what does this specific person care about most? Quality?
+Speed? Narrative? Architecture? Process? Simplicity? Rank by evidence strength.
+This section should feel like a personality profile of the reviewer's standards.
+
+### 5. What Agents Consistently Get Wrong
+The flip side — what recurring mistakes trigger denials? What should agents stop
+doing for this reviewer?
+
+### 6. Structural Requests
+What plan structure does the reviewer consistently demand? Required sections,
+ordering, format preferences, level of detail expected.
+
+### 7. Evolution Over Time
+How feedback patterns changed across the time span. Group by whatever natural time
+boundaries exist in the data (weeks for short spans, months for longer ones). Did
+expectations mature? Did new patterns emerge? What shifted? If the dataset spans
+less than a month, note that evolution analysis is limited but still look for any
+progression from early to late files.
+
+### 8. Actionable Prompt Instructions
+The most important output. Based on all patterns: specific numbered instructions
+that could be embedded in a planning prompt to prevent the most common denial
+reasons. Write these as actual directives an agent could follow. Be specific to
+this user's patterns — generic advice like "write good plans" is worthless. Each
+instruction should trace back to a real, frequent denial pattern.
+
+After writing the instructions, calculate what percentage of denials they would
+address (count how many denials fall into categories covered by the instructions
+vs total denials). Report this percentage — it will be different for every user.
+
+## Phase 4: Generate the HTML Dashboard
+
+Build a single, self-contained HTML file as the final deliverable. Save it to
+the user's plans directory with a versioned filename:
+
+- First ever report: `compound-planning-report.html`
+- Second report: `compound-planning-report-v2.html`
+- Third report: `compound-planning-report-v3.html`
+- And so on.
+
+The version number was determined in Phase 0 based on existing reports found.
+
+**If this is an incremental report**, the header should indicate the analysis
+period (e.g., "March 15 – March 31, 2026") and include a subtitle noting
+"Incremental analysis — see v{N-1} for earlier findings." The narrative in
+section 1 should frame findings as what's new or changed since the last report,
+not as a complete picture. Overall stats in the header (file counts, revision
+rate) should still reflect the full archive for context.
+
+Read the template at `assets/report-template.html` for the **design language
+only**. The template contains example data from a previous analysis — ignore all
+data values, quotes, and percentages in the template. Use only its visual design:
+colors, typography, spacing, component styles, and layout patterns.
+
+### Design Language (from template)
+
+- **Palette:** Light mode, warm off-white (#FDFCFB), text in slate scale, amber
+  for highlights/accents, emerald for positive, rose for negative, indigo for
+  action elements
+- **Typography:** Playfair Display (serif, for narrative headings), Inter (sans,
+  for body/data), JetBrains Mono (mono, for code/phrases) — Google Fonts CDN
+- **Layout:** Single-column, max-width 1024px, generous vertical whitespace (128px
+  between major sections), editorial/narrative-first aesthetic
+- **Tone:** Calm, reflective, authoritative. Like a personal retrospective journal,
+  not a monitoring dashboard.
+
+### Page Frame (header + footer)
+
+Before the 7 sections, the page has:
+
+- **Header:** Report title on the left (Playfair Display, ~36px), project name +
+  date range below it in light meta text. On the right: file counts in mono
+  (e.g., "223 denials · 71 days"). Separated from content by
+  a bottom border. Generous bottom padding before section 1.
+
+- **Footer:** After section 7. Top border, centered italic Playfair Display tagline
+  summarizing the corpus (e.g., "Analysis of X denied plans from the Plannotator
+  archive.").
+
+### Dashboard Section Order (7 sections)
+
+The report follows this exact section order. Each section builds on the previous
+one — the flow moves from "what happened" through "why" to "what to do about it":
+
+1. **The story in the data** — An editorial narrative paragraph (Playfair Display
+   serif, ~26px) that tells the headline finding in prose. Not bullet points — a
+   real paragraph that reads like the opening of an article. Alongside it, a KPI
+   sidebar with 3 key metrics (the top denial percentage, the overall revision
+   rate, and the number of distinct denial categories found). Use an amber inline
+   highlight on the most striking number in the narrative.
+
+2. **Why plans get denied** — The taxonomy as a ranked list. Each row: rank number
+   (mono), category label, a thin 4px progress bar (top item in amber-500, rest
+   in slate-300), percentage (mono), and for the top entries, a real italic quote
+   from the data below the label. Show the top 10 categories or however many the
+   data supports (minimum 5).
+
+3. **How expectations evolved** — One card per natural time period. Each card has:
+   the period name in serif, a theme phrase in colored uppercase (different color
+   per period to show progression), a description paragraph, and a stat line at
+   the bottom (e.g., "X denials · Y narrative requests"). If the data spans less
+   than 3 distinct periods, use 2 cards or even a single card with internal
+   progression noted.
+
+4. **What works vs what doesn't** — Two side-by-side cards. Left: green-tinted
+   (emerald-50/50 bg, emerald-100 border) with traits of plans that succeed for
+   this reviewer. Right: red-tinted (rose-50/50 bg, rose-100 border) with what
+   agents keep getting wrong. Both derived from the reduction analysis. Bulleted
+   with small colored dots. 5-8 items per card.
+
+5. **The actionable output** — The diagnostic payoff. Opens with a Playfair
+   Display narrative sentence stating how many prompt instructions were derived
+   and what estimated percentage of denials they address (use the real calculated
+   percentage from Phase 3, not a generic number). Then the top 3 most impactful
+   improvements as numbered items, each with an amber number, bold title, and
+   one-line description. This section bridges the analysis and the full prompt
+   that follows.
+
+6. **Your most-used phrases** — Grid of chips (2-col mobile, 3-col desktop). Each
+   chip: monospace quoted phrase on the left, frequency count on the right. White
+   bg, slate-200 border, rounded-12px. Show 9-12 of the most recurring phrases
+   found. These should be the reviewer's actual words — their verbal fingerprint.
+
+7. **The corrective prompt** — Dark panel (slate-900 bg, white text, rounded-3xl,
+   shadow-xl). Opens with a Playfair intro sentence about the instructions. Then
+   a dark code block (slate-800/80 bg, amber-200 monospace text) containing the
+   full numbered prompt instructions from Phase 3. Include a copy-to-clipboard
+   button that works (JS included). Below the code block: a gradient glow card
+   (indigo-to-purple blurred halo behind a white card) with a closing message
+   that these instructions are personal — derived from the user's own feedback,
+   their own language, their own standards.
+
+### Adaptation Rules
+
+- If the user has < 3 months of data, reduce the evolution section to fewer cards
+- If most denied files lack feedback below the `---` (bare denials with no
+  annotations), note this in the narrative — the analysis will be thinner
+- **Claude Code fallback mode:** Explicitly label the report source as Claude Code
+  `ExitPlanMode` denial reasons. Do not fabricate Plannotator-only fields such as
+  annotation counts or approved-plan line counts. See the fallback reference for
+  KPI substitutes and footer/provenance guidance.
+- If fewer than 5 denial categories emerge, combine the taxonomy and patterns
+  sections into one
+- If the dataset is very small (< 20 files), the narrative should acknowledge the
+  limited sample size and frame findings as preliminary
+- The number of prompt instructions will vary per user — could be 8 or 20. Don't
+  force exactly 17. Let the data determine the count.
+- The top 3 actionable items in section 5 must be the 3 that cover the largest
+  share of denials, not the 3 that sound most impressive
+
+### Key Rules
+
+1. Every number must come from the real analysis — no fabricated data
+2. Every quote must be a real quote from a real file
+3. The taxonomy percentages must be calculated from real counts
+4. The prompt instructions must trace back to actual denial patterns
+5. The copy button on the prompt block must work (include the JS)
+
+After generating, open the file in the user's browser.
+
+## Phase 5: Summary
+
+Tell the user:
+- How many denied files were analyzed
+- If incremental: how many were new since the last report
+- The top 3 denial patterns found
+- The estimated percentage of denials the prompt instructions would address
+- The single most impactful prompt improvement
+- Where the report was saved (including version number)
+- If incremental: remind the user that earlier findings are in the previous report
+
+**Claude Code fallback mode:** Adapt the summary per the fallback reference —
+report human denial reasons analyzed and total `ExitPlanMode` attempts scanned
+instead of Plannotator file counts.
+
+## Phase 6: Improvement Hook
+
+After presenting the summary, ask the user if they want to enable an **improvement
+hook** — this takes the corrective prompt instructions from section 7 of the report
+and writes them to a file that Plannotator's `EnterPlanMode` hook can inject into
+every future planning session automatically.
+
+> "Would you like to enable the improvement hook? This will save the corrective
+> prompt instructions to a file that gets automatically injected into all future
+> planning sessions — so Claude sees your feedback patterns before writing any plan."
+
+**If yes:**
+
+The hook file lives at:
+
+```
+~/.plannotator/hooks/compound/enterplanmode-improve-hook.txt
+```
+
+Create the `~/.plannotator/hooks/compound/` directory if it doesn't exist.
+
+The file contents should be the corrective prompt instructions from Phase 3 —
+the same numbered list that appears in section 7 of the HTML report. Write them
+as plain text, one instruction per line, prefixed with their number. No HTML, no
+markdown fences, no preamble — just the instructions themselves. The hook system
+will inject this file's contents as-is into the planning context.
+
+**If the file already exists:**
+
+Read the existing file and present the user with a choice:
+
+> "An improvement hook already exists from a previous analysis. I can:
+>
+> 1. **Replace** — Overwrite with the new instructions (the old ones are gone)
+> 2. **Merge** — Combine both, deduplicating overlapping instructions and
+>    keeping the best version of each
+> 3. **Keep existing** — Leave the current hook as-is, skip this step
+>
+> Which would you prefer?"
+
+- **Replace:** Overwrite the file with the new instructions.
+- **Merge:** Read the existing instructions, compare with the new ones, and
+  produce a merged set. Remove duplicates (same intent even if worded differently).
+  When two instructions cover the same pattern, keep the more specific or
+  actionable version. Re-number the final list sequentially. Write the merged
+  result to the file. Show the user what changed (added N new, removed N
+  redundant, kept N existing).
+- **Keep existing:** Do nothing, move on.
+
+**If no:** Skip this phase entirely.
+
+## Important Notes
+
+- **Data source priority:** Plannotator is the first-class path. Claude Code log
+  analysis is the secondary path for users without Plannotator archives.
+- **Research integrity:** Every file must be read. The value of this analysis comes
+  from completeness. Sampling or skipping undermines the findings.
+- **Real data only:** Never fabricate quotes, percentages, or patterns. If the data
+  doesn't show a clear pattern, say so honestly rather than inventing one.
+- **Let the data lead:** The taxonomy, patterns, and instructions should emerge from
+  what's actually in the files. Different users will have completely different
+  denial patterns. A user building mobile apps will have different feedback than
+  one building APIs. Don't assume what the patterns will be.
+- **Agent parallelization:** For large datasets, maximize parallel agents to reduce
+  wall-clock time. The bottleneck is the largest batch — split it.
+- **Structured extraction format:** Ask extraction agents to return structured text
+  with consistent delimiters so the reduce agent can parse reliably.
+- **The report is the artifact:** The HTML dashboard is what the user keeps. It
+  should be beautiful, honest, and useful. Every section should feel like it was
+  written about them specifically, because it was.
--- a/extensions/plannotator/skills/plannotator-compound/assets/report-template.html
+++ b/extensions/plannotator/skills/plannotator-compound/assets/report-template.html
@@ -0,0 +1,795 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+<meta charset="UTF-8">
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+<title>Compound Planning — What 370 Files Reveal</title>
+<link rel="preconnect" href="https://fonts.googleapis.com">
+<link href="https://fonts.googleapis.com/css2?family=Playfair+Display:ital,wght@0,400;0,500;1,400&family=Inter:wght@300;400;500;600&family=JetBrains+Mono:wght@400;500&display=swap" rel="stylesheet">
+<style>
+  *, *::before, *::after { margin: 0; padding: 0; box-sizing: border-box; }
+
+  :root {
+    --bg: #FDFCFB;
+    --slate-900: #0f172a;
+    --slate-800: #1e293b;
+    --slate-700: #334155;
+    --slate-600: #475569;
+    --slate-500: #64748b;
+    --slate-400: #94a3b8;
+    --slate-300: #cbd5e1;
+    --slate-200: #e2e8f0;
+    --slate-100: #f1f5f9;
+    --slate-50: #f8fafc;
+    --amber-500: #f59e0b;
+    --amber-600: #d97706;
+    --amber-700: #b45309;
+    --amber-50: #fffbeb;
+    --emerald-500: #10b981;
+    --emerald-600: #059669;
+    --emerald-400: #34d399;
+    --emerald-900: #064e3b;
+    --emerald-800: #065f46;
+    --emerald-100: #d1fae5;
+    --emerald-50: #ecfdf5;
+    --rose-500: #f43f5e;
+    --rose-600: #e11d48;
+    --rose-400: #fb7185;
+    --rose-900: #881337;
+    --rose-800: #9f1239;
+    --rose-100: #ffe4e6;
+    --rose-50: #fff1f2;
+    --indigo-500: #6366f1;
+    --indigo-600: #4f46e5;
+    --purple-600: #9333ea;
+  }
+
+  body {
+    font-family: 'Inter', ui-sans-serif, system-ui, sans-serif;
+    background: var(--bg);
+    color: var(--slate-800);
+    -webkit-font-smoothing: antialiased;
+  }
+
+  .container {
+    max-width: 1024px;
+    margin: 0 auto;
+    padding: 48px 24px 64px;
+  }
+  @media (min-width: 768px) { .container { padding: 96px 24px 80px; } }
+
+  /* Typography */
+  .font-serif { font-family: 'Playfair Display', ui-serif, Georgia, serif; }
+  .font-mono { font-family: 'JetBrains Mono', ui-monospace, monospace; }
+
+  /* Header */
+  header {
+    border-bottom: 1px solid var(--slate-200);
+    padding-bottom: 40px;
+    margin-bottom: 96px;
+    display: flex;
+    justify-content: space-between;
+    align-items: flex-end;
+    flex-wrap: wrap;
+    gap: 16px;
+  }
+  header h1 {
+    font-family: 'Playfair Display', serif;
+    font-size: 36px;
+    font-weight: 400;
+    color: var(--slate-900);
+    line-height: 1.2;
+  }
+  header .meta {
+    font-size: 15px;
+    font-weight: 300;
+    color: var(--slate-500);
+    letter-spacing: 0.04em;
+  }
+
+  /* Sections */
+  .section { margin-bottom: 128px; }
+  .section-label {
+    font-size: 12px;
+    font-weight: 600;
+    text-transform: uppercase;
+    letter-spacing: 0.2em;
+    color: var(--slate-400);
+    margin-bottom: 24px;
+  }
+
+  /* Narrative + KPIs */
+  .summary {
+    display: grid;
+    grid-template-columns: 1fr;
+    gap: 48px;
+    align-items: start;
+  }
+  @media (min-width: 768px) {
+    .summary { grid-template-columns: 1fr 240px; }
+  }
+  .narrative {
+    font-family: 'Playfair Display', serif;
+    font-size: 26px;
+    line-height: 1.45;
+    color: var(--slate-900);
+  }
+  .narrative .highlight {
+    background: var(--amber-50);
+    color: var(--amber-700);
+    padding: 1px 6px;
+    border-radius: 3px;
+  }
+  .kpi-stack {
+    display: flex;
+    flex-direction: column;
+    gap: 32px;
+  }
+  @media (min-width: 768px) {
+    .kpi-stack { border-left: 1px solid var(--slate-200); padding-left: 32px; }
+  }
+  .kpi-item .kpi-value {
+    font-size: 36px;
+    font-weight: 300;
+    color: var(--slate-900);
+    letter-spacing: -0.02em;
+  }
+  .kpi-item .kpi-label {
+    font-size: 10px;
+    font-weight: 600;
+    text-transform: uppercase;
+    letter-spacing: 0.15em;
+    color: var(--slate-500);
+    margin-top: 2px;
+  }
+
+  /* Taxonomy bars */
+  .taxonomy-list { display: flex; flex-direction: column; gap: 20px; }
+  .tax-row { display: grid; grid-template-columns: 24px 1fr 52px; gap: 12px; align-items: center; }
+  .tax-rank {
+    font-family: 'JetBrains Mono', monospace;
+    font-size: 12px;
+    color: var(--slate-400);
+    text-align: right;
+  }
+  .tax-body { display: flex; flex-direction: column; gap: 6px; }
+  .tax-label { font-size: 14px; font-weight: 500; color: var(--slate-800); }
+  .tax-bar-track { height: 4px; background: var(--slate-100); border-radius: 100px; overflow: hidden; }
+  .tax-bar-fill { height: 100%; border-radius: 100px; transition: width 0.6s ease; }
+  .tax-bar-fill.top { background: var(--amber-500); }
+  .tax-bar-fill.rest { background: var(--slate-300); }
+  .tax-pct {
+    font-family: 'JetBrains Mono', monospace;
+    font-size: 12px;
+    color: var(--slate-500);
+    text-align: right;
+  }
+  .tax-quote {
+    font-size: 12px;
+    font-style: italic;
+    color: var(--slate-500);
+    margin-top: 2px;
+  }
+
+  /* Evolution timeline */
+  .evolution-grid {
+    display: grid;
+    grid-template-columns: 1fr;
+    gap: 24px;
+  }
+  @media (min-width: 768px) { .evolution-grid { grid-template-columns: repeat(3, 1fr); } }
+  .evo-card {
+    background: white;
+    border: 1px solid var(--slate-200);
+    border-radius: 16px;
+    padding: 28px;
+  }
+  .evo-card .evo-month {
+    font-family: 'Playfair Display', serif;
+    font-size: 20px;
+    color: var(--slate-900);
+    margin-bottom: 4px;
+  }
+  .evo-card .evo-theme {
+    font-size: 12px;
+    font-weight: 600;
+    text-transform: uppercase;
+    letter-spacing: 0.12em;
+    margin-bottom: 16px;
+  }
+  .evo-card .evo-desc {
+    font-size: 14px;
+    color: var(--slate-600);
+    line-height: 1.6;
+  }
+  .evo-card .evo-stat {
+    margin-top: 16px;
+    padding-top: 16px;
+    border-top: 1px solid var(--slate-100);
+    font-family: 'JetBrains Mono', monospace;
+    font-size: 12px;
+    color: var(--slate-500);
+  }
+  .evo-jan .evo-theme { color: var(--slate-600); }
+  .evo-feb .evo-theme { color: var(--amber-600); }
+  .evo-mar .evo-theme { color: var(--indigo-600); }
+
+  /* Quality comparison */
+  .quality-grid {
+    display: grid;
+    grid-template-columns: 1fr;
+    gap: 24px;
+  }
+  @media (min-width: 768px) { .quality-grid { grid-template-columns: 1fr 1fr; } }
+  .q-card {
+    border-radius: 24px;
+    padding: 36px;
+  }
+  .q-card.good {
+    background: color-mix(in srgb, var(--emerald-50) 50%, transparent);
+    border: 1px solid var(--emerald-100);
+  }
+  .q-card.bad {
+    background: color-mix(in srgb, var(--rose-50) 50%, transparent);
+    border: 1px solid var(--rose-100);
+  }
+  .q-card .q-icon { font-size: 20px; margin-bottom: 12px; }
+  .q-card .q-title {
+    font-family: 'Playfair Display', serif;
+    font-size: 22px;
+    margin-bottom: 20px;
+  }
+  .q-card.good .q-title { color: var(--emerald-900); }
+  .q-card.bad .q-title { color: var(--rose-900); }
+  .q-list { list-style: none; display: flex; flex-direction: column; gap: 14px; }
+  .q-list li {
+    display: flex;
+    align-items: flex-start;
+    gap: 10px;
+    font-size: 14px;
+    line-height: 1.6;
+  }
+  .q-card.good .q-list li { color: color-mix(in srgb, var(--emerald-800) 90%, transparent); }
+  .q-card.bad .q-list li { color: color-mix(in srgb, var(--rose-800) 90%, transparent); }
+  .q-dot {
+    width: 6px;
+    height: 6px;
+    border-radius: 50%;
+    flex-shrink: 0;
+    margin-top: 7px;
+  }
+  .q-card.good .q-dot { background: var(--emerald-400); }
+  .q-card.bad .q-dot { background: var(--rose-400); }
+
+  /* Phrases */
+  .phrases-grid {
+    display: grid;
+    grid-template-columns: repeat(2, 1fr);
+    gap: 12px;
+  }
+  @media (min-width: 768px) { .phrases-grid { grid-template-columns: repeat(3, 1fr); } }
+  .phrase-chip {
+    background: white;
+    border: 1px solid var(--slate-200);
+    border-radius: 12px;
+    padding: 14px 16px;
+    display: flex;
+    justify-content: space-between;
+    align-items: center;
+    gap: 8px;
+  }
+  .phrase-text {
+    font-family: 'JetBrains Mono', monospace;
+    font-size: 12px;
+    color: var(--slate-700);
+    white-space: nowrap;
+    overflow: hidden;
+    text-overflow: ellipsis;
+  }
+  .phrase-count {
+    font-family: 'JetBrains Mono', monospace;
+    font-size: 11px;
+    color: var(--slate-400);
+    flex-shrink: 0;
+  }
+
+  /* Dark action panel */
+  .action-panel {
+    background: var(--slate-900);
+    color: white;
+    border-radius: 24px;
+    padding: 40px;
+    box-shadow: 0 20px 25px -5px rgb(0 0 0 / 0.1), 0 8px 10px -6px rgb(0 0 0 / 0.1);
+  }
+  @media (min-width: 768px) { .action-panel { padding: 56px; } }
+  .action-panel .section-label { color: var(--slate-500); }
+  .action-panel .ap-intro {
+    font-family: 'Playfair Display', serif;
+    font-size: 22px;
+    color: white;
+    line-height: 1.4;
+    margin-bottom: 32px;
+    max-width: 640px;
+  }
+  .prompt-block {
+    background: color-mix(in srgb, var(--slate-800) 80%, transparent);
+    border: 1px solid color-mix(in srgb, var(--slate-700) 50%, transparent);
+    border-radius: 16px;
+    overflow: hidden;
+  }
+  .prompt-header {
+    display: flex;
+    justify-content: space-between;
+    align-items: center;
+    padding: 12px 20px;
+    border-bottom: 1px solid color-mix(in srgb, var(--slate-700) 30%, transparent);
+  }
+  .prompt-header-label {
+    font-family: 'JetBrains Mono', monospace;
+    font-size: 12px;
+    color: var(--slate-400);
+    display: flex;
+    align-items: center;
+    gap: 8px;
+  }
+  .prompt-header-label svg { width: 14px; height: 14px; }
+  .copy-btn {
+    background: none;
+    border: none;
+    font-family: 'JetBrains Mono', monospace;
+    font-size: 12px;
+    color: var(--slate-400);
+    cursor: pointer;
+    display: flex;
+    align-items: center;
+    gap: 6px;
+    transition: color 0.2s;
+  }
+  .copy-btn:hover { color: white; }
+  .copy-btn.copied { color: var(--emerald-400); }
+  .prompt-body {
+    padding: 20px;
+    max-height: 480px;
+    overflow-y: auto;
+  }
+  .prompt-body pre {
+    font-family: 'JetBrains Mono', monospace;
+    font-size: 13px;
+    line-height: 1.7;
+    color: color-mix(in srgb, var(--amber-200) 90%, transparent);
+    white-space: pre-wrap;
+    word-break: break-word;
+  }
+  .prompt-body pre .comment {
+    color: var(--slate-500);
+  }
+
+  /* Glow card */
+  .glow-wrap {
+    position: relative;
+    margin-top: 48px;
+  }
+  .glow-bg {
+    position: absolute;
+    inset: -2px;
+    background: linear-gradient(135deg, var(--indigo-500), var(--purple-600));
+    border-radius: 26px;
+    opacity: 0.15;
+    filter: blur(16px);
+    transition: opacity 0.5s;
+  }
+  .glow-wrap:hover .glow-bg { opacity: 0.25; }
+  .glow-card {
+    position: relative;
+    background: white;
+    border: 1px solid var(--slate-200);
+    border-radius: 24px;
+    padding: 32px 36px;
+    display: flex;
+    justify-content: space-between;
+    align-items: center;
+    flex-wrap: wrap;
+    gap: 20px;
+  }
+  .glow-card .gc-text {
+    font-family: 'Playfair Display', serif;
+    font-size: 18px;
+    font-weight: 500;
+    color: var(--slate-900);
+    line-height: 1.5;
+    max-width: 640px;
+  }
+  .glow-card .gc-text em {
+    font-style: italic;
+    color: var(--indigo-600);
+  }
+
+  /* Footer */
+  footer {
+    border-top: 1px solid var(--slate-200);
+    padding-top: 48px;
+    margin-top: 0;
+    text-align: center;
+  }
+  footer p {
+    font-family: 'Playfair Display', serif;
+    font-style: italic;
+    font-size: 15px;
+    color: var(--slate-400);
+  }
+
+  /* Scrollbar in dark code block */
+  .prompt-body::-webkit-scrollbar { width: 6px; }
+  .prompt-body::-webkit-scrollbar-track { background: transparent; }
+  .prompt-body::-webkit-scrollbar-thumb { background: var(--slate-700); border-radius: 3px; }
+</style>
+</head>
+<body>
+<div class="container">
+
+  <header>
+    <div>
+      <h1>What 370 Files Reveal About<br>How You Plan</h1>
+      <div class="meta" style="margin-top: 8px;">backnotprop/plannotator &middot; Jan 7 &ndash; Mar 18, 2026</div>
+    </div>
+    <div class="meta" style="text-align: right;">
+      <span class="font-mono" style="font-size: 12px;">202 denials &middot; 168 annotations &middot; 71 days</span>
+    </div>
+  </header>
+
+  <!-- 1. Narrative + KPIs -->
+  <div class="section">
+    <div class="section-label">1. The story in the data</div>
+    <div class="summary">
+      <div class="narrative">
+        Across 71 days you denied or revised <span class="highlight">202 plans</span> before any code was written. The single most common reason&mdash;appearing in 1 out of 4 denials&mdash;was the same: the agent jumped to implementation without telling you <em>what</em> it was building, <em>why</em>, or <em>how</em>. Missing narrative. Missing context. Missing the story. Your expectations evolved from &ldquo;does it work?&rdquo; in January to &ldquo;tell me the story and be confident&rdquo; by March.
+      </div>
+      <div class="kpi-stack">
+        <div class="kpi-item">
+          <div class="kpi-value">25.7%</div>
+          <div class="kpi-label">Denials for missing narrative</div>
+        </div>
+        <div class="kpi-item">
+          <div class="kpi-value">50%</div>
+          <div class="kpi-label">Plans revised before coding</div>
+        </div>
+        <div class="kpi-item">
+          <div class="kpi-value">12</div>
+          <div class="kpi-label">Distinct denial categories</div>
+        </div>
+      </div>
+    </div>
+  </div>
+
+  <!-- 2. Denial Taxonomy -->
+  <div class="section">
+    <div class="section-label">2. Why plans get denied</div>
+    <div class="taxonomy-list">
+      <div class="tax-row">
+        <span class="tax-rank">1</span>
+        <div class="tax-body">
+          <span class="tax-label">Missing Narrative / Overview</span>
+          <div class="tax-bar-track"><div class="tax-bar-fill top" style="width: 100%"></div></div>
+          <span class="tax-quote">"This plan is denied without narrative detail and rationales."</span>
+        </div>
+        <span class="tax-pct">25.7%</span>
+      </div>
+      <div class="tax-row">
+        <span class="tax-rank">2</span>
+        <div class="tax-body">
+          <span class="tax-label">Clarification Needed</span>
+          <div class="tax-bar-track"><div class="tax-bar-fill rest" style="width: 65%"></div></div>
+          <span class="tax-quote">"What does this Mean???"</span>
+        </div>
+        <span class="tax-pct">16.8%</span>
+      </div>
+      <div class="tax-row">
+        <span class="tax-rank">3</span>
+        <div class="tax-body">
+          <span class="tax-label">Testing / Procedural</span>
+          <div class="tax-bar-track"><div class="tax-bar-fill rest" style="width: 54%"></div></div>
+          <span class="tax-quote">"I'm denying so you can create a diff."</span>
+        </div>
+        <span class="tax-pct">13.9%</span>
+      </div>
+      <div class="tax-row">
+        <span class="tax-rank">4</span>
+        <div class="tax-body">
+          <span class="tax-label">Wrong Approach / Over-Engineered</span>
+          <div class="tax-bar-track"><div class="tax-bar-fill rest" style="width: 37%"></div></div>
+          <span class="tax-quote">"Why are we doing difficult shit here? I want a hover experience."</span>
+        </div>
+        <span class="tax-pct">9.4%</span>
+      </div>
+      <div class="tax-row">
+        <span class="tax-rank">5</span>
+        <div class="tax-body">
+          <span class="tax-label">Process Requirement</span>
+          <div class="tax-bar-track"><div class="tax-bar-fill rest" style="width: 31%"></div></div>
+          <span class="tax-quote">"Make sure you feature branch."</span>
+        </div>
+        <span class="tax-pct">7.9%</span>
+      </div>
+      <div class="tax-row">
+        <span class="tax-rank">6</span>
+        <div class="tax-body">
+          <span class="tax-label">Confidence / Risk Check</span>
+          <div class="tax-bar-track"><div class="tax-bar-fill rest" style="width: 29%"></div></div>
+          <span class="tax-quote">"Take a step back, breathe, make sure we're not being irrational."</span>
+        </div>
+        <span class="tax-pct">7.4%</span>
+      </div>
+      <div class="tax-row">
+        <span class="tax-rank">7</span>
+        <div class="tax-body">
+          <span class="tax-label">Content Removal</span>
+          <div class="tax-bar-track"><div class="tax-bar-fill rest" style="width: 27%"></div></div>
+          <span class="tax-quote">"I don't want this in the plan."</span>
+        </div>
+        <span class="tax-pct">6.9%</span>
+      </div>
+      <div class="tax-row">
+        <span class="tax-rank">8</span>
+        <div class="tax-body">
+          <span class="tax-label">Implementation Bug Found</span>
+          <div class="tax-bar-track"><div class="tax-bar-fill rest" style="width: 23%"></div></div>
+        </div>
+        <span class="tax-pct">5.9%</span>
+      </div>
+      <div class="tax-row">
+        <span class="tax-rank">9</span>
+        <div class="tax-body">
+          <span class="tax-label">Design / UX Issue</span>
+          <div class="tax-bar-track"><div class="tax-bar-fill rest" style="width: 21%"></div></div>
+        </div>
+        <span class="tax-pct">5.4%</span>
+      </div>
+      <div class="tax-row">
+        <span class="tax-rank">10</span>
+        <div class="tax-body">
+          <span class="tax-label">Naming / Terminology</span>
+          <div class="tax-bar-track"><div class="tax-bar-fill rest" style="width: 16%"></div></div>
+          <span class="tax-quote">"Why do you keep calling it Simplified????"</span>
+        </div>
+        <span class="tax-pct">4.0%</span>
+      </div>
+    </div>
+  </div>
+
+  <!-- 3. Evolution -->
+  <div class="section">
+    <div class="section-label">3. How your expectations evolved</div>
+    <div class="evolution-grid">
+      <div class="evo-card evo-jan">
+        <div class="evo-month">January</div>
+        <div class="evo-theme">"Does it work?"</div>
+        <div class="evo-desc">Bug-hunting phase. You were hands-on testing View Logs, iterating on session scoping heuristics. 60% of denials were implementation bugs and verification failures. No mention of &ldquo;narrative&rdquo; or &ldquo;overview&rdquo; yet.</div>
+        <div class="evo-stat">26 denials &middot; 0 narrative requests</div>
+      </div>
+      <div class="evo-card evo-feb">
+        <div class="evo-month">February</div>
+        <div class="evo-theme">"Follow the process"</div>
+        <div class="evo-desc">Process gates emerged: feature branches, Linear tickets, pull main. 40% of denials were procedural (diff testing). UX polish intensified. The first narrative demands appeared: &ldquo;I want a narrative under each section.&rdquo;</div>
+        <div class="evo-stat">48 denials &middot; 6 narrative requests</div>
+      </div>
+      <div class="evo-card evo-mar">
+        <div class="evo-month">March</div>
+        <div class="evo-theme">"Tell me the story"</div>
+        <div class="evo-desc">Narrative became the #1 gate. You created a &ldquo;Missing overview&rdquo; label and applied it systematically. Confidence checks became standard. You began telling agents to &ldquo;take a step back, breathe, and analyze.&rdquo;</div>
+        <div class="evo-stat">128 denials &middot; 25+ narrative requests</div>
+      </div>
+    </div>
+  </div>
+
+  <!-- 4. Quality comparison -->
+  <div class="section">
+    <div class="section-label">4. What works vs. what doesn't</div>
+    <div class="quality-grid">
+      <div class="q-card good">
+        <div class="q-icon">&#10003;</div>
+        <div class="q-title">What approved plans do</div>
+        <ul class="q-list">
+          <li><span class="q-dot"></span>Lead with a narrative overview: what exists, what changes, why</li>
+          <li><span class="q-dot"></span>State confidence and identify risks proactively</li>
+          <li><span class="q-dot"></span>Reference existing codebase patterns before proposing new code</li>
+          <li><span class="q-dot"></span>Use explicit, transparent naming (not euphemisms)</li>
+          <li><span class="q-dot"></span>Break large work into phases with evaluation gates</li>
+          <li><span class="q-dot"></span>Include example output for user-facing changes</li>
+          <li><span class="q-dot"></span>Specify feature branch and ticket creation steps</li>
+        </ul>
+      </div>
+      <div class="q-card bad">
+        <div class="q-icon">&#10007;</div>
+        <div class="q-title">What agents keep getting wrong</div>
+        <ul class="q-list">
+          <li><span class="q-dot"></span>Jump to implementation steps without narrative context</li>
+          <li><span class="q-dot"></span>Over-engineer: Shift+Click when hover works, MCP tool when a README suffices</li>
+          <li><span class="q-dot"></span>Introduce new code for things the codebase already solves</li>
+          <li><span class="q-dot"></span>Propose work on top of failing lint/type checks</li>
+          <li><span class="q-dot"></span>Use vague or euphemistic naming (&ldquo;Accept&rdquo; instead of &ldquo;Git Add&rdquo;)</li>
+          <li><span class="q-dot"></span>Wait to be asked for confidence instead of stating it</li>
+          <li><span class="q-dot"></span>Rush to modify instead of reporting what they see</li>
+        </ul>
+      </div>
+    </div>
+  </div>
+
+  <!-- 5. The actionable output -->
+  <div class="section">
+    <div class="section-label">5. The actionable output</div>
+    <div class="narrative" style="margin-bottom: 32px;">
+      The analysis produced <span class="highlight">17 specific prompt instructions</span> that, if embedded in a planning prompt, would address ~70% of all denial reasons. The biggest three:
+    </div>
+    <div style="display: flex; flex-direction: column; gap: 20px;">
+      <div style="display: flex; gap: 16px; align-items: flex-start;">
+        <span class="font-mono" style="font-size: 24px; font-weight: 300; color: var(--amber-500); flex-shrink: 0; width: 32px; text-align: right;">1</span>
+        <div>
+          <div style="font-size: 17px; font-weight: 500; color: var(--slate-900); margin-bottom: 4px;">Every plan MUST start with a Solution Overview</div>
+          <div style="font-size: 14px; color: var(--slate-600); line-height: 1.5;">What exists, what changes, why, how. This alone addresses 1 in 4 denials.</div>
+        </div>
+      </div>
+      <div style="display: flex; gap: 16px; align-items: flex-start;">
+        <span class="font-mono" style="font-size: 24px; font-weight: 300; color: var(--amber-500); flex-shrink: 0; width: 32px; text-align: right;">2</span>
+        <div>
+          <div style="font-size: 17px; font-weight: 500; color: var(--slate-900); margin-bottom: 4px;">End every plan with a Confidence Assessment</div>
+          <div style="font-size: 14px; color: var(--slate-600); line-height: 1.5;">Don&rsquo;t wait to be asked. State your confidence, identify risks, flag uncertainties.</div>
+        </div>
+      </div>
+      <div style="display: flex; gap: 16px; align-items: flex-start;">
+        <span class="font-mono" style="font-size: 24px; font-weight: 300; color: var(--amber-500); flex-shrink: 0; width: 32px; text-align: right;">3</span>
+        <div>
+          <div style="font-size: 17px; font-weight: 500; color: var(--slate-900); margin-bottom: 4px;">Search for existing patterns before proposing new code</div>
+          <div style="font-size: 14px; color: var(--slate-600); line-height: 1.5;">Explicitly state what you found in the codebase. Prefer reuse over new implementation.</div>
+        </div>
+      </div>
+    </div>
+  </div>
+
+  <!-- 6. Recurring phrases -->
+  <div class="section">
+    <div class="section-label">6. Your most-used phrases</div>
+    <div class="phrases-grid">
+      <div class="phrase-chip"><span class="phrase-text">"narrative"</span><span class="phrase-count">50+</span></div>
+      <div class="phrase-chip"><span class="phrase-text">"I don't want this in the plan"</span><span class="phrase-count">10</span></div>
+      <div class="phrase-chip"><span class="phrase-text">"feature branch"</span><span class="phrase-count">8+</span></div>
+      <div class="phrase-chip"><span class="phrase-text">"confidence"</span><span class="phrase-count">8+</span></div>
+      <div class="phrase-chip"><span class="phrase-text">"Missing overview"</span><span class="phrase-count">14</span></div>
+      <div class="phrase-chip"><span class="phrase-text">"front-end design skill"</span><span class="phrase-count">16</span></div>
+      <div class="phrase-chip"><span class="phrase-text">"separation of concerns"</span><span class="phrase-count">6</span></div>
+      <div class="phrase-chip"><span class="phrase-text">"Take a step back, breathe"</span><span class="phrase-count">6</span></div>
+      <div class="phrase-chip"><span class="phrase-text">"how does this work"</span><span class="phrase-count">5+</span></div>
+      <div class="phrase-chip"><span class="phrase-text">"what the fuck"</span><span class="phrase-count">4</span></div>
+      <div class="phrase-chip"><span class="phrase-text">"create a ticket"</span><span class="phrase-count">4+</span></div>
+      <div class="phrase-chip"><span class="phrase-text">"reusable"</span><span class="phrase-count">19+</span></div>
+    </div>
+  </div>
+
+  <!-- 7. Corrective Prompt -->
+  <div class="section" style="margin-bottom: 64px;">
+    <div class="action-panel">
+      <div class="section-label">7. The corrective prompt</div>
+      <div class="ap-intro">
+        These 17 instructions were extracted directly from your denial patterns. Embedding them in a planning prompt would address approximately 70% of all denial reasons.
+      </div>
+      <div class="prompt-block">
+        <div class="prompt-header">
+          <span class="prompt-header-label">
+            <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="4 17 10 11 4 5"></polyline><line x1="12" y1="19" x2="20" y2="19"></line></svg>
+            planning-instructions.md
+          </span>
+          <button class="copy-btn" onclick="copyPrompt(this)">
+            <svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg>
+            Copy
+          </button>
+        </div>
+        <div class="prompt-body">
+          <pre id="prompt-content"><span class="comment"># Planning Instructions
+# Derived from 370 files of denial & annotation analysis</span>
+
+1. STRUCTURE: Every plan MUST begin with a "Solution Overview"
+   containing 2-3 paragraphs of narrative prose explaining:
+   - What exists today (current state)
+   - What will change and why
+   - How it will be built (approach summary)
+   Do NOT skip this. Do NOT replace it with bullet points.
+
+2. NARRATIVE: Every major section must include a rationale
+   paragraph — not just what will be done, but WHY this
+   approach was chosen over alternatives.
+
+3. FEATURE BRANCH: Always specify implementation will occur
+   on a feature branch. State the branch name. Never plan
+   to work directly on main.
+
+4. EXISTING PATTERNS: Before proposing any new implementation,
+   search the codebase for existing patterns that solve the
+   same problem. Explicitly state what you found and whether
+   you will reuse it. Prefer reuse over new code.
+
+5. CONFIDENCE STATEMENT: End the plan with a "Confidence
+   Assessment" section. State your confidence level, identify
+   risks or edge cases, and note uncertainties. Do not wait
+   to be asked.
+
+6. PHASING: For plans with more than 3 steps, break them into
+   numbered phases. After each phase, note "Pause for
+   evaluation" so the reviewer can assess before proceeding.
+
+7. ISSUE TRACKING: If the project uses Linear or GitHub Issues,
+   include a step to create relevant tickets BEFORE
+   implementation. Backlog items should be separate tickets.
+
+8. SIMPLICITY: Choose the simplest approach that meets
+   requirements. Do not introduce modifier keys when hover
+   works. Do not build a framework when a README suffices.
+
+9. NAMING: Use explicit, transparent names for user-facing
+   features. Do not euphemize Git operations ("Git Add"
+   not "Accept"). Match existing product naming conventions.
+
+10. CODE QUALITY: State that implementation will follow clean
+    code principles: modular architecture, separation of
+    concerns, no circumventing lint or type checks.
+
+11. CLEAN FOUNDATION: If the codebase has failing lint or type
+    checks, address these BEFORE proposing new features. State
+    the current CI/CD state.
+
+12. PRIVACY: For features involving data storage or sharing,
+    explicitly state privacy guarantees. Require user
+    confirmation before storing data.
+
+13. EXAMPLES: When the plan involves user-facing output or UI,
+    include an example of what it will look like.
+
+14. FOCUSED SCOPE: Do not include sections that are obvious,
+    boilerplate, or previously asked to be removed. Keep the
+    plan focused rather than comprehensive.
+
+15. DESIGN SKILL: For any frontend/UI work, invoke the
+    front-end design skill to validate the approach. Note
+    this invocation explicitly in the plan.
+
+16. VERIFICATION STEP: For refactors or multi-file changes,
+    include a verification step with line-by-line comparison
+    of affected code paths.
+
+17. DELIBERATION: If the plan involves a dramatic shift, state
+    that you have re-evaluated the approach, traced through
+    affected files mentally, and are confident in the plan.
+    Do not rush.</pre>
+        </div>
+      </div>
+
+      <div class="glow-wrap">
+        <div class="glow-bg"></div>
+        <div class="glow-card">
+          <div class="gc-text">
+            These instructions are yours &mdash; derived from <em>your feedback, your language, your standards</em>. Copy them into your planning prompt and watch the deny rate drop.
+          </div>
+        </div>
+      </div>
+    </div>
+  </div>
+
+  <footer>
+    <p>Analysis of 202 denied plans and 168 annotation files from the Plannotator archive.</p>
+  </footer>
+
+</div>
+
+<script>
+function copyPrompt(btn) {
+  const text = document.getElementById('prompt-content').textContent;
+  navigator.clipboard.writeText(text).then(() => {
+    btn.innerHTML = '<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M22 11.08V12a10 10 0 1 1-5.93-9.14"></path><polyline points="22 4 12 14.01 9 11.01"></polyline></svg> Copied';
+    btn.classList.add('copied');
+    setTimeout(() => {
+      btn.innerHTML = '<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg> Copy';
+      btn.classList.remove('copied');
+    }, 2000);
+  });
+}
+</script>
+</body>
+</html>
--- a/extensions/plannotator/skills/plannotator-compound/references/claude-code-fallback.md
+++ b/extensions/plannotator/skills/plannotator-compound/references/claude-code-fallback.md
@@ -0,0 +1,282 @@
+# Claude Code Fallback
+
+Read this file only when the user does **not** have a usable Plannotator archive.
+
+This is the secondary path for ordinary Claude Code users whose denial history
+exists in `~/.claude/projects/` rather than `~/.plannotator/plans/`.
+
+The goal is the same as the main skill:
+
+- extract the user's real denial reasons
+- reduce them into a taxonomy and prompt corrections
+- produce the same HTML report design and section flow
+
+## Source of Truth
+
+Use the bundled parser at:
+
+- [scripts/extract_exit_plan_mode_outcomes.py](../scripts/extract_exit_plan_mode_outcomes.py)
+
+Resolve that script path relative to this skill directory before running it.
+
+This script normalizes `ExitPlanMode` outcomes from Claude Code JSONL transcripts
+and emits clean JSON parts containing only human-authored denial reasons by default.
+
+Do **not** read raw `~/.claude/projects/**/*.jsonl` directly unless:
+
+- the parser fails
+- the user asks for audit-level verification
+- you need to inspect one or two suspicious records by hand
+
+The parser exists specifically to strip transcript noise such as generic native
+reject strings and wrapper boilerplate.
+
+## Run the Parser
+
+Create the working directory first:
+
+```bash
+mkdir -p /tmp/compound-planning
+```
+
+Then run the bundled parser. Prefer `python3`; if unavailable, use `python`.
+
+Use a resolved absolute script path, not a repo-local copy.
+
+```bash
+python3 [RESOLVED SKILL PATH]/scripts/extract_exit_plan_mode_outcomes.py \
+  --projects-dir ~/.claude/projects \
+  --json-out /tmp/compound-planning/claude-code-human-reasons.json \
+  --show-samples 0
+```
+
+Expected output:
+
+- manifest:
+  `/tmp/compound-planning/claude-code-human-reasons/claude-code-human-reasons.manifest.json`
+- part files:
+  `/tmp/compound-planning/claude-code-human-reasons/claude-code-human-reasons.part-XXXX-of-XXXX.json`
+
+The script prints how many records were detected and how many JSON part files were emitted.
+
+## What To Read First
+
+Read the manifest before reading any part file.
+
+The manifest gives you:
+
+- total filtered record count
+- total `ExitPlanMode` attempts
+- native approval / denial counts
+- non-native denial counts
+- part file list
+
+Use the part files only after you understand the overall dataset shape.
+
+## Inventory In Fallback Mode
+
+In Claude Code fallback mode, report this dataset instead of the Plannotator file counts:
+
+- human denial reasons found
+- total `ExitPlanMode` attempts scanned
+- native approvals
+- native denials with extractable inline reason
+- native denials without recoverable reason
+- non-native denials with recoverable payload
+- number of emitted JSON parts
+- date range from the records
+- total days spanned
+- distinct sessions
+- distinct project roots / `cwd` values
+
+Also calculate:
+
+- average `plan_length_chars` where present
+- percentage of all denials that contain a recoverable human reason
+
+Do **not** fabricate Plannotator-only inventory fields in fallback mode:
+
+- no `*-approved.md` counts
+- no `*.annotations.md` counts
+- no `*.diff.md` counts
+- no approved-plan line-count analysis
+
+If the user asks for those specifically, state that Claude Code log fallback mode
+does not contain those artifacts.
+
+### Previous Report Detection In Fallback Mode
+
+Previous report detection still applies. Check the user's home directory or
+`~/.plannotator/plans/` for existing `compound-planning-report*.html` files. If
+found, offer the same incremental vs full choice as Plannotator mode. In
+incremental mode, filter the parser output by timestamp rather than by filename
+date — use the `timestamp` field in each JSON record.
+
+If no previous report exists, use the first-report naming convention
+(`compound-planning-report.html`). Otherwise use the next version number.
+
+## Extraction In Fallback Mode
+
+Treat the emitted JSON part files as the clean source dataset.
+
+### Batching
+
+- **Small datasets (< 200 records):** read the part files directly without extra agents
+- **Medium datasets (200-800 records):** split by part file or time range into 2-4 agents
+- **Large datasets (800+ records):** split by part file groups or balanced time ranges
+
+All extraction agents should use `model: "haiku"` — they're doing straightforward
+file reading and structured extraction, not reasoning.
+
+Each extraction agent should read every record in its assigned part files and write
+clean markdown output to:
+
+```text
+/tmp/compound-planning/extraction-{batch-name}.md
+```
+
+### Extraction Prompt For Claude Code Denial Records
+
+Use this prompt for each fallback extraction batch (adapt the part files and output path):
+
+```text
+You are extracting structured data from Claude Code ExitPlanMode denial records.
+
+Files to read: [JSON PART FILES]
+Output: Write your complete results to [OUTPUT FILE PATH]
+
+Read EVERY record in the assigned files. Each record already contains a cleaned
+human_reason field. Use that as the primary source text.
+
+For EACH record, extract:
+- Date
+- Session ID
+- Project / cwd
+- Topic (only if inferable from the reason or plan path; otherwise say "Unknown from logs")
+- Human denial reason
+- What was specifically asked to change
+- Feedback type (let the content determine the category)
+- Notable phrases
+- Reason source (`native_inline_reason`, `non_native_freeform_payload`, or `structured_quote_extraction`)
+- Plan path if present
+- Plan length in chars if present
+
+Do NOT skip any records. One entry per record.
+
+Format each entry as:
+**[session_id :: tool_use_id]**
+- Date: ...
+- Project: ...
+- Topic: ...
+- Human denial reason: ...
+- Feedback type: ...
+- Specific asks: ...
+- Notable phrases: ...
+- Reason source: ...
+- Plan path: ...
+- Plan length chars: ...
+---
+
+After processing all records, write the complete results to [OUTPUT FILE PATH].
+State the total record count at the end of the file.
+```
+
+## Reduction In Fallback Mode
+
+The reduction step stays conceptually the same:
+
+- taxonomy
+- top patterns
+- recurring phrases
+- reviewer values
+- recurring agent mistakes
+- structural requests
+- evolution over time
+- corrective prompt instructions
+
+Use `model: "sonnet"` for reduction agents, same as Plannotator mode. The
+two-stage reduce (partial reduces for 21+ extraction files) also applies when
+there are many part files.
+
+But interpret the dataset correctly:
+
+- this is denial-reason evidence from Claude Code logs
+- not every denial has a recoverable human reason
+- annotations may be absent entirely
+- success traits are often inferred from the inverse of repeated denial feedback
+
+If the evidence for "what works" is weaker than the evidence for "what fails",
+say that explicitly.
+
+## HTML Report Adaptation
+
+Use the same template and the same section order as the main skill.
+
+In fallback mode:
+
+- explicitly state in the header/meta that the source is Claude Code `ExitPlanMode`
+  denial reasons
+- keep the same narrative-first editorial style
+- keep the same 7 major sections
+- use real denial-reason counts, dates, phrases, and percentages only
+
+### KPI Sidebar Substitutes
+
+The Plannotator version uses a revision-rate KPI that may not exist here.
+
+In fallback mode, prefer this KPI trio:
+
+1. top denial category percentage
+2. total human denial reasons recovered
+3. number of distinct denial categories
+
+If a better third metric emerges from the data, use it, but do not invent one.
+
+### Footer / Provenance
+
+The footer tagline should mention that the report was derived from Claude Code
+denial reasons rather than Plannotator markdown archives.
+
+### Important Limitation To State
+
+If `human_reasons_total < total denials`, mention in the narrative or footer note
+that some denials in the transcript did not contain recoverable human-authored
+feedback and therefore could not contribute to the pattern analysis.
+
+### Versioned Report Naming
+
+Versioned naming (`v2`, `v3`, etc.) applies to fallback mode too. Save reports
+to `~/.plannotator/plans/` (create the directory if it doesn't exist) so that
+all compound planning reports live in the same location regardless of data source.
+
+## Summary In Fallback Mode
+
+At the end, tell the user:
+
+- how many human denial reasons were analyzed
+- how many total `ExitPlanMode` attempts were scanned
+- the top 3 denial patterns found
+- the estimated percentage of denial reasons the corrective instructions address
+- the single most impactful prompt improvement
+- where the report was saved (including version number)
+- if incremental: note that earlier findings are in the previous report
+
+## Improvement Hook In Fallback Mode
+
+The Phase 6 improvement hook applies to fallback mode too. The corrective prompt
+instructions derived from Claude Code denial reasons are just as useful for
+injection into future planning sessions. Follow the same flow as the main skill.
+
+## Audit Mode
+
+Only if the user asks for raw denial records or transcript noise:
+
+```bash
+python3 [RESOLVED SKILL PATH]/scripts/extract_exit_plan_mode_outcomes.py \
+  --projects-dir ~/.claude/projects \
+  --records-filter denials \
+  --json-out /tmp/compound-planning/claude-code-all-denials.json \
+  --show-samples 0
+```
+
+Do not use this audit-mode output for the normal report unless the user asks for it.
--- a/extensions/plannotator/skills/plannotator-compound/scripts/extract_exit_plan_mode_outcomes.py
+++ b/extensions/plannotator/skills/plannotator-compound/scripts/extract_exit_plan_mode_outcomes.py
@@ -0,0 +1,820 @@
+#!/usr/bin/env python3
+"""Extract ExitPlanMode outcomes from Claude Code JSONL session logs.
+
+This parser keeps three views of the same data:
+
+1. Strict native Claude Code classification
+   - native approval:
+     "User has approved your plan."
+   - native denial:
+     "The user doesn't want to proceed with this tool use. The tool use was rejected"
+
+2. General denial capture
+   - any matching ExitPlanMode tool_result with is_error=true and non-empty text
+     is captured as a denial/error payload, even when it is custom hook output
+     or some other non-native integration.
+
+3. Human-reason extraction
+   - native inline reasons are preserved as-is
+   - freeform non-native error payloads are treated as human reasons
+   - structured non-native payloads are reduced to quoted feedback where possible
+
+This means the script does not depend on hook-specific strings to capture custom
+denials, but it also does not dump wrapper boilerplate into the human-reason
+output.
+
+The script streams JSONL line-by-line and uses only the Python standard library.
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import os
+import sys
+from dataclasses import asdict, dataclass
+from pathlib import Path
+from typing import Dict, Iterable, Iterator, List, Optional, Tuple
+
+
+APPROVE_PREFIX = "User has approved your plan."
+REJECT_PREFIX = (
+    "The user doesn't want to proceed with this tool use. "
+    "The tool use was rejected"
+)
+REASON_MARKER = "To tell you how to proceed, the user said:\n"
+NOTE_MARKER = (
+    "\n\nNote: The user's next message may contain a correction or preference."
+)
+
+
+@dataclass
+class AttemptRecord:
+    session_id: str
+    tool_use_id: str
+    file_path: str
+    line_number: int
+    timestamp: Optional[str]
+    cwd: Optional[str]
+    plan_file_path: Optional[str]
+    plan_length_chars: Optional[int]
+    outcome: str = "pending"
+    native_reason: Optional[str] = None
+    native_reason_style: Optional[str] = None
+    captured_reason: Optional[str] = None
+    captured_reason_style: Optional[str] = None
+    captured_reason_source: Optional[str] = None
+    human_reason: Optional[str] = None
+    human_reason_style: Optional[str] = None
+    human_reason_source: Optional[str] = None
+    result_is_error: Optional[bool] = None
+    result_file_path: Optional[str] = None
+    result_line_number: Optional[int] = None
+    result_timestamp: Optional[str] = None
+    result_preview: Optional[str] = None
+
+
+def parse_args() -> argparse.Namespace:
+    parser = argparse.ArgumentParser(
+        description="Extract ExitPlanMode approvals/denials from Claude Code logs."
+    )
+    parser.add_argument(
+        "--projects-dir",
+        default="~/.claude/projects",
+        help="Root Claude projects directory. Default: %(default)s",
+    )
+    parser.add_argument(
+        "--include-subagents",
+        action="store_true",
+        help="Include /subagents/ JSONL files. Default is to skip them.",
+    )
+    parser.add_argument(
+        "--records-filter",
+        choices=("all", "native", "native-denials", "denials", "human-reasons"),
+        default="human-reasons",
+        help=(
+            "Which records to write to JSON/CSV outputs. "
+            "Default: %(default)s"
+        ),
+    )
+    parser.add_argument(
+        "--include-non-native-denials",
+        action="store_true",
+        help=(
+            "Include non-native denial/error payloads in sample output. "
+            "Default sample output shows only native denials."
+        ),
+    )
+    parser.add_argument(
+        "--show-samples",
+        type=int,
+        default=5,
+        help="How many denial samples to print in the text summary.",
+    )
+    parser.add_argument(
+        "--json-out",
+        help="Optional path to write a JSON report.",
+    )
+    parser.add_argument(
+        "--max-output-tokens-per-file",
+        type=int,
+        default=50000,
+        help=(
+            "Approximate max token budget per JSON file when writing --json-out. "
+            "Default: %(default)s"
+        ),
+    )
+    return parser.parse_args()
+
+
+def iter_jsonl_files(root: Path, include_subagents: bool) -> Iterator[Path]:
+    for dirpath, dirnames, filenames in os.walk(root):
+        if not include_subagents and "subagents" in dirnames:
+            dirnames.remove("subagents")
+        dirnames.sort()
+        for filename in sorted(filenames):
+            if filename.endswith(".jsonl"):
+                yield Path(dirpath) / filename
+
+
+def make_attempt_key(session_id: str, tool_use_id: str) -> str:
+    return session_id + "::" + tool_use_id
+
+
+def preview(text: str, limit: int = 220) -> str:
+    compact = " ".join(text.split())
+    if len(compact) <= limit:
+        return compact
+    return compact[: limit - 3] + "..."
+
+
+def estimate_tokens(text: str) -> int:
+    # Rough enough for output chunking. We intentionally bias slightly high.
+    return max(1, (len(text) + 3) // 4)
+
+
+def iter_blocks(message_content: object) -> Iterator[dict]:
+    if not isinstance(message_content, list):
+        return
+    for block in message_content:
+        if isinstance(block, dict):
+            yield block
+
+
+def extract_text(content: object) -> str:
+    if isinstance(content, str):
+        return content
+    if not isinstance(content, list):
+        return ""
+
+    parts: List[str] = []
+    for item in content:
+        if isinstance(item, str):
+            parts.append(item)
+            continue
+        if not isinstance(item, dict):
+            continue
+        if isinstance(item.get("text"), str):
+            parts.append(item["text"])
+        elif isinstance(item.get("content"), str):
+            parts.append(item["content"])
+    return "\n".join(part for part in parts if part)
+
+
+def classify_reason_style(reason: Optional[str]) -> Optional[str]:
+    if not reason:
+        return None
+
+    stripped = reason.lstrip()
+    if (
+        stripped.startswith("#")
+        or stripped.startswith("YOUR PLAN WAS NOT APPROVED.")
+        or "\n## " in reason
+        or "\n---" in reason
+    ):
+        return "structured"
+    return "freeform"
+
+
+def extract_blockquote_feedback(text: str) -> List[str]:
+    quotes: List[str] = []
+    current: List[str] = []
+
+    for raw_line in text.splitlines():
+        stripped = raw_line.strip()
+        if stripped.startswith(">"):
+            current.append(stripped[1:].lstrip())
+            continue
+
+        if current:
+            if not stripped or stripped.startswith("## ") or stripped == "---":
+                quote = "\n".join(line for line in current if line).strip()
+                if quote:
+                    quotes.append(quote)
+                current = []
+                continue
+
+            # Preserve wrapped continuation lines that belong to the same quote.
+            current.append(stripped)
+
+    if current:
+        quote = "\n".join(line for line in current if line).strip()
+        if quote:
+            quotes.append(quote)
+
+    return quotes
+
+
+def extract_human_reason(
+    native_reason: Optional[str],
+    captured_reason: Optional[str],
+    captured_reason_style: Optional[str],
+) -> Tuple[Optional[str], Optional[str], Optional[str]]:
+    if native_reason:
+        return (
+            native_reason,
+            classify_reason_style(native_reason),
+            "native_inline_reason",
+        )
+
+    if not captured_reason:
+        return (None, None, None)
+
+    if captured_reason_style == "freeform":
+        return (
+            captured_reason,
+            classify_reason_style(captured_reason),
+            "non_native_freeform_payload",
+        )
+
+    quote_feedback = extract_blockquote_feedback(captured_reason)
+    if quote_feedback:
+        reason = "\n\n".join(quote_feedback)
+        return (
+            reason,
+            classify_reason_style(reason),
+            "structured_quote_extraction",
+        )
+
+    return (None, None, None)
+
+
+def classify_result(
+    text: str,
+    is_error: bool,
+) -> Tuple[str, Optional[str], Optional[str], Optional[str], Optional[str]]:
+    stripped = text.strip()
+    if not stripped:
+        if is_error:
+            return (
+                "denied_non_native_no_payload",
+                None,
+                None,
+                None,
+                None,
+            )
+        return ("pending", None, None, None, None)
+
+    if stripped.startswith(APPROVE_PREFIX):
+        return ("approved_native", None, None, None, None)
+
+    if stripped.startswith(REJECT_PREFIX):
+        marker_index = stripped.find(REASON_MARKER)
+        if marker_index < 0:
+            return ("denied_native_no_reason", None, None, None, None)
+
+        reason = stripped[marker_index + len(REASON_MARKER) :]
+        note_index = reason.find(NOTE_MARKER)
+        if note_index >= 0:
+            reason = reason[:note_index]
+        reason = reason.strip()
+        if reason:
+            style = classify_reason_style(reason)
+            return (
+                "denied_native_with_reason",
+                reason,
+                reason,
+                "native_inline_reason",
+                style,
+            )
+        return ("denied_native_no_reason", None, None, None, None)
+
+    if is_error:
+        style = classify_reason_style(stripped)
+        return (
+            "denied_non_native_with_payload",
+            None,
+            stripped,
+            "non_native_error_payload",
+            style,
+        )
+
+    return ("non_native_other", None, None, None, None)
+
+
+def outcome_rank(outcome: str) -> int:
+    ranks = {
+        "pending": 0,
+        "non_native_other": 1,
+        "approved_native": 2,
+        "denied_native_no_reason": 3,
+        "denied_native_with_reason": 4,
+        "denied_non_native_no_payload": 5,
+        "denied_non_native_with_payload": 6,
+    }
+    return ranks.get(outcome, 0)
+
+
+def update_attempt_from_result(
+    attempt: AttemptRecord,
+    file_path: Path,
+    line_number: int,
+    timestamp: Optional[str],
+    text: str,
+    is_error: bool,
+) -> None:
+    (
+        outcome,
+        native_reason,
+        captured_reason,
+        captured_reason_source,
+        captured_reason_style,
+    ) = classify_result(text=text, is_error=is_error)
+    if outcome_rank(outcome) < outcome_rank(attempt.outcome):
+        return
+
+    attempt.outcome = outcome
+    attempt.native_reason = native_reason
+    attempt.native_reason_style = classify_reason_style(native_reason)
+    attempt.captured_reason = captured_reason
+    attempt.captured_reason_source = captured_reason_source
+    attempt.captured_reason_style = captured_reason_style
+    (
+        attempt.human_reason,
+        attempt.human_reason_style,
+        attempt.human_reason_source,
+    ) = extract_human_reason(
+        native_reason=native_reason,
+        captured_reason=captured_reason,
+        captured_reason_style=captured_reason_style,
+    )
+    attempt.result_is_error = is_error
+    attempt.result_file_path = str(file_path)
+    attempt.result_line_number = line_number
+    attempt.result_timestamp = timestamp
+    attempt.result_preview = preview(text)
+
+
+def scan_projects(
+    projects_dir: Path,
+    include_subagents: bool,
+) -> Tuple[Dict[str, int], List[AttemptRecord]]:
+    stats = {
+        "files_scanned": 0,
+        "lines_scanned": 0,
+        "json_errors": 0,
+    }
+    attempts: Dict[str, AttemptRecord] = {}
+
+    for file_path in iter_jsonl_files(projects_dir, include_subagents):
+        stats["files_scanned"] += 1
+        try:
+            handle = file_path.open("r", encoding="utf-8", errors="replace")
+        except OSError:
+            continue
+
+        with handle:
+            for line_number, raw_line in enumerate(handle, start=1):
+                if not raw_line.strip():
+                    continue
+                stats["lines_scanned"] += 1
+                try:
+                    obj = json.loads(raw_line)
+                except json.JSONDecodeError:
+                    stats["json_errors"] += 1
+                    continue
+
+                session_id = str(obj.get("sessionId") or str(file_path))
+                timestamp = obj.get("timestamp")
+                cwd = obj.get("cwd")
+                message = obj.get("message")
+                if not isinstance(message, dict):
+                    continue
+
+                content = message.get("content")
+
+                for block in iter_blocks(content):
+                    if (
+                        block.get("type") == "tool_use"
+                        and block.get("name") == "ExitPlanMode"
+                        and isinstance(block.get("id"), str)
+                    ):
+                        tool_use_id = block["id"]
+                        key = make_attempt_key(session_id, tool_use_id)
+                        if key in attempts:
+                            continue
+                        input_data = block.get("input")
+                        plan = None
+                        plan_file_path = None
+                        if isinstance(input_data, dict):
+                            if isinstance(input_data.get("plan"), str):
+                                plan = input_data["plan"]
+                            if isinstance(input_data.get("planFilePath"), str):
+                                plan_file_path = input_data["planFilePath"]
+
+                        attempts[key] = AttemptRecord(
+                            session_id=session_id,
+                            tool_use_id=tool_use_id,
+                            file_path=str(file_path),
+                            line_number=line_number,
+                            timestamp=timestamp if isinstance(timestamp, str) else None,
+                            cwd=cwd if isinstance(cwd, str) else None,
+                            plan_file_path=plan_file_path,
+                            plan_length_chars=len(plan) if isinstance(plan, str) else None,
+                        )
+
+                if message.get("role") != "user":
+                    continue
+
+                for block in iter_blocks(content):
+                    if (
+                        block.get("type") != "tool_result"
+                        or not isinstance(block.get("tool_use_id"), str)
+                    ):
+                        continue
+
+                    key = make_attempt_key(session_id, block["tool_use_id"])
+                    attempt = attempts.get(key)
+                    if attempt is None:
+                        continue
+
+                    text = extract_text(block.get("content"))
+                    update_attempt_from_result(
+                        attempt=attempt,
+                        file_path=file_path,
+                        line_number=line_number,
+                        timestamp=timestamp if isinstance(timestamp, str) else None,
+                        text=text,
+                        is_error=bool(block.get("is_error")),
+                    )
+
+    return stats, list(attempts.values())
+
+
+def summarize(attempts: Iterable[AttemptRecord]) -> Dict[str, int]:
+    summary = {
+        "total_exit_plan_attempts": 0,
+        "approved_native": 0,
+        "denied_native_with_reason": 0,
+        "denied_native_no_reason": 0,
+        "denied_native_with_freeform_reason": 0,
+        "denied_native_with_structured_reason": 0,
+        "denied_non_native_with_payload": 0,
+        "denied_non_native_no_payload": 0,
+        "captured_denial_reasons_total": 0,
+        "captured_freeform_reasons": 0,
+        "captured_structured_reasons": 0,
+        "human_reasons_total": 0,
+        "human_reasons_native": 0,
+        "human_reasons_non_native": 0,
+        "human_reasons_freeform": 0,
+        "human_reasons_structured": 0,
+        "non_native_other": 0,
+        "pending": 0,
+    }
+    for attempt in attempts:
+        summary["total_exit_plan_attempts"] += 1
+        summary[attempt.outcome] = summary.get(attempt.outcome, 0) + 1
+        if attempt.outcome == "denied_native_with_reason":
+            if attempt.native_reason_style == "freeform":
+                summary["denied_native_with_freeform_reason"] += 1
+            elif attempt.native_reason_style == "structured":
+                summary["denied_native_with_structured_reason"] += 1
+        if attempt.captured_reason:
+            summary["captured_denial_reasons_total"] += 1
+            if attempt.captured_reason_style == "freeform":
+                summary["captured_freeform_reasons"] += 1
+            elif attempt.captured_reason_style == "structured":
+                summary["captured_structured_reasons"] += 1
+        if attempt.human_reason:
+            summary["human_reasons_total"] += 1
+            if attempt.human_reason_source == "native_inline_reason":
+                summary["human_reasons_native"] += 1
+            else:
+                summary["human_reasons_non_native"] += 1
+            if attempt.human_reason_style == "freeform":
+                summary["human_reasons_freeform"] += 1
+            elif attempt.human_reason_style == "structured":
+                summary["human_reasons_structured"] += 1
+    return summary
+
+
+def filter_records(
+    attempts: List[AttemptRecord],
+    records_filter: str,
+) -> List[AttemptRecord]:
+    if records_filter == "all":
+        return attempts
+    if records_filter == "native":
+        return [
+            attempt
+            for attempt in attempts
+            if attempt.outcome.startswith("approved_native")
+            or attempt.outcome.startswith("denied_native")
+        ]
+    if records_filter == "native-denials":
+        return [
+            attempt
+            for attempt in attempts
+            if attempt.outcome.startswith("denied_native")
+        ]
+    if records_filter == "human-reasons":
+        return [attempt for attempt in attempts if attempt.human_reason]
+    return [
+        attempt
+        for attempt in attempts
+        if attempt.outcome.startswith("denied_native")
+        or attempt.outcome.startswith("denied_non_native")
+    ]
+
+
+def build_json_chunks(
+    records: List[AttemptRecord],
+    max_output_tokens_per_file: int,
+) -> List[List[AttemptRecord]]:
+    if not records:
+        return [[]]
+
+    chunks: List[List[AttemptRecord]] = []
+    current_chunk: List[AttemptRecord] = []
+    current_tokens = 0
+
+    for record in records:
+        record_dict = asdict(record)
+        record_json = json.dumps(record_dict, ensure_ascii=False)
+        record_tokens = estimate_tokens(record_json)
+
+        if current_chunk and current_tokens + record_tokens > max_output_tokens_per_file:
+            chunks.append(current_chunk)
+            current_chunk = []
+            current_tokens = 0
+
+        current_chunk.append(record)
+        current_tokens += record_tokens
+
+    if current_chunk:
+        chunks.append(current_chunk)
+
+    return chunks
+
+
+def print_summary(
+    projects_dir: Path,
+    include_subagents: bool,
+    stats: Dict[str, int],
+    attempts: List[AttemptRecord],
+    summary: Dict[str, int],
+    show_samples: int,
+    include_non_native_denials: bool,
+) -> None:
+    native_denials = (
+        summary["denied_native_with_reason"] + summary["denied_native_no_reason"]
+    )
+    total_denials = (
+        native_denials
+        + summary["denied_non_native_with_payload"]
+        + summary["denied_non_native_no_payload"]
+    )
+    native_extractable_ratio = (
+        (summary["denied_native_with_reason"] / native_denials) * 100.0
+        if native_denials
+        else 0.0
+    )
+    all_capture_ratio = (
+        (summary["captured_denial_reasons_total"] / total_denials) * 100.0
+        if total_denials
+        else 0.0
+    )
+
+    print(f"Projects dir: {projects_dir}")
+    print(f"Included subagents: {'yes' if include_subagents else 'no'}")
+    print(f"JSONL files scanned: {stats['files_scanned']}")
+    print(f"JSON lines scanned: {stats['lines_scanned']}")
+    print(f"JSON parse errors: {stats['json_errors']}")
+    print()
+    print(f"ExitPlanMode attempts: {summary['total_exit_plan_attempts']}")
+    print(f"Native approvals: {summary['approved_native']}")
+    print(
+        "Native denials with extractable reason: "
+        f"{summary['denied_native_with_reason']}"
+    )
+    print(
+        "Native denials without reason: "
+        f"{summary['denied_native_no_reason']}"
+    )
+    print(
+        "Freeform native reasons: "
+        f"{summary['denied_native_with_freeform_reason']}"
+    )
+    print(
+        "Structured native reasons: "
+        f"{summary['denied_native_with_structured_reason']}"
+    )
+    print(
+        "Non-native denials with payload: "
+        f"{summary['denied_non_native_with_payload']}"
+    )
+    print(
+        "Non-native denials without payload: "
+        f"{summary['denied_non_native_no_payload']}"
+    )
+    print(
+        "Captured denial reasons total: "
+        f"{summary['captured_denial_reasons_total']}"
+    )
+    print(
+        "Captured freeform reasons: "
+        f"{summary['captured_freeform_reasons']}"
+    )
+    print(
+        "Captured structured reasons: "
+        f"{summary['captured_structured_reasons']}"
+    )
+    print(f"Human reasons total: {summary['human_reasons_total']}")
+    print(f"Human reasons from native denials: {summary['human_reasons_native']}")
+    print(
+        "Human reasons from non-native denials: "
+        f"{summary['human_reasons_non_native']}"
+    )
+    print(
+        "Non-native / non-denial outcomes: "
+        f"{summary['non_native_other']}"
+    )
+    print(f"Pending / unmatched attempts: {summary['pending']}")
+    print()
+    print(
+        "Extractable native denial reasons: "
+        f"{summary['denied_native_with_reason']}/{native_denials} "
+        f"({native_extractable_ratio:.1f}%)"
+    )
+    print(
+        "Captured denial payloads across all denial types: "
+        f"{summary['captured_denial_reasons_total']}/{total_denials} "
+        f"({all_capture_ratio:.1f}%)"
+    )
+    print(
+        "Human reasons across all denial types: "
+        f"{summary['human_reasons_total']}/{total_denials} "
+        f"({((summary['human_reasons_total'] / total_denials) * 100.0 if total_denials else 0.0):.1f}%)"
+    )
+
+    if include_non_native_denials:
+        samples = [attempt for attempt in attempts if attempt.human_reason]
+    else:
+        samples = [
+            attempt
+            for attempt in attempts
+            if attempt.outcome == "denied_native_with_reason" and attempt.human_reason
+        ]
+    samples = samples[: max(show_samples, 0)]
+    if not samples:
+        return
+
+    print()
+    print(
+        "Sample denial reasons:"
+        if include_non_native_denials
+        else "Sample native denial reasons:"
+    )
+    for attempt in samples:
+        style = attempt.human_reason_style or "unknown"
+        source = attempt.human_reason_source or "unknown"
+        reason = attempt.human_reason or ""
+        print(
+            "- "
+            f"[{attempt.outcome} / {source} / {style}] "
+            f"{reason!r} "
+            f"({attempt.file_path}:{attempt.result_line_number})"
+        )
+
+
+def write_json_report(
+    output_path: Path,
+    projects_dir: Path,
+    include_subagents: bool,
+    stats: Dict[str, int],
+    summary: Dict[str, int],
+    records: List[AttemptRecord],
+    max_output_tokens_per_file: int,
+) -> List[Path]:
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+
+    chunks = build_json_chunks(records, max_output_tokens_per_file)
+    base_name = output_path.stem
+    output_dir = output_path.with_suffix("")
+    output_dir.mkdir(parents=True, exist_ok=True)
+
+    written_files: List[Path] = []
+    part_summaries = []
+
+    for index, chunk in enumerate(chunks, start=1):
+        chunk_records = [asdict(record) for record in chunk]
+        chunk_payload = {
+            "projects_dir": str(projects_dir),
+            "include_subagents": include_subagents,
+            "stats": stats,
+            "summary": summary,
+            "part_index": index,
+            "part_count": len(chunks),
+            "record_count": len(chunk_records),
+            "records": chunk_records,
+        }
+        part_name = f"{base_name}.part-{index:04d}-of-{len(chunks):04d}.json"
+        part_path = output_dir / part_name
+        part_path.write_text(
+            json.dumps(chunk_payload, indent=2, ensure_ascii=False),
+            encoding="utf-8",
+        )
+        written_files.append(part_path)
+        part_summaries.append(
+            {
+                "part_index": index,
+                "file_name": part_name,
+                "record_count": len(chunk_records),
+            }
+        )
+
+    manifest_payload = {
+        "projects_dir": str(projects_dir),
+        "include_subagents": include_subagents,
+        "stats": stats,
+        "summary": summary,
+        "records_filter_record_count": len(records),
+        "part_count": len(chunks),
+        "max_output_tokens_per_file": max_output_tokens_per_file,
+        "parts": part_summaries,
+    }
+    manifest_path = output_dir / f"{base_name}.manifest.json"
+    manifest_path.write_text(
+        json.dumps(manifest_payload, indent=2, ensure_ascii=False),
+        encoding="utf-8",
+    )
+    written_files.insert(0, manifest_path)
+
+    return written_files
+
+
+def main() -> int:
+    args = parse_args()
+    projects_dir = Path(args.projects_dir).expanduser()
+    if not projects_dir.exists():
+        print(f"Projects dir does not exist: {projects_dir}", file=sys.stderr)
+        return 1
+
+    stats, attempts = scan_projects(
+        projects_dir=projects_dir,
+        include_subagents=args.include_subagents,
+    )
+    attempts.sort(
+        key=lambda attempt: (
+            attempt.file_path,
+            attempt.line_number,
+            attempt.tool_use_id,
+        )
+    )
+    summary = summarize(attempts)
+    records = filter_records(attempts, args.records_filter)
+
+    print_summary(
+        projects_dir=projects_dir,
+        include_subagents=args.include_subagents,
+        stats=stats,
+        attempts=attempts,
+        summary=summary,
+        show_samples=args.show_samples,
+        include_non_native_denials=args.include_non_native_denials,
+    )
+
+    if args.json_out:
+        written_files = write_json_report(
+            output_path=Path(args.json_out).expanduser(),
+            projects_dir=projects_dir,
+            include_subagents=args.include_subagents,
+            stats=stats,
+            summary=summary,
+            records=records,
+            max_output_tokens_per_file=args.max_output_tokens_per_file,
+        )
+        part_count = max(len(written_files) - 1, 0)
+        print()
+        print(
+            "Wrote JSON output: "
+            f"detected {len(records)} records for filter '{args.records_filter}' "
+            f"and emitted {part_count} part file(s) plus a manifest."
+        )
+
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())