#+TITLE: Phase 3: Knowledge Engine & Agent Orchestration
#+AUTHOR: Giordano (via opencode)
#+OPTIONS: toc:2

* GOAL
Build a "Deep Knowledge Agent" (DKA) that acts as a secure, quarantined bridge between the Chat Gateway and private data sources.

* ARCHITECTURE OVERVIEW
** Layers
1. Public Gateway: FastAPI (The "Voice").
2. Orchestration Layer: LangGraph Supervisor (The "Router").
3. Quarantined Agent: DKA / Librarian (The "Keeper of Secrets").
   - Strictly Read-Only.
   - Accesses ChromaDB and Media stores.
4. Specialist Agent: Opencode (The "Engineer").

** Data Sources (The "Knowledge Mesh")
- [ ] *Code*: Gitea (Repos, Markdown docs).
- [ ] *Notes*: Trilium Next, Obsidian, Flatnotes, HedgeDoc.
- [ ] *Wiki*: DokuWiki.
- [ ] *Inventory*: HomeBox (Physical gear, photos).
- [ ] *Tasks*: Vikunja.
- [ ] *Media*: Immich (Photos/Videos metadata via Gemini Vision).

** Agent Tooling & Orchestration
- [ ] *Orchestrators*: CAO CLI, Agent Pipe.
- [ ] *External Agents*: Goose, Aider, Opencode (Specialist).

* COMPONENT DETAILS
** The Librarian (DKA - LangGraph)
- Purpose: Semantic retrieval and data synthesis from vectors.
- Tools: 
  - ~query_chroma~: Search the vector database.
  - ~fetch_media_link~: Returns a signed URL/path for Immich/HomeBox images.
- Constraints: 
  - NO ~bash~ or ~write~ tools.

** The Ingestion Pipeline (Airflow/Custom Python)
- [ ] *Multi-Source Scrapers*: API-based (Gitea, Immich) and File-based (Obsidian).
- [ ] *Vision Integration*: Gemini analyzes Immich photos to create searchable text descriptions.
- [ ] *Storage*: ChromaDB (Vectors) + PostgreSQL (Metadata/Hashes).

* TODO LIST [0/4]
- [ ] Create 'knowledge_service' directory.
- [ ] Implement ~test_rag.py~ (Hello World retrieval).
- [ ] Build basic scraper for ~hobbies.org~.
- [ ] Integrate DKA logic into the FastAPI Gateway.