#+TITLE: Phase 3: Knowledge Engine & Agent Orchestration #+AUTHOR: Giordano (via opencode) #+OPTIONS: toc:2 * GOAL Build a "Deep Knowledge Agent" (DKA) that acts as a secure, quarantined bridge between the Chat Gateway and private data sources. * ARCHITECTURE OVERVIEW ** Layers 1. Public Gateway: FastAPI (The "Voice"). 2. Orchestration Layer: LangGraph Supervisor (The "Router"). 3. Quarantined Agent: DKA / Librarian (The "Keeper of Secrets"). - Strictly Read-Only. - Accesses ChromaDB and Media stores. 4. Specialist Agent: Opencode (The "Engineer"). ** Data Sources (The "Knowledge Mesh") - [ ] *Code*: Gitea (Repos, Markdown docs). - [ ] *Notes*: Trilium Next, Obsidian, Flatnotes, HedgeDoc. - [ ] *Wiki*: DokuWiki. - [ ] *Inventory*: HomeBox (Physical gear, photos). - [ ] *Tasks*: Vikunja. - [ ] *Media*: Immich (Photos/Videos metadata via Gemini Vision). ** Agent Tooling & Orchestration - [ ] *Orchestrators*: CAO CLI, Agent Pipe. - [ ] *External Agents*: Goose, Aider, Opencode (Specialist). * COMPONENT DETAILS ** The Librarian (DKA - LangGraph) - Purpose: Semantic retrieval and data synthesis from vectors. - Tools: - ~query_chroma~: Search the vector database. - ~fetch_media_link~: Returns a signed URL/path for Immich/HomeBox images. - Constraints: - NO ~bash~ or ~write~ tools. ** The Ingestion Pipeline (Airflow/Custom Python) - [ ] *Multi-Source Scrapers*: API-based (Gitea, Immich) and File-based (Obsidian). - [ ] *Vision Integration*: Gemini analyzes Immich photos to create searchable text descriptions. - [ ] *Storage*: ChromaDB (Vectors) + PostgreSQL (Metadata/Hashes). * TODO LIST [0/4] - [ ] Create 'knowledge_service' directory. - [ ] Implement ~test_rag.py~ (Hello World retrieval). - [ ] Build basic scraper for ~hobbies.org~. - [ ] Integrate DKA logic into the FastAPI Gateway.