- Frontend: Vite + React + TypeScript chat interface - Backend: FastAPI gateway with LangGraph routing - Knowledge Service: ChromaDB RAG with Gitea scraper - LangGraph Service: Multi-agent orchestration - Airflow: Scheduled Gitea ingestion DAG - Documentation: Complete plan and implementation guides Architecture: - Modular Docker Compose per service - External ai-mesh network for communication - Fast rebuilds with /app/packages pattern - Intelligent agent routing (no hardcoded keywords) Services: - Frontend (5173): React chat UI - Chat Gateway (8000): FastAPI entry point - LangGraph (8090): Agent orchestration - Knowledge (8080): ChromaDB RAG - Airflow (8081): Scheduled ingestion - PostgreSQL (5432): Chat history Excludes: node_modules, .venv, chroma_db, logs, .env files Includes: All source code, configs, docs, docker files
1.8 KiB
1.8 KiB
Phase 3: Knowledge Engine & Agent Orchestration
GOAL
Build a "Deep Knowledge Agent" (DKA) that acts as a secure, quarantined bridge between the Chat Gateway and private data sources.
ARCHITECTURE OVERVIEW
Layers
- Public Gateway: FastAPI (The "Voice").
- Orchestration Layer: LangGraph Supervisor (The "Router").
-
Quarantined Agent: DKA / Librarian (The "Keeper of Secrets").
- Strictly Read-Only.
- Accesses ChromaDB and Media stores.
- Specialist Agent: Opencode (The "Engineer").
Data Sources (The "Knowledge Mesh")
- Code: Gitea (Repos, Markdown docs).
- Notes: Trilium Next, Obsidian, Flatnotes, HedgeDoc.
- Wiki: DokuWiki.
- Inventory: HomeBox (Physical gear, photos).
- Tasks: Vikunja.
- Media: Immich (Photos/Videos metadata via Gemini Vision).
Agent Tooling & Orchestration
- Orchestrators: CAO CLI, Agent Pipe.
- External Agents: Goose, Aider, Opencode (Specialist).
COMPONENT DETAILS
The Librarian (DKA - LangGraph)
- Purpose: Semantic retrieval and data synthesis from vectors.
-
Tools:
query_chroma: Search the vector database.fetch_media_link: Returns a signed URL/path for Immich/HomeBox images.
-
Constraints:
- NO
bashorwritetools.
- NO
The Ingestion Pipeline (Airflow/Custom Python)
- Multi-Source Scrapers: API-based (Gitea, Immich) and File-based (Obsidian).
- Vision Integration: Gemini analyzes Immich photos to create searchable text descriptions.
- Storage: ChromaDB (Vectors) + PostgreSQL (Metadata/Hashes).
TODO
LIST [0/4]
- Create 'knowledge_service' directory.
- Implement
test_rag.py(Hello World retrieval). - Build basic scraper for
hobbies.org. - Integrate DKA logic into the FastAPI Gateway.