Initial commit: Multi-service AI agent system
- Frontend: Vite + React + TypeScript chat interface - Backend: FastAPI gateway with LangGraph routing - Knowledge Service: ChromaDB RAG with Gitea scraper - LangGraph Service: Multi-agent orchestration - Airflow: Scheduled Gitea ingestion DAG - Documentation: Complete plan and implementation guides Architecture: - Modular Docker Compose per service - External ai-mesh network for communication - Fast rebuilds with /app/packages pattern - Intelligent agent routing (no hardcoded keywords) Services: - Frontend (5173): React chat UI - Chat Gateway (8000): FastAPI entry point - LangGraph (8090): Agent orchestration - Knowledge (8080): ChromaDB RAG - Airflow (8081): Scheduled ingestion - PostgreSQL (5432): Chat history Excludes: node_modules, .venv, chroma_db, logs, .env files Includes: All source code, configs, docs, docker files
This commit is contained in:
56
knowledge_service/knowledge_agent_plan.md
Normal file
56
knowledge_service/knowledge_agent_plan.md
Normal file
@@ -0,0 +1,56 @@
|
||||
# GOAL
|
||||
|
||||
Build a \"Deep Knowledge Agent\" (DKA) that acts as a secure,
|
||||
quarantined bridge between the Chat Gateway and private data sources.
|
||||
|
||||
# ARCHITECTURE OVERVIEW
|
||||
|
||||
## Layers
|
||||
|
||||
1. Public Gateway: FastAPI (The \"Voice\").
|
||||
2. Orchestration Layer: LangGraph Supervisor (The \"Router\").
|
||||
3. Quarantined Agent: DKA / Librarian (The \"Keeper of Secrets\").
|
||||
- Strictly Read-Only.
|
||||
- Accesses ChromaDB and Media stores.
|
||||
4. Specialist Agent: Opencode (The \"Engineer\").
|
||||
|
||||
## Data Sources (The \"Knowledge Mesh\")
|
||||
|
||||
- [ ] **Code**: Gitea (Repos, Markdown docs).
|
||||
- [ ] **Notes**: Trilium Next, Obsidian, Flatnotes, HedgeDoc.
|
||||
- [ ] **Wiki**: DokuWiki.
|
||||
- [ ] **Inventory**: HomeBox (Physical gear, photos).
|
||||
- [ ] **Tasks**: Vikunja.
|
||||
- [ ] **Media**: Immich (Photos/Videos metadata via Gemini Vision).
|
||||
|
||||
## Agent Tooling & Orchestration
|
||||
|
||||
- [ ] **Orchestrators**: CAO CLI, Agent Pipe.
|
||||
- [ ] **External Agents**: Goose, Aider, Opencode (Specialist).
|
||||
|
||||
# COMPONENT DETAILS
|
||||
|
||||
## The Librarian (DKA - LangGraph)
|
||||
|
||||
- Purpose: Semantic retrieval and data synthesis from vectors.
|
||||
- Tools:
|
||||
- `query_chroma`: Search the vector database.
|
||||
- `fetch_media_link`: Returns a signed URL/path for Immich/HomeBox
|
||||
images.
|
||||
- Constraints:
|
||||
- NO `bash` or `write` tools.
|
||||
|
||||
## The Ingestion Pipeline (Airflow/Custom Python)
|
||||
|
||||
- [ ] **Multi-Source Scrapers**: API-based (Gitea, Immich) and
|
||||
File-based (Obsidian).
|
||||
- [ ] **Vision Integration**: Gemini analyzes Immich photos to create
|
||||
searchable text descriptions.
|
||||
- [ ] **Storage**: ChromaDB (Vectors) + PostgreSQL (Metadata/Hashes).
|
||||
|
||||
# [TODO]{.todo .TODO} LIST \[0/4\] {#list-04}
|
||||
|
||||
- [ ] Create \'knowledge~service~\' directory.
|
||||
- [ ] Implement `test_rag.py` (Hello World retrieval).
|
||||
- [ ] Build basic scraper for `hobbies.org`.
|
||||
- [ ] Integrate DKA logic into the FastAPI Gateway.
|
||||
Reference in New Issue
Block a user