Initial commit: Multi-service AI agent system

- Frontend: Vite + React + TypeScript chat interface
- Backend: FastAPI gateway with LangGraph routing
- Knowledge Service: ChromaDB RAG with Gitea scraper
- LangGraph Service: Multi-agent orchestration
- Airflow: Scheduled Gitea ingestion DAG
- Documentation: Complete plan and implementation guides

Architecture:
- Modular Docker Compose per service
- External ai-mesh network for communication
- Fast rebuilds with /app/packages pattern
- Intelligent agent routing (no hardcoded keywords)

Services:
- Frontend (5173): React chat UI
- Chat Gateway (8000): FastAPI entry point
- LangGraph (8090): Agent orchestration
- Knowledge (8080): ChromaDB RAG
- Airflow (8081): Scheduled ingestion
- PostgreSQL (5432): Chat history

Excludes: node_modules, .venv, chroma_db, logs, .env files
Includes: All source code, configs, docs, docker files
This commit is contained in:
2026-02-27 19:51:06 +11:00
commit 628ba96998
44 changed files with 7177 additions and 0 deletions

52
.gitignore vendored Normal file
View File

@@ -0,0 +1,52 @@
# Dependencies
node_modules/
.venv/
__pycache__/
*.pyc
# Build outputs
dist/
dist-ssr/
# Databases and vector stores
chroma_db/
*.sqlite3
*.db
# Logs
logs/
*.log
npm-debug.log*
yarn-debug.log*
pnpm-debug.log*
# Environment variables (secrets!)
.env
.env.local
.env.*.local
# IDE
.vscode/*
!.vscode/extensions.json
.idea/
.DS_Store
*.suo
*.ntvs*
*.njsproj
*.sln
*.sw?
# Airflow runtime
airflow/logs/
airflow/config/
airflow/plugins/
# Testing
.coverage
htmlcov/
.pytest_cache/
# Project management files (not code)
action.md
ideas.org
project_journal.org

107
README.md Normal file
View File

@@ -0,0 +1,107 @@
# AboutMe AI Chat Demo
A comprehensive AI agent system with multi-service architecture for personal knowledge management and intelligent query responses.
## Architecture Overview
```
User Query → Chat Gateway → LangGraph Supervisor → [Librarian | Opencode | Brain]
Knowledge Service (ChromaDB) ← Airflow ← Gitea API
```
## Services
| Service | Port | Technology | Purpose |
|---------|------|------------|---------|
| Frontend | 5173 | Vite + React + TS | Chat UI |
| Chat Gateway | 8000 | FastAPI | API entry point |
| LangGraph | 8090 | FastAPI + LangGraph | Agent orchestration |
| Knowledge | 8080 | FastAPI + ChromaDB | RAG / Vector search |
| Airflow | 8081 | Apache Airflow | Scheduled ingestion |
| PostgreSQL | 5432 | Postgres 15 | Chat history |
## Quick Start
```bash
# 1. Ensure Docker network exists
docker network create ai-mesh
# 2. Start Knowledge Service
cd knowledge_service && docker-compose up -d
# 3. Start LangGraph Service
cd ../langgraph_service && docker-compose up -d
# 4. Start Chat Demo
cd ../aboutme_chat_demo && docker-compose up -d
# 5. Start Airflow (optional)
cd ../airflow && docker-compose up -d
```
## Environment Variables
Create `.env` files in each service directory:
**knowledge_service/.env:**
```
OPENROUTER_API_KEY=your_key_here
GITEA_URL=https://gitea.lab.audasmedia.com.au
GITEA_TOKEN=your_token
GITEA_USERNAME=sam
```
**langgraph_service/.env:**
```
OPENCODE_PASSWORD=sam4jo
```
**airflow/.env:**
```
AIRFLOW_UID=1000
GITEA_TOKEN=your_token
```
## Project Structure
```
aboutme_chat_demo/
├── frontend/ # React chat interface
├── backend/ # FastAPI gateway (routes to LangGraph)
├── plan.md # Full project roadmap
└── code_1.md # Implementation guide
knowledge_service/
├── main.py # FastAPI + ChromaDB
├── gitea_scraper.py # Gitea API integration
└── docker-compose.yml
langgraph_service/
├── main.py # FastAPI entry point
├── supervisor_agent.py # LangGraph orchestration
└── docker-compose.yml
airflow/
├── dags/ # Workflow definitions
│ └── gitea_ingestion_dag.py
└── docker-compose.yml
```
## Technologies
- **Frontend:** Vite, React 19, TypeScript, Tailwind CSS, TanStack Query
- **Backend:** FastAPI, Python 3.11, httpx
- **AI/ML:** LangGraph, LangChain, ChromaDB, OpenRouter API
- **Orchestration:** Apache Airflow (CeleryExecutor)
- **Infrastructure:** Docker, Docker Compose
## Documentation
- `plan.md` - Complete project roadmap (7 phases)
- `code_1.md` - Modular implementation guide
- `code.md` - Legacy implementation reference
## License
MIT

View File

@@ -0,0 +1,144 @@
"""
Airflow DAG for scheduled Gitea repository ingestion.
Runs daily to fetch new/updated repos and ingest into ChromaDB.
"""
from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.python import PythonOperator
from airflow.providers.http.operators.http import SimpleHttpOperator
import os
import sys
import json
# Add knowledge_service to path for imports
sys.path.insert(0, '/opt/airflow/dags/repo')
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5),
}
def fetch_gitea_repos(**context):
"""Task: Fetch all repositories from Gitea."""
from gitea_scraper import GiteaScraper
scraper = GiteaScraper(
base_url=os.getenv("GITEA_URL", "https://gitea.lab.audasmedia.com.au"),
token=os.getenv("GITEA_TOKEN", ""),
username=os.getenv("GITEA_USERNAME", "sam")
)
repos = scraper.get_user_repos()
# Push to XCom for downstream tasks
context['ti'].xcom_push(key='repo_count', value=len(repos))
context['ti'].xcom_push(key='repos', value=[
{
'name': r.name,
'description': r.description,
'url': r.url,
'updated_at': r.updated_at
}
for r in repos
])
return f"Fetched {len(repos)} repositories"
def fetch_readmes(**context):
"""Task: Fetch READMEs for all repositories."""
from gitea_scraper import GiteaScraper
ti = context['ti']
repos = ti.xcom_pull(task_ids='fetch_repos', key='repos')
scraper = GiteaScraper(
base_url=os.getenv("GITEA_URL", "https://gitea.lab.audasmedia.com.au"),
token=os.getenv("GITEA_TOKEN", ""),
username=os.getenv("GITEA_USERNAME", "sam")
)
readme_data = []
for repo in repos[:10]: # Limit to 10 repos per run for testing
readme = scraper.get_readme(repo['name'])
if readme:
readme_data.append({
'repo': repo['name'],
'content': readme[:5000], # First 5000 chars
'url': repo['url']
})
ti.xcom_push(key='readme_data', value=readme_data)
return f"Fetched {len(readme_data)} READMEs"
def ingest_to_chroma(**context):
"""Task: Ingest fetched data into ChromaDB via knowledge service."""
import httpx
ti = context['ti']
readme_data = ti.xcom_pull(task_ids='fetch_readmes', key='readme_data')
knowledge_service_url = os.getenv("KNOWLEDGE_SERVICE_URL", "http://knowledge-service:8080")
documents_ingested = 0
for item in readme_data:
try:
# Call knowledge service ingest endpoint
response = httpx.post(
f"{knowledge_service_url}/ingest",
json={
'source': f"gitea:{item['repo']}",
'content': item['content'],
'metadata': {
'repo': item['repo'],
'url': item['url'],
'type': 'readme'
}
},
timeout=30.0
)
if response.status_code == 200:
documents_ingested += 1
except Exception as e:
print(f"Error ingesting {item['repo']}: {e}")
return f"Ingested {documents_ingested} documents into ChromaDB"
# Define the DAG
with DAG(
'gitea_daily_ingestion',
default_args=default_args,
description='Daily ingestion of Gitea repositories into knowledge base',
schedule_interval=timedelta(days=1), # Run daily
start_date=datetime(2024, 1, 1),
catchup=False,
tags=['gitea', 'ingestion', 'knowledge'],
) as dag:
# Task 1: Fetch repository list
fetch_repos_task = PythonOperator(
task_id='fetch_repos',
python_callable=fetch_gitea_repos,
)
# Task 2: Fetch README content
fetch_readmes_task = PythonOperator(
task_id='fetch_readmes',
python_callable=fetch_readmes,
)
# Task 3: Ingest into ChromaDB
ingest_task = PythonOperator(
task_id='ingest_to_chroma',
python_callable=ingest_to_chroma,
)
# Define task dependencies
fetch_repos_task >> fetch_readmes_task >> ingest_task

View File

@@ -0,0 +1,121 @@
import os
import httpx
import logging
from typing import List, Dict, Optional
from dataclasses import dataclass
from datetime import datetime
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
@dataclass
class RepoMetadata:
name: str
description: str
url: str
default_branch: str
updated_at: str
language: Optional[str]
class GiteaScraper:
def __init__(self, base_url: str, token: str, username: str = "sam"):
self.base_url = base_url.rstrip("/")
self.token = token
self.username = username
self.headers = {"Authorization": f"token {token}"}
def get_user_repos(self) -> List[RepoMetadata]:
"""Fetch all repositories for the user."""
repos = []
page = 1
while True:
url = f"{self.base_url}/api/v1/users/{self.username}/repos?page={page}&limit=50"
try:
response = httpx.get(url, headers=self.headers, timeout=30.0)
response.raise_for_status()
data = response.json()
if not data:
break
for repo in data:
repos.append(RepoMetadata(
name=repo["name"],
description=repo.get("description", ""),
url=repo["html_url"],
default_branch=repo["default_branch"],
updated_at=repo["updated_at"],
language=repo.get("language")
))
logger.info(f"Fetched page {page}, got {len(data)} repos")
page += 1
except Exception as e:
logger.error(f"Error fetching repos: {e}")
break
return repos
def get_readme(self, repo_name: str) -> str:
"""Fetch README content for a repository."""
# Try common README filenames
readme_names = ["README.md", "readme.md", "Readme.md", "README.rst"]
for readme_name in readme_names:
url = f"{self.base_url}/api/v1/repos/{self.username}/{repo_name}/raw/{readme_name}"
try:
response = httpx.get(url, headers=self.headers, timeout=10.0)
if response.status_code == 200:
return response.text
except Exception as e:
logger.warning(f"Failed to fetch {readme_name}: {e}")
continue
return ""
def get_repo_files(self, repo_name: str, path: str = "") -> List[Dict]:
"""List files in a repository directory."""
url = f"{self.base_url}/api/v1/repos/{self.username}/{repo_name}/contents/{path}"
try:
response = httpx.get(url, headers=self.headers, timeout=10.0)
response.raise_for_status()
return response.json()
except Exception as e:
logger.error(f"Error listing files in {repo_name}/{path}: {e}")
return []
def get_file_content(self, repo_name: str, filepath: str) -> str:
"""Fetch content of a specific file."""
url = f"{self.base_url}/api/v1/repos/{self.username}/{repo_name}/raw/{filepath}"
try:
response = httpx.get(url, headers=self.headers, timeout=10.0)
if response.status_code == 200:
return response.text
except Exception as e:
logger.error(f"Error fetching file {filepath}: {e}")
return ""
# Test function
if __name__ == "__main__":
scraper = GiteaScraper(
base_url=os.getenv("GITEA_URL", "https://gitea.lab.audasmedia.com.au"),
token=os.getenv("GITEA_TOKEN", ""),
username=os.getenv("GITEA_USERNAME", "sam")
)
repos = scraper.get_user_repos()
print(f"Found {len(repos)} repositories")
for repo in repos[:3]: # Test with first 3
print(f"\nRepo: {repo.name}")
readme = scraper.get_readme(repo.name)
if readme:
print(f"README preview: {readme[:200]}...")

181
airflow/docker-compose.yml Normal file
View File

@@ -0,0 +1,181 @@
version: '3.8'
x-airflow-common:
&airflow-common
image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.8.1}
environment:
&airflow-common-env
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow
AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
AIRFLOW__CORE__FERNET_KEY: ''
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
AIRFLOW__API__AUTH_BACKENDS: 'airflow.api.auth.backend.basic_auth,airflow.api.auth.backend.session'
AIRFLOW__SCHEDULER__ENABLE_HEALTH_CHECK: 'true'
_PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
volumes:
- ${AIRFLOW_PROJ_DIR:-.}/dags:/opt/airflow/dags
- ${AIRFLOW_PROJ_DIR:-.}/logs:/opt/airflow/logs
- ${AIRFLOW_PROJ_DIR:-.}/config:/opt/airflow/config
- ${AIRFLOW_PROJ_DIR:-.}/plugins:/opt/airflow/plugins
user: "${AIRFLOW_UID:-50000}:0"
depends_on:
&airflow-common-depends-on
redis:
condition: service_healthy
postgres:
condition: service_healthy
services:
postgres:
image: postgres:13
environment:
POSTGRES_USER: airflow
POSTGRES_PASSWORD: airflow
POSTGRES_DB: airflow
volumes:
- postgres-db-volume:/var/lib/postgresql/data
healthcheck:
test: ["CMD", "pg_isready", "-U", "airflow"]
interval: 10s
retries: 5
start_period: 5s
restart: always
networks:
- ai-mesh
redis:
image: redis:latest
expose:
- 6379
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 30s
retries: 50
start_period: 30s
restart: always
networks:
- ai-mesh
airflow-webserver:
<<: *airflow-common
command: webserver
ports:
- "8081:8080"
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 5
start_period: 30s
restart: always
depends_on:
<<: *airflow-common-depends-on
airflow-init:
condition: service_completed_successfully
networks:
- ai-mesh
airflow-scheduler:
<<: *airflow-common
command: scheduler
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8974/health"]
interval: 30s
timeout: 10s
retries: 5
start_period: 30s
restart: always
depends_on:
<<: *airflow-common-depends-on
airflow-init:
condition: service_completed_successfully
networks:
- ai-mesh
airflow-worker:
<<: *airflow-common
command: celery worker
healthcheck:
test:
- "CMD-SHELL"
- 'celery --app airflow.providers.celery.executors.celery_executor.app inspect ping -d "celery@$${HOSTNAME}" || celery --app airflow.executors.celery_executor.app inspect ping -d "celery@$${HOSTNAME}"'
interval: 30s
timeout: 10s
retries: 5
start_period: 30s
restart: always
depends_on:
<<: *airflow-common-depends-on
airflow-init:
condition: service_completed_successfully
networks:
- ai-mesh
airflow-triggerer:
<<: *airflow-common
command: triggerer
healthcheck:
test: ["CMD-SHELL", 'airflow jobs check --job-type TriggererJob --hostname "$${HOSTNAME}"']
interval: 30s
timeout: 10s
retries: 5
start_period: 30s
restart: always
depends_on:
<<: *airflow-common-depends-on
airflow-init:
condition: service_completed_successfully
networks:
- ai-mesh
airflow-init:
<<: *airflow-common
entrypoint: /bin/bash
command:
- -c
- |
if [[ -z "${AIRFLOW_UID}" ]]; then
echo "WARNING!!!: AIRFLOW_UID not set!"
echo "Using default UID: 50000"
export AIRFLOW_UID=50000
fi
mkdir -p /sources/logs /sources/dags /sources/plugins
chown -R "${AIRFLOW_UID}:0" /sources/{logs,dags,plugins}
exec /entrypoint airflow version
environment:
<<: *airflow-common-env
_AIRFLOW_DB_MIGRATE: 'true'
_AIRFLOW_WWW_USER_CREATE: 'true'
_AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
_AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}
user: "0:0"
volumes:
- ${AIRFLOW_PROJ_DIR:-.}:/sources
networks:
- ai-mesh
airflow-cli:
<<: *airflow-common
profiles:
- debug
environment:
<<: *airflow-common-env
CONNECTION_CHECK_MAX_COUNT: "0"
command:
- bash
- -c
- airflow
networks:
- ai-mesh
volumes:
postgres-db-volume:
networks:
ai-mesh:
external: true

8
backend/Dockerfile Normal file
View File

@@ -0,0 +1,8 @@
FROM python:3.11-slim
WORKDIR /app
RUN apt-get update && apt-get install -y libpq-dev gcc
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--reload"]

58
backend/main.py Normal file
View File

@@ -0,0 +1,58 @@
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
import httpx
import logging
import sys
import traceback
logging.basicConfig(level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s", handlers=[logging.StreamHandler(sys.stdout)])
logger = logging.getLogger(__name__)
app = FastAPI()
app.add_middleware(CORSMiddleware, allow_origins=["*"], allow_credentials=True, allow_methods=["*"], allow_headers=["*"])
class MessageRequest(BaseModel):
message: str
BRAIN_URL = "http://opencode-brain:5000"
KNOWLEDGE_URL = "http://knowledge-service:8080/query"
AUTH = httpx.BasicAuth("opencode", "sam4jo")
@app.post("/chat")
async def chat(request: MessageRequest):
user_msg = request.message.lower()
timeout_long = httpx.Timeout(180.0, connect=10.0)
timeout_short = httpx.Timeout(5.0, connect=2.0)
context = ""
# Check for keywords to trigger Librarian (DB) lookup
if any(kw in user_msg for kw in ["sam", "hobby", "music", "guitar", "skiing", "experience"]):
logger.info("Gateway: Consulting Librarian (DB)...")
async with httpx.AsyncClient(timeout=timeout_short) as client:
try:
k_res = await client.post(KNOWLEDGE_URL, json={"question": request.message})
if k_res.status_code == 200:
context = k_res.json().get("context", "")
except Exception as e:
logger.warning(f"Gateway: Librarian offline/slow: {str(e)}")
# Forward to Brain (LLM)
async with httpx.AsyncClient(auth=AUTH, timeout=timeout_long) as brain_client:
try:
session_res = await brain_client.post(f"{BRAIN_URL}/session", json={"title": "Demo"})
session_id = session_res.json()["id"]
final_prompt = f"CONTEXT:\n{context}\n\nUSER: {request.message}" if context else request.message
response = await brain_client.post(f"{BRAIN_URL}/session/{session_id}/message", json={"parts": [{"type": "text", "text": final_prompt}]})
# FIX: Iterate through parts array to find text response
data = response.json()
if "parts" in data:
for part in data["parts"]:
if part.get("type") == "text" and "text" in part:
return {"response": part["text"]}
return {"response": "AI responded but no text found in expected format."}
except Exception:
logger.error(f"Gateway: Brain failure: {traceback.format_exc()}")
return {"response": "Error: The Brain is taking too long or is disconnected."}

49
backend/main.py.new Normal file
View File

@@ -0,0 +1,49 @@
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
import httpx
import logging
import sys
import traceback
import os
logging.basicConfig(level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s", handlers=[logging.StreamHandler(sys.stdout)])
logger = logging.getLogger(__name__)
app = FastAPI()
app.add_middleware(CORSMiddleware, allow_origins=["*"], allow_credentials=True, allow_methods=["*"], allow_headers=["*"])
class MessageRequest(BaseModel):
message: str
LANGGRAPH_URL = os.getenv("LANGGRAPH_URL", "http://langgraph-service:8090")
@app.post("/chat")
async def chat(request: MessageRequest):
"""Updated chat endpoint that routes through LangGraph Supervisor."""
logger.info(f"Gateway: Received message: {request.message}")
try:
# Call LangGraph Supervisor instead of direct brain
async with httpx.AsyncClient(timeout=httpx.Timeout(60.0, connect=10.0)) as client:
response = await client.post(
f"{LANGGRAPH_URL}/query",
json={"query": request.message}
)
if response.status_code == 200:
result = response.json()
logger.info(f"Gateway: Response from {result.get('agent_used', 'unknown')} agent")
return {"response": result["response"]}
else:
logger.error(f"Gateway: LangGraph error {response.status_code}")
return {"response": "Error: Orchestration service unavailable"}
except Exception as e:
logger.error(f"Gateway: Error routing through LangGraph: {traceback.format_exc()}")
return {"response": "Error: Unable to process your request at this time."}
@app.get("/health")
async def health():
return {"status": "healthy", "service": "chat-gateway"}

8
backend/requirements.txt Normal file
View File

@@ -0,0 +1,8 @@
fastapi
uvicorn
sqlalchemy
psycopg2-binary
pydantic
httpx
pytest
pytest-asyncio

View File

@@ -0,0 +1,79 @@
import pytest
from fastapi.testclient import TestClient
from main import app
import httpx
from unittest.mock import AsyncMock, patch
client = TestClient(app)
@pytest.mark.asyncio
async def test_chat_general_query():
"""Test that a general query (no personal keywords) skips the Librarian."""
with patch("httpx.AsyncClient.post", new_callable=AsyncMock) as mock_post:
# Mock Brain response
mock_response = AsyncMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"info": {"id": "msg_123"},
"parts": [{"type": "text", "text": "I am a general AI."}]
}
# First call is for session creation, second for message
mock_post.side_effect = [AsyncMock(status_code=200, json=lambda: {"id": "ses_123"}), mock_response]
response = client.post("/chat", json={"message": "What is 2+2?"})
assert response.status_code == 200
assert response.json()["response"] == "I am a general AI."
# Verify Librarian (knowledge-service) was NOT called
# The knowledge service URL is http://knowledge-service:8080/query
calls = [call.args[0] for call in mock_post.call_args_list]
assert not any("knowledge-service" in url for url in calls)
@pytest.mark.asyncio
async def test_chat_personal_query_success():
"""Test that a personal query calls the Librarian and injects context."""
with patch("httpx.AsyncClient.post", new_callable=AsyncMock) as mock_post:
# 1. Mock Librarian Response
mock_k_res = AsyncMock()
mock_k_res.status_code = 200
mock_k_res.json.return_value = {"context": "Sam likes red guitars."}
# 2. Mock Brain Session Response
mock_s_res = AsyncMock()
mock_s_res.status_code = 200
mock_s_res.json.return_value = {"id": "ses_123"}
# 3. Mock Brain Message Response
mock_b_res = AsyncMock()
mock_b_res.status_code = 200
mock_b_res.json.return_value = {
"parts": [{"type": "text", "text": "I see Sam likes red guitars."}]
}
mock_post.side_effect = [mock_k_res, mock_s_res, mock_b_res]
response = client.post("/chat", json={"message": "Tell me about Sam's music"})
assert response.status_code == 200
assert "red guitars" in response.json()["response"]
# Verify Librarian was called
calls = [call.args[0] for call in mock_post.call_args_list]
assert any("knowledge-service" in url for url in calls)
@pytest.mark.asyncio
async def test_chat_librarian_timeout_failover():
"""Test that the gateway fails over instantly (5s) if Librarian is slow."""
with patch("httpx.AsyncClient.post", new_callable=AsyncMock) as mock_post:
# Mock Librarian Timeout
mock_post.side_effect = [
httpx.TimeoutException("Timeout"), # Librarian call
AsyncMock(status_code=200, json=lambda: {"id": "ses_123"}), # Brain Session
AsyncMock(status_code=200, json=lambda: {"parts": [{"type": "text", "text": "Direct Brain Response"}]}) # Brain Msg
]
response = client.post("/chat", json={"message": "Sam's hobbies?"})
assert response.status_code == 200
assert response.json()["response"] == "Direct Brain Response"

1107
code.md Normal file

File diff suppressed because it is too large Load Diff

1128
code_1.md Normal file

File diff suppressed because it is too large Load Diff

41
docker-compose.yml Normal file
View File

@@ -0,0 +1,41 @@
services:
db:
image: postgres:15-alpine
environment:
POSTGRES_USER: sam
POSTGRES_PASSWORD: sam4jo
POSTGRES_DB: chat_demo
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
networks:
- ai-mesh
backend:
build: ./backend
ports:
- "8000:8000"
environment:
DATABASE_URL: postgresql://sam:sam4jo@db:5432/chat_demo
volumes:
- ./backend:/app
depends_on:
- db
networks:
- ai-mesh
frontend:
build: ./frontend
ports:
- "5173:5173"
volumes:
- ./frontend:/app
- /app/node_modules
environment:
- CHOKIDAR_USEPOLLING=true
networks:
- ai-mesh
volumes:
postgres_data:
networks:
ai-mesh:
external: true

24
frontend/.gitignore vendored Normal file
View File

@@ -0,0 +1,24 @@
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
lerna-debug.log*
node_modules
dist
dist-ssr
*.local
# Editor directories and files
.vscode/*
!.vscode/extensions.json
.idea
.DS_Store
*.suo
*.ntvs*
*.njsproj
*.sln
*.sw?

7
frontend/Dockerfile Normal file
View File

@@ -0,0 +1,7 @@
FROM node:20-alpine
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
RUN npm install -g pnpm && pnpm install
COPY . .
CMD ["pnpm", "run", "dev", "--host", "0.0.0.0"]

73
frontend/README.md Normal file
View File

@@ -0,0 +1,73 @@
# React + TypeScript + Vite
This template provides a minimal setup to get React working in Vite with HMR and some ESLint rules.
Currently, two official plugins are available:
- [@vitejs/plugin-react](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react) uses [Babel](https://babeljs.io/) (or [oxc](https://oxc.rs) when used in [rolldown-vite](https://vite.dev/guide/rolldown)) for Fast Refresh
- [@vitejs/plugin-react-swc](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react-swc) uses [SWC](https://swc.rs/) for Fast Refresh
## React Compiler
The React Compiler is not enabled on this template because of its impact on dev & build performances. To add it, see [this documentation](https://react.dev/learn/react-compiler/installation).
## Expanding the ESLint configuration
If you are developing a production application, we recommend updating the configuration to enable type-aware lint rules:
```js
export default defineConfig([
globalIgnores(['dist']),
{
files: ['**/*.{ts,tsx}'],
extends: [
// Other configs...
// Remove tseslint.configs.recommended and replace with this
tseslint.configs.recommendedTypeChecked,
// Alternatively, use this for stricter rules
tseslint.configs.strictTypeChecked,
// Optionally, add this for stylistic rules
tseslint.configs.stylisticTypeChecked,
// Other configs...
],
languageOptions: {
parserOptions: {
project: ['./tsconfig.node.json', './tsconfig.app.json'],
tsconfigRootDir: import.meta.dirname,
},
// other options...
},
},
])
```
You can also install [eslint-plugin-react-x](https://github.com/Rel1cx/eslint-react/tree/main/packages/plugins/eslint-plugin-react-x) and [eslint-plugin-react-dom](https://github.com/Rel1cx/eslint-react/tree/main/packages/plugins/eslint-plugin-react-dom) for React-specific lint rules:
```js
// eslint.config.js
import reactX from 'eslint-plugin-react-x'
import reactDom from 'eslint-plugin-react-dom'
export default defineConfig([
globalIgnores(['dist']),
{
files: ['**/*.{ts,tsx}'],
extends: [
// Other configs...
// Enable lint rules for React
reactX.configs['recommended-typescript'],
// Enable lint rules for React DOM
reactDom.configs.recommended,
],
languageOptions: {
parserOptions: {
project: ['./tsconfig.node.json', './tsconfig.app.json'],
tsconfigRootDir: import.meta.dirname,
},
// other options...
},
},
])
```

23
frontend/eslint.config.js Normal file
View File

@@ -0,0 +1,23 @@
import js from '@eslint/js'
import globals from 'globals'
import reactHooks from 'eslint-plugin-react-hooks'
import reactRefresh from 'eslint-plugin-react-refresh'
import tseslint from 'typescript-eslint'
import { defineConfig, globalIgnores } from 'eslint/config'
export default defineConfig([
globalIgnores(['dist']),
{
files: ['**/*.{ts,tsx}'],
extends: [
js.configs.recommended,
tseslint.configs.recommended,
reactHooks.configs.flat.recommended,
reactRefresh.configs.vite,
],
languageOptions: {
ecmaVersion: 2020,
globals: globals.browser,
},
},
])

13
frontend/index.html Normal file
View File

@@ -0,0 +1,13 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/vite.svg" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>frontend</title>
</head>
<body>
<div id="root"></div>
<script type="module" src="/src/main.tsx"></script>
</body>
</html>

36
frontend/package.json Normal file
View File

@@ -0,0 +1,36 @@
{
"name": "frontend",
"private": true,
"version": "0.0.0",
"type": "module",
"scripts": {
"dev": "vite",
"build": "tsc -b && vite build",
"lint": "eslint .",
"preview": "vite preview"
},
"dependencies": {
"@tanstack/react-query": "^5.90.21",
"axios": "^1.13.5",
"react": "^19.2.0",
"react-dom": "^19.2.0"
},
"devDependencies": {
"@eslint/js": "^9.39.1",
"@tailwindcss/vite": "^4.2.0",
"@types/node": "^24.10.1",
"@types/react": "^19.2.7",
"@types/react-dom": "^19.2.3",
"@vitejs/plugin-react": "^5.1.1",
"autoprefixer": "^10.4.24",
"eslint": "^9.39.1",
"eslint-plugin-react-hooks": "^7.0.1",
"eslint-plugin-react-refresh": "^0.4.24",
"globals": "^16.5.0",
"postcss": "^8.5.6",
"tailwindcss": "^4.2.0",
"typescript": "~5.9.3",
"typescript-eslint": "^8.48.0",
"vite": "^7.3.1"
}
}

2634
frontend/pnpm-lock.yaml generated Normal file

File diff suppressed because it is too large Load Diff

1
frontend/public/vite.svg Normal file
View File

@@ -0,0 +1 @@
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" class="iconify iconify--logos" width="31.88" height="32" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 257"><defs><linearGradient id="IconifyId1813088fe1fbc01fb466" x1="-.828%" x2="57.636%" y1="7.652%" y2="78.411%"><stop offset="0%" stop-color="#41D1FF"></stop><stop offset="100%" stop-color="#BD34FE"></stop></linearGradient><linearGradient id="IconifyId1813088fe1fbc01fb467" x1="43.376%" x2="50.316%" y1="2.242%" y2="89.03%"><stop offset="0%" stop-color="#FFEA83"></stop><stop offset="8.333%" stop-color="#FFDD35"></stop><stop offset="100%" stop-color="#FFA800"></stop></linearGradient></defs><path fill="url(#IconifyId1813088fe1fbc01fb466)" d="M255.153 37.938L134.897 252.976c-2.483 4.44-8.862 4.466-11.382.048L.875 37.958c-2.746-4.814 1.371-10.646 6.827-9.67l120.385 21.517a6.537 6.537 0 0 0 2.322-.004l117.867-21.483c5.438-.991 9.574 4.796 6.877 9.62Z"></path><path fill="url(#IconifyId1813088fe1fbc01fb467)" d="M185.432.063L96.44 17.501a3.268 3.268 0 0 0-2.634 3.014l-5.474 92.456a3.268 3.268 0 0 0 3.997 3.378l24.777-5.718c2.318-.535 4.413 1.507 3.936 3.838l-7.361 36.047c-.495 2.426 1.782 4.5 4.151 3.78l15.304-4.649c2.372-.72 4.652 1.36 4.15 3.788l-11.698 56.621c-.732 3.542 3.979 5.473 5.943 2.437l1.313-2.028l72.516-144.72c1.215-2.423-.88-5.186-3.54-4.672l-25.505 4.922c-2.396.462-4.435-1.77-3.759-4.114l16.646-57.705c.677-2.35-1.37-4.583-3.769-4.113Z"></path></svg>

After

Width:  |  Height:  |  Size: 1.5 KiB

42
frontend/src/App.css Normal file
View File

@@ -0,0 +1,42 @@
#root {
max-width: 1280px;
margin: 0 auto;
padding: 2rem;
text-align: center;
}
.logo {
height: 6em;
padding: 1.5em;
will-change: filter;
transition: filter 300ms;
}
.logo:hover {
filter: drop-shadow(0 0 2em #646cffaa);
}
.logo.react:hover {
filter: drop-shadow(0 0 2em #61dafbaa);
}
@keyframes logo-spin {
from {
transform: rotate(0deg);
}
to {
transform: rotate(360deg);
}
}
@media (prefers-reduced-motion: no-preference) {
a:nth-of-type(2) .logo {
animation: logo-spin infinite 20s linear;
}
}
.card {
padding: 2em;
}
.read-the-docs {
color: #888;
}

17
frontend/src/App.tsx Normal file
View File

@@ -0,0 +1,17 @@
import { QueryClient, QueryClientProvider } from '@tanstack/react-query'
import ChatInterface from './components/ChatInterface'
const queryClient = new QueryClient()
function App() {
return (
<QueryClientProvider client={queryClient}>
<div className="min-h-screen bg-gray-950 p-4 md:p-8">
<ChatInterface />
</div>
</QueryClientProvider>
)
}
export default App

View File

@@ -0,0 +1 @@
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" class="iconify iconify--logos" width="35.93" height="32" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 228"><path fill="#00D8FF" d="M210.483 73.824a171.49 171.49 0 0 0-8.24-2.597c.465-1.9.893-3.777 1.273-5.621c6.238-30.281 2.16-54.676-11.769-62.708c-13.355-7.7-35.196.329-57.254 19.526a171.23 171.23 0 0 0-6.375 5.848a155.866 155.866 0 0 0-4.241-3.917C100.759 3.829 77.587-4.822 63.673 3.233C50.33 10.957 46.379 33.89 51.995 62.588a170.974 170.974 0 0 0 1.892 8.48c-3.28.932-6.445 1.924-9.474 2.98C17.309 83.498 0 98.307 0 113.668c0 15.865 18.582 31.778 46.812 41.427a145.52 145.52 0 0 0 6.921 2.165a167.467 167.467 0 0 0-2.01 9.138c-5.354 28.2-1.173 50.591 12.134 58.266c13.744 7.926 36.812-.22 59.273-19.855a145.567 145.567 0 0 0 5.342-4.923a168.064 168.064 0 0 0 6.92 6.314c21.758 18.722 43.246 26.282 56.54 18.586c13.731-7.949 18.194-32.003 12.4-61.268a145.016 145.016 0 0 0-1.535-6.842c1.62-.48 3.21-.974 4.76-1.488c29.348-9.723 48.443-25.443 48.443-41.52c0-15.417-17.868-30.326-45.517-39.844Zm-6.365 70.984c-1.4.463-2.836.91-4.3 1.345c-3.24-10.257-7.612-21.163-12.963-32.432c5.106-11 9.31-21.767 12.459-31.957c2.619.758 5.16 1.557 7.61 2.4c23.69 8.156 38.14 20.213 38.14 29.504c0 9.896-15.606 22.743-40.946 31.14Zm-10.514 20.834c2.562 12.94 2.927 24.64 1.23 33.787c-1.524 8.219-4.59 13.698-8.382 15.893c-8.067 4.67-25.32-1.4-43.927-17.412a156.726 156.726 0 0 1-6.437-5.87c7.214-7.889 14.423-17.06 21.459-27.246c12.376-1.098 24.068-2.894 34.671-5.345a134.17 134.17 0 0 1 1.386 6.193ZM87.276 214.515c-7.882 2.783-14.16 2.863-17.955.675c-8.075-4.657-11.432-22.636-6.853-46.752a156.923 156.923 0 0 1 1.869-8.499c10.486 2.32 22.093 3.988 34.498 4.994c7.084 9.967 14.501 19.128 21.976 27.15a134.668 134.668 0 0 1-4.877 4.492c-9.933 8.682-19.886 14.842-28.658 17.94ZM50.35 144.747c-12.483-4.267-22.792-9.812-29.858-15.863c-6.35-5.437-9.555-10.836-9.555-15.216c0-9.322 13.897-21.212 37.076-29.293c2.813-.98 5.757-1.905 8.812-2.773c3.204 10.42 7.406 21.315 12.477 32.332c-5.137 11.18-9.399 22.249-12.634 32.792a134.718 134.718 0 0 1-6.318-1.979Zm12.378-84.26c-4.811-24.587-1.616-43.134 6.425-47.789c8.564-4.958 27.502 2.111 47.463 19.835a144.318 144.318 0 0 1 3.841 3.545c-7.438 7.987-14.787 17.08-21.808 26.988c-12.04 1.116-23.565 2.908-34.161 5.309a160.342 160.342 0 0 1-1.76-7.887Zm110.427 27.268a347.8 347.8 0 0 0-7.785-12.803c8.168 1.033 15.994 2.404 23.343 4.08c-2.206 7.072-4.956 14.465-8.193 22.045a381.151 381.151 0 0 0-7.365-13.322Zm-45.032-43.861c5.044 5.465 10.096 11.566 15.065 18.186a322.04 322.04 0 0 0-30.257-.006c4.974-6.559 10.069-12.652 15.192-18.18ZM82.802 87.83a323.167 323.167 0 0 0-7.227 13.238c-3.184-7.553-5.909-14.98-8.134-22.152c7.304-1.634 15.093-2.97 23.209-3.984a321.524 321.524 0 0 0-7.848 12.897Zm8.081 65.352c-8.385-.936-16.291-2.203-23.593-3.793c2.26-7.3 5.045-14.885 8.298-22.6a321.187 321.187 0 0 0 7.257 13.246c2.594 4.48 5.28 8.868 8.038 13.147Zm37.542 31.03c-5.184-5.592-10.354-11.779-15.403-18.433c4.902.192 9.899.29 14.978.29c5.218 0 10.376-.117 15.453-.343c-4.985 6.774-10.018 12.97-15.028 18.486Zm52.198-57.817c3.422 7.8 6.306 15.345 8.596 22.52c-7.422 1.694-15.436 3.058-23.88 4.071a382.417 382.417 0 0 0 7.859-13.026a347.403 347.403 0 0 0 7.425-13.565Zm-16.898 8.101a358.557 358.557 0 0 1-12.281 19.815a329.4 329.4 0 0 1-23.444.823c-7.967 0-15.716-.248-23.178-.732a310.202 310.202 0 0 1-12.513-19.846h.001a307.41 307.41 0 0 1-10.923-20.627a310.278 310.278 0 0 1 10.89-20.637l-.001.001a307.318 307.318 0 0 1 12.413-19.761c7.613-.576 15.42-.876 23.31-.876H128c7.926 0 15.743.303 23.354.883a329.357 329.357 0 0 1 12.335 19.695a358.489 358.489 0 0 1 11.036 20.54a329.472 329.472 0 0 1-11 20.722Zm22.56-122.124c8.572 4.944 11.906 24.881 6.52 51.026c-.344 1.668-.73 3.367-1.15 5.09c-10.622-2.452-22.155-4.275-34.23-5.408c-7.034-10.017-14.323-19.124-21.64-27.008a160.789 160.789 0 0 1 5.888-5.4c18.9-16.447 36.564-22.941 44.612-18.3ZM128 90.808c12.625 0 22.86 10.235 22.86 22.86s-10.235 22.86-22.86 22.86s-22.86-10.235-22.86-22.86s10.235-22.86 22.86-22.86Z"></path></svg>

After

Width:  |  Height:  |  Size: 4.0 KiB

View File

@@ -0,0 +1,125 @@
import { useState, useRef, useEffect } from 'react';
import { useMutation } from '@tanstack/react-query';
import axios from 'axios';
type Message = {
id: string;
text: string;
sender: 'user' | 'ai';
};
export default function ChatInterface() {
const [messages, setMessages] = useState<Message[]>([]);
const [input, setInput] = useState('');
const messagesEndRef = useRef<HTMLDivElement>(null);
const scrollToBottom = () => {
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
};
useEffect(() => {
scrollToBottom();
}, [messages]);
const chatMutation = useMutation({
mutationFn: async (messageText: string) => {
const response = await axios.post('http://localhost:8000/chat', {
message: messageText,
});
return response.data;
},
onSuccess: (data) => {
setMessages((prev) => [
...prev,
{ id: Date.now().toString(), text: data.response, sender: 'ai' },
]);
},
onError: () => {
setMessages((prev) => [
...prev,
{ id: Date.now().toString(), text: "Error: Could not connect to the backend.", sender: 'ai' },
]);
},
});
const handleSubmit = (e: React.FormEvent) => {
e.preventDefault();
if (!input.trim() || chatMutation.isPending) return;
const userMessage = input.trim();
setInput('');
setMessages((prev) => [
...prev,
{ id: Date.now().toString(), text: userMessage, sender: 'user' },
]);
chatMutation.mutate(userMessage);
};
return (
<div className="flex flex-col h-full max-w-2xl mx-auto border border-gray-700 rounded-lg overflow-hidden bg-gray-800 shadow-xl mt-8">
{/* Header */}
<div className="bg-gray-900 p-4 border-b border-gray-700">
<h2 className="text-xl font-semibold text-white">Sam Rolfe - AI</h2>
<p className="text-sm text-gray-400">Ask about skills, experience, hobbies</p>
</div>
{/* Message Area */}
<div className="flex-1 overflow-y-auto p-4 space-y-4 min-h-[400px]">
{messages.length === 0 && (
<div className="text-center text-gray-500 mt-10">
Send a message to start the conversation!
</div>
)}
{messages.map((msg) => (
<div
key={msg.id}
className={`flex ${msg.sender === 'user' ? 'justify-end' : 'justify-start'}`}
>
<div
className={`max-w-[80%] rounded-2xl px-4 py-2 ${
msg.sender === 'user'
? 'bg-blue-600 text-white rounded-tr-none'
: 'bg-gray-700 text-gray-100 rounded-tl-none'
}`}
>
{msg.text}
</div>
</div>
))}
{chatMutation.isPending && (
<div className="flex justify-start">
<div className="bg-gray-700 text-gray-400 rounded-2xl rounded-tl-none px-4 py-2 flex space-x-2 items-center">
<div className="w-2 h-2 bg-gray-500 rounded-full animate-bounce" />
<div className="w-2 h-2 bg-gray-500 rounded-full animate-bounce [animation-delay:0.2s]" />
<div className="w-2 h-2 bg-gray-500 rounded-full animate-bounce [animation-delay:0.4s]" />
</div>
</div>
)}
<div ref={messagesEndRef} />
</div>
{/* Input Area */}
<form onSubmit={handleSubmit} className="p-4 bg-gray-900 border-t border-gray-700">
<div className="flex space-x-2">
<input
type="text"
value={input}
onChange={(e) => setInput(e.target.value)}
placeholder="Type your message..."
className="flex-1 bg-gray-800 text-white border border-gray-700 rounded-lg px-4 py-2 focus:outline-none focus:border-blue-500 transition-colors"
disabled={chatMutation.isPending}
/>
<button
type="submit"
disabled={!input.trim() || chatMutation.isPending}
className="bg-blue-600 hover:bg-blue-700 disabled:bg-blue-800 disabled:text-gray-400 text-white px-6 py-2 rounded-lg font-medium transition-colors"
>
Send
</button>
</div>
</form>
</div>
);
}

5
frontend/src/index.css Normal file
View File

@@ -0,0 +1,5 @@
@import "tailwindcss";
@theme {
--font-sans: system-ui, Avenir, Helvetica, Arial, sans-serif;
}

10
frontend/src/main.tsx Normal file
View File

@@ -0,0 +1,10 @@
import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import './index.css'
import App from './App.tsx'
createRoot(document.getElementById('root')!).render(
<StrictMode>
<App />
</StrictMode>,
)

View File

@@ -0,0 +1,28 @@
{
"compilerOptions": {
"tsBuildInfoFile": "./node_modules/.tmp/tsconfig.app.tsbuildinfo",
"target": "ES2022",
"useDefineForClassFields": true,
"lib": ["ES2022", "DOM", "DOM.Iterable"],
"module": "ESNext",
"types": ["vite/client"],
"skipLibCheck": true,
/* Bundler mode */
"moduleResolution": "bundler",
"allowImportingTsExtensions": true,
"verbatimModuleSyntax": true,
"moduleDetection": "force",
"noEmit": true,
"jsx": "react-jsx",
/* Linting */
"strict": true,
"noUnusedLocals": true,
"noUnusedParameters": true,
"erasableSyntaxOnly": true,
"noFallthroughCasesInSwitch": true,
"noUncheckedSideEffectImports": true
},
"include": ["src"]
}

7
frontend/tsconfig.json Normal file
View File

@@ -0,0 +1,7 @@
{
"files": [],
"references": [
{ "path": "./tsconfig.app.json" },
{ "path": "./tsconfig.node.json" }
]
}

View File

@@ -0,0 +1,26 @@
{
"compilerOptions": {
"tsBuildInfoFile": "./node_modules/.tmp/tsconfig.node.tsbuildinfo",
"target": "ES2023",
"lib": ["ES2023"],
"module": "ESNext",
"types": ["node"],
"skipLibCheck": true,
/* Bundler mode */
"moduleResolution": "bundler",
"allowImportingTsExtensions": true,
"verbatimModuleSyntax": true,
"moduleDetection": "force",
"noEmit": true,
/* Linting */
"strict": true,
"noUnusedLocals": true,
"noUnusedParameters": true,
"erasableSyntaxOnly": true,
"noFallthroughCasesInSwitch": true,
"noUncheckedSideEffectImports": true
},
"include": ["vite.config.ts"]
}

11
frontend/vite.config.ts Normal file
View File

@@ -0,0 +1,11 @@
import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
import tailwindcss from '@tailwindcss/vite'
// https://vite.dev/config/
export default defineConfig({
plugins: [
tailwindcss(),
react()
],
})

View File

@@ -0,0 +1,29 @@
FROM python:3.11-slim
# Install system dependencies
RUN apt-get update && apt-get install -y \
libstdc++6 \
gcc \
g++ \
&& rm -rf /var/lib/apt/lists/*
# Create directories
RUN mkdir -p /app/packages /app/code
# Install Python packages to a specific location
WORKDIR /app
COPY requirements.txt .
RUN pip install --target=/app/packages -r requirements.txt
# Copy initial code (will be overridden by volume mount in dev)
COPY . /app/code/
# Set Python to find packages in /app/packages
ENV PYTHONPATH=/app/packages
ENV PYTHONUNBUFFERED=1
WORKDIR /app/code
EXPOSE 8080
CMD ["python3", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]

View File

@@ -0,0 +1,15 @@
# Sam's Hobbies
## Music
- Enjoys playing guitar and synthesizers.
- Collects vintage vinyl.
## Gardening
- Maintains a local vegetable patch.
- Focuses on organic heirloom tomatoes.
## Skiing
- Advanced skier, prefers off-piste and backcountry in the Alps.
## Art
- Digital illustration and oil painting.

View File

@@ -0,0 +1,24 @@
services:
knowledge-service:
build: .
image: sam/knowledge-service:latest
container_name: knowledge-service
ports:
- "8080:8080"
volumes:
# Only mount the code directory, not packages
- ./data:/app/code/data
- ./chroma_db:/app/code/chroma_db
- ./main.py:/app/code/main.py:ro # Read-only mount for safety
environment:
- PYTHONUNBUFFERED=1
- OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
- PYTHONPATH=/app/packages
networks:
- ai-mesh
restart: unless-stopped
networks:
ai-mesh:
external: true

View File

@@ -0,0 +1,121 @@
import os
import httpx
import logging
from typing import List, Dict, Optional
from dataclasses import dataclass
from datetime import datetime
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
@dataclass
class RepoMetadata:
name: str
description: str
url: str
default_branch: str
updated_at: str
language: Optional[str]
class GiteaScraper:
def __init__(self, base_url: str, token: str, username: str = "sam"):
self.base_url = base_url.rstrip("/")
self.token = token
self.username = username
self.headers = {"Authorization": f"token {token}"}
def get_user_repos(self) -> List[RepoMetadata]:
"""Fetch all repositories for the user."""
repos = []
page = 1
while True:
url = f"{self.base_url}/api/v1/users/{self.username}/repos?page={page}&limit=50"
try:
response = httpx.get(url, headers=self.headers, timeout=30.0)
response.raise_for_status()
data = response.json()
if not data:
break
for repo in data:
repos.append(RepoMetadata(
name=repo["name"],
description=repo.get("description", ""),
url=repo["html_url"],
default_branch=repo["default_branch"],
updated_at=repo["updated_at"],
language=repo.get("language")
))
logger.info(f"Fetched page {page}, got {len(data)} repos")
page += 1
except Exception as e:
logger.error(f"Error fetching repos: {e}")
break
return repos
def get_readme(self, repo_name: str) -> str:
"""Fetch README content for a repository."""
# Try common README filenames
readme_names = ["README.md", "readme.md", "Readme.md", "README.rst"]
for readme_name in readme_names:
url = f"{self.base_url}/api/v1/repos/{self.username}/{repo_name}/raw/{readme_name}"
try:
response = httpx.get(url, headers=self.headers, timeout=10.0)
if response.status_code == 200:
return response.text
except Exception as e:
logger.warning(f"Failed to fetch {readme_name}: {e}")
continue
return ""
def get_repo_files(self, repo_name: str, path: str = "") -> List[Dict]:
"""List files in a repository directory."""
url = f"{self.base_url}/api/v1/repos/{self.username}/{repo_name}/contents/{path}"
try:
response = httpx.get(url, headers=self.headers, timeout=10.0)
response.raise_for_status()
return response.json()
except Exception as e:
logger.error(f"Error listing files in {repo_name}/{path}: {e}")
return []
def get_file_content(self, repo_name: str, filepath: str) -> str:
"""Fetch content of a specific file."""
url = f"{self.base_url}/api/v1/repos/{self.username}/{repo_name}/raw/{filepath}"
try:
response = httpx.get(url, headers=self.headers, timeout=10.0)
if response.status_code == 200:
return response.text
except Exception as e:
logger.error(f"Error fetching file {filepath}: {e}")
return ""
# Test function
if __name__ == "__main__":
scraper = GiteaScraper(
base_url=os.getenv("GITEA_URL", "https://gitea.lab.audasmedia.com.au"),
token=os.getenv("GITEA_TOKEN", ""),
username=os.getenv("GITEA_USERNAME", "sam")
)
repos = scraper.get_user_repos()
print(f"Found {len(repos)} repositories")
for repo in repos[:3]: # Test with first 3
print(f"\nRepo: {repo.name}")
readme = scraper.get_readme(repo.name)
if readme:
print(f"README preview: {readme[:200]}...")

View File

@@ -0,0 +1,56 @@
# GOAL
Build a \"Deep Knowledge Agent\" (DKA) that acts as a secure,
quarantined bridge between the Chat Gateway and private data sources.
# ARCHITECTURE OVERVIEW
## Layers
1. Public Gateway: FastAPI (The \"Voice\").
2. Orchestration Layer: LangGraph Supervisor (The \"Router\").
3. Quarantined Agent: DKA / Librarian (The \"Keeper of Secrets\").
- Strictly Read-Only.
- Accesses ChromaDB and Media stores.
4. Specialist Agent: Opencode (The \"Engineer\").
## Data Sources (The \"Knowledge Mesh\")
- [ ] **Code**: Gitea (Repos, Markdown docs).
- [ ] **Notes**: Trilium Next, Obsidian, Flatnotes, HedgeDoc.
- [ ] **Wiki**: DokuWiki.
- [ ] **Inventory**: HomeBox (Physical gear, photos).
- [ ] **Tasks**: Vikunja.
- [ ] **Media**: Immich (Photos/Videos metadata via Gemini Vision).
## Agent Tooling & Orchestration
- [ ] **Orchestrators**: CAO CLI, Agent Pipe.
- [ ] **External Agents**: Goose, Aider, Opencode (Specialist).
# COMPONENT DETAILS
## The Librarian (DKA - LangGraph)
- Purpose: Semantic retrieval and data synthesis from vectors.
- Tools:
- `query_chroma`: Search the vector database.
- `fetch_media_link`: Returns a signed URL/path for Immich/HomeBox
images.
- Constraints:
- NO `bash` or `write` tools.
## The Ingestion Pipeline (Airflow/Custom Python)
- [ ] **Multi-Source Scrapers**: API-based (Gitea, Immich) and
File-based (Obsidian).
- [ ] **Vision Integration**: Gemini analyzes Immich photos to create
searchable text descriptions.
- [ ] **Storage**: ChromaDB (Vectors) + PostgreSQL (Metadata/Hashes).
# [TODO]{.todo .TODO} LIST \[0/4\] {#list-04}
- [ ] Create \'knowledge~service~\' directory.
- [ ] Implement `test_rag.py` (Hello World retrieval).
- [ ] Build basic scraper for `hobbies.org`.
- [ ] Integrate DKA logic into the FastAPI Gateway.

View File

@@ -0,0 +1,47 @@
#+TITLE: Phase 3: Knowledge Engine & Agent Orchestration
#+AUTHOR: Giordano (via opencode)
#+OPTIONS: toc:2
* GOAL
Build a "Deep Knowledge Agent" (DKA) that acts as a secure, quarantined bridge between the Chat Gateway and private data sources.
* ARCHITECTURE OVERVIEW
** Layers
1. Public Gateway: FastAPI (The "Voice").
2. Orchestration Layer: LangGraph Supervisor (The "Router").
3. Quarantined Agent: DKA / Librarian (The "Keeper of Secrets").
- Strictly Read-Only.
- Accesses ChromaDB and Media stores.
4. Specialist Agent: Opencode (The "Engineer").
** Data Sources (The "Knowledge Mesh")
- [ ] *Code*: Gitea (Repos, Markdown docs).
- [ ] *Notes*: Trilium Next, Obsidian, Flatnotes, HedgeDoc.
- [ ] *Wiki*: DokuWiki.
- [ ] *Inventory*: HomeBox (Physical gear, photos).
- [ ] *Tasks*: Vikunja.
- [ ] *Media*: Immich (Photos/Videos metadata via Gemini Vision).
** Agent Tooling & Orchestration
- [ ] *Orchestrators*: CAO CLI, Agent Pipe.
- [ ] *External Agents*: Goose, Aider, Opencode (Specialist).
* COMPONENT DETAILS
** The Librarian (DKA - LangGraph)
- Purpose: Semantic retrieval and data synthesis from vectors.
- Tools:
- ~query_chroma~: Search the vector database.
- ~fetch_media_link~: Returns a signed URL/path for Immich/HomeBox images.
- Constraints:
- NO ~bash~ or ~write~ tools.
** The Ingestion Pipeline (Airflow/Custom Python)
- [ ] *Multi-Source Scrapers*: API-based (Gitea, Immich) and File-based (Obsidian).
- [ ] *Vision Integration*: Gemini analyzes Immich photos to create searchable text descriptions.
- [ ] *Storage*: ChromaDB (Vectors) + PostgreSQL (Metadata/Hashes).
* TODO LIST [0/4]
- [ ] Create 'knowledge_service' directory.
- [ ] Implement ~test_rag.py~ (Hello World retrieval).
- [ ] Build basic scraper for ~hobbies.org~.
- [ ] Integrate DKA logic into the FastAPI Gateway.

52
knowledge_service/main.py Normal file
View File

@@ -0,0 +1,52 @@
from fastapi import FastAPI
from pydantic import BaseModel
from langchain_community.document_loaders import TextLoader
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter
import os
import logging
import sys
logging.basicConfig(level=logging.INFO, stream=sys.stdout)
logger = logging.getLogger(__name__)
app = FastAPI()
vector_db = None
# Voyage-2 embeddings via OpenRouter API
embeddings = OpenAIEmbeddings(
model="openai/text-embedding-3-small",
openai_api_base="https://openrouter.ai/api/v1",
openai_api_key=os.getenv("OPENROUTER_API_KEY")
)
@app.on_event("startup")
async def startup_event():
global vector_db
data_path = "./data/hobbies.md"
if os.path.exists(data_path):
try:
loader = TextLoader(data_path)
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)
vector_db = Chroma.from_documents(documents=chunks, embedding=embeddings, persist_directory="./chroma_db")
logger.info("Librarian: ChromaDB is loaded with openAi embeddings.")
except Exception as e:
logger.error(f"Librarian: DB error: {str(e)}")
else:
logger.warning(f"Librarian: Missing data file at {data_path}")
@app.get("/health")
async def health():
return {"status": "ready", "vectors_loaded": vector_db is not None}
class QueryRequest(BaseModel):
question: str
@app.post("/query")
async def query_knowledge(request: QueryRequest):
if not vector_db: return {"context": ""}
results = vector_db.similarity_search(request.question, k=2)
return {"context": "\n".join([res.page_content for res in results])}

View File

@@ -0,0 +1,7 @@
fastapi
uvicorn
langchain
langchain-community
langchain-openai
langchain-text-splitters
chromadb

View File

@@ -0,0 +1,22 @@
FROM python:3.11-slim
# Install system dependencies
RUN apt-get update && apt-get install -y \
gcc \
g++ \
&& rm -rf /var/lib/apt/lists/*
# Create app directory
WORKDIR /app
# Copy requirements
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy code
COPY . .
EXPOSE 8090
CMD ["python3", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8090"]

80
langgraph_service/main.py Normal file
View File

@@ -0,0 +1,80 @@
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from supervisor_agent import process_query
import logging
import sys
logging.basicConfig(level=logging.INFO, stream=sys.stdout)
logger = logging.getLogger(__name__)
app = FastAPI(title="LangGraph Supervisor Service")
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
class QueryRequest(BaseModel):
query: str
class QueryResponse(BaseModel):
response: str
agent_used: str
context: dict
@app.get("/health")
async def health():
return {"status": "healthy", "service": "langgraph-supervisor"}
@app.post("/query", response_model=QueryResponse)
async def query_supervisor(request: QueryRequest):
"""Main entry point for agent orchestration."""
logger.info(f"Received query: {request.query}")
try:
result = await process_query(request.query)
return QueryResponse(
response=result["response"],
agent_used=result["context"].get("source", "unknown"),
context=result["context"]
)
except Exception as e:
logger.error(f"Error processing query: {e}")
return QueryResponse(
response="Error processing your request",
agent_used="error",
context={"error": str(e)}
)
@app.get("/agents")
async def list_agents():
"""List available specialist agents."""
return {
"agents": [
{
"name": "librarian",
"description": "Queries the knowledge base for semantic information",
"triggers": ["repo", "code", "git", "hobby", "about", "skill"]
},
{
"name": "opencode",
"description": "Handles coding tasks and file modifications",
"triggers": ["write", "edit", "create", "fix", "implement"]
},
{
"name": "brain",
"description": "General LLM for reasoning and generation",
"triggers": ["default", "general questions"]
}
]
}
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8090)

View File

@@ -0,0 +1,9 @@
fastapi
uvicorn
langgraph
langchain
langchain-community
langchain-openai
httpx
pydantic

View File

@@ -0,0 +1,153 @@
from typing import TypedDict, Annotated, Sequence
from langgraph.graph import StateGraph, END
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage
import operator
import httpx
import os
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# State definition
class AgentState(TypedDict):
messages: Annotated[Sequence[BaseMessage], operator.add]
next_agent: str
context: dict
# Agent routing logic
def supervisor_node(state: AgentState):
"""Supervisor decides which specialist agent to call."""
last_message = state["messages"][-1].content.lower()
# Simple routing logic based on keywords
if any(kw in last_message for kw in ["repo", "code", "git", "github", "gitea", "project", "development"]):
return {"next_agent": "librarian"}
elif any(kw in last_message for kw in ["write", "edit", "create", "fix", "bug", "implement", "code change"]):
return {"next_agent": "opencode"}
elif any(kw in last_message for kw in ["sam", "hobby", "music", "experience", "skill", "about"]):
return {"next_agent": "librarian"}
else:
return {"next_agent": "brain"} # Default to general LLM
def librarian_agent(state: AgentState):
"""Librarian agent - queries knowledge base (ChromaDB)."""
last_message = state["messages"][-1].content
try:
# Call knowledge service
response = httpx.post(
"http://knowledge-service:8080/query",
json={"question": last_message},
timeout=10.0
)
if response.status_code == 200:
context = response.json().get("context", "")
return {
"messages": [AIMessage(content=f"Based on my knowledge base:\n\n{context}")],
"context": {"source": "librarian", "context": context}
}
except Exception as e:
logger.error(f"Librarian error: {e}")
return {
"messages": [AIMessage(content="I couldn't find relevant information in the knowledge base.")],
"context": {"source": "librarian", "error": str(e)}
}
def opencode_agent(state: AgentState):
"""Opencode agent - handles coding tasks via MCP."""
last_message = state["messages"][-1].content
# Placeholder - would integrate with opencode-brain
return {
"messages": [AIMessage(content=f"I'm the coding agent. I would help you with: {last_message}")],
"context": {"source": "opencode", "action": "coding_task"}
}
def brain_agent(state: AgentState):
"""Brain agent - general LLM fallback."""
last_message = state["messages"][-1].content
try:
# Call opencode-brain service
auth = httpx.BasicAuth("opencode", os.getenv("OPENCODE_PASSWORD", "sam4jo"))
timeout_long = httpx.Timeout(180.0, connect=10.0)
with httpx.AsyncClient(auth=auth, timeout=timeout_long) as client:
# Create session
session_res = client.post("http://opencode-brain:5000/session", json={"title": "Supervisor Query"})
session_id = session_res.json()["id"]
# Send message
response = client.post(
f"http://opencode-brain:5000/session/{session_id}/message",
json={"parts": [{"type": "text", "text": last_message}]}
)
data = response.json()
if "parts" in data:
for part in data["parts"]:
if part.get("type") == "text":
return {
"messages": [AIMessage(content=part["text"])],
"context": {"source": "brain"}
}
except Exception as e:
logger.error(f"Brain error: {e}")
return {
"messages": [AIMessage(content="I'm thinking about this...")],
"context": {"source": "brain"}
}
def route_decision(state: AgentState):
"""Routing function based on supervisor decision."""
return state["next_agent"]
# Build the graph
workflow = StateGraph(AgentState)
# Add nodes
workflow.add_node("supervisor", supervisor_node)
workflow.add_node("librarian", librarian_agent)
workflow.add_node("opencode", opencode_agent)
workflow.add_node("brain", brain_agent)
# Add edges
workflow.set_entry_point("supervisor")
# Conditional routing from supervisor
workflow.add_conditional_edges(
"supervisor",
route_decision,
{
"librarian": "librarian",
"opencode": "opencode",
"brain": "brain"
}
)
# All specialist agents end
workflow.add_edge("librarian", END)
workflow.add_edge("opencode", END)
workflow.add_edge("brain", END)
# Compile the graph
supervisor_graph = workflow.compile()
# Main entry point for queries
async def process_query(query: str) -> dict:
"""Process a query through the supervisor graph."""
result = await supervisor_graph.ainvoke({
"messages": [HumanMessage(content=query)],
"next_agent": "",
"context": {}
})
return {
"response": result["messages"][-1].content,
"context": result.get("context", {})
}

396
plan.md Normal file
View File

@@ -0,0 +1,396 @@
# Project Plan: aboutme_chat_demo
## Goal
Build a comprehensive AI agent system that ingests data from self-hosted services (Gitea, notes, wikis), stores it in a vector database, and provides intelligent responses through a multi-agent orchestration layer. The system emphasizes modular containerized architecture, industry-standard tools, and employment-relevant skills.
---
## Phase 1: Foundation & Core Infrastructure (COMPLETED)
### Phase 1.1: Frontend Application
**Location:** `/home/sam/development/aboutme_chat_demo/frontend/`
**Stack & Tools:**
- **Framework:** Vite 6.2.0 + React 19.0.0 + TypeScript
- **Styling:** Tailwind CSS 4.0.0
- **State Management:** TanStack Query (React Query) 5.67.0
- **Build Tool:** Vite with React plugin
- **Linting:** ESLint 9.21.0 + typescript-eslint 8.24.0
**Components Implemented:**
- `ChatInterface.tsx` - Auto-expanding text input with scrolling message list
- `App.tsx` - Main application container
- Real-time chat UI with message history
- HTTP client integration to backend gateway
**Docker Configuration:**
- Hot-reload development setup
- Volume mounting for instant code changes
- Node modules isolation (`/app/node_modules`)
### Phase 1.2: Chat Gateway (Orchestration Entry Point)
**Location:** `/home/sam/development/aboutme_chat_demo/backend/`
**Stack & Tools:**
- **Framework:** FastAPI (Python 3.11)
- **HTTP Client:** httpx 0.28.1
- **CORS:** Configured for all origins (development)
**Architecture Changes:**
- **OLD:** Hardcoded keyword matching (`["sam", "hobby", "music", "guitar", "skiing", "experience"]`) to trigger knowledge lookup
- **NEW:** Thin routing layer - all queries passed to LangGraph Supervisor for intelligent agent selection
- Removed direct Brain (LLM) integration
- Removed direct Knowledge Service calls
- Now acts as stateless entry point to LangGraph orchestration layer
**Endpoints:**
- `POST /chat` - Routes queries to LangGraph Supervisor
- `GET /health` - Service health check
- `GET /agents` - Lists available agents from LangGraph
### Phase 1.3: Knowledge Service (Librarian Agent)
**Location:** `/home/sam/development/knowledge_service/`
**Stack & Tools:**
- **Framework:** FastAPI + Uvicorn
- **Vector Database:** ChromaDB 1.5.1
- **Embeddings:** OpenAI via OpenRouter API (text-embedding-3-small)
- **LLM Framework:** LangChain ecosystem
- langchain 1.2.10
- langchain-community 0.4.1
- langchain-core 1.2.15
- langchain-text-splitters 1.1.1
- langchain-openai
- **Document Processing:** RecursiveCharacterTextSplitter
**Key Files:**
- `main.py` - FastAPI endpoints for /query and /health
- `gitea_scraper.py` - Gitea API integration module (NEW)
- `data/hobbies.md` - Sample knowledge base content
- `chroma_db/` - Persistent vector storage
**Docker Architecture (Optimized):**
- **Pattern:** Separate `/app/packages` (cached) from `/app/code` (volume-mounted)
- **Benefits:**
- Code changes apply instantly without rebuild
- Package installation happens once during image build
- PYTHONPATH=/app/packages ensures imports work
- **Volumes:**
- `./data:/app/code/data` - Knowledge documents
- `./chroma_db:/app/code/chroma_db` - Vector database persistence
- `./main.py:/app/code/main.py:ro` - Read-only code mount
### Phase 1.4: LangGraph Supervisor Service (NEW)
**Location:** `/home/sam/development/langgraph_service/`
**Stack & Tools:**
- **Framework:** FastAPI + Uvicorn
- **Orchestration:** LangGraph 1.0.9
- langgraph-checkpoint 4.0.0
- langgraph-prebuilt 1.0.8
- langgraph-sdk 0.3.9
- **State Management:** TypedDict with Annotated operators
- **Message Types:** LangChain Core Messages (HumanMessage, AIMessage)
**Architecture:**
- **Supervisor Node:** Analyzes queries and routes to specialist agents
- **Agent Graph:** StateGraph with conditional edges
- **Three Specialist Agents:**
1. **Librarian Agent** - Queries ChromaDB via knowledge-service:8080
2. **Opencode Agent** - Placeholder for coding tasks (MCP integration ready)
3. **Brain Agent** - Fallback to OpenCode Brain LLM (opencode-brain:5000)
**Routing Logic:**
```
Query → Supervisor → [Librarian | Opencode | Brain]
- "repo/code/git/project" → Librarian (RAG)
- "write/edit/create/fix" → Opencode (Coding)
- "sam/hobby/music/about" → Librarian (RAG)
- Default → Brain (General LLM)
```
**Docker Configuration:**
- Self-contained with own `/app/packages`
- No package sharing with other services (modular)
- Port 8090 exposed
### Phase 1.5: Apache Airflow (Scheduled Ingestion)
**Location:** `/home/sam/development/airflow/`
**Stack & Tools:**
- **Orchestration:** Apache Airflow 2.8.1
- **Executor:** CeleryExecutor (distributed task processing)
- **Database:** PostgreSQL 13 (metadata)
- **Message Queue:** Redis (Celery broker)
- **Services:**
- airflow-webserver (UI + API)
- airflow-scheduler (DAG scheduling)
- airflow-worker (task execution)
- airflow-triggerer (deferrable operators)
**DAG: gitea_daily_ingestion**
- **Schedule:** Daily
- **Tasks:**
1. `fetch_repos` - Get all user repos from Gitea API
2. `fetch_readmes` - Download README files
3. `ingest_to_chroma` - Store in Knowledge Service
**Integration:**
- Mounts `knowledge_service/gitea_scraper.py` into DAGs folder
- Environment variables for Gitea API token
- Network: ai-mesh (communicates with knowledge-service)
### Phase 1.6: Gitea Scraper Module
**Location:** `/home/sam/development/knowledge_service/gitea_scraper.py`
**Functionality:**
- **API Integration:** Gitea REST API v1
- **Authentication:** Token-based (Authorization header)
- **Methods:**
- `get_user_repos()` - Paginated repo listing
- `get_readme(repo_name)` - README content with fallback names
- `get_repo_files(repo_name, path)` - Directory listing
- `get_file_content(repo_name, filepath)` - File download
**Data Model:**
- `RepoMetadata` dataclass (name, description, url, branch, updated_at, language)
### Phase 1.7: Docker Infrastructure
**Network:**
- `ai-mesh` (external) - Shared bridge network for all services
**Services Overview:**
| Service | Port | Purpose | Dependencies |
|---------|------|---------|--------------|
| frontend | 5173 | React UI | backend |
| backend | 8000 | Chat Gateway | langgraph-service, db |
| db | 5432 | PostgreSQL (chat history) | - |
| knowledge-service | 8080 | RAG / Vector DB | - |
| langgraph-service | 8090 | Agent Orchestration | knowledge-service |
| airflow-webserver | 8081 | Workflow UI | postgres, redis |
| airflow-scheduler | - | DAG scheduling | postgres, redis |
| airflow-worker | - | Task execution | postgres, redis |
| redis | 6379 | Message broker | - |
| postgres (airflow) | - | Airflow metadata | - |
**Container Patterns:**
- All Python services use `/app/packages` + `/app/code` separation
- Node.js services use volume mounting for hot reload
- PostgreSQL uses named volumes for persistence
- External network (`ai-mesh`) for cross-service communication
---
## Phase 2: Multi-Source Knowledge Ingestion (IN PROGRESS)
### Goal
Expand beyond Gitea to ingest data from all self-hosted knowledge sources.
### Data Sources to Integrate:
1. **Notes & Documentation**
- **Trilium Next** - Hierarchical note-taking (tree structure)
- **Obsidian** - Markdown vault with backlinks
- **Flatnotes** - Flat file markdown notes
- **HedgeDoc** - Collaborative markdown editor
2. **Wiki**
- **DokuWiki** - Structured wiki content
3. **Project Management**
- **Vikunja** - Task lists and project tracking
4. **Media & Assets**
- **Immich** - Photo/video metadata + Gemini Vision API for content description
- **HomeBox** - Physical inventory with images
### Technical Approach:
- **Crawling:** Selenium/Playwright for JavaScript-heavy UIs
- **Extraction:** Firecrawl or LangChain loaders for structured content
- **Vision:** Gemini Vision API for image-to-text conversion
- **Storage:** ChromaDB (vectors) + PostgreSQL (metadata, hashes for deduplication)
- **Scheduling:** Additional Airflow DAGs per source
---
## Phase 3: Advanced Agent Capabilities
### Goal
Integrate external AI tools and expand agent capabilities.
### Agent Tooling:
1. **MCP (Model Context Protocol) Servers**
- Git MCP - Local repository operations
- Filesystem MCP - Secure file access
- Memory MCP - Knowledge graph persistence
- Custom Gitea MCP (if/when available)
2. **External Agents**
- **Goose** - CLI-based agent for local task execution
- **Aider** - AI pair programming
- **Opencode** - Already integrated (Brain Agent)
- **Automaker** - Workflow automation
- **Autocoder** - Code generation
3. **Orchestration Tools**
- **CAO CLI** - Agent orchestrator
- **Agent Pipe** - Pipeline management
### Integration Pattern:
- Each external tool wrapped as LangGraph node
- Supervisor routes to appropriate specialist
- State management for multi-turn interactions
---
## Phase 4: Production Hardening
### Goal
Prepare system for production deployment.
### Authentication & Security:
- **Laravel** - User authentication service (Phase 4 original plan)
- **JWT tokens** - Session management
- **API key management** - Secure credential storage
- **Network policies** - Inter-service communication restrictions
### Monitoring & Observability:
- **LangSmith** - LLM tracing and debugging
- **Langfuse** - LLM observability (note: currently in per-project install list)
- **Prometheus/Grafana** - Metrics and dashboards
- **Airflow monitoring** - DAG success/failure alerting
### Scaling:
- **ChromaDB** - Migration to server mode for concurrent access
- **Airflow** - Multiple Celery workers
- **Load balancing** - Nginx reverse proxy
- **Backup strategies** - Vector DB snapshots, PostgreSQL dumps
---
## Phase 5: Workflow Automation & Visual Tools
### Goal
Add visual prototyping and automation capabilities.
### Tools to Integrate:
1. **Flowise** - Visual LangChain builder
- Prototype agent flows without coding
- Export to Python code
- Debug RAG pipelines visually
2. **Windmill** - Turn scripts into workflows
- Schedule Python/LangChain scripts
- Reactive triggers (e.g., on-commit)
- Low-code workflow builder
3. **Activepieces** - Event-driven automation
- Webhook triggers from Gitea
- Integration with external APIs
- Visual workflow designer
4. **N8N** - Alternative workflow automation
- Consider if Activepieces doesn't meet needs
### Use Cases:
- **On-commit triggers:** Gitea push → immediate re-scan → notification
- **Scheduled reports:** Weekly summary of new/updated projects
- **Reactive workflows:** New photo uploaded → Gemini Vision → update knowledge base
---
## Phase 6: Knowledge Library Options & RAG Enhancement
### Goal
Advanced retrieval and knowledge organization.
### RAG Pipeline Improvements:
1. **Hybrid Search**
- Semantic search (ChromaDB) + Keyword search (PostgreSQL)
- Re-ranking with cross-encoders
- Query expansion and decomposition
2. **Multi-Modal RAG**
- Image retrieval (Immich + CLIP embeddings)
- Document parsing (PDFs, code files)
- Structured data (tables, lists)
3. **Knowledge Organization**
- Entity extraction and linking
- Knowledge graph construction
- Hierarchical chunking strategies
### Alternative Vector Stores (Evaluation):
- **pgvector** - PostgreSQL native (if ChromaDB limitations hit)
- **Weaviate** - GraphQL interface, hybrid search
- **Qdrant** - Rust-based, high performance
- **Milvus** - Enterprise-grade, distributed
---
## Phase 7: User Experience & Interface
### Goal
Enhanced frontend and interaction patterns.
### Frontend Enhancements:
1. **Chat Interface Improvements**
- Streaming responses (Server-Sent Events)
- Message threading and context
- File upload for document ingestion
- Image display (for Immich integration)
2. **Knowledge Browser**
- View ingested documents
- Search knowledge base directly
- See confidence scores and sources
- Manual document upload/ingestion trigger
3. **Agent Management**
- View active agents
- Configure agent behavior
- Monitor agent performance
- Override routing decisions
### Mobile & Accessibility:
- Responsive design improvements
- Mobile app (React Native or PWA)
- Accessibility compliance (WCAG)
---
## Technology Stack Summary
### Core Frameworks:
- **Backend:** FastAPI (Python 3.11)
- **Frontend:** Vite + React 19 + TypeScript
- **Styling:** Tailwind CSS
- **Database:** PostgreSQL 15
- **Vector DB:** ChromaDB 1.5.1
### AI/ML Stack:
- **LLM Orchestration:** LangGraph 1.0.9 + LangChain
- **Embeddings:** OpenAI via OpenRouter (text-embedding-3-small)
- **LLM:** OpenCode Brain (opencode-brain:5000)
- **Vision:** Gemini Vision API (Phase 2)
### Workflow & Scheduling:
- **Orchestration:** Apache Airflow 2.8.1 (CeleryExecutor)
- **Message Queue:** Redis
- **External Tools:** Flowise, Windmill, Activepieces
### Development Tools:
- **Containers:** Docker + Docker Compose
- **Networking:** Bridge network (ai-mesh)
- **Testing:** curl/httpx for API testing
- **Version Control:** Gitea (self-hosted)
### Skills Demonstrated:
- Containerized microservices architecture
- Multi-agent AI orchestration (LangGraph)
- Vector database implementation (RAG)
- ETL pipeline development (Airflow)
- API integration and web scraping
- Modular, maintainable code organization
- Industry-standard AI tooling (LangChain ecosystem)
- Workflow automation and scheduling