159 lines
5.1 KiB
Markdown
159 lines
5.1 KiB
Markdown
---
|
|
created: 2026-05-16 17:02
|
|
modified: 2026-05-16 17:02
|
|
type: note
|
|
tags:
|
|
- ai
|
|
- dev-ops
|
|
- website
|
|
- iframe
|
|
aliases: []
|
|
id: 1778914902-WMFA
|
|
---
|
|
# [[Local Hybrid Vector + Graph RAG Setup]]
|
|
|
|
# Local Hybrid Vector + Graph RAG Setup via Caddy & Docker
|
|
|
|
Taken from a Google Gemini AI chat.
|
|
|
|
This document outlines the architecture and configuration files required to run a single, unified local RAG system (Vector search for static files + Graph search for Obsidian notes) served inside an iframe across three separate context-specific showcase websites (`devops.local`, `coding.local`, `ai.local`).
|
|
|
|
---
|
|
|
|
## 1. Network Routing (`Caddyfile`)
|
|
|
|
This configuration uses a `Caddyfile` snippets to proxy your backend container while securely handling cross-origin iframe security rules (`Content-Security-Policy`).
|
|
|
|
```caddy
|
|
# Core AI RAG Application Backend
|
|
ai.local {
|
|
reverse_proxy localhost:8000
|
|
|
|
header {
|
|
# Restrict iframe rendering specifically to your 3 interest domains
|
|
Content-Security-Policy "frame-ancestors 'self' https://devops.local https://coding.local https://ai-site.local"
|
|
|
|
# Standard security hardening
|
|
X-Content-Type-Options "nosniff"
|
|
Referrer-Policy "strict-origin-when-cross-origin"
|
|
}
|
|
}
|
|
|
|
# Example Configuration Blocks for Frontend Sites
|
|
devops.local {
|
|
root * /var/www/devops_site
|
|
file_server
|
|
}
|
|
|
|
coding.local {
|
|
root * /var/www/coding_site
|
|
file_server
|
|
}
|
|
|
|
ai-site.local {
|
|
root * /var/www/ai_site
|
|
file_server
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 2. Infrastructure Layer (`docker-compose.yml`)
|
|
|
|
The app runs out of a localized, slimmed-down Python environment container. Underlying vector files and Graph databases are explicitly mounted as **read-only** (`:ro`) to guarantee stability against prompt manipulation.
|
|
|
|
```yaml
|
|
version: '3.8'
|
|
|
|
services:
|
|
unified-ai-rag:
|
|
image: python:3.11-slim
|
|
container_name: local_ai_rag
|
|
working_dir: /app
|
|
volumes:
|
|
# Mount application scripts
|
|
- ./app:/app
|
|
# Mount databases and notes securely as READ-ONLY
|
|
- ./db/semantic_rag:/app/db/semantic_rag:ro
|
|
- ./db/obsidian_graph.gpickle:/app/db/obsidian_graph.gpickle:ro
|
|
ports:
|
|
- "8000:8000"
|
|
environment:
|
|
- OPENAI_API_KEY=your_openai_api_key_here
|
|
- ANTHROPIC_API_KEY=your_anthropic_api_key_here
|
|
command: ["pip", "install", "-r", "requirements.txt", "&&", "chainlit", "run", "app.py", "--port", "8000"]
|
|
restart: unless-stopped
|
|
```
|
|
|
|
---
|
|
|
|
## 3. Web UI Application & Context Controller (`app.py`)
|
|
|
|
This Chainlit-driven script reads incoming referrers or basic text prompts to shift its personality depending on which website the recruiter came from.
|
|
|
|
```python
|
|
import chainlit as cl
|
|
import networkx as nx
|
|
from chromadb import PersistentClient
|
|
|
|
# 1. Initialization hooks for Vector and Graph layers (Read-Only)
|
|
def load_databases():
|
|
vector_client = PersistentClient(path="/app/db/semantic_rag")
|
|
vector_layer = vector_client.get_collection(name="resume_and_docs")
|
|
graph_layer = nx.read_gpickle("/app/db/obsidian_graph.gpickle")
|
|
return vector_layer, graph_layer
|
|
|
|
vector_db, graph_db = load_databases()
|
|
|
|
@cl.on_chat_start
|
|
async def start():
|
|
# Detect the site referring the frame to adjust persona
|
|
http_headers = cl.user_session.get("http_headers", {})
|
|
referer = http_headers.get("referer", "")
|
|
|
|
system_prompt = "You are a helpful AI assistant reviewing my portfolio data."
|
|
welcome_msg = "Hello! Ask me any questions about my profile or experience."
|
|
|
|
if "devops.local" in referer:
|
|
system_prompt = "Persona: DevOps Engineer. Focus heavily on infrastructure, IoT architecture, CI/CD, and server logs."
|
|
welcome_msg = "Welcome Recruiter! Ask me anything about my DevOps automation and IoT infrastructure."
|
|
elif "coding.local" in referer:
|
|
system_prompt = "Persona: Software Engineer. Emphasize backend code, software design patterns, and clean programming methodologies."
|
|
welcome_msg = "Hello! Let's talk about my development portfolio and technical code paradigms."
|
|
elif "ai-site.local" in referer:
|
|
system_prompt = "Persona: AI/RAG Specialist. Discuss custom embedding techniques, semantic lookups, and graph networks."
|
|
welcome_msg = "Greetings! Feel free to pick my brain about graph-based indexing and large language models."
|
|
|
|
cl.user_session.set("system_prompt", system_prompt)
|
|
await cl.Message(content=welcome_msg).send()
|
|
|
|
@cl.on_message
|
|
async def main(message: cl.Message):
|
|
query = message.content
|
|
sys_prompt = cl.user_session.get("system_prompt")
|
|
|
|
# Executing the dual pass strategy
|
|
# (Extract semantic chunks + pull Obsidian link neighbors from the graph)
|
|
# Synthesize outputs down through the LLM context window here...
|
|
|
|
response_text = f"Processed query using profile persona contextualized framework rules."
|
|
await cl.Message(content=response_text).send()
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Frontend Integration (`iframe`)
|
|
|
|
Embed this string block directly inside your target HTML sites:
|
|
|
|
```html
|
|
<iframe
|
|
src="https://ai.local"
|
|
style="width: 100%; height: 650px; border: 1px solid #ccc; border-radius: 8px;"
|
|
allow="clipboard-read; clipboard-write">
|
|
</iframe>
|
|
```
|
|
|
|
|
|
|