--- created: 2026-05-16 17:02 modified: 2026-05-16 17:02 type: note tags: - ai - dev-ops - website - iframe - ai-resume aliases: [] id: 1778914902-WMFA --- # [[Local Hybrid Vector + Graph RAG Setup]] # Local Hybrid Vector + Graph RAG Setup via Caddy & Docker Taken from a Google Gemini AI chat. This document outlines the architecture and configuration files required to run a single, unified local RAG system (Vector search for static files + Graph search for Obsidian notes) served inside an iframe across three separate context-specific showcase websites (`devops.local`, `coding.local`, `ai.local`). --- ## 1. Network Routing (`Caddyfile`) This configuration uses a `Caddyfile` snippets to proxy your backend container while securely handling cross-origin iframe security rules (`Content-Security-Policy`). ```caddy # Core AI RAG Application Backend ai.local { reverse_proxy localhost:8000 header { # Restrict iframe rendering specifically to your 3 interest domains Content-Security-Policy "frame-ancestors 'self' https://devops.local https://coding.local https://ai-site.local" # Standard security hardening X-Content-Type-Options "nosniff" Referrer-Policy "strict-origin-when-cross-origin" } } # Example Configuration Blocks for Frontend Sites devops.local { root * /var/www/devops_site file_server } coding.local { root * /var/www/coding_site file_server } ai-site.local { root * /var/www/ai_site file_server } ``` --- ## 2. Infrastructure Layer (`docker-compose.yml`) The app runs out of a localized, slimmed-down Python environment container. Underlying vector files and Graph databases are explicitly mounted as **read-only** (`:ro`) to guarantee stability against prompt manipulation. ```yaml version: '3.8' services: unified-ai-rag: image: python:3.11-slim container_name: local_ai_rag working_dir: /app volumes: # Mount application scripts - ./app:/app # Mount databases and notes securely as READ-ONLY - ./db/semantic_rag:/app/db/semantic_rag:ro - ./db/obsidian_graph.gpickle:/app/db/obsidian_graph.gpickle:ro ports: - "8000:8000" environment: - OPENAI_API_KEY=your_openai_api_key_here - ANTHROPIC_API_KEY=your_anthropic_api_key_here command: ["pip", "install", "-r", "requirements.txt", "&&", "chainlit", "run", "app.py", "--port", "8000"] restart: unless-stopped ``` --- ## 3. Web UI Application & Context Controller (`app.py`) This Chainlit-driven script reads incoming referrers or basic text prompts to shift its personality depending on which website the recruiter came from. ```python import chainlit as cl import networkx as nx from chromadb import PersistentClient # 1. Initialization hooks for Vector and Graph layers (Read-Only) def load_databases(): vector_client = PersistentClient(path="/app/db/semantic_rag") vector_layer = vector_client.get_collection(name="resume_and_docs") graph_layer = nx.read_gpickle("/app/db/obsidian_graph.gpickle") return vector_layer, graph_layer vector_db, graph_db = load_databases() @cl.on_chat_start async def start(): # Detect the site referring the frame to adjust persona http_headers = cl.user_session.get("http_headers", {}) referer = http_headers.get("referer", "") system_prompt = "You are a helpful AI assistant reviewing my portfolio data." welcome_msg = "Hello! Ask me any questions about my profile or experience." if "devops.local" in referer: system_prompt = "Persona: DevOps Engineer. Focus heavily on infrastructure, IoT architecture, CI/CD, and server logs." welcome_msg = "Welcome Recruiter! Ask me anything about my DevOps automation and IoT infrastructure." elif "coding.local" in referer: system_prompt = "Persona: Software Engineer. Emphasize backend code, software design patterns, and clean programming methodologies." welcome_msg = "Hello! Let's talk about my development portfolio and technical code paradigms." elif "ai-site.local" in referer: system_prompt = "Persona: AI/RAG Specialist. Discuss custom embedding techniques, semantic lookups, and graph networks." welcome_msg = "Greetings! Feel free to pick my brain about graph-based indexing and large language models." cl.user_session.set("system_prompt", system_prompt) await cl.Message(content=welcome_msg).send() @cl.on_message async def main(message: cl.Message): query = message.content sys_prompt = cl.user_session.get("system_prompt") # Executing the dual pass strategy # (Extract semantic chunks + pull Obsidian link neighbors from the graph) # Synthesize outputs down through the LLM context window here... response_text = f"Processed query using profile persona contextualized framework rules." await cl.Message(content=response_text).send() ``` --- ## 4. Frontend Integration (`iframe`) Embed this string block directly inside your target HTML sites: ```html ```