sam/obsidian-vault

Files

Sam Rolfe bfc7089dc5 sam-4screen-desktop 2026-6-29:13:13:53

2026-06-29 13:13:53 +10:00

4.2 KiB

Raw Blame History

created, modified, type, tags, aliases

created

modified

type

tags

aliases

2026-06-27 12:17

2026-06-27 12:17

note

ai

ai-agents

tool

tools

Tools to try with AI

Repository / Project Name	GitHub Repository URL	Description
OpenMontage	calesthio/OpenMontage	Text-to-video AI editor
codebase-memory-mcp	DeusData/codebase-memory-mcp	Agent context memory
timesfm	google-research/timesfm	Time-series forecasting model
Zapier MCP	zapier/zapier-mcp	App integration gateway
peerd	notasithlord/peerd	Local browser agent
FluidVoice	altic-dev/FluidVoice	Local dictation tool
birdclaw	steipete/birdclaw	Clean X reader
worldmonitor	koala73/worldmonitor	Global event dashboard
penpot	penpot/penpot	Open-source Figma alternative
voicebox	jamiepine/voicebox	Local voice cloner
system_prompts_leaks	asgeirtj/system_prompts_leaks	AI prompt repository
Agent-Reach	Panniantong/agent-reach	Social media connector

AI Engineering Tools Overview

🛠️ Structured Data & Model Control

Instructor (567-labs)

What it is: A library that patches LLM clients (OpenAI, Anthropic, Gemini) to return strict Python data structures.
Why you need it: Instead of asking an LLM for JSON and hoping it formats it correctly, you define a Pydantic schema. Instructor guarantees the model's output will match that schema perfectly, automatically retrying and feeding errors back to the LLM if validation fails.
URL: https://useinstructor.com

Outlines (dottxt-ai)

What it is: A library that enforces strict structure at the token generation level for local and API-based LLMs.
Why you need it: Unlike Instructor (which validates data after or during generation), Outlines guides the LLM dynamically. It alters the model's math so it physically cannot choose a token that violates your regex, schema, or choice list, resulting in 100% reliable structure with zero parsing errors.
URL: https://github.com

🔌 Universal Integration & Routing

LiteLLM

What it is: A lightweight proxy and SDK that acts as a universal translator for over 100 different LLM APIs.
Why you need it: Every AI provider has a slightly different code structure for API calls. LiteLLM lets you use the standard OpenAI format (openai.chat.completions.create) to talk to Anthropic, Bedrock, Cohere, or HuggingFace, while also handling fallbacks, load balancing, and spend tracking.
URL: https://github.com

🧠 Programming & Optimizing Prompts

DSPy (Stanford NLP)

What it is: A framework that treats prompt engineering like programming rather than manual "prompt hacking."
Why you need it: Instead of spending hours tweaking strings like "You are an expert...", you write Python modules (e.g., Chain-of-Thought, RAG). DSPy then compiles and automatically optimizes the prompts and few-shot examples based on your evaluation data, similar to how neural networks learn weights.
URL: https://github.com

🕸️ Data Scraping & Preparation

Crawl4AI

What it is: An open-source web crawler designed specifically to scrape websites and convert them into data optimized for LLMs.
Why you need it: Standard web scrapers pull raw, messy HTML full of ads and scripts. Crawl4AI extracts the core text, strips the noise, and outputs clean Markdown or structured JSON, making it perfect for feeding live web data into a RAG pipeline.
URL: https://github.com

Chonkie

What it is: A highly optimized, lightweight text-chunking library designed for Retrieval-Augmented Generation (RAG).
Why you need it: LLMs cannot ingest massive documents all at once; text must be broken down first. Chonkie focuses on speed and accuracy, splitting text intelligently (by tokens, sentences, or semantics) so context is never cut in half mid-sentence before being embedded.
URL: https://github.com