75 lines
4.2 KiB
Markdown
75 lines
4.2 KiB
Markdown
---
|
|
created: 2026-06-27 12:17
|
|
modified: 2026-06-27 12:17
|
|
type: note
|
|
tags:
|
|
- ai
|
|
- ai-agents
|
|
- tool
|
|
- tools
|
|
aliases: []
|
|
---
|
|
# [[Tools to try with AI]]
|
|
|
|
|
|
| Repository / Project Name | GitHub Repository URL | Description |
|
|
| :--- | :--- | :--- |
|
|
| **OpenMontage** | [calesthio/OpenMontage](https://github.com) | Text-to-video AI editor |
|
|
| **codebase-memory-mcp** | [DeusData/codebase-memory-mcp](https://github.com) | Agent context memory |
|
|
| **timesfm** | [google-research/timesfm](https://github.com) | Time-series forecasting model |
|
|
| **Zapier MCP** | [zapier/zapier-mcp](https://github.com) | App integration gateway |
|
|
| **peerd** | [notasithlord/peerd](https://github.com) | Local browser agent |
|
|
| **FluidVoice** | [altic-dev/FluidVoice](https://github.com) | Local dictation tool |
|
|
| **birdclaw** | [steipete/birdclaw](https://github.com) | Clean X reader |
|
|
| **worldmonitor** | [koala73/worldmonitor](https://github.com) | Global event dashboard |
|
|
| **penpot** | [penpot/penpot](https://github.com) | Open-source Figma alternative |
|
|
| **voicebox** | [jamiepine/voicebox](https://github.com) | Local voice cloner |
|
|
| **system_prompts_leaks** | [asgeirtj/system_prompts_leaks](https://github.com) | AI prompt repository |
|
|
| **Agent-Reach** | [Panniantong/agent-reach](https://github.com) | Social media connector |
|
|
|
|
|
|
# AI Engineering Tools Overview
|
|
|
|
## 🛠️ Structured Data & Model Control
|
|
|
|
### Instructor (567-labs)
|
|
* **What it is:** A library that patches LLM clients (OpenAI, Anthropic, Gemini) to return strict Python data structures.
|
|
* **Why you need it:** Instead of asking an LLM for JSON and hoping it formats it correctly, you define a Pydantic schema. Instructor guarantees the model's output will match that schema perfectly, automatically retrying and feeding errors back to the LLM if validation fails.
|
|
* **URL:** https://useinstructor.com
|
|
|
|
### Outlines (dottxt-ai)
|
|
* **What it is:** A library that enforces strict structure at the token generation level for local and API-based LLMs.
|
|
* **Why you need it:** Unlike Instructor (which validates data after or during generation), Outlines guides the LLM dynamically. It alters the model's math so it physically cannot choose a token that violates your regex, schema, or choice list, resulting in 100% reliable structure with zero parsing errors.
|
|
* **URL:** https://github.com
|
|
|
|
---
|
|
|
|
## 🔌 Universal Integration & Routing
|
|
|
|
### LiteLLM
|
|
* **What it is:** A lightweight proxy and SDK that acts as a universal translator for over 100 different LLM APIs.
|
|
* **Why you need it:** Every AI provider has a slightly different code structure for API calls. LiteLLM lets you use the standard OpenAI format (`openai.chat.completions.create`) to talk to Anthropic, Bedrock, Cohere, or HuggingFace, while also handling fallbacks, load balancing, and spend tracking.
|
|
* **URL:** https://github.com
|
|
|
|
---
|
|
|
|
## 🧠 Programming & Optimizing Prompts
|
|
|
|
### DSPy (Stanford NLP)
|
|
* **What it is:** A framework that treats prompt engineering like programming rather than manual "prompt hacking."
|
|
* **Why you need it:** Instead of spending hours tweaking strings like "You are an expert...", you write Python modules (e.g., Chain-of-Thought, RAG). DSPy then compiles and automatically optimizes the prompts and few-shot examples based on your evaluation data, similar to how neural networks learn weights.
|
|
* **URL:** https://github.com
|
|
|
|
---
|
|
|
|
## 🕸️ Data Scraping & Preparation
|
|
|
|
### Crawl4AI
|
|
* **What it is:** An open-source web crawler designed specifically to scrape websites and convert them into data optimized for LLMs.
|
|
* **Why you need it:** Standard web scrapers pull raw, messy HTML full of ads and scripts. Crawl4AI extracts the core text, strips the noise, and outputs clean Markdown or structured JSON, making it perfect for feeding live web data into a RAG pipeline.
|
|
* **URL:** https://github.com
|
|
|
|
### Chonkie
|
|
* **What it is:** A highly optimized, lightweight text-chunking library designed for Retrieval-Augmented Generation (RAG).
|
|
* **Why you need it:** LLMs cannot ingest massive documents all at once; text must be broken down first. Chonkie focuses on speed and accuracy, splitting text intelligently (by tokens, sentences, or semantics) so context is never cut in half mid-sentence before being embedded.
|
|
* **URL:** https://github.com |