ContextCut PRO sits between your AI tools and your local LLM. It injects relevant context from your knowledge base into every prompt — all on your machine. Zero data ever leaves your network.
Run this on any machine with Docker + Ollama.
Add .md files to the knowledge folder. Auto-indexed.
Change your LLM endpoint to localhost:18788. No code changes.
Only relevant chunks pass the threshold. Irrelevant queries cost ~5 tokens.
Drop-in replacement for any OpenAI-compatible LLM. Change one URL, get context injection instantly.
Real-time CTX% bar shows exactly how much of your window is used. Trimmed responses never exceed the limit.
Split-panel: streaming chat + per-request token analytics, source hits, and relevance scores.
Drop a .md file in the knowledge folder — it's ingested, chunked, and searchable in seconds.
Ollama, OpenAI, OpenRouter, Anthropic, xAI — any OpenAI-compatible endpoint works out of the box.
Every request logged to SQLite. Searchable archive with per-request breakdowns and token savings.
Expose your knowledge base via Model Context Protocol — usable from Claude Desktop, Cursor, VS Code, and more.
Local (Ollama nomic-embed-text) or cloud (Voyage AI). Your data, your embedding, your call.
Litigation, real estate, SMB, contracts
Corporate, personal, SMB tax & books
Clinical, billing, practice, research
Listings, contracts, disclosures
Estate, investment, retirement
Contracts, regulatory compliance
Contracts, privacy compliance
Methodology, deliverables, engagement
Designed for professionals who cannot send client data to cloud AI services.
Dashboard (:18787), proxy (:18788), and Qdrant (:6333) all bind to 127.0.0.1. Not reachable from LAN or WAN.
Zero outbound data. No document content, queries, or metadata is ever transmitted. The only outbound traffic is a lightweight heartbeat every 15 minutes to verify your license.
Only /v1/chat/completions, /api/chat, /api/generate, and /v1/completions are forwarded. All other paths return 404.
Gumroad webhooks are verified via HMAC-SHA256 signature before processing. License check is stateless and minimal.
Management endpoints (/api/pull, /api/push, /api/delete, /api/copy, /api/create) return 403 through the proxy.
Full source available on GitHub. No binary blobs. Every security claim is verifiable by inspection. MIT-licensed free edition available.
One purchase. No subscriptions. No surprises.
3 seats · Lifetime · No subscription