● 100% Local · No Cloud · No Telemetry

The RAG proxy that
respects your privacy.

ContextCut PRO sits between your AI tools and your local LLM. It injects relevant context from your knowledge base into every prompt — all on your machine. Zero data ever leaves your network.

Buy Now — $99.88 Lifetime → ▽ Install Guide
Your AI App
ContextCut PRO
Local LLM
☁ Cloud LLM
Knowledge Base
◆ 100% LOCAL
☁ BEFORE
◆ AFTER WITH CONTEXTCUT-PRO
60+
Starter Knowledge Packs
3 seats
One-time purchase
50–90%
Token savings
Zero
Data leaving your network

One command. Done.

Run this on any machine with Docker + Ollama.

# Install ContextCut PRO → proxy on :18788, dashboard on :18787
$ curl -fsSL "https://api.contextcut-pro.com/install/CC-PRO-your-license-key" | bash 
macOS Linux Windows

How It Works

1

Drop Your Docs In

Add .md files to the knowledge folder. Auto-indexed.

2

Point Your Client

Change your LLM endpoint to localhost:18788. No code changes.

3

Smart Injection

Only relevant chunks pass the threshold. Irrelevant queries cost ~5 tokens.

What You Get

Transparent RAG Proxy

Drop-in replacement for any OpenAI-compatible LLM. Change one URL, get context injection instantly.

Smart Context Window

Real-time CTX% bar shows exactly how much of your window is used. Trimmed responses never exceed the limit.

Live Dashboard

Split-panel: streaming chat + per-request token analytics, source hits, and relevance scores.

File Watcher

Drop a .md file in the knowledge folder — it's ingested, chunked, and searchable in seconds.

Multi-Provider

Ollama, OpenAI, OpenRouter, Anthropic, xAI — any OpenAI-compatible endpoint works out of the box.

Session History

Every request logged to SQLite. Searchable archive with per-request breakdowns and token savings.

MCP Knowledge Server

Expose your knowledge base via Model Context Protocol — usable from Claude Desktop, Cursor, VS Code, and more.

Embedding Choice

Local (Ollama nomic-embed-text) or cloud (Voyage AI). Your data, your embedding, your call.

Starter Packs for Professionals

Lawyer

Litigation, real estate, SMB, contracts

📊

CPA

Corporate, personal, SMB tax & books

Doctor

Clinical, billing, practice, research

🏠

Realtor

Listings, contracts, disclosures

💼

Advisor

Estate, investment, retirement

🏗

Architect

Contracts, regulatory compliance

💻

Tech

Contracts, privacy compliance

📈

Consultant

Methodology, deliverables, engagement

Privacy by Architecture

Designed for professionals who cannot send client data to cloud AI services.

Localhost-only binding

Dashboard (:18787), proxy (:18788), and Qdrant (:6333) all bind to 127.0.0.1. Not reachable from LAN or WAN.

No telemetry

Zero outbound data. No document content, queries, or metadata is ever transmitted. The only outbound traffic is a lightweight heartbeat every 15 minutes to verify your license.

Open proxy prevention

Only /v1/chat/completions, /api/chat, /api/generate, and /v1/completions are forwarded. All other paths return 404.

HMAC-verified licensing

Gumroad webhooks are verified via HMAC-SHA256 signature before processing. License check is stateless and minimal.

Ollama hardening

Management endpoints (/api/pull, /api/push, /api/delete, /api/copy, /api/create) return 403 through the proxy.

Open source core

Full source available on GitHub. No binary blobs. Every security claim is verifiable by inspection. MIT-licensed free edition available.

Simple Pricing

One purchase. No subscriptions. No surprises.

ContextCut PRO

$99.88 one-time

3 seats · Lifetime · No subscription

Buy PRO →