Route any model
through any provider
Stop burning expensive tokens on simple tasks. Refract sits between your Anthropic-speaking tools and every LLM provider — translating, routing, and hot-swapping models mid-conversation. OpenAI incoming support coming in v1.x.
$ refract init
Created ~/.refract/config.json
$ refract start
refract service started.
$ refract claude
Claude Code connected via OpenRouter
$ /model deepseek-r1
Routed to deepseek/deepseek-r1 via openrouter
$ /model gemini-2.5-pro
Routed to google/gemini-2.5-pro via openrouter
The problem
Locked in
Your coding tool speaks one API format. Your models speak another. You pick one provider and stick with it — or write your own adapter.
Burning tokens
Not every request needs a frontier model. But there's no easy way to route simple tasks to cheap models and complex ones to capable models.
Rate limited
One provider, one rate limit. When it hits, you're stuck. No fallback, no redistribution. Just wait.
How it works
Refract is a local reverse proxy. It sits between your AI coding tools and your LLM providers, handling translation and routing so nothing else has to change.
Your tool sends a request
Claude Code, Continue — any tool that speaks the Anthropic API sends a request as usual.
Refract translates and routes
Translates Anthropic format to OpenAI format. Evaluates your routing rules. Picks the right model and provider for that specific request.
Provider responds
OpenRouter, Anthropic, OpenAI, Ollama — the response streams back and Refract translates it into Anthropic format your tool expects.
Intelligent routing
Define rules that match the right model to the right request. First match wins. No when clause = catch-all default.
{
"routes": [
{
"name": "big-context",
"when": { "token_estimate": { "gt": 60000 } },
"model": "google/gemini-2.5-pro"
},
{
"name": "deep-thinking",
"when": { "thinking": { "enabled": true } },
"model": "deepseek/deepseek-r1"
},
{
"name": "default",
"model": "google/gemini-2.0-flash-lite-001"
}
]
}
Every provider, one config
Declare your providers once. Route to any of them by name. Mix cloud and local in the same setup.
More providers coming soon.
Built for how you actually work
Hot-swap mid-session
Switch models without restarting. Your tool's /model command routes through your rules on every request. Gemini to DeepSeek to Llama — change whenever you want.
Full streaming translation
Anthropic SSE in, OpenAI SSE out. Streaming, non-streaming, thinking blocks, images, tool use. All translated faithfully. No data loss.
Zero dependencies
Go stdlib only. No Node.js, no Python, no Docker requirement. Download a binary and run refract init to get started. That's the entire install process.
Built-in dashboard
See every request that passes through. Which rule matched, which provider handled it, how long it took. Live feed, route table, hit counts.
Runs as a service
refract start installs it as a system service. Runs in the background, restarts on failure, starts at login. One command to manage it all.
Env var interpolation
Reference $API_KEY in your config. No secrets in JSON files. Works with .env files, Vault, whatever you already use.
Pricing
Refract
one-time payment. No subscription.
- Full API translation — Anthropic to OpenAI format and back
- 7 routing matchers including exec and http
- Multi-provider support (OpenRouter, Anthropic, OpenAI, Ollama)
- Hot-swapping mid-session
- CLI with service management —
refract init,start,stop,claude - Built-in dashboard
- Pre-built binaries for macOS, Linux, Windows
- Free updates through v1.x
- OpenAI incoming request support (coming in v1.x)
Early access pricing. Will increase as features land.