Kronaxis Router

An OpenAI-compatible proxy for the LLMs you already use, plus the CLI agents you wish you could. Cost-routes across local and cloud models, compresses prompts with pgvector RAG before they hit any backend, and -- uniquely -- wraps the actual claude CLI in headless mode so Claude Code's full agentic loop is available behind /v1/chat/completions.

Kronaxis Router pipeline Your service /v1/chat/completions Cache SHA-256 key Graphify RAG compress / augment via pgvector Classifier auto-tier the prompt Cost router YAML rules, failover, batch Local vLLM (cheapest) Gemini / OpenAI / Anthropic Async batch APIs (50% off) 7 providers UNIQUE claude CLI (headless) skills · MCP · file edit

Four real differentiators

Claude Code as an OpenAI endpoint

The agent-gateway sub-service spawns the actual claude binary in stream-json mode in an isolated git worktree, returning SSE chunks with proper tool_calls deltas plus a final git diff. Skills, MCP servers, hooks, the whole Claude Code surface, behind a vanilla OpenAI URL.

Token-saving RAG pre-stage

A pgvector + sentence-transformers retrieval step runs before the classifier. Augment thin prompts with project context; compress fat prompts with retrieved chunks. Verified saving 3,945 tokens on a single 10 KB request in production smoke.

Cost routing by rule

Send simple JSON extraction to a local 7B; send hard reasoning to Gemini 2.5 Pro; let the router pick. YAML rules, hot-reloadable, no restart. Routes by service, tier, priority, and content type.

Multi-account auth pool

Pool API keys (Anthropic / OpenAI / Gemini) and -- for personal use -- Claude Code OAuth subscriptions. Round-robin, pin by account_id, auto-disable on 429 with provider-aware cooldowns (5 min for keys, 5 hours for OAuth).

How it compares

If you want a hosted "credits + many providers" experience, OpenRouter wins. If you want the best logging dashboard, Helicone. If you want a Python-native multi-provider client, LiteLLM. If you want to run the proxy yourself, route by cost, and treat Claude Code as an API, Kronaxis Router is the only option that ships those things in one binary.

Kronaxis RouterLiteLLMOpenRouterHelicone
OpenAI-compatible proxylogs only
Self-hosted single binary✓ (Go, 9.9 MB)Pythonhosted onlyhosted only
Cost routing by rule×
Multi-backend failover×
CLI agent wrapping (Claude Code, Gemini CLI)×××
Built-in RAG pre-stage (pgvector)×××
Multi-account OAuth subscription pool✓ (ToS-gated)×××
Async batch API (50% off)✓ (7 providers)××
Response caching×
Hot-reloadable config×n/an/a
Embedded web UI✓ (best in class)

60-second start (no Postgres needed)

Just the agent-gateway sub-service, wrapping Claude Code as an OpenAI endpoint. Nothing else needed.

# 1. Install
go install github.com/kronaxis/agent-gateway@latest
claude auth login # if you haven't already

# 2. Start the gateway
agent-gateway -config /dev/stdin <<EOF
  port: 8055
  claude_binary: "claude"
EOF

# 3. Send a request
curl -N http://localhost:8055/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"claude-code-agent","stream":true,
       "messages":[{"role":"user","content":"write hello.go"}]}'

The final SSE chunk carries a kronaxis extras object with the file diff the agent produced. For the full cost-routing proxy with cloud + local backends, see the examples directory.

Continue across the research stack