Automatic LLM routing that selects the cheapest capable model per request using complexity scoring.

AI Smart Routing

ArchonHQ routes every AI request to the right model automatically. You don't pick a model per call. AiPipe scores each request on complexity and picks the cheapest model that can handle it well.

Why routing matters

Most teams pick one LLM and use it for everything. The problem:

A cheap model (gpt-4o-mini) handles greetings and lookups well but makes mistakes on architecture decisions and formal reasoning
A frontier model (Claude Sonnet, GPT-4o) handles everything correctly but costs 15–30× more per call than necessary for simple work

AiPipe solves this by scoring each request and routing it to the cheapest model that can handle it at sufficient quality.

Result: you typically save ~50% on LLM costs without changing your workflow or giving up quality on important tasks.

How routing decisions are made

AiPipe scores each request on five signals:

Length: longer prompts generally require more capable models
Code: code blocks increase complexity; complex languages (Rust, Go, TypeScript) increase it further
Keywords: "prove by induction", "architecture tradeoffs", "security vulnerability" map to high complexity; "translate", "summarise", "hello" map to low
Structure: multi-part questions, numbered lists, and headers indicate higher complexity
Depth: multi-turn conversations carry accumulated context and route higher

The five signals combine into a single score in [0.05, 1.0]. This score drives model selection.

At low complexity, the cheapest capable model wins. As complexity rises, a quality-weighted formula kicks in: models with higher benchmark scores earn a cost discount, making them more competitive despite higher nominal prices.

Routing in practice:

Complexity	Typical request	Model selected
0.05–0.20	Greeting, simple question, status check	gpt-4o-mini
0.20–0.55	Translation, summarisation, basic code review	gpt-4o-mini
0.55–0.70	Multi-step explanation, medium code task	gpt-4o-mini or gpt-4o
0.68–0.90	Architecture analysis, complex reasoning	claude-sonnet-4-5
0.80–1.00	Formal proofs, security audits, research	claude-sonnet-4-5 or claude-opus

Supported providers

AiPipe routes across all providers you've configured. Enable any provider by adding its API key in the Connection Wizard →.

Provider	Models	Best for
OpenAI	gpt-4o-mini, gpt-4o	Cost-efficient default; strong general performance
Anthropic	Haiku, Sonnet, Opus	Top quality on complex reasoning and coding
Google Gemini	Flash, Pro	Ultra-cheap for simple tasks; enormous context window
xAI (Grok)	Grok-4, Grok-4-fast	Fast reasoning at competitive cost
OpenRouter	auto	Routes across 200+ models via unified API
MiniMax	abab6.5s	Cost-competitive for medium tasks
Kimi (Moonshot)	v1-8k, v1-32k	Long-context tasks at low cost

Request counts by provider and model
Success rate per provider
Total cost and cost breakdown by model
Queue depth and queue capacity

Use this weekly to verify routing is working as expected and identify any providers with elevated error rates.

For more detail

AiPipe Benchmark: full routing analysis with cost tables →

AI Smart Routing

AI Smart Routing

Why routing matters

How routing decisions are made

Which model gets picked

Supported providers

Per-tenant isolation

Caching

Reliability and fallback

Monitoring routing in the dashboard

For more detail

On this page