Routing
Intent
Classify an input and direct it to a specialized handler optimized for that type of request.
Problem
A single prompt optimized for one type of input will perform poorly on others. If you tune your customer-service prompt for refund requests, it will struggle with technical questions. You can't serve diverse inputs well with a one-size-fits-all approach — prompt quality degrades as you try to handle every case.
Solution
Add a classification step at the beginning that routes inputs to specialized downstream handlers. Each handler has its own optimized prompt, tools, or even model. The router can be an LLM call, a traditional classifier, or simple keyword matching — whatever is most reliable for your domain. This is separation of concerns applied to LLM systems. Each handler can be independently optimized and tested.
Diagram
┌→ [Handler A: Refunds]
│
Input → [Router] ─→ [Handler B: Technical Support]
│
└→ [Handler C: General Questions]When to Use
- Complex tasks with distinct categories that need different handling
- When you want to route easy queries to cheaper/faster models
- Customer service systems with multiple intents
- Multi-domain systems where each domain has specialized requirements
When NOT to Use
- Tasks where all inputs need the same treatment
- When classification accuracy is low — misrouting is worse than a generic handler
- Simple applications where a single prompt handles everything well
Pros & Cons
Pros
- Each handler can be independently optimized
- Cost optimization — route easy tasks to smaller models
- Clean separation of concerns
- Easy to add new categories without affecting existing handlers
Cons
- Classification errors cause misrouting
- Added latency from the classification step
- Requires maintaining multiple prompts/handlers
- Edge cases between categories can be hard to handle
Implementation Steps
- 1Define clear categories for your inputs
- 2Build a classifier — LLM-based, traditional ML, or rule-based
- 3Create specialized handlers for each category
- 4Design fallback handling for unclassified or ambiguous inputs
- 5Monitor classification accuracy and misrouting rates
- 6Iterate on classifier and handlers based on real traffic patterns
Real-World Example
Customer Service Triage
An LLM classifies incoming support tickets into categories: billing, technical, feature-request, or complaint. Each category routes to a handler with a specialized prompt, relevant knowledge base access, and appropriate tools (e.g., billing handler can look up invoices, technical handler can search documentation).
import anthropic
client = anthropic.Anthropic()
HANDLERS = {
"billing": "You are a billing specialist. Help resolve payment and invoice issues.",
"technical": "You are a senior engineer. Debug and resolve technical problems.",
"sales": "You are a sales consultant. Help with product and pricing inquiries.",
}
def classify(ticket: str) -> str:
response = client.messages.create(
model="claude-sonnet-4-20250514", max_tokens=20,
messages=[{"role": "user", "content": f"Classify this ticket as billing, technical, or sales. Reply with one word only.\n\n{ticket}"}]
)
return response.content[0].text.strip().lower()
def handle_ticket(ticket: str) -> str:
category = classify(ticket)
system_prompt = HANDLERS.get(category, HANDLERS["technical"])
response = client.messages.create(
model="claude-sonnet-4-20250514", max_tokens=500,
system=system_prompt,
messages=[{"role": "user", "content": ticket}]
)
return response.content[0].text