How model selection works
The LLM Router (packages/core/src/ai/router.ts) resolves the provider and model for each call from TenantTaskModelConfig, keyed by task type:
| Task type | Typical default |
|---|---|
| chat_tutor | Sonnet-class |
| chat_evaluation / summarization / supervisor | Haiku-class |
| content_generation | Haiku-class |
| course_review / course_agent / gamification_agent | Sonnet-class |
An institution admin can override any task to a different provider/model. No model IDs are hardcoded in agents — the router is the single decision point.
Router methods
stream()— streaming chat (metered).generate()— agents with tools (metered); tools are converted from JSON Schema to the AI SDK format so they work on any provider.generateDirect()— background, fire-and-forget calls (evaluation, content, summarizer, supervisor) that run in Next.jsafter()without the full metering pipeline.
Fallback chain + circuit breaker
By tier: Claude → GPT → Grok → Gemini (LLMs); Voyage → OpenAI → Cohere (embeddings). A circuit breaker stored in Redis trips after a failure threshold and is shared across instances, so an open breaker skips a failing provider with no timeout. Incidents are recorded for observability.
Bring-your-own keys + metering
- Per-tenant keys (
TenantApiKey) are encrypted AES-256-GCM. Resolution order: tenant key → global provider key → environment variable. - Every LLM call passes through the metering middleware (rate limit → credit check → cost calculation), recording input and output tokens separately. Prices come from a database table, never hardcoded.
Why it matters
You are not betting your platform on one lab's pricing, latency or availability. As models improve, you switch a dropdown — not your codebase.