Multi-provider AI: pick the model, never get locked in

How model selection works

The LLM Router (packages/core/src/ai/router.ts) resolves the provider and model for each call from TenantTaskModelConfig, keyed by task type:

Task type	Typical default
chat_tutor	Sonnet-class
chat_evaluation / summarization / supervisor	Haiku-class
content_generation	Haiku-class
course_review / course_agent / gamification_agent	Sonnet-class

An institution admin can override any task to a different provider/model. No model IDs are hardcoded in agents — the router is the single decision point.

Router methods

stream() — streaming chat (metered).
generate() — agents with tools (metered); tools are converted from JSON Schema to the AI SDK format so they work on any provider.
generateDirect() — background, fire-and-forget calls (evaluation, content, summarizer, supervisor) that run in Next.js after() without the full metering pipeline.

Fallback chain + circuit breaker

By tier: Claude → GPT → Grok → Gemini (LLMs); Voyage → OpenAI → Cohere (embeddings). A circuit breaker stored in Redis trips after a failure threshold and is shared across instances, so an open breaker skips a failing provider with no timeout. Incidents are recorded for observability.

Bring-your-own keys + metering

Per-tenant keys (TenantApiKey) are encrypted AES-256-GCM. Resolution order: tenant key → global provider key → environment variable.
Every LLM call passes through the metering middleware (rate limit → credit check → cost calculation), recording input and output tokens separately. Prices come from a database table, never hardcoded.

Why it matters

You are not betting your platform on one lab's pricing, latency or availability. As models improve, you switch a dropdown — not your codebase.