Models & Providers
ContextRouter provides a unified interface for working with LLMs and embedding models. Whether you’re using commercial APIs, self-hosted models, or local inference, the same code works everywhere.
Universal Model Support
One of ContextRouter’s key strengths is provider flexibility. You can switch between providers with a configuration change — no code modifications required.
Commercial APIs
| Provider | LLM | Embeddings | Key Format | Best For |
|---|---|---|---|---|
| Google Vertex AI | ✅ | ✅ | vertex/gemini-2.0-flash | Production, multimodal |
| OpenAI | ✅ | ✅ | openai/gpt-4o | Quality, ecosystem |
| Anthropic | ✅ | ❌ | anthropic/claude-sonnet-4 | Long context, safety |
| Groq | ✅ | ❌ | groq/llama-3.3-70b | Ultra-fast inference |
Aggregators & Self-Hosted
| Provider | LLM | Embeddings | Key Format | Best For |
|---|---|---|---|---|
| OpenRouter | ✅ | ❌ | openrouter/deepseek/deepseek-r1 | Model variety |
| Ollama | ✅ | ✅ | local/llama3.2 | Privacy, local dev |
| vLLM | ✅ | ❌ | local-vllm/meta-llama/Llama-3.1-8B | High-throughput serving |
| HuggingFace | ✅ | ✅ | hf/distilgpt2 | Custom models, STT |
Quick Configuration
Option 1: Settings File
[models]default_llm = "vertex/gemini-2.0-flash"default_embeddings = "vertex/text-embedding-004"
[vertex]project_id = "my-gcp-project"location = "us-central1"Option 2: Environment Variables
export VERTEX_PROJECT_ID=my-gcp-projectexport VERTEX_LOCATION=us-central1# orexport OPENAI_API_KEY=sk-...Using Models
Basic Usage
from contextrouter.modules.models import model_registryfrom contextrouter.modules.models.types import ModelRequest, TextPartfrom contextrouter.core import get_core_config
config = get_core_config()
# Get an LLMllm = model_registry.create_llm("vertex/gemini-2.0-flash", config=config)
# Generate a responserequest = ModelRequest( parts=[TextPart(text="Explain quantum computing simply")], temperature=0.7, max_output_tokens=1024,)response = await llm.generate(request)print(response.text)Streaming Responses
# Stream for real-time outputasync for event in llm.stream(request): if event.event_type == "text_delta": print(event.delta, end="", flush=True) elif event.event_type == "final_text": print() # New line at endEmbeddings
# Get embedding modelembeddings = model_registry.create_embeddings( "vertex/text-embedding-004", config=config)
# Embed a queryvector = await embeddings.embed_query("What is machine learning?")print(f"Dimensions: {len(vector)}")
# Embed multiple documentsvectors = await embeddings.embed_documents([ "First document text...", "Second document text...",])Fallback Strategies
Production systems need resilience. ContextRouter supports automatic fallback between providers:
model = model_registry.get_llm_with_fallback( key="vertex/gemini-2.0-flash", # Primary fallback_keys=[ "openai/gpt-4o-mini", # First fallback "local/llama3.2", # Last resort (local) ], strategy="fallback")
# Uses primary, falls back automatically on failureresponse = await model.generate(request)Available Strategies
| Strategy | Behavior | Use Case |
|---|---|---|
fallback | Try sequentially until one succeeds | Maximum reliability |
parallel | Race all, return first success | Minimum latency |
cost-priority | Prefer cheaper models first | Budget optimization |
# Cost optimization: try cheap firstmodel = model_registry.get_llm_with_fallback( key="local/llama3.2", # Free, local fallback_keys=[ "groq/llama-3.3-70b", # Fast, cheap "vertex/gemini-2.0-flash", # Reliable, moderate "openai/gpt-4o", # Premium ], strategy="cost-priority")Multimodal Support
ContextRouter’s model interface supports multiple modalities:
from contextrouter.modules.models.types import ( ModelRequest, TextPart, ImagePart, AudioPart)
# Text + Image requestrequest = ModelRequest( parts=[ TextPart(text="What's in this image?"), ImagePart( mime="image/jpeg", data_b64="...", # Base64 encoded # or uri="https://example.com/image.jpg" ) ])
# Audio transcriptionrequest = ModelRequest( parts=[ AudioPart( mime="audio/wav", data_b64="...", sample_rate_hz=16000 ) ])Learn More
- LLM Providers — Detailed setup for each provider
- Embeddings — Embedding model configuration
- Configuration Reference — All model settings