CLI Reference
ContextRouter includes a powerful CLI built with Click and Rich for beautiful, colorized output.
Installation
The CLI is included with the main package:
pip install contextroutercontextrouter --helpGlobal Options
contextrouter [OPTIONS] COMMAND [ARGS]
Options: --config PATH Path to settings.toml --env PATH Path to .env file -v, --verbose Enable debug logging --version Show version --help Show helpRAG Commands
validate - Configuration Validation
Validate your ContextRouter configuration and check for common issues.
Usage:
contextrouter rag validate [OPTIONS]Options:
--config PATH- Path to settings file (default: auto-detect)--check-providers- Test provider connections--check-models- Test model availability--verbose- Show detailed validation results
Examples:
# Basic validationcontextrouter rag validate
# Full system checkcontextrouter rag validate --check-providers --check-models --verbosechat - Interactive RAG Chat
Start an interactive chat session with RAG capabilities.
Usage:
contextrouter rag chat [OPTIONS]Options:
--web / --no-web- Enable/disable web search (default: enabled)--rerank / --no-rerank- Enable/disable reranking (default: enabled)--citations / --no-citations- Show/hide citations (default: enabled)--provider STR- Override default provider--model STR- Override default LLM--style STR- Response style (concise, detailed, technical)--max-results INT- Maximum search results (default: 10)--temperature FLOAT- LLM temperature (default: 0.7)--stream / --no-stream- Enable/disable streaming responses
Examples:
# Basic chatcontextrouter rag chat
# Technical Q&A with web searchcontextrouter rag chat --web --style technical --max-results 15
# Fast responses without rerankingcontextrouter rag chat --no-rerank --temperature 0.1
# Use specific modelcontextrouter rag chat --model vertex/gemini-2.0-flash --provider postgresInteractive Commands:
quitorexit- End sessionclear- Clear conversation historyhistory- Show conversation historyconfig- Show current configurationhelp- Show available commands
query - Single Query Execution
Execute a single RAG query without interactive mode.
Usage:
contextrouter rag query [OPTIONS] QUERYOptions:
--json- Output results as JSON--output PATH- Save results to file--web / --no-web- Enable/disable web search--rerank / --no-rerank- Enable/disable reranking--citations / --no-citations- Include/exclude citations--max-results INT- Maximum results to return--provider STR- Search provider to use--model STR- LLM model for generation
Examples:
# Simple querycontextrouter rag query "What is machine learning?"
# JSON output for scriptingcontextrouter rag query "Latest AI developments" --web --json
# Save detailed resultscontextrouter rag query "RAG architecture" --output ./rag-explanation.json --max-results 20
# Technical query with citationscontextrouter rag query "How does vector search work?" --citations --provider vertexJSON Output Format:
{ "query": "What is machine learning?", "response": "Machine learning is a subset of AI...", "citations": [ { "text": "...machine learning definition...", "source": {"type": "book", "title": "AI Handbook", "page": 45}, "confidence": 0.92 } ], "metadata": { "execution_time": 1.2, "results_count": 5, "model_used": "vertex/gemini-2.0-flash" }}Ingestion Commands
run - Complete Pipeline Execution
Run the full ingestion pipeline from raw content to deployed knowledge base.
Usage:
contextrouter ingest run [OPTIONS]Options:
--type TYPE- Content type: book, video, qa, web, knowledge--input PATH- Input file or directory path--output PATH- Output directory (default: ./ingestion_output)--overwrite- Overwrite existing artifacts--skip-preprocess- Skip preprocessing stage--skip-structure- Skip taxonomy/ontology building--skip-index- Skip indexing stage--skip-deploy- Skip deployment stage--workers INT- Number of parallel workers (default: CPU cores / 2)--dry-run- Show what would be done without executing
Examples:
# Full book ingestioncontextrouter ingest run --type book --input ./my-book.pdf
# Q&A transcripts with custom outputcontextrouter ingest run --type qa --input ./transcripts/ --output ./qa-knowledge
# Resume after preprocessingcontextrouter ingest run --type book --skip-preprocess
# Parallel processingcontextrouter ingest run --type video --input ./videos/ --workers 8
# Dry run to previewcontextrouter ingest run --type web --input ./articles/ --dry-runpreprocess - Text Extraction & Chunking
Clean raw content and prepare it for analysis by converting to structured text chunks.
Usage:
contextrouter ingest preprocess [OPTIONS]Options:
--type TYPE- Content type (book, video, qa, web, knowledge)--input PATH- Input file or directory--chunk-size INT- Target chunk size in characters (default: 1000)--chunk-overlap INT- Overlap between chunks (default: 200)--min-chunk-size INT- Minimum chunk size (default: 100)--max-chunk-size INT- Maximum chunk size (default: 2000)--encoding STR- Text encoding (default: utf-8)--preserve-formatting- Keep markdown formatting in chunks
Examples:
# Basic preprocessingcontextrouter ingest preprocess --type book --input ./document.pdf
# Custom chunking for long documentscontextrouter ingest preprocess --type book --chunk-size 1500 --chunk-overlap 300
# Video transcripts with smaller chunkscontextrouter ingest preprocess --type video --chunk-size 500 --min-chunk-size 50
# Preserve code formattingcontextrouter ingest preprocess --type knowledge --preserve-formattingstructure - Taxonomy & Ontology Building
Analyze content to build hierarchical categories and entity relationship schemas.
Usage:
contextrouter ingest structure [OPTIONS]Options:
--type TYPE- Content type--model STR- LLM model for analysis (default: from config)--max-samples INT- Maximum content samples to analyze (default: 100)--max-depth INT- Maximum taxonomy depth (default: 3)--categories LIST- Custom category hints--philosophy-focus STR- Custom analysis focus prompt
Examples:
# Build taxonomy for bookscontextrouter ingest structure --type book
# Custom analysis with specific categoriescontextrouter ingest structure --type qa --categories "AI,Machine Learning,Deep Learning"
# Use specific modelcontextrouter ingest structure --type knowledge --model vertex/gemini-2.0-flash
# Deep taxonomy for complex domainscontextrouter ingest structure --type book --max-depth 5 --max-samples 200index - Knowledge Graph & Shadow Records
Extract entities, build relationships, and create optimized search metadata.
Usage:
contextrouter ingest index [OPTIONS]Options:
--type TYPE- Content type--incremental- Build on existing knowledge graph--no-graph- Skip knowledge graph construction--max-entities-per-chunk INT- Entity extraction limit (default: 10)--confidence-threshold FLOAT- Minimum confidence for relationships (default: 0.3)--builders LIST- Graph builders to use (llm, local, hybrid)
Examples:
# Full indexing with knowledge graphcontextrouter ingest index --type book
# Incremental indexing (preserve existing)contextrouter ingest index --type qa --incremental
# Skip graph for simple search-only indexingcontextrouter ingest index --type web --no-graph
# High-precision entity extractioncontextrouter ingest index --type knowledge --max-entities-per-chunk 20 --confidence-threshold 0.5export - Data Export for Indexing
Convert processed data to formats suitable for search index population.
Usage:
contextrouter ingest export [OPTIONS]Options:
--type TYPE- Content type--format STR- Export format: jsonl, sql, csv (default: jsonl)--include-metadata- Include full metadata in export--compress- Compress output files--batch-size INT- Records per output file (default: 1000)
Examples:
# Export for Vertex AI Searchcontextrouter ingest export --type book --format jsonl
# Export SQL for Postgrescontextrouter ingest export --type qa --format sql
# Compressed export for large datasetscontextrouter ingest export --type web --compress --batch-size 5000deploy - Index Population & Upload
Upload processed data to search indexes and knowledge graphs.
Usage:
contextrouter ingest deploy [OPTIONS]Options:
--type TYPE- Content type--provider STR- Target provider (postgres, vertex, gcs)--target STR- Deployment target for blue/green (blue, green)--batch-size INT- Upload batch size (default: 1000)--workers INT- Parallel upload workers (default: 4)--validate- Run validation after deployment--dry-run- Show what would be deployed
Examples:
# Deploy to default providercontextrouter ingest deploy --type book
# Deploy to specific providercontextrouter ingest deploy --type qa --provider vertex
# Blue/green deploymentcontextrouter ingest deploy --type book --target green
# Large dataset deployment with validationcontextrouter ingest deploy --type knowledge --workers 8 --validatereport - Ingestion Analytics & Reporting
Generate detailed reports on ingestion quality and performance.
Usage:
contextrouter ingest report [OPTIONS]Options:
--type TYPE- Content type--output PATH- Report output path (default: ./report.html)--format STR- Report format: html, json, markdown (default: html)--include-charts- Include charts in HTML report--include-samples- Include data samples in report
Examples:
# HTML report with chartscontextrouter ingest report --type book
# JSON report for automationcontextrouter ingest report --type qa --format json --output ./qa-report.json
# Markdown report for documentationcontextrouter ingest report --type knowledge --format markdown --include-samplespersona - Assistant Persona Generation
Generate personalized assistant configurations based on ingested content.
Usage:
contextrouter ingest persona [OPTIONS]Options:
--type TYPE- Content type--traits LIST- Personality traits to emphasize--expertise LIST- Expertise areas to focus on--style STR- Response style (formal, casual, technical)--output PATH- Persona configuration output path
Examples:
# Generate persona for Q&A contentcontextrouter ingest persona --type qa
# Technical expert personacontextrouter ingest persona --type knowledge --style technical --expertise "AI,Machine Learning"
# Custom personality traitscontextrouter ingest persona --type book --traits "helpful,concise,accurate"Registry Commands
list - Component Discovery
List all registered components in the system.
Usage:
contextrouter registry list [OPTIONS]Options:
--type STR- Filter by component type (connectors, providers, transformers, agents, graphs)--pattern STR- Filter by name pattern (supports wildcards)--json- Output as JSON--verbose- Show detailed information
Examples:
# List all componentscontextrouter registry list
# List only connectorscontextrouter registry list --type connectors
# Find components by patterncontextrouter registry list --pattern "*web*"
# JSON output for automationcontextrouter registry list --json --type providersSample Output:
Registered Components══════════════════════
Connectors (5)├── web - Google Custom Search integration├── file - Local file ingestion (PDF, TXT, MD, JSON)├── rss - RSS/Atom feed monitoring├── api - Generic REST API connector└── slack - Slack messages connector
Providers (3)├── postgres - PostgreSQL with pgvector├── vertex - Vertex AI Search└── gcs - Google Cloud Storage
Transformers (6)├── ner - Named Entity Recognition├── taxonomy - Category classification├── summarization- Text summarization├── keyphrases - Key phrase extraction└── shadow - Shadow record generationshow - Component Details
Show detailed information about a specific component.
Usage:
contextrouter registry show [OPTIONS] NAMEOptions:
--type STR- Component type (if name conflicts)--config- Show configuration schema--examples- Show usage examples--verbose- Show full implementation details
Examples:
# Show provider detailscontextrouter registry show postgres
# Show with configurationcontextrouter registry show vertex --config
# Verbose outputcontextrouter registry show ner --verbose
# Show examplescontextrouter registry show web --examplesSample Output:
Component: postgresType: providerDescription: PostgreSQL with pgvector for hybrid searchClass: PostgresProvider
Configuration Schema:├── host (str): Database host (default: localhost)├── port (int): Database port (default: 5432)├── database (str): Database name (required)├── user (str): Database user (required)├── password (str): Database password (from env)└── search_path (str): Schema search path (default: public)
Usage Example:from contextrouter.core.registry import select_providerprovider = select_provider("postgres")results = await provider.read("machine learning", limit=10)Advanced CLI Features
Configuration Management
Configuration Hierarchy
ContextRouter uses a layered configuration system:
- Default values (built into code)
- Environment variables (highest priority except runtime)
- Settings file (
settings.toml) - Runtime overrides (CLI flags, API parameters)
Configuration Commands
# Show current configurationcontextrouter config show
# Show configuration schemacontextrouter config schema
# Validate configurationcontextrouter config validate
# Generate sample configurationcontextrouter config sample > settings.tomlEnvironment Variables
# Provider credentialsexport VERTEX_PROJECT_ID=my-projectexport VERTEX_LOCATION=us-central1export OPENAI_API_KEY=sk-...export POSTGRES_PASSWORD=mysecret
# Application settingsexport CR_CONFIG_PATH=./settings.tomlexport CR_LOG_LEVEL=DEBUGexport CR_WORKERS=4Parallel Processing & Performance
Worker Management
# Use all CPU corescontextrouter ingest run --workers 0 # 0 = auto-detect
# Limit parallelism for memory-constrained systemscontextrouter ingest run --workers 2
# Check optimal worker countcontextrouter ingest run --dry-run --workers 8Memory Management
# Process large files in chunkscontextrouter ingest preprocess --chunk-size 500 --max-chunk-size 1000
# Batch processing for large datasetscontextrouter ingest deploy --batch-size 500 --workers 2
# Monitor memory usagecontextrouter ingest run --verbose # Shows memory statsPipeline Orchestration
Selective Execution
# Run only specific stagescontextrouter ingest preprocess --type book --input ./book.pdfcontextrouter ingest structure --type bookcontextrouter ingest index --type bookcontextrouter ingest deploy --type book
# Skip completed stagescontextrouter ingest run --type book --skip-preprocess --skip-structure
# Run stages for multiple typescontextrouter ingest index --type book,qa,webConditional Processing
# Only process if source changed (requires checksums)contextrouter ingest run --type book --if-changed
# Force reprocessingcontextrouter ingest run --type book --overwrite
# Continue after failurecontextrouter ingest run --type book --continue-on-errorOutput & Logging
Enhanced Logging
# Debug modecontextrouter --verbose ingest run --type book
# Structured JSON loggingcontextrouter --log-format json ingest run --type book
# Log to filecontextrouter --log-file ./ingestion.log ingest run --type book
# Filter log levelscontextrouter --log-level INFO ingest run --type bookProgress Monitoring
# Show progress barscontextrouter ingest run --type book --progress
# Show detailed timingcontextrouter ingest run --type book --timing
# Show resource usagecontextrouter ingest run --type book --profileCommon Options Reference
| Option | Commands | Description |
|---|---|---|
--type TYPE | ingest * | Content type: book, video, qa, web, knowledge |
--input PATH | preprocess | Input file or directory path |
--output PATH | export, report | Output path (default: ./ingestion_output) |
--overwrite | run, * | Overwrite existing artifacts |
--skip-* | run | Skip specific pipeline stages |
--workers INT | run, deploy | Number of parallel workers |
--batch-size INT | export, deploy | Records per batch |
--json | query, list | Output as JSON |
--verbose | all | Enable debug logging |
--dry-run | run, deploy | Show what would be done |
--config PATH | all | Path to settings file |
--env PATH | all | Path to .env file |
Examples
Full Book Ingestion
# Run complete pipelinecontextrouter ingest run --type book --input ./my-book.pdf
# Or step by stepcontextrouter ingest preprocess --type book --input ./my-book.pdfcontextrouter ingest structure --type bookcontextrouter ingest index --type bookcontextrouter ingest deploy --type bookResume After Failure
# Skip already-completed stagescontextrouter ingest run --type book --skip-preprocess --skip-structureMultiple Content Types
# Process books and Q&A togethercontextrouter ingest run --type book,qa --input ./content/Blue/Green Deployment
# Deploy to staging (green)contextrouter ingest deploy --type book --target green
# Verify stagingcontextrouter rag chat --dataset green
# Switch to production (manual config change)Scripting & Automation
Bash Scripting Examples
Automated Ingestion Pipeline
#!/bin/bash# auto_ingest.sh - Automated ingestion for multiple content types
set -e # Exit on any error
CONTENT_DIR="./content"OUTPUT_DIR="./knowledge-base"LOG_FILE="./ingestion.log"
# Function to log with timestamplog() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOG_FILE"}
# Validate environmentlog "Validating configuration..."contextrouter rag validate --quiet
# Process different content typesfor content_type in book qa web; do input_dir="$CONTENT_DIR/$content_type" if [ -d "$input_dir" ]; then log "Processing $content_type content from $input_dir"
# Run full pipeline contextrouter ingest run \ --type "$content_type" \ --input "$input_dir" \ --output "$OUTPUT_DIR" \ --overwrite \ --workers 4 \ >> "$LOG_FILE" 2>&1
# Generate report contextrouter ingest report \ --type "$content_type" \ --output "$OUTPUT_DIR/reports/$content_type.html" \ >> "$LOG_FILE" 2>&1
log "Completed $content_type processing" fidone
# Validate deploymentlog "Validating deployment..."contextrouter rag query "test query" --json --quiet > /dev/null
log "Ingestion pipeline completed successfully"Continuous Integration
name: Knowledge Base Ingestion
on: push: paths: - 'content/**' - 'settings.toml'
jobs: ingest: runs-on: ubuntu-latest
steps: - uses: actions/checkout@v3
- name: Setup Python uses: actions/setup-python@v4 with: python-version: '3.11'
- name: Install ContextRouter run: pip install contextrouter[vertex,postgres]
- name: Validate Configuration run: contextrouter rag validate
- name: Run Ingestion run: | contextrouter ingest run \ --type book,qa \ --input ./content \ --output ./kb-output \ --workers 2
- name: Generate Report run: | contextrouter ingest report \ --type book \ --output ./kb-output/report.html
- name: Test Search run: | contextrouter rag query "test query" --json --quiet
- name: Upload Report uses: actions/upload-artifact@v3 with: name: ingestion-report path: ./kb-output/report.htmlPython Scripting
Batch Processing Script
#!/usr/bin/env python3"""batch_ingest.py - Batch ingestion with progress monitoring"""
import subprocessimport sysfrom pathlib import Pathfrom typing import List
def run_command(cmd: List[str], description: str) -> bool: """Run command with error handling.""" print(f"🔄 {description}") try: result = subprocess.run(cmd, capture_output=True, text=True, check=True) print(f"✅ {description} completed") return True except subprocess.CalledProcessError as e: print(f"❌ {description} failed: {e}") print(f"Error output: {e.stderr}") return False
def main(): content_types = ["book", "qa", "web"] base_dir = Path("./content")
for content_type in content_types: content_dir = base_dir / content_type if not content_dir.exists(): print(f"⚠️ Skipping {content_type}: directory {content_dir} not found") continue
# Preprocess if not run_command([ "contextrouter", "ingest", "preprocess", "--type", content_type, "--input", str(content_dir) ], f"Preprocessing {content_type}"): sys.exit(1)
# Structure if not run_command([ "contextrouter", "ingest", "structure", "--type", content_type ], f"Building structure for {content_type}"): sys.exit(1)
# Index if not run_command([ "contextrouter", "ingest", "index", "--type", content_type ], f"Indexing {content_type}"): sys.exit(1)
# Deploy if not run_command([ "contextrouter", "ingest", "deploy", "--type", content_type ], f"Deploying {content_type}"): sys.exit(1)
print(f"🎉 {content_type} ingestion completed successfully")
# Final validation if run_command([ "contextrouter", "rag", "query", "test query", "--json", "--quiet" ], "Validating search functionality"): print("🎉 All content types ingested successfully!") else: print("❌ Search validation failed") sys.exit(1)
if __name__ == "__main__": main()Monitoring & Health Checks
#!/usr/bin/env python3"""health_check.py - Monitor ContextRouter health"""
import subprocessimport jsonimport timefrom typing import Dict, Any
def run_query(query: str) -> Dict[str, Any]: """Run a test query and return metrics.""" try: result = subprocess.run([ "contextrouter", "rag", "query", query, "--json", "--quiet" ], capture_output=True, text=True, check=True, timeout=30)
return json.loads(result.stdout) except (subprocess.CalledProcessError, json.JSONDecodeError, subprocess.TimeoutExpired) as e: return {"error": str(e), "success": False}
def check_configuration() -> bool: """Check if configuration is valid.""" try: subprocess.run([ "contextrouter", "rag", "validate", "--quiet" ], check=True, capture_output=True) return True except subprocess.CalledProcessError: return False
def main(): print("🔍 ContextRouter Health Check") print("=" * 40)
# Configuration check config_ok = check_configuration() print(f"Configuration: {'✅ Valid' if config_ok else '❌ Invalid'}")
if not config_ok: print("❌ Health check failed: invalid configuration") return 1
# Query performance tests test_queries = [ "What is machine learning?", "artificial intelligence applications", "deep learning neural networks" ]
total_time = 0 successful_queries = 0
for query in test_queries: print(f"\nTesting query: '{query}'") start_time = time.time()
result = run_query(query) query_time = time.time() - start_time
if "error" not in result: successful_queries += 1 print(".2f" else: print(f"❌ Failed: {result.get('error', 'Unknown error')}")
total_time += query_time
# Results summary success_rate = successful_queries / len(test_queries) * 100 avg_time = total_time / len(test_queries)
print("📊 Results Summary" print(f"Success rate: {success_rate:.1f}%") print(".2f" print(f"Queries tested: {len(test_queries)}")
if success_rate >= 90 and avg_time < 5.0: print("✅ System is healthy") return 0 else: print("⚠️ System may need attention") return 1
if __name__ == "__main__": exit(main())Troubleshooting
Common Issues & Solutions
Configuration Errors
Issue: Configuration file not found
Solution: Specify config path explicitlycontextrouter --config ./settings.toml rag validateIssue: Provider connection failed
Solution: Check credentials and networkexport VERTEX_PROJECT_ID=your-projectcontextrouter rag validate --check-providersMemory Issues
Issue: Out of memory during ingestion
Solutions:- Reduce batch size: --batch-size 500- Use fewer workers: --workers 2- Process in chunks: --chunk-size 500- Add swap space or increase RAMPerformance Problems
Issue: Ingestion is too slow
Solutions:- Increase workers: --workers 8- Use faster storage (SSD vs HDD)- Disable unnecessary features: --no-graph for simple search- Use GPU-enabled models for transformersSearch Issues
Issue: Queries return no results
Solutions:- Check deployment: contextrouter ingest report- Validate provider: contextrouter rag validate --check-providers- Test simple query: contextrouter rag query "test"- Check index population: verify data in Postgres/VertexImport Errors
Issue: Module not found
Solutions:- Install missing packages: pip install missing-package- Check Python path: python -c "import contextrouter"- Reinstall: pip uninstall contextrouter && pip install contextrouterDebug Mode
Enable detailed debugging for troubleshooting:
# Maximum verbositycontextrouter --verbose rag chat
# Log everythingcontextrouter --log-level DEBUG --log-file debug.log ingest run --type book
# Show stack tracesexport PYTHONUNBUFFERED=1contextrouter --verbose rag query "test"Getting Help
# Show all commandscontextrouter --help
# Command-specific helpcontextrouter ingest run --help
# Show version and system infocontextrouter --version
# Report issues# GitHub: https://github.com/ContextRouter/contextrouter/issuesExit Codes
| Code | Meaning | Action |
|---|---|---|
| 0 | Success | - |
| 1 | General error | Check error message |
| 2 | Configuration error | Validate settings.toml |
| 3 | Provider error | Check credentials/network |
| 4 | Validation error | Fix data quality issues |
| 5 | Timeout error | Increase timeouts or reduce load |
| 130 | Interrupted (Ctrl+C) | Safe to retry |
Error Handling
The CLI uses Rich for enhanced error output:
- Colorized tracebacks with syntax highlighting
- Local variables shown in stack traces
- Clear error messages with suggested fixes
- Contextual help for common issues
All errors go to stderr, keeping stdout clean for scripting.