Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

LLM Providers

🤖 Configure and optimize different LLM providers with ReasonKit.

ReasonKit AI Integration Options: Claude, Gemini, OpenAI, Cursor, VS Code, Any LLM

Universal Compatibility: ReasonKit integrates seamlessly with Claude, Gemini, OpenAI, Cursor, VS Code, and any LLM provider. The same structured reasoning protocols work across all platforms, giving you flexibility without vendor lock-in.

ReasonKit supports multiple LLM providers, each with different strengths, pricing, and capabilities.

Supported Providers

Provider Models Best For Pricing
Anthropic Claude 4, Sonnet, Haiku Best quality, safety $$$
OpenAI GPT-4, GPT-4 Turbo Broad compatibility $$$
OpenRouter 300+ models Variety, cost optimization $ - $$$
Ollama Llama, Mistral, etc. Privacy, free Free
Google Gemini Pro, Flash Long context $$

Provider Configuration

Claude models provide the best reasoning quality for ThinkTools.

# Set API key
export ANTHROPIC_API_KEY="sk-ant-..."

# Use explicitly
rk think "question" --provider anthropic --model claude-sonnet-4-20250514

Config file:

[providers.anthropic]
api_key = "${ANTHROPIC_API_KEY}"  # Use env var
model = "claude-sonnet-4-20250514"
max_tokens = 4096

Available models:

Model Context Speed Quality
claude-opus-4-20250514 200K Slow Best
claude-sonnet-4-20250514 200K Fast Excellent
claude-haiku-3-5-20241022 200K Fastest Good

OpenAI

export OPENAI_API_KEY="sk-..."

rk think "question" --provider openai --model gpt-4-turbo

Config file:

[providers.openai]
api_key = "${OPENAI_API_KEY}"
model = "gpt-4-turbo"
organization_id = "org-..."  # Optional
base_url = "https://api.openai.com/v1"  # For proxies

Available models:

Model Context Speed Quality
gpt-4-turbo 128K Fast Excellent
gpt-4 8K Medium Excellent
gpt-3.5-turbo 16K Fastest Good

OpenRouter

Access 300+ models through a single API. Great for cost optimization and experimentation.

export OPENROUTER_API_KEY="sk-or-..."

rk think "question" --provider openrouter --model anthropic/claude-sonnet-4

Config file:

[providers.openrouter]
api_key = "${OPENROUTER_API_KEY}"
model = "anthropic/claude-sonnet-4"
site_url = "https://yourapp.com"  # For rankings
site_name = "Your App"

Popular models:

Model Provider Quality Price
anthropic/claude-sonnet-4 Anthropic Excellent $$
openai/gpt-4-turbo OpenAI Excellent $$
google/gemini-pro Google Good $
mistralai/mistral-large Mistral Good $
meta-llama/llama-3-70b Meta Good $

Ollama (Local)

Run models locally for privacy and zero API costs.

# Start Ollama
ollama serve

# Pull a model
ollama pull llama3.2

# Use with ReasonKit
rk think "question" --provider ollama --model llama3.2

Config file:

[providers.ollama]
host = "http://localhost:11434"
model = "llama3.2"

Recommended models:

Model Size Quality RAM Required
llama3.2 8B Good 8GB
llama3.2:70b 70B Excellent 48GB
mistral 7B Good 8GB
mixtral 8x7B Excellent 32GB
deepseek-coder 33B Good (code) 24GB

Google Gemini

export GOOGLE_API_KEY="..."

rk think "question" --provider google --model gemini-pro

Config file:

[providers.google]
api_key = "${GOOGLE_API_KEY}"
model = "gemini-pro"

Provider Selection

Automatic Selection

By default, ReasonKit auto-selects based on available API keys:

# Priority order:
# 1. ANTHROPIC_API_KEY
# 2. OPENAI_API_KEY
# 3. OPENROUTER_API_KEY
# 4. GOOGLE_API_KEY
# 5. Ollama (if running)

rk think "question"  # Uses first available

Per-Profile Provider

Configure different providers for different profiles:

[profiles.quick]
provider = "ollama"
model = "llama3.2"

[profiles.balanced]
provider = "anthropic"
model = "claude-sonnet-4-20250514"

[profiles.deep]
provider = "anthropic"
model = "claude-opus-4-20250514"

Cost Optimization

# Use cheaper models for simple tasks
[profiles.quick]
provider = "openrouter"
model = "mistralai/mistral-7b-instruct"  # Very cheap

[profiles.balanced]
provider = "openrouter"
model = "anthropic/claude-sonnet-4"  # Good balance

[profiles.paranoid]
provider = "anthropic"
model = "claude-opus-4-20250514"  # Best quality

Advanced Configuration

Timeouts

[providers.anthropic]
timeout_secs = 120
connect_timeout_secs = 10

Retries

[providers.anthropic]
max_retries = 3
retry_delay_ms = 1000
retry_multiplier = 2.0  # Exponential backoff

Rate Limiting

[providers.anthropic]
requests_per_minute = 50
tokens_per_minute = 100000

Custom Endpoints

For proxies or enterprise deployments:

[providers.openai]
base_url = "https://your-proxy.com/v1"
api_key = "${PROXY_API_KEY}"

Temperature and Sampling

[providers.anthropic]
temperature = 0.7        # 0.0-1.0, lower = more deterministic
top_p = 0.9             # Nucleus sampling
top_k = 40              # Top-k sampling

Provider-Specific Features

Anthropic Extended Thinking

Enable extended thinking for complex analysis:

[providers.anthropic]
extended_thinking = true
thinking_budget = 16000  # Max thinking tokens

OpenAI Function Calling

[providers.openai]
function_calling = true

OpenRouter Fallbacks

[providers.openrouter]
model = "anthropic/claude-sonnet-4"
fallback_models = [
    "openai/gpt-4-turbo",
    "google/gemini-pro",
]

Monitoring and Debugging

Token Usage

# Show token usage after each analysis
rk think "question" --verbose

# Output includes:
# Tokens: 1,234 prompt + 567 completion = 1,801 total
# Cost: ~$0.0054

Request Logging

# Log all API requests (for debugging)
export RK_DEBUG_API=true
rk think "question"

Provider Health Check

# Check if provider is working
rk provider test anthropic
rk provider test openai
rk provider test ollama

Switching Providers

Migration Checklist

When switching providers:

  1. Test compatibility — Run same prompts, compare quality
  2. Adjust timeouts — Different providers have different latencies
  3. Check token limits — Models have different context windows
  4. Update rate limits — Different quotas per provider
  5. Review costs — Pricing varies significantly

Quality Comparison

# Run same analysis with different providers
rk think "question" --provider anthropic --output json > anthropic.json
rk think "question" --provider openai --output json > openai.json
rk think "question" --provider ollama --output json > ollama.json

# Compare results
diff anthropic.json openai.json

Troubleshooting

Common Issues

Issue Cause Solution
“API key invalid” Wrong/expired key Regenerate API key
“Rate limited” Too many requests Add retry logic, reduce frequency
“Model not found” Wrong model ID Check provider’s model list
“Context too long” Input exceeds limit Use model with larger context
“Connection refused” Ollama not running ollama serve

Error Codes

Code Meaning Action
401 Unauthorized Check API key
429 Rate limited Wait and retry
500 Server error Retry or switch provider
503 Service unavailable Try fallback provider