LLM Providers

🤖 Configure and optimize different LLM providers with ReasonKit.

ReasonKit AI Integration Options: Claude, Gemini, OpenAI, Cursor, VS Code, Any LLM

Universal Compatibility: ReasonKit integrates seamlessly with Claude, Gemini, OpenAI, Cursor, VS Code, and any LLM provider. The same structured reasoning protocols work across all platforms, giving you flexibility without vendor lock-in.

ReasonKit supports multiple LLM providers, each with different strengths, pricing, and capabilities.

Supported Providers

Provider	Models	Best For	Pricing
Anthropic	Claude 4, Sonnet, Haiku	Best quality, safety	$$$
OpenAI	GPT-4, GPT-4 Turbo	Broad compatibility	$$$
OpenRouter	300+ models	Variety, cost optimization	$ - $$$
Ollama	Llama, Mistral, etc.	Privacy, free	Free
Google	Gemini Pro, Flash	Long context	$$

Provider Configuration

Anthropic (Recommended)

Claude models provide the best reasoning quality for ThinkTools.

# Set API key
export ANTHROPIC_API_KEY="sk-ant-..."

# Use explicitly
rk think "question" --provider anthropic --model claude-sonnet-4-20250514

Config file:

[providers.anthropic]
api_key = "${ANTHROPIC_API_KEY}"  # Use env var
model = "claude-sonnet-4-20250514"
max_tokens = 4096

Available models:

Model	Context	Speed	Quality
`claude-opus-4-20250514`	200K	Slow	Best
`claude-sonnet-4-20250514`	200K	Fast	Excellent
`claude-haiku-3-5-20241022`	200K	Fastest	Good

OpenAI

export OPENAI_API_KEY="sk-..."

rk think "question" --provider openai --model gpt-4-turbo

Config file:

[providers.openai]
api_key = "${OPENAI_API_KEY}"
model = "gpt-4-turbo"
organization_id = "org-..."  # Optional
base_url = "https://api.openai.com/v1"  # For proxies

Available models:

Model	Context	Speed	Quality
`gpt-4-turbo`	128K	Fast	Excellent
`gpt-4`	8K	Medium	Excellent
`gpt-3.5-turbo`	16K	Fastest	Good

OpenRouter

Access 300+ models through a single API. Great for cost optimization and experimentation.

export OPENROUTER_API_KEY="sk-or-..."

rk think "question" --provider openrouter --model anthropic/claude-sonnet-4

Config file:

[providers.openrouter]
api_key = "${OPENROUTER_API_KEY}"
model = "anthropic/claude-sonnet-4"
site_url = "https://yourapp.com"  # For rankings
site_name = "Your App"

Popular models:

Model	Provider	Quality	Price
`anthropic/claude-sonnet-4`	Anthropic	Excellent	$$
`openai/gpt-4-turbo`	OpenAI	Excellent	$$
`google/gemini-pro`	Google	Good	$
`mistralai/mistral-large`	Mistral	Good	$
`meta-llama/llama-3-70b`	Meta	Good	$

Ollama (Local)

Run models locally for privacy and zero API costs.

# Start Ollama
ollama serve

# Pull a model
ollama pull llama3.2

# Use with ReasonKit
rk think "question" --provider ollama --model llama3.2

Config file:

[providers.ollama]
host = "http://localhost:11434"
model = "llama3.2"

Recommended models:

Model	Size	Quality	RAM Required
`llama3.2`	8B	Good	8GB
`llama3.2:70b`	70B	Excellent	48GB
`mistral`	7B	Good	8GB
`mixtral`	8x7B	Excellent	32GB
`deepseek-coder`	33B	Good (code)	24GB

Google Gemini

export GOOGLE_API_KEY="..."

rk think "question" --provider google --model gemini-pro

Config file:

[providers.google]
api_key = "${GOOGLE_API_KEY}"
model = "gemini-pro"

Provider Selection

Automatic Selection

By default, ReasonKit auto-selects based on available API keys:

# Priority order:
# 1. ANTHROPIC_API_KEY
# 2. OPENAI_API_KEY
# 3. OPENROUTER_API_KEY
# 4. GOOGLE_API_KEY
# 5. Ollama (if running)

rk think "question"  # Uses first available

Per-Profile Provider

Configure different providers for different profiles:

[profiles.quick]
provider = "ollama"
model = "llama3.2"

[profiles.balanced]
provider = "anthropic"
model = "claude-sonnet-4-20250514"

[profiles.deep]
provider = "anthropic"
model = "claude-opus-4-20250514"

Cost Optimization

# Use cheaper models for simple tasks
[profiles.quick]
provider = "openrouter"
model = "mistralai/mistral-7b-instruct"  # Very cheap

[profiles.balanced]
provider = "openrouter"
model = "anthropic/claude-sonnet-4"  # Good balance

[profiles.paranoid]
provider = "anthropic"
model = "claude-opus-4-20250514"  # Best quality

Advanced Configuration

Timeouts

[providers.anthropic]
timeout_secs = 120
connect_timeout_secs = 10

Retries

[providers.anthropic]
max_retries = 3
retry_delay_ms = 1000
retry_multiplier = 2.0  # Exponential backoff

Rate Limiting

[providers.anthropic]
requests_per_minute = 50
tokens_per_minute = 100000

Custom Endpoints

For proxies or enterprise deployments:

[providers.openai]
base_url = "https://your-proxy.com/v1"
api_key = "${PROXY_API_KEY}"

Temperature and Sampling

[providers.anthropic]
temperature = 0.7        # 0.0-1.0, lower = more deterministic
top_p = 0.9             # Nucleus sampling
top_k = 40              # Top-k sampling

Provider-Specific Features

Anthropic Extended Thinking

Enable extended thinking for complex analysis:

[providers.anthropic]
extended_thinking = true
thinking_budget = 16000  # Max thinking tokens

OpenAI Function Calling

[providers.openai]
function_calling = true

OpenRouter Fallbacks

[providers.openrouter]
model = "anthropic/claude-sonnet-4"
fallback_models = [
    "openai/gpt-4-turbo",
    "google/gemini-pro",
]

Monitoring and Debugging

Token Usage

# Show token usage after each analysis
rk think "question" --verbose

# Output includes:
# Tokens: 1,234 prompt + 567 completion = 1,801 total
# Cost: ~$0.0054

Request Logging

# Log all API requests (for debugging)
export RK_DEBUG_API=true
rk think "question"

Provider Health Check

# Check if provider is working
rk provider test anthropic
rk provider test openai
rk provider test ollama

Switching Providers

Migration Checklist

When switching providers:

Test compatibility — Run same prompts, compare quality
Adjust timeouts — Different providers have different latencies
Check token limits — Models have different context windows
Update rate limits — Different quotas per provider
Review costs — Pricing varies significantly

Quality Comparison

# Run same analysis with different providers
rk think "question" --provider anthropic --output json > anthropic.json
rk think "question" --provider openai --output json > openai.json
rk think "question" --provider ollama --output json > ollama.json

# Compare results
diff anthropic.json openai.json

Troubleshooting

Common Issues

Issue	Cause	Solution
“API key invalid”	Wrong/expired key	Regenerate API key
“Rate limited”	Too many requests	Add retry logic, reduce frequency
“Model not found”	Wrong model ID	Check provider’s model list
“Context too long”	Input exceeds limit	Use model with larger context
“Connection refused”	Ollama not running	`ollama serve`

Error Codes

Code	Meaning	Action
401	Unauthorized	Check API key
429	Rate limited	Wait and retry
500	Server error	Retry or switch provider
503	Service unavailable	Try fallback provider

Configuration — General configuration
Environment Variables — API key setup
Architecture — Provider layer internals

Keyboard shortcuts

ReasonKit Documentation