AI Providers
Configure and manage AI model providers for report generation and prompt execution. Leverage OpenRouter for unified access to multiple LLMs with intelligent routing.
AI Providers Overview#
PromptReports uses OpenRouter as a unified gateway to access multiple AI model providers. This architecture gives you flexibility to choose the best model for each use case while maintaining a single, consistent API.
Multi-Provider Access
Access OpenAI, Anthropic, Google, Meta, Mistral, and more through a single API.
Intelligent Routing
Automatic model selection based on task requirements, cost, and availability.
Cost Optimization
Set spending limits, track usage, and optimize model selection for cost efficiency.
Fallback Protection
Automatic failover to alternative models when primary providers experience issues.
OpenRouter Integration#
OpenRouter acts as a unified API layer that routes requests to various AI providers. PromptReports handles the integration, so you can focus on building great prompts and reports without managing multiple API keys or provider-specific implementations.
Your Application
│
▼
PromptReports API
│
▼
OpenRouter Gateway
│
├──► OpenAI (GPT-4, GPT-4o, GPT-3.5)
├──► Anthropic (Claude 3.5, Claude 3)
├──► Google (Gemini Pro, Gemini Flash)
├──► Meta (Llama 3.1, Llama 3)
├──► Mistral (Large, Medium, Small)
└──► 100+ more models...Bring Your Own Key (BYOK)#
You can optionally connect your own OpenRouter API key or direct provider keys for additional control and potentially lower costs at scale.
curl -X PATCH "https://api.promptreports.ai/v1/settings/ai-providers" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"openRouterKey": "sk-or-xxxxxxxxxxxxxxxxxxxx",
"fallbackToDefault": true
}'Default vs Custom Keys
Model Selection#
Choose models based on your requirements for quality, speed, and cost. PromptReports supports automatic model selection or explicit model specification.
Automatic Model Selection#
Let PromptReports choose the optimal model based on task characteristics:
curl -X POST "https://api.promptreports.ai/v1/prompts/prm_abc123/execute" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"variables": { "topic": "quarterly sales analysis" },
"modelPreference": {
"priority": "quality", // "quality" | "speed" | "cost"
"maxCost": 0.05, // Max cost per request in USD
"capabilities": ["long-context", "structured-output"]
}
}'Explicit Model Selection#
Specify exactly which model to use for full control:
curl -X POST "https://api.promptreports.ai/v1/prompts/prm_abc123/execute" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"variables": { "topic": "quarterly sales analysis" },
"model": "anthropic/claude-3.5-sonnet",
"parameters": {
"temperature": 0.7,
"maxTokens": 4096,
"topP": 0.9
}
}'Available Models#
| Model ID | Provider | Best For | Context Window |
|---|---|---|---|
| openai/gpt-4o | OpenAI | General-purpose, multimodal | 128K |
| openai/gpt-4-turbo | OpenAI | Complex reasoning, long documents | 128K |
| anthropic/claude-3.5-sonnet | Anthropic | Analysis, writing, coding | 200K |
| anthropic/claude-3-opus | Anthropic | Highest quality, complex tasks | 200K |
| google/gemini-pro-1.5 | Long context, multimodal | 1M | |
| google/gemini-flash-1.5 | Fast responses, cost-effective | 1M | |
| meta-llama/llama-3.1-405b | Meta | Open-source, high quality | 128K |
| mistral/mistral-large | Mistral | European hosting, multilingual | 128K |
curl -X GET "https://api.promptreports.ai/v1/models" \
-H "Authorization: Bearer YOUR_API_KEY"
# Response includes model details, pricing, and capabilities
{
"models": [
{
"id": "anthropic/claude-3.5-sonnet",
"name": "Claude 3.5 Sonnet",
"provider": "Anthropic",
"contextWindow": 200000,
"pricing": {
"input": 0.000003, // per token
"output": 0.000015 // per token
},
"capabilities": ["vision", "function-calling", "streaming"]
}
]
}Provider Configuration#
Configure default settings and preferences for AI providers at the organization or project level.
curl -X PATCH "https://api.promptreports.ai/v1/organization/settings" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"aiProvider": {
"defaultModel": "anthropic/claude-3.5-sonnet",
"fallbackModels": [
"openai/gpt-4o",
"google/gemini-pro-1.5"
],
"maxCostPerRequest": 0.10,
"maxCostPerDay": 50.00,
"allowedModels": [
"anthropic/*",
"openai/gpt-4*",
"google/gemini*"
],
"blockedModels": []
}
}'Project-Level Settings#
curl -X PATCH "https://api.promptreports.ai/v1/projects/proj_abc123/settings" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"aiProvider": {
"defaultModel": "openai/gpt-4o",
"defaultParameters": {
"temperature": 0.5,
"maxTokens": 2048
},
"streaming": true,
"caching": {
"enabled": true,
"ttlSeconds": 3600
}
}
}'Response Caching
Cost Management#
Monitor and control AI spending with budgets, alerts, and usage tracking.
Spending Limits
Set daily, weekly, or monthly spending caps to prevent unexpected costs.
Usage Analytics
Track token usage, costs, and request patterns across models and projects.
Budget Alerts
Receive notifications when approaching or exceeding spending thresholds.
Cost Estimation
Preview estimated costs before executing prompts with large inputs.
curl -X PATCH "https://api.promptreports.ai/v1/organization/budget" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"limits": {
"daily": 100.00,
"weekly": 500.00,
"monthly": 1500.00
},
"alerts": [
{ "threshold": 0.50, "email": true, "webhook": true },
{ "threshold": 0.80, "email": true, "webhook": true },
{ "threshold": 0.95, "email": true, "webhook": true, "pauseRequests": false },
{ "threshold": 1.00, "email": true, "webhook": true, "pauseRequests": true }
],
"alertEmail": "billing@yourcompany.com"
}'curl -X GET "https://api.promptreports.ai/v1/usage?period=month&breakdown=model" \
-H "Authorization: Bearer YOUR_API_KEY"
# Response
{
"period": { "start": "2024-01-01", "end": "2024-01-31" },
"totalCost": 245.67,
"totalRequests": 15420,
"totalTokens": { "input": 12500000, "output": 3200000 },
"breakdown": [
{
"model": "anthropic/claude-3.5-sonnet",
"requests": 8500,
"cost": 142.30,
"tokens": { "input": 7800000, "output": 2100000 }
},
{
"model": "openai/gpt-4o",
"requests": 4200,
"cost": 78.45,
"tokens": { "input": 3500000, "output": 850000 }
}
]
}Cost Optimization Tips#
- Use cheaper models (GPT-3.5, Claude Haiku, Gemini Flash) for simple tasks like classification or extraction
- Enable response caching for prompts with deterministic outputs
- Set appropriate maxTokens limits to prevent unnecessarily long responses
- Use streaming for better user experience without affecting costs
- Batch similar requests when possible to reduce API overhead
- Monitor usage patterns to identify expensive but underperforming prompts
Fallback Strategies#
Configure automatic fallback to alternative models when primary providers experience outages, rate limits, or increased latency.
curl -X PATCH "https://api.promptreports.ai/v1/settings/fallback" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"enabled": true,
"triggers": {
"onRateLimit": true,
"onTimeout": true,
"onServerError": true,
"timeoutMs": 30000
},
"fallbackChain": [
{
"model": "anthropic/claude-3.5-sonnet",
"priority": 1
},
{
"model": "openai/gpt-4o",
"priority": 2,
"conditions": { "maxCost": 0.10 }
},
{
"model": "google/gemini-pro-1.5",
"priority": 3
}
],
"notifyOnFallback": true
}'Fallback Notifications
{
"result": "Your generated content...",
"metadata": {
"model": "openai/gpt-4o",
"originalModel": "anthropic/claude-3.5-sonnet",
"fallbackReason": "rate_limit_exceeded",
"latency": 1245,
"tokens": { "input": 1500, "output": 800 },
"cost": 0.023
}
}Performance Optimization#
Optimize response times and throughput for production workloads.
Streaming Responses#
Enable streaming to receive partial responses as they are generated, improving perceived latency for end users.
curl -X POST "https://api.promptreports.ai/v1/prompts/prm_abc123/execute" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-d '{
"variables": { "topic": "AI trends" },
"model": "anthropic/claude-3.5-sonnet",
"stream": true
}'
# Response (Server-Sent Events)
data: {"type":"content","delta":"Artificial"}
data: {"type":"content","delta":" intelligence"}
data: {"type":"content","delta":" is transforming"}
...
data: {"type":"done","usage":{"input":150,"output":450},"cost":0.0067}async function streamPromptExecution(promptId: string, variables: object) {
const response = await fetch(
`https://api.promptreports.ai/v1/prompts/${promptId}/execute`,
{
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.API_KEY}`,
'Content-Type': 'application/json',
'Accept': 'text/event-stream',
},
body: JSON.stringify({ variables, stream: true }),
}
);
const reader = response.body!.getReader();
const decoder = new TextDecoder();
let content = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = JSON.parse(line.slice(6));
if (data.type === 'content') {
content += data.delta;
console.log(data.delta); // Stream to UI
} else if (data.type === 'done') {
console.log('\nComplete! Cost:', data.cost);
}
}
}
}
return content;
}Request Batching#
curl -X POST "https://api.promptreports.ai/v1/prompts/batch" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"requests": [
{
"promptId": "prm_abc123",
"variables": { "topic": "AI" }
},
{
"promptId": "prm_abc123",
"variables": { "topic": "Cloud Computing" }
},
{
"promptId": "prm_def456",
"variables": { "industry": "Healthcare" }
}
],
"model": "openai/gpt-4o",
"parallel": true
}'Model Comparison#
Use the model comparison feature to evaluate different models for your use case before committing to production.
curl -X POST "https://api.promptreports.ai/v1/prompts/prm_abc123/compare" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"variables": { "topic": "Renewable Energy Market Analysis" },
"models": [
"anthropic/claude-3.5-sonnet",
"openai/gpt-4o",
"google/gemini-pro-1.5"
],
"parameters": {
"temperature": 0.7,
"maxTokens": 2048
}
}'{
"results": [
{
"model": "anthropic/claude-3.5-sonnet",
"output": "The renewable energy market is experiencing...",
"metrics": {
"latency": 2340,
"tokens": { "input": 450, "output": 1820 },
"cost": 0.0312
}
},
{
"model": "openai/gpt-4o",
"output": "Global renewable energy trends show...",
"metrics": {
"latency": 1890,
"tokens": { "input": 450, "output": 1650 },
"cost": 0.0285
}
},
{
"model": "google/gemini-pro-1.5",
"output": "The renewable energy sector continues...",
"metrics": {
"latency": 1520,
"tokens": { "input": 450, "output": 1580 },
"cost": 0.0198
}
}
],
"summary": {
"fastestModel": "google/gemini-pro-1.5",
"cheapestModel": "google/gemini-pro-1.5",
"longestOutput": "anthropic/claude-3.5-sonnet"
}
}