Skip to main content

AI Providers

Configure and manage AI model providers for report generation and prompt execution. Leverage OpenRouter for unified access to multiple LLMs with intelligent routing.

AI Providers Overview#

PromptReports uses OpenRouter as a unified gateway to access multiple AI model providers. This architecture gives you flexibility to choose the best model for each use case while maintaining a single, consistent API.

Multi-Provider Access

Access OpenAI, Anthropic, Google, Meta, Mistral, and more through a single API.

Intelligent Routing

Automatic model selection based on task requirements, cost, and availability.

Cost Optimization

Set spending limits, track usage, and optimize model selection for cost efficiency.

Fallback Protection

Automatic failover to alternative models when primary providers experience issues.

OpenRouter Integration#

OpenRouter acts as a unified API layer that routes requests to various AI providers. PromptReports handles the integration, so you can focus on building great prompts and reports without managing multiple API keys or provider-specific implementations.

Architecture Overview
text
Your Application
      │
      ▼
PromptReports API
      │
      ▼
OpenRouter Gateway
      │
      ├──► OpenAI (GPT-4, GPT-4o, GPT-3.5)
      ├──► Anthropic (Claude 3.5, Claude 3)
      ├──► Google (Gemini Pro, Gemini Flash)
      ├──► Meta (Llama 3.1, Llama 3)
      ├──► Mistral (Large, Medium, Small)
      └──► 100+ more models...

Bring Your Own Key (BYOK)#

You can optionally connect your own OpenRouter API key or direct provider keys for additional control and potentially lower costs at scale.

Configure Custom API Key
bash
curl -X PATCH "https://api.promptreports.ai/v1/settings/ai-providers" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "openRouterKey": "sk-or-xxxxxxxxxxxxxxxxxxxx",
    "fallbackToDefault": true
  }'

Model Selection#

Choose models based on your requirements for quality, speed, and cost. PromptReports supports automatic model selection or explicit model specification.

Automatic Model Selection#

Let PromptReports choose the optimal model based on task characteristics:

Automatic Model Selection
bash
curl -X POST "https://api.promptreports.ai/v1/prompts/prm_abc123/execute" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "variables": { "topic": "quarterly sales analysis" },
    "modelPreference": {
      "priority": "quality",    // "quality" | "speed" | "cost"
      "maxCost": 0.05,         // Max cost per request in USD
      "capabilities": ["long-context", "structured-output"]
    }
  }'

Explicit Model Selection#

Specify exactly which model to use for full control:

Explicit Model Selection
bash
curl -X POST "https://api.promptreports.ai/v1/prompts/prm_abc123/execute" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "variables": { "topic": "quarterly sales analysis" },
    "model": "anthropic/claude-3.5-sonnet",
    "parameters": {
      "temperature": 0.7,
      "maxTokens": 4096,
      "topP": 0.9
    }
  }'

Available Models#

Model IDProviderBest ForContext Window
openai/gpt-4oOpenAIGeneral-purpose, multimodal128K
openai/gpt-4-turboOpenAIComplex reasoning, long documents128K
anthropic/claude-3.5-sonnetAnthropicAnalysis, writing, coding200K
anthropic/claude-3-opusAnthropicHighest quality, complex tasks200K
google/gemini-pro-1.5GoogleLong context, multimodal1M
google/gemini-flash-1.5GoogleFast responses, cost-effective1M
meta-llama/llama-3.1-405bMetaOpen-source, high quality128K
mistral/mistral-largeMistralEuropean hosting, multilingual128K
List Available Models
bash
curl -X GET "https://api.promptreports.ai/v1/models" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Response includes model details, pricing, and capabilities
{
  "models": [
    {
      "id": "anthropic/claude-3.5-sonnet",
      "name": "Claude 3.5 Sonnet",
      "provider": "Anthropic",
      "contextWindow": 200000,
      "pricing": {
        "input": 0.000003,   // per token
        "output": 0.000015   // per token
      },
      "capabilities": ["vision", "function-calling", "streaming"]
    }
  ]
}

Provider Configuration#

Configure default settings and preferences for AI providers at the organization or project level.

Organization-Level Configuration
bash
curl -X PATCH "https://api.promptreports.ai/v1/organization/settings" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "aiProvider": {
      "defaultModel": "anthropic/claude-3.5-sonnet",
      "fallbackModels": [
        "openai/gpt-4o",
        "google/gemini-pro-1.5"
      ],
      "maxCostPerRequest": 0.10,
      "maxCostPerDay": 50.00,
      "allowedModels": [
        "anthropic/*",
        "openai/gpt-4*",
        "google/gemini*"
      ],
      "blockedModels": []
    }
  }'

Project-Level Settings#

Project Configuration
bash
curl -X PATCH "https://api.promptreports.ai/v1/projects/proj_abc123/settings" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "aiProvider": {
      "defaultModel": "openai/gpt-4o",
      "defaultParameters": {
        "temperature": 0.5,
        "maxTokens": 2048
      },
      "streaming": true,
      "caching": {
        "enabled": true,
        "ttlSeconds": 3600
      }
    }
  }'

Cost Management#

Monitor and control AI spending with budgets, alerts, and usage tracking.

Spending Limits

Set daily, weekly, or monthly spending caps to prevent unexpected costs.

Usage Analytics

Track token usage, costs, and request patterns across models and projects.

Budget Alerts

Receive notifications when approaching or exceeding spending thresholds.

Cost Estimation

Preview estimated costs before executing prompts with large inputs.

Set Spending Limits
bash
curl -X PATCH "https://api.promptreports.ai/v1/organization/budget" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "limits": {
      "daily": 100.00,
      "weekly": 500.00,
      "monthly": 1500.00
    },
    "alerts": [
      { "threshold": 0.50, "email": true, "webhook": true },
      { "threshold": 0.80, "email": true, "webhook": true },
      { "threshold": 0.95, "email": true, "webhook": true, "pauseRequests": false },
      { "threshold": 1.00, "email": true, "webhook": true, "pauseRequests": true }
    ],
    "alertEmail": "billing@yourcompany.com"
  }'
Get Usage Report
bash
curl -X GET "https://api.promptreports.ai/v1/usage?period=month&breakdown=model" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Response
{
  "period": { "start": "2024-01-01", "end": "2024-01-31" },
  "totalCost": 245.67,
  "totalRequests": 15420,
  "totalTokens": { "input": 12500000, "output": 3200000 },
  "breakdown": [
    {
      "model": "anthropic/claude-3.5-sonnet",
      "requests": 8500,
      "cost": 142.30,
      "tokens": { "input": 7800000, "output": 2100000 }
    },
    {
      "model": "openai/gpt-4o",
      "requests": 4200,
      "cost": 78.45,
      "tokens": { "input": 3500000, "output": 850000 }
    }
  ]
}

Cost Optimization Tips#

  • Use cheaper models (GPT-3.5, Claude Haiku, Gemini Flash) for simple tasks like classification or extraction
  • Enable response caching for prompts with deterministic outputs
  • Set appropriate maxTokens limits to prevent unnecessarily long responses
  • Use streaming for better user experience without affecting costs
  • Batch similar requests when possible to reduce API overhead
  • Monitor usage patterns to identify expensive but underperforming prompts

Fallback Strategies#

Configure automatic fallback to alternative models when primary providers experience outages, rate limits, or increased latency.

Fallback Configuration
bash
curl -X PATCH "https://api.promptreports.ai/v1/settings/fallback" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "triggers": {
      "onRateLimit": true,
      "onTimeout": true,
      "onServerError": true,
      "timeoutMs": 30000
    },
    "fallbackChain": [
      {
        "model": "anthropic/claude-3.5-sonnet",
        "priority": 1
      },
      {
        "model": "openai/gpt-4o",
        "priority": 2,
        "conditions": { "maxCost": 0.10 }
      },
      {
        "model": "google/gemini-pro-1.5",
        "priority": 3
      }
    ],
    "notifyOnFallback": true
  }'
Response with Fallback Metadata
json
{
  "result": "Your generated content...",
  "metadata": {
    "model": "openai/gpt-4o",
    "originalModel": "anthropic/claude-3.5-sonnet",
    "fallbackReason": "rate_limit_exceeded",
    "latency": 1245,
    "tokens": { "input": 1500, "output": 800 },
    "cost": 0.023
  }
}

Performance Optimization#

Optimize response times and throughput for production workloads.

Streaming Responses#

Enable streaming to receive partial responses as they are generated, improving perceived latency for end users.

Streaming Request
bash
curl -X POST "https://api.promptreports.ai/v1/prompts/prm_abc123/execute" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{
    "variables": { "topic": "AI trends" },
    "model": "anthropic/claude-3.5-sonnet",
    "stream": true
  }'

# Response (Server-Sent Events)
data: {"type":"content","delta":"Artificial"}
data: {"type":"content","delta":" intelligence"}
data: {"type":"content","delta":" is transforming"}
...
data: {"type":"done","usage":{"input":150,"output":450},"cost":0.0067}
Handling Streaming in JavaScript
typescript
async function streamPromptExecution(promptId: string, variables: object) {
  const response = await fetch(
    `https://api.promptreports.ai/v1/prompts/${promptId}/execute`,
    {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${process.env.API_KEY}`,
        'Content-Type': 'application/json',
        'Accept': 'text/event-stream',
      },
      body: JSON.stringify({ variables, stream: true }),
    }
  );

  const reader = response.body!.getReader();
  const decoder = new TextDecoder();
  let content = '';

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    const chunk = decoder.decode(value);
    const lines = chunk.split('\n');

    for (const line of lines) {
      if (line.startsWith('data: ')) {
        const data = JSON.parse(line.slice(6));
        if (data.type === 'content') {
          content += data.delta;
          console.log(data.delta); // Stream to UI
        } else if (data.type === 'done') {
          console.log('\nComplete! Cost:', data.cost);
        }
      }
    }
  }

  return content;
}

Request Batching#

Batch Multiple Prompts
bash
curl -X POST "https://api.promptreports.ai/v1/prompts/batch" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "requests": [
      {
        "promptId": "prm_abc123",
        "variables": { "topic": "AI" }
      },
      {
        "promptId": "prm_abc123",
        "variables": { "topic": "Cloud Computing" }
      },
      {
        "promptId": "prm_def456",
        "variables": { "industry": "Healthcare" }
      }
    ],
    "model": "openai/gpt-4o",
    "parallel": true
  }'

Model Comparison#

Use the model comparison feature to evaluate different models for your use case before committing to production.

Compare Models
bash
curl -X POST "https://api.promptreports.ai/v1/prompts/prm_abc123/compare" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "variables": { "topic": "Renewable Energy Market Analysis" },
    "models": [
      "anthropic/claude-3.5-sonnet",
      "openai/gpt-4o",
      "google/gemini-pro-1.5"
    ],
    "parameters": {
      "temperature": 0.7,
      "maxTokens": 2048
    }
  }'
Comparison Response
json
{
  "results": [
    {
      "model": "anthropic/claude-3.5-sonnet",
      "output": "The renewable energy market is experiencing...",
      "metrics": {
        "latency": 2340,
        "tokens": { "input": 450, "output": 1820 },
        "cost": 0.0312
      }
    },
    {
      "model": "openai/gpt-4o",
      "output": "Global renewable energy trends show...",
      "metrics": {
        "latency": 1890,
        "tokens": { "input": 450, "output": 1650 },
        "cost": 0.0285
      }
    },
    {
      "model": "google/gemini-pro-1.5",
      "output": "The renewable energy sector continues...",
      "metrics": {
        "latency": 1520,
        "tokens": { "input": 450, "output": 1580 },
        "cost": 0.0198
      }
    }
  ],
  "summary": {
    "fastestModel": "google/gemini-pro-1.5",
    "cheapestModel": "google/gemini-pro-1.5",
    "longestOutput": "anthropic/claude-3.5-sonnet"
  }
}