AI Providers

Configure and manage AI model providers for report generation and prompt execution. Leverage OpenRouter for unified access to multiple LLMs with intelligent routing.

AI Providers Overview#

PromptReports uses OpenRouter as a unified gateway to access multiple AI model providers. This architecture gives you flexibility to choose the best model for each use case while maintaining a single, consistent API.

Multi-Provider Access

Access OpenAI, Anthropic, Google, Meta, Mistral, and more through a single API.

Intelligent Routing

Automatic model selection based on task requirements, cost, and availability.

Cost Optimization

Set spending limits, track usage, and optimize model selection for cost efficiency.

Fallback Protection

Automatic failover to alternative models when primary providers experience issues.

OpenRouter Integration#

OpenRouter acts as a unified API layer that routes requests to various AI providers. PromptReports handles the integration, so you can focus on building great prompts and reports without managing multiple API keys or provider-specific implementations.

Architecture Overview

text

Your Application
      │
      ▼
PromptReports API
      │
      ▼
OpenRouter Gateway
      │
      ├──► OpenAI (GPT-4, GPT-4o, GPT-3.5)
      ├──► Anthropic (Claude 3.5, Claude 3)
      ├──► Google (Gemini Pro, Gemini Flash)
      ├──► Meta (Llama 3.1, Llama 3)
      ├──► Mistral (Large, Medium, Small)
      └──► 100+ more models...

Bring Your Own Key (BYOK)#

You can optionally connect your own OpenRouter API key or direct provider keys for additional control and potentially lower costs at scale.

Configure Custom API Key

bash

curl -X PATCH "https://api.promptreports.ai/v1/settings/ai-providers" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "openRouterKey": "sk-or-xxxxxxxxxxxxxxxxxxxx",
    "fallbackToDefault": true
  }'

Default vs Custom Keys

When using PromptReports' default key, costs are included in your subscription. With BYOK, you're billed directly by OpenRouter, but gain access to your full account limits and custom provider agreements.

Model Selection#

Choose models based on your requirements for quality, speed, and cost. PromptReports supports automatic model selection or explicit model specification.

Automatic Model Selection#

Let PromptReports choose the optimal model based on task characteristics:

Automatic Model Selection

bash

curl -X POST "https://api.promptreports.ai/v1/prompts/prm_abc123/execute" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "variables": { "topic": "quarterly sales analysis" },
    "modelPreference": {
      "priority": "quality",    // "quality" | "speed" | "cost"
      "maxCost": 0.05,         // Max cost per request in USD
      "capabilities": ["long-context", "structured-output"]
    }
  }'

Explicit Model Selection#

Specify exactly which model to use for full control:

Explicit Model Selection

bash

curl -X POST "https://api.promptreports.ai/v1/prompts/prm_abc123/execute" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "variables": { "topic": "quarterly sales analysis" },
    "model": "anthropic/claude-3.5-sonnet",
    "parameters": {
      "temperature": 0.7,
      "maxTokens": 4096,
      "topP": 0.9
    }
  }'

Available Models#

Model ID	Provider	Best For	Context Window
openai/gpt-4o	OpenAI	General-purpose, multimodal	128K
openai/gpt-4-turbo	OpenAI	Complex reasoning, long documents	128K
anthropic/claude-3.5-sonnet	Anthropic	Analysis, writing, coding	200K
anthropic/claude-3-opus	Anthropic	Highest quality, complex tasks	200K
google/gemini-pro-1.5	Google	Long context, multimodal	1M
google/gemini-flash-1.5	Google	Fast responses, cost-effective	1M
meta-llama/llama-3.1-405b	Meta	Open-source, high quality	128K
mistral/mistral-large	Mistral	European hosting, multilingual	128K

List Available Models

bash

curl -X GET "https://api.promptreports.ai/v1/models" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Response includes model details, pricing, and capabilities
{
  "models": [
    {
      "id": "anthropic/claude-3.5-sonnet",
      "name": "Claude 3.5 Sonnet",
      "provider": "Anthropic",
      "contextWindow": 200000,
      "pricing": {
        "input": 0.000003,   // per token
        "output": 0.000015   // per token
      },
      "capabilities": ["vision", "function-calling", "streaming"]
    }
  ]
}

Provider Configuration#

Configure default settings and preferences for AI providers at the organization or project level.

Organization-Level Configuration

bash

curl -X PATCH "https://api.promptreports.ai/v1/organization/settings" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "aiProvider": {
      "defaultModel": "anthropic/claude-3.5-sonnet",
      "fallbackModels": [
        "openai/gpt-4o",
        "google/gemini-pro-1.5"
      ],
      "maxCostPerRequest": 0.10,
      "maxCostPerDay": 50.00,
      "allowedModels": [
        "anthropic/*",
        "openai/gpt-4*",
        "google/gemini*"
      ],
      "blockedModels": []
    }
  }'

Project-Level Settings#

Project Configuration

bash

curl -X PATCH "https://api.promptreports.ai/v1/projects/proj_abc123/settings" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "aiProvider": {
      "defaultModel": "openai/gpt-4o",
      "defaultParameters": {
        "temperature": 0.5,
        "maxTokens": 2048
      },
      "streaming": true,
      "caching": {
        "enabled": true,
        "ttlSeconds": 3600
      }
    }
  }'

Response Caching

Enable caching for deterministic prompts (temperature=0) to reduce costs and latency. Identical requests within the TTL window return cached responses instantly.

Cost Management#

Monitor and control AI spending with budgets, alerts, and usage tracking.

Spending Limits

Set daily, weekly, or monthly spending caps to prevent unexpected costs.

Usage Analytics

Track token usage, costs, and request patterns across models and projects.

Budget Alerts

Receive notifications when approaching or exceeding spending thresholds.

Cost Estimation

Preview estimated costs before executing prompts with large inputs.

Set Spending Limits

bash

curl -X PATCH "https://api.promptreports.ai/v1/organization/budget" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "limits": {
      "daily": 100.00,
      "weekly": 500.00,
      "monthly": 1500.00
    },
    "alerts": [
      { "threshold": 0.50, "email": true, "webhook": true },
      { "threshold": 0.80, "email": true, "webhook": true },
      { "threshold": 0.95, "email": true, "webhook": true, "pauseRequests": false },
      { "threshold": 1.00, "email": true, "webhook": true, "pauseRequests": true }
    ],
    "alertEmail": "billing@yourcompany.com"
  }'

Get Usage Report

bash

curl -X GET "https://api.promptreports.ai/v1/usage?period=month&breakdown=model" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Response
{
  "period": { "start": "2024-01-01", "end": "2024-01-31" },
  "totalCost": 245.67,
  "totalRequests": 15420,
  "totalTokens": { "input": 12500000, "output": 3200000 },
  "breakdown": [
    {
      "model": "anthropic/claude-3.5-sonnet",
      "requests": 8500,
      "cost": 142.30,
      "tokens": { "input": 7800000, "output": 2100000 }
    },
    {
      "model": "openai/gpt-4o",
      "requests": 4200,
      "cost": 78.45,
      "tokens": { "input": 3500000, "output": 850000 }
    }
  ]
}

Cost Optimization Tips#

Use cheaper models (GPT-3.5, Claude Haiku, Gemini Flash) for simple tasks like classification or extraction
Enable response caching for prompts with deterministic outputs
Set appropriate maxTokens limits to prevent unnecessarily long responses
Use streaming for better user experience without affecting costs
Batch similar requests when possible to reduce API overhead
Monitor usage patterns to identify expensive but underperforming prompts

Fallback Strategies#

Configure automatic fallback to alternative models when primary providers experience outages, rate limits, or increased latency.

Fallback Configuration

bash

curl -X PATCH "https://api.promptreports.ai/v1/settings/fallback" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "triggers": {
      "onRateLimit": true,
      "onTimeout": true,
      "onServerError": true,
      "timeoutMs": 30000
    },
    "fallbackChain": [
      {
        "model": "anthropic/claude-3.5-sonnet",
        "priority": 1
      },
      {
        "model": "openai/gpt-4o",
        "priority": 2,
        "conditions": { "maxCost": 0.10 }
      },
      {
        "model": "google/gemini-pro-1.5",
        "priority": 3
      }
    ],
    "notifyOnFallback": true
  }'

Fallback Notifications

When fallback is triggered, the response includes metadata indicating which model was used and why fallback occurred. Enable notifications to track fallback frequency.

Response with Fallback Metadata

json

{
  "result": "Your generated content...",
  "metadata": {
    "model": "openai/gpt-4o",
    "originalModel": "anthropic/claude-3.5-sonnet",
    "fallbackReason": "rate_limit_exceeded",
    "latency": 1245,
    "tokens": { "input": 1500, "output": 800 },
    "cost": 0.023
  }
}

Performance Optimization#

Optimize response times and throughput for production workloads.

Streaming Responses#

Enable streaming to receive partial responses as they are generated, improving perceived latency for end users.

Streaming Request

bash

curl -X POST "https://api.promptreports.ai/v1/prompts/prm_abc123/execute" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{
    "variables": { "topic": "AI trends" },
    "model": "anthropic/claude-3.5-sonnet",
    "stream": true
  }'

# Response (Server-Sent Events)
data: {"type":"content","delta":"Artificial"}
data: {"type":"content","delta":" intelligence"}
data: {"type":"content","delta":" is transforming"}
...
data: {"type":"done","usage":{"input":150,"output":450},"cost":0.0067}

Handling Streaming in JavaScript

typescript

async function streamPromptExecution(promptId: string, variables: object) {
  const response = await fetch(
    `https://api.promptreports.ai/v1/prompts/${promptId}/execute`,
    {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${process.env.API_KEY}`,
        'Content-Type': 'application/json',
        'Accept': 'text/event-stream',
      },
      body: JSON.stringify({ variables, stream: true }),
    }
  );

  const reader = response.body!.getReader();
  const decoder = new TextDecoder();
  let content = '';

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    const chunk = decoder.decode(value);
    const lines = chunk.split('\n');

    for (const line of lines) {
      if (line.startsWith('data: ')) {
        const data = JSON.parse(line.slice(6));
        if (data.type === 'content') {
          content += data.delta;
          console.log(data.delta); // Stream to UI
        } else if (data.type === 'done') {
          console.log('\nComplete! Cost:', data.cost);
        }
      }
    }
  }

  return content;
}

Request Batching#

Batch Multiple Prompts

bash

curl -X POST "https://api.promptreports.ai/v1/prompts/batch" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "requests": [
      {
        "promptId": "prm_abc123",
        "variables": { "topic": "AI" }
      },
      {
        "promptId": "prm_abc123",
        "variables": { "topic": "Cloud Computing" }
      },
      {
        "promptId": "prm_def456",
        "variables": { "industry": "Healthcare" }
      }
    ],
    "model": "openai/gpt-4o",
    "parallel": true
  }'

Model Comparison#

Use the model comparison feature to evaluate different models for your use case before committing to production.

Compare Models

bash

curl -X POST "https://api.promptreports.ai/v1/prompts/prm_abc123/compare" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "variables": { "topic": "Renewable Energy Market Analysis" },
    "models": [
      "anthropic/claude-3.5-sonnet",
      "openai/gpt-4o",
      "google/gemini-pro-1.5"
    ],
    "parameters": {
      "temperature": 0.7,
      "maxTokens": 2048
    }
  }'

Comparison Response

json

{
  "results": [
    {
      "model": "anthropic/claude-3.5-sonnet",
      "output": "The renewable energy market is experiencing...",
      "metrics": {
        "latency": 2340,
        "tokens": { "input": 450, "output": 1820 },
        "cost": 0.0312
      }
    },
    {
      "model": "openai/gpt-4o",
      "output": "Global renewable energy trends show...",
      "metrics": {
        "latency": 1890,
        "tokens": { "input": 450, "output": 1650 },
        "cost": 0.0285
      }
    },
    {
      "model": "google/gemini-pro-1.5",
      "output": "The renewable energy sector continues...",
      "metrics": {
        "latency": 1520,
        "tokens": { "input": 450, "output": 1580 },
        "cost": 0.0198
      }
    }
  ],
  "summary": {
    "fastestModel": "google/gemini-pro-1.5",
    "cheapestModel": "google/gemini-pro-1.5",
    "longestOutput": "anthropic/claude-3.5-sonnet"
  }
}

A/B Testing

Use model comparison in evaluation runs to identify which model produces the best results for your specific prompts. Consider factors beyond cost, including output quality, consistency, and format adherence.

PreviousWebhooks

NextAPI Overview