Core Concepts
Understand the fundamental building blocks of PromptReports and how they work together to create a powerful prompt engineering platform.
Platform Architecture#
PromptReports is built around a hierarchical structure that mirrors professional software development workflows. Understanding this architecture helps you organize your work effectively and leverage all platform capabilities.
Prompt Folders
Top-level containers for organizing related prompts. Similar to projects or repositories.
Prompts
Individual prompt templates with variables, stored within folders.
Versions
Immutable snapshots of prompts that enable change tracking and rollback.
Test Datasets
Collections of test cases for systematic prompt evaluation.
Prompt Folders#
Prompt folders are the primary organizational unit in PromptReports. Think of them as projects or repositories that contain all related prompts, datasets, and configurations.
Folder Features
- Organization: Group related prompts by project, team, or use case
- Access Control: Manage who can view, edit, or approve prompts
- Shared Datasets: Test datasets are scoped to folders for reuse
- Webhooks: Configure folder-level event notifications
- Tags: Add metadata for filtering and discovery
Each folder can contain multiple prompts, and prompts within a folder share access to the folder's test datasets. This makes it easy to evaluate multiple prompt variations against the same test cases.
Prompts & Versions#
Prompts in PromptReports are versioned, meaning every change creates a new immutable version. This approach provides several benefits:
| Feature | Description |
|---|---|
| Change History | View complete history of all changes with diffs |
| Rollback | Instantly revert to any previous version |
| A/B Testing | Compare performance between versions |
| Promotion Flow | Move versions through dev/staging/production stages |
| Audit Trail | Track who made what changes and when |
Each prompt version can be in one of several states:
| State | Description | Can Edit? |
|---|---|---|
| Draft | Work in progress, not yet saved as version | Yes |
| Development | Saved version under active development | No (create new) |
| Staging | Version being tested for production | No |
| Production | Live version serving real traffic | No |
| Archived | Deprecated version kept for reference | No |
Version Best Practices
Test Datasets#
Test datasets are collections of input-output pairs (or just inputs) used to evaluate prompt quality systematically. They're essential for:
- Regression Testing: Ensure changes don't degrade quality
- Benchmarking: Compare different prompt versions objectively
- Quality Metrics: Track performance over time
- Edge Cases: Document and test known difficult scenarios
Datasets can be created in multiple ways:
Manual Entry
Add test cases one by one through the UI.
CSV Import
Upload test cases from spreadsheets or other sources.
From History
Create datasets from past prompt executions.
Evaluations#
Evaluations run your prompts against test datasets and measure quality. PromptReports supports several evaluation types:
| Type | Purpose | When to Use |
|---|---|---|
| Batch Evaluation | Run prompt against all dataset rows | Regular quality checks, before deployments |
| A/B Testing | Compare two versions with statistical significance | Deciding between prompt variations |
| Pairwise Comparison | Head-to-head comparison on same inputs | Detailed quality analysis |
| Backtesting | Test new version against historical data | Understanding impact of changes |
| Regression Testing | Compare against baseline before promotion | Preventing quality degradation |
Workflows#
PromptReports supports collaborative workflows for teams:
Collaboration
Request review and approval before promoting versions to production.
Evaluation & Testing
Run evaluations and regression tests to ensure quality.
Webhooks
Trigger external systems when prompts are created, updated, or promoted.
Version Control
Track changes and manage versions of your prompts.
Key Terminology#
Here's a quick reference of important terms used throughout PromptReports:
| Term | Definition |
|---|---|
| Prompt | A template containing text and variables that generates AI responses |
| Version | An immutable snapshot of a prompt at a point in time |
| Variable | A placeholder in a prompt (e.g., {{name}}) that gets replaced at runtime |
| Preset | A saved set of variable values for quick testing |
| Context File | Additional content injected into prompts for reference |
| Dataset | A collection of test cases with variable values and optional expected outputs |
| Evaluation | A run of a prompt against a dataset to measure quality |
| Promotion | Moving a version from one stage to another (e.g., dev → production) |
| Regression | Quality degradation compared to a baseline version |
| Backtest | Testing a new version against historical execution data |