PromptReports Quality Index
A single, weighted composite score across all four benchmark dimensions — the only AI quality index combining citation accuracy, hallucination resistance, source authority, and prose quality.
Which AI model delivers the highest overall research quality? Find out with verified daily benchmarks.
Quality Index Leaderboard
Rankings by PQI Score (0–100, highest = best). Only models with data in all four pipelines appear here.
| # | Model | |||||||
|---|---|---|---|---|---|---|---|---|
Performance Insights
Discover how composite scores evolve over time, compare models across all four quality dimensions, and find which AI delivers the most consistent research quality.
How PQI Is Calculated
Four daily pipelines, normalized to 0–100, combined into a single weighted score
Four Daily Benchmarks
Each pipeline runs every 24 hours: CV tests citation accuracy, HD tests hallucination resistance, ST tests source authority, WQ tests prose quality.
Normalize to 0–100
Every benchmark score is normalized to a 0–100 scale where higher is always better. Hallucination Rate is inverted: lower hallucination = higher HD score.
Weighted Composite
PQI = CV×30% + HD×30% + ST×25% + WQ×15%. Only models with data across all four pipelines qualify for a composite ranking.
Use the model with the highest overall quality score
PromptReports.ai routes your research through the top-ranked model for your use case — verified daily across all four quality dimensions by this benchmark.