Model Optimization
Stop Using Opus for Everything: A Model Routing Guide for Vibe Coders
PromptReports System
March 30, 2026
4 min read
Stop Using Opus for Everything: A Model Routing Guide for Vibe Coders
Opus is the best coding model available. It's also the most expensive. And you're probably using it for tasks that cheaper models handle equally well.
We analyzed token usage across thousands of Claude Code and OpenRouter sessions. The finding: 35-45% of tasks sent to Opus could use a faster or cheaper model with identical output quality. That's $150-300/month in pure savings for the average vibe coder.
The Model Cost Ladder
Here's what you're actually paying (approximate per-million tokens as of April 2026):
Model | Input | Output | Best For
Claude Opus 4.6 | $15 | $75 | Architecture, complex reasoning, multi-file refactors
Claude Opus /fast | $15 | $75 | Same capabilities, faster output for simple tasks
Claude Sonnet 4.6 | $3 | $15 | Solid coding, most feature work
Claude Haiku 4.5 | $0.80 | $4 | Simple transforms, data formatting, test generation
GPT-4o | $2.50 | $10 | Good general coding, alternative perspective
DeepSeek V3 | $0.27 | $1.10 | Bulk processing, simple completions
The price difference between Opus and Haiku is 19x on output tokens. If 30% of your tasks could use Haiku, you're leaving significant money on the table.
Task-to-Model Routing
Here's a practical routing guide based on our analysis:
Opus (Default Claude Code) — 40% of tasks
Keep Opus for anything requiring deep reasoning:
• Designing new architecture
• Debugging complex, multi-file issues
• Security reviews and auth changes
• Large refactors spanning 5+ files
• Plan mode conversations
Opus /fast — 25% of tasks
Same model, faster output, lower effective cost due to speed:
• File reads and searches (grep, glob)
• Git operations (commit, diff, status)
• Simple single-file edits
• Running build commands
• Formatting changes
Sonnet — 20% of tasks (via OpenRouter API)
Great for feature work that doesn't need Opus-level reasoning:
• Building straightforward CRUD endpoints
• Creating UI components from clear specs
• Writing tests for existing code
• Documentation generation
• Simple bug fixes with known causes
Haiku — 15% of tasks (via OpenRouter API)
Fast and cheap for mechanical work:
• Generating seed data and fixtures
• Converting between data formats
• Extracting structured data from text
• Simple string transformations
• Summarizing long outputs
How to Route in Practice
In Claude Code: Toggle /fast mode for simple tasks. This is the lowest-friction optimization — one command toggles it on and off.
In your application code: Use OpenRouter to route API calls to the right model:
For batch operations: Use the cheapest model that produces acceptable output. Test with 10 examples on Haiku before committing to Opus for 1,000.
Measuring Your Model Mix
The PromptReports CLI shows your current model distribution:
npx @promptreports/cli --sessions --models
Output:
If you see Opus at 80%+, there's room to optimize.
Common Objections
"What if the cheaper model gives worse output?"
Test it. For mechanical tasks (formatting, renaming, git ops), the output is identical. For simple coding tasks, Sonnet produces code that's functionally equivalent to Opus 95% of the time. The 5% where it's not is exactly the work that should stay on Opus.
"Switching models is annoying."
In Claude Code, it's one command: /fast. In your API code, it's a one-line model parameter change. The PromptReports dashboard tracks which model each task used, so you can see the impact immediately.
"I'd rather pay more and not think about it."
Fair. But $200/month is $2,400/year. For a team of 5, that's $12,000/year. Enough to fund another service, hire a contractor, or just keep in your pocket.
Start Tracking
See your model mix and savings opportunities:
npx @promptreports/cli