Stop Using Opus for Everything: A Model Routing Guide for Vibe Coders

Opus is the best coding model available. It's also the most expensive. And you're probably using it for tasks that cheaper models handle equally well.

We analyzed token usage across thousands of Claude Code and OpenRouter sessions. The finding: 35-45% of tasks sent to Opus could use a faster or cheaper model with identical output quality. That's $150-300/month in pure savings for the average vibe coder.

The Model Cost Ladder

Here's what you're actually paying (approximate per-million tokens as of April 2026):

Model | Input | Output | Best For

Claude Opus 4.6 | $15 | $75 | Architecture, complex reasoning, multi-file refactors

Claude Opus /fast | $15 | $75 | Same capabilities, faster output for simple tasks

Claude Sonnet 4.6 | $3 | $15 | Solid coding, most feature work

Claude Haiku 4.5 | $0.80 | $4 | Simple transforms, data formatting, test generation

GPT-4o | $2.50 | $10 | Good general coding, alternative perspective

DeepSeek V3 | $0.27 | $1.10 | Bulk processing, simple completions

The price difference between Opus and Haiku is 19x on output tokens. If 30% of your tasks could use Haiku, you're leaving significant money on the table.

Task-to-Model Routing

Here's a practical routing guide based on our analysis:

Opus (Default Claude Code) — 40% of tasks

Keep Opus for anything requiring deep reasoning:

• Designing new architecture

• Debugging complex, multi-file issues

• Security reviews and auth changes

• Large refactors spanning 5+ files

• Plan mode conversations

Opus /fast — 25% of tasks

Same model, faster output, lower effective cost due to speed:

• File reads and searches (grep, glob)

• Git operations (commit, diff, status)

• Simple single-file edits

• Running build commands

• Formatting changes

Sonnet — 20% of tasks (via OpenRouter API)

Great for feature work that doesn't need Opus-level reasoning:

• Building straightforward CRUD endpoints

• Creating UI components from clear specs

• Writing tests for existing code

• Documentation generation

• Simple bug fixes with known causes

Haiku — 15% of tasks (via OpenRouter API)

Fast and cheap for mechanical work:

• Generating seed data and fixtures

• Converting between data formats

• Extracting structured data from text

• Simple string transformations

• Summarizing long outputs

How to Route in Practice

In Claude Code: Toggle /fast mode for simple tasks. This is the lowest-friction optimization — one command toggles it on and off.

In your application code: Use OpenRouter to route API calls to the right model:

For batch operations: Use the cheapest model that produces acceptable output. Test with 10 examples on Haiku before committing to Opus for 1,000.

Measuring Your Model Mix

The PromptReports CLI shows your current model distribution:

npx @promptreports/cli --sessions --models

Output:

If you see Opus at 80%+, there's room to optimize.

Common Objections

"What if the cheaper model gives worse output?"

Test it. For mechanical tasks (formatting, renaming, git ops), the output is identical. For simple coding tasks, Sonnet produces code that's functionally equivalent to Opus 95% of the time. The 5% where it's not is exactly the work that should stay on Opus.

"Switching models is annoying."

In Claude Code, it's one command: /fast. In your API code, it's a one-line model parameter change. The PromptReports dashboard tracks which model each task used, so you can see the impact immediately.

"I'd rather pay more and not think about it."

Fair. But $200/month is $2,400/year. For a team of 5, that's $12,000/year. Enough to fund another service, hire a contractor, or just keep in your pocket.

Start Tracking

See your model mix and savings opportunities:

npx @promptreports/cli

Free at promptreports.ai.