Prompt Optimization
Prompt Engineering That Actually Saves You Money
PromptReports System
April 8, 2026
4 min read
Prompt Engineering That Actually Saves You Money
Most prompt engineering advice focuses on getting better outputs. That matters. But nobody talks about the cost side: a badly structured prompt can 3-5x your token spend for the same result.
We analyzed 12,000 Claude Code sessions and identified the prompt patterns that waste the most money. Here's what we found and how to fix each one.
The Lazy Prompt Tax
The most expensive prompt pattern is also the most common: vague instructions that force the AI to guess what you want.
Expensive prompt:
"Fix the authentication bug."
Claude Code doesn't know which bug. It reads every auth-related file, explores multiple theories, tries fixes that don't work, and eventually asks you for clarification. That exploration costs tokens.
Cheap prompt:
"The login form at app/auth/login/page.tsx throws a 401 when the user has a valid session cookie. The issue is likely in the middleware at middleware.ts around line 45 where we check the session. Fix the cookie validation logic."
Same result. Half the tokens. You already know where the bug is, so tell the AI. The 30 seconds you spend writing a specific prompt saves 5 minutes of AI exploration and $2-5 in tokens.
The Context Dump Problem
Some developers go the other direction and paste entire files, logs, and stack traces into their prompts. More context isn't always better. It's often more expensive and less effective.
What to include:
• The specific file and line numbers
• The exact error message (not the full stack trace)
• What you expected vs. what happened
• One or two relevant code snippets
What to leave out:
• Full file contents (Claude Code can read them itself with the Read tool)
• Entire log files (grep for the relevant lines first)
• Background context the AI already has from CLAUDE.md
• Explanations of how the framework works
Rule of thumb: if Claude Code can look it up with a tool call, don't paste it into your prompt.
The One-Shot vs. Multi-Turn Decision
Some tasks are cheaper as a single detailed prompt. Others are cheaper as a conversation. Knowing which is which saves real money.
Use one-shot prompts for:
• Bug fixes where you know the location and cause
• Simple feature additions with clear specs
• Refactoring with well-defined scope
• Generating boilerplate from a template
Use multi-turn conversation for:
• Exploratory debugging where you don't know the cause
• Architecture decisions that need discussion
• Complex features that benefit from incremental review
• Learning and understanding unfamiliar code
The key insight: one-shot prompts are cheaper per task, but only if the prompt is specific enough to avoid retries. A vague one-shot prompt that leads to "that's not what I meant" follow-ups is the most expensive pattern of all.
Model-Aware Prompting
Different models have different strengths. Routing the right prompt to the right model can cut costs dramatically.
Claude Opus (default):
• Architecture and design decisions
• Complex multi-file changes
• Security-sensitive code
• Code review
Claude Opus /fast mode:
• Simple file operations
• Formatting and renaming
• Git operations
• Running commands
Claude Haiku (via API):
• Generating test data
• Formatting conversions
• Simple text transformations
• Log parsing
If you're using Opus for everything, you're overpaying for at least 30% of your tasks.
Measuring Prompt Efficiency
You can't optimize what you don't measure. The PromptReports CLI shows you exactly which prompts cost the most:
npx @promptreports/cli --sessions --details
This breaks down each session into individual turns, showing:
• Input tokens vs. output tokens per turn
• Cache hit rate (higher is better, means you're reusing context efficiently)
• Cost per turn
• Which turns triggered expensive operations (file reads, searches, multi-tool calls)
Look for turns where the input token count spikes without a corresponding increase in output quality. Those are your optimization targets.
The 80/20 of Prompt Optimization
If you only do three things:
1. Be specific. Include file paths, line numbers, and expected behavior. Every second you spend writing a precise prompt saves dollars in tokens.
2. Use /fast for simple tasks. Toggle it on for anything that doesn't require deep reasoning.
3. Restart sessions at 20 messages. The context compounding is the single biggest cost driver in Claude Code.
These three changes save most developers $150-250/month. Track your progress with the PromptReports Ops Intelligence Dashboard.
npx @promptreports/cli