What is the /compact command in Claude Code?

/compact summarizes your conversation history to reduce token usage. When your context gets long, Claude auto-compacts, but you can trigger it manually anytime. Add instructions like '/compact Focus on code samples and API usage' to tell Claude what to preserve during summarization. Use it whenever you feel the conversation getting sluggish.

Is Claude Pro or API cheaper for Claude Code?

It depends on usage volume. If you use Claude Code fewer than 50 million tokens per month (light to moderate use), API pay-per-token may be cheaper than the $20/month Pro plan. Above that, Pro saves money significantly. One developer tracked 10 billion tokens over 8 months — API equivalent would have been $15,000 versus $800 on Max subscription. For most daily users, Pro or Max is the better deal.

How do I check my Claude Code token usage?

Use the /cost command inside Claude Code to see current session token usage (most relevant for API users). For subscription users, check claude.ai/settings/usage in your browser. The /status command shows which model and auth method are active. For detailed tracking, the open-source ccusage tool provides daily breakdowns and cost estimates.

Claude Code Token Saving Tips: Do More with Less

Q: How much does Claude Code cost per month?

Claude Code requires at minimum a Claude Pro subscription at $20/month. Claude Max at $100/month provides 5x more usage, and Max 20x at $200/month provides 20x. API pay-per-token is also available: Sonnet costs about $3/$15 per million input/output tokens, Opus costs $15/$75. The average developer spends about $6/day on API usage, with 90% staying under $12/day.

Q: What uses the most tokens in Claude Code?

Context re-reading is the biggest token consumer — Claude re-processes your entire conversation history with every new message. Long sessions, large files loaded into context, verbose command output, and bloated CLAUDE.md files are the top four token drains. Prompt caching helps (saving 40-50% on repeated content), but keeping sessions short and context focused is the most effective strategy.

Claude Code token usage going down after applying optimization tips

Same productivity, way fewer tokens. Here's how.

✍️ Thirsty Hippo · Tracking token usage obsessively since late 2025 | 📅 March 2026 | ⏱️ 9 min read

📢 Transparency: Self-purchased Claude Pro subscription. No Anthropic sponsorship. Prices verified March 2026. This post contains affiliate links.

📘 Part 3 of the "Claude Code Unlocked" series

🔑 Key Takeaways

Context re-reading consumes the most tokens — Claude re-processes your entire conversation history with every message. Short, focused sessions are cheaper.
/clear between tasks is the single biggest token saver. Stale context from previous work is dead weight on every subsequent message.
Be specific in prompts. "Fix the bug in src/auth/validate.ts line 42" uses 10x fewer tokens than "fix the bug in the login page."
CLAUDE.md under 100 lines. Every line is re-read every session. Cut anything that doesn't prevent Claude from making mistakes.
Pro at $20/month is the best value for most daily users. API pay-per-token only wins for very light usage (under 50M tokens/month).

How Tokens Work in Claude Code (Simple Explanation)

Here at Thirsty Hippo, we don't do spec-sheet reviews — we live with products for weeks before writing a single word. And after months of tracking my Claude Code token usage, I've figured out exactly where the waste happens — and how to stop it.

First, what is a token? Think of tokens as the "word pieces" that AI reads and writes. One token is roughly 4 English characters — so the word "authentication" is about 4 tokens. Every time you send a message to Claude Code, it reads your entire conversation history (input tokens) and generates a response (output tokens). Both cost money or consume your subscription quota.

Here's the deal: the average Claude Code developer uses about $6 worth of tokens per day, according to Anthropic's official cost documentation. But 90% of users stay under $12/day. The difference between $6 and $12 isn't about how much work you do — it's about how efficiently you do it.

The biggest surprise? Over 90% of all tokens are cache reads, not your actual prompts or Claude's responses. Claude Code constantly caches context about your codebase — file structures, dependencies, previous conversations. Prompt caching saves 40-50% on repeated content automatically. But the rest? That's where your habits make or break your budget.

Why You Can Trust This Review

How tested: 4+ months of daily Claude Code use with token tracking. Compared usage before and after applying each tip over 2-week periods.
Sponsored? No — self-purchased Claude Pro subscription.
Update schedule: Reviewed monthly as pricing and features change.
Limitations: Tested primarily on web development and content tasks. Token savings may vary for heavy enterprise or multi-agent workflows.

10 Token Saving Tips That Actually Work

These tips are ranked by impact — highest savings first. After spending months optimizing my workflow, these 10 practices cut my token usage by roughly 60% without reducing productivity.

Comparing verbose Claude Code prompts to optimized concise prompts saving tokens

Left: 3,500 tokens. Right: same task, 1,200 tokens. Prompt specificity is everything.

1. Use /clear between unrelated tasks. This is the single biggest lever. When you switch from debugging auth to refactoring a component, the old context is dead weight. Every subsequent message pays the "tax" of re-reading that irrelevant history. Use /clear when switching projects, and /rename first if you want to find the session later.

2. Be specific in your prompts. "Fix the JWT validation in src/auth/validate.ts line 42 where expired tokens aren't rejected" costs a fraction of "fix the bug in the login page." Specificity eliminates search operations — Claude doesn't need to scan your entire codebase to find the right file. According to multiple developer reports, this alone can reduce token usage by 10x per task.

3. Keep CLAUDE.md under 100 lines. This file is read every single session. A 400-line CLAUDE.md eats thousands of tokens before you even ask your first question. For each line, ask: "Would removing this cause Claude to make a mistake?" If not, cut it. Honestly speaking, trimming my CLAUDE.md from 400 lines to 60 was the most impactful change I made — I covered this in the Beginner's Guide.

4. Use /compact when sessions get long. This command summarizes your conversation history, freeing up context space. You can customize what gets preserved: /compact Focus on code samples and API patterns. Claude also auto-compacts when approaching context limits, but triggering it manually gives you more control.

5. Switch models for simple tasks. Not every question needs Opus. Quick syntax checks, simple file reads, basic refactors — Sonnet handles these perfectly at a fraction of the cost. Use /model sonnet for routine work and /model opus only for complex architecture decisions. One thing that surprised me was how rarely I actually needed Opus once I started paying attention.

💡 Quick Answer: The three highest-impact token savers are: (1) /clear between tasks, (2) specific prompts with file paths, and (3) keeping CLAUDE.md under 100 lines. These three alone can cut usage by 40-50%.

6. Create a .claudeignore file. Like .gitignore, this tells Claude what not to scan. Add node_modules, build folders, lock files, and any large directories Claude doesn't need. The savings compound — every time Claude would have read a 5,000-line lock file, you save those tokens.

7. Use /btw for side questions. The /btw command creates a branch conversation that doesn't add to your main thread. Quick questions like "what does this library do?" or "is this the right syntax?" stay out of your primary context. According to MindStudio's analysis, consistent /btw usage can reduce total session tokens by up to 50%.

8. Pipe command output through filters. When Claude runs shell commands, the full output enters your context. git log with 200 commits? All tokens. Use git log --oneline -10 instead, or add hooks that truncate verbose output automatically.

9. One task per session. One bug fix. One feature. One refactor. Don't try to fix three bugs and add two features in one conversation. Fresh sessions mean clean context and fewer wasted tokens on irrelevant history.

10. Use /effort to control thinking depth. Extended thinking is powerful but expensive — the default thinking budget can be tens of thousands of tokens per request. For simple tasks, /effort low skips the deep reasoning. Save deep thinking for architecture decisions.

🔧 Getting auth errors while trying to save tokens? Our Claude Code auth error fix checklist covers every solution from /login to the nuclear reset.

🔴 My Failure Moment

Fair warning: I didn't learn about token optimization gently. In my first month with Claude Code, I had a session where I asked Claude to analyze my entire blog codebase. I didn't scope the request. I didn't use .claudeignore. I let Claude read every file, including 15,000 lines of auto-generated CSS and a massive package-lock.json. The session burned through my entire daily usage allocation in about 20 minutes. I hit the rate limit, couldn't do any more work that day, and realized I'd essentially paid $12 to have Claude read a bunch of files it didn't need. The very next day, I created a .claudeignore file and started using /clear. My daily costs dropped from $10+ to under $4. That's the difference between working smart and just working.

Pro vs. Max vs. API: Which Plan Saves You Money?

Comparing Claude Code Pro subscription cost versus API pay-per-token pricing

$20/month or pay-per-token? The answer depends on how many tokens you actually use.

Plan	Cost	Best For
Claude Pro	$20/month	Daily use, learning, hobby projects
Claude Max 5x	$100/month	Heavy daily use, Opus access, complex projects
Claude Max 20x	$200/month	Power users, multi-agent workflows
API (Sonnet)	~$3/$15 per M tokens	Light/automated use, predictable low volume
API (Opus)	~$15/$75 per M tokens	High-quality needs, budget to match

The best part? The math is straightforward. One developer who tracked 10 billion tokens over 8 months calculated that API pricing would have cost $15,000 — versus $800 on the Max subscription. That's a 93% saving. For most people using Claude Code daily, the subscription is dramatically cheaper than pay-per-token.

I could be wrong here, but if you're using Claude Code more than 5-10 times per day, the $20/month Pro plan almost certainly saves you money compared to API pricing. The breakeven point is roughly 50 million tokens per month — and most daily users blow past that without realizing it.

The "Do I Need to Pay More?" Decision Guide

From what I've seen so far, here's how to decide which tier makes sense for your workflow:

🟢 Pro ($20/month) is enough if you:

Use Claude Code under 10 times per day
Work on small to medium projects
Primarily use Sonnet (not Opus)
Apply the token-saving tips above
Don't run multi-agent or automated workflows

🟡 Consider Max ($100/month) if you:

Hit rate limits more than 2-3 times per week on Pro
Need Opus for complex architecture or refactoring
Work on multiple large projects daily
Use Claude Code as your primary development tool

🔴 API pay-per-token makes sense if you:

Use Claude Code fewer than 5 times per day
Need it only for specific automated tasks
Want precise cost tracking per project
Don't mind variable monthly costs

For budget-conscious users who aren't sure yet, our guide to free AI tools covers alternatives for tasks that don't specifically require Claude Code's terminal capabilities.

Frequently Asked Questions

How much does Claude Code cost per month?

Claude Pro at $20/month is the minimum. Max is $100 or $200/month for heavier use. API pricing varies by model — Sonnet is about $3/$15 per million input/output tokens. The average developer spends roughly $6/day on API usage. For most daily users, the Pro subscription is the cheapest option.

What uses the most tokens in Claude Code?

Context re-reading — Claude processes your full conversation history with every new message. Long sessions, large files in context, verbose command output, and bloated CLAUDE.md files are the top drains. Keeping sessions short and context focused is the most effective saving strategy.

What is the /compact command?

/compact summarizes your conversation history to reduce token load. Use it whenever sessions get long or sluggish. Add instructions like /compact Focus on code samples to control what gets preserved.

Is Pro or API cheaper for Claude Code?

For daily users (50M+ tokens/month), Pro at $20/month is dramatically cheaper. One developer tracked 10B tokens over 8 months — API would have cost $15,000 versus $800 on Max. API only wins for very light usage under 50M tokens/month.

How do I check my token usage?

Use /cost for current session data (mainly for API users). Check claude.ai/settings/usage for subscription usage. /status shows your active model and auth method. For detailed tracking, the open-source ccusage tool provides daily breakdowns.

📅 Last updated: March 25, 2026 — See what changed

March 25, 2026: Original publish. Pricing verified against official Claude Code cost management docs and claude.com/pricing. Token saving techniques tested over 2-week comparison periods. /btw command data from MindStudio analysis (March 2026).

Bottom Line

Token optimization isn't about being stingy — it's about being smart. The same way you wouldn't leave every light in your house on 24/7, you shouldn't let Claude Code read your entire codebase when it only needs one file. The 10 tips above — especially /clear, specific prompts, and a lean CLAUDE.md — can cut your usage by 60%+ without changing what you accomplish.

And for most people, $20/month on Claude Pro covers everything they need, especially with these optimizations applied.

What's your biggest token-saving trick? Share it in the comments — I'll add the best ones to this guide with credit. And if you know someone hitting rate limits every day, this article might just save them real money.

📌 Next in the "Claude Code Unlocked" series: A guide for absolute beginners — no coding experience required. We'll start from "what is a terminal?" and build your first project step by step.

Hashtags: #ClaudeCode #ClaudeCodeTokens #SaveTokens #ClaudeCodeCost #ClaudeCodeTips #AIcoding #ClaudePro #ClaudeMax #TokenOptimization #DevTools2026 #AIproductivity #ClaudeCodeUnlocked #CodingBudget #ThirstyHippo #FallInLoveWithTheRightTech

Claude Code Token Saving Tips: Do More with Less