Advertisement
How Claude API Pricing Works
Anthropic charges per million tokens (MTok) separately for input (prompt) and output (completion) tokens. The cost depends on which model you use.
Cost = ((Input Tokens × Input Price) + (Output Tokens × Output Price)) ÷ 1,000,000
Claude Model Pricing (2025)
- Claude Haiku 4.5 — $0.80 input / $4.00 output per MTok (fastest, cheapest)
- Claude Sonnet 4.6 — $3.00 input / $15.00 output per MTok (best balance)
- Claude Opus 4.7 — $15.00 input / $75.00 output per MTok (most capable)
Tips to Reduce Your Claude API Bill
- Use prompt caching for repeated context — saves up to 90% on cached input tokens
- Use Haiku for classification, routing, or simple tasks; Sonnet for complex generation
- Trim system prompts — every token costs money at scale
- Set max_tokens limits to cap runaway output costs
Frequently Asked Questions
How do I count tokens in my prompt?
A rough rule: 1 token ≈ 4 characters or ¾ of a word. "Hello world" is about 2-3 tokens. Use Anthropic's tokenizer tool for exact counts.
Does Anthropic charge for failed requests?
No — you're only charged for tokens processed in successful completions. Rate limit errors and server errors are not billed.
What is prompt caching and how does it save money?
Prompt caching lets you cache repeated context (system prompts, long documents) so you pay a lower rate on subsequent requests that reuse that cached content. It can reduce costs dramatically for apps with long, repeated context.