Developer Tools

LLM Token Counter & Cost Estimator

Paste any prompt — get a token estimate and the per-request and monthly cost across every frontier LLM in parallel. Runs in your browser, no API keys, no logging.

Free Tool100% Local14 models
Privacy: tokenization and cost math happen client-side. Your prompt never leaves the browser.
Tokens (est.)
0
Characters
0
Output tokens
0
20% of input
Total per request
0
input + output
20%
0%50%100%200%

How much the model writes back, as a percentage of your prompt. 20% is typical for classification or extraction; 100%+ for chat or long generation.

Total requests in your projection window (e.g. 30 000 = 1 000 / day for 30 days).

Cost across 14 models

Paste some text to see cost estimates.

How this counter works

Real tokenization differs per vendor — OpenAI uses cl100k_base / o200k_base, Anthropic ships its own tokenizer, Llama uses a SentencePiece variant. Bundling all of them as WASM would balloon the page. Instead, this tool uses a BPE-style heuristic: it takes the larger of chars / 4 and words × 1.3. The result lands within roughly 10 % of real tokenizer counts for natural English, code and mixed content — enough for cost estimates and budget conversations.

For precise counts before a production rollout, use the vendor SDK:tiktoken for OpenAI, @anthropic-ai/tokenizer for Claude.

What to watch in your bill

  • Output dominates. Output tokens are typically 3–5× more expensive than input. Even a small ratio shift moves the bill significantly.
  • Frontier vs. cheap-bulk is ~100×. Same prompt, Opus 4.7 costs ~150× more than Llama 4 Scout. Route accordingly.
  • Caching changes everything. Prompt caching (Anthropic, OpenAI, Gemini) cuts repeated-input cost by 90 % — this calculator does NOT account for it.
  • Self-hosting is its own equation. Llama / DeepSeek prices here are hosted-API; running on your own GPUs trades $/token for $/hour.

Frequently Asked Questions

How accurate is the token count?
Within ~10 % of the real tokenizer for natural text, code and mixed content. For exact counts, use the vendor SDK (tiktoken, @anthropic-ai/tokenizer, etc.).
Does this tool send my text anywhere?
No. Everything runs in your browser. Open DevTools → Network and paste anything — no requests fire.
Why is the output ratio important?
Output tokens cost 3–5× more than input across all vendors. A 20 % output ratio vs. 100 % can change your monthly bill by 4× even when the input is identical.
Does this account for prompt caching?
No. Prompt caching can reduce repeated-input cost by ~90 %. If your workload has a large stable system prompt and changing user messages, your real bill will be significantly lower than what this tool shows.