Skip to main content

On This Page

llm-costs: A CLI Tool for Real-Time LLM API Price Comparison

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

I Built an Open-Source CLI to Compare LLM API Costs in Your Terminal (npx, Zero Install)

Developer Followtayeeb released llm-costs, a terminal-based utility to eliminate manual spreadsheet math when comparing AI model pricing. The tool supports 17 models from major providers including OpenAI, Anthropic, and DeepSeek, utilizing actual tokenizers like tiktoken for accuracy.

Why This Matters

Developers often rely on stale blog posts or manual calculations that fail to account for differing tokenization methods, such as character-based estimation versus tiktoken. This tool addresses the technical reality of fluctuating API costs—where models like GPT-4o cost significantly more ($0.01875 per 1k tokens) than DeepSeek ($0.00011)—by automating price fetching via GitHub Actions and LiteLLM’s aggregate data.

Key Insights

  • GitHub Actions fetches pricing from LiteLLM’s aggregate JSON every Monday morning to track and PR price changes (Followtayeeb, 2026).
  • Batch processing allows piping files of prompts to calculate total costs via the command ‘llm-costs batch prompts.txt’.
  • The tool utilizes a two-layer update approach with a 7-day TTL cache stored at ~/.llm-costs/pricing.json and a 5-second non-blocking background fetch.
  • Tokenization accuracy is maintained using tiktoken for OpenAI and character-based estimation for providers like Anthropic or Google.
  • The llm-costs tool includes an MCP server mode for direct integration with Claude Desktop and other compatible environments.

Working Examples

Compare prompt costs across all major providers in a terminal table.

npx llm-costs "Build a REST API in Python" --compare

Set a cost ceiling for CI/CD pipelines to prevent budget overruns.

llm-costs guard --max 0.10

Calculate budget projections for high-volume API usage.

llm-costs budget --requests 10000

Practical Applications

  • CI/CD Integration: Using the ‘guard’ feature to block deployments if prompt changes exceed a $0.10 budget limit; Pitfall: Failing to update the local price cache can lead to incorrect budget approvals.
  • Prompt Engineering: Utilizing ‘watch mode’ to get live-refreshing cost analysis while drafting prompts; Pitfall: Overlooking different tokenization methods between models can result in a 2-3x variance in actual billing.

References:

Continue reading

Next article

Optimizing AI Development Costs: Reducing Monthly Spend by 60%

Related Content