llm-costs: A CLI Tool for Real-Time LLM API Price Comparison

I Built an Open-Source CLI to Compare LLM API Costs in Your Terminal (npx, Zero Install)

Developer Followtayeeb released llm-costs, a terminal-based utility to eliminate manual spreadsheet math when comparing AI model pricing. The tool supports 17 models from major providers including OpenAI, Anthropic, and DeepSeek, utilizing actual tokenizers like tiktoken for accuracy.

Why This Matters

Developers often rely on stale blog posts or manual calculations that fail to account for differing tokenization methods, such as character-based estimation versus tiktoken. This tool addresses the technical reality of fluctuating API costs—where models like GPT-4o cost significantly more ($0.01875 per 1k tokens) than DeepSeek ($0.00011)—by automating price fetching via GitHub Actions and LiteLLM’s aggregate data.

Key Insights

GitHub Actions fetches pricing from LiteLLM’s aggregate JSON every Monday morning to track and PR price changes (Followtayeeb, 2026).
Batch processing allows piping files of prompts to calculate total costs via the command ‘llm-costs batch prompts.txt’.
The tool utilizes a two-layer update approach with a 7-day TTL cache stored at ~/.llm-costs/pricing.json and a 5-second non-blocking background fetch.
Tokenization accuracy is maintained using tiktoken for OpenAI and character-based estimation for providers like Anthropic or Google.
The llm-costs tool includes an MCP server mode for direct integration with Claude Desktop and other compatible environments.

Working Examples

Compare prompt costs across all major providers in a terminal table.

npx llm-costs "Build a REST API in Python" --compare

Set a cost ceiling for CI/CD pipelines to prevent budget overruns.

llm-costs guard --max 0.10

Calculate budget projections for high-volume API usage.

llm-costs budget --requests 10000

Practical Applications

CI/CD Integration: Using the ‘guard’ feature to block deployments if prompt changes exceed a $0.10 budget limit; Pitfall: Failing to update the local price cache can lead to incorrect budget approvals.
Prompt Engineering: Utilizing ‘watch mode’ to get live-refreshing cost analysis while drafting prompts; Pitfall: Overlooking different tokenization methods between models can result in a 2-3x variance in actual billing.

References:

On This Page

I Built an Open-Source CLI to Compare LLM API Costs in Your Terminal (npx, Zero Install)

Why This Matters

Key Insights

Working Examples

Practical Applications

Continue reading

Related Content

SVI: A New CLI Tool to Streamline Prompt Engineering for AI-Assisted Coding

AI Coding Agents Still Write Your SDK's Old API — SDKProof Measures the Gap with Type-Checking

CLI vs. MCP: Prioritizing OS-Level Portability for AI Agent Tools