Skip to main content

On This Page

MCP vs. CLI: Measuring Token Overhead in Agent Search

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Engineer Ary Rabelo compared the token costs of SerpApi’s official MCP server against a custom MIT-licensed CLI. The results showed the MCP returning 6,047 tokens per call compared to just 351 for the CLI using field projection.

Why This Matters

The technical reality is that while MCP provides a standardized transport, it introduces significant ‘standing costs’ by injecting tool schemas into the context on every turn. For stateless operations like search, this overhead compounds as more tools are added, potentially wasting thousands of tokens before an agent even performs work. This contrasts with the ideal model of lean context usage where only necessary data is passed to the LLM.

Key Insights

  • Standing cost disparity: MCP injects 771 tokens per turn for tool schema, whereas a binary on PATH (CLI) incurs ~0 standing tokens (Rabelo, 2026).
  • Field Projection over Full Payloads: Using --fields title,link reduces response size to 351 tokens versus 6,047 for default MCP output.
  • Context Reduction via Code Execution: Anthropic’s research showed a Drive-to-Salesforce workflow token reduction from 150,000 to 2,000 by calling tools as code rather than loading definitions.

Practical Applications

  • Stateless Search: Use a CLI with minified JSON output and field projection to minimize context bloat in coding loops.
  • Governed Connections: Use MCP when requiring OAuth, multi-user auth, or server-side rate limiting across multiple clients.
  • Tool Proliferation Pitfall: Loading multiple MCP servers simultaneously creates additive standing costs that can consume several thousand tokens of the context window.

References:

Continue reading

Next article

Kafka 4.0+: Mastering KRaft, Incremental Rebalancing, and Production Python Patterns

Related Content