AI Agent Security Audit: 76% of Tool Calls Lack Protective Guards

We Scanned 16 AI Agent Repos. 76% of Tool Calls Had Zero Guards

Researcher Josselin Guarnelli analyzed 16 prominent AI agent repositories, including CrewAI and Skyvern. The study found that 76% of tool calls with real-world side effects—such as database writes and HTTP requests—had zero protective guards.

Why This Matters

In traditional software, human users are constrained by UI-level validation and rate limits; however, AI agents delegate action-triggering to LLMs that lack inherent understanding of business rules. Without code-level guards like input validation or idempotency keys, a single prompt injection or hallucination can trigger catastrophic consequences, such as exhausting API quotas through recursive loops or performing unvalidated database deletions. The technical reality shows that even production-grade applications like Skyvern (76% unguarded) and Dify (75% unguarded) fail to implement the necessary safeguards between the LLM’s decision and the final execution.

Key Insights

A scan of 16 repositories showed that 76% of functions with side effects, including database writes and payment processing, lacked any form of rate limiting or authentication checks.
Frameworks like CrewAI (78% unguarded) and PraisonAI (89% unguarded) lack guards by design, but developers are failing to add them in the application layer.
The Khoj AI assistant contains an unguarded ‘ai_update_memories’ function that allows an LLM to delete and replace user data without confirmation or rate limits.
Diplomat-agent, an AST-based static analyzer, identifies risk by walking the Python syntax tree to find side-effect patterns and matching them against existing guards.
The OWASP Top 10 for Agentic Applications (2025) and the EU AI Act (2026) now necessitate documented inventories of agent capabilities and human oversight measures.

Working Examples

Installation and execution of the diplomat-agent static analyzer.

pip install diplomat-agent
diplomat-agent .

Using a comment to manually acknowledge an intentionally unguarded tool call.

def send_alert(message): # checked:ok — protected by API gateway
    requests.post(ALERT_URL, json={"msg": message})

Practical Applications

Use Case: Integrating diplomat-agent into CI pipelines with the ‘—fail-on-unchecked’ flag to block pull requests that introduce dangerous, unguarded functions.
Pitfall: Relying on LLM logic for safety; an attacker can use prompt injection to bypass natural language instructions and trigger functions like ‘refund()’ repeatedly.
Use Case: Generating a ‘toolcalls.yaml’ registry to maintain a committable, auditable inventory of every function that can modify the real world.
Pitfall: Assuming framework generic code is secure; application developers must implement ‘Depends()’ or ‘Security()’ checks in FastAPI-based agent tools.

References:

https://dev.to/josselin_guarnelli/we-scanned-16-ai-agent-repos-76-of-tool-calls-had-zero-guards-5c2h

On This Page

We Scanned 16 AI Agent Repos. 76% of Tool Calls Had Zero Guards

Why This Matters

Key Insights

Working Examples

Practical Applications

Continue reading

Related Content

Two Questions That Defend Solana Accounts: Owner Check and Signer Verification

How to Build an AI-Driven Property Management Email Agent Without Shared Inbox Chaos

Securing AI Agents: Why Observability Fails Without MCP Governance