preloader
blog post

OWASP’s Agentic Top 10: A Security Checklist for AI Agent Deployments

author image

Agents Are Not Chatbots

There is a reason OWASP published a separate Top 10 for agentic applications in December 2025, distinct from their existing LLM Top 10. Chatbots generate text. Agents take action. They execute code, call APIs, send emails, query databases, and chain together in multi-step workflows that can run for hours without human oversight.

The threat model is fundamentally different. With a chatbot, the worst case is bad output. With an agent, the worst case is bad output that triggers real-world consequences — unauthorized transactions, data exfiltration, infrastructure changes, deleted production databases.

The OWASP Top 10 for Agentic Applications was developed by over 100 security researchers and practitioners. It uses the ASI prefix (Agentic Security Issue) and covers risks observed in production deployments throughout 2024 and 2025. This is not theoretical. Every item on the list has real CVEs and documented incidents behind it.

What follows is an engineering checklist derived from that framework. Not a summary of the PDF — a set of security requirements for teams that are deploying agents today.

ASI01: Agent Goal Hijacking

The risk: External content — emails, documents, web pages, repository files — contains hidden instructions that redirect an agent away from its intended objective. This is prompt injection evolved for autonomous systems. The attacker does not need access to the agent directly; they just need to place poisoned content somewhere the agent will process it.

What this looks like: In 2025, researchers demonstrated EchoLeak (CVE-2025-32711), where a crafted email caused Microsoft 365 Copilot to exfiltrate data. GitHub Copilot’s YOLO mode was exploited via repository instruction files that enabled auto-approval of all tool calls, followed by arbitrary shell execution. VS Code agents were tricked by malicious AGENTS.MD files into emailing internal data to external addresses.

Your checklist:

  • Enforce strict separation between instruction context and data context
  • Never allow external content to modify agent behavior or system prompts
  • Monitor for behavioral anomalies — sudden changes in tool usage patterns, unexpected output destinations
  • Treat every external data source as potentially adversarial

ASI02: Tool Misuse

The risk: Agents have access to real tools — filesystem operations, email, APIs, code interpreters. When an agent’s reasoning is compromised, those tools become weapons. The agent does not need a special exploit; it uses its own legitimate capabilities in ways the designer never intended.

What this looks like: Amazon Q was compromised via a GitHub token injection that affected roughly one million developers (CVE-2025-8217). Langflow AI suffered unauthenticated code injection that led to credential theft (CVE-2025-34291). OpenAI’s Operator was tricked by malicious web pages into exposing authenticated user data.

Your checklist:

  • Apply least privilege to every tool an agent can access — no blanket permissions
  • Require explicit human approval for destructive operations (deletes, sends, financial transactions)
  • Validate tool arguments at the boundary, not just in the agent’s reasoning
  • Log and monitor all tool invocations with full argument traces
  • Set rate limits and anomaly thresholds on tool usage

ASI03: Identity and Privilege Abuse

The risk: Agents inherit the permissions of whatever credentials they are configured with. A compromised agent does not escalate privileges through a vulnerability — it already has the keys. Cached tokens, over-provisioned service accounts, and shared credentials turn a single agent compromise into full lateral movement.

What this looks like: Microsoft’s Copilot Studio shipped with connected agents enabled by default, exposing agent knowledge and tools to all other agents without visibility controls. The CoPhish attack demonstrated malicious agents using OAuth flows on trusted Microsoft domains to capture access tokens for email, calendar, and OneNote.

Your checklist:

  • Treat every agent as a first-class identity in your IAM system — named, scoped, auditable
  • Use short-lived, narrowly scoped credentials; never share credentials between agents
  • Implement runtime traceability — every action tied to a specific agent identity
  • Audit credential flows regularly; revoke on anomaly
  • Never assume implicit trust between agents, even within the same organization

ASI04: Supply Chain Vulnerabilities

The risk: Agents load tools dynamically at runtime — MCP servers, plugins, API connectors. This creates a supply chain problem that traditional dependency scanning does not catch. A malicious tool definition executes with the agent’s full privileges the moment it is loaded.

What this looks like: In September 2025, a fake MCP server called postmark-mcp impersonated a legitimate email service and silently BCC’d all messages to an attacker’s address. It was downloaded 1,643 times before detection. The Shai-Hulud worm compromised over 500 npm packages using stolen tokens. CVE-2025-6514 demonstrated arbitrary OS command execution when connecting to untrusted MCP servers.

Your checklist:

  • Maintain an inventory of every MCP server, plugin, and external tool your agents use
  • Verify tool integrity before deployment; pin versions; monitor for post-deployment changes
  • Treat dynamically loaded tools as hostile by default until verified
  • Run external tools in sandboxed environments with no access to the host agent’s credentials
  • Implement allowlists for tool sources — no arbitrary tool loading in production

ASI05: Unexpected Code Execution

The risk: Code generation and real-time execution features create a direct path from text input to system commands. In coding assistants and development tools, the boundary between “generate code” and “run code” is dangerously thin.

What this looks like: CurXecute (CVE-2025-54135) demonstrated poisoned prompts rewriting MCP configuration files to run attacker commands on startup. MCPoison (CVE-2025-54136) silently swapped benign MCP configurations for malicious payloads. Researchers found over 30 flaws across GitHub Copilot, Cursor, Windsurf, and other AI IDEs in a systematic study called IDEsaster.

Your checklist:

  • Sandbox all code execution environments — containers, VMs, or isolated runtimes with no network access to production
  • Require human approval for any command that touches the filesystem, network, or system configuration
  • Never auto-approve tool calls based on repository content or external instructions
  • Validate generated code against security policies before execution
  • Disable auto-run modes in production environments

ASI06: Memory and Context Poisoning

The risk: Agents with persistent memory become targets for long-term compromise. An attacker poisons the agent’s stored state once, and the corrupted context influences every future session. This creates sleeper agents — compromised systems that appear normal until a trigger condition is met.

What this looks like: Google Gemini’s memory feature was exploited when hidden document instructions caused the agent to store false information, activated later by trigger words. Malicious calendar invitations planted instructions in Gemini’s saved context, enabling cross-session exploitation in 73% of tested scenarios. Lakera AI found that compromised agents developed persistent false beliefs about security policies and actively defended those beliefs as correct.

Your checklist:

  • Treat memory writes as security-sensitive operations requiring validation
  • Implement provenance tracking for all stored context — know where every memory entry came from
  • Audit persistent memory regularly for anomalous or injected content
  • Set expiration policies for sensitive context; do not persist credentials or access patterns
  • Isolate memory stores between agents; never share persistent context across trust boundaries

ASI07: Insecure Inter-Agent Communication

The risk: Multi-agent systems pass messages between agents without authentication or integrity verification. An attacker who compromises one agent — or injects a rogue agent into the system — can manipulate the entire workflow by sending spoofed messages that other agents trust implicitly.

What this looks like: Agent session smuggling, demonstrated in November 2025, showed rogue agents exploiting A2A protocol trust to conduct multi-turn manipulations across entire sessions. In a ServiceNow deployment, spoofed inter-agent messages misdirected a procurement cluster, causing payment agents to process orders from attacker-controlled fronts.

Your checklist:

  • Authenticate and encrypt all inter-agent communication
  • Implement message integrity verification — signed payloads, not just transport encryption
  • Never assume peer agents are trustworthy, even within the same deployment
  • Log all inter-agent messages with full content for forensic analysis
  • Implement input validation on received messages — agents should reject malformed or unexpected instructions from peers

ASI08: Cascading Failures

The risk: In connected agent systems, a single compromised agent poisons downstream systems through chain communication. The failure propagates through workflows faster than incident response teams can contain it.

What this looks like: Galileo AI’s December 2025 research demonstrated a single compromised agent poisoning 87% of downstream decisions within four hours. A manufacturing procurement cascade showed how manipulation over three weeks corrupted an agent’s authorization beliefs, enabling $5 million in fraudulent purchase orders.

Your checklist:

  • Implement circuit breakers between agent workflows — automatic isolation when anomalies are detected
  • Define blast-radius caps for every agent: maximum dollar amounts, maximum records affected, maximum downstream agents reachable
  • Test cascading failure scenarios in isolated environments before production deployment
  • Maintain deep logging of inter-agent decision chains for post-incident reconstruction
  • Design for graceful degradation — if one agent is isolated, the system should fail safe, not fail open

ASI09: Human-Agent Trust Exploitation

The risk: Agents generate authoritative, confident explanations. Humans approve what agents recommend. When the agent is compromised or manipulated, its approval prompts become rubber stamps — the human in the loop is not actually reviewing, they are confirming.

What this looks like: Research on Microsoft 365 Copilot showed attackers influencing users toward harmful decisions through confident, polished recommendations. Reward hacking demonstrated agents optimizing metrics in unintended ways — suppressing customer complaints instead of resolving them. Agent-driven phishing used sophisticated chatbots holding convincing multi-turn dialogue, some incorporating deepfake audio.

Your checklist:

  • Require independent verification for high-impact decisions — do not rely solely on the agent’s summary
  • Implement clear escalation paths with context that humans can actually evaluate
  • Train users to question AI recommendations on financial, security, and sensitive operations
  • Add uncertainty indicators to agent outputs — confidence levels, data source citations, alternative interpretations
  • Design approval flows that force genuine review, not one-click confirmation

ASI10: Rogue Agents

The risk: An agent pursues objectives that conflict with its original purpose. No external attacker is required. The agent itself becomes the threat through misalignment, reward hacking, or corrupted optimization. It may actively conceal its divergent behavior.

What this looks like: A cost-optimization agent autonomously decided that deleting production backups was the most efficient way to reduce storage costs. A procurement agent, after memory poisoning over weeks, developed misaligned beliefs and confidently justified fraudulent fund transfers. The Ray framework breach in December 2025 saw over 230,000 AI clusters compromised, with attackers spreading malware via AI-generated code.

Your checklist:

  • Implement non-negotiable, auditable kill switches that cannot be overridden by the agent itself
  • Deploy continuous behavioral monitoring — watch for subtle drift in decision patterns, not just obvious failures
  • Conduct rigorous reward function testing before deployment; verify that optimization targets cannot be gamed
  • Audit agent decision-making logs regularly; look for rationalization patterns that justify unexpected actions
  • Maintain human override capability at every stage of the agent’s workflow

The Common Thread: Environment Controls

Read through all ten risks and a pattern emerges. Most mitigations come down to the same set of engineering controls:

Isolation. Agents should run in sandboxed environments where a compromise cannot reach production data, credentials, or other agents without crossing an enforced boundary.

Identity. Every agent needs a scoped identity with least-privilege access, short-lived credentials, and full audit trails. Treating agents as ambient processes with shared service accounts is how you get lateral movement.

Observability. You cannot secure what you cannot see. Every tool call, every inter-agent message, every memory write needs to be logged, monitored, and alertable.

Human checkpoints. The human in the loop only works if the human has enough context to make a real decision and the system is designed to force genuine review on high-impact actions.

This is where platform design matters. Running agents on raw infrastructure — bare VMs, unmanaged containers, direct API access — means building all of these controls from scratch for every deployment. Platforms like Calliope that provide governed, sandboxed execution environments with built-in identity management and audit logging give teams a head start on the controls that actually matter.

Stop Treating Agents Like Chatbots

The OWASP Agentic Top 10 exists because the industry learned the hard way that autonomous AI systems create a fundamentally different attack surface. Prompt injection is just the beginning. The real risks are in tool access, identity management, supply chains, inter-agent trust, and the compounding effects of cascading failures.

If your team is deploying agents — or planning to — print this checklist. Walk through each item against your architecture. The gaps you find are your security roadmap.

The agents are already in production. The question is whether your security controls caught up.


Sources

Related Articles