Unsandboxed Execution
The Anti-Pattern
Giving agents direct access to production systems, real databases, or real-world actions without sandboxing, rollback, or approval gates.
Why It Happens
During development, itβs tempting to give agents real credentials for faster iteration. Agents then execute destructive operations β deleting data, sending emails, making purchases, deploying code β with no way to undo. The failure mode isnβt βagent doesnβt work,β itβs βagent works too well on the wrong thing.β By the time you realize what happened, the damage is done.
How to Fix It
Always sandbox agent actions in development and testing. Use human-in-the-loop approval for high-risk operations. Implement dry-run mode for every destructive action. Classify all actions by risk level and require escalating approval thresholds. Apply the principle of least privilege β agents get only the permissions they need, nothing more. Design for the worst case: whatβs the most damage this agent could do? Then make that impossible.
Diagram
Unsandboxed (dangerous): Sandboxed (safe):
βββββββββ ββββββββββββββ βββββββββ βββββββββββ
β Agent βββββΆβ Production β β Agent βββββΆβ Sandbox β
β β β Database β β β β (copy) β
βββββββββ ββββββββββββββ βββββββββ βββββββββββ
β β β β
DROP TABLE users β DROP TABLE users β
β βΌ β βΌ
β π Data gone forever β β Only sandbox affected
β β
β βΌ
β [Human review]
β β
β 'Looks wrong, reject'Symptoms
- Agents have write access to production databases or APIs
- No approval workflow exists for destructive operations
- Testing happens directly against production systems
- Agent errors result in real-world consequences that canβt be reversed
False Positives
- Agents operating in read-only mode with no write capabilities
- Well-tested agents with comprehensive approval workflows already in place
- Development environments that are already properly isolated
Warning Signs & Consequences
Warning Signs
- Agent performing write operations on production without any review step
- No separation between dev/staging/production for agent access
- Credentials with broad permissions given directly to agents
- Lack of audit trail for agent-initiated actions
Consequences
- Irreversible damage from agent errors β data loss, wrong emails sent, bad deployments
- Compliance and regulatory violations from uncontrolled access
- Impossible debugging when the agent has modified the system itβs operating on
- Complete loss of user trust after a catastrophic agent-initiated incident
Remediation Steps
- 1Classify all agent actions by risk level: read-only, low-risk write, high-risk write
- 2Sandbox all development and testing β never test against production
- 3Implement dry-run mode that shows what would happen without executing
- 4Add human-in-the-loop approval for all high-risk operations
- 5Apply principle of least privilege β minimal permissions, explicit scope
Real-World Example
The Accidental Email Blast
A customer service agent is given access to the email API for sending individual support responses. Due to a prompt injection in a customer ticket, the agent sends the same response to every customer in the database β 50,000 emails in 3 minutes. A human-in-the-loop approval for batch operations (>1 recipient) would have caught this instantly.