Unsandboxed Execution

Advanced🚫 Anti-Pattern⚙️ Anti-Patterns: Tool UseMicrosoft Research / Industry observation

🚫Anti-Pattern— This describes a common mistake to avoid, not a pattern to follow.

The Anti-Pattern

Giving agents direct access to production systems, real databases, or real-world actions without sandboxing, rollback, or approval gates.

Why It Happens

During development, it’s tempting to give agents real credentials for faster iteration. Agents then execute destructive operations — deleting data, sending emails, making purchases, deploying code — with no way to undo. The failure mode isn’t ‘agent doesn’t work,’ it’s ‘agent works too well on the wrong thing.’ By the time you realize what happened, the damage is done.

How to Fix It

Always sandbox agent actions in development and testing. Use human-in-the-loop approval for high-risk operations. Implement dry-run mode for every destructive action. Classify all actions by risk level and require escalating approval thresholds. Apply the principle of least privilege — agents get only the permissions they need, nothing more. Design for the worst case: what’s the most damage this agent could do? Then make that impossible.

Diagram

  Unsandboxed (dangerous):            Sandboxed (safe):
  ┌───────┐    ┌────────────┐        ┌───────┐    ┌─────────┐
  │ Agent │───▶│ Production │        │ Agent │───▶│ Sandbox │
  │       │    │ Database   │        │       │    │  (copy) │
  └───────┘    └────────────┘        └───────┘    └─────────┘
       │            │                     │            │
  DROP TABLE users  │                  DROP TABLE users │
       │            ▼                     │            ▼
       │     💀 Data gone forever         │     ✓ Only sandbox affected
       │                                  │
       │                                  ▼
       │                            [Human review]
       │                                  │
       │                            'Looks wrong, reject'

Symptoms

Agents have write access to production databases or APIs
No approval workflow exists for destructive operations
Testing happens directly against production systems
Agent errors result in real-world consequences that can’t be reversed

False Positives

Agents operating in read-only mode with no write capabilities
Well-tested agents with comprehensive approval workflows already in place
Development environments that are already properly isolated

Warning Signs & Consequences

Warning Signs

Agent performing write operations on production without any review step
No separation between dev/staging/production for agent access
Credentials with broad permissions given directly to agents
Lack of audit trail for agent-initiated actions

Consequences

Irreversible damage from agent errors — data loss, wrong emails sent, bad deployments
Compliance and regulatory violations from uncontrolled access
Impossible debugging when the agent has modified the system it’s operating on
Complete loss of user trust after a catastrophic agent-initiated incident

Remediation Steps

1Classify all agent actions by risk level: read-only, low-risk write, high-risk write
2Sandbox all development and testing — never test against production
3Implement dry-run mode that shows what would happen without executing
4Add human-in-the-loop approval for all high-risk operations
5Apply principle of least privilege — minimal permissions, explicit scope

Real-World Example

The Accidental Email Blast

A customer service agent is given access to the email API for sending individual support responses. Due to a prompt injection in a customer ticket, the agent sends the same response to every customer in the database — 50,000 emails in 3 minutes. A human-in-the-loop approval for batch operations (>1 recipient) would have caught this instantly.