blog June 01, 2026 6 min read

Coding Agent Horror Stories: The rm -rf ~/ Incident

CodeQuest Editorial Team

News Desk

The rm -rf ~/ Incident: How AI Coding Agents Can Wipe Your System—and How to Stop It A single misplaced command can erase years of work. That’s exactly what happened in a real-world incident involving an AI coding agent,

The `rm -rf ~/` Incident: How AI Coding Agents Can Wipe Your System—and How to Stop It

A single misplaced command can erase years of work. That’s exactly what happened in a real-world incident involving an AI coding agent, where a developer’s entire home directory vanished in seconds. The command? rm -rf ~/. The lesson? Even the most advanced AI tools aren’t foolproof—and the risks of running unchecked automation in your local environment are severe.

Docker’s latest blog post in their AI Coding Agent Horror Stories series details how this incident unfolded, exposing a critical gap in developer workflows: AI-generated code isn’t always safe to execute. The post also highlights how Docker Sandboxes provide a layer of isolation to contain these failures before they become catastrophic. For developers relying on AI assistants, this isn’t just a cautionary tale—it’s a wake-up call about execution-layer security.

—

What Happened: the `rm -rf ~/` Incident

The incident began when a developer used an AI coding agent to automate a routine task. The agent, following a common pattern, suggested a command to clean up temporary files. But instead of targeting /tmp, it generated:

rm -rf ~/

This command recursively deletes everything in the user’s home directory—documents, projects, configurations, and even system files if permissions allow. The developer, trusting the AI’s output, ran it without review. The result? Irrecoverable data loss.

Docker’s analysis of the incident points to three key failures:

No execution review: The AI provided a command without warning of its destructive potential.
No sandbox isolation: The command ran directly in the host environment, with full system access.
No rollback mechanism: Once executed, there was no way to undo the damage.

The blog post emphasizes that this isn’t an isolated case—similar incidents have occurred with other AI tools, where agents generate commands like chmod -R 777 / or dd if=/dev/zero of=/, both of which can brick systems if misused.

—

Why This Matters for Developers

1. ai-generated Code Isn’t Trustworthy By Default

AI tools like GitHub Copilot, Amazon CodeWhisperer, or custom agents are trained on vast codebases, but they don’t understand context—especially not the implications of commands like rm -rf.
Skill shift: Developers must now treat AI suggestions as drafts, not final products. Manual review of critical commands is no longer optional.
Project risk: In team environments, an unchecked rm -rf could wipe shared repositories, CI/CD pipelines, or production-like sandboxes.

2. execution-layer Security Is Overlooked

Most discussions about AI safety focus on input validation (e.g., preventing prompt injection). But the real danger lies in execution—where a single command can cause irreversible damage.
Tooling gap: Traditional IDEs and linters don’t flag destructive commands. Static analysis tools like shellcheck can help, but they’re often bypassed in fast-paced workflows.

3. sandboxing Isn’t Just for Security—it’s for Survival

Docker Sandboxes (and similar tools like Podman, LXC, or Firecracker) create isolated environments where commands like rm -rf only affect the container, not the host.
Practical implication: Developers should adopt sandboxing for:
Testing AI-generated scripts.
Running untrusted code snippets.
Automating repetitive tasks (e.g., cron jobs, CI/CD steps).

—

How to Protect Yourself: Practical Steps

1. never Run Ai-generated Commands Directly

Rule of thumb: Treat every AI-suggested command as potentially harmful. Even harmless-looking ones like curl | sh can be dangerous.
Workaround: Use a sandbox first. For example:

     docker run --rm -it alpine sh -c "rm -rf /tmp/test && echo 'Command ran in sandbox'"

This isolates the command to a throwaway container.

2. enable Docker Sandboxes for High-risk Workflows

Docker’s blog recommends using workspace-scoped isolation to contain failures. Here’s how to set it up:

Install Docker Desktop or Docker Engine.
Create a disposable container:

        docker run --rm -it -v "$PWD:/workspace" ubuntu bash

Run commands inside the container. If it fails, delete and recreate:

        docker rm -f my_sandbox

Time estimate: 5–10 minutes to set up a basic workflow.

3. use Static Analysis Tools

Tools like shellcheck or bandit can flag dangerous commands:

     shellcheck script.sh  # Flags risky patterns like rm -rf

Limitations: These tools aren’t perfect—false negatives still exist. Combine them with sandboxing.

4. adopt a "no Host Execution" Policy for AI Tools

Configure your AI agent to:
Never run commands in the host environment by default.
Require explicit confirmation for destructive operations.
Example .bashrc alias to force sandboxing:

     alias rm='docker run --rm -it -v "$PWD:/workspace" alpine sh -c "rm -i /workspace/$@ && exit"'

(Note: This is a simplified example—real-world use requires careful handling.)

5. test with Minimal, Reproducible Examples

Before running AI-generated code in production:

Create a minimal test case.
Run it in a sandbox.
Verify behavior with diff or git status.

Example workflow:

     # 1. Clone a test repo
     git clone https://github.com/example/minimal-repo.git
     cd minimal-repo

     # 2. Run AI-generated script in a sandbox
     docker run --rm -it -v "$PWD:/app" python:3.9 bash -c "python /app/script.py && echo 'Test passed'"

—

transmission_intercept

Learn to code. Stay alive.

CodeQuest turns coding into a survival game. Master Python, JavaScript, SQL, and AI/ML through missions, boss fights, and faction warfare. Your character dies if you stop coding.

Claim your free trial › 7 days free · no card needed

Section

1. sandboxing Adds Friction

Problem: Isolating every command slows down workflows.
Mitigation: Start with high-risk operations (e.g., rm, chmod, dd) and gradually expand.

2. not All Tools Support Sandboxing

Example: Some CLI tools (e.g., kubectl, terraform) don’t play well with containerized execution.
Workaround: Use nested sandboxes (e.g., Docker-in-Docker) or virtual machines for complex tools.

3. ai Agents May Still Generate Bad Advice

Reality check: Sandboxing prevents damage, but it doesn’t fix flawed logic. Always review the intent behind commands.
Example: An AI might suggest rm -rf /var/log/* to "clean logs," but this could break system monitoring tools.

4. cost and Complexity

Overhead: Running every command in a container requires Docker setup, which may not be feasible in restricted environments (e.g., shared servers).
Alternative: Use lightweight sandboxes like firejail or unshare for Linux:

     firejail rm -rf /tmp/safe_dir/

—

What’s Next: the Future of AI and Execution Safety

1. ai Tools Will Add Built-in Sandboxing

Signal to watch: GitHub Copilot and VS Code already integrate with Docker. Expect more tools to bake in isolation by default.
Example: JetBrains IDEs now support running code in disposable containers via plugins.

2. standardized "safe Execution" Protocols

Emerging trend: Frameworks like OpenAI’s "Safe Mode" or Google’s "Execution Guard" aim to restrict dangerous operations.
Adoption timeline: Likely within 1–2 years as regulatory pressures grow (e.g., GDPR, HIPAA compliance).

3. developer Education Will Shift

New skills in demand:
Sandbox management: Understanding Docker, Podman, or systemd-nspawn.
Command auditing: Tools like auditd or strace to monitor system calls.
CodeQuest alignment: Look for missions covering:
Secure automation scripts.
Containerized development environments.
Static analysis for shell scripts.

4. incident Reporting Will Improve

Current gap: Most rm -rf incidents go unreported. Platforms like Docker’s blog are starting to document cases.
Actionable step: Developers should contribute to public databases of AI-generated command risks (e.g., GitHub Issues, CVE tracking).

—

Final Takeaway: Assume the Worst—and Prepare

The rm -rf ~/ incident isn’t a bug—it’s a feature of how AI tools interact with systems today. The fix isn’t better AI; it’s better execution hygiene. Here’s the minimal action plan to start today:

Sandbox one high-risk command this week. Use Docker or firejail.
Add a static analyzer to your CI pipeline (e.g., shellcheck for Bash scripts).
Review your .bashrc/zshrc for aliases that bypass safety checks.

The goal isn’t to eliminate risk—it’s to ensure that when AI fails, your system doesn’t. And that’s a skill every developer needs in 2024 and beyond.

—

Sources:

Sources

Docker Blog. https://www.docker.com/blog/coding-agent-horror-stories-the-rm-rf-incident/

blog

// share

Coding Agent Horror Stories: The rm -rf ~/ Incident

Coding Agent Horror Stories: The rm -rf ~/ Incident

The `rm -rf ~/` Incident: How AI Coding Agents Can Wipe Your System—and How to Stop It

What Happened: the `rm -rf ~/` Incident

Why This Matters for Developers

1. **ai-generated Code Isn’t Trustworthy By Default**

2. **execution-layer Security Is Overlooked**

3. **sandboxing Isn’t Just for Security—it’s for Survival**

How to Protect Yourself: Practical Steps

1. **never Run Ai-generated Commands Directly**

2. **enable Docker Sandboxes for High-risk Workflows**

3. **use Static Analysis Tools**

4. **adopt a "no Host Execution" Policy for AI Tools**

5. **test with Minimal, Reproducible Examples**

Section

1. **sandboxing Adds Friction**

2. **not All Tools Support Sandboxing**

3. **ai Agents May Still Generate Bad Advice**

4. **cost and Complexity**

What’s Next: the Future of AI and Execution Safety

1. **ai Tools Will Add Built-in Sandboxing**

2. **standardized "safe Execution" Protocols**

3. **developer Education Will Shift**

4. **incident Reporting Will Improve**

Final Takeaway: Assume the Worst—and Prepare

Sources

Leave a Reply Cancel reply

1. ai-generated Code Isn’t Trustworthy By Default

2. execution-layer Security Is Overlooked

3. sandboxing Isn’t Just for Security—it’s for Survival

1. never Run Ai-generated Commands Directly

2. enable Docker Sandboxes for High-risk Workflows

3. use Static Analysis Tools

4. adopt a "no Host Execution" Policy for AI Tools

5. test with Minimal, Reproducible Examples

1. sandboxing Adds Friction

2. not All Tools Support Sandboxing

3. ai Agents May Still Generate Bad Advice

4. cost and Complexity

1. ai Tools Will Add Built-in Sandboxing

2. standardized "safe Execution" Protocols

3. developer Education Will Shift

4. incident Reporting Will Improve