Trace the full path from user message to shell execution inside OpenClaw. Understand the gateway, sandbox, and approval flow — and where security gaps live.

OpenClaw Architecture Explained: What Happens When Your Agent Runs a Command

Your agent just ran rm -rf /workspace. Here's everything that happened in the 340 milliseconds before that command hit disk.

Understanding this matters. If you run OpenClaw in production, you need to know where data flows, where decisions get made, and where things can go wrong.

The Three Layers

OpenClaw has three main components. Every interaction passes through all three.

1. The Gateway — the front door. It handles WebSocket connections, routes messages, manages sessions, and talks to your LLM provider.

2. The Sandbox — an isolated Docker container where the agent executes code. File system access, shell commands, package installs — all happen here.

3. The LLM Provider — Claude, GPT-4, Gemini, or whatever model you've configured. The brain that decides what to do next.

The gateway sits between the user and everything else. It's the orchestrator.

What Happens When You Send a Message

Let's trace a real interaction. You type: "Read the CSV file in /workspace and summarize the top 10 rows."

Step 1: WebSocket to Gateway (2ms)

Your browser holds a persistent WebSocket connection to the gateway. Your message hits the gateway's message handler.

The gateway wraps your message with context: session ID, conversation history, system prompt, available tools. This becomes the payload sent to the LLM.

Step 2: Gateway to LLM Provider (200-800ms)

The gateway sends the full context to your configured LLM provider. Anthropic, OpenAI, whatever you set up.

The LLM responds with a tool call. Something like:

{
  "tool": "execute_command",
  "arguments": {
    "command": "head -n 11 /workspace/data.csv"
  }
}

The model decided to run a shell command. The gateway now has to execute it.

Step 3: Gateway to Sandbox (15-50ms)

The gateway talks to Docker to execute the command inside the sandbox container. This is where it gets interesting.

The sandbox is a Docker container with:

A mounted workspace directory
Network access (by default, unrestricted)
Shell access (bash)
Whatever packages you've installed

The gateway uses the Docker API to run docker exec inside this container. The command runs. Output comes back.

Running OpenClaw in production? That Docker API call is the most dangerous step in the entire chain. No access controls by default. Deploy securely with Clawctl →

Step 4: Result Back to LLM (200-800ms)

The command output goes back to the LLM. The model reads the CSV output and generates a natural language summary.

Step 5: Response to User (2ms)

The summary streams back over the WebSocket to your browser.

Total time: 400ms to 1.6 seconds. Most of it waiting on the LLM.

The Docker Socket: Where Power Meets Risk

Here's the critical detail most people miss.

The gateway needs Docker access to manage the sandbox. In a default install, that means mounting the Docker socket:

volumes:
  - /var/run/docker.sock:/var/run/docker.sock

This gives the gateway root-equivalent access to your entire Docker host. It can:

Start and stop any container
Read environment variables from any container
Mount any host directory
Pull and run arbitrary images

The gateway only needs to manage one sandbox container. But the socket gives it the keys to everything.

This is the single biggest security gap in a default OpenClaw deployment. We wrote about this in detail in our Docker exposure guide.

The Socket Proxy Pattern

The fix is a socket proxy. Instead of mounting the raw socket, you run a proxy container that filters Docker API calls.

User → Gateway → Socket Proxy → Docker Socket
                     ↓
              (only allows API calls
               for this tenant's containers)

The proxy uses regex filtering to only allow operations on containers matching a specific naming pattern. Your gateway can manage its sandbox. Nothing else.

The Sandbox: Isolation (Sort Of)

The sandbox container runs your agent's code. It's a standard Docker container with some constraints.

What the sandbox provides:

Filesystem isolation. The agent sees /workspace, not your host filesystem.
Process isolation. The agent's processes are contained.
Resource limits. CPU, memory, and disk can be capped.

What the sandbox doesn't provide by default:

Network isolation. The agent can reach any URL on the internet.
Egress filtering. No control over what APIs the agent calls.
Exec logging. No record of what commands ran, when, or what they returned.
Approval workflows. Every command executes immediately. No human review.

An agent that can reach the internet and run arbitrary shell commands is powerful. That's the point. But in production, "powerful" without "controlled" is a liability. For a deeper look at what the sandbox does and doesn't protect, see our sandbox explainer.

The LLM Layer: Trust and Tokens

Your API keys for the LLM provider live in the gateway's environment. In a default setup, that's a .env file or docker-compose environment block.

The gateway sends your full conversation context to the LLM on every turn. That includes:

System prompts (with any secrets you put there)
Full conversation history
Tool call results (including command output)

If your agent reads a file containing credentials and the LLM processes that content, those credentials have now been sent to a third-party API.

This isn't a bug. It's how LLM-based agents work. But it's worth knowing. For more on credential security, see our API key leak guide.

Multi-Turn Execution Chains

Here's where things get complex. Most agent tasks aren't single-turn.

The agent might:

List files in the workspace
Read a config file
Install a Python package
Run a script
Parse the output
Call an external API
Write results to a file

Each step is a full loop: gateway → LLM → tool call → sandbox → result → LLM.

Seven turns. Seven shell commands. Seven chances for something unexpected. And in a default install, all seven execute without any human review.

For a deeper look at setting up approval workflows, see our agent approval workflows guide.

Where the Gaps Are

Here's a summary of what a default OpenClaw install gives you vs. what production requires:

Capability	Default Install	Production Requirement
TLS/HTTPS	No	Yes
API key encryption	Plaintext .env	Encrypted at rest
Docker socket access	Full host access	Scoped proxy
Network egress	Unrestricted	Allowlist-based
Command audit log	None	Full trail
Human approval	None	Configurable per action
Auto-recovery	None	Health checks + restart
Key rotation	Manual	Automated

The left column is what you get out of the box. The right column is what your security team (or your customers' security teams) will require. For the dollar cost of bridging this gap yourself, see the true cost of self-hosting OpenClaw.

How Clawctl Fills the Gaps

Clawctl is a managed layer that wraps OpenClaw with production infrastructure.

Docker socket proxy. Per-tenant scoping. Your gateway can only touch its own sandbox. The proxy runs health checks and filters every API call.

Encrypted key storage. API keys are encrypted before they hit disk. Decrypted only at runtime, in memory. Never visible in docker inspect or process listings.

Egress controls. Define which domains your agent can reach. Everything else is blocked.

Audit logging. Every command, every API call, every file read. Timestamped and searchable. When your enterprise customer asks "what did the agent do?", you have the answer.

Human approval flows. Configure which actions need approval. Destructive commands, external API calls, file deletions. The agent pauses and asks before executing.

Auto-recovery. Container health monitoring. Automatic restarts. Escalation if restarts fail. You get paged, not your users.

The architecture is the same. OpenClaw gateway, sandbox, LLM provider. Clawctl adds the layer between them that makes it safe.

Ready to stop worrying?

Clawctl locks down your OpenClaw instance in 60 seconds. Encrypted keys, audit logs, egress controls, human approvals. $49/mo. No contracts. Start now →

This content is for informational purposes only and does not constitute financial, legal, medical, tax, or other professional advice. Individual results vary. See our Terms of Service for important disclaimers.

OpenClaw Architecture: What Happens When Your Agent Runs

OpenClaw Architecture Explained: What Happens When Your Agent Runs a Command

The Three Layers

What Happens When You Send a Message

Step 1: WebSocket to Gateway (2ms)

Step 2: Gateway to LLM Provider (200-800ms)

Step 3: Gateway to Sandbox (15-50ms)

Step 4: Result Back to LLM (200-800ms)

Step 5: Response to User (2ms)

The Docker Socket: Where Power Meets Risk

The Socket Proxy Pattern

The Sandbox: Isolation (Sort Of)

The LLM Layer: Trust and Tokens

Multi-Turn Execution Chains

Where the Gaps Are

How Clawctl Fills the Gaps

Ready to stop worrying?

You Might Also Like

Done researching? See how the options compare.

OpenClaw Architecture: What Happens When Your Agent Runs

OpenClaw Architecture Explained: What Happens When Your Agent Runs a Command

The Three Layers

What Happens When You Send a Message

Step 1: WebSocket to Gateway (2ms)

Step 2: Gateway to LLM Provider (200-800ms)

Step 3: Gateway to Sandbox (15-50ms)

Step 4: Result Back to LLM (200-800ms)

Step 5: Response to User (2ms)

The Docker Socket: Where Power Meets Risk

The Socket Proxy Pattern

The Sandbox: Isolation (Sort Of)

The LLM Layer: Trust and Tokens

Multi-Turn Execution Chains

Where the Gaps Are

How Clawctl Fills the Gaps

Ready to stop worrying?

More Guides Articles

You Might Also Like

Related Resources

Done researching? See how the options compare.