Skip to content

Enterprise Security Face — code-mode in WasmAgent

Last refreshed: 2026-06-12. Companion to SECURITY.md, docs/kernels/comparison.md, and docs/strategy/2026-06-competitiveness.md.

This document describes the security face an enterprise procurement review will care about: what we control, what the consumer controls, what each kernel actually enforces, and how the OAuth / capability story maps onto Cloudflare's 2026 enterprise-MCP reference architecture without binding you to Cloudflare.

It is deliberately written as a face document — not a marketing piece. Every claim points at a file. Every claim that is not enforced (only honoured best-effort) is labelled as such.


1. Why a separate enterprise face

Three observations from 2026 procurement reviews shape this doc:

  1. MCP without governance is a liability. Forrester's MCP brief (2026-Q2) flags MCP servers as a new trust boundary, not a solved one. Cloudflare's 2026-04/05 enterprise reference architecture treats sandboxed execution, OAuth-stepdown, and centralised policy as the prerequisites — not the cherry on top.
  2. "It runs in WASM" is not a kernel review. The kernel matrix in docs/kernels/comparison.md was written for engineers picking a tier. This doc is the same matrix re-projected onto the questions a security reviewer asks: what isolation, what allow-listing, what failure mode if the guest tries to escape.
  3. Single-tier isolation is rejected by default. A serious review will ask whether you can fall back to a stronger tier for untrusted code or escalating threat. The WasmAgent answer is the three-tier matrix; this doc is what to point at when asked.

2. Three-tier isolation decision tree (security view)

┌──────────────────────────────────────────────────────────────┐
│ Q1. Who supplies the code that will run?                     │
└──────────────────────────────────────────────────────────────┘

        ├── Internal team, signed at build time
        │      → VmKernel (in-process node:vm) is acceptable.
        │        Trust the surrounding process boundary.
        │        File: packages/core/src/executor/JsKernel.ts

        ├── Generated by an LLM call you control
        │      → WASM kernel (QuickJS / Wasmtime / Pyodide).
        │        Language-level isolation; cannot reach process FS
        │        or sockets except through host-injected hooks.
        │        Files: packages/kernel-quickjs/QuickJSKernel.ts
        │               packages/kernel-pyodide/PyodideKernel.ts
        │               packages/kernel-wasmtime/WasmtimeKernel.ts

        └── Untrusted third party (multi-tenant SaaS, public
            playground, processing PII alongside opaque blobs)
            → RemoteSandboxKernel (E2B / Cloudflare Sandbox).
              Full process + filesystem + network isolation; the
              guest is in another machine, not another vm.
              File: packages/kernel-remote/RemoteSandboxKernel.ts

If the answer to Q1 is "I'm not sure," pick the stronger tier.
A WASM guest costs ~5–30ms per cold-start and is the recommended
default for anything LLM-generated.

KernelFactory.createKernel(tier) chooses the implementation; the same CapabilityManifest is honoured across all three tiers, so the host's allow-list does not change shape when the tier changes.


3. CapabilityManifest — the cross-kernel policy face

packages/core/src/executor/types.ts:41 defines the manifest. Every kernel run accepts a Partial<CapabilityManifest>, defaulting to deny for every field. The fields and the enforcement matrix:

FieldMeaningVmKernelQuickJSKernelWasmtimeKernelPyodideRemoteSandbox
allowedHostsGlob list of outbound HTTP hostsenforced (fetch wrapper)enforced (fetch wrapper)enforced (WASI socket gate)enforced (fetch wrapper)enforced (sandbox network policy)
allowedReadPathsPath prefixes the guest may readenforcedn/a (no FS)enforced (WASI preopens)n/aenforced (sandbox FS)
allowedWritePathsPath prefixes the guest may writeenforcedn/a (no FS)enforced (WASI preopens)n/aenforced (sandbox FS)
extraCapabilitiesNamed hooks (tool:web_search, …)enforcedenforcedenforcedenforcedenforced
envAllow-list of env values exposed as __env__enforcedenforcedenforcedenforcedenforced
cpuMsHard wall-clock ceiling per run()enforcedenforcedenforcedbest-effortenforced
memoryLimitBytesSoft memory ceilingbest-effortenforcedenforcedbest-effortenforced

enforced means the kernel rejects the operation; best-effort means the kernel records the limit but cannot trip on it (a runtime warning is logged so the caller is not silently misled — see the constructor docstring in types.ts). A reviewer should note which tier they are deploying and treat best-effort cells accordingly.

Defence-in-depth on cpuMs: the kernel's own KernelOptions.timeoutMs and the manifest's cpuMs both apply, with the lower value winning. A host can pin per-tool-call limits without owning kernel construction.


4. OAuth-stepdown for code-mode MCP

Cloudflare's enterprise MCP reference architecture (2026-05) requires the MCP server to step down the user's full identity to a scoped agent identity before any tool execution. The WasmAgent code-mode MCP server (packages/mcp-server/src/codeMode.ts) does not bind you to Cloudflare's OAuth implementation, but it accepts the same shape.

The minimal integration looks like this — pseudocode is in the packages/mcp-server README; the live source is at packages/mcp-server/src/fetchHandler.ts:

ts
// Pseudo — adapt to your OAuth resolver of choice.
const principal = await yourOauthResolver.exchange(req);
//   ↳ returns: { sub, scopes: [...], on_behalf_of: '<user@org>' }

const manifest: CapabilityManifest = {
  allowedHosts:        principal.scopes.includes("net.read")  ? ["api.example.com"] : [],
  allowedReadPaths:    principal.scopes.includes("fs.read")   ? ["/workspace"]      : [],
  allowedWritePaths:   principal.scopes.includes("fs.write")  ? ["/workspace/out"]  : [],
  extraCapabilities:   principal.scopes.filter(s => s.startsWith("tool:")),
  env: { TENANT_ID: principal.sub },
  cpuMs: 5_000,
  memoryLimitBytes: 64 * 1024 * 1024,
};

const result = await codeModeServer.executeCode(req.body.code, { capabilities: manifest });

The point: the OAuth principal turns into a CapabilityManifest once, and from there the same enforcement matrix in §3 applies no matter which kernel runs the code. The manifest is the policy face; the kernel is the enforcement face. They compose; neither one alone is enough.


5. Failure-mode summary (what does each tier do when escape is attempted?)

ThreatVmKernel responseWASM kernel responseRemoteSandbox response
Guest tries to require('fs')Throws — no Node bindings exposedThrows — capability not in WASIConfined to sandbox FS only
Guest fetches a non-allowlisted hostWrapper rejectsWrapper / WASI socket gate rejectsSandbox network policy rejects
Guest spawns a child processThrows — no child_processThrows — no syscallSpawns inside sandbox (still confined)
Guest exhausts CPUWatchdog kills runWatchdog kills runSandbox kills the guest
Guest exhausts memoryProcess pressure (best-effort)Hard limitSandbox OOM-kills the guest
Guest tries Spectre-style timing attackSame process — assume reachableDifferent vm, but same process; assume some reachabilityDifferent machine — practically infeasible

For the last row in particular: VmKernel and WASM kernels share the host process, so a sufficiently advanced timing attack on shared L1/L2 cache is in scope of the threat model. If your review includes that attack class, the answer is RemoteSandbox.


6. Disclosure SLA and audit posture

  • Sandbox-escape disclosure SLA — see SECURITY.md. Written floor; updates to it land via PR.
  • Public security auditnot yet performed. This is an open item the strategy memo flags as a funding-dependent action (D4 in the optimization plan). Procurement reviews that require an external audit should treat this as a known gap and weight accordingly.
  • Bounty programnot yet stood up. Tracked as part of D4.

The honest summary: the enforcement is real, the matrix is testable (packages/core/src/executor/capabilities.test.ts), and there is no third-party audit yet. We do not paper over the third bullet.


7. What an enterprise review should ask us

If you are evaluating WasmAgent for a security-sensitive deployment, here is the question set that maps to a clear yes/no answer:

  1. Which kernel tier do we plan to use, and is it appropriate for the trust level of the code we will run? — §2 decision tree.
  2. Is every capability we depend on enforced rather than best-effort on that tier? — §3 matrix.
  3. Does our MCP integration step down the OAuth principal into a per-call CapabilityManifest before any code runs? — §4 pattern.
  4. Have we defined the response we expect on each of the failure modes in §5? — failure-mode table.
  5. Are we comfortable deploying without an external audit, or should we co-fund one? — §6 honest gap.

Open an issue tagged security:review if any of those questions have unclear answers in your context, and we'll tighten the doc.

Released under the Apache-2.0 License.