Enterprise Security Face — code-mode in WasmAgent

Last refreshed: 2026-06-12. Companion to SECURITY.md, docs/kernels/comparison.md, and docs/strategy/2026-06-competitiveness.md.

This document describes the security face an enterprise procurement review will care about: what we control, what the consumer controls, what each kernel actually enforces, and how the OAuth / capability story maps onto Cloudflare's 2026 enterprise-MCP reference architecture without binding you to Cloudflare.

It is deliberately written as a face document — not a marketing piece. Every claim points at a file. Every claim that is not enforced (only honoured best-effort) is labelled as such.

1. Why a separate enterprise face

Three observations from 2026 procurement reviews shape this doc:

MCP without governance is a liability. Forrester's MCP brief (2026-Q2) flags MCP servers as a new trust boundary, not a solved one. Cloudflare's 2026-04/05 enterprise reference architecture treats sandboxed execution, OAuth-stepdown, and centralised policy as the prerequisites — not the cherry on top.
"It runs in WASM" is not a kernel review. The kernel matrix in docs/kernels/comparison.md was written for engineers picking a tier. This doc is the same matrix re-projected onto the questions a security reviewer asks: what isolation, what allow-listing, what failure mode if the guest tries to escape.
Single-tier isolation is rejected by default. A serious review will ask whether you can fall back to a stronger tier for untrusted code or escalating threat. The WasmAgent answer is the three-tier matrix; this doc is what to point at when asked.

2. Three-tier isolation decision tree (security view)

┌──────────────────────────────────────────────────────────────┐
│ Q1. Who supplies the code that will run?                     │
└──────────────────────────────────────────────────────────────┘
        │
        ├── Internal team, signed at build time
        │      → VmKernel (in-process node:vm) is acceptable.
        │        Trust the surrounding process boundary.
        │        File: packages/core/src/executor/JsKernel.ts
        │
        ├── Generated by an LLM call you control
        │      → WASM kernel (QuickJS / Wasmtime / Pyodide).
        │        Language-level isolation; cannot reach process FS
        │        or sockets except through host-injected hooks.
        │        Files: packages/kernel-quickjs/QuickJSKernel.ts
        │               packages/kernel-pyodide/PyodideKernel.ts
        │               packages/kernel-wasmtime/WasmtimeKernel.ts
        │
        └── Untrusted third party (multi-tenant SaaS, public
            playground, processing PII alongside opaque blobs)
            → RemoteSandboxKernel (E2B / Cloudflare Sandbox).
              Full process + filesystem + network isolation; the
              guest is in another machine, not another vm.
              File: packages/kernel-remote/RemoteSandboxKernel.ts

If the answer to Q1 is "I'm not sure," pick the stronger tier.
A WASM guest costs ~5–30ms per cold-start and is the recommended
default for anything LLM-generated.

KernelFactory.createKernel(tier) chooses the implementation; the same CapabilityManifest is honoured across all three tiers, so the host's allow-list does not change shape when the tier changes.

3. CapabilityManifest — the cross-kernel policy face

packages/core/src/executor/types.ts:41 defines the manifest. Every kernel run accepts a Partial<CapabilityManifest>, defaulting to deny for every field. The fields and the enforcement matrix:

Field	Meaning	VmKernel	QuickJSKernel	WasmtimeKernel	Pyodide	RemoteSandbox
`allowedHosts`	Glob list of outbound HTTP hosts	enforced (fetch wrapper)	enforced (fetch wrapper)	enforced (WASI socket gate)	enforced (fetch wrapper)	enforced (sandbox network policy)
`allowedReadPaths`	Path prefixes the guest may read	enforced	n/a (no FS)	enforced (WASI preopens)	n/a	enforced (sandbox FS)
`allowedWritePaths`	Path prefixes the guest may write	enforced	n/a (no FS)	enforced (WASI preopens)	n/a	enforced (sandbox FS)
`extraCapabilities`	Named hooks (`tool:web_search`, …)	enforced	enforced	enforced	enforced	enforced
`env`	Allow-list of env values exposed as `__env__`	enforced	enforced	enforced	enforced	enforced
`cpuMs`	Hard wall-clock ceiling per `run()`	enforced	enforced	enforced	best-effort	enforced
`memoryLimitBytes`	Soft memory ceiling	best-effort	enforced	enforced	best-effort	enforced

enforced means the kernel rejects the operation; best-effort means the kernel records the limit but cannot trip on it (a runtime warning is logged so the caller is not silently misled — see the constructor docstring in types.ts). A reviewer should note which tier they are deploying and treat best-effort cells accordingly.

Defence-in-depth on cpuMs: the kernel's own KernelOptions.timeoutMs and the manifest's cpuMs both apply, with the lower value winning. A host can pin per-tool-call limits without owning kernel construction.

4. OAuth-stepdown for code-mode MCP

Cloudflare's enterprise MCP reference architecture (2026-05) requires the MCP server to step down the user's full identity to a scoped agent identity before any tool execution. The WasmAgent code-mode MCP server (packages/mcp-server/src/codeMode.ts) does not bind you to Cloudflare's OAuth implementation, but it accepts the same shape.

The minimal integration looks like this — pseudocode is in the packages/mcp-server README; the live source is at packages/mcp-server/src/fetchHandler.ts:

// Pseudo — adapt to your OAuth resolver of choice.
const principal = await yourOauthResolver.exchange(req);
//   ↳ returns: { sub, scopes: [...], on_behalf_of: '<user@org>' }

const manifest: CapabilityManifest = {
  allowedHosts:        principal.scopes.includes("net.read")  ? ["api.example.com"] : [],
  allowedReadPaths:    principal.scopes.includes("fs.read")   ? ["/workspace"]      : [],
  allowedWritePaths:   principal.scopes.includes("fs.write")  ? ["/workspace/out"]  : [],
  extraCapabilities:   principal.scopes.filter(s => s.startsWith("tool:")),
  env: { TENANT_ID: principal.sub },
  cpuMs: 5_000,
  memoryLimitBytes: 64 * 1024 * 1024,
};

const result = await codeModeServer.executeCode(req.body.code, { capabilities: manifest });

The point: the OAuth principal turns into a CapabilityManifest once, and from there the same enforcement matrix in §3 applies no matter which kernel runs the code. The manifest is the policy face; the kernel is the enforcement face. They compose; neither one alone is enough.

5. Failure-mode summary (what does each tier do when escape is attempted?)

Threat	VmKernel response	WASM kernel response	RemoteSandbox response
Guest tries to require('fs')	Throws — no Node bindings exposed	Throws — capability not in WASI	Confined to sandbox FS only
Guest fetches a non-allowlisted host	Wrapper rejects	Wrapper / WASI socket gate rejects	Sandbox network policy rejects
Guest spawns a child process	Throws — no `child_process`	Throws — no syscall	Spawns inside sandbox (still confined)
Guest exhausts CPU	Watchdog kills run	Watchdog kills run	Sandbox kills the guest
Guest exhausts memory	Process pressure (best-effort)	Hard limit	Sandbox OOM-kills the guest
Guest tries Spectre-style timing attack	Same process — assume reachable	Different vm, but same process; assume some reachability	Different machine — practically infeasible

For the last row in particular: VmKernel and WASM kernels share the host process, so a sufficiently advanced timing attack on shared L1/L2 cache is in scope of the threat model. If your review includes that attack class, the answer is RemoteSandbox.

6. Disclosure SLA and audit posture

Sandbox-escape disclosure SLA — see SECURITY.md. Written floor; updates to it land via PR.
Public security audit — not yet performed. This is an open item the strategy memo flags as a funding-dependent action (D4 in the optimization plan). Procurement reviews that require an external audit should treat this as a known gap and weight accordingly.
Bounty program — not yet stood up. Tracked as part of D4.

The honest summary: the enforcement is real, the matrix is testable (packages/core/src/executor/capabilities.test.ts), and there is no third-party audit yet. We do not paper over the third bullet.

7. What an enterprise review should ask us

If you are evaluating WasmAgent for a security-sensitive deployment, here is the question set that maps to a clear yes/no answer:

Which kernel tier do we plan to use, and is it appropriate for the trust level of the code we will run? — §2 decision tree.
Is every capability we depend on enforced rather than best-effort on that tier? — §3 matrix.
Does our MCP integration step down the OAuth principal into a per-call CapabilityManifest before any code runs? — §4 pattern.
Have we defined the response we expect on each of the failure modes in §5? — failure-mode table.
Are we comfortable deploying without an external audit, or should we co-fund one? — §6 honest gap.

Open an issue tagged security:review if any of those questions have unclear answers in your context, and we'll tighten the doc.

Enterprise Security Face — code-mode in WasmAgent ​

1. Why a separate enterprise face ​

2. Three-tier isolation decision tree (security view) ​

3. CapabilityManifest — the cross-kernel policy face ​

4. OAuth-stepdown for code-mode MCP ​

5. Failure-mode summary (what does each tier do when escape is attempted?) ​

6. Disclosure SLA and audit posture ​

7. What an enterprise review should ask us ​