Fast-spawning compute for the agent era

High-throughput runtime infrastructure for AI agents.

vmyard turns agent work into isolated, stateful sandboxes that spawn, wake, and disappear at fleet speed. Code agents, browser agents, workflow agents, and tool-using agents get runnable capacity without waiting on service-era orchestration.

See benchmarks Move WASD / arrows Interact Space

Walk to the water cooler, desk, whiteboard, or meeting table and press Space.

agent queue 128 spawn p50 126 ms

Why not Kubernetes

Kubernetes schedules services that start once and run for months. Agents are the opposite workload: they arrive in bursts, need isolated filesystems, browsers, tools, secrets, and network policy, then pause, resume, fork, or disappear as the task changes. Pushed through pod machinery, every transition pays for API round-trips, scheduling passes, kubelet reconciliation, and network setup that were priced for workloads changing state a few times a day.

Waiting for Kubernetes to grow into this is a bet against its own architecture: a control plane built on slowly converging desired state cannot become the fast path for millions of short-lived, stateful agent runtimes. Enterprise agents need a scheduler built around spawn, wake, isolation, and teardown from day one.

Agent lifecycle

PhaseWhat happens
spawnCreate an isolated runtime with workspace, tools, network policy, and identity.
placePick a host. Bin-pack for density while reserving capacity for wakes.
runThe agent executes code, drives browsers, calls tools, or runs workflow steps.
sleepFreeze state and release hot resources while preserving the agent's context.
wakeReturn the agent to runnable state near its snapshot and cached data.
destroyReclaim compute, storage, credentials, and network state cleanly.

Runtime boundary

Agents may execute generated code, browse the open web, and touch enterprise systems, so isolation is not decoration. vmyard treats the sandbox boundary as infrastructure: the scheduler sees capacity, constraints, lifecycle verbs, and restore cost; drivers own the host dataplane underneath it.

DriverIsolationStatus
firecrackermicroVMproduction path
cloud hypervisormicroVMplanned
dockercontainerdevelopment and CI

Benchmarks

Every number is reproducible from the repository with one command, runs from a fixed seed, and reports latency percentiles next to throughput. The point is not only scheduling speed; it is time to runnable agent capacity under burst. Measured 2026-06-12; machines noted per table.

Agent placement decisions - single thread, Apple M5 laptop, fleet of 32-core/128 GiB hosts

FleetSustained churnDecision p50Decision p99Wake locality
100 hosts7,433k placements/s208 ns334 ns98.1% local
1,000 hosts912k placements/s1.7 µs2.3 µs98.2% local
10,000 hosts88k placements/s17.6 µs22.0 µs98.3% local

Placement is a deliberate linear scan, and the curve above is its published cost. Wake locality holds at ~98% at every fleet size because it depends on the per-host wake reserve, not on scale.

Agent time to runnable - through the Docker dev driver, real containers, CI runner

ModeSpawn p50Spawn p95Wake p50Wake p95
serial, 30 containers126 ms147 ms18.1 ms19.7 ms
storm, 60 x 8 workers785 ms1.08 s113 ms156 ms

The scheduler contributes nanoseconds (42 ns p50 against a no-op driver); the rest is Docker's container setup, which serializes under the 8-way storm. Docker wake is pause/unpause, not snapshot restore. Real agent restore benchmarks begin with the microVM driver path.

Status

In development.

The scheduler core, simulator, driver contract, and a Docker development driver exist and are benchmarked in CI against real containers on every push. There is no installable release yet. The project is being shaped for enterprise teams building serious agent infrastructure.