Fast-spawning compute for the agent era
High-throughput runtime infrastructure for AI agents.
vmyard turns agent work into isolated, stateful sandboxes that spawn, wake, and disappear at fleet speed. Code agents, browser agents, workflow agents, and tool-using agents get runnable capacity without waiting on service-era orchestration.
Walk to the water cooler, desk, whiteboard, or meeting table and press Space.
Why not Kubernetes
Kubernetes schedules services that start once and run for months. Agents are the opposite workload: they arrive in bursts, need isolated filesystems, browsers, tools, secrets, and network policy, then pause, resume, fork, or disappear as the task changes. Pushed through pod machinery, every transition pays for API round-trips, scheduling passes, kubelet reconciliation, and network setup that were priced for workloads changing state a few times a day.
Waiting for Kubernetes to grow into this is a bet against its own architecture: a control plane built on slowly converging desired state cannot become the fast path for millions of short-lived, stateful agent runtimes. Enterprise agents need a scheduler built around spawn, wake, isolation, and teardown from day one.
Agent lifecycle
| Phase | What happens |
|---|---|
| spawn | Create an isolated runtime with workspace, tools, network policy, and identity. |
| place | Pick a host. Bin-pack for density while reserving capacity for wakes. |
| run | The agent executes code, drives browsers, calls tools, or runs workflow steps. |
| sleep | Freeze state and release hot resources while preserving the agent's context. |
| wake | Return the agent to runnable state near its snapshot and cached data. |
| destroy | Reclaim compute, storage, credentials, and network state cleanly. |
Runtime boundary
Agents may execute generated code, browse the open web, and touch enterprise systems, so isolation is not decoration. vmyard treats the sandbox boundary as infrastructure: the scheduler sees capacity, constraints, lifecycle verbs, and restore cost; drivers own the host dataplane underneath it.
| Driver | Isolation | Status |
|---|---|---|
| firecracker | microVM | production path |
| cloud hypervisor | microVM | planned |
| docker | container | development and CI |
Benchmarks
Every number is reproducible from the repository with one command, runs from a fixed seed, and reports latency percentiles next to throughput. The point is not only scheduling speed; it is time to runnable agent capacity under burst. Measured 2026-06-12; machines noted per table.
Agent placement decisions - single thread, Apple M5 laptop, fleet of 32-core/128 GiB hosts
| Fleet | Sustained churn | Decision p50 | Decision p99 | Wake locality |
|---|---|---|---|---|
| 100 hosts | 7,433k placements/s | 208 ns | 334 ns | 98.1% local |
| 1,000 hosts | 912k placements/s | 1.7 µs | 2.3 µs | 98.2% local |
| 10,000 hosts | 88k placements/s | 17.6 µs | 22.0 µs | 98.3% local |
Placement is a deliberate linear scan, and the curve above is its published cost. Wake locality holds at ~98% at every fleet size because it depends on the per-host wake reserve, not on scale.
Agent time to runnable - through the Docker dev driver, real containers, CI runner
| Mode | Spawn p50 | Spawn p95 | Wake p50 | Wake p95 |
|---|---|---|---|---|
| serial, 30 containers | 126 ms | 147 ms | 18.1 ms | 19.7 ms |
| storm, 60 x 8 workers | 785 ms | 1.08 s | 113 ms | 156 ms |
The scheduler contributes nanoseconds (42 ns p50 against a no-op driver); the rest is Docker's container setup, which serializes under the 8-way storm. Docker wake is pause/unpause, not snapshot restore. Real agent restore benchmarks begin with the microVM driver path.
Status
In development.
The scheduler core, simulator, driver contract, and a Docker development driver exist and are benchmarked in CI against real containers on every push. There is no installable release yet. The project is being shaped for enterprise teams building serious agent infrastructure.