vmyard places and bin-packs microVM and container sandboxes onto a fleet of hosts, with the dataplane behind pluggable drivers for Firecracker, Cloud Hypervisor, and Docker. The design target is thousands of placements per second from a single scheduler node.
Kubernetes schedules services that start once and run for months. A sandbox is created in a burst next to a thousand siblings, runs for a few seconds, goes to sleep, and is expected back fast when its owner returns. Pushed through pod machinery, every one of those transitions pays for API round-trips, scheduling passes, kubelet reconciliation, and network setup that were priced for a workload changing state a few times a day.
Waiting for Kubernetes to grow into this is a bet against its own architecture: a control plane built on slowly converging desired state cannot become the fast path for millions of short-lived, stateful workloads. That gap is what vmyard is for.
| Phase | What happens |
|---|---|
| create | Boot a fresh sandbox from an image. The cold path. |
| place | Pick a host. Bin-pack for density while leaving headroom for wakes. |
| run | Code executes. The scheduler stays out of the way. |
| sleep | Snapshot and release CPU and memory. Identity and state are kept. |
| wake | Restore from snapshot. The latency this project is judged on. |
| destroy | Reclaim everything. |
Drivers own the dataplane. The scheduling core sees capacity, constraints, and lifecycle verbs; it never talks to a runtime directly. The shape follows the task-driver model proven in HashiCorp Nomad.
| Driver | Isolation | Status |
|---|---|---|
| firecracker | microVM | first target |
| cloud hypervisor | microVM | planned |
| docker | container | planned, for development and CI |
Every number is reproducible from the repository with one command, runs from a fixed seed, and reports latency percentiles next to throughput. Measured 2026-06-12; machines noted per table.
Placement decisions — single thread, Apple M5 laptop, fleet of 32-core/128 GiB hosts
| Fleet | Sustained churn | Decision p50 | Decision p99 | Wake locality |
|---|---|---|---|---|
| 100 hosts | 7,433k placements/s | 208 ns | 334 ns | 98.1% local |
| 1,000 hosts | 912k placements/s | 1.7 µs | 2.3 µs | 98.2% local |
| 10,000 hosts | 88k placements/s | 17.6 µs | 22.0 µs | 98.3% local |
Placement is a deliberate linear scan, and the curve above is its published cost. Wake locality holds at ~98% at every fleet size because it depends on the per-host wake reserve, not on scale.
Time to runnable — through the Docker dev driver, real containers, CI runner
| Mode | Spawn p50 | Spawn p95 | Wake p50 | Wake p95 |
|---|---|---|---|---|
| serial, 30 containers | 126 ms | 147 ms | 18.1 ms | 19.7 ms |
| storm, 60 × 8 workers | 785 ms | 1.08 s | 113 ms | 156 ms |
The scheduler contributes nanoseconds (42 ns p50 against a no-op driver); the rest is Docker's container setup, which serializes under the 8-way storm. Docker wake is pause/unpause, not snapshot restore — that distinction is the reason the numbers are labeled. Real restore benchmarks begin with the Firecracker driver.
In development.
The scheduler core, simulator, driver contract, and a Docker development driver exist and are benchmarked in CI against real containers on every push. There is no installable release yet. Source opens with the first runnable release.