Adaptive Runtime¶

Use a probe-first planner now, and leave room for a fuller controller later.

Intent¶

Define how runtime adaptation works before owned buffers and kernel families expand, without overcommitting to a live feedback controller too early.

Request Signals¶

adaptive runtime
nvml
variant registry
probe and adapt
chunk planning
saturation monitoring

Open First¶

docs/architecture/adaptive-runtime.md
docs/architecture/runtime.md
src/vibespatial/runtime/adaptive.py
src/vibespatial/runtime/kernel_registry.py
docs/decisions/0007-probe-first-adaptive-runtime.md

Verify¶

uv run pytest tests/test_adaptive_runtime.py
uv run python scripts/check_docs.py --check

Risks¶

A full live controller would create more machinery than value before real kernels and chunked workloads exist.
Overfitting variant choice too early can freeze bad metadata into the registry contract.
Hard-coding NVML into call sites would make later telemetry upgrades expensive.

Canonical Rule¶

Adaptive planning happens before execution and, for streaming work, at chunk boundaries.
The first landing is a planner, not a continuous controller.
Telemetry is optional. When monitoring is unavailable, planning falls back to static heuristics and declared metadata.
Explicit cpu, gpu, and precision overrides remain authoritative.

Required Layers¶

telemetry snapshot: GPU availability plus optional NVML saturation and memory signals
variant registry: typed metadata, not just variant names
planner input: kernel class, row count, geometry mix, residency, and requested mode
planner output: selected runtime, variant, precision plan, chunk size hint, and reason log

Decision Scope¶

The planner may adapt:

kernel variant
chunk size hint
precision path through the existing precision-policy contract
auto runtime target through the existing crossover policy

The planner must not:

switch mid-kernel
override explicit user pins
depend on continuous background polling

Upgrade Path¶

This design is intentionally a stepping stone.

Today: one-shot planning plus optional re-plan after the first chunk.
Later: richer telemetry, runtime history, and tighter re-plan cadence.

Moving from the planner to a live controller should only replace internal policy and telemetry sources. Kernel call sites, registry metadata, and plan objects should stay stable.