Probe-First Adaptive Runtime

Context

The repo now has explicit precision, residency, robustness, and crossover policy, plus a kernel variant registry. The remaining question is how much adaptive runtime machinery should land before real kernels, owned buffers, and streaming pipelines exist.

Decision

Adopt a probe-first planner as the first adaptive runtime.

  • collect one device snapshot per operation or chunk boundary

  • choose runtime target, precision plan, variant, and chunk-size hint from declared metadata plus the snapshot

  • allow a re-plan after the first chunk for streaming workloads

  • keep explicit user pins authoritative

  • avoid continuous feedback control, mid-kernel switching, and CUPTI-style machinery in the first landing

Consequences

  • Kernel authors get a stable variant-registration contract immediately.

  • The planner can evolve into a fuller controller later without changing public dispatch APIs.

  • NVML remains an optional telemetry source instead of becoming a hard dependency for basic planning.

Alternatives Considered

  • a full live controller with continuous utilization feedback from day one

  • static registry metadata with no runtime adaptation at all

  • embedding variant logic directly inside each kernel module

Acceptance Notes

The landed implementation provides typed registry metadata, optional NVML-style device snapshots, plan objects, and chunk-boundary re-planning. Continuous control, telemetry history, and richer diagnostics remain follow-up work.