feat: cross-node migration (nodeName affinity + migration state machine)#11
feat: cross-node migration (nodeName affinity + migration state machine)#11tonicmuroq wants to merge 1 commit into
Conversation
2edf006 to
3b78805
Compare
|
Rebased onto
One thing left for you — needs vk-cocoon context I can't verify locally:
Otherwise #11's migration flow is now correct (CRD + loop fixed, rebase verified intact). |
…state machine) Rewritten on current main atop the merged restore-from-hibernate producer (#14): the control plane patches CocoonSet.spec.nodeName and the operator hibernates the main agent, waits for the :hibernate snapshot in the OCI registry, deletes the old pod, recreates it with hostname nodeAffinity + restore-from-hibernate, and drops the snapshot once the restored VM runs with a fresh VMID. Decisions are pure functions of durable state (spec.nodeName, status.phase, the pod, the snapshot), so every step is idempotent and crash-recoverable. Hardening over the original branch: - a registry probe error now owns the reconcile (handled=true) — falling through would let applyUnsuspend unwind the migration or fresh-boot over the only copy of the state - a :hibernate tag on a pod this controller never quiesced is treated as a leftover (suspend/unsuspend never deletes the tag) and dropped instead of restored — a raw presence check would delete a live pod and roll back state - re-targeting nodeName back to the current node mid-migration wakes the pod in place instead of deadlocking (unless a CocoonHibernation CR owns it) - clearing nodeName in the deleted-pod window finishes the restore instead of stranding the snapshot behind a fresh boot - steady-state pinned sets skip the registry probe (Migrating is persisted before the first side effect, so in-flight migrations are never mistaken) Scoped to the main agent (slot 0); sub-agents follow via their hard bind.
|
Rewrote the branch on current main (
Follow-up (not in this PR): a migration timeout + Warning events like the hibernation controller's, and sub-agent migration (currently scoped out, one-VM-per-set model). |
8ba5d6a to
e71b762
Compare
|
Live E2E on the GKE cluster (operator Two pre-existing gaps surfaced by the final drop-snapshot step (not this PR's logic):
Once the IAM grant lands the full loop (drop + settle to Running) can be re-validated; everything up to that point is verified. |
Operator side of cross-node
migrate(vmname, node): the control plane patchesCocoonSet.spec.nodeName, the operator does the rest.What
buildAgentPod): the main agent (slot 0) gets a required hostnamenodeAffinityfromspec.nodeNameinstead of a hardNodeNamebind — it lands on the target only if it fits and the node is schedulable, else stays Pending (respects capacity/cordon, no OOM). Sub-agents keep their hard-bind to the main's node.reconcileMigration): a pure observation function over durable state (spec.nodeName, the pod, the epoch:hibernatesnapshot) — set internal hibernate annotation → wait for snapshot → delete old pod → recreate on target withrestore-from-hibernate→ wait for the restored VMID → drop the snapshot. Idempotent and crash-recoverable; runs beforeapplyUnsuspendso its hibernate annotation isn't cleared mid-flight. Ordering gates: old pod deleted only after the snapshot lands; snapshot dropped only after the new VM has a fresh VMID. SurfacesCocoonSetPhaseMigrating. Scoped to the main agent (one VM per CocoonSet).Dependency
Depends on cocoonstack/cocoon-common#3 (
spec.nodeName+Migratingphase). go.mod pins the branch commit via pseudo-version; bump to the cocoon-common release tag after #3 merges.Tests
migrate_test.go(7 transitions incl. both ordering gates),pods_test.go(3 affinity cases); full suite +make lintclean on linux + darwin.Not in scope
Control-plane
migrateAPI + IP backfill + involuntary-eviction reconcile (simular-pro-vm-service); end-to-end + crash-injection tests (need a cluster).