Skip to content

Fix NAT keepalive dropped by src-binding; flow poll retry; recover test#303

Merged
TeoSlayer merged 1 commit into
mainfrom
fix/scrub-findings
Jun 21, 2026
Merged

Fix NAT keepalive dropped by src-binding; flow poll retry; recover test#303
TeoSlayer merged 1 commit into
mainfrom
fix/scrub-findings

Conversation

@TeoSlayer

Copy link
Copy Markdown
Collaborator

Fixes from the multi-agent scrub of the verified-badges work.

HIGH — NAT keepalive regression (from #294). The transport src-binding check drops frames whose inner Src.Node != the AEAD-authenticated peer, but keepaliveSweep built the keepalive with Src.Node unset (0) — so every NAT keepalive was dropped at the receiver and mappings expired (reintroducing the v1.9.1 bug). Fix: newKeepalivePacket now stamps our own node id (= the peer id from the receiver's view). Extracted to a helper + added TestKeepalivePacketStampsOwnSource that exercises the real shape (the old test hand-crafted the packet and masked this).

MED — device-flow poll resilience. pilotctl verify --provider aborted the whole flow on a single transient poll error; now tolerates up to 5 consecutive failures before giving up.

MED — recover coverage. Added TestCmdRecoveryRecoverInstallsNewIdentity for the most destructive command (keyless force-rotate + local identity install).

go test -race green on pkg/daemon + cmd/pilotctl; gofmt + vet clean.

@TeoSlayer TeoSlayer merged commit eeee27e into main Jun 21, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants