Skip to content

fix: align sync status heuristic with leanSpec#417

Open
dicethedev wants to merge 3 commits into
lambdaclass:mainfrom
dicethedev:fix/sync-status-heuristic
Open

fix: align sync status heuristic with leanSpec#417
dicethedev wants to merge 3 commits into
lambdaclass:mainfrom
dicethedev:fix/sync-status-heuristic

Conversation

@dicethedev
Copy link
Copy Markdown
Contributor

🗒️ Description / Motivation

  • Updates the heuristic used by lean_node_sync_status to follow leanSpec PR feat: add validator sync-lag duty gate leanEthereum/leanSpec#708.
  • The previous heuristic considered the node synced only when its head reached the current slot.
  • This provides a sufficiently stable sync signal for a follow-up PR to gate validator duties.

What Changed

  • Added the leanSpec sync-lag thresholds:
    • 4-slot local sync-lag threshold.
    • 8-slot network-stall threshold.
    • 2-slot hysteresis band.
  • Added a stateful sync-status tracker to the blockchain actor.
  • Refreshes sync status on every tick.
  • Uses the freshest validated live-chain block to distinguish local lag from a network-wide stall.
  • Added tests for threshold boundaries, network stalls, hysteresis, and future-head clock skew.

Correctness / Behavior Guarantees

  • A node is marked syncing when its head is more than four slots behind while fresher validated blocks are known.
  • Network-wide stalls do not mark the node as syncing.
  • Once syncing, the node must recover to two slots behind before returning to synced.
  • Slot differences use saturating subtraction to handle future-head clock skew.
  • This PR only updates the metric heuristic. It does not yet skip proposals or attestations.

Tests Added / Run

  • Added unit tests covering:
    • Lag at and below the threshold.
    • Lag above the threshold.
    • Network-wide stalls.
    • Hysteresis during recovery.
    • Clock skew with a future head.
  • Ran cargo fmt --all -- --check.
  • Ran git diff --check.
  • Full tests and lint remain to be run after the pinned leanVM dependency is available.

Related Issues / PRs

✅ Verification Checklist

  • Ran make fmt — clean
  • [ x] Ran make lint (clippy with -D warnings) — clean
  • Ran cargo test --workspace --release — all passing

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 6, 2026

Greptile Summary

This PR replaces the simplistic "head == current slot" sync heuristic with a stateful SyncStatusTracker that follows leanSpec PR #708, adding lag thresholds, network-stall detection, and hysteresis to prevent flapping.

  • Introduces SYNC_LAG_THRESHOLD (4), NETWORK_STALL_THRESHOLD (8), and SYNC_HYSTERESIS_BAND (2) constants plus a SyncStatusTracker struct updated on every tick (5x per slot at 800 ms each).
  • Moves sync-status updates from process_block to on_tick so the metric refreshes even when no new blocks arrive; adds unit tests covering threshold boundaries, hysteresis, network stalls, and clock skew.

Confidence Score: 4/5

Safe to merge for the stated observability goal; the state-machine logic follows leanSpec PR #708 and is well-covered by the new unit tests.

The state-machine logic and test coverage are solid. Two small issues are worth tidying before this feeds into validator-duty gating: the live-chain full-HashMap allocation on every 800 ms tick, and the bare unsigned subtraction in the hysteresis recovery condition which would silently break hysteresis in a release build if the constants are ever reordered.

crates/blockchain/src/lib.rs — the update_sync_status helper and the hysteresis recovery expression on line 89.

Important Files Changed

Filename Overview
crates/blockchain/src/lib.rs Core change: adds SyncStatusTracker with correct threshold/hysteresis logic; two minor issues — full HashMap allocation on every 800 ms tick and an unguarded unsigned subtraction in the recovery condition.
crates/blockchain/src/metrics.rs Adds #[derive(Debug, Clone, Copy, PartialEq, Eq)] to SyncStatus to support equality assertions in the new unit tests; no logic changes.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A([Every Tick 800ms]) --> B{network_lag > 8?}
    B -- Yes --> C[syncing = false]
    B -- No --> D{currently syncing?}
    D -- Yes --> E{head_lag > 2?}
    E -- Yes --> F[syncing = true]
    E -- No --> G[syncing = false]
    D -- No --> H{head_lag > 4?}
    H -- Yes --> I[syncing = true]
    H -- No --> J[syncing = false]
    C & G & J --> K([Synced])
    F & I --> L([Syncing])
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
crates/blockchain/src/lib.rs:721-727
`get_live_chain()` collects the entire `LiveChain` table into a `HashMap<H256, (u64, H256)>` only to compute the maximum slot. This allocation runs 5 times per slot (once per 800 ms tick). During finality-delay events the live chain can grow to hundreds of entries. Adding a dedicated `Store::max_live_chain_slot()` that iterates without collecting would avoid the allocation entirely.

```suggestion
        let max_seen_slot = self
            .store
            .max_live_chain_slot()
            .unwrap_or(head_slot);
```

### Issue 2 of 2
crates/blockchain/src/lib.rs:89
The expression `SYNC_LAG_THRESHOLD - SYNC_HYSTERESIS_BAND` is a bare `u64` subtraction. In a debug build this would panic if `SYNC_HYSTERESIS_BAND` were ever raised above `SYNC_LAG_THRESHOLD`; in a release build it silently wraps to a huge value, making the condition `head_lag > u64::MAX - delta` always false — the syncing flag would reset on the very next tick regardless of actual lag, silently breaking hysteresis. Using `saturating_sub` costs nothing and makes the intent explicit.

```suggestion
            self.syncing = head_lag > SYNC_LAG_THRESHOLD.saturating_sub(SYNC_HYSTERESIS_BAND);
```

Reviews (1): Last reviewed commit: "fix: align sync status heuristic with le..." | Re-trigger Greptile

Comment thread crates/blockchain/src/lib.rs
Comment thread crates/blockchain/src/lib.rs Outdated
dicethedev and others added 2 commits June 6, 2026 06:08
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant