spock_conflict: fix spurious tiebreaker-equal WARNING in single-writer topology#504
spock_conflict: fix spurious tiebreaker-equal WARNING in single-writer topology#504rasifr wants to merge 3 commits into
Conversation
Reproduces the bug where conflict_resolve_by_timestamp() fires a "tiebreaker values are equal" WARNING when the same row is updated twice in a single upstream transaction. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…r topology conflict_resolve_by_timestamp() used get_node(local_origin_id) for loc_node, where local_origin_id is the replication origin stamped on the existing local tuple. In a single-writer topology every row on the subscriber carries origin = provider.id and incoming changes also have remote_origin = provider.id, so both lookups resolved to the same SpockNode. The same ambiguity arises from a node_id hash collision: the id is a 16-bit hash of the node name, so two distinct nodes can share the same id and again resolve to the same SpockNode. In either case the tiebreaker comparison is trivially equal, firing a spurious WARNING on every timestamp-tied conflict. Use get_local_node() for loc_node so the tiebreaker always compares the applying node against the remote sender, not the tuple's origin. Add a same-origin shortcut: when local_origin_id == remote_origin_id the change shares the same source (e.g. the same row updated twice in one upstream transaction), so there is no real conflict and remote is applied directly without consulting the tiebreaker. Detect node_id hash collisions at node creation time.
The collision check in create_node() now emits "collides with existing node" instead of a PK uniqueness error. Update the like() patterns in 024 to match the new message. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Duplication | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
📝 WalkthroughWalkthrough
ChangesNode ID collision detection and tiebreaker conflict resolution
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/tap/t/025_tiebreaker_equal_warning.pl (1)
45-46: ⚡ Quick winPrefer deterministic wait over fixed
sleepfor replication checks.Using fixed sleeps at Line 45 and Line 60 can make this TAP test flaky under slow CI or loaded hosts. Polling until the expected replicated value appears (with timeout) is more stable.
Also applies to: 60-63
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/tap/t/025_tiebreaker_equal_warning.pl` around lines 45 - 46, Replace the fixed sleep calls used in the tiebreaker equal warning test with deterministic polling logic that repeatedly checks for the expected replicated value until it appears or a timeout is reached. Instead of using system_or_bail with 'sleep', implement a polling loop with a reasonable timeout that checks the replication status at intervals, allowing the test to proceed as soon as the expected condition is met rather than waiting a fixed duration. This approach eliminates flakiness on slow or loaded CI environments.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/tap/t/025_tiebreaker_equal_warning.pl`:
- Around line 71-72: The unlike() assertion in the tiebreaker test is checking
for outdated warning text. The regex pattern is looking for the old phrase
"tiebreaker values are equal" but the current warning message emitted by
src/spock_conflict.c has been changed to "have equal tiebreaker value ...;
applying remote". Update the regex in the unlike() call at line 71 to match the
current warning text so the negative assertion properly prevents false positives
when the warning is actually emitted.
---
Nitpick comments:
In `@tests/tap/t/025_tiebreaker_equal_warning.pl`:
- Around line 45-46: Replace the fixed sleep calls used in the tiebreaker equal
warning test with deterministic polling logic that repeatedly checks for the
expected replicated value until it appears or a timeout is reached. Instead of
using system_or_bail with 'sleep', implement a polling loop with a reasonable
timeout that checks the replication status at intervals, allowing the test to
proceed as soon as the expected condition is met rather than waiting a fixed
duration. This approach eliminates flakiness on slow or loaded CI environments.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 61fb842d-1102-49ed-af0a-ffd52f3507e4
📒 Files selected for processing (5)
src/spock_conflict.csrc/spock_node.ctests/tap/scheduletests/tap/t/024_node_id_collision.pltests/tap/t/025_tiebreaker_equal_warning.pl
| unlike($new_log, qr/tiebreaker values are equal/i, | ||
| 'no spurious tiebreaker-equal WARNING'); |
There was a problem hiding this comment.
Negative log assertion is checking stale warning text.
At Line 71, the regex only matches the old phrase (tiebreaker values are equal), but the current warning text in src/spock_conflict.c is have equal tiebreaker value ...; applying remote. This can let the test pass even when the warning is emitted.
Proposed fix
-unlike($new_log, qr/tiebreaker values are equal/i,
- 'no spurious tiebreaker-equal WARNING');
+unlike($new_log, qr/(tiebreaker values are equal|equal tiebreaker value)/i,
+ 'no spurious tiebreaker-equal WARNING');🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/tap/t/025_tiebreaker_equal_warning.pl` around lines 71 - 72, The
unlike() assertion in the tiebreaker test is checking for outdated warning text.
The regex pattern is looking for the old phrase "tiebreaker values are equal"
but the current warning message emitted by src/spock_conflict.c has been changed
to "have equal tiebreaker value ...; applying remote". Update the regex in the
unlike() call at line 71 to match the current warning text so the negative
assertion properly prevents false positives when the warning is actually
emitted.
conflict_resolve_by_timestamp() used get_node(local_origin_id) for loc_node,
where local_origin_id is the replication origin stamped on the existing local
tuple. In a single-writer topology every row on the subscriber carries
origin = provider.id and incoming changes also have remote_origin = provider.id,
so both lookups resolved to the same SpockNode. The same ambiguity arises from
a node_id hash collision: the id is a 16-bit hash of the node name, so two
distinct nodes can share the same id and again resolve to the same SpockNode.
In either case the tiebreaker comparison is trivially equal, firing a spurious
WARNING on every timestamp-tied conflict.
Use get_local_node() for loc_node so the tiebreaker always compares the
applying node against the remote sender, not the tuple's origin. Add a
same-origin shortcut: when local_origin_id == remote_origin_id the change
shares the same source (e.g. the same row updated twice in one upstream
transaction), so there is no real conflict and remote is applied directly
without consulting the tiebreaker.
Detect node_id hash collisions at node creation time.