probe(1brc): lanes G–J — kanban-update arc: ownership, orchestration, batch pipeline, and the 64×64 gridlake sweet spot#635
Conversation
… merge, tax is all boundary, never shard ownership below contention Operator follow-up: 'compare morton and the kanban vs without — if 64k concurrent SoA vs Morton tile can help us understand the pros and cons of our architecture when using kanban update.' ndarray checkout rebased onto master (#227 merged — sibling Morton scatter/morsel probe). Lane G (feature lane-g): lane F's Morton-tile 64K SoA held as OWNED state by shard mailbox actors (mailbox-as-owner over contiguous Morton tile ranges). Workers pre-reduce 64K-row morsels in private tables (identical hot loop to F), extract dirty slots clear-by-undo, cast entries prefix-routed to the owning shard; every applied batch is witnessed with a KanbanMove on the owner WAL (journal==casts asserted; CognitiveWork->Evaluation legal edge, storage_kanban journaling precedent). Scan workers on the blocking pool so owners never starve. t4 (recipe corpus re-verified, 4 cores, 3 passes, medians): F 79.5 / G(1 shard) 43.0 / G(4) 39.9 / G(16) 36.0 with one thrash collapse to 11.7; workers=3 variants strictly worse. Ledger: kanban update = 0.54x at morsel granularity, the tax is all boundary (corpus copy + oversubscription + messaging; the witness itself ~free per lane E); it buys live bounded-staleness state, witnessed replayable writes, single-writer safety, bounded worker memory. Do NOT shard ownership below contention — one mailbox absorbs ~400-group apply work; shards scale with owner WORK, never rows; the Morton prefix route mechanism is free (G@4 within ~7% of G@1). README §5.4 tables; plan Addendum-13 t4; board E-1BRC-KANBAN-UPDATE-1. Gates: 14 std / 16 lane-g / 20 all-features tests green; clippy -D warnings exit 0; fmt clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
📝 WalkthroughWalkthroughThis PR adds probe lanes G, H, I, and J, wires them behind features and CLI entrypoints, exports preset-based runners, and updates the measurement records and session documentation for the new lane behaviors. ChangesProbe lanes and supporting docs
Estimated code review effort: 5 (Critical) | ~120 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches📝 Generate docstrings
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
… full ownership-granularity curve to 64K mailboxes Operator correction: 'I thought we spawn one ractor mailbox per SoA?' — ratified. Each owner's actor State is now its OWN OwnerSoa sized to its tile span (span-clamped capacity, local probe, full-name verification; panic-on-full, never silent) — one mailbox = one SoA verbatim, no full-64K tables per owner, no 'sharded one SoA' framing. Flush grouping sort-based (no dense per-shard vec allocation at 64K owners). Parity test extended to 4096 mailboxes; shards clamped to [1, 65536]. t4a curve (recipe corpus re-verified, 4 cores, medians of 3): G(1) 43.4 / G(16) 30.3 / G(256) 35.9 / G(4096) 18.3 / G(65536 = one mailbox per tile) 2.1 Mrows/s — a 20x collapse at the literal 64K-concurrent-SoAs end (64K spawns in-run, cast fragmentation ~150 -> ~63K, 64K tasks on 4 cores). Ruling: ownership plateau 1-256 owners, then the cliff; Morton tile GROUPING is what makes mailbox-as-owner viable — mailbox = OWNER boundary, tile = ADDRESS boundary, never 1:1 under load. README §5.4a; plan Addendum-13 t4a; board E-1BRC-OWNER-GRANULARITY-1. Gates: 16 lane-g / 20 all-features tests green; clippy -D warnings exit 0; fmt clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
…tion + ahead-firing batching flatten the ownership curve (23x at the 64K end) Operator: 'the 65536 mailboxes had no Orchestration at all — check with rs-graph-llm or lance-graph-planner + kanban update to find the sweet spot.' Lane H (feature lane-h) interposes the planner/kanban-executor domain's two in-loop mechanisms over lane G's unchanged one-mailbox-per-SoA substrate: LAZY activation (router tier spawns an owner only on first traffic — live mailboxes track occupancy ~413, never the 64K address space) and AHEAD-FIRING batched delivery (routers buffer per-owner entries, fire batched Applys at batch_k=64, drain flushes remainders). Witness discipline preserved: owner journals == router casts asserted. graph-flow (rs-graph-llm) stays the OUTER loop by design: task-granularity persisted cursor (M25); per-morsel it would put a session save on the hot path, and its in-container build is blocked by the pre-existing burn-submodule 403 (W3b). t5 (recipe corpus re-verified, 4 cores, medians of 3): H(16) 42.2 / H(256) 36.8 / H(4096) 40.2 / H(65536) 39.4 Mrows/s vs same-session flat G(65536) 1.7 — a 23x recovery, within ~9% of G(1) 43.2 (F 81.7). Ruling: orchestration FLATTENS the granularity curve; the sweet spot is not a shard count, it is the orchestration tier itself. Fine-grained mailbox-as-owner is viable iff producers never address owners directly — the router/delegation (ahead-firing batch-writer) tier is load-bearing, not an optimization; flat fan-out to fine owners is the measured 20x anti-pattern. README §5.5; plan Addendum-13 t5; board E-1BRC-ORCHESTRATION-SWEETSPOT-1. Gates: 14 std / 17 lane-h / 21 all-features tests green; clippy -D warnings exit 0; fmt clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
crates/onebrc-probe/src/main.rs (2)
135-164: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick winFallback "unknown lane" message doesn't mention new lanes
g/h.Adding the new "g"/"h" match arms leaves the
otherfallback's expected-lane list stale (Line 161 still lists onlya, b, c, d, e, f, r), which will mislead users who mistype a lane or check the error text for valid options.🩹 Proposed fix
other => { - eprintln!("unknown lane '{other}' (expected 'a', 'b', 'c', 'd', 'e', 'f', or 'r')"); + eprintln!("unknown lane '{other}' (expected 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', or 'r')"); std::process::exit(2); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/onebrc-probe/src/main.rs` around lines 135 - 164, The fallback error message in the `other` match arm is stale and still lists only the old lane options. Update the unknown-lane message in `main` to include the new valid lanes `g` and `h` alongside `a` through `f` and `r`, so the `lane` dispatch and the user-facing validation text stay in sync.
172-184: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick winLane
h's owner-granularity parameter isn't printed.Lane
ggets a dedicated print branch showingshards={shards}(176-179), but laneh(which also consumesshardsas the nominal owner count) falls through to the generic branch and never prints it. This is the exact sweep parameter used to produce README §5.5's ownership-granularity table, so runs can't be distinguished from their output alone.🩹 Proposed fix
if lane == "e" { println!( "lane={lane} rows={rows} workers={workers} batches={batches} elapsed_ms={elapsed_ms:.3} throughput_mrows_s={throughput_mrows_s:.3}" ); - } else if lane == "g" { + } else if lane == "g" || lane == "h" { println!( "lane={lane} rows={rows} workers={workers} shards={shards} elapsed_ms={elapsed_ms:.3} throughput_mrows_s={throughput_mrows_s:.3}" ); } else {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/onebrc-probe/src/main.rs` around lines 172 - 184, The lane-specific summary in main does not print the owner-granularity value for lane h, so runs using that lane cannot be distinguished from the output. Update the conditional printing logic in main to give lane h its own branch, similar to the existing lane g branch, and include shards={shards} in the formatted output while keeping the other lanes unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/onebrc-probe/README.md`:
- Around line 96-101: The CLI documentation in README.md is missing lane h and
its usage details. Update the onebrc-probe run syntax and lane selector to
include h, and revise the feature-gate note to mention that lane-g and lane-h
both require their corresponding features and are enforced by main.rs. Also
document that lane h uses the same 4th positional argument semantics as lane g,
so readers can find the correct behavior from the onebrc-probe CLI docs.
In `@crates/onebrc-probe/src/lane_h.rs`:
- Around line 158-172: The routing logic in RouteMsg::Morsel can spawn the same
owner_idx in multiple routers when routers_n and owners_nominal partition Morton
slots differently, inflating live-owner counts. Update the router-to-owner
mapping so each owner_idx is owned by exactly one router, either by snapping
router boundaries to owner boundaries or by routing based directly on owner_idx
instead of morton_slot-derived router partitions. Keep the buffer/fire flow in
the RouteMsg::Morsel path and the setup around routers_n and owners_nominal
consistent with that single-owner ownership model.
---
Outside diff comments:
In `@crates/onebrc-probe/src/main.rs`:
- Around line 135-164: The fallback error message in the `other` match arm is
stale and still lists only the old lane options. Update the unknown-lane message
in `main` to include the new valid lanes `g` and `h` alongside `a` through `f`
and `r`, so the `lane` dispatch and the user-facing validation text stay in
sync.
- Around line 172-184: The lane-specific summary in main does not print the
owner-granularity value for lane h, so runs using that lane cannot be
distinguished from the output. Update the conditional printing logic in main to
give lane h its own branch, similar to the existing lane g branch, and include
shards={shards} in the formatted output while keeping the other lanes unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 0bde42de-a647-44c7-bb1c-d5c3128a3672
📒 Files selected for processing (9)
.claude/board/EPIPHANIES.md.claude/v3/INTEGRATION-PLAN.mdcrates/onebrc-probe/Cargo.tomlcrates/onebrc-probe/README.mdcrates/onebrc-probe/src/lane_f.rscrates/onebrc-probe/src/lane_g.rscrates/onebrc-probe/src/lane_h.rscrates/onebrc-probe/src/lib.rscrates/onebrc-probe/src/main.rs
… codebook CAM addressing, whole-table double-casts, flush cache Operator spec implemented verbatim: (1) all 65536 mailboxes spawned UPFRONT — standing ownership registry, spawn measured separately (1.1-2.7 s, 17-40 us/actor); (2) two fixed aligned indices — mailbox idx == SoA row idx; ownership guarantee = the row_owner[i]==i binding plus write-on-behalf at the sink, never a message path; (3) codebook index — identity minted once (Morton place + probe + name verify at mint only), worker-local memo, hot loop = direct CAM index with no probe and no compare; (4) whole-table DOUBLE-CASTS — full 65536-row SoA batch tables frozen into one Arc cast to BOTH the mailbox-ownership- guarantee sink and the Lance row-address sink (156 batches -> 312 messages total vs ~63K flat / ~2.6K orchestrated; messages track batches, independent of occupancy and address space); (5) flush cache — refcount-gated table pool, peak 2-3 tables/worker, flushing and reindex-next interleave, worker never waits. Double-WAL asserted: ownership journal == lance journal == 156 batches + one DatasetVersion tick per batch — replayable from either end (the full W1b batch-writer shape). t6: steady state ~20-22 Mrows/s (~1/2 of G(1) 43 — residency-footprint attribution flagged CONJECTURE); total incl. one-time spawn 3.2-6.1. Ruling: the batch pipeline wins the messaging war outright and carries the strongest witness story; the standing 64K registry is affordable infrastructure; the remaining surface is residency, not architecture. README §5.6 + lane table; plan Addendum-13 t6; board E-1BRC-BATCH-PIPELINE-1. Gates: 14 std / 16 lane-i / 23 all-features tests green (incl. lane I parity with recycled flush cache + codebook mint idempotence); clippy -D warnings exit 0; fmt clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
… spot (46 Mrows/s, equals best streamed topology, double-WAL) One parameterized lane answers the operator's four follow-ups as knobs: grid (4096 gridlake vs 65536), sink lanes (1/8/64), registry (on/off), over lane I's unchanged batch pipeline (codebook CAM, whole-table Arc double-cast, flush cache; per-batch witness on every lane asserted). t7 (recipe corpus, 4 cores; same-session refs G(1)=46.3 / H=40.5 / F=70.1): J(4096, 1 lane, no registry) = 46.2-46.3 — the measured sweet spot. Registry ON halves steady state net of spawn (t6 residency CONJECTURE -> FINDING). Grid 65536 -> 40 (L2-busting table+memo working set ~2.5 MB/worker). Lanes: 1 suffices, 8 free, 64 over-lanes — lane count scales with per-batch APPLY work, never data. Composed recipe: 64x64 gridlake batch SoA + codebook CAM + 1-8 lane pairs + whole-table double-cast + flush cache; ownership as the index-aligned guarantee table (no standing per-cell actor registry); BF16 planes per ndarray #227's proven VDPBF16PS tier as the tile-GEMM continuation. Gates: 25 all-features tests green (incl. lane J 4-corner knob-matrix parity); clippy -D warnings exit 0; fmt clean. README §5.7; plan Addendum-13 t7; board E-1BRC-GRIDLAKE-SWEETSPOT-1. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
…rray_chunks wiring Consolidation per operator request: (1) FINDINGS.md — the agnostic record: environment, methods, all t0-t7 measurement tables, and all 11 asserted invariants WITH their code (merge algebra, cross-method parity, corpus reproducibility, witness completeness single- and double-sink, aligned-index ownership guarantee, codebook mint idempotence, flush-cache refcount exclusivity, bounded-probe panic, lifecycle legality, SIMD provenance); no interpretation. (2) COMMENTARY.md — this session's prime stored separately (readings, composed recipe, flagged uncertainty, suggested lab sweeps) so another session can read the findings from its own angle. (3) src/presets.rs (feature presets) — the 8 batching methods frozen as named reproducible presets with one shared signature and one parity harness: all_presets_agree_with_lane_a asserts every preset byte-identical to lane A, so lab sweeps inherit correctness for free. (4) lane B stride walk now routes through ndarray::simd::array_chunks (simd_ops.rs non-overlapping walker; array_windows is the overlapping GEMM-style sibling, deliberately unused for delimiter scanning); simd_soa.rs SoaBytes noted as the open follow-up for vectorized sink sweeps. Gates: 14 std / 16 lane-b / 21 presets tests green; clippy -D warnings exit 0; fmt clean. (Probe target/debug purged mid-round — container disk hit 100%; all gates re-run clean after.) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
CodeRabbit #635 r3515365083 (open, Functional Correctness / Minor): lane H partitioned the 65536 Morton-slot space TWICE independently — `routers_n` (worker grouping) and `owners_nominal` (router-internal owner_idx). When the two grids don't nest (e.g. 3 routers vs 16 owners: 21845-wide router ranges vs 4096-wide owner ranges), one owner_idx range straddled a router boundary and was lazily spawned as a separate ShardOwner in two routers — duplicating MailboxIds and inflating the live-owner count that lane H exists to measure ("activates only ~occupancy owners"). Final aggregates were already correct (each station lands in one router→one owner, disjoint maps merged); the defect was measurement integrity. Fix (CodeRabbit option 2 — route owners→routers via owner_idx): - `worker_scan_grouped` now takes a `group_of: impl Fn(u64) -> usize` closure instead of the hardwired `slot * groups >> 16`, so the caller chooses the bucket mapping. - lane H adds `owner_of(h, owners_nominal)` (the fine, stable global tile grain) and `router_of_owner(owner_idx, owners_nominal, routers_n)` (coarse grain derived FROM owner_idx by integer division). The worker routes each entry to `router_of_owner(owner_of(h))`, a pure function of owner_idx, so every owner_idx maps to exactly one router by construction. The router computes the same `owner_of` — identical formula, single source. - `RouteMsg::Drain` now also returns each router's activated owner_idxs; the coordinator asserts their union is duplicate-free (`unique.len() == all.len()`), so a straddle regression fails the run. The existing lane_h test already exercises the misaligned 16-owner / 3-router case; its doc now names it as the straddle guard. All 21 onebrc-probe tests pass (incl. presets parity); clippy -D warnings clean; fmt clean. Thread 1 (README omits lane h) was already resolved in 60bc1e6. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (5)
crates/onebrc-probe/src/lane_i.rs (1)
93-110: 🩺 Stability & Availability | 🔵 Trivial | 💤 Low valueSame unbounded-probe pattern as
lane_j.rs::GridCodebook::mint, lower risk here.
Codebook::minthas no capacity guard either, butSLOTS(65536) comfortably exceeds the 1BRC spec's ~10,000-station ceiling, so this is much less likely to hang in practice than lane J's 4096-cell default. Still, the same defensiveassert!(self.len < SLOTS, ...)would make this explicit rather than implicit.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/onebrc-probe/src/lane_i.rs` around lines 93 - 110, `Codebook::mint` can still probe forever when the table fills, even though the risk is lower than in `lane_j.rs::GridCodebook::mint`. Add a defensive capacity check in `mint` before entering the probe loop, using the existing `self.len` and `SLOTS` symbols, so the code explicitly fails once all slots are consumed instead of relying on the spec’s station limit.crates/onebrc-probe/src/lane_b.rs (2)
71-74: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valuePrefer
U8x32::LANESover the literal32inarray_chunks::<u8, 32>.
aligned_endis derived fromU8x32::LANES, but the chunk size passed toarray_chunksis a separate hardcoded literal. IfLANESever changes, these two would silently diverge.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/onebrc-probe/src/lane_b.rs` around lines 71 - 74, The chunking logic in lane_b uses a hardcoded 32 in array_chunks::<u8, 32> while aligned_end and pos already rely on U8x32::LANES, so update the array_chunks call to use the same U8x32::LANES constant. Keep the chunk size, aligned_end calculation, and U8x32::from_slice path consistent so the code stays correct if the lane count changes.
46-46: 🗄️ Data Integrity & Integration | 🔵 Trivial | 💤 Low valueUse
U8x32::LANEShere instead of the hardcoded32. The loop already derives its stride fromU8x32::LANES, so this keeps the chunk size and offset math aligned.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/onebrc-probe/src/lane_b.rs` at line 46, The loop in lane_b is using a hardcoded chunk size of 32 instead of the SIMD lane count, which can drift from the existing stride logic. Update the chunking/offset math in the lane_b code to use U8x32::LANES consistently, matching the stride already derived from that symbol so the array_chunks-based processing stays aligned.crates/onebrc-probe/src/lane_j.rs (1)
56-181: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚖️ Poor tradeoffSignificant duplication with
lane_i.rs'sCodebook/SlotMemo/BatchTable/FlushCache.
GridCodebook,GridMemo,GridBatch, andGridFlushCacheare near-identical to their lane_i counterparts, generalized only by a runtimegridsize. Given lane J is documented as "lane I's shape with knobs," lane I's fixed-SLOTStypes could be expressed as lane J withgrid = SLOTS, removing ~150 lines of parallel logic to maintain.Also applies to: 361-398
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/onebrc-probe/src/lane_j.rs` around lines 56 - 181, `GridCodebook`, `GridMemo`, `GridBatch`, and `GridFlushCache` are duplicating the same slot-memo/batch logic already present in lane I, differing mainly by a configurable grid size. Refactor the lane J implementation to reuse the shared abstractions instead of maintaining parallel copies: make the common codebook/memo/batch/cache logic parameterized by grid and have lane I instantiate it with `SLOTS`, keeping the lane J-specific behavior only where it truly differs. Preserve the existing behavior of `mint`, `resolve`, `observe`, and `reset`, but consolidate the near-identical implementations into a single path to reduce maintenance overhead.crates/onebrc-probe/Cargo.toml (1)
61-61: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winConsider pinning the
ractorgit dependency to a rev/tag.The git source has no
rev/tag/branch, so it floats on the fork's default branch — a rebuild months later could silently pick up breaking changes.♻️ Suggested pin
-ractor = { git = "https://github.com/AdaWorldAPI/ractor", optional = true, default-features = false, features = ["tokio_runtime"] } +ractor = { git = "https://github.com/AdaWorldAPI/ractor", rev = "<pin-me>", optional = true, default-features = false, features = ["tokio_runtime"] }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/onebrc-probe/Cargo.toml` at line 61, The ractor git dependency is floating on the fork’s default branch because it has no pinned source, which can cause rebuilds to silently pick up different code. Update the ractor entry in Cargo.toml to pin it to a specific rev, tag, or branch so the dependency is reproducible and stable over time.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.claude/v3/INTEGRATION-PLAN.md:
- Line 688: The Markdown in INTEGRATION-PLAN.md is triggering MD018 because a
wrapped line starts with an inline `#227-style` reference that looks like a
heading marker. Update the affected prose so the line no longer begins with a
bare # reference, or reflow the surrounding text in the section containing the
`#227` reference to keep the hash from appearing at the start of a line. Use the
nearby markdown content in the integration plan to locate and adjust the wrapped
sentence.
In `@crates/onebrc-probe/src/lane_j.rs`:
- Around line 76-91: GridCodebook::mint can loop forever when the table is full
and no empty slot remains; update the probing logic in mint to detect exhaustion
instead of continuing indefinitely. Use the existing GridCodebook state (for
example len, mask, used, tags, names) to cap the probe count or check fullness
before/while probing, and return a failure result or otherwise fail fast once
every slot has been examined. Keep the existing matching behavior for identical
(h, name) entries, but ensure mint always terminates even when the number of
unique station identities exceeds the grid size.
---
Nitpick comments:
In `@crates/onebrc-probe/Cargo.toml`:
- Line 61: The ractor git dependency is floating on the fork’s default branch
because it has no pinned source, which can cause rebuilds to silently pick up
different code. Update the ractor entry in Cargo.toml to pin it to a specific
rev, tag, or branch so the dependency is reproducible and stable over time.
In `@crates/onebrc-probe/src/lane_b.rs`:
- Around line 71-74: The chunking logic in lane_b uses a hardcoded 32 in
array_chunks::<u8, 32> while aligned_end and pos already rely on U8x32::LANES,
so update the array_chunks call to use the same U8x32::LANES constant. Keep the
chunk size, aligned_end calculation, and U8x32::from_slice path consistent so
the code stays correct if the lane count changes.
- Line 46: The loop in lane_b is using a hardcoded chunk size of 32 instead of
the SIMD lane count, which can drift from the existing stride logic. Update the
chunking/offset math in the lane_b code to use U8x32::LANES consistently,
matching the stride already derived from that symbol so the array_chunks-based
processing stays aligned.
In `@crates/onebrc-probe/src/lane_i.rs`:
- Around line 93-110: `Codebook::mint` can still probe forever when the table
fills, even though the risk is lower than in `lane_j.rs::GridCodebook::mint`.
Add a defensive capacity check in `mint` before entering the probe loop, using
the existing `self.len` and `SLOTS` symbols, so the code explicitly fails once
all slots are consumed instead of relying on the spec’s station limit.
In `@crates/onebrc-probe/src/lane_j.rs`:
- Around line 56-181: `GridCodebook`, `GridMemo`, `GridBatch`, and
`GridFlushCache` are duplicating the same slot-memo/batch logic already present
in lane I, differing mainly by a configurable grid size. Refactor the lane J
implementation to reuse the shared abstractions instead of maintaining parallel
copies: make the common codebook/memo/batch/cache logic parameterized by grid
and have lane I instantiate it with `SLOTS`, keeping the lane J-specific
behavior only where it truly differs. Preserve the existing behavior of `mint`,
`resolve`, `observe`, and `reset`, but consolidate the near-identical
implementations into a single path to reduce maintenance overhead.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: bfc9f672-ce62-4c29-b1a8-0ea5761cdd59
📒 Files selected for processing (14)
.claude/board/EPIPHANIES.md.claude/v3/INTEGRATION-PLAN.mdcrates/onebrc-probe/COMMENTARY.mdcrates/onebrc-probe/Cargo.tomlcrates/onebrc-probe/FINDINGS.mdcrates/onebrc-probe/README.mdcrates/onebrc-probe/src/lane_b.rscrates/onebrc-probe/src/lane_g.rscrates/onebrc-probe/src/lane_h.rscrates/onebrc-probe/src/lane_i.rscrates/onebrc-probe/src/lane_j.rscrates/onebrc-probe/src/lib.rscrates/onebrc-probe/src/main.rscrates/onebrc-probe/src/presets.rs
✅ Files skipped from review due to trivial changes (3)
- crates/onebrc-probe/FINDINGS.md
- .claude/board/EPIPHANIES.md
- crates/onebrc-probe/README.md
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/onebrc-probe/src/main.rs
- crates/onebrc-probe/src/lane_g.rs
| gridlake batch SoA + codebook CAM + 1–8 lane pairs + whole-table | ||
| double-cast + flush cache; ownership as the index-aligned guarantee | ||
| table, NOT a standing per-cell actor registry; BF16 planes per ndarray | ||
| #227's proven VDPBF16PS tier as the tile-GEMM continuation. README |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Markdownlint MD018: missing space after #.
Likely triggered by an inline #227-style reference sitting at the start of a wrapped line, which markdown parses as an ATX heading attempt.
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 688-688: No space after hash on atx style heading
(MD018, no-missing-space-atx)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.claude/v3/INTEGRATION-PLAN.md at line 688, The Markdown in
INTEGRATION-PLAN.md is triggering MD018 because a wrapped line starts with an
inline `#227-style` reference that looks like a heading marker. Update the
affected prose so the line no longer begins with a bare # reference, or reflow
the surrounding text in the section containing the `#227` reference to keep the
hash from appearing at the start of a line. Use the nearby markdown content in
the integration plan to locate and adjust the wrapped sentence.
Source: Linters/SAST tools
| fn mint(&mut self, h: u64, name: &[u8]) -> u16 { | ||
| let mut s = morton_slot(h) as usize & self.mask; | ||
| loop { | ||
| if !self.used[s] { | ||
| self.used[s] = true; | ||
| self.tags[s] = h; | ||
| self.names[s] = name.to_vec(); | ||
| self.len += 1; | ||
| return s as u16; | ||
| } | ||
| if self.tags[s] == h && self.names[s] == name { | ||
| return s as u16; | ||
| } | ||
| s = (s + 1) & self.mask; | ||
| } | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🔴 Critical | ⚡ Quick win
GridCodebook::mint can spin forever once the grid fills up.
With grid as low as 4096 (the default "gridlake" preset), any corpus with more unique station identities than grid cells will cause mint to loop indefinitely — no empty slot is ever found and no matching (h, name) exists, so the loop never terminates. The classic 1BRC dataset spec allows up to 10,000 distinct stations, well above the 4096-cell default, so this is reachable with realistic data even though the probe's own synthetic corpora stay under it.
🛡️ Proposed fix: fail fast instead of hanging
fn mint(&mut self, h: u64, name: &[u8]) -> u16 {
+ assert!(
+ self.len < self.tags.len(),
+ "GridCodebook capacity ({}) exceeded — increase `grid`",
+ self.tags.len()
+ );
let mut s = morton_slot(h) as usize & self.mask;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn mint(&mut self, h: u64, name: &[u8]) -> u16 { | |
| let mut s = morton_slot(h) as usize & self.mask; | |
| loop { | |
| if !self.used[s] { | |
| self.used[s] = true; | |
| self.tags[s] = h; | |
| self.names[s] = name.to_vec(); | |
| self.len += 1; | |
| return s as u16; | |
| } | |
| if self.tags[s] == h && self.names[s] == name { | |
| return s as u16; | |
| } | |
| s = (s + 1) & self.mask; | |
| } | |
| } | |
| fn mint(&mut self, h: u64, name: &[u8]) -> u16 { | |
| assert!( | |
| self.len < self.tags.len(), | |
| "GridCodebook capacity ({}) exceeded — increase `grid`", | |
| self.tags.len() | |
| ); | |
| let mut s = morton_slot(h) as usize & self.mask; | |
| loop { | |
| if !self.used[s] { | |
| self.used[s] = true; | |
| self.tags[s] = h; | |
| self.names[s] = name.to_vec(); | |
| self.len += 1; | |
| return s as u16; | |
| } | |
| if self.tags[s] == h && self.names[s] == name { | |
| return s as u16; | |
| } | |
| s = (s + 1) & self.mask; | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/onebrc-probe/src/lane_j.rs` around lines 76 - 91, GridCodebook::mint
can loop forever when the table is full and no empty slot remains; update the
probing logic in mint to detect exhaustion instead of continuing indefinitely.
Use the existing GridCodebook state (for example len, mask, used, tags, names)
to cap the probe count or check fullness before/while probing, and return a
failure result or otherwise fail fast once every slot has been examined. Keep
the existing matching behavior for identical (h, name) entries, but ensure mint
always terminates even when the number of unique station identities exceeds the
grid size.
The full operator arc on the Addendum-13 probe, five follow-ups: kanban vs without (G) → one mailbox per SoA (topology corrected) → orchestration sweet spot (H) → the batch-pipeline spec: 64K upfront + aligned indices + codebook CAM + whole-table double-cast + flush cache (I) → the knob matrix: grid × lanes × registry (J). ndarray rebased onto master (#227 — sibling Morton/morsel probe; its VDPBF16PS tier is the BF16 continuation).
The final ledger (same recipe corpus, 4 cores; same-session refs G(1)=46.3, H=40.5, F=70.1)
The operator's questions, answered by knobs (README §5.7)
The composed recipe: 64×64 gridlake batch SoA + codebook CAM addressing + 1–8 sink lane pairs + whole-table
Arcdouble-cast + flush cache; ownership as the index-aligned guarantee table (no standing per-cell actor registry); BF16 planes per #227's tier when tile-GEMM lands.Gates
All-features test matrix green (incl. lane J 4-corner knob parity, lane I recycled-flush-cache parity, codebook idempotence, double-cast completeness asserts); clippy
-D warningsexit 0; fmt clean. Board:E-1BRC-KANBAN-UPDATE-1,E-1BRC-OWNER-GRANULARITY-1,E-1BRC-ORCHESTRATION-SWEETSPOT-1,E-1BRC-BATCH-PIPELINE-1,E-1BRC-GRIDLAKE-SWEETSPOT-1; plan Addendum-13 t4–t7; README §5.4–§5.7.🤖 Generated with Claude Code
https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
Summary by CodeRabbit
New Features
Documentation
Tests