Skip to content

W2 arc: real-owner KanbanActor probe, W2a LAYOUT-GATED ruling, R-2 edges proof, semantic-OS capstone on the board#634

Merged
AdaWorldAPI merged 14 commits into
mainfrom
claude/v3-substrate-migration-review-o0yoxv
Jul 2, 2026
Merged

W2 arc: real-owner KanbanActor probe, W2a LAYOUT-GATED ruling, R-2 edges proof, semantic-OS capstone on the board#634
AdaWorldAPI merged 14 commits into
mainfrom
claude/v3-substrate-migration-review-o0yoxv

Conversation

@AdaWorldAPI

@AdaWorldAPI AdaWorldAPI commented Jul 2, 2026

Copy link
Copy Markdown
Owner

The thinking↔substrate lane's W2 opening moves (companion: the rs-graph-llm branch carries W3b/M25 — see below).

Shipped

  • W2b probe (D-V3-W2b)KanbanActor spawned over the REAL cognitive_shader_driver::mailbox_soa::MailboxSoA for the first time (previously TestBoard-only). Dev-dependency only — the structural-owner proof gains no runtime dep. Three probes green: two legal advances persist on the real SoA purely via KanbanMsg; Planning→Commit rejected with RubiconTransitionError and the row unchanged; sole-mutator shown structurally (the SoA moves into Actor::spawn, no handle survives).
  • R-2 closure residualedges_only_strided_read_via_descriptors_r2_residual: every row's EdgeBlock recovered touching exactly 16 bytes/row via the NODE_ROW_COLUMNS descriptor + stride; the frozen 512-byte unit untouched (the read-side proof of the "edges cheap without loading values" ruling). Contract 793/793.
  • W2a envelope-audit ruling (Addendum-12a) — the board-as-tenant spec is LAYOUT-GATED: byte-sound, zero ENVELOPE_LAYOUT_VERSION change, but the textbook I-LEGACY shape. Decisions recorded: BoardAggregates = NEW append-only 10th ValueTenant @ row_offset 152 (reuse-reinterpretation rejected — pre-P4 inexpressible); board classid via the next batched mint only; tests T1–T6 mandatory; nan_projection + symbiont fixed-offset sweepers identified as the two EXPOSED readers to gate. Guardrails §2 vocabulary reconciled to the ruling.
  • BoardsE-SEMANTIC-OS-CONVERGENCE-1 (the operator's capstone, canonical text + [G] grounding table + two sharpenings: DUPLICATED is the third membrane failure mode; a membrane without a build-failing tripwire is prose), M25 status → SHIPPED v1, Addendum-12/12a, post-merge hygiene for Cross-session intake: RouteBucketTyped (C6) merged, emission_scan minted, OCR codebook mirror, GraphRAG-rs inventory + operator rulings #632/docs: Statusmatrix (DE) — was funktioniert, Mitigation, Schulden, Pot… #148 incl. the ogar-vocab lock bump that cleared COUNT_FUSE (68 == 68 after the fuse demonstrably fired in the merge window).

Companion (rs-graph-llm, same branch name)

W3b/M25: graph-flow gains a kanban feature — KanbanSessionStorage (snapshot upsert + append-only real-KanbanMove log, doc-pinned V1 Rubicon mapping). The M25 kill-mid-graph replay gate is GREEN: killed after task 2, resumed from the same storage on a fresh runner, completed with no repeats/gaps and the pinned column sequence [CognitiveWork, Evaluation, Commit]. Verified in an isolated two-crate workspace (16+59 tests) — in-repo cargo is blocked by the pre-existing burn-submodule 403, stash-verified unrelated.

🤖 Generated with Claude Code

https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM


Generated by Claude Code

Summary by CodeRabbit

  • New Features
    • Added a standalone onebrc-probe CLI to generate deterministic 1BRC datasets and run multiple benchmark “lanes” (including SIMD, actor-based, kanban-scheduled, and Morton/radix routing variants).
  • Documentation
    • Updated operational guidance and released workflow/validation rules; refreshed shipped-pr tracking and milestone details.
  • Tests
    • Added supervisor-backed integration checks for real-owner phase transitions and added coverage for edge-only residual reads.
  • Chores
    • Extended CI to run supervisor feature tests for the supervisor crate.

claude added 7 commits July 2, 2026 15:26
…NT_FUSE (68 == 68)

The compile-time fuse fired in the between-merges window (mirror 68 vs
locked ogar_vocab 65) exactly as designed, and cleared with the lock
bump (lance-graph-ogar's own lockfile is gitignored — CI resolves the
merged OGAR main fresh). lance-graph-ogar 81 tests green locally.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
…ard-row spec sketch, execution order)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
…ng of the actor to the production owner

Closes the D-V3-W2b gap (actor was only ever exercised against its own
TestBoard): dev-dependency-only on cognitive-shader-driver (the
structural-owner proof gains no runtime dep); three probes green —
two legal advances persist on the real SoA via KanbanMsg only,
Planning->Commit rejected with RubiconTransitionError and the row
unchanged, and sole-mutator shown structurally (the SoA is moved into
Actor::spawn; no handle survives).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
…ors — R-2 closure residual

The read-side proof of the operator's edges-cheap ruling: every row's
EdgeBlock recovered touching exactly 16 bytes/row, driven purely by the
Edges descriptor's (row_offset, elems_per_row) + NODE_ROW_STRIDE; the
512-byte storage unit untouched.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
…membranes, 'do not copy meaning') pinned with grounding table + broadcast

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
…rdAggregates 10th tenant decided; T1-T6 + sweeper exposure) + guardrails vocab reconciled

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds board and knowledge records for a W2/W3b operational arc, introduces the standalone onebrc-probe crate with multiple lane implementations and CLI tooling, and adds supervisor, CI, and canonical-node coverage around the new probes.

Changes

Board and Knowledge Documentation

Layer / File(s) Summary
Semantic findings and broadcast
.claude/board/EPIPHANIES.md, .claude/board/CROSS_SESSION_BROADCAST.md
Adds new dated findings and matching broadcast entries for semantic convergence and the W2 arc.
PR #632 tracking records
.claude/board/LATEST_STATE.md, .claude/board/PR_ARC_INVENTORY.md
Adds PR #632 state and arc entries with merge metadata and confidence notes.
Integration plan and guardrails
.claude/v3/INTEGRATION-PLAN.md, .claude/v3/ENTROPY-MILESTONES.md, .claude/v3/knowledge/sonnet-worker-guardrails.md
Adds W2/W3b addenda, updates M25 to shipped, and revises the kanban board vocabulary entry.

Estimated code review effort: 3 (Moderate) | ~25 minutes

onebrc-probe crate and supervisor coverage

Layer / File(s) Summary
Crate scaffold and usage docs
Cargo.toml, crates/onebrc-probe/.cargo/config.toml, crates/onebrc-probe/Cargo.toml, crates/onebrc-probe/README.md
Adds the standalone crate manifest, workspace exclusion, CPU pinning, feature flags, and usage/measurement documentation.
Stats, parsing, generator, and SHA-256
crates/onebrc-probe/src/lib.rs, crates/onebrc-probe/src/gen.rs, crates/onebrc-probe/src/sha256.rs
Defines shared aggregation/parsing primitives, deterministic corpus generation, and streaming digest support with tests.
Lane implementations and CLI
crates/onebrc-probe/src/lane_b.rs, crates/onebrc-probe/src/lane_d.rs, crates/onebrc-probe/src/lane_e.rs, crates/onebrc-probe/src/lane_f.rs, crates/onebrc-probe/src/main.rs
Adds SIMD, actor, kanban, Morton/radix, and command-line execution paths with lane-specific tests.
Supervisor probe, CI, and edge descriptor test
crates/lance-graph-supervisor/Cargo.toml, crates/lance-graph-supervisor/tests/w2b_real_owner_probe.rs, .github/workflows/rust-test.yml, crates/lance-graph-contract/src/canonical_node.rs
Adds the real-owner Kanban probe, runs it in CI, and adds the edge-only row-descriptor test.

Estimated code review effort: 4 (Complex) | ~60 minutes

Sequence Diagram(s)

sequenceDiagram
  participant Test
  participant KanbanActor
  participant MailboxSoA

  Test->>KanbanActor: spawn(real_mailbox())
  Test->>KanbanActor: KanbanMsg::Advance
  KanbanActor->>MailboxSoA: persist phase
  Test->>KanbanActor: KanbanMsg::Phase
  KanbanActor->>MailboxSoA: read phase
  MailboxSoA-->>Test: current phase
  Test->>KanbanActor: stop + join
Loading

Possibly related PRs

Poem

A rabbit raced through lanes and logs,
Through boards and probes and runtime cogs.
It hopped from bytes to phase to hash,
Then left a tidy, measured stash. 🐇

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main shipped areas: W2b real-owner probe, W2a layout ruling, R-2 edges proof, and semantic-OS board updates.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/lance-graph-supervisor/tests/w2b_real_owner_probe.rs`:
- Around line 18-19: Add CI coverage for the supervisor probe tests by updating
the Rust test workflow to run `cargo test --manifest-path
crates/lance-graph-supervisor/Cargo.toml --features supervisor`; the issue is
that the `w2b_real_owner_probe` tests are gated behind the `supervisor` feature
and currently never execute in CI, so add a dedicated job alongside the existing
`lance-graph`, `lance-graph-contract`, and `deepnsm` jobs to ensure the
`lance-graph-supervisor` crate’s probes are exercised.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: ace9a976-c12e-4772-81f8-f1b317f80977

📥 Commits

Reviewing files that changed from the base of the PR and between df36747 and cda3b9c.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • .claude/board/CROSS_SESSION_BROADCAST.md
  • .claude/board/EPIPHANIES.md
  • .claude/board/LATEST_STATE.md
  • .claude/board/PR_ARC_INVENTORY.md
  • .claude/v3/ENTROPY-MILESTONES.md
  • .claude/v3/INTEGRATION-PLAN.md
  • .claude/v3/knowledge/sonnet-worker-guardrails.md
  • crates/lance-graph-contract/src/canonical_node.rs
  • crates/lance-graph-supervisor/Cargo.toml
  • crates/lance-graph-supervisor/tests/w2b_real_owner_probe.rs

Comment thread crates/lance-graph-supervisor/tests/w2b_real_owner_probe.rs
claude added 7 commits July 2, 2026 16:31
…-D warnings); row_offset (u32) keeps try_from

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
…probes were never executed in CI)

The whole crate is feature-gated, so no existing step reached it — the
'green CI that didn't test the real fuse' failure mode. Flagged by
coderabbit on #634.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
…strate-native Morton-tile cascaded shader lane

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
…d 1BRC workload (Addendum-13)

Standalone workspace-excluded crate, zero deps for the baselines.
Deterministic SplitMix64 corpus generator (procedural station names, no
external dataset) with recipe+hash emission per the archival convention;
in-crate SHA-256 verified against system sha256sum on the real corpus.
Lane A scalar single-pass (integer-tenths parse, no float in the hot
loop); lane C newline-aligned chunks + owned per-worker maps +
commutative merge (the borrow-strategy shape). 11/11 tests incl. A==C
aggregate equality and generator determinism.

t0 @ 10M rows, 4 cores (container): A 7.11 Mrows/s; C 26.41 Mrows/s
(3.71x). recipe rows=10000000 seed=42 sha256=f1853caa...5691.
Lanes B (ndarray SIMD) / D (ractor ratio) / E (kanban scheduling tax) /
F (Morton-tile cascaded shader — the addressing-is-aggregation thesis
test) are README-stubbed follow-ups.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
…) — t1 measured

Lane B: 32-byte-stride ;/\n scan via ndarray::simd::U8x32::cmpeq_mask
(all SIMD from ndarray::simd per the workspace rule; probe pinned to
x86-64-v3 so the ops are real AVX2 intrinsics), cross-block state carry,
scalar parse kept. Lane D: actor-per-worker over Arc<Vec<u8>> via the
AdaWorldAPI ractor fork (supervisor coordinates), identical chunking and
commutative merge as lane C — only the worker primitive changes.

Feature-gated (lane-b/lane-d); lanes A/C stay zero-dep (11/13/12/14 tests
green across the four feature combos; clippy -D warnings clean, incl. the
pre-existing gen.rs byte-grouping lint fixed here).

t1 (recipe corpus rows=10000000 seed=42 sha256=f1853caa…5691 re-verified,
4 cores, best-of-2): A 7.012 / B 7.455 (1.06x vs A — delimiter find is
not the bottleneck) / C 27.586 / D 22.078 Mrows/s (0.80x vs C — the
'ractor is a helper, not a messaging path' ruling as a measured ~20%
actor tax incl. the forced corpus copy). README §5.1 carries the tables.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
…ithin noise, ~66 us/card

One kanban card per batch: fresh KanbanActor<ProbeBoard> (real
lance-graph-supervisor machinery, feature lane-e) driven through the full
Rubicon forward arc (3x drive_version_tick) around each aggregation
batch; AtomicUsize batch queue, workers pull; combined journal asserted
3*batches legal KanbanMoves. ProbeBoard mirrors the supervisor's own
TestBoard shape (stand-in board; the real SoA board is lane F's business).

t2 (recipe corpus re-verified, 4 cores, best-of-2): C 28.310 / D 22.381 /
E(4) 22.963 / E(64) 22.477 / E(256) 22.118 Mrows/s. E-D ~ 0 at chunk
granularity — the per-card journal (spawn + 3 ticks + join) is invisible
next to the shared actor-boundary corpus copy; 256 cards cost ~4% total,
~66 us per card = ~0.01% of the W2d 550 ms Libet budget. The board is not
a scheduling threat; the actor boundary remains the only material tax.

Gates: 11/12/15 tests green (no-feature / lane-e / all-lanes), clippy -D
warnings clean, fmt clean. README §5.2 carries tables + readings; plan
Addendum-13 updated.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM
…e-and-write 3x, addressing tax ~10%; probe complete

Lane F: station identity -> FNV-1a 64 -> two axis bytes nibble-interleaved
into the 16-bit Morton tile position (the GUID canon's 256x256
centroid-tile read) -> flat SoA accumulators (min[]/max[]/sum[]/count[],
open-addressed, name-verified on tag hit), gated indexed writes,
per-worker owned tables BUNDLE-merged. Lane R: byte-identical pipeline,
slot = hash & 0xFFFF — F-R isolates the Morton addressing tax exactly,
R-C prices flat-SoA-vs-BTreeMap. Both std-only; A/C zero-dep contract
holds. Collision-forcing probe test (constant slot fn) included.

t3 (recipe corpus re-verified, 4 cores, 5 passes, medians): C 28.3 /
F 77.4 / R 86.3 Mrows/s; F/R single-thread 21.5/23.3 vs A 7.16. Readings:
route-and-write beats look-up-and-compare 3x; the semantic address costs
~10% over plain radix at ~400-group cardinality (high-cardinality
prefix-local payoff untested, unclaimed); the accumulator, not the scan,
is where the win lives. Board: E-1BRC-ADDRESSING-1. Addendum-13 probe
COMPLETE — all lanes A-F + R measured on one regenerable recipe corpus.

Gates: 14 (std) / 18 (all features) tests green, clippy -D warnings
clean, fmt clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MLBnPuScZy6w9di2QEjsXM

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
.github/workflows/rust-test.yml (1)

111-125: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Consider a CI gate for the new onebrc-probe crate, similar to deepnsm's.

This same PR adds onebrc-probe to the workspace exclude list with non-trivial feature-gated lanes (ractor/tokio actors, ndarray SIMD, kanban integration with lance-graph-supervisor), but no step here exercises cargo test --manifest-path crates/onebrc-probe/Cargo.toml [--all-features]. The deepnsm step right above this diff was added for the same reason (a workspace-excluded crate whose tests never ran) — the new crate's lane-d/lane-e code carries a comparable "untested feature-gated code" risk.

🧪 Proposed CI step
       - name: Run supervisor tests (W2b real-owner probes)
         run: cargo test --manifest-path crates/lance-graph-supervisor/Cargo.toml --features supervisor
+      # onebrc-probe: workspace-excluded, own [workspace]. Gate its
+      # feature-gated lanes (b/d/e) so lane compile/logic breakage is
+      # caught in CI rather than only via the README's manual instructions.
+      - name: Run onebrc-probe tests
+        run: cargo test --manifest-path crates/onebrc-probe/Cargo.toml --all-features
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/rust-test.yml around lines 111 - 125, Add a CI test gate
for the workspace-excluded onebrc-probe crate, since its feature-gated tests are
not covered by existing workflow steps. Update the rust-test workflow near the
existing Run deepnsm tests and Run supervisor tests steps to include a cargo
test invocation for crates/onebrc-probe/Cargo.toml, and ensure the relevant
feature combinations used by onebrc-probe are exercised so its
ractor/tokio/ndarray/kanban code paths are actually tested.
crates/onebrc-probe/Cargo.toml (1)

42-42: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Pin the ractor fork to a specific rev.

The git dependency has no rev/tag/branch, so cargo update can silently pull a different upstream-fork commit on rebuild, undermining reproducibility (unless a checked-in Cargo.lock for this workspace-excluded crate already pins it — please confirm one exists and is committed).

♻️ Suggested fix
-ractor = { git = "https://github.com/AdaWorldAPI/ractor", optional = true, default-features = false, features = ["tokio_runtime"] }
+ractor = { git = "https://github.com/AdaWorldAPI/ractor", rev = "<pinned-commit-sha>", optional = true, default-features = false, features = ["tokio_runtime"] }
#!/bin/bash
fd -HI Cargo.lock crates/onebrc-probe
rg -n -A3 'name = "ractor"' crates/onebrc-probe/Cargo.lock 2>/dev/null || echo "no committed lockfile found for onebrc-probe"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/onebrc-probe/Cargo.toml` at line 42, The `ractor` git dependency in
`Cargo.toml` is not pinned, so the `crates/onebrc-probe` build can drift to
different commits on update; update the dependency declaration for `ractor` to
include a specific `rev` (or an equivalent fixed git reference) and verify that
a committed `Cargo.lock` exists for this crate/workspace-excluded target and
already locks the same commit. If no lockfile is present, make sure the pinned
`rev` is present so reproducible builds do not depend on upstream HEAD.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/onebrc-probe/src/lib.rs`:
- Around line 134-183: The parsing path in parse_temp_tenths and lane_a_scalar
assumes well-formed, terminated input and can panic on out-of-bounds access when
a record is missing ';', '\n', or a temperature digit. Add explicit bounds
checks in the scan loops and in parse_temp_tenths so malformed or unterminated
input fails with a clear error instead of indexing past the slice. Mirror the
same guard logic in lane_b_simd’s tail parsing loop so both code paths behave
consistently.

---

Nitpick comments:
In @.github/workflows/rust-test.yml:
- Around line 111-125: Add a CI test gate for the workspace-excluded
onebrc-probe crate, since its feature-gated tests are not covered by existing
workflow steps. Update the rust-test workflow near the existing Run deepnsm
tests and Run supervisor tests steps to include a cargo test invocation for
crates/onebrc-probe/Cargo.toml, and ensure the relevant feature combinations
used by onebrc-probe are exercised so its ractor/tokio/ndarray/kanban code paths
are actually tested.

In `@crates/onebrc-probe/Cargo.toml`:
- Line 42: The `ractor` git dependency in `Cargo.toml` is not pinned, so the
`crates/onebrc-probe` build can drift to different commits on update; update the
dependency declaration for `ractor` to include a specific `rev` (or an
equivalent fixed git reference) and verify that a committed `Cargo.lock` exists
for this crate/workspace-excluded target and already locks the same commit. If
no lockfile is present, make sure the pinned `rev` is present so reproducible
builds do not depend on upstream HEAD.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 75eab3a8-ee70-407c-b25c-68e9419ddc50

📥 Commits

Reviewing files that changed from the base of the PR and between cda3b9c and 63243bc.

⛔ Files ignored due to path filters (1)
  • crates/onebrc-probe/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (16)
  • .claude/board/EPIPHANIES.md
  • .claude/v3/INTEGRATION-PLAN.md
  • .github/workflows/rust-test.yml
  • Cargo.toml
  • crates/lance-graph-contract/src/canonical_node.rs
  • crates/onebrc-probe/.cargo/config.toml
  • crates/onebrc-probe/Cargo.toml
  • crates/onebrc-probe/README.md
  • crates/onebrc-probe/src/gen.rs
  • crates/onebrc-probe/src/lane_b.rs
  • crates/onebrc-probe/src/lane_d.rs
  • crates/onebrc-probe/src/lane_e.rs
  • crates/onebrc-probe/src/lane_f.rs
  • crates/onebrc-probe/src/lib.rs
  • crates/onebrc-probe/src/main.rs
  • crates/onebrc-probe/src/sha256.rs
✅ Files skipped from review due to trivial changes (2)
  • crates/onebrc-probe/.cargo/config.toml
  • .claude/board/EPIPHANIES.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/lance-graph-contract/src/canonical_node.rs

Comment on lines +134 to +183
fn parse_temp_tenths(bytes: &[u8]) -> i32 {
let mut i = 0usize;
let neg = bytes[0] == b'-';
if neg {
i += 1;
}
let mut val: i32 = 0;
while bytes[i] != b'.' {
val = val * 10 + (bytes[i] - b'0') as i32;
i += 1;
}
i += 1; // skip '.'
val = val * 10 + (bytes[i] - b'0') as i32;
if neg {
-val
} else {
val
}
}

/// Lane A — single-thread scalar baseline. One pass over `data`, byte-wise
/// scan for `;` and `\n`, integer temp parse, `BTreeMap<String, Stats>`
/// accumulation (owned per-station microcopies — see `Stats::merge` doc).
pub fn lane_a_scalar(data: &[u8]) -> BTreeMap<String, Stats> {
let mut map: BTreeMap<String, Stats> = BTreeMap::new();
let len = data.len();
let mut i = 0usize;
while i < len {
let name_start = i;
while data[i] != b';' {
i += 1;
}
let name = std::str::from_utf8(&data[name_start..i]).expect("station name is valid utf8");
i += 1; // skip ';'
let temp_start = i;
while data[i] != b'\n' {
i += 1;
}
let tenths = parse_temp_tenths(&data[temp_start..i]);
i += 1; // skip '\n'

match map.get_mut(name) {
Some(stats) => stats.observe(tenths),
None => {
map.insert(name.to_string(), Stats::single(tenths));
}
}
}
map
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

No bounds guard against malformed/unterminated input.

parse_temp_tenths indexes bytes[0]/bytes[i] unconditionally, and lane_a_scalar's while data[i] != b';' { i += 1; } / while data[i] != b'\n' { i += 1; } scans have no i < len check. A corpus file passed to run <path> that lacks a trailing \n (or is otherwise malformed) will panic with an out-of-bounds index rather than failing with a clear message. lane_b_simd's tail loop (lane_b.rs) has the identical pattern, so a fix here should be mirrored there.

🛡️ Example: fail with a clear message instead of an opaque panic
     while i < len {
         let name_start = i;
-        while data[i] != b';' {
+        while i < len && data[i] != b';' {
             i += 1;
         }
+        assert!(i < len, "corpus record at byte {name_start} is missing ';'");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/onebrc-probe/src/lib.rs` around lines 134 - 183, The parsing path in
parse_temp_tenths and lane_a_scalar assumes well-formed, terminated input and
can panic on out-of-bounds access when a record is missing ';', '\n', or a
temperature digit. Add explicit bounds checks in the scan loops and in
parse_temp_tenths so malformed or unterminated input fails with a clear error
instead of indexing past the slice. Mirror the same guard logic in lane_b_simd’s
tail parsing loop so both code paths behave consistently.

@AdaWorldAPI AdaWorldAPI merged commit a575c7c into main Jul 2, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants