docs(blog): add benchmarking under-the-hood post with redirect infrastructure#261
docs(blog): add benchmarking under-the-hood post with redirect infrastructure#261SamBarker wants to merge 11 commits into
Conversation
Companion to the operator-focused Post 1. Covers the OMB harness design, workload choices, flamegraph analysis, coefficient derivation across the full 3×3 grid, and bugs found in our own tooling. Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Post 2 publication date moved to 2026-06-04 04:30 UTC (4:30 PM NZST). Post 1 brought onto this branch with all coming-soon companion references replaced by links to the published post. Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Post 1 was renamed from 2026-05-26 to 2026-05-28; update the copy on this branch and the post_url link in Post 2 to match. Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
The flamegraphs for both no-filter and encryption scenarios were from the original 3-node cluster (which had co-location issues and produced unrealistically high numbers). Replace with flamegraphs captured at 13,600 msg/s on the correct 11-node cluster (RF=3, 1 topic, 1 partition) from the rate-sweep-1core-rf3 suite. No-filter flamegraph: proxy-no-filters-cpu-profile.html updated in place. Encryption flamegraph: encryption-cpu-profile-36k.html removed (wrong cluster), encryption-cpu-profile-13k.html added (correct cluster, 13k rate). CPU percentage tables in Post 2 recomputed from async-profiler self-time data in the new flamegraph HTML files. Key findings change: - No-filter syscalls: 59.2% → 63.2% (distributed cluster has real network) - No-filter Kroxylicious: 1.4% → 1.7% - Encryption GC pressure: +4.9% → +10.3% (bigger indirect cost) - Encryption direct AES-GCM: 11.3% → 6.5% Also fix the record-encryption docs link in Post 1 (0.20.0 → 0.21.0). Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Rename section to 'Wait! Just... one... more... run...' and add two flamegraph-motivated experiments: io_uring (to address the 63% send/recv syscall cost in the passthrough proxy) and GC tuning (to address the 10.5% GC overhead in the encryption scenario, with Generational ZGC as a well-motivated first experiment). Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Adds AbsoluteRedirectPage to support version-free redirects to external
URLs. Mappings with absoluteTarget bypass the version loop and generate a
single page at /redirect/{group}/{subgroup}/{name}. Existing versioned
redirect mappings are unaffected.
Adds blog.yaml with a redirect to the benchmarking-the-proxy raw data
folder on Google Drive.
Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Future-dated posts are excluded from Jekyll builds by default, breaking post_url references in Post 1. Backdate to today so the site builds without --future. Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
| | | 1-core (1000m) | 2-core (2000m) | 4-core (4000m) | | ||
| |---|---|---|---| | ||
| | **1 topic** | 15.7 mc/MB/s | 24.3 mc/MB/s | 30.5 mc/MB/s† | | ||
| | **10 topics** | 10.0 mc/MB/s | 19.9 mc/MB/s | 25.1 mc/MB/s | | ||
| | **100 topics** | 4.3 mc/MB/s | 6.8 mc/MB/s | 8.0 mc/MB/s | |
There was a problem hiding this comment.
claude was complaining that this table is a bit hard to square with the linear scaling claim, saying a constant coefficient would be expected in that case. maybe just needs some further explanation?
There was a problem hiding this comment.
Good catch. After digging into the JFR data from the coefficient sweep, the short answer is: the rising coefficient is real, not noise, but the sweep methodology conflates two effects (more connections and more throughput at each step), so we can't cleanly attribute it yet.
Preliminary regression on the non-saturated probes suggests the per-byte encryption cost is roughly constant across core counts (~38 mc/MB/s for 1-core and 2-core), but the baseline JVM/Netty thread overhead scales with core count — each additional Netty event loop thread adds a fixed per-thread cost regardless of how many bytes it's processing. The proper experiment to confirm this is a rate sweep at fixed connection count, which hasn't been run yet.
I've added a footnote to the table acknowledging this and pointing readers to the ceiling table as the cleaner test of linear scaling. The clean sweep is on the todo list for a follow-up post.
robobario
left a comment
There was a problem hiding this comment.
LGTM, claude suggesting the triple question-mark is a bit much in the title, too social-media-ey. Also that it's quite aside heavy so maybe one or two of them could be dropped (WHAT BROKE NOW??) is the most expendable
Add an explanatory note clarifying that the coefficient rising with core count is not inconsistent with the linear scaling claim. Points to the ceiling table as the cleaner test of linearity, and honestly flags that the connection-count sweep conflates connections and throughput — a separate rate sweep at fixed connection count is needed to cleanly decompose the two effects. Preliminary analysis suggests per-byte cost is stable and the baseline Netty thread overhead is the driver. Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Summary
absoluteTargetsupport for version-free external redirects, with asubgroupfield for nested URL pathsblog.yamlredirect mapping linking to raw benchmark data (OMB results, metrics, JFR recordings) on Google Drive via/redirect/blog/benchmarking-the-proxy-under-the-hood/benchmark-dataTest plan
./run.shbuilds without errorshttp://127.0.0.1:4000/redirect/blog/benchmarking-the-proxy-under-the-hood/benchmark-dataredirects to Google Drive folder/redirect/errors/,/redirect/qr-codes/) still work🤖 Generated with Claude Code