Add JSON encode fuzz runner#1
Conversation
|
Warning Review limit reached
More reviews will be available in 20 minutes and 48 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughThis PR introduces a complete fuzzing framework for lua-rapidjson that generates diverse JSON payloads via schema-guided or recursive-random modes, encodes them with RapidJSON, validates sorted-key ordering and structural correctness, and runs workers concurrently with progress reporting and failure diagnostics. ChangesRapidJSON Fuzzing Framework
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Makefile`:
- Line 20: The Makefile's fuzz recipe must coerce SEED to a numeric fallback
before doing shell arithmetic; modify the recipe to compute a numeric SEED value
(e.g., set SEED_NUM from SEED if it matches an integer, otherwise use the
current epoch like date +%s) and then compute per-worker seed using that numeric
SEED_NUM (replace the existing seed=$$(( $(SEED) + $$worker - 1 )) with
arithmetic based on SEED_NUM). Reference the fuzz recipe, the SEED variable, the
per-worker seed calculation, and tools/fuzz_encode_lib.lua::normalize_seed() so
the behavior matches Lua's fallback to os.time().
- Around line 15-17: The polling loop in the fuzz target can hang after
stop_workers() calls cleanup() and removes $$tmpdir; update the loop that waits
on ls "$$tmpdir"/fail.* / done.* (the while :; do ... done loop) so it breaks if
the temp directory is gone (e.g., add a check if [ ! -d "$$tmpdir" ]; then
break; fi) or also check for a sentinel file; alternatively, have stop_workers()
create a sentinel outside $$tmpdir before calling cleanup() and make the loop
exit when that sentinel is present. Reference: stop_workers(), cleanup, the fuzz
target while loop, $$tmpdir, fail.*, and done_count.
In `@tools/fuzz_encode_lib.lua`:
- Around line 1005-1009: validate_key_order is always enforcing alphabetical
member order because validate_encoded_case (in tools/fuzz_encode_lib.lua)
doesn’t receive the sort_keys flag; when cfg.sort_keys is false we should skip
order validation. Change the signature of M.validate_encoded_case to accept a
sort_keys boolean (e.g. function M.validate_encoded_case(rapidjson, case, json,
sort_keys)) and thread cfg.sort_keys from the caller in tools/fuzz_encode.lua
when calling fuzz.validate_encoded_case; inside
validate_encoded_case/validate_key_order, if sort_keys is false, skip or
short-circuit the member-order check (the parts that use entry.keys /
string_keys()), so generated JSON encoded with sort_keys=false won’t be
rejected. Also add a regression test covering SORT_KEYS=0 to ensure order
validation is skipped.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5f2ed8d4-d991-4b7c-a57e-258d52176b8e
📒 Files selected for processing (5)
.gitignoreMakefilespec/fuzz_encode_lib_spec.luatools/fuzz_encode.luatools/fuzz_encode_lib.lua
| stop_workers() { for pid in $$pids; do kill "$$pid" 2>/dev/null || true; done; cleanup; }; \ | ||
| trap cleanup EXIT; \ | ||
| trap stop_workers INT TERM; \ |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify trap + loop dependency on tmpdir markers.
rg -n -C2 'stop_workers\(\)|trap stop_workers|tmpdir|done_count|fail\.\*|done\.\*' MakefileRepository: api7/lua-rapidjson
Length of output: 954
Stop the fuzz polling loop from continuing after INT/TERM cleanup
In Makefile’s fuzz target, stop_workers() (lines ~15–17) calls cleanup() which rm -rfs $$tmpdir, but the parent while :; do polling loop (lines ~39–50) keeps waiting on ls "$$tmpdir"/fail.* / ls "$$tmpdir"/done.*. After $$tmpdir is removed, fail.* never appears and done_count stays at 0, so the loop can sleep indefinitely.
Suggested fix
- stop_workers() { for pid in $$pids; do kill "$$pid" 2>/dev/null || true; done; cleanup; }; \
+ stop_workers() { \
+ for pid in $$pids; do kill "$$pid" 2>/dev/null || true; done; \
+ exit 130; \
+ }; \📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| stop_workers() { for pid in $$pids; do kill "$$pid" 2>/dev/null || true; done; cleanup; }; \ | |
| trap cleanup EXIT; \ | |
| trap stop_workers INT TERM; \ | |
| stop_workers() { \ | |
| for pid in $$pids; do kill "$$pid" 2>/dev/null || true; done; \ | |
| exit 130; \ | |
| }; \ | |
| trap cleanup EXIT; \ | |
| trap stop_workers INT TERM; \ |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Makefile` around lines 15 - 17, The polling loop in the fuzz target can hang
after stop_workers() calls cleanup() and removes $$tmpdir; update the loop that
waits on ls "$$tmpdir"/fail.* / done.* (the while :; do ... done loop) so it
breaks if the temp directory is gone (e.g., add a check if [ ! -d "$$tmpdir" ];
then break; fi) or also check for a sentinel file; alternatively, have
stop_workers() create a sentinel outside $$tmpdir before calling cleanup() and
make the loop exit when that sentinel is present. Reference: stop_workers(),
cleanup, the fuzz target while loop, $$tmpdir, fail.*, and done_count.
| trap stop_workers INT TERM; \ | ||
| worker=1; \ | ||
| while [ "$$worker" -le "$(WORKERS)" ]; do \ | ||
| seed=$$(( $(SEED) + $$worker - 1 )); \ |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify mismatch between Makefile arithmetic requirement and fuzz lib seed normalization.
rg -n -C2 'seed=\$\$\(\(|SEED' Makefile
rg -n -C3 'normalize_seed|tonumber\(|SEED' tools/fuzz_encode_lib.luaRepository: api7/lua-rapidjson
Length of output: 1770
🏁 Script executed:
#!/bin/bash
# Inspect Makefile around SEED and seed arithmetic.
sed -n '1,60p' Makefile
# Find other SEED usage in repo.
rg -n 'SEED\b' Makefile tools -SRepository: api7/lua-rapidjson
Length of output: 1750
🏁 Script executed:
#!/bin/bash
# Inspect how SEED is passed into Lua and normalized.
sed -n '1,120p' tools/fuzz_encode.lua
sed -n '1,120p' tools/fuzz_encode_lib.luaRepository: api7/lua-rapidjson
Length of output: 4798
Coerce SEED to numeric (matching Lua) before Make arithmetic
The fuzz recipe computes per-worker seeds with shell arithmetic (seed=$$(( $(SEED) + $$worker - 1 ))), which requires $(SEED) to be numeric; SEED=abc fails before Lua starts. Lua’s tools/fuzz_encode_lib.lua normalize_seed() already falls back to os.time() for non-numeric inputs, so the Makefile should apply the same numeric fallback (e.g., use date +%s when SEED isn’t an integer) instead of failing early.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Makefile` at line 20, The Makefile's fuzz recipe must coerce SEED to a
numeric fallback before doing shell arithmetic; modify the recipe to compute a
numeric SEED value (e.g., set SEED_NUM from SEED if it matches an integer,
otherwise use the current epoch like date +%s) and then compute per-worker seed
using that numeric SEED_NUM (replace the existing seed=$$(( $(SEED) + $$worker -
1 )) with arithmetic based on SEED_NUM). Reference the fuzz recipe, the SEED
variable, the per-worker seed calculation, and
tools/fuzz_encode_lib.lua::normalize_seed() so the behavior matches Lua's
fallback to os.time().
| local ordered, order_err = validate_key_order(rapidjson, json, entry) | ||
| if not ordered then | ||
| return false, order_err | ||
| end | ||
| end |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Encode call should use cfg.sort_keys; validate call should currently omit it.
rg -nP -C3 '\bvalidate_encoded_case\s*\(' --type=lua
rg -nP -C2 'sort_keys\s*=\s*cfg\.sort_keys' --type=luaRepository: api7/lua-rapidjson
Length of output: 3573
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect validate_encoded_case and validate_key_order usage + relevant logic
sed -n '930,1060p' tools/fuzz_encode_lib.lua | nl -ba | sed -n '1,200p'Repository: api7/lua-rapidjson
Length of output: 105
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect validate_encoded_case and key-order check logic
sed -n '920,1060p' tools/fuzz_encode_lib.lua | nl -baRepository: api7/lua-rapidjson
Length of output: 105
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Validate key-order logic and where it's called
sed -n '940,1025p' tools/fuzz_encode_lib.lua | cat -n
echo "----"
sed -n '1025,1105p' tools/fuzz_encode_lib.lua | cat -n
echo "----"
# show the exact snippet around the originally referenced lines
sed -n '990,1025p' tools/fuzz_encode_lib.lua | cat -nRepository: api7/lua-rapidjson
Length of output: 6863
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate and inspect validate_key_order and string_keys
rg -n --hidden --no-mmap "function validate_key_order|local function validate_key_order|string_keys\s*\(" tools/fuzz_encode_lib.lua
echo "----"
rg -n --hidden --no-mmap "local function string_keys|function string_keys" tools/fuzz_encode_lib.lua
echo "----"
# Print the relevant regions around the definitions
# (Use sed ranges based on the match lines; adjust after seeing first pass.)
python3 - <<'PY'
import re, subprocess, textwrap, os, sys, json, pathlib
import subprocess
p=subprocess.run(["rg","-n","--hidden","--no-mmap","function validate_key_order|local function validate_key_order|string_keys\\s*\\(","tools/fuzz_encode_lib.lua"],capture_output=True,text=True)
print(p.stdout)
PYRepository: api7/lua-rapidjson
Length of output: 653
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '270,360p' tools/fuzz_encode_lib.lua | cat -n
echo "----"
sed -n '770,900p' tools/fuzz_encode_lib.lua | cat -n
echo "----"
sed -n '900,980p' tools/fuzz_encode_lib.lua | cat -nRepository: api7/lua-rapidjson
Length of output: 9789
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n --hidden --no-mmap -S "SORT_KEYS|sort_keys\s*=" -g'*.lua' .Repository: api7/lua-rapidjson
Length of output: 1600
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "---- tools/fuzz_encode.lua around validate call and failure handling ----"
sed -n '1,140p' tools/fuzz_encode.lua | cat -n
echo "---- tools/fuzz_encode_lib.lua config defaults / env parsing for sort_keys ----"
sed -n '1,120p' tools/fuzz_encode_lib.lua | cat -n
echo "---- spec/fuzz_encode_lib_spec.lua around SORT_KEYS=0 ----"
sed -n '1,120p' spec/fuzz_encode_lib_spec.lua | cat -nRepository: api7/lua-rapidjson
Length of output: 11320
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "---- validate_encoded_case usages in spec/fuzz_encode_lib_spec.lua ----"
rg -n "validate_encoded_case" spec/fuzz_encode_lib_spec.lua
echo "---- sort_keys usages in spec/fuzz_encode_lib_spec.lua ----"
rg -n "sort_keys" spec/fuzz_encode_lib_spec.luaRepository: api7/lua-rapidjson
Length of output: 961
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n --hidden --no-mmap -S "\bvalidate_encoded_case\s*\(" .Repository: api7/lua-rapidjson
Length of output: 855
Fix key-order validation to respect sort_keys=false
tools/fuzz_encode.lua encodes with sort_keys = cfg.sort_keys, but fuzz.validate_encoded_case/validate_key_order always enforces member order against entry.keys (built from string_keys() which sorts keys alphabetically). When SORT_KEYS=0 makes cfg.sort_keys false, validation can still reject the generated JSON and the worker aborts via os.exit(1).
Skip the order check when sorting is disabled (and thread the flag into validate_encoded_case):
🐛 Proposed fix
- local ordered, order_err = validate_key_order(rapidjson, json, entry)
- if not ordered then
- return false, order_err
- end
+ if sort_keys ~= false then
+ local ordered, order_err = validate_key_order(rapidjson, json, entry)
+ if not ordered then
+ return false, order_err
+ end
+ endUpdate signature in tools/fuzz_encode_lib.lua:
function M.validate_encoded_case(rapidjson, case, json, sort_keys)Pass from the worker in tools/fuzz_encode.lua:
local valid, reason =
fuzz.validate_encoded_case(rapidjson, generated_case, json_or_err, cfg.sort_keys)Add a regression test for the SORT_KEYS=0/sort_keys=false path: existing validate_encoded_case coverage encodes with { sort_keys = true } and doesn’t cover “encoded without sorting, so don’t enforce sorted member order”.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tools/fuzz_encode_lib.lua` around lines 1005 - 1009, validate_key_order is
always enforcing alphabetical member order because validate_encoded_case (in
tools/fuzz_encode_lib.lua) doesn’t receive the sort_keys flag; when
cfg.sort_keys is false we should skip order validation. Change the signature of
M.validate_encoded_case to accept a sort_keys boolean (e.g. function
M.validate_encoded_case(rapidjson, case, json, sort_keys)) and thread
cfg.sort_keys from the caller in tools/fuzz_encode.lua when calling
fuzz.validate_encoded_case; inside validate_encoded_case/validate_key_order, if
sort_keys is false, skip or short-circuit the member-order check (the parts that
use entry.keys / string_keys()), so generated JSON encoded with sort_keys=false
won’t be rejected. Also add a regression test covering SORT_KEYS=0 to ensure
order validation is skipped.
Summary
make fuzzVerification
/Users/yuanshengwang/.luarocks/bin/busted-> 114 successesmake fuzz LUA=luajit DURATION=2 INTERVAL=1 WORKERS=1 SEED=123make fuzz LUA=luajit DURATION=2 INTERVAL=1 WORKERS=2 SEED=123Summary by CodeRabbit
New Features
Chores
.gitignoreto exclude RapidJSON build artifacts and documentation directories