Fix A2 fast-path prewarm count to match the generic WaveNet#300
Merged
Conversation
A2FastModel computed its prewarm sample count as the layer-stack lookback
distance: sum of per-layer (kernel_size-1)*dilation, plus (head_kernel-1).
The generic WaveNet it replaces computes the receptive field, which is one
greater: it seeds mPrewarmSamples at 1 (the sample being produced) before
adding the same lookback terms (model.cpp). So the fast path warmed up by
one fewer sample than the model it is meant to be a drop-in for.
prewarm() runs process() in whole maxBufferSize blocks until the count is
reached, so the off-by-one only changes the block count when the total
(6346 for the A2 shape) is an exact multiple of the buffer size. Power-of-2
buffers (64/256/4096) mask it, which is why the existing equivalence test
(test_matches_generic, blocks 64/256) never caught it; on a buffer size that
divides 6346 the two paths warm up by a different number of blocks and their
first post-Reset output diverges.
Seed prewarm at 1 to match the generic receptive-field formula. Add
test_prewarm_matches_generic_{nano,standard}, which builds both the fast and
generic model from the same config and asserts equal GetPrewarmSamples().
Verified the test catches the regression: reverting the seed to 0 aborts the
new assertion; with the fix the full suite passes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A2FastModel(the A2 fast-path WaveNet, a drop-in for the genericWaveNet)computed its prewarm sample count as the layer-stack lookback distance —
Σ (kernel_size-1)·dilation + (head_kernel-1)— instead of the receptivefield, which is one greater. The generic path seeds
mPrewarmSamplesat1(the sample being produced) before adding the same lookback terms
(
model.cpp), so the fast path warmed up by one fewer sample than themodel it replaces.
Impact
prewarm()runsprocess()in wholemaxBufferSizeblocks until the count isreached, so the off-by-one only changes the block count when the total (
6346for the A2 shape) is an exact multiple of the buffer size. Power-of-2 buffers
(64/256/4096) mask it — which is exactly why the existing equivalence test
(
test_matches_generic, blocks 64/256) never caught it. On a buffer size thatdivides
6346(e.g. 334), the two paths warm up by a different number ofblocks and their first post-
Resetoutput diverges.Low severity (no audible artifact at typical buffer sizes, no RT-safety
impact), but a real contract violation for a path whose whole purpose is to be
numerically identical to the generic WaveNet.
Fix
prewarmat1to match the generic receptive-field formula, with acomment explaining the anchor so the count isn't re-derived incorrectly again.
test_prewarm_matches_generic_{nano,standard}: builds both the fast andgeneric model from the same config and asserts equal
GetPrewarmSamples().Verification
Confirmed the test catches the regression — reverting the seed to
0abortsthe new assertion; with the fix the full
run_testssuite passes (A2 fast pathis built by default via
NAM_ENABLE_A2_FAST).