Skip to content

Fix A2 fast-path prewarm count to match the generic WaveNet#300

Merged
sdatkinson merged 1 commit into
sdatkinson:mainfrom
rhaist:rhabugfix/a2-prewarm-count
Jun 25, 2026
Merged

Fix A2 fast-path prewarm count to match the generic WaveNet#300
sdatkinson merged 1 commit into
sdatkinson:mainfrom
rhaist:rhabugfix/a2-prewarm-count

Conversation

@rhaist

@rhaist rhaist commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Summary

A2FastModel (the A2 fast-path WaveNet, a drop-in for the generic WaveNet)
computed its prewarm sample count as the layer-stack lookback distance
Σ (kernel_size-1)·dilation + (head_kernel-1) — instead of the receptive
field
, which is one greater. The generic path seeds mPrewarmSamples at 1
(the sample being produced) before adding the same lookback terms
(model.cpp), so the fast path warmed up by one fewer sample than the
model it replaces.

Impact

prewarm() runs process() in whole maxBufferSize blocks until the count is
reached, so the off-by-one only changes the block count when the total (6346
for the A2 shape) is an exact multiple of the buffer size. Power-of-2 buffers
(64/256/4096) mask it — which is exactly why the existing equivalence test
(test_matches_generic, blocks 64/256) never caught it. On a buffer size that
divides 6346 (e.g. 334), the two paths warm up by a different number of
blocks and their first post-Reset output diverges.

Low severity (no audible artifact at typical buffer sizes, no RT-safety
impact), but a real contract violation for a path whose whole purpose is to be
numerically identical to the generic WaveNet.

Fix

  • Seed prewarm at 1 to match the generic receptive-field formula, with a
    comment explaining the anchor so the count isn't re-derived incorrectly again.
  • Add test_prewarm_matches_generic_{nano,standard}: builds both the fast and
    generic model from the same config and asserts equal GetPrewarmSamples().

Verification

Confirmed the test catches the regression — reverting the seed to 0 aborts
the new assertion; with the fix the full run_tests suite passes (A2 fast path
is built by default via NAM_ENABLE_A2_FAST).

A2FastModel computed its prewarm sample count as the layer-stack lookback
distance: sum of per-layer (kernel_size-1)*dilation, plus (head_kernel-1).
The generic WaveNet it replaces computes the receptive field, which is one
greater: it seeds mPrewarmSamples at 1 (the sample being produced) before
adding the same lookback terms (model.cpp). So the fast path warmed up by
one fewer sample than the model it is meant to be a drop-in for.

prewarm() runs process() in whole maxBufferSize blocks until the count is
reached, so the off-by-one only changes the block count when the total
(6346 for the A2 shape) is an exact multiple of the buffer size. Power-of-2
buffers (64/256/4096) mask it, which is why the existing equivalence test
(test_matches_generic, blocks 64/256) never caught it; on a buffer size that
divides 6346 the two paths warm up by a different number of blocks and their
first post-Reset output diverges.

Seed prewarm at 1 to match the generic receptive-field formula. Add
test_prewarm_matches_generic_{nano,standard}, which builds both the fast and
generic model from the same config and asserts equal GetPrewarmSamples().

Verified the test catches the regression: reverting the seed to 0 aborts the
new assertion; with the fix the full suite passes.

@sdatkinson sdatkinson left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again!

@sdatkinson sdatkinson merged commit 763a079 into sdatkinson:main Jun 25, 2026
4 checks passed
@rhaist rhaist deleted the rhabugfix/a2-prewarm-count branch June 26, 2026 06:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants