Remove over-strict softmax mask divisibility assert#19903
Conversation
Summary: `SoftmaxPattern.fuse()` asserts `mask_shape[-1] % 16 == 0`. The softmax mask passed to the fused op is a dummy (`mask_type=0`, no masking is applied), so its trailing dimension does not affect numerics, and the historical `QuantFusion` simply floor-divided without asserting. The assert rejects otherwise-valid shapes (e.g. softmax over a last dim of 17 or 33) and fails `test_quantized_softmax_out_*` (T273477740). Remove the assert and floor-divide the mask shape like before, in both the `fbcode/` and `xplat/` cells. Differential Revision: D106957459
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19903
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 5a3b3cd with merge base ec31735 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@ethansfng has exported this pull request. If you are a Meta employee, you can view the originating Diff in D106957459. |
This PR needs a
|
Summary:
SoftmaxPattern.fuse()assertsmask_shape[-1] % 16 == 0. The softmax mask passed to the fused op is a dummy (mask_type=0, no masking is applied), so its trailing dimension does not affect numerics, and the historicalQuantFusionsimply floor-divided without asserting. The assert rejects otherwise-valid shapes (e.g. softmax over a last dim of 17 or 33) and failstest_quantized_softmax_out_*(T273477740). Remove the assert and floor-divide the mask shape like before, in both thefbcode/andxplat/cells.Differential Revision: D106957459