Arm backend: support depthwise Conv3D#19902
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19902
Note: Links to docs will display an error until the docs builds have been completed.
|
|
The following ciflow label(s) have been added but CI has not been triggered yet because the workflows are awaiting approval:
Once a maintainer approves the workflows (scroll to the bottom of the PR page), the corresponding CI jobs will be triggered automatically. Please ping one of the reviewers if you do not have access to approve and run workflows. |
This PR needs a
|
DecomposeGroupedConvPass skips depthwise convolutions (in_channels == groups) because Conv2D depthwise maps to TOSA DEPTHWISE_CONV2D. For Conv3D (rank-5 input) there is no DEPTHWISE_CONV3D op, so the skip caused RewriteConvPass to raise a RuntimeError during preprocess. Narrow the depthwise skip to Conv2D only. For rank-5 inputs the pass falls through to the existing slice->CONV3D->cat decomposition, which already handles arbitrary groups and produces the correct per-channel computations that depthwise convolution requires. Signed-off-by: Youngsik Yang <vacu9708@gmail.com>
b94b5b8 to
5e0aff3
Compare
Summary
Depthwise Conv3D (
in_channels == groups, rank-5 input) previously crashed with aRuntimeErrorinsideRewriteConvPassbecause TOSA has noDEPTHWISE_CONV3Dop.DecomposeGroupedConvPassalready handles non-depthwise grouped Conv3D by splitting it intogroups==1convolutions via slice→conv→cat, but it explicitly skipped the depthwise case since Conv2D depthwise maps to the nativeDEPTHWISE_CONV2DTOSA op.For Conv3D there is no such native op, so the fix is to extend
DecomposeGroupedConvPassto stop skipping depthwise when the input is rank 5(Conv3D).The existing slice→
CONV3D→cat decomposition can handle it correctly.flowchart LR DW2D["Depthwise Conv2D\n(in_channels == groups, rank 4)"] DW3D["Depthwise Conv3D\n(in_channels == groups, rank 5)"] GRP["DecomposeGroupedConvPass"] RC2D["RewriteConvPass"] RC3D["RewriteConvPass"] DELEGATE_CONV2D["DEPTHWISE_CONV2D"] DELEGATE_CONV3D["CONV3D"] DW2D --> RC2D DW3D -->|"decomposed"| GRP GRP -->|"CONV3D (groups==1)"| RC3D RC2D -->|"delegated to native op"| DELEGATE_CONV2D RC3D -->|"delegated to native op"| DELEGATE_CONV3DFiles changed:
backends/arm/_passes/decompose_grouped_conv_pass.pycall_operator, narrow the depthwise skip to Conv2D only (len(input.data.shape) != 5); for rank-5 inputs(Conv3D) fall through to the existing decomposition.backends/arm/_passes/rewrite_conv_pass.py_is_conv3dto reflect that both grouped and depthwise Conv3D are now decomposed upstream; retain theRuntimeErroras defense-in-depth.backends/arm/test/ops/test_conv3d.pytest_convolution_3d_tosa_FP_depthwiseto assert delegationTest result
python -m pytest backends/arm/test/ops/test_conv3d.py::test_convolution_u55_INT_not_delegated_3d # 2 passed, 0 failed.lintrunner -a \ backends/arm/_passes/decompose_grouped_conv_pass.py \ backends/arm/_passes/rewrite_conv_pass.py \ backends/arm/test/ops/test_conv3d.py # ok No lint issues.cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell @rascani