Summary
Investigate whether Matrix::inf_norm can be made faster without weakening its finite-value and overflow error behavior.
Current State
The v0.4.3 release-signal comparison against v0.4.2 is broadly positive, so this should not block the v0.4.3 release. Larger-dimension local vs_linalg measurements suggest Matrix::inf_norm is the most interesting remaining leaf-kernel target:
- D=16: about 66.7 ns
- D=32: about 294 ns
- D=64: about 2.15 us
Dot and squared-norm rows also deserve a glance, but inf_norm looks more likely to matter for determinant/error-bound and validation-adjacent paths.
Proposed Changes
- Inspect the current
Matrix::inf_norm implementation and call sites for avoidable fallibility, iterator, or row-sum overhead.
- Compare clear indexed loops, helper extraction, and any FMA/accumulation variants that preserve the documented semantics.
- Benchmark focused
la_stack_inf_norm rows across D=2, 3, 4, 5, 8, 16, 32, and 64 before and after any change.
- Keep any optimization only if it is consistent and does not obscure the mathematical invariant or typed error behavior.
Benefits
- Improves a hot matrix leaf kernel without expanding crate scope.
- Keeps post-
v0.4.3 performance work separate from release preparation.
- Gives future release notes a clean, auditable benchmark story.
Implementation Notes
- Preserve
LaError::NonFinite { row, col } source-location behavior for non-finite entries or overflowed row sums.
- Do not trade correctness or diagnostics for speed.
- If the investigation finds no robust improvement, close with benchmark evidence rather than forcing a code change.
Summary
Investigate whether
Matrix::inf_normcan be made faster without weakening its finite-value and overflow error behavior.Current State
The
v0.4.3release-signal comparison againstv0.4.2is broadly positive, so this should not block thev0.4.3release. Larger-dimension localvs_linalgmeasurements suggestMatrix::inf_normis the most interesting remaining leaf-kernel target:Dot and squared-norm rows also deserve a glance, but
inf_normlooks more likely to matter for determinant/error-bound and validation-adjacent paths.Proposed Changes
Matrix::inf_normimplementation and call sites for avoidable fallibility, iterator, or row-sum overhead.la_stack_inf_normrows across D=2, 3, 4, 5, 8, 16, 32, and 64 before and after any change.Benefits
v0.4.3performance work separate from release preparation.Implementation Notes
LaError::NonFinite { row, col }source-location behavior for non-finite entries or overflowed row sums.