gh-150424: Widen specialized int fast paths to full int64 range by KRRT7 · Pull Request #150425 · python/cpython

KRRT7 · 2026-05-25T18:59:21Z

Proposed PR title

gh-150424: Widen specialized int fast paths to full int64 range

Proposed PR body

This widens the interpreter’s specialized int fast paths (tier-2 / uop execution) from the compact-only range to the full int64_t range, and fixes follow-up correctness issues resulting from this for non-compact exact ints and 15-bit builds.

Changes:

widen specialized integer add/subtract/multiply fast paths to operate across the full int64_t range
relax guards from compact-only exact ints to exact ints that fit in int64_t
add fast extraction/support code for non-compact exact ints in the widened range
keep specialized in-place mutation compact-only and fall back safely for non-compact inputs
handle widened integer comparisons without compact-only assumptions
construct widened arithmetic results with PyLong_FromInt64() so 15-bit builds do not narrow through stwodigits
add regression coverage for widened operations, non-compact exact ints, boundary cases, and overflow fallback
add benchmark scripts for measuring widened specialized integer fast-path performance

Tests run:

./python.exe -m unittest test.test_capi.test_opt.TestUopsOptimization
./python.exe -m test test_generated_cases

Additional validation:

built and tested with --with-pydebug --enable-experimental-jit --enable-big-digits=15
ran targeted non-compact widened int regression tests on the 15-bit build

Comparison: main (1310d2c) vs this branch (160d3cd).
Build: PGO + full LTO, installed binary, macOS arm64, Clang 22, PYTHON_JIT=0 (interpreter only).
Microbenchmarks target non-compact int arithmetic directly (values that exceed 2**30 but fit in int64_t).
pyperformance benchmarks are general workloads not specific to this change — included to confirm no regression.

pyperformance

Benchmark	main	branch	Change	Significant
chaos	33.2 ms ± 1.1 ms	30.6 ms ± 0.2 ms	1.08x faster	Yes (t=18.05)
float	43.1 ms ± 0.6 ms	40.1 ms ± 0.6 ms	1.08x faster	Yes (t=26.37)
nbody	67.3 ms ± 0.5 ms	63.7 ms ± 0.6 ms	1.06x faster	Yes (t=34.13)
pidigits	228 ms ± 0 ms	228 ms ± 0 ms	1.00x	No
pyflate	254 ms ± 2 ms	247 ms ± 2 ms	1.03x faster	Yes (t=19.73)
raytrace	146 ms ± 1 ms	138 ms ± 1 ms	1.06x faster	Yes (t=64.71)
scimark_fft	168 ms ± 2 ms	157 ms ± 2 ms	1.07x faster	Yes (t=29.51)
scimark_lu	62.6 ms ± 7.4 ms	60.3 ms ± 0.4 ms	1.04x faster	Yes (t=2.44)
scimark_monte_carlo	41.4 ms ± 4.2 ms	37.5 ms ± 0.3 ms	1.10x faster	Yes (t=7.09)
scimark_sor	75.8 ms ± 10.2 ms	66.8 ms ± 0.4 ms	1.13x faster	Yes (t=6.81)
scimark_sparse_mat_mult	2.53 ms ± 0.06 ms	2.42 ms ± 0.02 ms	1.05x faster	Yes (t=13.25)
spectral_norm	51.8 ms ± 7.5 ms	50.5 ms ± 0.5 ms	1.03x faster	No

Microbenchmarks (`Tools/scripts/jit_int_benchmark_pyperf.py`)

Benchmark	main	branch	Change
jit_int_small	209 ns ± 0 ns	181 ns ± 0 ns	1.16x faster
jit_int_intermediate_overflow	1.28 µs ± 0.01 µs	1.17 µs ± 0.01 µs	1.10x faster
jit_int_double_add	1.34 µs ± 0.01 µs	1.26 µs ± 0.06 µs	1.06x faster
jit_int_accumulate	806 ns ± 33 ns	664 ns ± 20 ns	1.21x faster
jit_int_always_large	459 ns ± 17 ns	422 ns ± 10 ns	1.09x faster
jit_int_mixed	176 ns ± 5 ns	153 ns ± 6 ns	1.15x faster
Geometric mean			1.13x faster

Replace the 30-bit compact-only range check (is_medium_int) with __builtin_add_overflow / sub_overflow / mul_overflow, widening the JIT arithmetic fast path from ±2^30 to ±2^62. Relax operand guards from _PyLong_CheckExactAndCompact to PyLong_CheckExact so non-compact inputs also stay in the JIT trace. Add inline _PyLong_AsInt64 for fast digit extraction from non-compact PyLongs (avoids calling the heavy PyLong_AsLongLongAndOverflow). Results (pyperf, ARM64, rigorous): intermediate_overflow 2.70x faster double_add 2.73x faster accumulate 1.65x faster geometric mean 1.52x faster Includes pyperf-based microbenchmark (Tools/scripts/jit_int_benchmark_pyperf.py) and a simpler timeit-based version (jit_int_benchmark.py).

bedevere-app · 2026-05-25T18:59:24Z

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

python-cla-bot · 2026-05-25T18:59:25Z

The following commit authors need to sign the Contributor License Agreement:

106575910+KRRT7@users.noreply.github.com

picnixz · 2026-05-25T19:08:39Z

15-bit builds

Do we actually support such builds?

bedevere-app · 2026-05-25T19:09:51Z

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

picnixz · 2026-05-25T19:09:57Z

I suspect this has been entirely generated by an LLM (at least the summary has been generated by that). It's unclear what the issue is so please first wait for some feedbcak on the issue before opening the PR. This is the common process as highlighted in our devguide.

KRRT7 · 2026-05-25T19:10:47Z

I suspect this has been entirely generated by an LLM (at least the summary has been generated by that). It's unclear what the issue is so please first wait for some feedbcak on the issue before opening the PR. This is the common process as highlighted in our devguide.

Hi, that's incorrect, I'm talking to a core dev directly on the the PR and as noted by the status, it's still a draft and WIP

KRRT7 · 2026-05-25T19:13:02Z

Do we actually support such builds?

15-bit builds are still supported: --enable-big-digits=15 is a documented configure option, and CPython still has both 15-bit and 30-bit PyLong layouts. I called it out because widening the JIT path to full int64_t exposed a real correctness issue on that

https://github.com/python/cpython/blob/main/Doc/using/configure.rst#L229-L237
https://github.com/python/cpython/blob/main/configure.ac#L6452-L6466

picnixz · 2026-05-25T19:15:15Z

Hi, that's incorrect, I'm talking to a core dev directly on the the PR and as noted by the status

Who's the core dev?

Ok for the 15-bit but please, first share a reproducer on the issue before jumping onto PRs. I don't know if you want to address correctness or performance. Those are different. And the described scope is not helpful as there are too many points on the issue.

KRRT7 · 2026-05-25T19:17:42Z

I don't know if you want to address correctness or performance.

I'm addressing performance, the performance changes surfaced correctness issues on 15-bit by the test suite.

the described scope is not helpful as there are too many points on the issue.

I'm working on updating the title and bodies, as I said, this is a draft / WIP, I still haven't settled the final body as I'm doing some cleanup

picnixz · 2026-05-25T19:20:19Z

Before doing any work, we should first understand the issue. It doesn't make sense to make PRs and update the issue accordingly. This is not how our worklow is. So please, first discuss the change on the issue by including a reproducer.

KRRT7 · 2026-05-28T12:02:12Z

Before doing any work, we should first understand the issue.

I don't think what I'm doing is an "issue", though arguably performance bottlenecks are issues (which is something that I personally believe).

It doesn't make sense to make PRs and update the issue accordingly. This is not how our worklow is.

it is however, Github native workflow (opening PRs as a draft, a rough PR where as a draft it is iterated and improved upon), where cpython happens to be hosted in.

picnixz · 2026-05-28T12:31:31Z

it is however, Github native workflow (opening PRs as a draft, a rough PR where as a draft it is iterated and improved upon), where cpython happens to be hosted in.

We have a policy written on the devguide. Contributors need to follow that policy. The purpose is to avoid back-and-forth so knowing first what the issue is about is the first step. PRs come after.

Nothing forbids you to open something on your fork though. But any change you make in a PR against CPython takes CI resources which affects other contributors.

bedevere-app · 2026-05-28T12:33:14Z

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

bedevere-app · 2026-05-28T12:33:14Z

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

picnixz · 2026-05-28T12:35:03Z

Considering @Fidget-Spinner's stance, I'm reopening it. But please follow what the devguide says. We have too many AI-generated PRs that take both triagers and reviewers time.

When it comes to performance PRs, we really want to see benchmarks before PRs even if in a draft state. This gives more intution on whether the change is worth or not.

KRRT7 added 4 commits May 25, 2026 12:52

cleanup

1181bf0

Fix widened JIT int fast paths

8624d26

Add widened JIT int boundary tests

91854c2

bedevere-app Bot mentioned this pull request May 25, 2026

Widen specialized int fast paths to full int64 range #150424

Open

KRRT7 changed the title ~~gh-150424: Fix widened JIT int fast paths~~ gh-150424: Widen JIT int fast paths to full int64 range May 25, 2026

Merge branch 'main' into jit-wide-int-fastpath

e0089f7

picnixz closed this May 25, 2026

fix failing unit tests

160d3cd

KRRT7 changed the title ~~gh-150424: Widen JIT int fast paths to full int64 range~~ gh-150424: Widen specialized int fast paths to full int64 range May 28, 2026

picnixz reopened this May 28, 2026

Uh oh!

Conversation

KRRT7 commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed PR title

Proposed PR body

pyperformance

Microbenchmarks (Tools/scripts/jit_int_benchmark_pyperf.py)

Uh oh!

bedevere-app Bot commented May 25, 2026

Uh oh!

python-cla-bot Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

picnixz commented May 25, 2026

Uh oh!

bedevere-app Bot commented May 25, 2026

Uh oh!

picnixz commented May 25, 2026

Uh oh!

KRRT7 commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KRRT7 commented May 25, 2026

Uh oh!

picnixz commented May 25, 2026

Uh oh!

KRRT7 commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

picnixz commented May 25, 2026

Uh oh!

KRRT7 commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

picnixz commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bedevere-app Bot commented May 28, 2026

Uh oh!

bedevere-app Bot commented May 28, 2026

Uh oh!

picnixz commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KRRT7 commented May 25, 2026 •

edited

Loading

Microbenchmarks (`Tools/scripts/jit_int_benchmark_pyperf.py`)

python-cla-bot Bot commented May 25, 2026 •

edited

Loading

KRRT7 commented May 25, 2026 •

edited

Loading

KRRT7 commented May 25, 2026 •

edited

Loading

KRRT7 commented May 28, 2026 •

edited

Loading

picnixz commented May 28, 2026 •

edited

Loading