Fix IMU process memory leak + CPU spin (and guard contrast-reserve log spam)#472
Merged
Conversation
The imu_monitor loop (imu_pi.py) had no sleep: imu.update() throttles the I2C reads to 30 Hz, but the loop body still ran thousands/sec, calling shared_state.set_imu() every iteration. set_imu crosses a multiprocessing Manager proxy, so each call pickles the ImuSample -- and pickling a numpy.quaternion leaks (~25 MB/200k dumps, numpy-quaternion 2023.0.4). On real hardware this was ~16 MB/min + ~19% CPU in the IMU child; on a 2 GB Pi it drove swap-thrash toward OOM. Invisible to the fake-IMU headless harness (its update() self-throttles with sleep(0.1)). Two parts: 1. Throttle the loop to the IMU sample rate: sleep only the remainder of the sample period (period - work already done this iteration), so the publish cadence tracks the 30 Hz sample rate rather than drifting to period + work. The >0 guard keeps the fake-IMU fallback (whose update() already sleeps) from double-sleeping. 19% -> ~2.4% CPU. 2. Stop pickling the numpy.quaternion. Add _quat_to_floats/_floats_to_quat helpers and __getstate__/__setstate__ to ImuSample (quat) and to PointingEstimate / SuccessfulSolve (imu_anchor -- the same quaternion also pickles via set_solution's deepcopy and solver_queue). The quaternion pickles as 4 plain floats and is rebuilt on unpickle; the in-process attribute stays a real numpy.quaternion, so consumers are unchanged. Verified: A/B pickle.dumps RSS -- bare quaternion +24.8 MB/200k, all three patched dataclasses +0.0 MB. 50-min real-hardware endurance (real BNO055, Test Mode): IMU-child anonymous heap flat (113.8 MB, slope -0.000 MB/min), CPU mean 2.35%, vs the prior 134->545 MB / 19% in 28 min. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This was referenced Jun 15, 2026
9a5e2c7 to
f1154b6
Compare
mrosseel
added a commit
to mrosseel/PiFinder
that referenced
this pull request
Jun 16, 2026
7 tasks
mrosseel
added a commit
to mrosseel/PiFinder
that referenced
this pull request
Jun 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two fixes found by the real-hardware endurance test that followed #470. #470 fixed the comet CPU hog; this PR fixes a second, distinct failure mode the endurance test then surfaced — and that mode (a memory leak) is the more likely driver of the actual OOM/freeze. See Relationship to #470 below.
1. IMU process memory leak + CPU spin (primary)
The
imu_monitorloop (imu_pi.py) had no sleep.imu.update()throttles the I²C reads to 30 Hz, but the loop body still ran thousands/sec, callingshared_state.set_imu()every iteration.set_imucrosses a multiprocessing Manager proxy, so each call pickles theImuSample— and pickling anumpy.quaternionleaks (~25 MB/200k dumps, numpy-quaternion 2023.0.4). On real hardware the IMU child leaked ~16 MB/min and spun ~19 % CPU; on a 2 GB Pi that exhausts RAM → swap-thrash → OOM/freeze. Invisible to the fake-IMU headless harness (itsupdate()self-throttles withsleep(0.1)), which is why earlier headless profiling never caught it.period − work), so publishing tracks 30 Hz instead of drifting toperiod + work. The> 0guard keeps the fake-IMU fallback from double-sleeping. 19 % → ~2.4 % CPU.__getstate__/__setstate__onImuSample.quat,PointingEstimate.imu_anchor, andSuccessfulSolve.imu_anchor(the same quaternion also ridesset_solution'sdeepcopy+solver_queue). Pickles 4 plain floats, rebuilt on unpickle; the in-process attribute stays a realnumpy.quaternion, so consumers are unchanged.2. Contrast-reserve log spam (rider)
pydeepskylog.contrast_reserve()logslogger.error(...)and returns (doesn't raise) when an object diameter isNone, so the surroundingexceptcan't suppress it.object_detailscalls it per redraw withdiameter=Nonefor sizeless objects → steady ERROR-level spam that bypasses theroot=ERRORfilter. Guard: skip the call when a diameter isNone(same blank-contrast result, minus the error).Verification
pickle.dumpsRSS: bare quaternion +24.8 MB/200k; all three patched dataclasses +0.0 MB.ImuSample/None-anchor round-trip tests inTestPicklability); ruff + mypy clean;test_ui_modules211 pass.Relationship to #470
#470 ("Vectorize comet propagation") fixed the CPU hog —
calc_cometspegging a core whenever locked. That is a real pathology (UI starvation, heat), but a CPU hog alone doesn't typically hard-crash the OS. This PR fixes a memory leak that does: ~16 MB/min on a 2 GB Pi exhausts RAM in well under an hour → swap-death / OOM, which presents as the field "hang." The endurance test that found this leak ran after the comet fix, with the hog already gone — so the two are independent.Best current understanding: the field freeze was primarily this memory leak, with the comet CPU hog a compounding factor. Both fixes are needed for a healthy long observing session. #470's description has been updated to match.
🤖 Generated with Claude Code