feat(contract): selfImprove forwards the full loop surface + fail-loud default (0.82.0)#226
Merged
Merged
Conversation
…d default selfImprove dropped reps/promoteTopK/labeledStore/captureSource/expectUsage/ analyzeGeneration/findings when forwarding to runImprovementLoop — so a product collapsing onto it would silently lose capture, replicates, and the analyst loop, and would run with a weaker integrity guard. Forward all of them. A product agent now collapses its entire loop harness onto one selfImprove call with zero regression. expectUsage now defaults to 'assert' (selfImprove is the real-run path; a stub fails loud rather than scoring a clean 0). Offline callers set 'off'. chore(release): 0.82.0 (lockstep npm + pyproject + python __version__).
tangletools
approved these changes
Jun 5, 2026
Contributor
tangletools
left a comment
There was a problem hiding this comment.
Completes selfImprove as a one-call surface (forwards reps/promoteTopK/labeledStore/captureSource/expectUsage/analyzeGeneration/findings); fail-loud assert default. 1891 green, no other caller relied on a stub. Approving.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Makes
selfImprovea complete one-call surface, so a product agent can collapse its entire hand-rolled loop harness onto it with zero regression — the prerequisite for propagating the eval-campaign scaffold to the product agents.Gap this closes
selfImprovedropped these when forwarding torunImprovementLoop:reps,promoteTopK,labeledStore,captureSource,expectUsage,analyzeGeneration,findings. A product collapsing onto it would silently lose labeled-example capture, replicates, and the analyst loop (EYES→HANDS), and run with a weaker backend-integrity guard. Now all are forwarded.Fail-loud default
expectUsagenow defaults to'assert'(was effectively'warn').selfImproveis the real-run path, so a stub cell — produced an artifact but reportedcostUsd === 0and zero tokens — fails loud rather than scoring a clean 0. Offline/replay callers opt out withexpectUsage: 'off'.Verification
pnpm typecheckcleanpnpm test— 1891 passed, 2 skipped (1889 prior + 2 new: the'assert'default fails loud on a stub while'off'resolves;analyzeGenerationfires between generations). The new default broke no other test — proof no other caller silently relied on a stub.expectUsage: 'off'(the onlyselfImprovecaller; honest opt-out for a deterministic mock).Release 0.82.0 (lockstep npm + PyPI). Tag
v0.82.0after merge publishes.