when we get #369 we will get much larger contracts like zero-knowledge proof verifiers with maybe 2-3 orders of magnitude more cashscript code to compile which will surface optimized compiler speeds
first thing i ran into when analyzing with claude:
compileString runs the bytecode optimizer twice (optimiseBytecodeOld and optimiseBytecode) and then compares them by
converting both to ASM strings (scriptToAsm(...) !== scriptToAsm(...)) : a dev sanity-check left in the hot path plus source-map/debug/artifact generation we discard.
then in follow up research claude said:
The optimizer is 98% of the time (932 ms of 952 ms) — parse/codegen/typecheck are noise. So 1.3–1.6× is not the limit; the real target is inside
optimiseBytecode.
replaceOps (line 285+) doesn't work on the script array — it stringifies to ASM and does regex rewriting, and for every match it counts whitespace over the growing processedAsm ([...processedAsm.matchAll(/\s+/g)].length) to recover the script index. That's O(n²) per optimization, per match, per pass — string work, not the actual rewriting. Let me confirm it's quadratic by measuring optimize-time vs contract size.
follow up:
it's not O(n²) in op-count — it's driven by ASM-string length. The pairing chunks bake dozens of 32–40-byte BN254 field constants; each becomes a long hex run in the ASM string. And replaceOps works on that string: for every optimization pattern, every match, it counts whitespace over the growing processedAsm ([...processedAsm.matchAll(/\s+/g)].length) to recover the script index — O(asm-length) per match, so the cost explodes with the size of the baked constants, not the number of ops.
claude concluded and suggested:
The optimizer is 98% of compile time, and it's a string/regex rewriter that scales with ASM-string length. The double-optimize removal you already did is the small lever; the big lever is the optimizer's design. A replaceOps that works on the script array (match opcode/data-element sequences directly, splice the array, keep a parallel index map for the location/source-map data) would skip the stringify-and-regex entirely — and would speed up both paths (full and lean), since both call optimiseBytecode. Given the per-op cost is ~10× higher on the constant-heavy pairing chunks, that's plausibly the multi-× ("5×"-ish) win you're asking about, and it grows with how many big field constants a contract bakes.
Cheaper, lower-risk wins (each speeds up both paths):
- Array fixed-point check — replace scriptToAsm(old) === scriptToAsm(new) with an element-wise array compare (no stringify per pass).
- Hoist per-pattern work — the regex objects and pattern/replacement length counts are recomputed per match; compute once.
But those are incremental. The real headroom is killing the ASM-string rewriting in replaceOps in favor of array-based matching
when we get #369 we will get much larger contracts like zero-knowledge proof verifiers with maybe 2-3 orders of magnitude more cashscript code to compile which will surface optimized compiler speeds
first thing i ran into when analyzing with claude:
then in follow up research claude said:
follow up:
claude concluded and suggested: