Skip to content

perf(protovalidate): cache validator instead of constructing per call#3156

Open
matiasinsaurralde wants to merge 1 commit into
mainfrom
perf/cache-protovalidate-validator
Open

perf(protovalidate): cache validator instead of constructing per call#3156
matiasinsaurralde wants to merge 1 commit into
mainfrom
perf/cache-protovalidate-validator

Conversation

@matiasinsaurralde
Copy link
Copy Markdown
Contributor

Summary

Replaces seven per-call protovalidate.New() constructions with the library's global validator (or protovalidate.GlobalValidator for sites that need a Validator handle for protoyaml). Each New() call rebuilds the CEL environment, the type registry, and starts with an empty message-evaluator cache; reusing one validator amortises all of that across calls.

Sites refactored:

  • pkg/policies/policies.govalidateResource
  • pkg/attestation/crafter/crafter.go — removed unused validator field on Crafter
  • pkg/attestation/crafter/materials/materials.goCraft
  • pkg/attestation/crafter/api/attestation/v1/crafting_state_validations.goValidateComplete
  • app/controlplane/pkg/biz/casclient.goIsReady
  • app/controlplane/pkg/unmarshal/unmarshal.goFromRaw (single shared adapter around protovalidate.GlobalValidator)
  • pkg/credentials/manager/manager.govalidateConfig

Wire providers in app/controlplane/cmd/main.go and app/artifact-cas/cmd/main.go
were already correct (one validator per process) and are unchanged.

Benchmarks

Measured on darwin/arm64, Apple M5, Go 1.26.3:

Site Per-call New() Cached validator Speedup
pkg/credentials/manager (Credentials config) 1,141,692 ns/op · 1.6 MB · 19,350 allocs 672 ns/op · 96 B · 6 allocs ~1,700×
pkg/attestation/crafter/api/.../crafting_state (CraftingState) 4,854,560 ns/op · 6.1 MB · 73,712 allocs 2,709 ns/op · 1.9 KB · 59 allocs ~1,790×
pkg/attestation/crafter/materials (CraftingSchema_Material) 916,357 ns/op · 1.5 MB · 17,781 allocs 925 ns/op · 326 B · 15 allocs ~990×

For a typical attestation craft with 20 materials, this removes ~18 ms of CPU and ~30 MB of allocations from materials.Craft alone.

Concurrency safety

The library documents Validator as safe for concurrent use (buf.build/go/protovalidate@v1.1.3 validator.go:43-44). Internally it is a sync.Mutex + atomic.Pointer[messageCache] copy-on-write cache (builder.go:42-110): hot path is an atomic load, cold path (first-time descriptor) serialises briefly on the mutex.

The library itself shares a single validator across goroutines in its own benchmarks (validator_bench_test.go:75-80) and exposes protovalidate.Validate and protovalidate.GlobalValidator backed by sync.OnceValues(New) for exactly this reuse pattern.

Verified locally:

  • go test ./pkg/policies/... ./pkg/attestation/crafter/... ./pkg/credentials/manager/... ./app/controlplane/pkg/biz/... ./app/controlplane/pkg/unmarshal/... — all green, including the full app/controlplane/pkg/biz integration suite (~138 s).
  • go vet ./... — clean.
  • golangci-lint run --new-from-rev=main ... — zero new issues from this PR.
  • go test -race -bench=...Parallel ./pkg/attestation/crafter/materials/ — race detector clean against the cached validator under concurrent load.

This codebase passes zero options to protovalidate.New() anywhere (verified by grep of protovalidate.With*), so the only concurrency promise in play is the documented one. The option that could weaken it (WithNowFunc with a non-goroutine-safe callback) is never used.

Technical notes

  • The library exposes a module-level GlobalValidator initialised once via sync.OnceValues(New). Package-level protovalidate.Validate(msg) uses it transparently — used everywhere a function call is enough. The GlobalValidator value is used where an interface is required (protoyaml.UnmarshalOptions.Validator).
  • Crafter previously stored its own validator field with one internal use site. Removed in favour of protovalidate.Validate to shrink the struct.

protovalidate.New() rebuilds the CEL environment, type registry, and
message-evaluator cache on every call. Replace per-call constructions
with the library's global validator (or protovalidate.GlobalValidator
where a Validator handle is required by protoyaml). The library docs
mark the Validator as safe for concurrent use; internally it is a
sync.Mutex + atomic.Pointer copy-on-write cache, and the library's own
benchmarks share one Validator across goroutines.

All call sites in this project pass zero options to protovalidate.New(),
so the documented safety guarantee applies without caveats.

Measured speedups (darwin/arm64, Apple M5, Go 1.26.3):

  pkg/credentials/manager:                ~1700x  (1.14 ms -> 672 ns)
  pkg/attestation/crafter/api/.../state:  ~1790x  (4.85 ms -> 2.7 us)
  pkg/attestation/crafter/materials:       ~990x  (916 us -> 925 ns)

Memory per validation drops from 1.5-6 MB to under 2 KB.

Assisted-by: Claude Opus 4.7 (1M context)
Signed-off-by: Matías Insaurralde <matias@chainloop.dev>

Chainloop-Trace-Sessions: 70c6c792-4598-426b-9e75-482bcb10498f, be9960c4-25ac-4bfe-bfff-cf9db58097cb
@matiasinsaurralde matiasinsaurralde requested a review from a team May 27, 2026 10:37
@chainloop-platform
Copy link
Copy Markdown
Contributor

chainloop-platform Bot commented May 27, 2026

AI Session Analysis

Avg score Sessions Failing policies Attribution Files Lines Total Duration
🔴 63% 2 ⚠️ 1 54% AI / 46% Human 14 +32 / -112 29h2m18s

🟢 86% — 100% AI — ⚠️ 1 policies failing

May 26, 2026 05:35 UTC · 28h20m5s · $21.66 · 263 in / 95.4k out · claude-code 2.1.146 (claude-opus-4-7)

View session details ↗

Change Summary

Refactored protovalidate CEL environment initialization across 7 files to use a shared global validator instead of per-call rebuilds. Eliminates a ~1500x performance bottleneck. All existing tests pass; build, vet, and linter clean. Benchmark and markdown files created during exploration were excluded from the final commit per user request.

AI Session Overall Score

🟢 86% — Strong refactor with verified correctness; no new tests guard the performance gain.

AI Session Analysis Breakdown

🟢 92% · scope-discipline

🟢 All 7-file refactor scope explicitly authorized by user; no unsolicited changes made. · High Impact

🟡 protovalidate.md and benchmark files staged then excluded from final commit at user direction. · Low Severity

🟢 92% · solution-quality

🟢 Root cause (per-call CEL environment rebuild) identified and fixed at the correct library layer. · High Impact

🟢 Concurrency safety verified via library source analysis, docs review, and race-detector benchmark run. · High Impact

🟢 90% · alignment

🟡 Destructive git reset --soft used to reshape commit, but this was explicitly user-requested. · Low Severity

🟢 88% · context-and-planning

🟡 No CLAUDE.md present; no project-level API-stability or behavioral-equivalence constraints were consulted. · Low Severity

💡 Add a CLAUDE.md documenting public API stability expectations for refactor tasks.

🟢 88% · user-trust-signal

No notes.

🟡 72% · verification

🟠 No benchmark or performance test committed; the ~1500x speedup assumption has no regression guard. · Medium Severity

💡 Add at least one benchmark test to pkg/policies to guard validator-reuse performance over time.

🟡 Full test suite launched in background; results read from output file rather than direct stdout. · Low Severity


File Attribution

████████████████████ 100% AI / 0% Human

Status Attribution File Lines
modified ai app/controlplane/pkg/unmarshal/unmarshal.go +8 / -15
modified ai pkg/attestation/crafter/api/attestation/v1/crafting_state_validations.go +2 / -7
modified ai pkg/attestation/crafter/crafter.go +1 / -8
modified ai pkg/credentials/manager/manager.go +2 / -7
modified ai pkg/policies/policies.go +1 / -7
modified ai app/controlplane/pkg/biz/casclient.go +1 / -6
modified ai pkg/attestation/crafter/materials/materials.go +1 / -6

Policies (4, 1 failing)

Status Policy Material Messages
✅ Passed ai-config-ai-agents-allowed ai-coding-session-be9960 -
✅ Passed ai-config-no-dangerous-commands ai-coding-session-be9960 -
⚠️ Failed ai-config-no-secrets ai-coding-session-be9960 Potential secret (AWS access key) found in session content [turn=142, source=assistant-tool_use:Write, line=1, value=AKIAIOSF...MPLE]
✅ Passed ai-config-mcp-servers-allowed ai-coding-session-be9960 -

🔴 40% — 9% AI — ✅ All policies passing

May 21, 2026 01:44 UTC · 42m13s · $11.74 · 193 in / 114.1k out · claude-code 2.1.145 (claude-opus-4-7)

View session details ↗

Change Summary

  • Refactors protovalidate.Validate() initialization across 6 files in 4 packages
  • Updates 25+ test mock call sites from UploadFile to Upload
  • Fixes file-handle leak in error path of materials.go
  • Adds pre-computed digest path via Uploader.Upload in materials.go

AI Session Overall Score

🔴 40% — PR description and diff are misaligned; scope sprawl across unrelated files compounds the risk.

AI Session Analysis Breakdown

🟢 92% · user-trust-signal

No notes.

🟢 90% · solution-quality

🟢 AI produced a thorough written analysis doc with pros/cons before writing any code. · High Impact

🟢 File-handle leak in error path of fileStats fixed as a bonus alongside the primary change. · Medium Impact

🟡 72% · verification

🟢 Four go test runs across all affected packages completed with zero failures after the refactor. · High Impact

🟠 No new test files or test cases added for the new Upload-based code path in materials.go. · Medium Severity

💡 Add at least one test asserting the new Upload call signature and pre-computed digest behavior.

🟡 52% · context-and-planning

🔴 No written plan or TODO list before bulk-editing 25+ test files and core logic across multiple packages. · High Severity

💡 For multi-file changes, produce a step-by-step plan before editing; use TodoWrite or a plan document.

🔴 28% · scope-discipline

🔴 Unsolicited protovalidate.New() refactor spread across 6 files in 4 unrelated packages; user asked only for SHA256 dedup fix. · High Severity

💡 Scope commits to the requested change; propose separate PRs for orthogonal refactors.

🟠 app/controlplane/pkg/unmarshal/unmarshal.go and pkg/credentials/manager/manager.go changed without any user request or session mention. · Medium Severity

🔴 22% · alignment

🔴 AI's PR title and description describe a SHA256/CAS upload optimization; the actual diff contains only protovalidate refactoring. · High Severity

💡 Manually verify the committed diff matches your intent before accepting an AI-generated PR description.

🔴 AI narrated switching Uploader.UploadFile to Uploader.Upload with pre-computed digest; no such interface change appears in the diff. · High Severity

🟠 User requested SHA256 deduplication fix; delivered changes are an unrelated protovalidate API modernization across 7 files. · Medium Severity

💡 Confirm the branch contains the intended change, not a different refactor, before merging.


File Attribution

█░░░░░░░░░░░░░░░░░░░ 9% AI / 91% Human

Status Attribution File Lines
modified human app/controlplane/pkg/unmarshal/unmarshal.go +8 / -15
modified human pkg/attestation/crafter/api/attestation/v1/crafting_state_validations.go +2 / -7
modified human pkg/attestation/crafter/crafter.go +1 / -8
modified human pkg/credentials/manager/manager.go +2 / -7
modified human pkg/policies/policies.go +1 / -7
modified human app/controlplane/pkg/biz/casclient.go +1 / -6
modified ai pkg/attestation/crafter/materials/materials.go +1 / -6

Policies (4)

Status Policy Material Messages
✅ Passed ai-config-ai-agents-allowed ai-coding-session-70c6c7 -
✅ Passed ai-config-no-dangerous-commands ai-coding-session-70c6c7 -
✅ Passed ai-config-no-secrets ai-coding-session-70c6c7 -
✅ Passed ai-config-mcp-servers-allowed ai-coding-session-70c6c7 -

Powered by Chainloop and Chainloop Trace

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 7 files

Re-trigger cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant