perf(protovalidate): cache validator instead of constructing per call by matiasinsaurralde · Pull Request #3156 · chainloop-dev/chainloop

matiasinsaurralde · 2026-05-27T10:37:14Z

Summary

Replaces seven per-call protovalidate.New() constructions with the library's global validator (or protovalidate.GlobalValidator for sites that need a Validator handle for protoyaml). Each New() call rebuilds the CEL environment, the type registry, and starts with an empty message-evaluator cache; reusing one validator amortises all of that across calls.

Sites refactored:

pkg/policies/policies.go — validateResource
pkg/attestation/crafter/crafter.go — removed unused validator field on Crafter
pkg/attestation/crafter/materials/materials.go — Craft
pkg/attestation/crafter/api/attestation/v1/crafting_state_validations.go — ValidateComplete
app/controlplane/pkg/biz/casclient.go — IsReady
app/controlplane/pkg/unmarshal/unmarshal.go — FromRaw (single shared adapter around protovalidate.GlobalValidator)
pkg/credentials/manager/manager.go — validateConfig

Wire providers in app/controlplane/cmd/main.go and app/artifact-cas/cmd/main.go
were already correct (one validator per process) and are unchanged.

Benchmarks

Measured on darwin/arm64, Apple M5, Go 1.26.3:

Site	Per-call `New()`	Cached validator	Speedup
`pkg/credentials/manager` (`Credentials` config)	1,141,692 ns/op · 1.6 MB · 19,350 allocs	672 ns/op · 96 B · 6 allocs	~1,700×
`pkg/attestation/crafter/api/.../crafting_state` (`CraftingState`)	4,854,560 ns/op · 6.1 MB · 73,712 allocs	2,709 ns/op · 1.9 KB · 59 allocs	~1,790×
`pkg/attestation/crafter/materials` (`CraftingSchema_Material`)	916,357 ns/op · 1.5 MB · 17,781 allocs	925 ns/op · 326 B · 15 allocs	~990×

For a typical attestation craft with 20 materials, this removes ~18 ms of CPU and ~30 MB of allocations from materials.Craft alone.

Concurrency safety

The library documents Validator as safe for concurrent use (buf.build/go/protovalidate@v1.1.3 validator.go:43-44). Internally it is a sync.Mutex + atomic.Pointer[messageCache] copy-on-write cache (builder.go:42-110): hot path is an atomic load, cold path (first-time descriptor) serialises briefly on the mutex.

The library itself shares a single validator across goroutines in its own benchmarks (validator_bench_test.go:75-80) and exposes protovalidate.Validate and protovalidate.GlobalValidator backed by sync.OnceValues(New) for exactly this reuse pattern.

Verified locally:

go test ./pkg/policies/... ./pkg/attestation/crafter/... ./pkg/credentials/manager/... ./app/controlplane/pkg/biz/... ./app/controlplane/pkg/unmarshal/... — all green, including the full app/controlplane/pkg/biz integration suite (~138 s).
go vet ./... — clean.
golangci-lint run --new-from-rev=main ... — zero new issues from this PR.
go test -race -bench=...Parallel ./pkg/attestation/crafter/materials/ — race detector clean against the cached validator under concurrent load.

This codebase passes zero options to protovalidate.New() anywhere (verified by grep of protovalidate.With*), so the only concurrency promise in play is the documented one. The option that could weaken it (WithNowFunc with a non-goroutine-safe callback) is never used.

Technical notes

The library exposes a module-level GlobalValidator initialised once via sync.OnceValues(New). Package-level protovalidate.Validate(msg) uses it transparently — used everywhere a function call is enough. The GlobalValidator value is used where an interface is required (protoyaml.UnmarshalOptions.Validator).
Crafter previously stored its own validator field with one internal use site. Removed in favour of protovalidate.Validate to shrink the struct.

protovalidate.New() rebuilds the CEL environment, type registry, and message-evaluator cache on every call. Replace per-call constructions with the library's global validator (or protovalidate.GlobalValidator where a Validator handle is required by protoyaml). The library docs mark the Validator as safe for concurrent use; internally it is a sync.Mutex + atomic.Pointer copy-on-write cache, and the library's own benchmarks share one Validator across goroutines. All call sites in this project pass zero options to protovalidate.New(), so the documented safety guarantee applies without caveats. Measured speedups (darwin/arm64, Apple M5, Go 1.26.3): pkg/credentials/manager: ~1700x (1.14 ms -> 672 ns) pkg/attestation/crafter/api/.../state: ~1790x (4.85 ms -> 2.7 us) pkg/attestation/crafter/materials: ~990x (916 us -> 925 ns) Memory per validation drops from 1.5-6 MB to under 2 KB. Assisted-by: Claude Opus 4.7 (1M context) Signed-off-by: Matías Insaurralde <matias@chainloop.dev> Chainloop-Trace-Sessions: 70c6c792-4598-426b-9e75-482bcb10498f, be9960c4-25ac-4bfe-bfff-cf9db58097cb

chainloop-platform · 2026-05-27T10:37:24Z

AI Session Analysis

Avg score	Sessions	Failing policies	Attribution	Files	Lines	Total Duration
🔴 63%	2	⚠️ 1	54% AI / 46% Human	14	+32 / -112	29h2m18s

🟢 86% — 100% AI — ⚠️ 1 policies failing

May 26, 2026 05:35 UTC · 28h20m5s · $21.66 · 263 in / 95.4k out · claude-code 2.1.146 (claude-opus-4-7)

View session details ↗

Change Summary

Refactored protovalidate CEL environment initialization across 7 files to use a shared global validator instead of per-call rebuilds. Eliminates a ~1500x performance bottleneck. All existing tests pass; build, vet, and linter clean. Benchmark and markdown files created during exploration were excluded from the final commit per user request.

AI Session Overall Score

🟢 86% — Strong refactor with verified correctness; no new tests guard the performance gain.

AI Session Analysis Breakdown

🟢 92% · scope-discipline

🟢 All 7-file refactor scope explicitly authorized by user; no unsolicited changes made. · High Impact

🟡 protovalidate.md and benchmark files staged then excluded from final commit at user direction. · Low Severity

🟢 92% · solution-quality

🟢 Root cause (per-call CEL environment rebuild) identified and fixed at the correct library layer. · High Impact

🟢 Concurrency safety verified via library source analysis, docs review, and race-detector benchmark run. · High Impact

🟢 90% · alignment

🟡 Destructive git reset --soft used to reshape commit, but this was explicitly user-requested. · Low Severity

🟢 88% · context-and-planning

🟡 No CLAUDE.md present; no project-level API-stability or behavioral-equivalence constraints were consulted. · Low Severity

💡 Add a CLAUDE.md documenting public API stability expectations for refactor tasks.

🟢 88% · user-trust-signal

No notes.

🟡 72% · verification

🟠 No benchmark or performance test committed; the ~1500x speedup assumption has no regression guard. · Medium Severity

💡 Add at least one benchmark test to pkg/policies to guard validator-reuse performance over time.

🟡 Full test suite launched in background; results read from output file rather than direct stdout. · Low Severity

File Attribution

████████████████████ 100% AI / 0% Human

Status	Attribution	File	Lines
modified	ai	`app/controlplane/pkg/unmarshal/unmarshal.go`	+8 / -15
modified	ai	`pkg/attestation/crafter/api/attestation/v1/crafting_state_validations.go`	+2 / -7
modified	ai	`pkg/attestation/crafter/crafter.go`	+1 / -8
modified	ai	`pkg/credentials/manager/manager.go`	+2 / -7
modified	ai	`pkg/policies/policies.go`	+1 / -7
modified	ai	`app/controlplane/pkg/biz/casclient.go`	+1 / -6
modified	ai	`pkg/attestation/crafter/materials/materials.go`	+1 / -6

Policies (4, 1 failing)

Status	Policy	Material	Messages
✅ Passed	`ai-config-ai-agents-allowed`	`ai-coding-session-be9960`	-
✅ Passed	`ai-config-no-dangerous-commands`	`ai-coding-session-be9960`	-
⚠️ Failed	`ai-config-no-secrets`	`ai-coding-session-be9960`	Potential secret (AWS access key) found in session content [turn=142, source=assistant-tool_use:Write, line=1, value=AKIAIOSF...MPLE]
✅ Passed	`ai-config-mcp-servers-allowed`	`ai-coding-session-be9960`	-

🔴 40% — 9% AI — ✅ All policies passing

May 21, 2026 01:44 UTC · 42m13s · $11.74 · 193 in / 114.1k out · claude-code 2.1.145 (claude-opus-4-7)

View session details ↗

Change Summary

Refactors protovalidate.Validate() initialization across 6 files in 4 packages
Updates 25+ test mock call sites from UploadFile to Upload
Fixes file-handle leak in error path of materials.go
Adds pre-computed digest path via Uploader.Upload in materials.go

AI Session Overall Score

🔴 40% — PR description and diff are misaligned; scope sprawl across unrelated files compounds the risk.

AI Session Analysis Breakdown

🟢 92% · user-trust-signal

No notes.

🟢 90% · solution-quality

🟢 AI produced a thorough written analysis doc with pros/cons before writing any code. · High Impact

🟢 File-handle leak in error path of fileStats fixed as a bonus alongside the primary change. · Medium Impact

🟡 72% · verification

🟢 Four go test runs across all affected packages completed with zero failures after the refactor. · High Impact

🟠 No new test files or test cases added for the new Upload-based code path in materials.go. · Medium Severity

💡 Add at least one test asserting the new Upload call signature and pre-computed digest behavior.

🟡 52% · context-and-planning

🔴 No written plan or TODO list before bulk-editing 25+ test files and core logic across multiple packages. · High Severity

💡 For multi-file changes, produce a step-by-step plan before editing; use TodoWrite or a plan document.

🔴 28% · scope-discipline

🔴 Unsolicited protovalidate.New() refactor spread across 6 files in 4 unrelated packages; user asked only for SHA256 dedup fix. · High Severity

💡 Scope commits to the requested change; propose separate PRs for orthogonal refactors.

🟠 app/controlplane/pkg/unmarshal/unmarshal.go and pkg/credentials/manager/manager.go changed without any user request or session mention. · Medium Severity

🔴 22% · alignment

🔴 AI's PR title and description describe a SHA256/CAS upload optimization; the actual diff contains only protovalidate refactoring. · High Severity

💡 Manually verify the committed diff matches your intent before accepting an AI-generated PR description.

🔴 AI narrated switching Uploader.UploadFile to Uploader.Upload with pre-computed digest; no such interface change appears in the diff. · High Severity

🟠 User requested SHA256 deduplication fix; delivered changes are an unrelated protovalidate API modernization across 7 files. · Medium Severity

💡 Confirm the branch contains the intended change, not a different refactor, before merging.

File Attribution

█░░░░░░░░░░░░░░░░░░░ 9% AI / 91% Human

Status	Attribution	File	Lines
modified	human	`app/controlplane/pkg/unmarshal/unmarshal.go`	+8 / -15
modified	human	`pkg/attestation/crafter/api/attestation/v1/crafting_state_validations.go`	+2 / -7
modified	human	`pkg/attestation/crafter/crafter.go`	+1 / -8
modified	human	`pkg/credentials/manager/manager.go`	+2 / -7
modified	human	`pkg/policies/policies.go`	+1 / -7
modified	human	`app/controlplane/pkg/biz/casclient.go`	+1 / -6
modified	ai	`pkg/attestation/crafter/materials/materials.go`	+1 / -6

Policies (4)

Status	Policy	Material	Messages
✅ Passed	`ai-config-ai-agents-allowed`	`ai-coding-session-70c6c7`	-
✅ Passed	`ai-config-no-dangerous-commands`	`ai-coding-session-70c6c7`	-
✅ Passed	`ai-config-no-secrets`	`ai-coding-session-70c6c7`	-
✅ Passed	`ai-config-mcp-servers-allowed`	`ai-coding-session-70c6c7`	-

Powered by Chainloop and Chainloop Trace

cubic-dev-ai

No issues found across 7 files

_{Re-trigger cubic}

matiasinsaurralde requested a review from a team May 27, 2026 10:37

cubic-dev-ai Bot reviewed May 27, 2026

View reviewed changes

Conversation

matiasinsaurralde commented May 27, 2026

Summary

Benchmarks

Concurrency safety

Technical notes

Uh oh!

chainloop-platform Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Session Analysis

🟢 86% — 100% AI — ⚠️ 1 policies failing

Change Summary

AI Session Overall Score

AI Session Analysis Breakdown

🟢 92% · scope-discipline

🟢 92% · solution-quality

🟢 90% · alignment

🟢 88% · context-and-planning

🟢 88% · user-trust-signal

🟡 72% · verification

File Attribution

Policies (4, 1 failing)

🔴 40% — 9% AI — ✅ All policies passing

Change Summary

AI Session Overall Score

AI Session Analysis Breakdown

🟢 92% · user-trust-signal

🟢 90% · solution-quality

🟡 72% · verification

🟡 52% · context-and-planning

🔴 28% · scope-discipline

🔴 22% · alignment

File Attribution

Policies (4)

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chainloop-platform Bot commented May 27, 2026 •

edited

Loading