tangle-network · drewstone · Jun 3, 2026 · Jun 3, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -66,9 +66,9 @@ The parser rejects absolute paths, `..`, control characters, and writes outside
 
 ## Eval Boundary
 
-Use `runKnowledgeBaseOptimization()` when comparing candidate knowledge bases on an actual task corpus. It delegates to `@tangle-network/agent-eval` multi-shot optimization, so single-turn and multi-turn agents share the same path.
+Compare candidate knowledge bases on an actual task corpus by running an `@tangle-network/agent-eval` improvement loop (`runImprovementLoop`) over the variants; each run is scored into a `RunRecord`.
 
-Use `knowledgeReleaseReportFromOptimization()` before promotion. It projects optimizer traces and `RunRecord` rows into `agent-eval` release confidence evidence.
+Use `knowledgeReleaseReport()` before promotion. It folds the candidate and baseline `RunRecord[]` (plus optional traces and the gate decision) into `agent-eval` release confidence evidence.
 
 ## Integration Boundaries
 

diff --git a/README.md b/README.md
@@ -29,7 +29,7 @@ Two ways in, depending on what you're doing:
 - **Drive it from an agent** → pick the primitive by intent:
   - *"Does the agent have enough context to run?"* → [`buildEvalKnowledgeBundle`](#agent-eval-integration) (block / ask / acquire before execution).
   - *"Grow the KB as a researcher"* → [`runKnowledgeResearchLoop`](#research-loop) (deterministic mechanics; your agent owns judgment) or the sandbox [researcher profile](#researcher-profile) for `runLoop`.
-  - *"Does this candidate KB actually improve task success?"* → `runKnowledgeBaseOptimization` ([Agent-Eval integration](#agent-eval-integration)).
+  - *"Does this candidate KB actually improve task success?"* → run an [agent-eval improvement loop](#agent-eval-integration) over KB variants, then `knowledgeReleaseReport` for the promotion decision.
   - *"Keep live authorities fresh"* → [pluggable sources](#pluggable-knowledge-sources) + `detectChanges` → eval re-runs.
 
 Storage stays consumer-owned via `KbStore` (`MemoryKbStore`, `FileSystemKbStore`, or your own D1/Postgres). Every primitive below is source-grounded: claims cite immutable source records, and lint fails on un-grounded citations.
@@ -98,7 +98,7 @@ from `@tangle-network/agent-knowledge`.
   hit *in this result set* (top hit = 1, others = score / topScore) — use it
   when comparing against natural confidence thresholds. The normalization is
   within-set ranking, not a cross-query absolute confidence.
-- Optimization uses `@tangle-network/agent-eval` internally instead of reimplementing eval gates.
+- Release confidence uses `@tangle-network/agent-eval` release gates (`evaluateReleaseConfidence`) instead of reimplementing them.
 - `buildEvalKnowledgeBundle()` maps wiki/search evidence into
   `agent-eval` `KnowledgeRequirement`, `KnowledgeBundle`, and
   `KnowledgeReadinessReport` contracts so control loops can block, ask, or
@@ -108,9 +108,9 @@ The `/viz` subpath exports graph insight helpers without UI dependencies.
 
 ## Agent-Eval Integration
 
-Use `runKnowledgeBaseOptimization()` when the question is whether a candidate knowledge base actually improves agent task success. The candidate is passed through `runMultiShotOptimization`, so `n=1` single-turn tasks and variable-length multi-turn traces use the same path.
+To answer whether a candidate knowledge base actually improves agent task success, run an `@tangle-network/agent-eval` improvement loop (`runImprovementLoop`) over your KB variants on a real task corpus; each run is scored into a `RunRecord`.
 
-Use `knowledgeReleaseReportFromOptimization()` to turn optimizer output into release confidence evidence using `agent-eval` release gates and `RunRecord` validation.
+Use `knowledgeReleaseReport()` before promotion: pass the candidate and baseline `RunRecord[]` (plus optional `ReleaseTraceEvidence` and the gate decision) and it folds them into a `ReleaseConfidenceScorecard` and a `KnowledgeRelease` using `agent-eval`'s release gates and `RunRecord` validation.
 
 Use `buildEvalKnowledgeBundle()` before execution when the question is whether
 the agent has enough task-world context to run:

diff --git a/docs/architecture.md b/docs/architecture.md
@@ -9,7 +9,7 @@ It does not try to be a vector database, a RAG framework, or a product-specific
 - claims with source references
 - deterministic indexing, graph construction, search, and lint
 - safe LLM write proposals
-- eval-gated optimization through `@tangle-network/agent-eval`
+- eval-gated release confidence through `@tangle-network/agent-eval`
 - visualization DTOs under the `/viz` subpath
 - storage contracts with memory/filesystem reference adapters
 - discovery worker/dispatcher contracts
@@ -18,7 +18,7 @@ It does not try to be a vector database, a RAG framework, or a product-specific
 
 ## Boundaries
 
-`agent-eval` owns traces, ASI, multi-shot optimization, run records, and promotion gates.
+`agent-eval` owns traces, ASI, improvement loops, run records, and promotion gates.
 
 `agent-knowledge` owns sources, claims, pages, graph/search/lint, and knowledge base candidates. It calls `agent-eval` instead of reimplementing evaluation.
 
@@ -34,7 +34,7 @@ Core does not own a D1 schema or fleet dispatcher. Apps wire `KbStore` and `Know
 4. Validate paths, citations, links, and schema.
 5. Index generated knowledge pages.
 6. Search and graph-lint the knowledge base.
-7. Evaluate candidate KB variants with `runKnowledgeBaseOptimization`.
+7. Evaluate candidate KB variants with an `agent-eval` improvement loop, then fold the resulting run records into release confidence with `knowledgeReleaseReport`.
 8. Promote only variants that pass downstream gates.
 
 ## CLI

diff --git a/package.json b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@tangle-network/agent-knowledge",
-  "version": "1.5.2",
+  "version": "1.6.0",
   "description": "Source-grounded, eval-gated knowledge growth primitives for agents.",
   "homepage": "https://github.com/tangle-network/agent-knowledge#readme",
   "repository": {
@@ -63,21 +63,25 @@
     "format": "biome format --write src tests"
   },
   "dependencies": {
-    "@tangle-network/agent-eval": "^0.42.0",
-    "@tangle-network/agent-runtime": "^0.25.0",
+    "@tangle-network/agent-eval": "^0.77.0",
+    "@tangle-network/agent-runtime": "^0.44.0",
     "zod": "^4.3.6"
   },
   "devDependencies": {
     "@biomejs/biome": "^2.4.15",
-    "@tangle-network/sandbox": "^0.3.0",
+    "@tangle-network/sandbox": "^0.4.0",
     "@types/node": "^25.6.0",
     "tsup": "^8.0.0",
     "typescript": "^5.7.0",
     "vitest": "^3.0.0"
   },
   "pnpm": {
     "minimumReleaseAge": 4320,
-    "minimumReleaseAgeExclude": []
+    "minimumReleaseAgeExclude": [
+      "@tangle-network/agent-eval",
+      "@tangle-network/agent-runtime",
+      "@tangle-network/sandbox"
+    ]
   },
   "engines": {
     "node": ">=20"

diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml
diff --git a/src/release.ts b/src/release.ts
@@ -21,9 +21,8 @@ export interface KnowledgeReleaseReport {
  * loop) supplies the candidate/baseline `RunRecord[]` (e.g. via
  * `campaignToRunRecords`) + optional per-instance `ReleaseTraceEvidence` + the
  * gate decision; this folds them into a `ReleaseConfidenceScorecard` + a
- * `KnowledgeRelease`. Decoupled from any optimizer result shape — agent-eval's
- * legacy multi-shot orchestration (and its `MultiShotOptimizationResult`) was
- * removed in 0.42; release confidence is computed from records + traces.
+ * `KnowledgeRelease`. Release confidence is computed from run records + traces,
+ * independent of any optimizer result shape.
  */
 export interface KnowledgeReleaseInput {
   candidateId: string