diff --git a/examples/README.md b/examples/README.md
index 980d6a7c..567b849e 100644
--- a/examples/README.md
+++ b/examples/README.md
@@ -20,6 +20,7 @@ service keys.
 | [Swap an OCR model with one identifier change](./document-ocr) | Driving recognition (VLM-OCR), structured extraction (Donut), and zero-shot NER (GLiNER) through the same `extract` call by swapping the model ID | `extract` | Docker Compose plus Node UI, no API key required, hosted version on [Hugging Face Spaces](https://huggingface.co/spaces/superlinked/document-ocr) | Runnable demo |
 | [A Stripe Link checkout with an SIE fraud-risk gate](./stripe-link-fraud) | Wiring all three SIE primitives into a pre-authorization fraud-risk gate that runs in the same round-trip as the Stripe PaymentIntent | `extract`, `encode`, `score` | Docker Compose plus Node UI; Stripe test-mode keys optional (runs in mock mode without them) | Runnable demo |
 | [Vision-first document RAG](./vision-doc-rag) | Retrieving and answering questions over a multi-tenant page corpus by looking at page images — including scanned drawings — with OCR kept out of the score path | `encode`, `chat/completions`, `score` (optional) | GPU SIE deployment required: ColQwen2.5 retriever + Qwen3.5-4B answer model (runs on the generation bundle) | Runnable demo |
+| [Multi-model contract review with the OpenAI Agents SDK](./contract-review-agent) | Running an OpenAI Agents SDK agent whose every model call — triage, orchestration, vision, OCR, embeddings, rerank, entity extraction, text-to-SQL, reasoning, and a safety guardrail — is served by one SIE cluster, each step on the right catalog model, with per-model observability | `chat/completions`, `encode`, `score`, `extract` | GPU SIE deployment required; standalone `uv` project; real contracts fetched from CUAD (CC BY 4.0) | Runnable demo |
 
 For docs publishing, lead with the quickest runnable demos, then use the
 benchmark and evaluation examples for deeper technical users.
diff --git a/examples/contract-review-agent/.env.example b/examples/contract-review-agent/.env.example
new file mode 100644
index 00000000..b99497bb
--- /dev/null
+++ b/examples/contract-review-agent/.env.example
@@ -0,0 +1,4 @@
+# Point these at any SIE deployment that serves /v1/chat/completions (a GPU
+# cluster — see README "Run it"). Defaults match a local CUDA container.
+SIE_CLUSTER_URL=http://localhost:8080
+SIE_API_KEY=
diff --git a/examples/contract-review-agent/.gitignore b/examples/contract-review-agent/.gitignore
new file mode 100644
index 00000000..ea00f029
--- /dev/null
+++ b/examples/contract-review-agent/.gitignore
@@ -0,0 +1,8 @@
+.venv/
+__pycache__/
+.env
+*.log
+uv.lock
+.ruff_cache/
+# Generated sample artifacts (recreate with `uv run make-sample`)
+contract_review_agent/data/generated/
diff --git a/examples/contract-review-agent/README.md b/examples/contract-review-agent/README.md
new file mode 100644
index 00000000..db9b8ab7
--- /dev/null
+++ b/examples/contract-review-agent/README.md
@@ -0,0 +1,111 @@
+# Contract review with the OpenAI Agents SDK, on one SIE cluster
+
+A multi-agent contract reviewer built with the [OpenAI Agents SDK](https://openai.github.io/openai-agents-python/) where **every model call is served by SIE** — no `api.openai.com`, no per-token bill. An **investigator** agent autonomously calls tools to gather grounded facts, then a **synthesizer** agent turns them into a structured review — each step running on the **right model from the SIE catalog**: a fast triage model, a vision model that reads the scanned signature page, a reasoning sub-agent for clause risk, a text-to-SQL specialist, an OCR model, embedding + reranker models for clause search, a zero-shot entity extractor, and a safety guardrail. Ten specialized jobs, one cluster, one request.
+
+This is the "one cluster powers every model your agent calls" idea from the [SIE landing page](https://superlinked.com), made real and runnable.
+
+## The catalog: the right model for each job
+
+Every value below is a real model in the [SIE catalog](https://superlinked.com/models). Swap any line in `config.yaml` to try another — nothing else changes.
+
+| Role in the agent | SIE model | SIE function |
+|---|---|---|
+| Triage — classify the document type | `Qwen/Qwen3-0.6B` | chat |
+| **Orchestrator** — plan, call tools, assemble the review | `Qwen/Qwen3-4B-Instruct-2507` (alias `code`) | chat + tools + JSON schema |
+| Vision — read the scanned signature page | `Qwen/Qwen3.5-4B` | chat + image |
+| Reasoning sub-agent — clause-risk analysis | `Qwen/Qwen3-4B-Instruct-2507` (↑ `Qwen3.5-4B` / `Qwen3.6-27B` where served) | chat |
+| Text-to-SQL — query the obligations DB | `defog/sqlcoder-7b-2` | completions |
+| Guardrail — safety / prompt-injection | `ibm-granite/granite-guardian-3.0-2b` (alias `guard`) | chat |
+| OCR — scanned page → markdown | `lightonai/LightOnOCR-2-1B` | extract |
+| Clause search — dense embeddings | `BAAI/bge-m3` | encode |
+| Clause rerank — cross-encoder | `Qwen/Qwen3-Reranker-4B` | score |
+| Entity extraction — parties, dates, amounts | `urchade/gliner_large-v2.1` | extract |
+
+## How it works
+
+The whole trick is one idea: **the Agents SDK speaks the OpenAI wire protocol, and SIE serves an OpenAI-compatible `/v1` endpoint.** So we point the SDK at SIE and force chat completions (`contract_review_agent/runtime.py`):
+
+```python
+client = AsyncOpenAI(base_url="http://localhost:8080/v1", api_key="not-needed")
+set_default_openai_client(client)        # every agent talks to SIE...
+set_default_openai_api("chat_completions")  # ...over chat completions, not the Responses API...
+set_tracing_disabled(True)               # ...and we never phone home with traces.
+```
+
+After that, each `Agent` just names the SIE model it should run on:
+
+```python
+Agent(name="Risk Analyst", model=OpenAIChatCompletionsModel("Qwen/Qwen3-4B-Instruct-2507", openai_client=client), ...)
+```
+
+The flow is **two agents** (which is what keeps a small open model reliable):
+
+1. An **investigator** (on `Qwen3-4B-Instruct`) with seven tools and **no** structured `output_type` — so it can't short-circuit to a hallucinated answer and instead must call tools to learn anything about the contract:
+   - `classify_document` (triage) · `read_signature_page` (vision) · `analyze_clause_risks` (delegates to the reasoning **sub-agent**) — generative LLMs
+   - `ocr_signature_page` · `extract_entities` (`extract`), `search_clauses` (`encode` + `score`), `query_obligations_db` (`completions`) — retrieval & extraction
+   - a `granite-guardian` **input guardrail** screens the request first (and fails open, logged, if the guard model is unavailable).
+2. A **synthesizer** (structured `output_type=ContractReview`, no tools) turns the investigator's grounded findings into the final review — parties, dates, governing law, executed?, key obligations, risk flags with severity + redlines, recommendation — via SIE's JSON-schema-constrained generation.
+
+> Why two agents? With a structured `output_type`, a small model tends to emit the schema immediately and skip the tools (it will even hallucinate the fields). Splitting "gather with tools" from "format the result" keeps the fan-out real and the output grounded.
+
+## Run it
+
+You need Python 3.12 and a **GPU-backed SIE deployment** — the generative models run on SIE's generation bundle (CUDA), so the `latest-cpu-default` image can't serve them.
+
+```bash
+# 1. SIE on a local NVIDIA GPU, or point SIE_CLUSTER_URL / SIE_API_KEY at a managed GPU cluster.
+docker run --gpus all -p 8080:8080 -v sie-hf-cache:/app/.cache/huggingface \
+  ghcr.io/superlinked/sie-server:latest-cuda12-default
+
+cd examples/contract-review-agent
+cp .env.example .env          # edit SIE_CLUSTER_URL / SIE_API_KEY if not localhost
+uv sync
+
+# 2. Fetch a handful of real contracts from CUAD (CC BY 4.0). Downloads a ~18 MB archive once.
+uv run fetch-contracts                 # or: uv run make-sample  (offline synthetic contracts)
+
+# 3. Review the first contract and watch the model fan-out.
+uv run review                          # uv run review --list   to see available contracts
+uv run review --contract <slug>        # review a specific one
+```
+
+> **GPU sizing.** `reasoning` defaults to `Qwen/Qwen3-4B-Instruct-2507` (reliable, fast) so the demo
+> runs on a single mid-size GPU; swap in the newer `Qwen/Qwen3.5-4B` or the stronger `Qwen/Qwen3.6-27B` (H100/RTX PRO 6000) where the cluster serves them. A cold
+> cluster pays a one-time load per model on first use; the agent retries the "still
+> provisioning" responses under `cluster.provision_timeout_s`. Keep bundles warm
+> (`minReplicas: 1`) to skip the wait — and any model the cluster can't serve degrades
+> gracefully (logged in the ledger) instead of failing the run.
+
+## What you'll see
+
+`uv run review` prints the model catalog, runs the agent, then prints the structured review **plus a per-model observability ledger** — each step's model, SIE function, **cold-start warm-up**, warm latency, data sent, and **warm throughput (tokens/s)** — so you can watch one cluster fan a single request across the catalog and see how each model performed. (Warm-up is shown separately from throughput for the generative calls; the `encode`/`score`/`extract` calls go through the SIE SDK, which provisions internally, so those show total latency.) Try `--instruction "..."` to change the ask, or feed the guardrail a malicious prompt to watch `granite-guardian` trip the tripwire.
+
+## Swapping models (the point of the catalog)
+
+`config.yaml` maps each role to a model id. Change a string, rerun — no code edits:
+
+```yaml
+models:
+  reasoning: "Qwen/Qwen3.6-27B"               # default 4B runs anywhere; bump to 27B on an H100-class cluster
+  ocr: "opendatalab/MinerU2.5-Pro-2604-1.2B"  # try a different OCR model
+```
+
+Alternatively, resolve roles **server-side** with SIE's gateway aliases — set
+`SIE_GATEWAY_MODEL_ALIASES='{"vision":"Qwen/Qwen3.5-4B","ocr":"lightonai/LightOnOCR-2-1B"}'`
+and reference `vision` / `ocr` (the built-ins `code`, `sql`, `guard` already ship).
+
+## Data
+
+The default corpus is **[CUAD](https://www.atticusprojectai.org/cuad/)** (Contract Understanding Atticus Dataset) — 510 real commercial contracts filed with the SEC, released by The Atticus Project under **CC BY 4.0**. `fetch-contracts` downloads CUAD's ~18 MB archive once (from the [Atticus Project repo](https://github.com/TheAtticusProject/cuad)), parses the SQuAD-format contract text, writes a curated handful as the corpus, renders one page to an image for the OCR/vision step, and seeds a small SQLite obligations database that references the contracts pulled.
+
+> CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review. Dan Hendrycks, Collin Burns, Anya Chen, Spencer Ball. arXiv:2103.06268. Licensed CC BY 4.0.
+
+`uv run make-sample` builds a fully synthetic, offline alternative (an Acme MSA, an NDA, and an SOW) so the demo runs with no network.
+
+## Notes
+
+- Chat completions, tool calling, JSON-schema structured output, vision, and `/v1/completions` (for `sqlcoder`) are all served over SIE's OpenAI-compatible API.
+- `sqlcoder-7b-2` is a completion model used with its native text-to-SQL template; for higher accuracy you can instead point `sql` at the `code`-aliased instruct model.
+- This is a demo of inference orchestration, **not legal advice**.
+
+Apache-2.0, like the rest of SIE.
diff --git a/examples/contract-review-agent/config.yaml b/examples/contract-review-agent/config.yaml
new file mode 100644
index 00000000..fc16f707
--- /dev/null
+++ b/examples/contract-review-agent/config.yaml
@@ -0,0 +1,36 @@
+# SIE deployment. /v1/chat/completions is served by SIE's generation bundle
+# (GPU), so use a CUDA image locally:
+#   docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie-server:latest-cuda12-default
+# or override with SIE_CLUSTER_URL / SIE_API_KEY to target a managed GPU cluster.
+cluster:
+  url: "http://localhost:8080"
+  api_key: ""
+  gpu: ""                  # only set for managed multi-GPU clusters (e.g. "l4-spot"); ignored locally
+  provision_timeout_s: 900 # cold clusters scale from zero — first call to each model pays a load
+
+# ── The heart of the demo: one cluster, many models — the right one for each job. ──
+# Every value is a real model in the SIE catalog (https://superlinked.com/models).
+# Swap any line to try a different model; no other code changes.
+models:
+  # Generative LLMs — the agent "brains" (chat / tools / structured output):
+  triage: "Qwen/Qwen3-0.6B"                          # fast, cheap doc-type classifier (no tools)
+  orchestrator: "Qwen/Qwen3-4B-Instruct-2507"        # plans, calls tools, assembles output (alias: code)
+  vision: "Qwen/Qwen3.5-4B"                          # reads scanned / signature pages (text + image)
+  reasoning: "Qwen/Qwen3-4B-Instruct-2507"           # clause-risk analysis (reliable, fast); swap to newer Qwen/Qwen3.5-4B or stronger Qwen/Qwen3.6-27B where the cluster serves them
+  sql: "defog/sqlcoder-7b-2"                         # text-to-SQL specialist (completions + native template)
+  guard: "ibm-granite/granite-guardian-3.0-2b"       # safety / prompt-injection guardrail (alias: guard)
+  # Retrieval + extraction tools (encode / score / extract):
+  ocr: "lightonai/LightOnOCR-2-1B"                   # scanned PDF / image -> markdown (latest OCR model)
+  embed: "BAAI/bge-m3"                               # dense embeddings for clause search
+  rerank: "Qwen/Qwen3-Reranker-4B"                   # cross-encoder rerank of retrieved clauses
+  entities: "urchade/gliner_large-v2.1"              # zero-shot entity extraction (parties, dates, amounts)
+
+# Tunables for the SIE-backed tools.
+search:
+  top_k_candidates: 12   # clauses retrieved by embedding similarity
+  top_k_results: 4       # clauses kept after rerank
+
+guard:
+  # granite-guardian emits a "yes" (unsafe) / "no" (safe) verdict. Trip the
+  # guardrail when P(unsafe) clears this threshold. 0.5 is recall-biased.
+  threshold: 0.5
diff --git a/examples/contract-review-agent/contract_review_agent/__init__.py b/examples/contract-review-agent/contract_review_agent/__init__.py
new file mode 100644
index 00000000..af59174f
--- /dev/null
+++ b/examples/contract-review-agent/contract_review_agent/__init__.py
@@ -0,0 +1,6 @@
+"""Contract review with the OpenAI Agents SDK, served entirely by SIE.
+
+One orchestrator agent drives specialist sub-agents and SIE-backed tools, each
+running on a different model from the SIE catalog — the "one cluster powers
+every model your agent calls" story, made runnable.
+"""
diff --git a/examples/contract-review-agent/contract_review_agent/app.py b/examples/contract-review-agent/contract_review_agent/app.py
new file mode 100644
index 00000000..837de71d
--- /dev/null
+++ b/examples/contract-review-agent/contract_review_agent/app.py
@@ -0,0 +1,114 @@
+"""Assemble the multi-agent app: an orchestrator on one model, a risk-analyst
+sub-agent on another, SIE-backed tools, a safety guardrail, and a structured
+output type."""
+
+from __future__ import annotations
+
+from typing import Any
+
+from agents import Agent, RunResult, Runner
+from pydantic import BaseModel
+
+from .guardrails import safety_guardrail
+from .runtime import AppContext, model_for
+from .tools import ALL_TOOLS
+
+
+class RiskFlag(BaseModel):
+    clause: str
+    issue: str
+    severity: str  # low | medium | high
+    suggested_redline: str
+
+
+class ContractReview(BaseModel):
+    """The structured deliverable the orchestrator must produce."""
+
+    document_type: str
+    parties: list[str]
+    effective_date: str  # "unknown" if not stated
+    renewal_terms: str
+    governing_law: str  # "unknown" if not stated
+    executed: bool  # is the signature page signed and dated?
+    key_obligations: list[str]
+    risk_flags: list[RiskFlag]
+    recommendation: str
+
+
+# The investigator has NO output_type on purpose: a structured output_type gives a
+# weak model an escape hatch to emit the schema immediately instead of using tools.
+# With only tools available, it must call them to do its job.
+_INVESTIGATOR_INSTRUCTIONS = """\
+You are a contract investigator. You have NO prior knowledge of this contract — the
+ONLY way to learn anything is to CALL YOUR TOOLS. Investigate thoroughly: call EVERY
+one of these tools, one after another, before you write anything.
+
+- classify_document() — the document type
+- ocr_signature_page() — read the executed signature page (signatories, titles, date)
+- extract_entities() — parties, dates, amounts, governing law
+- read_signature_page("Are both parties' signatures present and dated?") — visual execution check
+- search_clauses("automatic renewal"), then search_clauses("limitation of liability"),
+  then search_clauses("indemnification"), then search_clauses("termination")
+- analyze_clause_risks(<the clause text you found>) — risk analysis with severities
+- query_obligations_db("upcoming obligations with due dates and amounts") — deadlines
+
+Do NOT write your report until you have called them all. Then write a thorough,
+factual findings report that cites ONLY what the tools returned. Never invent a party,
+date, number, or clause — if a tool failed, say so."""
+
+_SYNTHESIZER_INSTRUCTIONS = """\
+You turn a contract investigator's findings into a structured ContractReview. Use
+ONLY the findings provided — never add facts. If the findings don't establish a
+field, use "unknown" (or false for `executed`). Make key_obligations and risk_flags
+specific and grounded in the findings, and give a clear recommendation."""
+
+
+def build_reasoning_agent(cfg: dict[str, Any], client: Any) -> Agent:
+    return Agent(
+        name="Risk Analyst",
+        instructions=(
+            "You are a senior contracts attorney. Given contract clauses, identify "
+            "risks to the Customer. For each, state the clause, the issue, a severity "
+            "(low/medium/high), and a concrete one-line redline. Be specific and brief."
+        ),
+        model=model_for(cfg["models"]["reasoning"], client),
+    )
+
+
+def build_investigator(cfg: dict[str, Any], client: Any) -> Agent:
+    """Autonomous tool-using agent (no output_type) that gathers grounded findings."""
+    return Agent(
+        name="Contract Investigator",
+        instructions=_INVESTIGATOR_INSTRUCTIONS,
+        model=model_for(cfg["models"]["orchestrator"], client),
+        tools=ALL_TOOLS,
+        input_guardrails=[safety_guardrail],
+    )
+
+
+def build_synthesizer(cfg: dict[str, Any], client: Any) -> Agent:
+    """Structured-output agent (no tools) that formats the findings into a review."""
+    return Agent(
+        name="Contract Reviewer",
+        instructions=_SYNTHESIZER_INSTRUCTIONS,
+        model=model_for(cfg["models"]["orchestrator"], client),
+        output_type=ContractReview,
+    )
+
+
+async def run_review(
+    app: AppContext, investigator: Agent, synthesizer: Agent, instruction: str
+) -> tuple[RunResult, RunResult]:
+    """Investigate with tools (autonomous fan-out), then synthesize the structured review."""
+    gather = await Runner.run(
+        investigator,
+        f"{instruction}\n\nInvestigate the contract using your tools, then report your findings.",
+        context=app,
+        max_turns=20,
+    )
+    synth = await Runner.run(
+        synthesizer,
+        f"Investigator findings:\n\n{gather.final_output}\n\nProduce the ContractReview.",
+        context=app,
+    )
+    return gather, synth
diff --git a/examples/contract-review-agent/contract_review_agent/cli.py b/examples/contract-review-agent/contract_review_agent/cli.py
new file mode 100644
index 00000000..eeb310ce
--- /dev/null
+++ b/examples/contract-review-agent/contract_review_agent/cli.py
@@ -0,0 +1,270 @@
+"""Run the contract-review agent over one contract and show the model fan-out."""
+
+from __future__ import annotations
+
+import argparse
+import asyncio
+import json
+import time
+from pathlib import Path
+
+from agents import InputGuardrailTripwireTriggered
+from rich.console import Console
+from rich.panel import Panel
+from rich.table import Table
+from sie_sdk import SIEAsyncClient
+
+from .app import ContractReview, build_investigator, build_reasoning_agent, build_synthesizer, run_review
+from .config import load_config
+from .data import CUAD_DIR, GENERATED_DIR, MANIFEST_PATH, make_sample
+from .runtime import AppContext, Ledger, chat_once, configure_runtime, make_openai_client
+
+console = Console()
+
+# (config role, human job, SIE function) — drives the catalog table.
+ROLE_INFO = [
+    ("orchestrator", "Plan, call tools, assemble the review", "chat + tools"),
+    ("triage", "Classify the document type (fast)", "chat"),
+    ("vision", "Read the scanned signature page", "chat + image"),
+    ("reasoning", "Clause-risk specialist (sub-agent)", "chat"),
+    ("sql", "Text-to-SQL over the obligations DB", "completions"),
+    ("guard", "Safety / prompt-injection guardrail", "chat"),
+    ("ocr", "Scanned page → markdown", "extract"),
+    ("embed", "Clause search (embeddings)", "encode"),
+    ("rerank", "Rerank retrieved clauses", "score"),
+    ("entities", "Entity extraction (parties, dates, $)", "extract"),
+]
+
+
+def _print_catalog(cfg: dict) -> None:
+    table = Table(title="One SIE cluster · the right model for each job", title_style="bold")
+    table.add_column("Role", style="cyan")
+    table.add_column("SIE catalog model", style="green")
+    table.add_column("SIE function", style="magenta")
+    table.add_column("Job")
+    for role, job, fn in ROLE_INFO:
+        table.add_row(role, cfg["models"][role], fn, job)
+    console.print(table)
+
+
+def _print_ledger(ledger: Ledger) -> None:
+    table = Table(title="Per-model observability (in call order)", title_style="bold")
+    table.add_column("#", justify="right")
+    table.add_column("Step")
+    table.add_column("Model", style="green")
+    table.add_column("SIE fn", style="magenta")
+    table.add_column("Warm-up", justify="right")
+    table.add_column("Latency", justify="right")
+    table.add_column("Sent", justify="right")
+    table.add_column("Got", justify="right")
+    table.add_column("Throughput", justify="right")
+    warmup_total = 0.0
+    call_total = 0.0
+    for i, e in enumerate(ledger.entries, 1):
+        warmup_total += e.warmup_s
+        call_total += e.latency_s
+        table.add_row(
+            str(i), e.step, e.model.split("/")[-1], e.sie_fn,
+            f"{e.warmup_s:.1f}s" if e.warmup_s else "—",
+            f"{e.latency_s:.2f}s" if e.latency_s else "—",
+            e.sent or "—", e.got or "—", e.throughput or "—",
+        )
+    console.print(table)
+    used = {e.model for e in ledger.entries}
+    console.print(f"[bold]{len(used)} distinct SIE models[/] handled this request — "
+                  f"{warmup_total:.0f}s cold-start (model warm-up) + {call_total:.0f}s warm calls.")
+
+
+def _print_summary(cfg: dict, usage, wall_s: float) -> None:
+    parts = [f"end-to-end wall time [bold]{wall_s:.1f}s[/]"]
+    reqs = getattr(usage, "requests", None)
+    if reqs is not None:
+        it = getattr(usage, "input_tokens", 0) or 0
+        ot = getattr(usage, "output_tokens", 0) or 0
+        orch = cfg["models"]["orchestrator"].split("/")[-1]
+        parts.append(f"investigator {orch}: {reqs} LLM calls, {it:,} in / {ot:,} out tok")
+    console.print("Run summary — " + " · ".join(parts))
+
+
+def _print_review(review: ContractReview) -> None:
+    lines = [
+        f"[bold]Document type[/]: {review.document_type}",
+        f"[bold]Parties[/]: {', '.join(review.parties) or '—'}",
+        f"[bold]Effective date[/]: {review.effective_date or '—'}",
+        f"[bold]Governing law[/]: {review.governing_law or '—'}",
+        f"[bold]Executed[/]: {review.executed}",
+        f"[bold]Renewal terms[/]: {review.renewal_terms}",
+        "",
+        "[bold]Key obligations[/]:",
+        *[f"  • {o}" for o in review.key_obligations],
+        "",
+        "[bold]Recommendation[/]:",
+        f"  {review.recommendation}",
+    ]
+    console.print(Panel("\n".join(lines), title="Contract review", border_style="blue"))
+
+    if review.risk_flags:
+        risks = Table(title="Risk flags", title_style="bold red")
+        risks.add_column("Severity", style="bold")
+        risks.add_column("Clause")
+        risks.add_column("Issue")
+        risks.add_column("Suggested redline")
+        sev_color = {"high": "red", "medium": "yellow", "low": "green"}
+        for f in review.risk_flags:
+            color = sev_color.get(f.severity.lower(), "white")
+            risks.add_row(f"[{color}]{f.severity}[/]", f.clause, f.issue, f.suggested_redline)
+        console.print(risks)
+
+
+def _resolve_corpus(args) -> tuple[str, str, str, str]:
+    """Return (contract_text, scan_path, db_path, label)."""
+    # Explicit file path wins.
+    if args.contract and Path(args.contract).is_file():
+        p = Path(args.contract)
+        scan = args.scan or str(GENERATED_DIR / "acme-msa-signature.png")
+        return p.read_text(), scan, str(GENERATED_DIR / "obligations.db"), p.name
+
+    if MANIFEST_PATH.exists():  # real CUAD corpus
+        manifest = json.loads(MANIFEST_PATH.read_text())
+        slug = args.contract or manifest["primary"]
+        text = (CUAD_DIR / f"{slug}.txt").read_text()
+        scan = args.scan or str(GENERATED_DIR / manifest["scan_path"])
+        return text, scan, str(GENERATED_DIR / manifest["db_path"]), f"CUAD · {slug}"
+
+    # Offline fallback: synthetic corpus (generate it if missing).
+    if not (GENERATED_DIR / "acme-msa.md").exists():
+        console.print("[yellow]No corpus found — generating the synthetic one. "
+                      "Run `uv run fetch-contracts` for real CUAD contracts.[/]")
+        make_sample.main()
+    name = args.contract or "acme-msa"
+    text = (GENERATED_DIR / f"{name}.md").read_text()
+    scan = args.scan or str(GENERATED_DIR / "acme-msa-signature.png")
+    return text, scan, str(GENERATED_DIR / "obligations.db"), f"synthetic · {name}"
+
+
+def _list_contracts() -> None:
+    if MANIFEST_PATH.exists():
+        manifest = json.loads(MANIFEST_PATH.read_text())
+        console.print(f"[bold]CUAD corpus[/] ({manifest['license']}):")
+        for c in manifest["contracts"]:
+            console.print(f"  {c['slug']}  [dim]{c['type']} · {c['char_len']:,} chars[/]")
+    elif (GENERATED_DIR / "acme-msa.md").exists():
+        console.print("[bold]Synthetic corpus[/]: acme-msa, mutual-nda, acme-sow")
+    else:
+        console.print("No corpus yet. Run `uv run fetch-contracts` or `uv run make-sample`.")
+
+
+async def _warm(app: AppContext) -> None:
+    """Provision the generative models before the run.
+
+    The orchestrator and reasoning sub-agent call models through the Agents SDK,
+    which has no cold-start retry — so on a scale-from-zero cluster the first
+    call would fail. Touch each model once (our helpers retry while it loads).
+    """
+    m = app.cfg["models"]
+    # Only the orchestrator and reasoning sub-agent run through the Agents SDK,
+    # which has no cold-start retry — so only these must be warm before the run.
+    # Triage, vision, guard, SQL, and the encode/score/extract tools all retry
+    # while their model provisions, so they load lazily on first use.
+    for model in dict.fromkeys([m["orchestrator"], m["reasoning"]]):
+        with console.status(f"Warming {model} (first call provisions it on a cold cluster)..."):
+            try:
+                await chat_once(app, model, [{"role": "user", "content": "ok"}], max_tokens=1)
+            except Exception as exc:  # warm-up is best-effort; the run will retry
+                console.print(f"[yellow]warm-up: {model} not ready ({type(exc).__name__}); will retry during the run.[/]")
+    console.print("[green]Warm-up done.[/]\n")
+
+
+async def _run(args) -> None:
+    cfg = load_config()
+    text, scan_path, db_path, label = _resolve_corpus(args)
+
+    # Tool calls use our own provisioning-retry, so their client shouldn't also retry
+    # (max_retries=0). Agents-SDK calls can't be wrapped, so that client retries hard
+    # to survive a model being evicted/reloaded mid-run on a busy cluster.
+    tool_client = make_openai_client(cfg["cluster"]["url"], cfg["cluster"]["api_key"], max_retries=0)
+    agent_client = make_openai_client(cfg["cluster"]["url"], cfg["cluster"]["api_key"], max_retries=12, timeout_s=180)
+    configure_runtime(agent_client)
+
+    _print_catalog(cfg)
+    console.print(f"Reviewing [bold]{label}[/] against SIE at "
+                  f"[bold]{cfg['cluster']['url']}[/]\n")
+
+    async with SIEAsyncClient(cfg["cluster"]["url"], api_key=cfg["cluster"]["api_key"] or None) as sie:
+        ledger = Ledger()
+        app = AppContext(
+            sie=sie,
+            oai=tool_client,
+            cfg=cfg,
+            ledger=ledger,
+            contract_text=text,
+            scan_path=scan_path,
+            db_path=db_path,
+            reasoning_agent=build_reasoning_agent(cfg, agent_client),
+        )
+        investigator = build_investigator(cfg, agent_client)
+        synthesizer = build_synthesizer(cfg, agent_client)
+        if not args.no_warm:
+            await _warm(app)
+        t0 = time.monotonic()
+        try:
+            gather, result = await run_review(app, investigator, synthesizer, args.instruction)
+        except InputGuardrailTripwireTriggered:
+            console.print(Panel("Request blocked by the granite-guardian safety guardrail.",
+                                border_style="red", title="Guardrail tripped"))
+            _print_ledger(ledger)
+            await agent_client.close()
+            await tool_client.close()
+            return
+        except Exception as exc:  # a model the SDK calls (investigator/sub-agent) was unreachable
+            console.print(Panel(f"{type(exc).__name__}: {exc}", border_style="red",
+                                title="Run failed (model unavailable)"))
+            _print_ledger(ledger)
+            await agent_client.close()
+            await tool_client.close()
+            return
+        wall = time.monotonic() - t0
+        await agent_client.close()
+        await tool_client.close()
+
+        try:
+            review = result.final_output_as(ContractReview)
+        except Exception:
+            review = result.final_output
+
+        console.print()
+        if isinstance(review, ContractReview):
+            _print_review(review)
+        else:
+            console.print(Panel(str(review), title="Agent output (unstructured)"))
+        console.print()
+        _print_ledger(ledger)
+        usage = getattr(getattr(gather, "context_wrapper", None), "usage", None)
+        _print_summary(cfg, usage, wall)
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="Review a contract with a multi-model SIE agent.")
+    parser.add_argument("--contract", default=None,
+                        help="contract slug (CUAD), synthetic name, or a path to a .txt/.md file")
+    parser.add_argument("--scan", default=None, help="path to a signature-page image (png/jpg)")
+    parser.add_argument(
+        "--instruction",
+        default="Review this contract. Identify the parties and key terms, flag the "
+        "biggest risks to the Customer with severity and redlines, confirm it is "
+        "executed, and surface upcoming obligations and deadlines.",
+        help="what to ask the agent to do",
+    )
+    parser.add_argument("--list", action="store_true", help="list available contracts and exit")
+    parser.add_argument("--no-warm", action="store_true",
+                        help="skip pre-warming models (faster when the cluster is already warm)")
+    args = parser.parse_args()
+
+    if args.list:
+        _list_contracts()
+        return
+    asyncio.run(_run(args))
+
+
+if __name__ == "__main__":
+    main()
diff --git a/examples/contract-review-agent/contract_review_agent/config.py b/examples/contract-review-agent/contract_review_agent/config.py
new file mode 100644
index 00000000..068526f4
--- /dev/null
+++ b/examples/contract-review-agent/contract_review_agent/config.py
@@ -0,0 +1,27 @@
+"""Load `config.yaml` and resolve the SIE endpoint from env / `.env`.
+
+Precedence for the cluster URL and key: real env var > `.env` file > `config.yaml`.
+"""
+
+from __future__ import annotations
+
+import os
+from pathlib import Path
+from typing import Any
+
+import yaml
+from dotenv import dotenv_values
+
+PROJECT_ROOT = Path(__file__).resolve().parent.parent
+
+
+def load_config() -> dict[str, Any]:
+    cfg = yaml.safe_load((PROJECT_ROOT / "config.yaml").read_text())
+    env = dotenv_values(PROJECT_ROOT / ".env")
+
+    cluster = cfg.setdefault("cluster", {})
+    cluster["url"] = (
+        os.environ.get("SIE_CLUSTER_URL") or env.get("SIE_CLUSTER_URL") or cluster.get("url") or "http://localhost:8080"
+    )
+    cluster["api_key"] = os.environ.get("SIE_API_KEY") or env.get("SIE_API_KEY") or cluster.get("api_key") or ""
+    return cfg
diff --git a/examples/contract-review-agent/contract_review_agent/data/__init__.py b/examples/contract-review-agent/contract_review_agent/data/__init__.py
new file mode 100644
index 00000000..202a8791
--- /dev/null
+++ b/examples/contract-review-agent/contract_review_agent/data/__init__.py
@@ -0,0 +1,15 @@
+"""Self-contained sample data for the contract-review demo.
+
+Run ``uv run make-sample`` (or ``python -m contract_review_agent.data.make_sample``)
+to generate everything under ``data/generated/``: the contract markdown, the
+"scanned" signature-page image, and the SQLite obligations database. Nothing is
+downloaded — it is all synthetic and safe to ship.
+"""
+
+from pathlib import Path
+
+DATA_DIR = Path(__file__).resolve().parent
+GENERATED_DIR = DATA_DIR / "generated"
+CUAD_DIR = GENERATED_DIR / "cuad"  # real contracts fetched from CUAD
+MANIFEST_PATH = GENERATED_DIR / "manifest.json"
+DB_PATH = GENERATED_DIR / "obligations.db"
diff --git a/examples/contract-review-agent/contract_review_agent/data/fetch_contracts.py b/examples/contract-review-agent/contract_review_agent/data/fetch_contracts.py
new file mode 100644
index 00000000..189d9309
--- /dev/null
+++ b/examples/contract-review-agent/contract_review_agent/data/fetch_contracts.py
@@ -0,0 +1,205 @@
+"""Fetch real contracts from CUAD to build the corpus the agent reviews.
+
+CUAD (Contract Understanding Atticus Dataset) is 510 real commercial contracts
+filed with the SEC, released by The Atticus Project under CC BY 4.0 — the
+dataset built specifically for contract review. We download the dataset's small
+(~18 MB) archive once, parse the SQuAD-format JSON, write a curated handful of
+full contracts to ``data/generated/cuad/``, render one page to an image for the
+OCR/vision step, and seed an obligations database that references them.
+
+    Dan Hendrycks, Collin Burns, Anya Chen, Spencer Ball. "CUAD: An
+    Expert-Annotated NLP Dataset for Legal Contract Review." arXiv:2103.06268.
+    Dataset: https://www.atticusprojectai.org/cuad — CC BY 4.0.
+
+Run: ``uv run fetch-contracts`` (offline alternative: ``uv run make-sample``).
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import re
+import sqlite3
+import sys
+import tempfile
+import zipfile
+from pathlib import Path
+
+import httpx
+
+from . import CUAD_DIR, DB_PATH, GENERATED_DIR, MANIFEST_PATH
+from .make_sample import SCHEMA_DDL
+from .render import render_text_page
+
+CUAD_ZIP_URL = "https://raw.githubusercontent.com/TheAtticusProject/cuad/main/data.zip"
+CITATION = (
+    "CUAD (Contract Understanding Atticus Dataset), The Atticus Project, "
+    "CC BY 4.0. Hendrycks et al., arXiv:2103.06268."
+)
+
+_TYPE_KEYWORDS = [
+    ("LICENSE", "License"),
+    ("RESELLER", "Reseller"),
+    ("DISTRIBUTOR", "Distribution"),
+    ("HOSTING", "Hosting"),
+    ("MAINTENANCE", "Maintenance"),
+    ("SUPPLY", "Supply"),
+    ("SERVICE", "Services"),
+    ("CONSULTING", "Consulting"),
+    ("DEVELOPMENT", "Development"),
+    ("MARKETING", "Marketing"),
+]
+
+
+def _slug(title: str) -> str:
+    s = re.sub(r"[^a-z0-9]+", "-", title.lower()).strip("-")
+    return (s[:60] or "contract").rstrip("-")
+
+
+def _infer_type(title: str) -> str:
+    upper = title.upper()
+    for needle, label in _TYPE_KEYWORDS:
+        if needle in upper:
+            return label
+    return "Agreement"
+
+
+def _short_name(title: str) -> str:
+    # CUAD titles are long EDGAR filenames; keep a readable counterparty-ish label.
+    return re.sub(r"[_\-]+", " ", title).strip()[:48]
+
+
+def _download_archive() -> Path:
+    """Download CUAD's data.zip once, cached under data/generated/.cache/."""
+    dest = GENERATED_DIR / ".cache" / "cuad-data.zip"
+    if dest.exists():
+        return dest
+    dest.parent.mkdir(parents=True, exist_ok=True)
+    print(f"Downloading CUAD archive (~18 MB) from {CUAD_ZIP_URL} ...", file=sys.stderr)
+    with tempfile.NamedTemporaryFile(delete=False, dir=dest.parent, suffix=".tmp") as tmp:
+        tmp_path = Path(tmp.name)
+        try:
+            with httpx.stream("GET", CUAD_ZIP_URL, follow_redirects=True, timeout=180) as resp:
+                resp.raise_for_status()
+                for chunk in resp.iter_bytes():
+                    tmp.write(chunk)
+        except BaseException:
+            tmp_path.unlink(missing_ok=True)
+            raise
+    tmp_path.rename(dest)
+    return dest
+
+
+def fetch(n: int, match: str | None) -> list[dict]:
+    """Return up to ``n`` full contracts from CUAD as {title, text, ...} dicts."""
+    archive = _download_archive()
+    with zipfile.ZipFile(archive) as z:
+        names = z.namelist()
+        json_name = next((m for m in names if m.lower().endswith("cuadv1.json")), None) or next(
+            m for m in names if m.lower().endswith(".json")
+        )
+        squad = json.loads(z.read(json_name))
+
+    contracts: list[dict] = []
+    for entry in squad["data"]:
+        title = (entry.get("title") or "").strip()
+        paragraphs = entry.get("paragraphs") or []
+        if not title or not paragraphs:
+            continue
+        if match and match.lower() not in title.lower():
+            continue
+        text = (paragraphs[0].get("context") or "").strip()
+        if not text:
+            continue
+        contracts.append(
+            {
+                "title": title,
+                "slug": _slug(title),
+                "type": _infer_type(title),
+                "counterparty": _short_name(title),
+                "text": text,
+                "char_len": len(text),
+            }
+        )
+        if len(contracts) >= n:
+            break
+    return contracts
+
+
+# Synthetic obligations attached to the real contracts, so the text-to-SQL tool
+# has deadlines/payments to query about the contracts actually in the corpus.
+_OBLIGATION_PLANS = [
+    ("Send renewal / non-renewal notice before term end", "us", "2026-09-15", None, "open"),
+    ("Annual subscription / license fee", "us", "2026-07-01", 120000.0, "open"),
+    ("Quarterly compliance attestation", "counterparty", "2026-06-30", None, "open"),
+    ("Prior-period true-up payment", "us", "2026-04-30", 45000.0, "done"),
+    ("Audit / records inspection response", "counterparty", "2026-06-10", None, "overdue"),
+]
+
+
+def build_obligations(contracts: list[dict]) -> int:
+    DB_PATH.unlink(missing_ok=True)
+    rows = []
+    for i, c in enumerate(contracts):
+        for j in range(3):
+            obligation, owner, due, amount, status = _OBLIGATION_PLANS[(i + j) % len(_OBLIGATION_PLANS)]
+            rows.append((c["counterparty"], c["type"], obligation, owner, due, amount, status))
+    conn = sqlite3.connect(DB_PATH)
+    try:
+        conn.execute(SCHEMA_DDL)
+        conn.executemany(
+            "INSERT INTO obligations "
+            "(counterparty, contract_type, obligation, owner, due_date, amount_usd, status) "
+            "VALUES (?, ?, ?, ?, ?, ?, ?)",
+            rows,
+        )
+        conn.commit()
+    finally:
+        conn.close()
+    return len(rows)
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="Fetch real contracts from CUAD (CC BY 4.0).")
+    parser.add_argument("--contracts", type=int, default=4, help="how many contracts to pull")
+    parser.add_argument("--match", default=None, help="only titles containing this substring")
+    args = parser.parse_args()
+
+    CUAD_DIR.mkdir(parents=True, exist_ok=True)
+    contracts = fetch(args.contracts, args.match)
+    if not contracts:
+        raise SystemExit("No contracts matched. Try a different --match or run `uv run make-sample`.")
+
+    for c in contracts:
+        (CUAD_DIR / f"{c['slug']}.txt").write_text(c["text"])
+
+    primary = contracts[0]
+    scan_path = CUAD_DIR / f"{primary['slug']}-page.png"
+    render_text_page(primary["text"], scan_path, title=primary["title"])
+
+    n_obligations = build_obligations(contracts)
+
+    manifest = {
+        "source": "CUAD v1",
+        "license": "CC BY 4.0",
+        "citation": CITATION,
+        "primary": primary["slug"],
+        "scan_path": str(scan_path.relative_to(GENERATED_DIR)),
+        "db_path": str(DB_PATH.relative_to(GENERATED_DIR)),
+        "contracts": [
+            {k: c[k] for k in ("title", "slug", "type", "counterparty", "char_len")} for c in contracts
+        ],
+    }
+    MANIFEST_PATH.write_text(json.dumps(manifest, indent=2) + "\n")
+
+    print(f"Wrote {len(contracts)} contracts to {CUAD_DIR}:")
+    for c in contracts:
+        flag = "  (primary)" if c["slug"] == primary["slug"] else ""
+        print(f"  {c['type']:12s} {c['slug']}.txt  ({c['char_len']:,} chars){flag}")
+    print(f"Rendered scan page : {scan_path.name}")
+    print(f"Obligations DB     : {DB_PATH.name} ({n_obligations} rows)")
+    print(f"Source: {CITATION}")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/examples/contract-review-agent/contract_review_agent/data/make_sample.py b/examples/contract-review-agent/contract_review_agent/data/make_sample.py
new file mode 100644
index 00000000..9a1bd991
--- /dev/null
+++ b/examples/contract-review-agent/contract_review_agent/data/make_sample.py
@@ -0,0 +1,290 @@
+"""Generate the self-contained sample corpus for the demo.
+
+Produces, under ``data/generated/``:
+  * ``acme-msa.md``, ``mutual-nda.md``, ``acme-sow.md`` — contracts to review/index
+  * ``acme-msa-signature.png`` — the executed signature page, as a "scan" image
+  * ``obligations.db`` — a SQLite database the text-to-SQL tool queries
+
+Everything here is fictional. The MSA is seeded with genuinely risk-bearing
+clauses (auto-renewal, uncapped indemnity, vendor-friendly termination) so the
+risk-analysis agent has real findings to surface.
+"""
+
+from __future__ import annotations
+
+import sqlite3
+import textwrap
+
+from PIL import Image, ImageDraw, ImageFont
+
+from contract_review_agent.data import GENERATED_DIR
+
+# Treat this as "today" for the seeded obligation due dates.
+TODAY = "2026-06-22"
+
+# Single source of truth for the obligations schema: used both to create the
+# table and to prompt the text-to-SQL model (the inline comments help it).
+SCHEMA_DDL = """\
+CREATE TABLE obligations (
+    id            INTEGER PRIMARY KEY,
+    counterparty  TEXT NOT NULL,   -- legal name of the other party
+    contract_type TEXT NOT NULL,   -- one of: MSA, NDA, SOW
+    obligation    TEXT NOT NULL,   -- description of what is owed/required
+    owner         TEXT NOT NULL,   -- who must act: 'us' or 'counterparty'
+    due_date      TEXT NOT NULL,   -- ISO-8601 date (YYYY-MM-DD)
+    amount_usd    REAL,            -- dollar amount if a payment, else NULL
+    status        TEXT NOT NULL    -- one of: open, done, overdue
+)"""
+
+ACME_MSA = """\
+# Master Services Agreement
+
+This Master Services Agreement (the "Agreement") is entered into between
+Northwind Analytics, Inc. ("Customer") and Acme Cloud Services, LLC ("Provider").
+
+## 1. Definitions
+"Services" means the cloud data-processing services described in one or more
+Order Forms. "Order Form" means an ordering document executed by both parties
+that references this Agreement.
+
+## 2. Term and Renewal
+The initial term of this Agreement is twelve (12) months from the Effective
+Date. Thereafter, this Agreement automatically renews for successive twelve (12)
+month terms unless Customer provides written notice of non-renewal at least
+ninety (90) days before the end of the then-current term. Fees for each renewal
+term may increase by up to ten percent (10%) over the prior term.
+
+## 3. Fees and Payment
+Customer shall pay all fees within thirty (30) days of the invoice date.
+Late amounts accrue interest at 1.5% per month. All fees are non-refundable,
+including fees for partially used subscription periods.
+
+## 4. Termination
+Provider may terminate this Agreement or any Order Form for convenience upon
+thirty (30) days written notice. Customer may terminate only for Provider's
+uncured material breach, after a sixty (60) day cure period. Upon termination,
+Customer shall pay all fees due through the end of the then-current term.
+
+## 5. Limitation of Liability
+EXCEPT FOR CUSTOMER'S PAYMENT OBLIGATIONS AND CUSTOMER'S INDEMNIFICATION
+OBLIGATIONS, EACH PARTY'S TOTAL AGGREGATE LIABILITY SHALL NOT EXCEED THE FEES
+PAID BY CUSTOMER IN THE THREE (3) MONTHS PRECEDING THE CLAIM. PROVIDER SHALL NOT
+BE LIABLE FOR ANY INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES.
+
+## 6. Indemnification
+Customer shall defend, indemnify, and hold harmless Provider from any and all
+claims arising out of Customer's use of the Services, without limitation. This
+indemnity is uncapped and survives termination of this Agreement.
+
+## 7. Confidentiality
+Each party shall protect the other's Confidential Information using at least a
+reasonable standard of care, and shall not disclose it except to personnel with
+a need to know who are bound by confidentiality obligations.
+
+## 8. Data Protection
+Provider shall process Customer personal data only on Customer's documented
+instructions and shall maintain appropriate technical and organizational
+security measures. Provider may engage sub-processors, provided Provider remains
+liable for their performance.
+
+## 9. Service Levels
+Provider targets 99.9% monthly availability. If availability falls below the
+target, Customer's sole and exclusive remedy is a service credit equal to 5% of
+the monthly fee per 0.1% below target, capped at 25% of the monthly fee.
+
+## 10. Governing Law
+This Agreement is governed by the laws of the State of Delaware, without regard
+to its conflict-of-laws principles. The parties consent to the exclusive
+jurisdiction of the state and federal courts located in Wilmington, Delaware.
+
+## 11. Assignment
+Customer may not assign this Agreement without Provider's prior written consent.
+Provider may assign this Agreement freely, including to an acquirer of all or
+substantially all of its assets.
+
+## 12. Entire Agreement
+This Agreement, together with all Order Forms, constitutes the entire agreement
+between the parties and supersedes all prior agreements on its subject matter.
+"""
+
+MUTUAL_NDA = """\
+# Mutual Non-Disclosure Agreement
+
+This Mutual Non-Disclosure Agreement is entered into between Northwind
+Analytics, Inc. and Globex Corporation (each a "Party").
+
+## 1. Purpose
+The Parties wish to explore a potential business relationship and may disclose
+confidential information to one another for that purpose.
+
+## 2. Confidential Information
+"Confidential Information" means any non-public information disclosed by a Party,
+whether orally, in writing, or by inspection of tangible objects, that is
+designated confidential or that reasonably should be understood to be
+confidential.
+
+## 3. Obligations
+The receiving Party shall use Confidential Information solely for the Purpose,
+shall protect it with the same degree of care it uses for its own confidential
+information (but no less than reasonable care), and shall not disclose it to
+third parties without the disclosing Party's prior written consent.
+
+## 4. Term
+This Agreement remains in effect for two (2) years from the Effective Date.
+Confidentiality obligations survive for three (3) years after disclosure.
+
+## 5. Return of Materials
+Upon written request, the receiving Party shall promptly return or destroy all
+Confidential Information in its possession.
+
+## 6. Governing Law
+This Agreement is governed by the laws of the State of New York.
+"""
+
+ACME_SOW = """\
+# Statement of Work No. 1
+
+This Statement of Work ("SOW") is issued under and incorporates the Master
+Services Agreement between Northwind Analytics, Inc. and Acme Cloud Services,
+LLC.
+
+## 1. Scope
+Provider will deliver a managed data-ingestion pipeline, including connectors,
+transformation jobs, and a monitoring dashboard, as further described in
+Exhibit A.
+
+## 2. Deliverables and Milestones
+Milestone 1 (design) is due 30 days after the SOW Effective Date. Milestone 2
+(implementation) is due 90 days after. Milestone 3 (acceptance) is due 120 days
+after.
+
+## 3. Fees
+Fees for this SOW total USD 240,000, invoiced 25% per milestone and 25% on final
+acceptance. Travel expenses are billed at cost with prior written approval.
+
+## 4. Acceptance
+Customer has fifteen (15) business days after delivery of each milestone to
+accept or reject deliverables in writing. Deliverables are deemed accepted if
+Customer does not respond within that period.
+
+## 5. Personnel
+Provider shall assign a named technical lead for the duration of this SOW and
+shall not reassign that person without Customer's consent, not to be
+unreasonably withheld.
+
+## 6. Governing Law
+This SOW is governed by the Master Services Agreement, including its governing
+law provision.
+"""
+
+CONTRACTS = {
+    "acme-msa": ACME_MSA,
+    "mutual-nda": MUTUAL_NDA,
+    "acme-sow": ACME_SOW,
+}
+
+# The executed signature page — rendered to an image so the OCR and vision
+# models have a real "scan" to read (these executed details are intentionally
+# NOT in the template body above; the agent must read them off the scan).
+SIGNATURE_PAGE = """\
+EXECUTION PAGE
+
+MASTER SERVICES AGREEMENT
+
+Effective Date: March 1, 2026
+
+IN WITNESS WHEREOF, the parties have executed this Agreement as of the
+Effective Date.
+
+CUSTOMER:                          PROVIDER:
+Northwind Analytics, Inc.          Acme Cloud Services, LLC
+
+By: /s/ Dana Whitfield             By: /s/ Marcus Reyes
+Name: Dana Whitfield               Name: Marcus Reyes
+Title: Chief Operating Officer     Title: VP, Customer Success
+Date: March 1, 2026                Date: March 1, 2026
+
+Governing Law: State of Delaware
+Notices to Customer: legal@northwind-analytics.example
+"""
+
+# (counterparty, contract_type, obligation, owner, due_date, amount_usd, status)
+OBLIGATIONS = [
+    ("Acme Cloud Services, LLC", "MSA", "Send non-renewal notice (90 days before term end) to avoid auto-renewal", "us", "2026-11-30", None, "open"),
+    ("Acme Cloud Services, LLC", "MSA", "Annual subscription fee (renewal term)", "us", "2026-03-01", 180000.0, "done"),
+    ("Acme Cloud Services, LLC", "SOW", "Milestone 1 (design) payment", "us", "2026-04-15", 60000.0, "done"),
+    ("Acme Cloud Services, LLC", "SOW", "Milestone 2 (implementation) payment", "us", "2026-07-14", 60000.0, "open"),
+    ("Acme Cloud Services, LLC", "SOW", "Milestone 3 (acceptance) payment", "us", "2026-08-13", 60000.0, "open"),
+    ("Acme Cloud Services, LLC", "MSA", "Quarterly security attestation delivery", "counterparty", "2026-06-30", None, "open"),
+    ("Globex Corporation", "NDA", "Return or destroy confidential materials on request", "us", "2026-05-10", None, "overdue"),
+    ("Globex Corporation", "NDA", "Confidentiality survival period ends", "us", "2027-09-01", None, "open"),
+    ("Initech LLC", "MSA", "Send non-renewal notice (60 days before term end)", "us", "2026-07-02", None, "open"),
+    ("Initech LLC", "MSA", "Annual subscription fee", "us", "2026-09-01", 95000.0, "open"),
+    ("Umbrella Health, Inc.", "MSA", "Data processing audit response", "counterparty", "2026-06-18", None, "overdue"),
+    ("Umbrella Health, Inc.", "MSA", "Annual subscription fee", "us", "2026-12-01", 220000.0, "open"),
+]
+
+
+def _load_font(size: int) -> ImageFont.ImageFont:
+    """A scalable font with no external file dependency (Pillow >= 10.1)."""
+    try:
+        return ImageFont.load_default(size=size)
+    except TypeError:  # very old Pillow without sized default
+        return ImageFont.load_default()
+
+
+def render_signature_page(out_path) -> None:
+    width, height, margin = 1000, 1300, 70
+    img = Image.new("RGB", (width, height), "white")
+    draw = ImageDraw.Draw(img)
+    font = _load_font(22)
+    y = margin
+    for raw_line in SIGNATURE_PAGE.splitlines():
+        # Preserve the two-column layout by not wrapping; just render each line.
+        draw.text((margin, y), raw_line, fill="black", font=font)
+        y += 34
+    img.save(out_path)
+
+
+def build_obligations_db(out_path) -> None:
+    out_path.unlink(missing_ok=True)
+    conn = sqlite3.connect(out_path)
+    try:
+        conn.execute(SCHEMA_DDL)
+        conn.executemany(
+            "INSERT INTO obligations "
+            "(counterparty, contract_type, obligation, owner, due_date, amount_usd, status) "
+            "VALUES (?, ?, ?, ?, ?, ?, ?)",
+            OBLIGATIONS,
+        )
+        conn.commit()
+    finally:
+        conn.close()
+
+
+def main() -> None:
+    GENERATED_DIR.mkdir(parents=True, exist_ok=True)
+
+    for name, body in CONTRACTS.items():
+        (GENERATED_DIR / f"{name}.md").write_text(body)
+
+    sig_path = GENERATED_DIR / "acme-msa-signature.png"
+    render_signature_page(sig_path)
+
+    db_path = GENERATED_DIR / "obligations.db"
+    build_obligations_db(db_path)
+
+    print(
+        textwrap.dedent(
+            f"""\
+            Sample corpus written to {GENERATED_DIR}:
+              contracts : {", ".join(f"{n}.md" for n in CONTRACTS)}
+              scan      : {sig_path.name}
+              database  : {db_path.name} ({len(OBLIGATIONS)} obligations)
+            """
+        )
+    )
+
+
+if __name__ == "__main__":
+    main()
diff --git a/examples/contract-review-agent/contract_review_agent/data/render.py b/examples/contract-review-agent/contract_review_agent/data/render.py
new file mode 100644
index 00000000..b98f2b9e
--- /dev/null
+++ b/examples/contract-review-agent/contract_review_agent/data/render.py
@@ -0,0 +1,53 @@
+"""Render contract text to a page image.
+
+Used to produce the "scanned page" the OCR and vision models read. (For a true
+scan, point ``--scan`` at a real PDF/PNG instead — see the README.)
+"""
+
+from __future__ import annotations
+
+import textwrap
+from pathlib import Path
+
+from PIL import Image, ImageDraw, ImageFont
+
+
+def load_font(size: int) -> ImageFont.ImageFont:
+    """A scalable font with no external file dependency (Pillow >= 10.1)."""
+    try:
+        return ImageFont.load_default(size=size)
+    except TypeError:  # very old Pillow without sized default
+        return ImageFont.load_default()
+
+
+def render_text_page(
+    text: str,
+    out_path: Path,
+    *,
+    title: str | None = None,
+    width: int = 1000,
+    line_height: int = 30,
+    font_size: int = 20,
+    margin: int = 60,
+    max_lines: int = 46,
+    wrap: int = 92,
+) -> None:
+    lines: list[str] = []
+    for raw in text.splitlines():
+        lines.extend(textwrap.wrap(raw, width=wrap) or [""])
+    lines = lines[:max_lines]
+
+    n = len(lines) + (2 if title else 0)
+    height = max(margin * 2 + line_height * n, 400)
+    img = Image.new("RGB", (width, height), "white")
+    draw = ImageDraw.Draw(img)
+    body_font = load_font(font_size)
+
+    y = margin
+    if title:
+        draw.text((margin, y), title[:80], fill="black", font=load_font(font_size + 4))
+        y += line_height * 2
+    for line in lines:
+        draw.text((margin, y), line, fill="black", font=body_font)
+        y += line_height
+    img.save(out_path)
diff --git a/examples/contract-review-agent/contract_review_agent/guardrails.py b/examples/contract-review-agent/contract_review_agent/guardrails.py
new file mode 100644
index 00000000..94bebbd3
--- /dev/null
+++ b/examples/contract-review-agent/contract_review_agent/guardrails.py
@@ -0,0 +1,60 @@
+"""Input guardrail backed by ibm-granite/granite-guardian.
+
+Before the orchestrator sees a request, granite-guardian screens it for unsafe
+content / prompt-injection. The model emits a "yes" (unsafe) / "no" (safe)
+verdict; we trip the guardrail on "yes". This is the Agents SDK's guardrail
+hook wired to a dedicated safety model in the SIE catalog — the same "guard
+content" job the landing page describes.
+"""
+
+from __future__ import annotations
+
+import time
+from typing import Any
+
+from agents import Agent, GuardrailFunctionOutput, RunContextWrapper, input_guardrail
+
+from .runtime import AppContext, chat_once
+
+
+def _input_text(data: Any) -> str:
+    """Flatten the guardrail input (a string, or a list of input items) to text."""
+    if isinstance(data, str):
+        return data
+    parts: list[str] = []
+    for item in data or []:
+        content = item.get("content") if isinstance(item, dict) else None
+        if isinstance(content, str):
+            parts.append(content)
+        elif isinstance(content, list):
+            parts.extend(c["text"] for c in content if isinstance(c, dict) and isinstance(c.get("text"), str))
+    return "\n".join(parts)
+
+
+@input_guardrail
+async def safety_guardrail(
+    ctx: RunContextWrapper[AppContext], agent: Agent, data: Any
+) -> GuardrailFunctionOutput:
+    app = ctx.context
+    model = app.cfg["models"]["guard"]
+    t0 = time.monotonic()
+    try:
+        res = await chat_once(app, model, [{"role": "user", "content": _input_text(data)[:6000]}], max_tokens=8, timeout_s=25)
+    except Exception as exc:
+        # Guard model unavailable: fail OPEN (allow the run) but make it visible.
+        # A stricter deployment might fail closed — that's a policy choice.
+        app.ledger.record("Safety guardrail (granite-guardian)", model, "chat",
+                          warmup_s=time.monotonic() - t0, got=f"unavailable: {type(exc).__name__}")
+        return GuardrailFunctionOutput(output_info={"error": str(exc), "model": model}, tripwire_triggered=False)
+    verdict = res.text.strip()
+    app.ledger.record(
+        "Safety guardrail (granite-guardian)", model, "chat",
+        warmup_s=res.provision_s, latency_s=res.gen_s,
+        sent=f"{res.prompt_tokens:,} tok" if res.prompt_tokens else "—",
+        got=verdict or "—",
+    )
+    unsafe = verdict.lower().startswith("yes")
+    return GuardrailFunctionOutput(
+        output_info={"verdict": verdict, "model": model},
+        tripwire_triggered=unsafe,
+    )
diff --git a/examples/contract-review-agent/contract_review_agent/runtime.py b/examples/contract-review-agent/contract_review_agent/runtime.py
new file mode 100644
index 00000000..68fbdd7c
--- /dev/null
+++ b/examples/contract-review-agent/contract_review_agent/runtime.py
@@ -0,0 +1,235 @@
+"""Wire the OpenAI Agents SDK to a SIE cluster.
+
+The one idea that makes this whole example work: the Agents SDK speaks the
+OpenAI wire protocol, and SIE serves an OpenAI-compatible ``/v1`` endpoint. So
+we hand every agent an ``AsyncOpenAI`` client whose ``base_url`` points at SIE,
+force the *chat completions* API (SIE doesn't implement the newer Responses
+API), and disable tracing (so nothing is shipped to api.openai.com). After that,
+each agent just names the SIE catalog model it should run on.
+"""
+
+from __future__ import annotations
+
+import asyncio
+import time
+from dataclasses import dataclass, field
+from typing import Any, Awaitable, Callable, TypeVar
+
+from agents import (
+    OpenAIChatCompletionsModel,
+    set_default_openai_api,
+    set_default_openai_client,
+    set_tracing_disabled,
+)
+from openai import APIConnectionError, APIStatusError, AsyncOpenAI
+from sie_sdk import SIEAsyncClient
+
+T = TypeVar("T")
+
+
+def make_openai_client(
+    cluster_url: str, api_key: str, *, max_retries: int = 2, timeout_s: float | None = None
+) -> AsyncOpenAI:
+    """An OpenAI client pointed at SIE's OpenAI-compatible endpoint.
+
+    ``max_retries`` matters because Agents-SDK-driven calls can't be wrapped in our
+    own provisioning retry — a client with generous retries survives a model being
+    evicted and reloaded mid-run on a busy cluster.
+    """
+    base_url = cluster_url.rstrip("/") + "/v1"
+    kwargs: dict[str, Any] = {"base_url": base_url, "api_key": api_key or "not-needed", "max_retries": max_retries}
+    if timeout_s is not None:
+        kwargs["timeout"] = timeout_s
+    # SIE ignores the key locally; a managed cluster reads it as a Bearer token.
+    return AsyncOpenAI(**kwargs)
+
+
+def configure_runtime(client: AsyncOpenAI) -> None:
+    """Point the whole Agents SDK at SIE instead of api.openai.com."""
+    set_default_openai_client(client)  # every agent talks to SIE...
+    set_default_openai_api("chat_completions")  # ...over chat completions, not Responses...
+    set_tracing_disabled(True)  # ...and we never phone home with traces.
+
+
+def model_for(model_id: str, client: AsyncOpenAI) -> OpenAIChatCompletionsModel:
+    """Bind one SIE catalog model id to an Agents-SDK model an Agent can use."""
+    return OpenAIChatCompletionsModel(model=model_id, openai_client=client)
+
+
+@dataclass
+class GenResult:
+    """One generation's text plus the metrics we log for observability.
+
+    ``provision_s`` is time spent waiting for the model to be ready — the cold
+    start, measured as the failed 503/504 retries before the request that
+    actually ran. ``gen_s`` is the duration of that successful (warm) call, so
+    throughput is measured warm instead of being blended with the cold start.
+    """
+
+    text: str
+    provision_s: float
+    gen_s: float
+    prompt_tokens: int | None = None
+    completion_tokens: int | None = None
+
+    @property
+    def latency_s(self) -> float:
+        return self.provision_s + self.gen_s
+
+    @property
+    def tokens_per_s(self) -> float | None:
+        if self.completion_tokens and self.gen_s > 0:
+            return self.completion_tokens / self.gen_s
+        return None
+
+
+@dataclass
+class LedgerEntry:
+    step: str
+    model: str
+    sie_fn: str
+    warmup_s: float = 0.0  # cold-start / provisioning wait (0 if warm or N/A)
+    latency_s: float = 0.0  # the call itself (warm), excluding warm-up
+    sent: str = ""
+    got: str = ""
+    throughput: str = ""
+
+
+@dataclass
+class Ledger:
+    """Per-call observability for one agent run.
+
+    Every tool, guardrail, and sub-agent records the model it used plus latency,
+    how much data it sent, and throughput — so a normal run prints not just
+    *which* models the cluster fanned the request across, but how each performed.
+    """
+
+    entries: list[LedgerEntry] = field(default_factory=list)
+
+    def record(
+        self,
+        step: str,
+        model: str,
+        sie_fn: str,
+        *,
+        warmup_s: float = 0.0,
+        latency_s: float = 0.0,
+        sent: str = "",
+        got: str = "",
+        throughput: str = "",
+    ) -> None:
+        self.entries.append(LedgerEntry(step, model, sie_fn, warmup_s, latency_s, sent, got, throughput))
+
+
+@dataclass
+class AppContext:
+    """Shared dependencies handed to every tool via ``RunContextWrapper``."""
+
+    sie: SIEAsyncClient
+    oai: AsyncOpenAI
+    cfg: dict[str, Any]
+    ledger: Ledger
+    contract_text: str  # the contract body we have on file (template/clauses)
+    scan_path: str  # the executed signature page, delivered as a scan image
+    db_path: str  # the SQLite obligations database the SQL tool queries
+    reasoning_agent: Any = None  # the risk-analyst sub-agent (set during build)
+    # Cache the clause embeddings in a shared mutable dict, not a reassigned
+    # attribute: the Agents SDK hands each tool call a shallow copy of the context,
+    # so mutating a shared object persists but reassigning `app.x = ...` does not.
+    clause_cache: dict[str, Any] = field(default_factory=dict)
+
+    @property
+    def provision_timeout_s(self) -> float:
+        return float(self.cfg["cluster"].get("provision_timeout_s", 900))
+
+
+async def with_provisioning_retry(
+    make_call: Callable[[], Awaitable[T]], deadline: float
+) -> tuple[T, float, float]:
+    """Retry an OpenAI-client call while SIE scales a model from zero.
+
+    A cold cluster answers 503/504/202 until the model is resident; we retry until
+    ``deadline`` (monotonic seconds) before giving up. Returns
+    ``(result, provision_s, call_s)``: ``provision_s`` is the time spent in failed
+    retries before the attempt that succeeded (the cold start), ``call_s`` is the
+    duration of that successful call — so callers can report warm-up and warm
+    throughput separately.
+    """
+    start = time.monotonic()
+    while True:
+        attempt_start = time.monotonic()
+        try:
+            result = await make_call()
+            return result, attempt_start - start, time.monotonic() - attempt_start
+        except APIConnectionError:
+            if time.monotonic() < deadline:
+                await asyncio.sleep(5)
+                continue
+            raise
+        except APIStatusError as exc:
+            # 202 accepted (warming), 502/503 unavailable, 504 first_chunk_timeout —
+            # all transient while SIE scales a model from zero.
+            if exc.status_code in (202, 502, 503, 504) and time.monotonic() < deadline:
+                await asyncio.sleep(5)
+                continue
+            raise
+
+
+async def chat_once(
+    app: AppContext,
+    model: str,
+    messages: list[dict[str, Any]],
+    *,
+    max_tokens: int = 512,
+    temperature: float = 0.0,
+    timeout_s: float | None = None,
+    **extra: Any,
+) -> GenResult:
+    """One OpenAI-compatible chat completion against SIE, with cold-start retry.
+
+    ``timeout_s`` overrides how long to keep retrying a provisioning model (the
+    guardrail uses a short budget so it fails open fast rather than blocking).
+    """
+    deadline = time.monotonic() + (timeout_s if timeout_s is not None else app.provision_timeout_s)
+    resp, provision_s, gen_s = await with_provisioning_retry(
+        lambda: app.oai.chat.completions.create(
+            model=model, messages=messages, max_tokens=max_tokens, temperature=temperature, **extra
+        ),
+        deadline,
+    )
+    usage = resp.usage
+    return GenResult(
+        text=resp.choices[0].message.content or "",
+        provision_s=provision_s,
+        gen_s=gen_s,
+        prompt_tokens=getattr(usage, "prompt_tokens", None) if usage else None,
+        completion_tokens=getattr(usage, "completion_tokens", None) if usage else None,
+    )
+
+
+async def complete_once(
+    app: AppContext,
+    model: str,
+    prompt: str,
+    *,
+    max_tokens: int = 256,
+    temperature: float = 0.0,
+    stop: list[str] | None = None,
+) -> GenResult:
+    """One OpenAI-compatible *text completion* against SIE (for completion-only
+    models like sqlcoder that expect a raw prompt, not a chat transcript)."""
+    deadline = time.monotonic() + app.provision_timeout_s
+    resp, provision_s, gen_s = await with_provisioning_retry(
+        lambda: app.oai.completions.create(
+            model=model, prompt=prompt, max_tokens=max_tokens, temperature=temperature, stop=stop
+        ),
+        deadline,
+    )
+    usage = resp.usage
+    return GenResult(
+        text=resp.choices[0].text or "",
+        provision_s=provision_s,
+        gen_s=gen_s,
+        prompt_tokens=getattr(usage, "prompt_tokens", None) if usage else None,
+        completion_tokens=getattr(usage, "completion_tokens", None) if usage else None,
+    )
diff --git a/examples/contract-review-agent/contract_review_agent/tools.py b/examples/contract-review-agent/contract_review_agent/tools.py
new file mode 100644
index 00000000..014e6ff0
--- /dev/null
+++ b/examples/contract-review-agent/contract_review_agent/tools.py
@@ -0,0 +1,321 @@
+"""The agent's tools — each one a different model from the SIE catalog.
+
+Every tool takes the shared :class:`AppContext` (injected by the Agents SDK via
+``RunContextWrapper``, never shown to the model) and records to the ledger the
+model it used, the latency, how much data it sent, and the throughput — so a
+normal end-to-end run doubles as per-model observability.
+"""
+
+from __future__ import annotations
+
+import base64
+import re
+import sqlite3
+import time
+from pathlib import Path
+
+import numpy as np
+from agents import RunContextWrapper, Runner, function_tool
+from sie_sdk import Item
+
+from .data.make_sample import SCHEMA_DDL, TODAY
+from .runtime import AppContext, GenResult, chat_once, complete_once
+
+
+def _tok(n: int | None) -> str:
+    return f"{n:,} tok" if n else "—"
+
+
+def _tps(res: GenResult) -> str:
+    return f"{res.tokens_per_s:.0f} tok/s" if res.tokens_per_s else "—"
+
+
+def _split_clauses(text: str, target: int = 900) -> list[str]:
+    """Chunk a contract into ~``target``-char passages for retrieval.
+
+    Works on real contract text (plain, no markdown) as well as the synthetic
+    markdown body: split on blank-line paragraphs (falling back to lines), greedily
+    pack to ``target`` chars, then hard-split anything still oversized.
+    """
+    text = text.replace("\r\n", "\n")
+    blocks = [b.strip() for b in re.split(r"\n\s*\n", text) if b.strip()]
+    if len(blocks) < 3:  # e.g. single-spaced contract text
+        blocks = [ln.strip() for ln in text.split("\n") if ln.strip()]
+
+    chunks: list[str] = []
+    current = ""
+    for block in blocks:
+        if current and len(current) + len(block) + 2 > target:
+            chunks.append(current)
+            current = block
+        else:
+            current = f"{current}\n\n{block}" if current else block
+    if current:
+        chunks.append(current)
+
+    out: list[str] = []
+    for chunk in chunks:
+        while len(chunk) > target * 1.6:
+            out.append(chunk[:target])
+            chunk = chunk[target:]
+        out.append(chunk)
+    return out
+
+
+async def _clause_index(app: AppContext, embed_model: str) -> tuple[list[str], np.ndarray]:
+    """Embed the contract's clauses once and cache (clauses, matrix)."""
+    if "matrix" in app.clause_cache:
+        return app.clause_cache["clauses"], app.clause_cache["matrix"]
+    clauses = _split_clauses(app.contract_text)
+    t0 = time.monotonic()
+    results = await app.sie.encode(
+        embed_model,
+        [Item(id=str(i), text=c) for i, c in enumerate(clauses)],
+        output_types=["dense"],
+        wait_for_capacity=True,
+        provision_timeout_s=app.provision_timeout_s,
+    )
+    dt = time.monotonic() - t0
+    matrix = np.vstack([np.asarray(r["dense"], dtype=np.float32) for r in results])
+    app.ledger.record(
+        "Embed clauses (index)", embed_model, "encode",
+        latency_s=dt, sent=f"{len(clauses)} clauses", got=f"{matrix.shape[1]}-dim",
+        throughput=f"{len(clauses) / dt:.1f} items/s" if dt > 0 else "—",
+    )
+    app.clause_cache["clauses"] = clauses
+    app.clause_cache["matrix"] = matrix
+    return clauses, matrix
+
+
+# ──────────────────────────────────────────────────────────────────────────
+# Generative-LLM tools
+# ──────────────────────────────────────────────────────────────────────────
+@function_tool
+async def classify_document(ctx: RunContextWrapper[AppContext]) -> str:
+    """Classify the contract under review as MSA, NDA, SOW, or Other, with a
+    one-line reason. A fast first-pass triage over the loaded contract."""
+    app = ctx.context
+    model = app.cfg["models"]["triage"]
+    res = await chat_once(
+        app, model,
+        [
+            {"role": "system", "content": "You label legal documents. Reply with the type only "
+             "(MSA, NDA, SOW, or Other) followed by a short reason."},
+            {"role": "user", "content": app.contract_text[:1500]},
+        ],
+        max_tokens=60,
+    )
+    app.ledger.record("Triage: classify document", model, "chat",
+                      warmup_s=res.provision_s, latency_s=res.gen_s,
+                      sent=_tok(res.prompt_tokens), got=_tok(res.completion_tokens), throughput=_tps(res))
+    return res.text.strip()
+
+
+@function_tool
+async def read_signature_page(ctx: RunContextWrapper[AppContext], question: str) -> str:
+    """Ask a vision model a question about the scanned signature-page image
+    (e.g. 'Are both parties' signatures present and dated?'). Use this for
+    visual checks that OCR text alone cannot answer."""
+    app = ctx.context
+    model = app.cfg["models"]["vision"]
+    data = base64.b64encode(Path(app.scan_path).read_bytes()).decode()
+    res = await chat_once(
+        app, model,
+        [
+            {"role": "system", "content": "You are a meticulous contracts paralegal. Answer only from what is visible in the image."},
+            {"role": "user", "content": [
+                {"type": "text", "text": question},
+                {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{data}"}},
+            ]},
+        ],
+        max_tokens=220,
+    )
+    app.ledger.record("Read signature page (vision)", model, "chat + image",
+                      warmup_s=res.provision_s, latency_s=res.gen_s,
+                      sent=f"{Path(app.scan_path).stat().st_size // 1024} KB img + {_tok(res.prompt_tokens)}",
+                      got=_tok(res.completion_tokens), throughput=_tps(res))
+    return res.text.strip()
+
+
+@function_tool
+async def analyze_clause_risks(ctx: RunContextWrapper[AppContext], clauses: str) -> str:
+    """Delegate deep legal risk analysis of specific clauses to a specialist
+    reasoning agent (the largest model). Pass the clause text to analyze; get
+    back per-clause issues with severity and suggested redlines."""
+    app = ctx.context
+    model = app.cfg["models"]["reasoning"]
+    t0 = time.monotonic()
+    result = await Runner.run(
+        app.reasoning_agent,
+        "Analyze the following contract clauses for risk to the Customer. For each "
+        "risk, give: the clause, the issue, a severity (low/medium/high), and a "
+        f"one-line suggested redline.\n\n{clauses}",
+        context=app,
+    )
+    dt = time.monotonic() - t0
+    usage = getattr(getattr(result, "context_wrapper", None), "usage", None)
+    out_tok = getattr(usage, "output_tokens", None)
+    app.ledger.record("Clause-risk analysis (specialist sub-agent)", model, "chat",
+                      latency_s=dt, sent=_tok(getattr(usage, "input_tokens", None)), got=_tok(out_tok),
+                      throughput=f"{out_tok / dt:.0f} tok/s" if out_tok and dt > 0 else "—")
+    return str(result.final_output)
+
+
+# ──────────────────────────────────────────────────────────────────────────
+# Retrieval / extraction tools (encode · score · extract)
+# ──────────────────────────────────────────────────────────────────────────
+@function_tool
+async def ocr_signature_page(ctx: RunContextWrapper[AppContext]) -> str:
+    """OCR the executed signature page (a scanned image) into markdown text.
+    Use this to recover who signed, their titles, and the execution date —
+    details that exist only on the scan, not in the contract body."""
+    app = ctx.context
+    model = app.cfg["models"]["ocr"]
+    t0 = time.monotonic()
+    res = await app.sie.extract(
+        model, Item(images=[app.scan_path]), wait_for_capacity=True, provision_timeout_s=app.provision_timeout_s
+    )
+    dt = time.monotonic() - t0
+    entities = res.get("entities") or []
+    markdown = entities[0]["text"] if entities else "(no text recognized)"
+    app.ledger.record("OCR signature page → markdown", model, "extract",
+                      latency_s=dt, sent=f"{Path(app.scan_path).stat().st_size // 1024} KB img",
+                      got=f"{len(markdown):,} chars md", throughput=f"{len(markdown) / dt:.0f} chars/s" if dt > 0 else "—")
+    return markdown
+
+
+@function_tool
+async def extract_entities(ctx: RunContextWrapper[AppContext]) -> str:
+    """Extract structured entities (parties, dates, monetary amounts, governing
+    law, notice periods) from the loaded contract using zero-shot NER."""
+    app = ctx.context
+    model = app.cfg["models"]["entities"]
+    labels = ["party", "effective date", "renewal date", "termination notice period",
+              "monetary amount", "governing law", "term length"]
+    payload = app.contract_text[:6000]
+    t0 = time.monotonic()
+    res = await app.sie.extract(
+        model, Item(text=payload), labels=labels, wait_for_capacity=True, provision_timeout_s=app.provision_timeout_s
+    )
+    dt = time.monotonic() - t0
+    entities = res.get("entities") or []
+    app.ledger.record("Extract entities (zero-shot NER)", model, "extract",
+                      latency_s=dt, sent=f"{len(payload):,} chars", got=f"{len(entities)} entities",
+                      throughput=f"{len(entities) / dt:.1f} ent/s" if dt > 0 else "—")
+    lines = [f"- {e['label']}: {e['text']} (score {e.get('score', 0):.2f})" for e in entities]
+    return "\n".join(lines) if lines else "(no entities found)"
+
+
+@function_tool
+async def search_clauses(ctx: RunContextWrapper[AppContext], query: str) -> str:
+    """Find the clauses most relevant to a topic (e.g. 'automatic renewal',
+    'limitation of liability', 'indemnification'). Dense-embedding retrieval
+    followed by a cross-encoder rerank; returns the top clauses verbatim."""
+    app = ctx.context
+    embed_model = app.cfg["models"]["embed"]
+    rerank_model = app.cfg["models"]["rerank"]
+    k_cand = int(app.cfg["search"]["top_k_candidates"])
+    k_res = int(app.cfg["search"]["top_k_results"])
+
+    clauses, matrix = await _clause_index(app, embed_model)
+    q = await app.sie.encode(
+        embed_model, Item(text=query), output_types=["dense"], is_query=True,
+        wait_for_capacity=True, provision_timeout_s=app.provision_timeout_s,
+    )
+    qv = np.asarray(q["dense"], dtype=np.float32)
+    denom = np.linalg.norm(matrix, axis=1) * (np.linalg.norm(qv) + 1e-9) + 1e-9
+    sims = (matrix @ qv) / denom
+    candidate_idx = np.argsort(-sims)[: min(k_cand, len(clauses))]
+    candidates = [clauses[i] for i in candidate_idx]
+
+    t0 = time.monotonic()
+    scored = await app.sie.score(
+        rerank_model, Item(text=query), [Item(id=str(i), text=c) for i, c in enumerate(candidates)],
+        wait_for_capacity=True, provision_timeout_s=app.provision_timeout_s,
+    )
+    dt = time.monotonic() - t0
+    app.ledger.record("Rerank candidate clauses", rerank_model, "score",
+                      latency_s=dt, sent=f"{len(candidates)} docs", got=f"top {k_res}",
+                      throughput=f"{len(candidates) / dt:.1f} docs/s" if dt > 0 else "—")
+    ranked = sorted(scored["scores"], key=lambda s: s["rank"])[:k_res]
+    top = [candidates[int(s["item_id"])] for s in ranked]
+    return "\n\n---\n\n".join(top) if top else "(no relevant clauses found)"
+
+
+# ──────────────────────────────────────────────────────────────────────────
+# Text-to-SQL tool (completion-only specialist model)
+# ──────────────────────────────────────────────────────────────────────────
+_SQLCODER_PROMPT = """### Task
+Generate a SQLite SQL query to answer [QUESTION]{question}[/QUESTION]
+
+### Instructions
+- Use SQLite syntax only. Today's date is {today}.
+- Dates are ISO-8601 text; use SQLite date functions (e.g. date('now'), date(due_date)).
+
+### Database Schema
+The query will run on a database with this schema:
+{schema}
+
+### Answer
+Given the database schema, here is the SQLite query that answers [QUESTION]{question}[/QUESTION]:
+"""
+
+
+def _clean_sql(raw: str) -> str:
+    sql = raw.strip()
+    for fence in ("```sql", "```sqlite", "```"):
+        if sql.startswith(fence):
+            sql = sql[len(fence) :].strip()
+    sql = sql.split("```")[0]
+    if "[SQL]" in sql:
+        sql = sql.split("[SQL]", 1)[1]
+    return sql.strip().rstrip(";").strip()
+
+
+def _run_select(db_path: str, sql: str) -> tuple[list[str], list[tuple]]:
+    conn = sqlite3.connect(db_path)
+    try:
+        cur = conn.execute(sql)
+        cols = [d[0] for d in cur.description] if cur.description else []
+        return cols, cur.fetchmany(50)
+    finally:
+        conn.close()
+
+
+@function_tool
+async def query_obligations_db(ctx: RunContextWrapper[AppContext], question: str) -> str:
+    """Answer a question about tracked contract obligations and deadlines by
+    generating SQL (a text-to-SQL specialist model) and running it against the
+    obligations database. Good for 'which obligations are due soon?' or 'total
+    outstanding payments by counterparty'."""
+    app = ctx.context
+    model = app.cfg["models"]["sql"]
+    prompt = _SQLCODER_PROMPT.format(question=question, schema=SCHEMA_DDL, today=TODAY)
+    res = await complete_once(app, model, prompt, max_tokens=256, stop=[";", "```", "\n\n\n"])
+    app.ledger.record("Text-to-SQL", model, "completions",
+                      warmup_s=res.provision_s, latency_s=res.gen_s,
+                      sent=_tok(res.prompt_tokens), got=_tok(res.completion_tokens), throughput=_tps(res))
+
+    sql = _clean_sql(res.text)
+    if not sql.lower().startswith("select"):
+        return f"Generated query was not a SELECT, so it was not run:\n{sql}"
+    try:
+        cols, rows = _run_select(app.db_path, sql)
+    except sqlite3.Error as exc:
+        return f"SQL error: {exc}\nQuery was:\n{sql}"
+    if not rows:
+        return f"Query ran but returned no rows.\nSQL: {sql}"
+    header = " | ".join(cols)
+    body = "\n".join(" | ".join("" if v is None else str(v) for v in row) for row in rows)
+    return f"SQL: {sql}\n\n{header}\n{body}"
+
+
+ALL_TOOLS = [
+    classify_document,
+    ocr_signature_page,
+    extract_entities,
+    search_clauses,
+    read_signature_page,
+    query_obligations_db,
+    analyze_clause_risks,
+]
diff --git a/examples/contract-review-agent/pyproject.toml b/examples/contract-review-agent/pyproject.toml
new file mode 100644
index 00000000..9761bc1f
--- /dev/null
+++ b/examples/contract-review-agent/pyproject.toml
@@ -0,0 +1,38 @@
+[project]
+name = "contract-review-agent"
+version = "0.1.0"
+description = "Contract review with the OpenAI Agents SDK, every model served from one SIE cluster"
+readme = "README.md"
+requires-python = ">=3.12,<3.13"
+dependencies = [
+    # openai-agents >=0.14 pins websockets>=15 (for its realtime/voice features,
+    # unused here) which conflicts with sie-sdk's websockets<15; 0.13.x is the
+    # newest line compatible with the SIE SDK and has the same agent API.
+    "openai-agents>=0.13,<0.14",
+    "openai>=1.40",
+    "sie-sdk>=0.6.8",
+    "httpx>=0.27",
+    "numpy>=1.26",
+    "pydantic>=2.7",
+    "python-dotenv>=1.0",
+    "pyyaml>=6.0",
+    "pillow>=10.3",
+    "rich>=13.7",
+]
+
+[project.scripts]
+fetch-contracts = "contract_review_agent.data.fetch_contracts:main"
+make-sample = "contract_review_agent.data.make_sample:main"
+review = "contract_review_agent.cli:main"
+
+[dependency-groups]
+dev = [
+    "ruff>=0.11.0",
+]
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.ruff]
+target-version = "py312"