ayhammouda · ayhammouda · May 30, 2026 · May 29, 2026 · May 29, 2026 · May 29, 2026
diff --git a/.coderabbit.yaml b/.coderabbit.yaml
@@ -0,0 +1,24 @@
+# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json
+language: "en-US"
+
+reviews:
+  profile: "chill"
+  request_changes_workflow: false
+  high_level_summary: true
+  review_status: true
+  path_filters:
+    - "src/**"
+    - "tests/**"
+  path_instructions:
+    - path: "src/**"
+      instructions: |
+        Focus review on correctness, MCP tool behavior, runtime compatibility,
+        cache/index compatibility, packaging impact, and security boundaries.
+        Avoid comments about planning docs, release docs, or repository process
+        unless the changed source code makes those docs materially inaccurate.
+    - path: "tests/**"
+      instructions: |
+        Focus review on meaningful assertions, regression coverage, fixture
+        correctness, deterministic behavior, and avoiding network- or
+        environment-dependent tests unless the test is explicitly marked as an
+        integration smoke test.
@@ -0,0 +1,49 @@
+# CODEOWNERS — forces maintainer review on forbidden-territory paths.
+#
+# Source of truth: AGENT-EXECUTION-PIPELINE.md §2 (Forbidden Territory),
+# required by §10 (Pre-flight Checklist).
+#
+# For these rules to be ENFORCED, branch protection on `main` must enable
+# "Require review from Code Owners". CODEOWNERS alone only requests review;
+# branch protection is what blocks merge.
+#
+# Autonomous agents may NOT modify these paths without explicit human approval
+# (pipeline §2). Any agent PR touching them must add the `🛑 needs-human-review`
+# label and stop short of requesting merge (pipeline §7).
+
+# --- Project identity, dependencies, classifiers (only `version` is agent-editable) ---
+/pyproject.toml                                   @ayhammouda
+
+# --- Permanent commitments and trust posture ---
+/LICENSE                                          @ayhammouda
+/SECURITY.md                                      @ayhammouda
+
+# --- Load-bearing brand assets ---
+/README.md                                        @ayhammouda
+/.planning/POSITIONING.md                         @ayhammouda
+
+# --- Release history (adding entries is fine; rewriting history is not) ---
+/CHANGELOG.md                                     @ayhammouda
+
+# --- CI/CD and supply chain (release path especially) ---
+# The single /.github/ rule covers workflows and release.yml. Last-matching-
+# pattern wins in CODEOWNERS — adding narrower entries with the same owner
+# below would be no-ops and would silently *override* this rule if a different
+# owner is ever added here, so we keep ownership of /.github/ uniform.
+/.github/                                         @ayhammouda
+
+# --- Index schema and migrations (rebuilds existing user indexes) ---
+# NOTE: the retrieved-docs *cache* table lives in
+# src/mcp_server_python_docs/services/persistent_cache.py and is NOT covered
+# here — it is best-effort, fingerprint-scoped, and agent-editable per
+# decision 5.7. Only the canonical *index* schema is forbidden territory.
+**/storage/schema.sql                             @ayhammouda
+**/migrations/                                    @ayhammouda
+
+# --- Archival roadmap history ---
+/.planning/ROADMAP.md                             @ayhammouda
+
+# --- Governing policy + strategy documents ---
+/AGENT-EXECUTION-PIPELINE.md                      @ayhammouda
+/OPENCLAW-FORGE-PROTOCOL.md                       @ayhammouda
+/STRATEGIC-ROADMAP-2026-05-29.md                  @ayhammouda
@@ -0,0 +1,108 @@
+name: Autonomous Agent Task
+description: A task spec scoped for unattended execution by an autonomous coding agent (Claude Code or similar).
+title: "[vX.Y.Z] <scope> — <verb> <thing>"
+body:
+  - type: markdown
+    attributes:
+      value: |
+        This template enforces the issue structure required by
+        `AGENT-EXECUTION-PIPELINE.md` §3 (in the repo root). An issue missing
+        any required section is **not** agent-ready and will not pass the §10
+        pre-flight checklist. Do not apply the `agent-ready` label from this
+        template; a maintainer applies it only after reading the completed
+        issue end-to-end. Read the pipeline doc and
+        `STRATEGIC-ROADMAP-2026-05-29.md` before filling this out.
+  - type: textarea
+    id: context
+    attributes:
+      label: Context (self-containment)
+      description: Link to the per-issue context file, this pipeline doc, the roadmap, any relevant ADR or `.planning/phases/0X-*` directory, and prior related issues.
+      value: |
+        - Per-issue context file: `.planning/agent-context/<issue-slug>.md` (read this first)
+        - Pipeline: `AGENT-EXECUTION-PIPELINE.md`
+        - Roadmap: `STRATEGIC-ROADMAP-2026-05-29.md` §<section>
+        - Related issues:
+    validations:
+      required: true
+  - type: textarea
+    id: goal
+    attributes:
+      label: Goal (one sentence)
+      description: The single outcome that counts as success.
+    validations:
+      required: true
+  - type: textarea
+    id: acceptance
+    attributes:
+      label: Acceptance criteria (testable checkbox list)
+      description: Each criterion must be testable, atomic, achievable without touching forbidden territory, and verifiable in <5 minutes (pipeline §4). Prefer exact commands and expected output.
+      value: |
+        - [ ] `<exact command>` <expected result>
+        - [ ] `<exact command>` <expected result>
+    validations:
+      required: true
+  - type: textarea
+    id: scope
+    attributes:
+      label: Scope boundaries
+      description: Explicit In scope / Out of scope. Out-of-scope work is a stop-and-comment trigger, never silent expansion.
+      value: |
+        **In scope:**
+        -
+
+        **Out of scope:**
+        -
+    validations:
+      required: true
+  - type: textarea
+    id: forbidden
+    attributes:
+      label: Forbidden-territory reminders
+      description: Repeat the AGENT-EXECUTION-PIPELINE.md §2 items relevant to THIS issue. If the task appears to require touching any of them, stop and comment.
+    validations:
+      required: true
+  - type: textarea
+    id: validation
+    attributes:
+      label: Validation commands (pipeline §5 gate)
+      description: The exact canonical gate, in order, plus any change-type-specific gates. Must pass before any PR is opened.
+      value: |
+        ```bash
+        uv run ruff check src/ tests/
+        uv run pyright src/
+        uv run pytest --tb=short -q
+        uv run python-docs-mcp-server doctor
+        ```
+    validations:
+      required: true
+  - type: textarea
+    id: pr-and-recovery
+    attributes:
+      label: PR requirements & recovery
+      description: What the PR description must include (pipeline §6) and where to go if blocked (pipeline §8).
+      value: |
+        - PR title matches this issue title verbatim; body uses
+          `.github/PULL_REQUEST_TEMPLATE/agent.md`.
+        - Branch: `agent/<issue-number>-<kebab-summary>`.
+        - If blocked: stop, write `WORKING-NOTES.md` on the branch, comment on
+          this issue per pipeline §8. **No PR, no auto-merge, ever.**
+    validations:
+      required: true
+  - type: input
+    id: effort
+    attributes:
+      label: Effort estimate (hours)
+      description: Rough hours. Agent must bail and escalate if work exceeds 2× this estimate (pipeline §8).
+    validations:
+      required: true
+  - type: checkboxes
+    id: acknowledgements
+    attributes:
+      label: Agent acknowledgements
+      options:
+        - label: I will work on a branch, never on `main`, and will not auto-merge.
+          required: true
+        - label: I will stop and comment rather than silently expand scope or touch forbidden territory.
+          required: true
+        - label: I will add `🛑 needs-human-review` if any pipeline §7 trigger fires.
+          required: true
@@ -0,0 +1,42 @@
+<!--
+Autonomous-agent PR template. Enforces AGENT-EXECUTION-PIPELINE.md §6.
+PR title MUST match the issue title verbatim. Never request auto-merge.
+-->
+
+Closes #<issue-number>
+
+## Acceptance criteria
+<!-- Copy every criterion from the issue. Check the box only when satisfied,
+     and add one line of evidence (command + observed result) per item. -->
+- [ ] <criterion 1> — <evidence>
+- [ ] <criterion 2> — <evidence>
+
+## Validation gate output
+<!-- Paste the tail of each gate command. All must be green before opening this PR. -->
+```text
+$ uv run ruff check src/ tests/
+$ uv run pyright src/
+$ uv run pytest --tb=short -q
+$ uv run python-docs-mcp-server doctor
-$ uv run ruff check src/ tests/
-$ uv run pyright src/
-$ uv run pytest --tb=short -q
-$ uv run python-docs-mcp-server doctor
+uv run ruff check src/ tests/
+uv run pyright src/
+uv run pytest --tb=short -q
+uv run python-docs-mcp-server doctor
-$ uv run ruff check src/ tests/
-$ uv run pyright src/
-$ uv run pytest --tb=short -q
-$ uv run python-docs-mcp-server doctor
+uv run ruff check src/ tests/
+uv run pyright src/
+uv run pytest --tb=short -q
+uv run python-docs-mcp-server doctor
+```
+<!-- Plus any change-type-specific gates from pipeline §5 (stdio smoke,
+     validate-corpus, uv lock --check) that applied to this change. -->
+
+## CodeRabbit review
+<!-- After CodeRabbit comments, summarize findings as:
+     - Blocking: <items or None>
+     - Follow-up: <items or None>
+     - False positive: <items or None>
+     If CodeRabbit has not run yet, write "Pending." Do not mark findings green
+     by silence. -->
+Pending.
+
+## Why this approach
+<!-- One paragraph max. If the issue fully prescribed the approach, say so.
+     If you cite a design choice NOT in the issue, that is a §7 trigger. -->
+
+## Why this triggered human review
+<!-- List any pipeline §7 triggers and explain each. If none, write "None."
+     If any fired: this PR is opened for review only — do NOT request merge,
+     and ensure the `🛑 needs-human-review` label is applied. -->
+None.
diff --git a/.planning/agent-context/adr-001-source-adapters.md b/.planning/agent-context/adr-001-source-adapters.md
@@ -0,0 +1,49 @@
+# Agent Context — ADR-001 (Source Adapters)
+
+> One-read working context for issue `[v0.3.0] docs — write ADR-001 (Source Adapters)`.
+> A **writing** task. Every claim must match the code — verify before you assert.
+
+## 1. Roadmap excerpts (the principles you are recording)
+
+- **Principle 2.1:** Canonical source only. CPython at a pinned tag for stdlib
+  docs; PyPI metadata API for package URLs. No scraped mirrors. No third-party indexers.
+- **Principle 2.2:** Offline-first *runtime*. No network access at query time.
+- **Principle 2.7:** Layered design with stable contracts; the **source
+  connector** is layer 1 of 8 and is what makes the pattern cloneable.
+
+## 2. The two source adapters that exist today (describe these)
+
+1. **CPython documentation source** (`src/mcp_server_python_docs/ingestion/`):
+   - `cpython_versions.py` — pinned build targets (`CPYTHON_DOCS_BUILD_CONFIG`:
+     per-version `tag` + `sphinx_pin`). Five versions: 3.10–3.14.
+   - `__main__.py` `build-index` path — `git clone --depth 1 --branch <tag>` of
+     `python/cpython`, builds docs with `sphinx-build -b json` in a dedicated venv.
+   - `sphinx_json.py` — parses the Sphinx JSON output into the index; also loads
+     `synonyms.yaml`. `inventory.py` — parses `objects.inv` for exact symbol resolution.
+2. **PyPI metadata source** (`src/mcp_server_python_docs/services/package_docs.py`):
+   - Backs `lookup_package_docs`. A **controlled** PyPI metadata lookup
+     (`GET /pypi/<project>/json`) that returns only project/docs/homepage/source
+     URLs — not a generic web fetch, not scraped docs.
+
+## 3. The one documented exception to "offline-first"
+
+- `lookup_package_docs` performs a network call to PyPI's metadata API. This is
+  **not** a docs-*query*-time call against the canonical stdlib index — it is a
+  controlled, narrowly-scoped metadata lookup. The ADR must state this exception
+  explicitly so the offline-first invariant (2.2) stays honest. (See README's
+  "Why not Context7" section and `SECURITY.md` scope for the existing framing.)
+
+## 4. Known pitfalls
+
+- **Verify, don't assume.** Open each cited file and confirm the behavior before
+  writing it into the ADR. An ADR that misstates current behavior is worse than none.
+- Don't document adapters that don't exist (Rust/Go) beyond a single "future
+  adopters clone this contract" sentence — that's the cloneability point, not a claim.
+- No code, schema, or workflow changes — writing only.
+- Keep it factual; "reference architecture" is not claimed externally (5.6).
+
+## 5. Decision log
+
+- File path:
+- Claims you verified against code (file:line):
+- Anything ambiguous about the layer contract that you flagged for the maintainer:
diff --git a/.planning/agent-context/adr-006-serialization.md b/.planning/agent-context/adr-006-serialization.md
@@ -0,0 +1,52 @@
+# Agent Context — ADR-006 (Serialization)
+
+> One-read working context for issue `[v0.3.0] docs — write ADR-006 (Serialization)`.
+> This is a **writing** task. You are recording locked decisions, not making new ones.
+
+## 1. Roadmap excerpts (the decisions you are recording — verbatim)
+
+- **Principle 2.5:** Wire format is explicit and pluggable on structured tools
+  only. Compact JSON default; TOON opt-in *if and only if* the empirical study
+  supports it. `get_docs` stays markdown. *Token economy is empirical, not architectural.*
+- **Principle 2.7:** Layered design with stable contracts — eight layers, the
+  **serializer** being one of them.
+- **Decision 5.3:** Storage stays SQLite + markdown. **TOON-as-storage killed.**
+- **Decision 5.4:** Empirical Claude-tokenizer study **gates** the `format="toon"` decision.
+- **Decision 5.5:** `format` parameter on `search_docs`, `list_versions`,
+  `compare_versions` **only**. JSON default; TOON opt-in. `get_docs` stays markdown.
+- **Decision 5.8:** The study measures **client-side rewrap**, not just raw
+  payload tokens; reports tokens AND latency per tool family.
+
+## 2. Code touch-points (for accuracy — describe, do NOT change)
+
+- Tool results are Pydantic models in `src/mcp_server_python_docs/models.py`
+  (e.g. `GetDocsResult`); tools live in `server.py` and return those models,
+  which FastMCP serializes. The "serializer layer" is the conceptual seam where
+  a structured result becomes a wire string — that's what the `format` parameter
+  will eventually parameterize. You are documenting that seam, not building it.
+- `get_docs` returns markdown content (`GetDocsResult.content`) — this is why it
+  is carved out of the `format` parameter (markdown is already the canonical body).
+
+## 3. Pattern to follow
+
+- There is no `docs/architecture/` ADR yet — you are establishing the house
+  style. Use the exact skeleton embedded in the issue. Keep it tight (1–2 pages).
+- Number/name the file `docs/architecture/ADR-006-serialization.md` to match the
+  roadmap's ADR numbering (ADR-001 and ADR-006 are the first two written).
+
+## 4. Known pitfalls
+
+- **Do not invent.** If you find yourself making a serialization choice that is
+  not in §2 above, that's a pipeline §7 trigger ("cites a design choice not in
+  the issue") — stop and comment.
+- **Do not implement `format`.** That is v0.3.x and is gated by the study.
+- Don't claim a TOON token win — the study hasn't run. The ADR records that TOON
+  is *opt-in and gated*, with the bar being "win holds after client rewrap" (5.8).
+- "Reference architecture" is **not** claimed externally (decision 5.6) — keep
+  the ADR factual, not promotional.
+
+## 5. Decision log
+
+- Final file path:
+- Any wording you were unsure mapped to a locked decision (and how you resolved it):
+- Open follow-ups (e.g. link to TOKEN-STUDY.md once it exists):
diff --git a/.planning/agent-context/cpython-source-sha-pin.md b/.planning/agent-context/cpython-source-sha-pin.md
@@ -0,0 +1,67 @@
+# Agent Context — CPython source SHA pin
+
+> One-read working context for issue `[v0.3.0] ingestion — pin CPython source by commit SHA`.
+> PARTIAL issue: you do the pin + verification; the human writes the SECURITY.md prose.
+
+## 1. Roadmap excerpt
+
+> **Build-time supply-chain hardening** (roadmap §4, v0.3.0): Pin CPython source
+> by SHA, not by tag. Document the threat model in SECURITY.md (the `build-index`
+> CPython clone is the largest non-runtime attack surface). Verify Sphinx-build
+> environment isolation.
+>
+> **Decision 5.10 (locked):** Build-time supply chain (the `build-index` CPython
+> clone) is an explicit risk area; threat model documented in SECURITY.md;
+> CPython source pinned by SHA.
+
+## 2. Code touch-points
+
+- `src/mcp_server_python_docs/ingestion/cpython_versions.py`
+  - `CPythonDocsBuildConfig(TypedDict)` — add `sha: str`.
+  - `CPYTHON_DOCS_BUILD_CONFIG` — five entries, currently `{"tag": ..., "sphinx_pin": ...}`:
+    `3.10→v3.10.20`, `3.11→v3.11.15`, `3.12→v3.12.13`, `3.13→v3.13.13`, `3.14→v3.14.4`.
+    Add the resolved SHA to each. Resolve with:
+    `git ls-remote https://github.com/python/cpython.git refs/tags/<tag>`
+    (use the dereferenced commit — the `<tag>^{}` line — not the annotated-tag object).
+- `src/mcp_server_python_docs/__main__.py:210–226` — the clone:
+  `git clone --depth 1 --branch config["tag"] https://github.com/python/cpython.git <clone_dir>`.
+  After it, add: `rev = git -C <clone_dir> rev-parse HEAD`; if `rev != config["sha"]`,
+  log a clear error and **abort this version's build** (raise / skip-with-failure —
+  match the existing error-handling style in this function; do not silently continue).
+- `tests/test_ingestion.py:53` — existing assertion
+  `config["tag"].startswith(f"v{version}.")`. Add a sibling assertion that
+  `config["sha"]` matches `^[0-9a-f]{40}$`.
+
+## 3. Patterns to follow
+
+- `tests/test_ingestion.py` iterates `CPYTHON_DOCS_BUILD_CONFIG.items()` for the
+  tag assertion — extend that same loop for the SHA assertion. No new fixtures.
+- The clone block already uses `subprocess.run([...], check=True, capture_output=True, text=True)`
+  — reuse that idiom for the `rev-parse` call.
+
+## 4. Known pitfalls
+
+- **`--branch <tag>` cannot take a raw SHA** on a shallow clone against GitHub by
+  default. Keep the tag-based shallow fetch; make the **SHA a post-clone
+  verification gate**, not the fetch ref. That is the integrity win: a moved/re-tagged
+  tag now fails the build instead of silently changing canonical content.
+- Use the **dereferenced commit SHA** (peeled tag), not the annotated tag object's
+  own SHA — `rev-parse HEAD` after checkout gives the commit; match that.
+- **Do not edit `SECURITY.md`** (forbidden). Draft the threat-model paragraph in
+  the PR body + decision log below for a human to paste.
+- A full `build-index` clones over the network and takes minutes — do not gate the
+  PR on it. The unit tests cover the config + verification logic offline.
+- Don't bump any tag to a newer CPython point release; pin the SHA of the
+  **current** tag only.
+
+## 5. Decision log
+
+- Resolved SHAs (tag → 40-hex commit), one line each:
+  - 3.10 / v3.10.20 →
+  - 3.11 / v3.11.15 →
+  - 3.12 / v3.12.13 →
+  - 3.13 / v3.13.13 →
+  - 3.14 / v3.14.4 →
+- Where/how the verification aborts on mismatch:
+- **Draft SECURITY.md threat-model paragraph (for human to paste):**
+  >