Skip to content

[codex] Implement roadmap hardening release#1

Merged
chrishonson merged 7 commits into
mainfrom
codex/roadmap-hardening-release
Jun 15, 2026
Merged

[codex] Implement roadmap hardening release#1
chrishonson merged 7 commits into
mainfrom
codex/roadmap-hardening-release

Conversation

@chrishonson

Copy link
Copy Markdown
Owner

Summary

Ships the first MetaCortex roadmap hardening release:

  • Adds Firestore TTL-ready fields for write fingerprints and memory events, plus dry-run/write backfill tooling and a production TTL deployment script.
  • Cleans the public MCP contract by returning summary-only search_context matches and accepting document_id as a fetch_context compatibility alias with conflict validation.
  • Repairs Gemini multimodal defaults from the shut-down preview model to stable gemini-3.1-flash-lite, adds live model validation, and keeps generation on Vertex AI global while embeddings stay in us-central1.
  • Hardens smoke tooling for scoped clients, image-file multimodal tests, and MCP error detection.
  • Updates README, deployment/security docs, env examples, tests, and the roadmap status.

Validation

  • npm --prefix functions test
  • npm --prefix functions run build
  • npm --prefix functions run validate:models
  • ./scripts/deploy-session-preflight.sh
  • Firebase production deploy to my-brain-88870: indexes and functions deployed.
  • TTL backfill dry-run/write/follow-up dry-run completed; follow-up reported update_count: 0 for fingerprints and memory events.
  • Firestore TTL policies verified ACTIVE for memory_events/expires_at and memory_vectors_write_fingerprints/expires_at.
  • Production smoke validation passed for admin read/write, Claude scoped read/write, ChatGPT multimodal read/write, and installed connector fetch_context with document_id.

@chrishonson chrishonson marked this pull request as ready for review June 11, 2026 20:31
@chrishonson

Copy link
Copy Markdown
Owner Author

ping

@chrishonson

Copy link
Copy Markdown
Owner Author

hi @claude

@claude

claude Bot commented Jun 11, 2026

Copy link
Copy Markdown

Claude finished @chrishonson's task in 28s —— View job


Hi Nick! 👋

I'm Claude, ready to help with this PR. I can see you've got a substantial hardening release here — 1219 additions across 25 files covering TTL fields, MCP contract cleanup, Gemini model fixes, and smoke test improvements.

What would you like me to help with? Here are a few things I can do:

  • Code review — I can review the full diff for bugs, security issues, and potential improvements
  • Answer questions — about any specific part of the implementation
  • Implement changes — if you'd like me to make any fixes or additions

Just let me know what you need!


@chrishonson

Copy link
Copy Markdown
Owner Author

review it

@chrishonson

Copy link
Copy Markdown
Owner Author

@claude review this

@claude

claude Bot commented Jun 11, 2026

Copy link
Copy Markdown

Claude finished @chrishonson's task in 3m 35s —— View job


Code Review

  • Gather context and understand the request
  • Read changed files and analyze the diff
  • Review TTL backfill and deployment scripts
  • Review MCP contract changes (search/fetch)
  • Review Gemini model fixes and embeddings
  • Review observability and repository changes
  • Review test coverage

Overall this is a solid hardening release. The TTL architecture, document_id alias, and model repair are all well-executed. A few correctness issues and some minor papercuts are noted below.


Bugs

1. consolidateContext doesn't deduplicate source_ids — medium severity

service.ts:179-193 maps source_ids directly without deduplication. If a caller passes ["a", "a"], the length check at line 199 passes (2 ≥ 2), the LLM merges the same memory with itself, and then deprecate() is called twice on the same document. The second deprecation silently overwrites with the same superseded_by, so it's non-destructive, but the merge result is wrong.

// service.ts:179 — no dedup before mapping
sources = (fetched as NonNullable<...>[]).map(doc => ({ id: doc.id, content: doc.content }));

The MCP schema description says "At least 2 ids required" but neither the Zod schema nor the service enforces uniqueness. Fix this →


2. fetch_context branch state check after document fetch — minor

In mcpServer.ts:284-299, the allowedFilterStates check runs after service.fetchContext() returns. A scoped token holder can distinguish "document doesn't exist" (404 from the service) from "document exists but you can't see it" (403 from the state check). This leaks document existence to clients who shouldn't know about deprecated/wip records.

// mcpServer.ts:287-299
const fetched = await service.fetchContext(args);      // 404 if missing
if (!config.allowedFilterStates.includes(fetched.item.metadata.branch_state)) {
  throw new HttpError(403, `branch_state '...' is not allowed`);  // 403 if wrong state
}

For a private memory system the exposure is limited, but 403/404 discrimination is a recognized information-disclosure pattern. Fix this →


Code Quality

3. createVertexClient duplicated across embeddings.ts and merging.ts

embeddings.ts:206-221 and merging.ts:85-100 are byte-for-byte identical. This is the only shared pattern between the two files that isn't abstracted. A shared gemini.ts utility would eliminate the duplication and keep the GEMINI_API_KEY env-var manipulation in one place. Fix this →

4. validate-models.mjs does manual GEMINI_API_KEY manipulation

validate-models.mjs:40-63 manually deletes and restores process.env.GEMINI_API_KEY at the module top-level, whereas embeddings.ts and merging.ts wrap this logic in createVertexClient(). The script predates the extraction, but since it already lives in the scripts directory alongside the rest, it's inconsistent. Low risk (it's a dev-time script, runs sequentially), but worth noting.

5. GeminiMergeClient reuses config.multimodalModel

runtime.ts:65-76 instantiates GeminiMergeClient with config.multimodalModel (defaulting to gemini-3.1-flash-lite). The merge operation is text-to-text and is a high-value call — it produces the canonical consolidated memory. Using the same (lightweight) model as image normalization might be acceptable now, but if consolidation quality becomes a concern, a separate GEMINI_MERGE_MODEL env var would give you independent control without touching the other defaults.


Scale / Operational

6. backfill-firestore-ttl.mjs reads entire collection into memory

backfillFingerprints and backfillEvents both call collection.get() (lines 69, 123) which fetches all documents in one shot. The batchSize parameter controls only write batching, not read pagination. This is fine at current scale (the follow-up dry-run already reported update_count: 0, meaning it ran successfully). If memory_vectors_write_fingerprints or memory_events grows to 100k+ docs, this would need pagination via query.limit(batchSize).startAfter(lastDoc). Worth a comment in the script. Fix this →


Documentation

7. CLAUDE.md GEMINI_MULTIMODAL_MODEL default is stale

CLAUDE.md (env vars table) still shows gemini-3.1-flash-lite-preview as the default, but config.ts:268 now defaults to gemini-3.1-flash-lite. This is one of the core fixes in the PR and the doc didn't get updated. Fix this →


What's solid

  • TTL field split (dedupe_expires_at number + expires_at Date): clean separation of concerns between the dedup window and the Firestore TTL policy. The getDedupeExpiresAt fallback for legacy numeric expires_at values is correct.
  • document_id alias: the conflict validation (id !== document_id → error) in both the Zod superRefine and resolveFetchContextId is thorough.
  • search_context summary truncation: returning 220-char summaries in search and full content only on fetch_context is the right pattern for token-efficient LLM clients.
  • createVertexClient env-var mutex: the delete/finally-restore pattern is correct for the single-threaded Node.js init path.
  • Backfill dry-run: the --write flag defaulting to dry-run is good operational safety.
  • Smoke test requireSuccessfulToolResult: checking both result.isError and payload.error catches both MCP-layer and application-layer failures.
  • Test coverage: memoryRepository.test.ts covering both the new fingerprint fields and the legacy numeric expires_at fallback is exactly the right thing to test here.

@chrishonson

Copy link
Copy Markdown
Owner Author

@claude review feedback addressed. verify it

@claude

claude Bot commented Jun 15, 2026

Copy link
Copy Markdown

Claude finished @chrishonson's task in 2m 8s —— View job


Review Feedback Verification

  • Gather context and read current source files
  • Bug 1: consolidateContext source_ids deduplication (service.ts + mcpServer.ts)
  • Bug 2: fetch_context 403/404 information disclosure (mcpServer.ts)
  • Quality 3: createVertexClient duplication (embeddings.ts + merging.ts)
  • Quality 4: validate-models.mjs GEMINI_API_KEY manipulation
  • Quality 5: GeminiMergeClient uses multimodalModel
  • Scale 6: backfill-firestore-ttl.mjs full-collection reads
  • Docs 7: CLAUDE.md stale GEMINI_MULTIMODAL_MODEL default

Results

5 of 7 items addressed — 2 outstanding

✅ Fixed

Bug 1 — consolidateContext source_ids deduplication
Both layers addressed correctly:

  • service.ts:180: const uniqueSourceIds = [...new Set(input.source_ids)]; — dedup before fetch and deprecate
  • mcpServer.ts:131-144: superRefine added that rejects duplicate ids at the Zod input layer with "source_ids must be unique"

Bug 2 — fetch_context 403/404 information disclosure
mcpServer.ts:325-329 now throws new HttpError(404, "Document not found") when allowedFilterStates rejects the document's state — matches the 404 from a missing document, closing the existence-leak.

Scale 6 — backfill-firestore-ttl.mjs full-collection reads
Comments added to both backfillFingerprints (line 69) and backfillEvents (line 125) noting the unpaged read and pointing to the limit/startAfter pattern for large collections. Matches the requested change precisely.


❌ Not addressed

Quality 3 — createVertexClient duplicated in embeddings.ts and merging.ts
Both files (embeddings.ts:206-222 and merging.ts:85-101) still carry identical createVertexClient implementations. The extraction to a shared file (gemini.ts or similar) has not been done. Fix this →

Docs 7 — CLAUDE.md GEMINI_MULTIMODAL_MODEL default is stale
CLAUDE.md:144 still shows gemini-3.1-flash-lite-preview as the default. config.ts and validate-models.mjs:25 both now default to gemini-3.1-flash-lite (stable). The doc update was missed. Fix this →


ℹ️ Not changed (observations, no action required)

Quality 4 — validate-models.mjs GEMINI_API_KEY manipulation: The inline delete/restore at lines 40-63 is still present and not using createVertexClient. Low risk for a dev-time script; was flagged as informational only.

Quality 5 — GeminiMergeClient uses config.multimodalModel: runtime.ts:75 still passes config.multimodalModel to GeminiMergeClient. Acceptable for now; was flagged as an architectural note for if consolidation quality becomes a concern.


@chrishonson

Copy link
Copy Markdown
Owner Author

@claude please recheck

@claude

claude Bot commented Jun 15, 2026

Copy link
Copy Markdown

Claude finished @chrishonson's task in 1m 4s —— View job


Re-verification of Outstanding Items

  • Check Quality 3: createVertexClient extraction to shared gemini.ts
  • Check Docs 7: CLAUDE.md GEMINI_MULTIMODAL_MODEL default

Results

1 of 2 items now addressed — 1 still outstanding

✅ Fixed

Quality 3 — createVertexClient duplication
Correctly resolved. functions/src/gemini.ts now exists and exports createVertexClient with the VertexClientOptions interface. Both consumers import from it:

  • embeddings.ts:4: import { createVertexClient } from "./gemini.js";
  • merging.ts:3: import { createVertexClient } from "./gemini.js";

Neither file carries the duplicate inline implementation anymore. Clean extraction.


❌ Still not addressed

Docs 7 — CLAUDE.md GEMINI_MULTIMODAL_MODEL default is stale

CLAUDE.md:144 still reads:

| `GEMINI_MULTIMODAL_MODEL` | `gemini-3.1-flash-lite-preview` | Multimodal normalization model |

The correct stable default (matching config.ts and validate-models.mjs:25) is gemini-3.1-flash-lite. The -preview suffix is still present. Fix this →


@chrishonson chrishonson merged commit 0037659 into main Jun 15, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant