Skip to content

feat(discovery): canonicalize same-origin checkouts in workspace scan#32

Merged
saagpatel merged 1 commit into
mainfrom
feat/discovery-canon
Jun 3, 2026
Merged

feat(discovery): canonicalize same-origin checkouts in workspace scan#32
saagpatel merged 1 commit into
mainfrom
feat/discovery-canon

Conversation

@saagpatel
Copy link
Copy Markdown
Owner

Collapse linked worktrees + stray duplicate clones (same origin) to one canonical project. Portfolio count 162→132; catalog-completeness gap de-noised 79→58 (now accurate). 6 new tests, full truth suite green, ruff clean.

Linked git worktrees and stray duplicate clones (e.g. <repo>-security-fix
left by multi-repo sweeps) share one origin but were each counted as a
distinct project — inflating the portfolio count (162→132) and dragging
catalog-completeness toward zero (79→58 flagged, now an accurate gap).

Dedupe discovered checkouts by repo_full_name, keeping one canonical per
origin (basename match > shortest name > alphabetical); origin-less local
projects are never collapsed. 6 new tests.
@saagpatel saagpatel merged commit 8d1cac9 into main Jun 3, 2026
2 checks passed
@saagpatel saagpatel deleted the feat/discovery-canon branch June 3, 2026 04:50
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e57085225f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +138 to +141
key=lambda p: (
str(p.get("name", "")).lower() != repo_base,
len(str(p.get("name", ""))),
str(p.get("name", "")).lower(),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prefer cataloged paths before collapsing checkouts

When same-origin checkouts include a path-cataloged project plus a stray basename clone, this tie-breaker always keeps the basename directory even if the catalog only has a contract for the nested path. _build_truth_project later resolves catalog entries from raw_project["path"] first, so a workspace with ITPRJsViaClaude/IncidentWorkbench and a root IncidentWorkbench sharing the same origin would drop the explicit ITPRJsViaClaude/IncidentWorkbench contract and report the wrong group/completeness metadata. Consider preferring checkouts whose path has an explicit catalog entry before falling back to basename/length.

Useful? React with 👍 / 👎.

saagpatel added a commit that referenced this pull request Jun 3, 2026
)

* chore(ci): pin all GitHub Actions to commit SHAs + add concurrency to audit.yml

Pin every `uses:` ref across all 5 workflows (audit, ci, codeql, pypi,
release) from floating major tags to the commit SHA their current tag
resolves to, annotated with the resolved version. Removes the moving-tag
supply-chain surface: a compromised or repointed tag can no longer swap
action code into our CI silently.

Add a concurrency group to audit.yml (ci.yml and codeql.yml already had
one) with cancel-in-progress: false, so overlapping scheduled audit runs
serialize instead of racing on the history cache.

Pins are re-derived fresh against canonical's current action majors
(checkout v6, setup-python v6, upload/download-artifact v7, action-gh-release
v3); supersedes the stale archived attempt that targeted older v4/v5 majors.

* feat(discovery): skip transient non-project dirs in workspace scan

Sibling to the same-origin canonicalization fix (#32). The workspace scan
admitted three classes of scratch directory as real projects, dragging them
into the catalog-completeness gate as permanently-unfixable flags:

  - NoGoPRJs/*          operator-flagged never-pursued projects
  - *-smoke-export/*    generated AuraForge signed-smoke-export bundles
  - *-tmp-<timestamp>   transient clones left by tooling runs

Add _is_ignored_project_dir (token + regex ignore-list, sibling to SKIP_DIRS),
applied in discover_workspace_projects and _discover_nested_projects to skip a
directory and its subtree. Live-workspace check: 132 -> 129 projects; the three
residual catalog-flags drop to 0 while real repos (incl. ResumeEvolver, whose
-tmp clone is filtered) are retained.

* test(ci): realign distribution-policy assertions with SHA-pinned workflows

The pin-all-actions-to-SHAs change (94dca6d) replaced tag refs like
codeql-action/init@v4 and gh-action-pypi-publish@release/v1 with SHA pins
carrying '# vN.x' comments, but these policy tests still asserted the old tag
literals — so they fail on main. Update the assertions to verify the action is
present and pinned to the intended major line (# v4 / # v1) instead of an exact
tag, preserving the security intent and tolerating future patch bumps.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant