Skip to content

fix(explore): keep multi-term backend files from being buried by a denser frontend layer#917

Open
colbymchenry wants to merge 1 commit into
mainfrom
fix/explore-corroboration-ranking
Open

fix(explore): keep multi-term backend files from being buried by a denser frontend layer#917
colbymchenry wants to merge 1 commit into
mainfrom
fix/explore-corroboration-ranking

Conversation

@colbymchenry

Copy link
Copy Markdown
Owner

What

codegraph_explore now keeps a query-relevant backend file in the response when a larger, denser frontend layer would otherwise crowd it out of a cross-layer monorepo.

Why

The explore file sort is primarily driven by Random-Walk-with-Restart (RWR) graph-centrality mass, seeded from the query's text matches. In a cross-layer monorepo — e.g. an api/ server alongside a much larger, internally dense app/ frontend that mirrors the same domain words — RWR mass concentrates in the bigger layer. A backend service/handler that matches several distinct query terms (often the #1 search hit, with many callers) is call-isolated from the frontend seed cluster, so it accrues little RWR mass, sorts below hits=0 frontend files, and gets truncated out of the response — and the agent reads the file back.

How

A multi-term corroboration tier above the graph signal: a file that is BOTH (a) an entry/central file (search root, named seed, or graph-central hub) AND (b) matched by ≥2 distinct query terms ranks above the pure graph ordering.

The entry/central guard is what makes it safe: an incidental multi-term file that is neither entry nor central (a type/util file matching a common word but not on the flow) is NOT promoted, so it can't displace a graph-central answer file — a blunt hits≥2-only tier regressed exactly that case (it floated unrelated type/util files above the real central renderer in a React render flow). Single-layer repos with one cluster are unaffected. Gated by CODEGRAPH_RANK_NO_MULTITERM=1.

Validation

Deterministic, on real indexes (directus / n8n / excalidraw):

  • directus backend-flow queries: the api/ service moves from absent/mentioned to sourced in the explore output (was buried under app/*.vue + unrelated packages).
  • excalidraw render-flow query: no regression — all three render files stay sourced (the entry/central guard prevents the blunt-tier regression).
  • n8n: unchanged.
  • All existing explore/context/ranking tests pass; adds __tests__/explore-corroboration-ranking.test.ts.

🤖 Generated with Claude Code

…nser frontend layer

codegraph_explore's file sort is primarily driven by Random-Walk-with-Restart
graph-centrality mass, seeded from the query's text matches. In a cross-layer
monorepo (an API server alongside a much larger, internally dense frontend that
mirrors the same domain words), that mass skews to the bigger layer — so a
backend service/handler that genuinely matches several query terms, even when
it's the #1 search hit, sorts below hits=0 frontend files and gets truncated out
of the response, and the agent reads it back.

Add a corroboration tier above the graph signal: a file that is BOTH an
entry/central file AND matched by >=2 distinct query terms is kept in. The
entry/central guard prevents an incidental multi-term file (a type/util file
that isn't the flow) from displacing a graph-central answer file — a blunt
hits-only tier regressed that case. Single-layer repos are unaffected. Gated by
CODEGRAPH_RANK_NO_MULTITERM=1.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant