Skip to content

build(docs-gen): add ast raw data extractor#1147

Draft
makhnatkin wants to merge 3 commits into
mainfrom
codex/docs-gen-ast-raw-data
Draft

build(docs-gen): add ast raw data extractor#1147
makhnatkin wants to merge 3 commits into
mainfrom
codex/docs-gen-ast-raw-data

Conversation

@makhnatkin

@makhnatkin makhnatkin commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

Summary by Sourcery

Add a TypeScript AST-based extractor pipeline to generate raw extension metadata and integrate it into the docs-gen tooling.

New Features:

  • Introduce a CLI command to extract raw extension metadata from configured editor extension entry points into JSON and Markdown outputs.
  • Add a pluggable TypeScript AST extraction framework to derive schema, actions, keymaps, plugins, options, and examples from extension source and test files.
  • Expose configuration for docs generation and extension extraction, including paths, categories, entry points, and preset definitions.
  • Provide Markdown generators and pipeline documentation for the new extraction flow.

Enhancements:

  • Refactor docs generation paths and constants into a shared config module and update the existing docs build script to use it.

Build:

  • Extend docs-gen package scripts to support running the new extractor, executing extractor tests, and add TypeScript as a dependency.

Documentation:

  • Document the docs-gen package layout and the extension extraction pipeline, including module responsibilities and CLI usage.

Tests:

  • Add unit tests for AST-based extractors and option parsing used by the extension metadata pipeline.

@sourcery-ai

sourcery-ai Bot commented Jun 14, 2026

Copy link
Copy Markdown

Reviewer's Guide

Adds a new AST-based extension metadata extraction pipeline to docs-gen, centralizes docs configuration paths, and wires a CLI plus tests for generating raw extension data alongside the existing Diplodoc build.

File-Level Changes

Change Details Files
Centralize docs-gen configuration (paths, regexes, extension metadata contract) and adjust existing docs build to use it.
  • Introduce src/config.mjs to hold repo paths, docs directories, GitHub URL regex, header regex, extension entry points, blacklist, and field config.
  • Refactor generate-docs.mjs to import DOCS_DIR, DOCS_SRC_DIR, GITHUB_RAW_RE, and HEADER_RE from config instead of computing paths locally.
  • Rename OUT_DIR usages to DOCS_SRC_DIR and update file operations for cleaning, writing docs, toc.yaml, index.md, assets, and .yfm.
infra/docs-gen/src/config.mjs
infra/docs-gen/src/generate-docs.mjs
Add a CLI and orchestrator for extracting raw extension metadata into JSON and Markdown.
  • Implement extract-extension-data.mjs CLI to parse options (--editor-pkg, --out-dir, --only, --help), resolve paths from repo root, and invoke ExtensionExtractor.
  • Create ExtensionExtractor class to collect extension refs, scan extensions, enrich with preset membership, and write outputs to tmp/docs-gen.
  • Wire new npm scripts: extract (run CLI) and test (Node --test over extractor tests).
infra/docs-gen/src/extract-extension-data.mjs
infra/docs-gen/src/extractor/index.mjs
infra/docs-gen/package.json
Implement extension discovery, filtering, and filesystem utilities for the extractor.
  • Add config-driven EXTENSION_ENTRY_POINTS, categories, internal and external blacklists, and helper to apply CLI overrides to entry points.
  • Implement extension-refs.mjs to build extension references from category directories and single-extension packages, then apply blacklist/--only filters.
  • Add shared filesystem helpers (readText, listDirs, findFiles, readAllTsFiles) used by extractor modules.
infra/docs-gen/src/config.mjs
infra/docs-gen/src/extractor/extension-refs.mjs
infra/docs-gen/src/utils.mjs
Implement TypeScript AST infrastructure and focused scanners for actions, schema, keymaps, plugins, input rules, markdown-it plugins, and serializer hints.
  • Create ast/core.mjs with generic AST helpers (parseSource, forEachNode, unwrapExpression, unique, getExpressionName, getStaticPropertyName, getStringValue, getCallPropertyName).
  • Add ast/builder.mjs to recognize extension builder call chains and extract first arguments for specific methods.
  • Add specialized scanners: actions.mjs, schema.mjs, keymaps.mjs, input-rules.mjs, plugins.mjs, md-plugins.mjs, serializer.mjs, and factory.mjs, re-exported via ast.mjs as a barrel.
infra/docs-gen/src/extractor/ast/core.mjs
infra/docs-gen/src/extractor/ast/builder.mjs
infra/docs-gen/src/extractor/ast/actions.mjs
infra/docs-gen/src/extractor/ast/schema.mjs
infra/docs-gen/src/extractor/ast/keymaps.mjs
infra/docs-gen/src/extractor/ast/input-rules.mjs
infra/docs-gen/src/extractor/ast/plugins.mjs
infra/docs-gen/src/extractor/ast/md-plugins.mjs
infra/docs-gen/src/extractor/ast/serializer.mjs
infra/docs-gen/src/extractor/ast/factory.mjs
infra/docs-gen/src/extractor/ast.mjs
Add constant and options resolvers to translate TypeScript declarations into flat, usable metadata.
  • Implement constants.mjs to extract string-valued consts, enums, scalar object members, resolve aliases, and provide resolveConstant/resolveAllConstants helpers.
  • Implement options.mjs plus options/declarations.mjs, options/fields.mjs, and options/resolve.mjs to parse *Options declarations (interfaces/types), handle Pick/Omit, intersections, and nested objects into field lists.
  • Use these resolvers in schema.mjs and record-fields.mjs to resolve schema names, action IDs, and options against constants and declarations.
infra/docs-gen/src/extractor/constants.mjs
infra/docs-gen/src/extractor/options.mjs
infra/docs-gen/src/extractor/options/declarations.mjs
infra/docs-gen/src/extractor/options/fields.mjs
infra/docs-gen/src/extractor/options/resolve.mjs
infra/docs-gen/src/extractor/schema.mjs
infra/docs-gen/src/extractor/record-fields.mjs
Provide higher-level source grouping and per-extension scanning to feed the field extractors.
  • Implement extension-sources.mjs to read all TS/TSX files for an extension and separate production from test files, plus build schema-specific content.
  • Implement source-files.mjs to classify files (source vs tests, specs, serializer, root/specs index) and expose helpers for joining content.
  • Implement scan.mjs to glue everything: read sources, extract constants, build schema content, and create a final extension record from EXTENSION_DOC_FIELD_CONFIG.
infra/docs-gen/src/extractor/extension-sources.mjs
infra/docs-gen/src/extractor/source-files.mjs
infra/docs-gen/src/extractor/scan.mjs
Add extraction of serializer examples, presets, and generation of Markdown output for each extension.
  • Implement examples.mjs to parse test files, collect local string bindings, and resolve literals/template strings/joins used in same(...) calls into markdown examples.
  • Implement presets.mjs to scan editor preset files, accumulate inherited extension membership, and expose getPresetsForExtension.
  • Implement markdown-gen.mjs and output.mjs to render per-extension raw Markdown (frontmatter, sections, tables, examples) and write extensions.json plus raw/*.md files.
infra/docs-gen/src/extractor/examples.mjs
infra/docs-gen/src/extractor/presets.mjs
infra/docs-gen/src/extractor/markdown-gen.mjs
infra/docs-gen/src/extractor/output.mjs
Document the docs-gen and extractor architecture and wire a lightweight logger.
  • Add top-level README.md under infra/docs-gen describing commands, main files, extractor modules, and outputs.
  • Add extractor/README.md and EXTRACTION_PIPELINE.md with an English/Russian overview and a Mermaid diagram of the extraction pipeline.
  • Introduce logger.mjs as a tiny console logger used by CLI and extractor.
  • Ensure pnpm-lock.yaml is updated for the new TypeScript dependency and scripts.
infra/docs-gen/README.md
infra/docs-gen/src/extractor/README.md
infra/docs-gen/EXTRACTION_PIPELINE.md
infra/docs-gen/src/logger.mjs
pnpm-lock.yaml
Add unit tests for key extractor behaviors and wire a test script.
  • Introduce Node test files for actions, CLI args, constants, examples, keymaps, schema, and options extraction to validate behavior against real repo sources.
  • Update package.json to add a test script that runs node --test over src/extractor/*.test.mjs.
infra/docs-gen/src/extractor/actions.test.mjs
infra/docs-gen/src/extractor/cli.test.mjs
infra/docs-gen/src/extractor/constants.test.mjs
infra/docs-gen/src/extractor/examples.test.mjs
infra/docs-gen/src/extractor/keymaps.test.mjs
infra/docs-gen/src/extractor/options.test.mjs
infra/docs-gen/src/extractor/schema.test.mjs
infra/docs-gen/package.json

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@gravity-ui

gravity-ui Bot commented Jun 14, 2026

Copy link
Copy Markdown

Storybook Deployed

@gravity-ui

gravity-ui Bot commented Jun 14, 2026

Copy link
Copy Markdown

🎭 Playwright Report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant