Skip to content

feat: scaffold Backstage application#1

Open
scotwells wants to merge 12 commits into
mainfrom
feat/initial-backstage-scaffold
Open

feat: scaffold Backstage application#1
scotwells wants to merge 12 commits into
mainfrom
feat/initial-backstage-scaffold

Conversation

@scotwells

@scotwells scotwells commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Summary

Scaffolds Datum's Backstage developer portal and software catalog, and wires up CI to publish the backend image plus a kustomize deployment bundle via the shared datum-cloud/actions reusable workflows, gated by a kind-based end-to-end test that actually boots Backstage.

What's included

  • A standard Backstage monorepo generated with the official @backstage/create-app scaffolder: packages/app (frontend), packages/backend (backend), plus app-config.yaml, app-config.production.yaml, example entities/template, and the backend Dockerfile.
  • A Datum-specific README.md with local-dev (yarn install && yarn dev) and build instructions.

Image + bundle publishing (shared actions)

  • .github/workflows/publish.yaml replaces the bespoke build-image.yaml and uses the datum-cloud/actions reusable workflows pinned at v1.16.0:
    • publish-docker.yaml builds and pushes the backend image to ghcr.io/datum-cloud/backstage (context ., dockerfile packages/backend/Dockerfile, linux/amd64).
    • publish-kustomize-bundle.yaml then publishes the config/ tree as an OCI artifact to oci://ghcr.io/datum-cloud/backstage-kustomize, pinning the freshly built image into config/base via kustomize edit set image.

Dockerfile switched to self-contained multi-stage

  • The scaffold shipped Backstage's host-build Dockerfile, which only COPYs a pre-built skeleton.tar.gz/bundle.tar.gz and assumes yarn install && yarn tsc && yarn build:backend already ran on the host. The shared publish workflow and the e2e job both run a plain docker build with no host yarn step, so that variant would fail.
  • It is now the official multi-stage Dockerfile that runs yarn install --immutable, yarn tsc, and the backend build inside the image from the repo-root context. A plain docker build -f packages/backend/Dockerfile . now succeeds (verified locally end-to-end).
  • The scaffold also shipped a stale Yarn v1 format yarn.lock that made yarn install --immutable fail under Yarn 4. It has been regenerated as a valid Yarn 4 (Berry) lockfile, and .dockerignore no longer excludes packages/*/src or plugins so the in-image build has its sources.

Deployment bundle (config/)

  • config/base/: namespace, service account, a ClusterIP service on 7007, and a Deployment running the backend image with readiness/liveness probes on the new-backend health endpoints (/.backstage/health/v1/readiness and /.backstage/health/v1/liveness). The kustomization carries an images entry for ghcr.io/datum-cloud/backstage so the bundle publisher can pin the built tag. Database and app-config come from the overlay, not hardcoded here.
  • config/test/: an e2e overlay that boots Backstage without GCP secrets or SSO. It adds an ephemeral postgres:16-alpine, an app-config.e2e.yaml ConfigMap (guest auth, default auth policy disabled, pg database pointed at the in-cluster postgres, github integrations/providers removed, a small static example catalog), and a Deployment patch that mounts the config, wires the DB env, and pins the image to the kind-loaded backstage:e2e.

Kind e2e gate

  • .github/workflows/e2e.yaml runs on pull_request and pushes to main. It builds the backend image (buildx with GitHub Actions layer caching so the slow yarn install / tsc / build:backend layers are reused across runs), loads it into a kind cluster, applies config/test, waits for the postgres and backstage rollouts, and asserts the readiness, liveness, and catalog API endpoints all return HTTP 200. On failure it dumps pods, the deployment description, and backend logs.

Verified locally

  • docker build -f packages/backend/Dockerfile . succeeds and produces a runnable image.
  • A real kind cluster came up, deployed config/test, Backstage rolled out against postgres, the health endpoints returned 200, and the catalog API served the static example entities.

Notes

  • Dependencies are not committed; node_modules are installed in CI / inside the image build.
  • The published backend image is ghcr.io/datum-cloud/backstage; the published bundle is ghcr.io/datum-cloud/backstage-kustomize.
  • The catalog org-model and production deploy wiring for this app live in the datum-cloud/infra repo, not here.

Design proposal: https://github.com/datum-cloud/infra/blob/main/docs/enhancements/backstage-service-catalog/README.md

Branding

Applies Datum's brand to the app (new frontend system, theme extensions):

  • Datum-branded light and dark themes in packages/app/src/theme/datum.ts (createUnifiedTheme): Midnight Fjord #0C1D31 primary and navy sidebar, Pine Forge #4D6356 secondary, Aurora Moss #E6F59F selected-indicator (paired with dark/white text, never white text on the lime), flat navy page headers (no gradient), and status colors. Registered via ThemeBlueprint in a new packages/app/src/modules/theme module; the built-in Backstage themes are disabled in app-config.yaml and Datum Light is the default.
  • Datum logos inlined into packages/app/src/modules/nav/LogoFull.tsx (lime mark + white wordmark for the navy sidebar) and LogoIcon.tsx (lime mark for the collapsed sidebar). Added packages/app/public/favicon.svg (navy mark) referenced from index.html, with the generic Backstage favicon.ico left as fallback. App title set to "Datum Portal".
  • Fonts: the theme references the brand families (Canela Text headings, Alliance No.1 body) first for forward-compatibility, but we decided to use fallback fonts. The licensed brand fonts are intentionally not bundled (this is a public repo) — no .woff2 committed — and the fallback stacks (Georgia serif for headings, system sans for body) render cleanly on their own. They can optionally be added out-of-band later.

Security hardening

Applies a batch of security findings. Auth/guest config is intentionally untouched — an OIDC proxy will front the app at deploy time.

  • Gate publishing (H-2): publish.yaml no longer triggers on pull requests or arbitrary branch pushes. The image and kustomize bundle now ship only on main, v* tags, and published releases. PR validation is covered by the e2e build.
  • Isolate PR build cache (M-1): e2e.yaml builds untrusted PR code, so its buildx GHA cache now uses its own scope=backstage-e2e for both cache-from and cache-to, preventing it from poisoning the trusted publish build's cache.
  • e2e least-privilege (L-1): Added a top-level permissions: { contents: read } to e2e.yaml.
  • SHA-pin actions (L-2): Every third-party marketplace action in e2e.yaml (actions/checkout, docker/setup-buildx-action, docker/build-push-action, helm/kind-action) is pinned to a full commit SHA with a trailing version comment, matching the org convention. First-party datum-cloud/actions/...@v1.16.0 reusable workflows stay tag-pinned by convention.
  • Deployment hardening (M-2): config/base/deployment.yaml now disables service-account token automount, runs as non-root uid/gid 1000 with fsGroup and the RuntimeDefault seccomp profile, drops all capabilities, disables privilege escalation, and mounts the root filesystem read-only. Writable emptyDirs are mounted at /tmp and /home/node/.cache so the backend can write under readOnlyRootFilesystem. CPU/memory requests and a CPU limit were added. Validated end-to-end on kind: the backend rolls out with zero restarts, readOnlyRootFilesystem holds, and readiness/liveness/catalog all return 200.
  • Snyk scanning: New snyk-scan.yaml calls the shared datum-cloud/actions reusable workflow on pushes to main, pull requests, and a weekly schedule, with the permissions needed for SARIF upload. Prerequisite: this requires a SNYK_TOKEN org secret to be present; the workflow will fail authentication until it is added.
  • Dependabot: New .github/dependabot.yml covering the npm (root, weekly, grouped minor/patch) and github-actions (weekly) ecosystems.
  • Dependency advisories (M-4): Added root resolutions to force-bump the three high-severity packages to patched releases within their compatible majors (undici 7.24.7 → 7.28.0, scoped to the 7.x descriptor; protobufjs → 7.6.4; tar → 7.5.16) plus a yarn dedupe. High-severity advisories dropped from 3 affected packages (14 advisory entries) to 0; typecheck, app build, and the kind e2e all stay green.
  • Repo settings: Secret scanning and push protection are enabled on the repo.

Next steps

  • Wire app-config catalog providers, auth, and Kubernetes integration for production.
  • Add the production overlay / deploy wiring and populate the catalog org-model in datum-cloud/infra.
  • Add the SNYK_TOKEN org secret so the Snyk scan can authenticate.

https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG

scotwells added 12 commits June 25, 2026 10:44
Generate Datum's Backstage developer portal and software catalog using the
official @backstage/create-app scaffolder. The repo root is the Backstage
monorepo with packages/app (frontend) and packages/backend (backend).

Dependencies are installed in CI rather than committed (--skip-install), so
node_modules are not part of the repository.

Add a CI workflow that builds the backend image with the host-build flow and
publishes it to ghcr.io/datum-cloud/backstage on pushes to main and on tags.
Replace the default README with Datum-specific run/build instructions and
links to the design proposal.

Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Replace the host-build backend Dockerfile, which only COPYs a pre-built
skeleton.tar.gz/bundle.tar.gz and assumes yarn install/tsc/build:backend
already ran on the host, with the official self-contained multi-stage
Dockerfile. The new image runs yarn install --immutable, tsc, and the
backend build inside the build stage from the repo-root context, so CI can
build it with a plain `docker build -f packages/backend/Dockerfile .`.

Key changes:
- Regenerate yarn.lock as a valid Yarn 4 (Berry) lockfile; the scaffold
  shipped a stale Yarn v1 format lockfile that made `yarn install
  --immutable` fail inside the image.
- Stop excluding packages/*/src and plugins from the build context in
  .dockerignore so the in-image build has the sources it needs.

Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Drop the bespoke build-image workflow and use the datum-cloud/actions
reusable workflows. publish-docker.yaml builds and pushes the backend
image; publish-kustomize-bundle.yaml then publishes the config/ tree to
oci://ghcr.io/datum-cloud/backstage-kustomize with the freshly built image
pinned into config/base.

Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Add the config/ tree that is published as the backstage-kustomize OCI
bundle and consumed by the e2e gate.

Key changes:
- config/base: namespace, service account, ClusterIP service on 7007, and
  a Deployment running the backend image with readiness/liveness probes on
  the new-backend health endpoints (/.backstage/health/v1/readiness and
  /.backstage/health/v1/liveness). The kustomization carries an images
  entry for ghcr.io/datum-cloud/backstage so the bundle publisher can pin
  the built tag.
- config/test: e2e overlay that boots Backstage without GCP secrets or
  SSO. Adds an ephemeral postgres:16-alpine, an app-config.e2e.yaml
  ConfigMap (guest auth, default auth policy disabled, pg database, github
  integrations/providers removed, a small static example catalog), and a
  Deployment patch wiring the DB env and config mount. The image is pinned
  to backstage:e2e so kind uses the locally loaded image.

Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Add an e2e workflow that proves the deployment manifests actually boot
Backstage. On pull_request and push to main it builds the backend image,
loads it into a kind cluster, applies config/test, waits for the postgres
and backstage rollouts, and asserts the readiness, liveness, and catalog
API endpoints all return HTTP 200.

Key changes:
- Build with docker/build-push-action and GitHub Actions layer caching
  (cache-from/cache-to type=gha) so the expensive yarn install / tsc /
  build:backend layers are reused across PR runs.
- Dump pods, deployment description, and backend logs on failure.

Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Apply Datum's brand palette and typography to the Backstage app using the
new frontend system theme extensions.

Key changes:
- Add packages/app/src/theme/datum.ts with createUnifiedTheme light/dark
  themes: Midnight Fjord primary, Pine Forge secondary, Aurora Moss sidebar
  indicator, flat navy page headers, and brand status colors
- Register both themes via ThemeBlueprint in a new theme frontend module and
  disable the built-in Backstage themes in app-config.yaml
- Set app title to "Datum Portal" and organization to "Datum"
- Reference brand font-family names (Canela Text headings, Alliance No.1
  body) with solid fallbacks; decided to use fallback fonts, the licensed
  brand fonts are intentionally not bundled in this public repo

Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Replace the placeholder logos with the official Datum mark and wordmark.

Key changes:
- LogoFull renders the Datum dark wordmark (lime mark, white text) for the
  expanded navy sidebar
- LogoIcon renders the Datum mark recolored to Aurora Moss lime for the
  collapsed sidebar

Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Key changes:
- Add packages/app/public/favicon.svg with the navy Datum mark and reference
  it from index.html (the generic favicon.ico stays as fallback)
- Default the document title to "Datum Portal" and set theme-color to
  Midnight Fjord

Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Restrict the publish workflow so the image and kustomize bundle only ship
on main, version tags, and releases -- never on arbitrary branch pushes or
pull requests. PR validation is covered by the e2e build.

Harden e2e.yaml, which builds untrusted PR code:
- Give the buildx GHA cache its own scope (backstage-e2e) so it cannot be
  cross-restored into the trusted publish build.
- Add a top-level least-privilege permissions block (contents: read).
- SHA-pin all third-party marketplace actions with a trailing version
  comment, matching the datum-cloud org convention.

Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Apply pod and container security hardening to the base Deployment:
- Pod: disable service account token automount, run as non-root uid/gid
  1000, set fsGroup and the RuntimeDefault seccomp profile.
- Container: drop all capabilities, disable privilege escalation, and
  mount the root filesystem read-only.
- Mount writable emptyDir volumes at /tmp and /home/node/.cache so the
  backend can write scratch and cache data under readOnlyRootFilesystem.
- Add CPU/memory requests and a CPU limit alongside the memory limit.

Validated end-to-end on kind: the backend rolls out clean with no
restarts and readiness, liveness, and catalog endpoints return 200.

Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Add a Snyk security scan workflow calling the shared datum-cloud reusable
workflow on pushes to main, pull requests, and a weekly schedule, with the
permissions required for SARIF upload to GitHub code scanning. Requires a
SNYK_TOKEN org secret to be present.

Add Dependabot config for the npm and github-actions ecosystems, both on a
weekly cadence with grouped minor/patch npm updates.

Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Force-bump the three packages carrying high-severity advisories to patched
releases within their compatible majors via root resolutions, and run yarn
dedupe to collapse duplicate ranges:
- undici 7.24.7 -> 7.28.0 (scoped to the 7.x descriptor; 6.x untouched)
- protobufjs -> 7.6.4
- tar -> 7.5.16

Drops the high-severity advisory count from 3 affected packages (14
advisory entries) to 0. Typecheck, app build, and the kind e2e all pass
with the bumped dependencies.

Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant