feat: scaffold Backstage application#1
Open
scotwells wants to merge 12 commits into
Open
Conversation
Generate Datum's Backstage developer portal and software catalog using the official @backstage/create-app scaffolder. The repo root is the Backstage monorepo with packages/app (frontend) and packages/backend (backend). Dependencies are installed in CI rather than committed (--skip-install), so node_modules are not part of the repository. Add a CI workflow that builds the backend image with the host-build flow and publishes it to ghcr.io/datum-cloud/backstage on pushes to main and on tags. Replace the default README with Datum-specific run/build instructions and links to the design proposal. Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Replace the host-build backend Dockerfile, which only COPYs a pre-built skeleton.tar.gz/bundle.tar.gz and assumes yarn install/tsc/build:backend already ran on the host, with the official self-contained multi-stage Dockerfile. The new image runs yarn install --immutable, tsc, and the backend build inside the build stage from the repo-root context, so CI can build it with a plain `docker build -f packages/backend/Dockerfile .`. Key changes: - Regenerate yarn.lock as a valid Yarn 4 (Berry) lockfile; the scaffold shipped a stale Yarn v1 format lockfile that made `yarn install --immutable` fail inside the image. - Stop excluding packages/*/src and plugins from the build context in .dockerignore so the in-image build has the sources it needs. Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Drop the bespoke build-image workflow and use the datum-cloud/actions reusable workflows. publish-docker.yaml builds and pushes the backend image; publish-kustomize-bundle.yaml then publishes the config/ tree to oci://ghcr.io/datum-cloud/backstage-kustomize with the freshly built image pinned into config/base. Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Add the config/ tree that is published as the backstage-kustomize OCI bundle and consumed by the e2e gate. Key changes: - config/base: namespace, service account, ClusterIP service on 7007, and a Deployment running the backend image with readiness/liveness probes on the new-backend health endpoints (/.backstage/health/v1/readiness and /.backstage/health/v1/liveness). The kustomization carries an images entry for ghcr.io/datum-cloud/backstage so the bundle publisher can pin the built tag. - config/test: e2e overlay that boots Backstage without GCP secrets or SSO. Adds an ephemeral postgres:16-alpine, an app-config.e2e.yaml ConfigMap (guest auth, default auth policy disabled, pg database, github integrations/providers removed, a small static example catalog), and a Deployment patch wiring the DB env and config mount. The image is pinned to backstage:e2e so kind uses the locally loaded image. Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Add an e2e workflow that proves the deployment manifests actually boot Backstage. On pull_request and push to main it builds the backend image, loads it into a kind cluster, applies config/test, waits for the postgres and backstage rollouts, and asserts the readiness, liveness, and catalog API endpoints all return HTTP 200. Key changes: - Build with docker/build-push-action and GitHub Actions layer caching (cache-from/cache-to type=gha) so the expensive yarn install / tsc / build:backend layers are reused across PR runs. - Dump pods, deployment description, and backend logs on failure. Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Apply Datum's brand palette and typography to the Backstage app using the new frontend system theme extensions. Key changes: - Add packages/app/src/theme/datum.ts with createUnifiedTheme light/dark themes: Midnight Fjord primary, Pine Forge secondary, Aurora Moss sidebar indicator, flat navy page headers, and brand status colors - Register both themes via ThemeBlueprint in a new theme frontend module and disable the built-in Backstage themes in app-config.yaml - Set app title to "Datum Portal" and organization to "Datum" - Reference brand font-family names (Canela Text headings, Alliance No.1 body) with solid fallbacks; decided to use fallback fonts, the licensed brand fonts are intentionally not bundled in this public repo Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Replace the placeholder logos with the official Datum mark and wordmark. Key changes: - LogoFull renders the Datum dark wordmark (lime mark, white text) for the expanded navy sidebar - LogoIcon renders the Datum mark recolored to Aurora Moss lime for the collapsed sidebar Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Key changes: - Add packages/app/public/favicon.svg with the navy Datum mark and reference it from index.html (the generic favicon.ico stays as fallback) - Default the document title to "Datum Portal" and set theme-color to Midnight Fjord Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Restrict the publish workflow so the image and kustomize bundle only ship on main, version tags, and releases -- never on arbitrary branch pushes or pull requests. PR validation is covered by the e2e build. Harden e2e.yaml, which builds untrusted PR code: - Give the buildx GHA cache its own scope (backstage-e2e) so it cannot be cross-restored into the trusted publish build. - Add a top-level least-privilege permissions block (contents: read). - SHA-pin all third-party marketplace actions with a trailing version comment, matching the datum-cloud org convention. Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Apply pod and container security hardening to the base Deployment: - Pod: disable service account token automount, run as non-root uid/gid 1000, set fsGroup and the RuntimeDefault seccomp profile. - Container: drop all capabilities, disable privilege escalation, and mount the root filesystem read-only. - Mount writable emptyDir volumes at /tmp and /home/node/.cache so the backend can write scratch and cache data under readOnlyRootFilesystem. - Add CPU/memory requests and a CPU limit alongside the memory limit. Validated end-to-end on kind: the backend rolls out clean with no restarts and readiness, liveness, and catalog endpoints return 200. Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Add a Snyk security scan workflow calling the shared datum-cloud reusable workflow on pushes to main, pull requests, and a weekly schedule, with the permissions required for SARIF upload to GitHub code scanning. Requires a SNYK_TOKEN org secret to be present. Add Dependabot config for the npm and github-actions ecosystems, both on a weekly cadence with grouped minor/patch npm updates. Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
Force-bump the three packages carrying high-severity advisories to patched releases within their compatible majors via root resolutions, and run yarn dedupe to collapse duplicate ranges: - undici 7.24.7 -> 7.28.0 (scoped to the 7.x descriptor; 6.x untouched) - protobufjs -> 7.6.4 - tar -> 7.5.16 Drops the high-severity advisory count from 3 affected packages (14 advisory entries) to 0. Typecheck, app build, and the kind e2e all pass with the bumped dependencies. Claude-Session: https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Scaffolds Datum's Backstage developer portal and software catalog, and wires up CI to publish the backend image plus a kustomize deployment bundle via the shared
datum-cloud/actionsreusable workflows, gated by a kind-based end-to-end test that actually boots Backstage.What's included
@backstage/create-appscaffolder:packages/app(frontend),packages/backend(backend), plusapp-config.yaml,app-config.production.yaml, example entities/template, and the backendDockerfile.README.mdwith local-dev (yarn install && yarn dev) and build instructions.Image + bundle publishing (shared actions)
.github/workflows/publish.yamlreplaces the bespokebuild-image.yamland uses thedatum-cloud/actionsreusable workflows pinned atv1.16.0:publish-docker.yamlbuilds and pushes the backend image toghcr.io/datum-cloud/backstage(context., dockerfilepackages/backend/Dockerfile,linux/amd64).publish-kustomize-bundle.yamlthen publishes theconfig/tree as an OCI artifact tooci://ghcr.io/datum-cloud/backstage-kustomize, pinning the freshly built image intoconfig/baseviakustomize edit set image.Dockerfile switched to self-contained multi-stage
COPYs a pre-builtskeleton.tar.gz/bundle.tar.gzand assumesyarn install && yarn tsc && yarn build:backendalready ran on the host. The shared publish workflow and the e2e job both run a plaindocker buildwith no host yarn step, so that variant would fail.yarn install --immutable,yarn tsc, and the backend build inside the image from the repo-root context. A plaindocker build -f packages/backend/Dockerfile .now succeeds (verified locally end-to-end).yarn.lockthat madeyarn install --immutablefail under Yarn 4. It has been regenerated as a valid Yarn 4 (Berry) lockfile, and.dockerignoreno longer excludespackages/*/srcorpluginsso the in-image build has its sources.Deployment bundle (
config/)config/base/: namespace, service account, a ClusterIP service on 7007, and a Deployment running the backend image with readiness/liveness probes on the new-backend health endpoints (/.backstage/health/v1/readinessand/.backstage/health/v1/liveness). The kustomization carries animagesentry forghcr.io/datum-cloud/backstageso the bundle publisher can pin the built tag. Database and app-config come from the overlay, not hardcoded here.config/test/: an e2e overlay that boots Backstage without GCP secrets or SSO. It adds an ephemeralpostgres:16-alpine, anapp-config.e2e.yamlConfigMap (guest auth, default auth policy disabled,pgdatabase pointed at the in-cluster postgres, github integrations/providers removed, a small static example catalog), and a Deployment patch that mounts the config, wires the DB env, and pins the image to the kind-loadedbackstage:e2e.Kind e2e gate
.github/workflows/e2e.yamlruns onpull_requestand pushes tomain. It builds the backend image (buildx with GitHub Actions layer caching so the slow yarn install / tsc / build:backend layers are reused across runs), loads it into a kind cluster, appliesconfig/test, waits for the postgres and backstage rollouts, and asserts the readiness, liveness, and catalog API endpoints all return HTTP 200. On failure it dumps pods, the deployment description, and backend logs.Verified locally
docker build -f packages/backend/Dockerfile .succeeds and produces a runnable image.config/test, Backstage rolled out against postgres, the health endpoints returned 200, and the catalog API served the static example entities.Notes
node_modulesare installed in CI / inside the image build.ghcr.io/datum-cloud/backstage; the published bundle isghcr.io/datum-cloud/backstage-kustomize.datum-cloud/infrarepo, not here.Design proposal: https://github.com/datum-cloud/infra/blob/main/docs/enhancements/backstage-service-catalog/README.md
Branding
Applies Datum's brand to the app (new frontend system, theme extensions):
packages/app/src/theme/datum.ts(createUnifiedTheme): Midnight Fjord#0C1D31primary and navy sidebar, Pine Forge#4D6356secondary, Aurora Moss#E6F59Fselected-indicator (paired with dark/white text, never white text on the lime), flat navy page headers (no gradient), and status colors. Registered viaThemeBlueprintin a newpackages/app/src/modules/thememodule; the built-in Backstage themes are disabled inapp-config.yamland Datum Light is the default.packages/app/src/modules/nav/LogoFull.tsx(lime mark + white wordmark for the navy sidebar) andLogoIcon.tsx(lime mark for the collapsed sidebar). Addedpackages/app/public/favicon.svg(navy mark) referenced fromindex.html, with the generic Backstagefavicon.icoleft as fallback. App title set to "Datum Portal".Canela Textheadings,Alliance No.1body) first for forward-compatibility, but we decided to use fallback fonts. The licensed brand fonts are intentionally not bundled (this is a public repo) — no.woff2committed — and the fallback stacks (Georgia serif for headings, system sans for body) render cleanly on their own. They can optionally be added out-of-band later.Security hardening
Applies a batch of security findings. Auth/guest config is intentionally untouched — an OIDC proxy will front the app at deploy time.
publish.yamlno longer triggers on pull requests or arbitrary branch pushes. The image and kustomize bundle now ship only onmain,v*tags, and published releases. PR validation is covered by the e2e build.e2e.yamlbuilds untrusted PR code, so its buildx GHA cache now uses its ownscope=backstage-e2efor bothcache-fromandcache-to, preventing it from poisoning the trusted publish build's cache.permissions: { contents: read }toe2e.yaml.e2e.yaml(actions/checkout,docker/setup-buildx-action,docker/build-push-action,helm/kind-action) is pinned to a full commit SHA with a trailing version comment, matching the org convention. First-partydatum-cloud/actions/...@v1.16.0reusable workflows stay tag-pinned by convention.config/base/deployment.yamlnow disables service-account token automount, runs as non-root uid/gid 1000 withfsGroupand theRuntimeDefaultseccomp profile, drops all capabilities, disables privilege escalation, and mounts the root filesystem read-only. WritableemptyDirs are mounted at/tmpand/home/node/.cacheso the backend can write underreadOnlyRootFilesystem. CPU/memory requests and a CPU limit were added. Validated end-to-end on kind: the backend rolls out with zero restarts,readOnlyRootFilesystemholds, and readiness/liveness/catalog all return 200.snyk-scan.yamlcalls the shareddatum-cloud/actionsreusable workflow on pushes tomain, pull requests, and a weekly schedule, with the permissions needed for SARIF upload. Prerequisite: this requires aSNYK_TOKENorg secret to be present; the workflow will fail authentication until it is added..github/dependabot.ymlcovering thenpm(root, weekly, grouped minor/patch) andgithub-actions(weekly) ecosystems.resolutionsto force-bump the three high-severity packages to patched releases within their compatible majors (undici7.24.7 → 7.28.0, scoped to the 7.x descriptor;protobufjs→ 7.6.4;tar→ 7.5.16) plus ayarn dedupe. High-severity advisories dropped from 3 affected packages (14 advisory entries) to 0; typecheck, app build, and the kind e2e all stay green.Next steps
app-configcatalog providers, auth, and Kubernetes integration for production.datum-cloud/infra.SNYK_TOKENorg secret so the Snyk scan can authenticate.https://claude.ai/code/session_01NMSkwUcaTmZr7S5XmV2aFG