Skip to content

feat(installer): reorder — sign in + provision the client before Helm (#838)#279

Merged
LukasWodka merged 3 commits into
developfrom
feat/838-provision-reorder
Jun 24, 2026
Merged

feat(installer): reorder — sign in + provision the client before Helm (#838)#279
LukasWodka merged 3 commits into
developfrom
feat/838-provision-reorder

Conversation

@LukasWodka

Copy link
Copy Markdown
Contributor

What

RFC-0001 #838 (epic backend#830): reorder the installer so the client is provisioned before Helm — the minted credential + derived namespace feed the chart.

New flow in install-k8s.sh:

create_cluster → provision_client (Step 3) → install_client_helm (Step 4/5)

provision.sh (new) — provision_client

  • Browser-auth path: install the CLI (now FATAL — it mints the credential, vs the old non-fatal post-Helm Step 5) → tracebloc login (device flow) → tracebloc client create --credential-file <f>source the 0600 file → export TRACEBLOC_CLIENT_ID + TRACEBLOC_CLIENT_PASSWORD + TB_NAMESPACE into install_client_helm's existing non-interactive env contract. The credential is never printed; the transient file is removed after sourcing and on any error/signal (a failed mint never leaves a secret-bearing file behind).
  • Adopt (re-run): no fresh credential (the existing one stands) — hands only TB_NAMESPACE and lets install_client_helm reconcile from the local values.yaml.
  • Dual-mode preserved: TRACEBLOC_VALUES_FILE or pre-supplied TRACEBLOC_CLIENT_ID+PASSWORD skip sign-in entirely (still install the CLI, non-fatal) — exactly the prior behavior. Unattended installs use this path.

Also: install.sh fetches provision.sh; common.sh print_roadmap renumbered to match (3 sign-in/provision · 4 install · 5 connect) and install_cleanup shreds the transient credential on exit; the step headers + the trailing CLI install were updated/removed.

Tests

scripts/tests/provision.bats (8): dual-mode skip (creds + values), mint hands all three env vars, adopt hands only the namespace, fatal on missing CLI / failed login / no credential file, and a failed create leaves no credential file behind. The install-cli, install-client-helm, and common (install_cleanup) suites stay green; shellcheck clean on provision.sh. (The 2 suite failures — validate_config: valid config passes and _extract_yaml '' escape — are pre-existing on develop, unrelated to this change.)

Gate + deferred

  • Runs end-to-end once the CLI release ships login + create + --credential-file (cli#104 merged; install-cli.sh fetches releases/latest).
  • Adversarially reviewed (reorder/dual-mode safety, credential-handoff security, bats+integration); the should-fix items (cred-file cleanup on error, stale roadmap) are folded in.
  • Deferred (noted, follow-ups): a non-interactive enroll-token login for fully unattended browser-auth installs; the R7 orphan-resume edge (adopt on a rebuilt host with no local values.yaml → it currently falls through to the existing credential prompt).

🤖 Generated with Claude Code

… (#838)

RFC-0001 §6.4 / #838: move CLI install + `tracebloc login` + `tracebloc client
create` to run BEFORE the Helm install, so the minted credential + derived namespace
feed the chart. A new Step 3 (provision_client) sits between create_cluster and
install_client_helm.

- scripts/lib/provision.sh (new): provision_client + _provisioning_preset.
  - Browser-auth path: install the CLI (now FATAL — it mints the credential, vs the
    old non-fatal post-Helm Step 5), `tracebloc login` (device flow), `client create
    --credential-file` -> source the 0600 file -> export TRACEBLOC_CLIENT_ID/PASSWORD/
    TB_NAMESPACE into install_client_helm's existing non-interactive env contract. The
    credential is never printed; the transient file is removed after sourcing AND on
    any error/signal (install_cleanup backstop via _PROVISION_CRED_FILE) so a failed
    mint never leaves a secret-bearing file behind.
  - Adopt (re-run): no fresh credential — hand only TB_NAMESPACE and let
    install_client_helm reconcile the existing release from values.yaml.
  - DUAL-MODE preserved: TRACEBLOC_VALUES_FILE / pre-supplied TRACEBLOC_CLIENT_ID+
    PASSWORD skip sign-in (still installs the CLI, non-fatal). Unattended installs use
    this path (device-flow login is interactive; an enroll-token login is a follow-up).
- install-k8s.sh: source provision.sh (guarded for stale bootstraps); call
  provision_client before install_client_helm; drop the trailing non-fatal CLI install
  (now done in Step 3). install.sh: fetch provision.sh.
- common.sh: renumber the print_roadmap steps to match (3 sign-in/provision, 4 install,
  5 connect); install_cleanup removes the transient credential file on any exit.
- Renumbered the step headers (provision=3, install=4, connect=5); dropped the step
  heading from install_tracebloc_cli (the caller owns it now).

Tests: scripts/tests/provision.bats (8) — dual-mode skip (creds + values), mint hands
all three env vars, adopt hands only namespace, fatal on missing CLI / failed login /
no credential file, and a failed create leaves no credential file behind. install-cli +
install-client-helm + common (install_cleanup) suites still green (the 2 pre-existing
local-env failures — validate_config + _extract_yaml '' — fail on develop too,
unrelated). shellcheck clean on provision.sh.

Depends on cli `client create --credential-file` (cli#104, merged) + a CLI release that
ships login+create; install-cli.sh fetches releases/latest.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@saadqbal saadqbal left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review summary

Read the diff + surrounding code, ran the suite and shellcheck locally (isolated worktree), and verified the cross-repo CLI contract. Careful, well-tested PR with honest disclosure of its gates. All 8 provision.bats pass, install-cli/common suites green, provision.sh shellcheck-clean, and the 2 suite failures it calls out are genuinely pre-existing (the diff doesn't touch validate_config / _extract_yaml_value).

Verified good

  • Env handoff matches install_client_helm's existing _noninteractive_creds path (ID + PASSWORD → =1; adopt unsets both → values.yaml path).
  • CLI contract matches exactly on cli develop: mint→ID+PASSWORD+TB_NAMESPACE (stdout suppressed), adopt→ID+TB_NAMESPACE+ADOPTED=1 (no password), file 0600, parent dir auto-created.
  • No ordering bug: setup_log_file() mkdir's HOST_DATA_DIR early in main(), before provision_client writes the cred file there.
  • install_tracebloc_cli always returns 0 → the non-fatal claim holds under set -e; the browser path's has tracebloc || error is the real fatal gate.
  • Credential cleanup: explicit rm on success + error paths, unset after, EXIT-trap backstop. "failed create leaves no file" test confirms it.
  • Step renumber (1,2 → 3 → 4,5) consistent; install_tracebloc_cli step-framing correctly relocated to the caller, no orphaned caller.

Two items flagged inline (release sequencing for main; legacy-machine namespace divergence).

Minor / nits

  • verify_credentials on a freshly-minted credential is fatal on invalid(400)/inactive(401) — worth confirming create→auth is read-your-writes consistent on the backend.
  • trap install_cleanup EXIT is EXIT-only; "removed on any error/signal" is best-effort (SIGINT/SIGTERM in the brief mint→rm window is bash-version-dependent). EXIT INT TERM would make the claim true. Low severity.
  • Defense-in-depth: ( umask 077; tracebloc client create … ) would keep the file 0600 even against a future CLI regression. Optional — cli#104 guarantees 0600 today.
  • Test gap: the new _PROVISION_CRED_FILE removal inside install_cleanup (trap backstop) isn't directly tested; provision.bats covers the explicit rm paths only.
  • Adopt comment says it "reconciles from the local values.yaml," but that path actually re-prompts interactively ("Use previous settings as defaults?" + Client ID/password with values.yaml defaults). Functional, but the comment oversells it as silent.

Net: good for develop once the namespace question is settled, with the CLI-release gate as a hard precondition for main.

Reviewed with Claude Code.

Comment thread scripts/lib/provision.sh
Comment thread scripts/lib/provision.sh
saadqbal and others added 2 commits June 24, 2026 20:22
…istency

Folds PR #279 review hardening into the provision-before-Helm reorder:

- provision.sh: run `client create` under `umask 077` so the credential file
  lands 0600 even if a future CLI build regresses on its explicit chmod
  (defense in depth; cli#104 already sets 0600).
- provision.sh: clear TRACEBLOC_CLIENT_ID/PASSWORD/TB_NAMESPACE/ADOPTED before
  sourcing the credential file. The mint case does not write ADOPTED, so a
  stale ADOPTED=1 in the environment would otherwise misroute a fresh mint into
  the adopt branch and drop the just-minted credential. Regression test added.
- install-k8s.sh: trap SIGINT/SIGTERM -> exit so the EXIT-trap credential shred
  always runs on Ctrl-C; the brief mint->source window no longer risks leaving
  the 0600 secret on disk.
- install-k8s.sh: fix leftover "Step 1/4"/"Step 2/4" headers to /5 (the step()
  calls already pass 5; the #838 renumber missed these two).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-place namespace reconcile

Addresses the two review comments on the provision-before-Helm reorder:

- provision.sh (release-sequencing gate): probe the installed CLI for `login`
  + `client create` (cli#104) before taking the browser-auth path. The CLI is
  pulled from the latest RELEASE, which can lag this installer; if it's too old,
  fall back to the proven manual-credential path (install_client_helm prompts)
  instead of hard-failing on an unknown `tracebloc login`. Makes the reorder safe
  regardless of CLI-release timing — the browser flow auto-engages once the
  release lands. Regression test added.

- install-client-helm.sh (duplicate-release on re-run): when the SAME client is
  already installed under a different namespace (e.g. an old installer's fixed
  `tracebloc` ns vs. the #838 minted slug), upgrade that release in place instead
  of forking a second one under the new namespace — a silent fork would
  double-book host capacity and orphan the original. Reuses the guard's
  existing_ns. Regression test added.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@LukasWodka LukasWodka merged commit 35462b4 into develop Jun 24, 2026
21 checks passed
@LukasWodka

Copy link
Copy Markdown
Contributor Author

CLI v0.4.0 is published — release-ordering gate cleared. Validated the released binary against this PR's contract on darwin/arm64:

  • tracebloc version0.4.0; login, client create, and --credential-file all present.
  • The _cli_supports_provisioning probe (40ad207) passes against v0.4.0 → provision_client takes the browser-auth path, not the old-CLI fallback (it correctly fell back on v0.3.1).
  • client create --credential-file is auth-gated: without a session it errors not signed in and writes no credential file — no secret leak on the unauthenticated path.

So the "don't promote to main until the CLI release lands" note is satisfied (releases/latest = v0.4.0 ships login + create + --credential-file).

Not validated here (needs a human): the full live install runs create_cluster (k3d) → tracebloc login (device flow — requires a human to approve in a browser; there's no token/non-interactive login yet) → client create (real backend client) → helm install. That end-to-end browser-auth provision is a manual FR on dev. The dual-mode path (TRACEBLOC_CLIENT_ID/PASSWORD) remains the unattended route.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants