Skip to content

Make playbook idempotent#243

Open
mandre wants to merge 3 commits into
mainfrom
idempotency
Open

Make playbook idempotent#243
mandre wants to merge 3 commits into
mainfrom
idempotency

Conversation

@mandre

@mandre mandre commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Fix a few idempotency issues.

mandre added 3 commits June 23, 2026 11:54
Remove route capture/replay from nmstatectl apply. On re-run after
TripleO deployment, captured routes reference interfaces that nmstate
cannot manage:
- br-hostonly: nmstate considers it an unsupported tun type
- eno1: enslaved to the br-ex OVS bridge after deployment

Without a routes section, nmstatectl apply leaves existing routes
untouched, which is correct for both first run and re-runs.

Also fix checkpoint extraction to use regex_search across the full
output instead of fragile last-line parsing that assumed the checkpoint
was always on the last stdout/stderr line.
On re-run, 'openstack tripleo deploy' fails because the ephemeral
heat-all server becomes unresponsive during stack processing on a
system already running hundreds of OpenStack containers. The single-
threaded eventlet server can't handle API polling requests while
creating nested heat stacks under this load, causing the tripleoclient
to time out with 'Remote end closed connection without response'.

Fix by checking for /etc/openstack/clouds.yaml (created only after a
successful TripleO deployment) and skipping the deploy when it exists.
This is safe because:
- 'make destroy' removes /etc/openstack/, so destroy + redeploy works
- The deploy is inherently non-incremental (tripleo deploy creates a
  full heat stack from scratch each time), so re-running it on an
  already-deployed system provides no benefit

Also harden the stale heat process cleanup (needed for partial deploy
retries) by using SIGKILL instead of SIGTERM and waiting for processes
to fully terminate before proceeding.
When install_stack is re-run on an already-deployed system, the simpleca
role generates a new CA in a temp directory (always fresh), and the new
CA cert overwrites /etc/pki/ca-trust/source/anchors/simpleca.crt. But
the TripleO deploy is skipped (clouds.yaml already exists), so the
running services still serve the old server cert signed by the old CA.
This breaks all openstack CLI calls with CERTIFICATE_VERIFY_FAILED.

Fix by moving the tripleo_deployed check before the SSL generation
blocks and guarding them with 'not tripleo_deployed.stat.exists', so
we only generate and install new certs when actually deploying.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant