[bundle] Report deploy operations to Deployment Metadata Service (step 5)#5356
Closed
shreyas-goenka wants to merge 6 commits into
Closed
[bundle] Report deploy operations to Deployment Metadata Service (step 5)#5356shreyas-goenka wants to merge 6 commits into
shreyas-goenka wants to merge 6 commits into
Conversation
Pure code-movement refactor. Wraps the existing workspace-filesystem lock
behavior behind a DeploymentLock interface so a follow-up PR can introduce
an alternative metadata-service-backed lock implementation without
touching deploy/destroy/bind callers again.
What changed:
- New bundle/deploy/lock/lock.go: DeploymentLock interface, Goal enum
(moved from release.go), DeploymentStatus enum, and a
NewDeploymentLock factory that unconditionally returns the workspace
filesystem implementation.
- New bundle/deploy/lock/workspace_filesystem.go: workspaceFilesystemLock
struct that implements DeploymentLock. Preserves the historical
behavior of the deleted acquire.go / release.go mutators: lock-disabled
short-circuit, locker.CreateLocker initialization, the
permissions.ReportPossiblePermissionDenied branch on fs.ErrPermission
/ fs.ErrNotExist, and the destroy-mode locker.AllowLockFileNotExist
unlock quirk.
- Deleted bundle/deploy/lock/acquire.go and bundle/deploy/lock/release.go.
- Updated bundle/phases/{deploy,destroy,bind}.go to construct the lock
once via NewDeploymentLock and call Acquire / Release directly instead
of through bundle.ApplyContext. The deferred Release now reports
DeploymentSuccess / DeploymentFailure based on logdiag.HasError so a
future DMS-backed implementation can record the outcome.
Behavior is preserved end-to-end: lock-related acceptance goldens
(pipelines/{deploy,destroy}/force-lock, bundle/help/bundle-{deploy,
destroy}) all pass unchanged.
…ent lock Wires up a new env-var gate (DATABRICKS_BUNDLE_MANAGED_STATE) that switches the deployment lock from the workspace filesystem to the bundle deployment metadata service (DMS). The env var accepts the usual boolean spellings (true/false, 1/0, yes/no, on/off); the historical filesystem lock is the default. The DMS-backed lock uses the SDK's databricks-sdk-go/service/bundle client (merged via databricks#5311, available since v0.135.0) — no hand-rolled DMS client. Acquire calls CreateDeployment / CreateVersion and starts a background heartbeat goroutine; Release stops the heartbeat and calls CompleteVersion (plus DeleteDeployment on successful destroy). Bind and Unbind are not yet supported under DMS and return an error at lock construction. Deployment ID persistence lives inside the lock package for now (managed_service.json in the workspace state dir). In step 4 of the DMS split it will move to bundle/statemgmt so it can be shared with the state-from-DMS path. Co-authored-by: Isaac
Step 4 of the DMS CLI integration plan. When DATABRICKS_BUNDLE_MANAGED_STATE is set, resource state is now fetched from the deployment metadata service on the server rather than the workspace-side resources.json file: - statemgmt/state_dms.go: new LoadStateFromDMS reads resources via the SDK Bundle.ListResourcesAll and seeds the in-memory state DB via DeploymentState.OpenWithData (no local file is touched). - statemgmt/state_pull.go: when DMS is active, pull only the deployment_id pointer from managed_service.json and short-circuit. The full state is loaded later via LoadStateFromDMS so file-state lineage/serial logic stays off the DMS path entirely. - statemgmt/state_push.go: no-op under DMS — the server owns the state, there is nothing useful to upload. - cmd/bundle/utils/process.go: route through LoadStateFromDMS instead of StateDB.Open when DMS is on; reject --plan under DMS because plan-vs-state lineage checks don't translate to the server's version-based locking. - bundle/bundle.go: add Bundle.DeploymentID, the in-memory carrier for the DMS deployment record id (populated by either state pull or lock acquire). - statemgmt/managed_service_json.go: promote managed_service.json constant and struct out of the lock package so the state-read path can share them. - deploy/lock/deployment_metadata_service.go: switch to the exported names and publish b.DeploymentID after acquiring the lock so subsequent reads in the same process see it without re-reading the file. Operation reporting (writing per-resource operations back to DMS during deploy) is intentionally deferred to step 5; today the resources list returned by ListResources is empty until that lands. The release-lock-error acceptance fixture loses the "Updating deployment state..." line because PushResourcesState is now a no-op under DMS. A new plan-and-summary acceptance test exercises the full read path end-to-end and verifies both bundle plan and bundle summary hit ListResources.
…ient
Adds three table-style tests:
* NoDeploymentID — empty b.DeploymentID short-circuits without any RPC and
leaves the in-memory state DB safe to read (ExportState returns empty).
* PopulatesFromList — resources land under "resources.<key>" with their
ID and State preserved verbatim; the nil-State path is exercised too.
* ListError — DMS-side failures are wrapped with the package-level prefix
so callers can tell them apart from local-filesystem errors.
Together these cover the three branches a code reviewer would otherwise have
to read the source to verify.
Wire per-resource Create/Update/Delete outcomes from the direct engine
through to the deployment metadata service (DMS), completing step 5 of
the DMS CLI integration plan.
* bundle/deploy/lock/operation_reporter.go — new asyncReporter ships
operationEvents to DMS from a single sender goroutine fed by a buffered
channel (size matches direct.defaultParallelism so apply workers
rarely block). planActionToOperationAction maps the deploy-plan action
enum onto sdkbundle.OperationActionType.
* bundle/deploy/lock/operation_reporter_test.go — exhaustive mapping
test covering every ActionType (incl. Skip → empty, Undefined → error).
* bundle/deploy/lock/deployment_metadata_service.go — start the reporter
on Acquire and Close() it in Release before CompleteVersion so all
per-resource events land under the right version_id.
* bundle/direct/pkg.go — new OperationReporter type and DeploymentBundle
field; left nil when DMS is off so the file-state path is unaffected.
* bundle/direct/bundle_apply.go — call the reporter inline after each
Create/Update/Delete (success or failure); skip-actions are filtered
out at the sender, deletes report resource_id only, failures carry the
error message and no state payload.
* bundle/deployplan/plan.go, bundle/direct/bundle_plan.go — comment-only
notes that Lineage/Serial don't apply under DMS.
* acceptance/bundle/dms/{add-resources,sequential-deploys,deploy-error}
— three new acceptance tests exercising CREATE, DELETE, and FAILED
operations end-to-end against the fake DMS server.
* acceptance/bundle/dms/plan-and-summary — updated golden output now
that the deploy reports a CREATE, so plan/summary see the resource
already deployed.
* acceptance/bin/print_requests.py — open recorded-requests file with
utf-8 to handle non-ASCII bytes in the trace.
Co-authored-by: Isaac
Migrate mode rewrites local state in-place and never calls resource adapters, so there are no operations to report. Without this check, a DMS-enabled migrate would silently leave the server-side resource inventory empty. Co-authored-by: Isaac
Contributor
|
An authorized user can trigger integration tests manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
Contributor
Author
|
Superseded — reopening from the upstream branch instead of the fork so CI's JFrog OIDC handshake works (fork PRs lose |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Step 5 of the DMS CLI integration plan (see #4856 for the kitchen-sink reference). Stacked on top of #5355 (state-read from DMS) — this branch was cut from that PR's HEAD and should be reviewed after it merges.
When
DATABRICKS_BUNDLE_MANAGED_STATEis set, the direct engine now reports per-resource Create/Update/Delete operations to the deployment metadata service inline with apply. With this in place the DMS server has the full picture of which resources exist for a deployment, which closes the loop with the state-read path from step 4.bundle/deploy/lock/operation_reporter.go— newasyncReportershipsoperationEvents to DMS from a single sender goroutine fed by a buffered channel (buffer size =direct.defaultParallelismso apply workers rarely block).planActionToOperationActionmaps the deploy-plan action enum ontosdkbundle.OperationActionType; Skip → empty (no DMS operation), unknowns → error.bundle/deploy/lock/operation_reporter_test.go— exhaustive mapping test covering everyActionTypeincludingSkip,Undefined, and an unknown garbage value.bundle/deploy/lock/deployment_metadata_service.go— start the async reporter onAcquire,Close()it inReleasebeforeCompleteVersionso every event lands under the rightversion_id.bundle/direct/pkg.go— newOperationReportertype andDeploymentBundlefield; left nil when DMS is off so the file-state path is unaffected.bundle/direct/bundle_apply.go— call the reporter inline after each Create/Update/Delete (success or failure). Delete reportsresource_idonly (no state payload), failed operations report the error message withOPERATION_STATUS_FAILED. Migrate+DMS combo is rejected up-front.bundle/deployplan/plan.go,bundle/direct/bundle_plan.go— comment-only notes thatLineage/Serialdon't apply under DMS.acceptance/bundle/dms/{add-resources,sequential-deploys,deploy-error}— three new acceptance tests exercising CREATE, DELETE, and FAILED operations end-to-end against the fake DMS server.acceptance/bundle/dms/plan-and-summary— updated golden output now that the deploy reports a CREATE, so plan/summary see the resource as already-deployed.acceptance/bin/print_requests.py— open the recorded-requests file withencoding="utf-8"to handle non-ASCII bytes in the trace.Error handling
Operation reporting matches the heartbeat behaviour from step 3 (PR #5331): DMS-side delivery errors are logged and the deploy continues. The reporter only fails the deploy when the deploy context is cancelled mid-send. The buffer (size 10) gives the sender a small cushion against transient DMS slowness without serializing the deploy.
Out of scope
--planunder DMS — server-side version ordering already exists, but the lineage/serial bridge isn't built; deferred per the split plan.Test plan
go test ./bundle/direct/... ./bundle/deploy/lock/... ./bundle/deployplan/...green (incl. newTestPlanActionToOperationAction).go test ./acceptance -run "TestAccept/bundle/dms"green across all five DMS acceptance tests (add-resources,sequential-deploys,deploy-error,plan-and-summary,release-lock-error).This pull request and its description were written by Isaac.