vmm-cli update --compose with --env-file silently drops compose changes when allowed_envs differs

## Summary

`vmm-cli.py update <vm_id> --compose new.yaml --env-file new.env --kms-url ...` silently drops the `--compose` update when the env-file's keys differ from the VM's current `allowed_envs`. The resulting VMM-stored `compose_file` keeps the **old** `docker_compose_file` but with the **new** `allowed_envs`. `vmm-cli update` exits 0 and reports success.

## Reproduction

Any combined update where `--env-file` introduces (or removes) any env var changes `allowed_envs`. For us this surfaced when adding `LAUNCHER_CHANNEL` to the env list alongside a new service in the compose YAML — the new service was silently dropped on two hosts.

## Root cause

`vmm/src/vmm-cli.py`, `update_vm()` (current master, lines 1051–1124): two unrelated branches both write to `upgrade_params["compose_file"]`, and the env-file branch runs last:

```python
# Branch 1 — compose update (line 1051)
if needs_compose_update:
    vm_configuration = vm_info_response["info"].get("configuration") or {}
    compose_file_content = vm_configuration.get("compose_file")
    app_compose = json.loads(compose_file_content) if compose_file_content else {}
    if docker_compose_content:
        app_compose["docker_compose_file"] = docker_compose_content   # ← inserts NEW YAML
    ...
    upgrade_params["compose_file"] = json.dumps(app_compose, ...)

# Branch 2 — env-file (line 1088)
if env_file:
    envs = parse_env_file(env_file)
    if envs:
        ...
        if compose_file_content:
            app_compose = json.loads(compose_file_content)            # ← RE-READS ORIGINAL (no new YAML)
            ...
            if app_compose.get("allowed_envs") != allowed_envs:
                app_compose["allowed_envs"] = allowed_envs
                compose_changed = True
            ...
            if compose_changed:
                upgrade_params["compose_file"] = json.dumps(app_compose, ...)   # ← OVERWRITES branch 1's result
```

Branch 2 reloads `compose_file_content` from `vm_info_response` (pre-update state) instead of continuing to mutate the `app_compose` dict already built by branch 1. When `allowed_envs` differs, `compose_changed=True` and branch 2's `upgrade_params["compose_file"] = json.dumps(app_compose, ...)` clobbers the new YAML.

## Why it's hard to notice

- `vmm-cli update` exits 0 and prints success
- The resulting `compose_file` still has the new `allowed_envs`, so subsequent env operations look correct
- The KMS hash registered by the operator (computed from `app-compose.json`) matches what VMM stores — both are wrong-but-internally-consistent
- The CVM boots fine; the missing service simply… never existed

## Suggested fix

Have branch 2 reuse the `app_compose` dict built by branch 1 instead of reloading from `vm_configuration`. Sketch:

```python
app_compose = None  # accumulated across both branches

if needs_compose_update or env_file:
    vm_info_response = self.rpc_call("GetInfo", {"id": vm_id})
    ...

if needs_compose_update:
    vm_configuration = vm_info_response["info"].get("configuration") or {}
    compose_file_content = vm_configuration.get("compose_file")
    try:
        app_compose = json.loads(compose_file_content) if compose_file_content else {}
    except json.JSONDecodeError:
        app_compose = {}

    if docker_compose_content:
        app_compose["docker_compose_file"] = docker_compose_content
        updates.append("docker compose")
    # ... prelaunch_script, swap_size ...
    upgrade_params["compose_file"] = json.dumps(app_compose, ...)

if env_file:
    envs = parse_env_file(env_file)
    if envs:
        ...
        # Reuse the in-flight app_compose if branch 1 ran;
        # otherwise load from current VMM state.
        if app_compose is None:
            vm_configuration = vm_info_response["info"].get("configuration") or {}
            compose_file_content = vm_configuration.get("compose_file")
            try:
                app_compose = json.loads(compose_file_content) if compose_file_content else {}
            except json.JSONDecodeError:
                app_compose = {}

        compose_changed = False
        allowed_envs = list(envs.keys())
        if app_compose.get("allowed_envs") != allowed_envs:
            app_compose["allowed_envs"] = allowed_envs
            compose_changed = True
        # ... launch_token_hash ...
        if compose_changed or needs_compose_update:
            upgrade_params["compose_file"] = json.dumps(app_compose, ...)
```

Two key changes: (a) `app_compose` is shared across both branches; (b) when branch 1 ran, always re-serialize the merged result so the env updates don't drop the compose changes.

## Workaround (no upstream change needed)

Split the single update into two sequential `vmm-cli update` calls:

1. `vmm-cli update <vm_id> --env-file new.env --kms-url ...` — settles `allowed_envs` and `encrypted_env`
2. `vmm-cli update <vm_id> --compose new.yaml --vcpu ... --image ... --kms-url ...` — applies the new compose against an already-matching `allowed_envs`, so branch 2 sees `compose_changed=False` and doesn't clobber

## Environment

Reproduced on a downstream install (`/usr/bin/vmm-cli.py`, md5 `da37c6fecd4219363e4c43076ca4fc30`); upstream master at `vmm/src/vmm-cli.py` has the same code path. Hosts in question were built from a dstack release using `dstack-nvidia-0.5.5`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vmm-cli update --compose with --env-file silently drops compose changes when allowed_envs differs #707

Summary

Reproduction

Root cause

Why it's hard to notice

Suggested fix

Workaround (no upstream change needed)

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

vmm-cli update --compose with --env-file silently drops compose changes when allowed_envs differs #707

Description

Summary

Reproduction

Root cause

Why it's hard to notice

Suggested fix

Workaround (no upstream change needed)

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions