Skip to content

Fix nvidia fallback services failing to start#475

Open
vigh-m wants to merge 3 commits into
bottlerocket-os:developfrom
vigh-m:nvidia/unit-fix
Open

Fix nvidia fallback services failing to start#475
vigh-m wants to merge 3 commits into
bottlerocket-os:developfrom
vigh-m:nvidia/unit-fix

Conversation

@vigh-m

@vigh-m vigh-m commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Issue number:

Closes #464

Description of changes:

  • Add the %build sections to kmod-nvidia specfiles
  • Fix type in DevID for g4dn instances causing the open driver to be loaded instead of the tesla driver. With this change, we are in-line with what the EKS optimized AMI does
  • Change the dependency of the open-gpu-license-fallback.service and the tesla-license-fallback.service to depend on the nvidia-gridd.service which itself depends on preconfigured.target ensuring all are started on all variants.

Testing done:

  • Build and test the a combination of ecs and k8s variants on the 6.1, 6.12, and 6.18 kernels with the 580 driver

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

vigh-m added 3 commits June 29, 2026 16:31
Signed-off-by: Vighnesh Maheshwari <vighmah@amazon.com>
Signed-off-by: Vighnesh Maheshwari <vighmah@amazon.com>
Signed-off-by: Vighnesh Maheshwari <vighmah@amazon.com>
@vigh-m vigh-m force-pushed the nvidia/unit-fix branch from 3bcf796 to 8534e33 Compare June 29, 2026 16:32

@ginglis13 ginglis13 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - nit on missing space after : in commit message kmod-nvidia:fallback.service to install on all variants

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug] open-gpu-license-fallback.service and tesla-license-fallback.service never run on non-Kubernetes hosts due to k8s-only WantedBy

2 participants