Skip to content

feat(sandbox): narrow GPU procfs permissions and surface runtime additions #1628

@elezar

Description

@elezar

Problem Statement

Parent: #1444

CUDA initialization writes to /proc/<pid>/task/<tid>/comm to set thread names.
The current GPU filesystem policy fix restores CUDA compatibility by allowing
GPU sandboxes to include /proc in filesystem_policy.read_write. That behavior
is explicit and visible to users inspecting the effective policy, but it is
broader than the specific access CUDA appears to need.

A narrower Landlock rule that grants only AccessFs::WriteFile under procfs may
be sufficient for CUDA thread-name writes, including descendant CUDA processes.
However, if the narrower permission is applied outside the declarative policy,
users and operators may see /proc as read-only while the runtime grants an
additional procfs write capability. That creates an auditability gap.

This issue tracks a follow-up to evaluate and implement a least-privilege procfs
permission model for GPU sandboxes while ensuring runtime-added permissions are
reported clearly.

Related context:

Proposed Design

Investigate replacing the broad GPU /proc read-write baseline with a narrower
runtime permission for CUDA procfs writes.

The implementation should address both enforcement and visibility:

  1. Add a GPU-only Linux Landlock rule that grants the minimum access required
    for CUDA thread-name writes, likely AccessFs::WriteFile under /proc.
  2. Confirm that the rule covers descendant CUDA processes, not only the initial
    sandbox command process. A rule rooted at /proc/self/task is likely too
    narrow because descendants resolve /proc/self to a different procfs subtree.
  3. Keep non-GPU sandboxes unchanged.
  4. Ensure the effective runtime permissions are visible to users and operators.
    Options include:
    • an effective-policy field for runtime filesystem additions,
    • a diagnostics section in openshell policy get --full or sandbox inspect
      output,
    • OCSF config-state-change events that identify the runtime-added procfs
      permission,
    • or another explicit reporting surface that makes the narrower permission
      auditable.
  5. Update docs to explain the distinction between declarative policy paths and
    runtime-enforced baseline exceptions.

Alternatives Considered

Keep /proc in filesystem_policy.read_write

This is broader than required but fully visible in the effective policy. It is a
reasonable short-term fix while the narrower model and reporting surface are
designed.

Add a hidden Landlock exception only

This minimizes kernel permissions, but it is not acceptable on its own because
operators inspecting the policy would see /proc as read-only while the runtime
allows a write capability.

Use /proc/self/task instead of /proc

This is narrower, but early testing showed it does not cover CUDA workloads that
spawn descendant processes after Landlock is enforced. Descendant processes have
their own /proc/self resolution, so the rule can miss the procfs paths CUDA
uses.

Agent Investigation

During review of #1522, the branch briefly implemented a GPU-only Landlock
WriteFile exception for procfs. That prototype demonstrated the narrower
shape, but review raised the auditability tradeoff:

  • broad /proc read-write policy is explicit but wider,
  • narrow Landlock WriteFile is tighter but implicit unless surfaced elsewhere.

The same review identified that CUDA descendant processes need coverage beyond
/proc/self/task.

The follow-up should preserve least privilege without hiding effective runtime
permissions from policy inspection and security logs.

Definition of Done

  • GPU CUDA validation passes with the narrower procfs permission model.
  • Non-GPU sandboxes do not receive the procfs write permission.
  • Descendant CUDA processes are covered by tests.
  • The runtime-added permission is visible in an operator-facing surface.
  • OCSF or equivalent diagnostics identify when the GPU procfs permission is
    added.
  • Published docs explain the effective behavior.
  • The issue or PR explicitly documents why the reporting surface is
    sufficient for auditability.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:policyPolicy engine and policy lifecycle work

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions