Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ build: ## Build all binaries (scheduler needs flux-sched; helpers are pure Go)

.PHONY: python
python:
docker build -f python/Dockerfile -t ghcr.io/converged-computing/fluence-sidecar:latest ./python
docker push ghcr.io/converged-computing/fluence-sidecar:latest
docker build -f python/Dockerfile -t vanessa/fluence-sidecar:latest ./python
docker push vanessa/fluence-sidecar:latest
# kind load docker-image ghcr.io/converged-computing/fluence-sidecar:latest

.PHONY: test
Expand Down
118 changes: 60 additions & 58 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,10 +194,10 @@ ceiling. Types come from the same config as the graph, so they can't drift.

### `sidecars/` — quantum coordination sidecars

Vendor-specific sidecar containers injected by the webhook into leader pods
of quantum workflow groups. Each sidecar discovers the QPU task submitted by
the leader, polls the vendor queue, and ungates worker pods when the task
reaches position==1.
Vendor-specific sidecar containers injected by the webhook into the producer pod
of a shared quantum workflow group. Each sidecar discovers the QPU task submitted
by the producer, polls the vendor queue, and ungates the consumer pods when the
task reaches position==1.

```console
sidecars/
Expand All @@ -221,7 +221,7 @@ spec:
```

Fluence creates the PodGroup, injects the sidecar, creates per-namespace
RBAC, and gates all non-leader pods. See `sidecars/braket/design.md` for
RBAC, and gates the consumer pods. See `sidecars/braket/design.md` for
the full design including the SDK interceptor, queue position polling, and
the two-queue problem motivation.

Expand Down Expand Up @@ -369,88 +369,90 @@ Submission is **not** done by the scheduler — the workload container holds the
user's credentials and submits via qrmi-go. Fluence only schedules and hands off
the backend. (When we control local quantum devices this will change.)

### 3. Quantum workflow groups (leader + workers)
### 3. Quantum workflow groups (producer + consumers)

A quantum workflow group is one pod that **submits** quantum work (the leader)
plus N pods that **wait** for the result (the workers). All pods share a group
label; Fluence co-schedules them, gives the leader a sidecar that watches the
vendor queue, and gates the workers so they consume no node resources during the
(long, variable) QPU queue wait — releasing them only when the task reaches
`queue_position == 1`.
A quantum workflow group is a gang whose members share **one** quantum task:
one pod **submits** the work (the producer) and N−1 pods **wait** for the result
(the consumers). All pods share a group label and run the *same* image; Fluence
co-schedules them, gives the producer a sidecar that watches the vendor queue, and
gates the consumers so they consume no node resources during the (long, variable)
QPU queue wait — releasing them only when the task reaches `queue_position == 1`.

```yaml
# Every pod in the group carries the same group label + schedulerName: fluence
# Every pod in the group carries the same group label + schedulerName: fluence,
# and opts into shared coordination.
metadata:
labels:
fluence.flux-framework.org/group: my-qaoa-workflow
annotations:
fluence.flux-framework.org/coordination: shared
spec:
schedulerName: fluence
```

#### How the leader is chosen — two mechanisms
#### Coordination modes

There are two ways Fluence decides which pod is the leader. They are mutually
exclusive per group; pick the one that matches how your workload is built.
`fluence.flux-framework.org/coordination` selects how the gang is coordinated; it
defaults to `independent`.

**(a) Explicit role (recommended for leader/worker workflows).** Each pod
declares its role with an annotation. This is **authoritative**: admission order
is never consulted, and the same value is injected into the container as
`FLUENCE_ROLE` so your application reads the exact role Fluence used — the two
can never disagree.
- **`shared`** — the gang shares ONE quantum task. Fluence promotes one member to
producer and gates the rest as consumers (see below). Use this for a coordinated
workflow where the classical post-processing should start together as the single
result lands.
- **`independent`** (default) — every member does its own quantum work: its own
real submit, its own queue wait, no gating. N members run N tasks. This is the
honest default; Fluence never invents coordination you did not ask for, and
never dedups tasks meant to be distinct.

```yaml
metadata:
labels:
fluence.flux-framework.org/group: my-qaoa-workflow
annotations:
fluence.flux-framework.org/role: leader # or: worker
```
#### How the producer is chosen

Use this when the leader and workers are **different** (the leader submits the
quantum task and runs the sidecar; workers process results). The leader gets the
interceptor + sidecar; workers are gated. Because the decision is declared, it is
race-free regardless of which pod the API server admits first. Your container can
branch on `$FLUENCE_ROLE` (e.g. `leader` → submit; `worker` → wait).
In `shared` mode the producer is the member the Job controller stamps with
`batch.kubernetes.io/job-completion-index: "0"` — so an **indexed Job** gives
deterministic, race-free election from a single identical template (every pod has
the same image and group label; only the index differs). This serves two contracts
with no extra configuration:

**(b) Admission order (default when no role annotation is present).** If pods
carry the group label but **no** role annotation, the **first pod admitted**
becomes the leader and every subsequent pod is a worker. This suits a
*homogeneous* pod-template gang (Deployment/Job/StatefulSet) where every replica
is byte-identical — any one of them can lead, so "first admitted" is a fine
tiebreaker. It is **not** suitable for a heterogeneous leader/worker workflow:
since admission order is nondeterministic, a worker pod could be admitted first
and wrongly elected leader. Use mechanism (a) for that case.
- an **explicit-role script** that branches on the completion index (index 0
submits; others wait and consume the result), and
- an **identical script** where every pod calls submit — the producer's submit is
real, and each consumer's submit is transparently returned the producer's task
(the shared-result dedup), so the code need not branch at all.

> Rule of thumb: identical replicas → admission order is fine. Distinct
> leader/worker pods → use the explicit `role` annotation.
For loose pods with no completion index, the first pod admitted claims the producer
slot; an indexed Job is recommended when you need determinism.

#### What Fluence does

Regardless of mechanism, the leader gets the sidecar and a PodGroup is created
(`minCount: 1`); workers get a `quantum.braket/ready` scheduling gate and consume
no node resources during the QPU queue wait. When the sidecar observes
`queue_position == 1`, it patches the task ARN onto each worker's annotations and
removes their gates atomically with setting the `fluence-quantum-classical`
priority class so they reschedule promptly.
In `shared` mode the producer gets the interceptor (real mode) + sidecar and its
own group-of-one PodGroup `<group>-producer` (`minCount: 1`), so it schedules
alone and runs the single real submit; it is never gated. The consumers join the
`<group>` gang (`minCount: N−1`), get a `quantum.braket/ready` scheduling gate, and
consume no node resources during the QPU queue wait. When the sidecar observes
`queue_position == 1`, it stamps the producer's task id onto each consumer
(surfaced as `FLUENCE_QUANTUM_JOB_ID`) and removes their gates atomically with
setting the `fluence-quantum-classical` priority class so they reschedule promptly.
The producer is one of the N members, so the application runs exactly N times —
never N+1, and there is no separate submitter pod.

Per-namespace RBAC (`fluence-sidecar` ServiceAccount/Role/RoleBinding) and the
interceptor ConfigMap are created automatically by the webhook on first use — no
interceptor staging are created automatically by the webhook on first use — no
manual setup required.

```bash
# Just apply your pods with the group label (+ optional role annotation) and
# Apply your pods with the group label + coordination annotation +
# schedulerName: fluence. RBAC is created for you.
kubectl apply -f my-quantum-workflow.yaml
```

#### A note on the homogeneous "all submit" case
#### A note on the independent "all submit" case

A group where *every* pod submits its own quantum task (no leader/worker split)
is possible but rarely what you want: N independent submissions land in the
vendor queue and run at uncoordinated times, so there is no coordination benefit
from grouping them — you would just have N standalone quantum pods. For a single
quantum submission, use a standalone pod (no group label, see §2). For a
coordinated workflow, use the leader/worker form above with an explicit role.
`coordination: independent` (the default) means *every* pod submits its own
quantum task: N independent submissions land in the vendor queue and run at
uncoordinated times. That is correct and sometimes exactly what you want (N
distinct circuits), but it offers no coordination benefit from grouping — it is
equivalent to N standalone quantum pods. For a single quantum submission, use a
standalone pod (no group label, see §2). For a coordinated workflow that shares
one result, use `coordination: shared` above.


### Notes
Expand Down
31 changes: 16 additions & 15 deletions deploy/fluence-pull-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -138,8 +138,7 @@ spec:
containers:
- name: fluence
image: vanessa/fluence:test
# Allows for kind load
imagePullPolicy: Always
imagePullPolicy: Always
command:
- /bin/fluence
- --config=/etc/fluence/scheduler-config.yaml
Expand All @@ -148,13 +147,6 @@ spec:
# Without these its PodGroup/GangScheduling plugin is inactive, pods
# schedule with no gang semantics, and PodGroup status stays Pending.
- --feature-gates=GenericWorkload=true,GangScheduling=true
# Re-attempt unschedulable pods more often than the 5m default. In the
# contention experiment a gang that loses the initial race for nodes is
# marked Unschedulable; this is how soon it is re-tried after capacity
# frees (the event-driven QueueingHint is best-effort; this is the
# backstop that bounds worst-case requeue latency). 30s keeps contended
# gangs draining promptly without thrashing the queue.
- --pod-max-in-unschedulable-pods-duration=30s
- --v=4
env:
# Path to the resources config (e.g. quantum backends). Unset/empty
Expand Down Expand Up @@ -194,19 +186,29 @@ spec:
containers:
- name: webhook
image: vanessa/fluence:test
# Allows for kind load
imagePullPolicy: Always
command: ["/bin/fluence-webhook"]
# The webhook derives the FLUXION_* env contract (FLUXION_VENDOR,
# FLUXION_QRMI_TYPE, ...) from the resource graph's attribute keys, so
# it needs the same graph the scheduler and device plugin read. Without
# this it injects only FLUXION_BACKEND, and the sidecar can't route to
# a provider (which keys on qrmi_type).
env:
# Use busybox as sidecar image in tests — avoids pulling the real
# sidecar image which is large and not cached in CI.
- name: FLUENCE_SIDECAR_IMAGE
value: "busybox:latest"
- name: FLUENCE_RESOURCES
value: /etc/fluence/resources.yaml
ports:
- containerPort: 8443
readinessProbe:
httpGet: {path: /healthz, port: 8443, scheme: HTTPS}
initialDelaySeconds: 2
volumeMounts:
- name: config
mountPath: /etc/fluence
volumes:
- name: config
projected:
sources:
- configMap: {name: fluence-resources, optional: true}
---
apiVersion: v1
kind: Service
Expand Down Expand Up @@ -247,7 +249,6 @@ webhooks:
- key: kubernetes.io/metadata.name
operator: NotIn
values: ["kube-system"]
---
# fluence-sidecar.yaml
#
# RBAC and supporting resources for the Fluence quantum sidecar.
Expand Down
Loading
Loading