feat(api): accept caller-supplied per-frame detections on /predict by Chouffe · Pull Request #47 · pyronear/temporal-model

Chouffe · 2026-06-11T11:59:53Z

Summary

The RPi edge devices already run a YOLO detector and hold per-frame smoke bboxes when the alert-api calls /predict; today the API re-runs its bundled YOLO over every frame. This adds an optional detections field to POST /predict so callers can supply those boxes and skip the in-API detector pass (~600 ms/request on CPU).
When detections is present, the bundled detector and its cache are bypassed entirely (no read, no write). The supplied boxes are converted to internal xywhn Detections and fed through the existing predict(frame_detections=...) injection seam. Tube building, ROI filtering, cropping, classification, and calibration run unchanged — the calibrator sees genuine per-tube mean_conf, log_len, and n_tubes from the real per-frame boxes. Core is untouched.
Spec: docs/specs/2026-06-11-api-supplied-detections-design.md. Documented (unvalidated) risk: calibration was fit on bundled-detector boxes; edge-detector boxes may shift it — to be validated at alert-api integration time.

The two paths

flowchart LR
    A["POST /predict"] --> B{"detections<br/>in request?"}
    B -- "no (today's behavior)" --> C["detection cache<br/>(read + write)"]
    C --> D["bundled YOLO<br/>on cache misses"]
    D --> E["tube building"]
    B -- "yes" --> F["convert xyxyn → xywhn<br/>(class_id=0, cache bypassed)"]
    F --> E
    E --> G["ROI filter<br/>(roi_xyxyn, optional)"]
    G --> H["crop + stabilize"]
    H --> I["ViT classifier"]
    I --> J["calibrator"]
    J --> K["{ is_smoke, probability }"]

Intended deployment flow:

sequenceDiagram
    participant RPi as RPi (pyro-engine)
    participant P as alert-api
    participant API as temporal-model API
    RPi->>P: alert + per-frame bboxes (xyxyn + conf)
    P->>API: POST /predict { frames, detections }
    Note over API: YOLO skipped — tubes built<br/>from the supplied boxes
    API-->>P: { is_smoke, probability }

Request format

Today's call — detector path (unchanged)

Omit detections (or send null) and the API behaves exactly as before: frames are fetched from S3 and the bundled YOLO runs on every frame (with the detection cache):

POST /predict
{
  "frames": [
    "seq9711/000_det134188_2026-06-01T14-13-19.164516Z.jpg",
    "seq9711/001_det134190_2026-06-01T14-17-21.300641Z.jpg",
    "seq9711/002_det134186_2026-06-01T14-17-22.351349Z.jpg"
  ],
  "bucket": "frames",
  "roi_xyxyn": [0.30, 0.35, 0.50, 0.55]
}

(bucket and roi_xyxyn optional, as before.) With ?verbose=true, the response reports "detections_source": "detector" and the detector stage shows up in profiling.

New call — caller-supplied detections (detector bypassed)

One entry per frame, index-aligned with frames. An explicit [] means "the detector ran on this frame and saw nothing" (becomes a gap for tube building); null entries and partial coverage are rejected.

POST /predict
{
  "frames": [
    "seq9711/000_det134188_2026-06-01T14-13-19.164516Z.jpg",
    "seq9711/001_det134190_2026-06-01T14-17-21.300641Z.jpg",
    "seq9711/002_det134186_2026-06-01T14-17-22.351349Z.jpg"
  ],
  "detections": [
    [
      { "xyxyn": [0.137, 0.437, 0.147, 0.454], "confidence": 0.41 },
      { "xyxyn": [0.369, 0.391, 0.431, 0.462], "confidence": 0.31 }
    ],
    [],
    [
      { "xyxyn": [0.371, 0.394, 0.434, 0.465], "confidence": 0.35 }
    ]
  ]
}

detections composes with bucket and roi_xyxyn; omitting it (or sending null) gives exactly today's behavior. The response shape is unchanged:

{
  "is_smoke": true,
  "probability": 0.952,
  "model": { "name": "vit_dinov2_finetune", "version": "0.1.0" }
}

With ?verbose=true, the details block now carries provenance — details.preprocessing.detections_source is "request" when the boxes came from the caller, "detector" when the bundled YOLO produced them.

Validation (400 invalid_request): length mismatch with frames, null/non-list entries, coords outside [0, 1], inverted or zero-area boxes (also catches accidental xywhn input fail-closed), confidence outside [0, 1], missing fields.

Relationship to #46

#46 explores the same goal (skip the in-API detector) with a simpler contract: one static bbox_xyxyn + one bbox_confidence stamped on every frame. That shape loses exactly the information the downstream stages need:

Tube building becomes a no-op. The same box on every frame trivially yields one full-length, gap-free tube. Real sequences have boxes that move/grow per frame, multiple simultaneous boxes, and frames with none — with a static box the crops don't track the smoke, which is off-distribution for the ViT (trained on crops following per-frame detector boxes).
The calibrator's features are all distorted. Its feature row is [logit, log_len, mean_conf, n_tubes] (core/logistic_calibrator.py:106). A forced single box pins mean_conf to one constant (defaulting to 1.0, far above real YOLO confidence distributions), log_len to the full sequence length, and n_tubes to 1 — so the returned probability comes from feature values the regressor never saw during fitting.
No "saw nothing" signal. Frames where the edge detector found nothing still get a fabricated detection.

This PR keeps the per-frame boxes and confidences instead, and the exact-equivalence check below shows that shape preserves the model's behavior bit-for-bit. The plumbing from #46 (injection seam, cache bypass, validation reuse) follows the same approach here.

Test Plan

make -C api lint && make -C api test — 150 passed, 1 skipped
make -C core test — 245 passed, core has zero diffs
Exact-equivalence check (MinIO + native uvicorn, CPU): ran the bundled detector outside the API via BboxTubeTemporalModel.detect(), converted its boxes to xyxyn, and sent them as detections — the response is identical to the in-API detector path (probability equal to all 16 digits, full verbose payload deep-equal; only detections_source and timings differ)
Local e2e on scratch/annot_seq_9711 (7 real frames):
- detector path: is_smoke=true, p=0.870, 3 tubes, detector stage 612 ms, detections_source="detector"
- supplied detections from the sequence's label files (per-frame box counts 3/0/2/4/1/3/0): is_smoke=true, p=0.952, 2 tubes, no detector stage in profiling, detections_source="request"
- all-empty detections: is_smoke=false, p=0.0, 0 tubes
- 19 malformed-payload variants all return 400 invalid_request; detections: null falls back to the detector path
- detector-path request after a supplied-detections request still gets 7/7 cache hits — supplied boxes never entered the cache

…spec

…d-bboxes # Conflicts: # api/README.md # api/src/temporal_model/api/app.py # api/src/temporal_model/api/model_runner.py # api/src/temporal_model/api/schemas.py # api/tests/test_app.py # api/tests/test_model_runner.py # api/tests/test_schemas.py

The merge of #47 with the compute_trigger flag (#51) left the supplied-detections fast path dropping the flag: ?compute_trigger=true with caller-supplied boxes would silently skip the first-crossing search. Thread it through and pin the composition with a test.

Chouffe added 11 commits June 11, 2026 13:28

docs: spec for caller-supplied detections on /predict (detector bypass)

eea518c

docs: make full-coverage requirement explicit in supplied-detections …

d98d342

…spec

docs: spell out per-detection validation in supplied-detections spec

54430a7

docs: implementation plan for caller-supplied detections on /predict

7508181

feat(api): accept per-frame supplied detections in PredictRequest

9d7430c

feat(api): bypass detector and cache when detections are supplied

b7fe2e7

feat(api): thread supplied detections through /predict with provenance

04ab495

docs(api): document the detections field on /predict

bf4f72a

docs: drop the executed implementation plan

423572c

style: ruff format the runner tests

f05802d

docs: name the /predict caller alert-api, not platform

17a12ba

Chouffe requested a review from MateoLostanlen June 11, 2026 12:23

Chouffe mentioned this pull request Jun 11, 2026

feat(api): accept a caller-supplied bbox to skip YOLO detection #46

Draft

Chouffe added 2 commits June 12, 2026 08:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(api): accept caller-supplied per-frame detections on /predict#47

feat(api): accept caller-supplied per-frame detections on /predict#47
Chouffe wants to merge 13 commits into
mainfrom
arthur/feat-api-thread-bboxes

Chouffe commented Jun 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Chouffe commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

The two paths

Request format

Today's call — detector path (unchanged)

New call — caller-supplied detections (detector bypassed)

Relationship to #46

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Chouffe commented Jun 11, 2026 •

edited

Loading