Add patina_sre crate (System Recovery Environment boot orchestrator)#97
Add patina_sre crate (System Recovery Environment boot orchestrator)#97kat-perez wants to merge 4 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new patina_sre crate that implements patina_boot::BootOrchestrator to drive the normal boot path on platforms that ship a System Recovery Environment. The crate covers controller-connect/dispatch interleaving, EndOfDxe + ReadyToBoot signalling, console discovery, Boot#### enumeration with a main_os_path fallback, and a stub for the boot-partition write-lock pending issue #61. Re-exports DevicePathBuf/EndEntire ensure callers share trait coherence with the patina source used internally.
Changes:
- Introduce
SreBootManagerorchestrator with interleaved connect+dispatch, BDS signalling,Boot####iteration, andmain_os_pathfallback. - Re-export
DevicePathBuf/EndEntirefrompatina_sreso downstream consumers stay on the samepatinasource. - Add crate scaffolding (Cargo.toml, README, rust-toolchain.toml, rustfmt.toml, .gitignore) and host-side unit tests for the interleave loop and trait bounds.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| uefi/crates/patina_sre/src/sre_boot_manager.rs | Core orchestrator implementation + unit tests for interleave loop and trait conformance. |
| uefi/crates/patina_sre/src/lib.rs | Crate root with no_std/feature flags and re-exports. |
| uefi/crates/patina_sre/README.md | Crate description and usage snippet. |
| uefi/crates/patina_sre/Cargo.toml | Package manifest with git deps for patina/patina_boot. |
| uefi/crates/patina_sre/rust-toolchain.toml | Pinned nightly-2025-12-12 with UEFI targets. |
| uefi/crates/patina_sre/rustfmt.toml | max_width = 120. |
| uefi/crates/patina_sre/.gitignore | Ignore target/ and Cargo.lock. |
c16d71c to
3f2983c
Compare
Adds patina_sre at uefi/crates/patina_sre/ — implements
patina_boot::BootOrchestrator for platforms shipping a System Recovery
Environment alongside the main OS. The skeleton implements the normal
boot path:
1. interleave connect+dispatch (10-round cap)
2. extra connect_all before EndOfDxe so PartitionDxe can bind GPT
child handles during the open driver-binding window
3. signal EndOfDxe
4. discover console devices
5. boot-partition write-lock (currently a log::warn! stub pending
odp-platform-common#61's patina_boot::partition helper)
6. discover_boot_options + iterate each Boot#### entry through
signal_ready_to_boot + boot_from_device_path; logs the device path
and underlying error on each failure
7. fall back to the constructor-provided main_os_path if discovery
yields no entries OR fails
8. return EfiError::NotFound once every attempt is exhausted
The crate re-exports DevicePathBuf + EndEntire from its own patina
source so callers (e.g. surface_patina_intel/patina_bin) can construct
the constructor's device-path arguments without picking up a different
patina (which would break trait coherence).
Hotkey-to-SRE entry, WIM-to-RAM-disk boot, and capsule pre-boot hook
are tracked separately and will layer onto this skeleton.
Verified end-to-end on Maa Intel Surface hardware (Kioxia KBG8 NVMe,
NVMe 2.0, CAP.BPS=1) via the paired surface_patina_intel feature
branch: BootDispatcher dispatches, SreBootManager.execute() runs the
full BDS phase, discover_boot_options finds the Windows Boot Manager
Boot####, expand_device_path resolves the short-form HD(GPT,GUID) path
against the live device topology, and bootmgfw.efi loads + starts
cleanly.
Closes OpenDevicePartnership#91.
… -> RAM disk -> chainload)
Layers the SRE entry path onto the existing normal-boot skeleton. When the
Vol-Up + Power chord is registered at power-on (read via Surface
MS_BUTTON_SERVICES_PROTOCOL), SreBootManager::execute branches into the
BP recovery flow:
1. ConnectAll (priority-boot dispatch may run before BdsConnectAll)
2. Locate EFI_NVM_EXPRESS_PASS_THRU_PROTOCOL
3. Probe Identify Controller MDTS to size LID 0x15 chunks; clamp to
[64 KiB, 512 KiB]; fall back to 64 KiB on probe failure
4. Read 1 GiB of BP1 via chunked Get Log Page LID=0x15 into a fresh
DRAM allocation (LPOL = 16 + bp_offset to skip the controller's
header preamble)
5. Register the allocation as a RAM disk via EFI_RAM_DISK_PROTOCOL
(VIRTUAL_DISK_GUID)
6. Connect the RAM disk handle so PartitionDxe/FAT bind
7. Build a full device path = RAM disk DP + FilePath(\EFI\Boot\bootx64.efi)
+ END, LoadImage + StartImage
On any failure the flow returns and execute falls through to the
existing normal boot path. BP write protection (lock/unlock) is
intentionally not touched; that's owned by the FMP capsule update flow.
The three protocol FFI shims (NvmExpressPassThru, RamDisk,
MsButtonServices) are inlined in bp_recovery.rs rather than pulled
from patina_boot helpers that aren't yet published (issues OpenDevicePartnership#63
HotkeySource trait, OpenDevicePartnership#65 RAM-disk binding, OpenDevicePartnership#66 partition file-read,
OpenDevicePartnership#67 install_ram_disk). The intent is to extract these into
patina_boot once the API shape is proven on hardware.
Tests: existing 6 unit tests still pass; both x86_64-unknown-uefi
and host builds compile clean.
Mirrors the hardware-validated C-side SRE flow shipped via Devices
PR 392564 (commit 48a7a6159c on platform/maa_900/sre).
… testability - 2 pure-function tests for build_file_path_node (byte layout) and device_path_size (END_ENTIRE walk). - 3 tests for detect_sre_hotkey covering protocol-absent (outer) and pressed / not-pressed / check-errors (inner via synthetic button protocol). - 4 tests for probe_max_transfer covering MDTS=7 -> 512 KiB cap, MDTS=0 -> READ_CHUNK_MAX, MDTS too low -> READ_CHUNK_MIN clamp, Identify failure -> READ_CHUNK_DEFAULT. - 2 tests for read_bp_via_log_page: single-chunk LPOL=header-offset + LSP/LID layout, and pass-thru failure propagation. - 3 tests for register_ram_disk: outer locate failure, inner success (base/size captured + sentinel DP returned), inner register failure. To make the protocol-touching functions testable, split detect_sre_hotkey and register_ram_disk into outer (BootServices-based locate) + inner (takes raw protocol pointer) functions. Mirrors the pattern already used in patina_boot::partition for the lock_partition_write helpers. All 23 tests pass; both x86_64-unknown-uefi and host targets compile.
3f2983c to
6dfb633
Compare
f8fa118 to
fc84019
Compare
| // to the same handle. | ||
| // SAFETY: state_ptr points at a leaked RamDiskBlockIo whose first field | ||
| // is the standard Block IO protocol layout. The protocol struct lives | ||
| // for the rest of this boot. | ||
| let handle = unsafe { | ||
| boot_services.install_protocol_interface_unchecked(None, &block_io::PROTOCOL_GUID, state_ptr as *mut c_void) | ||
| } | ||
| .map_err(|s| { | ||
| log::error!("InstallProtocolInterface(BlockIo): {:?}", s); | ||
| EfiError::from(s) | ||
| })?; | ||
|
|
||
| // SAFETY: handle returned by install above; device path bytes are leaked | ||
| // and well-formed (Vendor node + End). | ||
| unsafe { | ||
| boot_services.install_protocol_interface_unchecked( | ||
| Some(handle), | ||
| &efi::protocols::device_path::PROTOCOL_GUID, | ||
| dp_ptr, | ||
| ) | ||
| } | ||
| .map_err(|s| { | ||
| log::error!("InstallProtocolInterface(DevicePath): {:?}", s); | ||
| EfiError::from(s) | ||
| })?; |
| let length = unsafe { (*p.add(2) as u16) | ((*p.add(3) as u16) << 8) } as usize; | ||
| p = unsafe { p.add(length) }; |
| // SAFETY: dereferencing the returned interface only via raw pointer calls below. | ||
| let passthru = unsafe { boot_services.locate_protocol_unchecked(&nvme_pass_thru::PROTOCOL_GUID, ptr::null_mut()) } | ||
| .map_err(|e| { | ||
| log::error!("LocateProtocol(NvmExpressPassThru): {:?}", e); | ||
| EfiError::from(e) | ||
| })? as *mut nvme_pass_thru::Protocol; | ||
|
|
| // SAFETY: dereferencing the returned interface only via raw pointer calls below. | ||
| let passthru = | ||
| match unsafe { boot_services.locate_protocol_unchecked(&nvme_pass_thru::PROTOCOL_GUID, ptr::null_mut()) } { | ||
| Ok(p) => p as *mut nvme_pass_thru::Protocol, | ||
| Err(e) => { | ||
| log::warn!("bp_has_sre_payload: NvmExpressPassThru not available: {:?}", e); | ||
| return false; | ||
| } | ||
| }; |
| const READ_CHUNK_DEFAULT: usize = 64 * 1024; | ||
|
|
||
| /// Path chainloaded from the registered RAM disk's FAT volume. | ||
| const CHAINLOAD_FILE_PATH: &str = "\\EFI\\Boot\\bootx64.efi"; |
| // Append `\EFI\Boot\BOOTX64.EFI` as a FilePath node. Patina's | ||
| // `LoadImage` does not apply the UEFI removable-media auto-resolve | ||
| // rule (which would otherwise pick up the default fallback | ||
| // bootloader when handed a bare SFS-handle path), so we must | ||
| // construct the explicit path ourselves or `LoadImage` returns | ||
| // `NotFound`. | ||
| // | ||
| // SAFETY: dp_ptr is a valid device path terminated by END_ENTIRE. | ||
| let base_total = unsafe { bp_recovery::device_path_size(dp_ptr as *const u8) }; | ||
| if base_total < 4 { | ||
| continue; | ||
| } | ||
| let prefix_size = base_total - 4; // strip END_ENTIRE | ||
| // SAFETY: dp_ptr is valid for `base_total` bytes; we slice the | ||
| // prefix that excludes the terminating END_ENTIRE node. | ||
| let prefix = unsafe { core::slice::from_raw_parts(dp_ptr as *const u8, prefix_size) }; | ||
| let mut bytes = alloc::vec::Vec::<u8>::with_capacity(base_total + 32); | ||
| bytes.extend_from_slice(prefix); | ||
| bytes.extend_from_slice(&bp_recovery::build_file_path_node("\\EFI\\Boot\\BOOTX64.EFI")); |
| //! SRE recovery boot path: read the recovery payload from NVMe Boot | ||
| //! Partition 1 via Get Log Page LID=0x15, install a synthesized | ||
| //! EFI_BLOCK_IO_PROTOCOL handle backed by the in-memory buffer, and | ||
| //! chainload \EFI\Boot\bootx64.efi from the resulting FAT volume. |
| /// `PartitionDxe` and `FatDxe` cooperate to install SFS specifically on | ||
| /// the partition hosting a recognizable volume; (2) `LoadImage` on a | ||
| /// path terminating in an SFS handle auto-resolves `\EFI\Boot\BOOTX64.EFI`. | ||
| /// Picking a whole-device `BlockIo` handle instead gives a path that |
| } | ||
| } | ||
|
|
||
| // TODO(odp-platform-common#61): boot-partition write-lock helper isn't in |
There was a problem hiding this comment.
You don't need to do any sort of lock in this driver. I found the 2 locks you mentioned in the spec.
The first type of lock to put it into write mode enables a separate write area to receive data and it is actually not really writing to the BP. Then when you issue the 'flush', it goes back into read only mode and synchronizes that written data into the partition at one shot.
The 2nd lock is the write once bit that can only unlock the device after a power reset which is what goes into the FMP driver's lock function. On exit boot services all FMP drivers get a callback to make sure their FW is fully locked down. The FMP for the SRE will handle this official locking after handoff similar to how we lock the SPINOR.
| // so we can both filter USB entries (which would re-run the SRE | ||
| // flashing tool that committed the WIM) and dispatch run_sre_flow as | ||
| // the final fallback. Cost: one 512-byte LID 0x15 head read of BP1. | ||
| let bp_has_sre = self.bp_sre_fallback && bp_recovery::bp_has_sre_payload(boot_services); |
There was a problem hiding this comment.
Have you thought about how you want to handle the A/B recovery?
Not something for this PR, just a thought. I am putting in the FMP code some hooks to update both partitions so it is there when we need it. Plus we probably need to do a hash of the data which there will also be a hash value in the image.
| # patina_boot lives in OpenDevicePartnership/patina-components and is not yet | ||
| # published to crates.io. Pinning to main here; swap to a `version = ...` line | ||
| # once it publishes. | ||
| patina_boot = { git = "https://github.com/OpenDevicePartnership/patina-components", branch = "main" } |
There was a problem hiding this comment.
Should we do a commit ID so we don't accidentally break it when we start doing changes later?
There was a problem hiding this comment.
Yeah, I actually meant to update this now that we have it merged into odp-platform-common. I'll update to pin it to a commit ID
| # | ||
|
|
||
| [toolchain] | ||
| channel = "nightly-2025-12-12" |
There was a problem hiding this comment.
Don't have to change it yet, just FYI that it's coming. We are switching the Patina build to use a stable toolchain with the RUSTC_BOOTSTRAP environment variable set to 1 then have a whitelist of what features are needed.
246918e to
59f29da
Compare
Port of Surface's C-side SRE entry behavior (DeviceBootManagerLib.c
SrePriorityBoot) onto the Patina-native BootOrchestrator. Lets a
Patina-BDS-equipped platform reproduce the Vol-Up -> SRE app and
Vol-Down -> USB-first / FrontPage hotkey semantics that C BdsDxe
provided.
What lands:
- MS_BUTTON_SERVICES_PROTOCOL FFI module + probe_sre_hotkey helper.
Locates the Surface vendor button-services protocol at BDS entry
and reads the SAM-latched Vol-Up + Power / Vol-Down + Power
states. Falls back to SreHotkey::None when the protocol isn't
published, preserving "no Surface buttons" platform behavior.
- SreHotkey { None, VolumeUp, VolumeDown } public enum.
- SreBootManager builder additions:
- with_sre_app_path(DevicePathBuf)
- with_frontpage_app_path(DevicePathBuf)
Both optional; constructor surface stays additive per the
existing crate contract.
- Vol-Up dispatch: when sre_app_path is set, signal ReadyToBoot
then boot_from_device_path on the configured path. Falls through
to normal Boot#### discovery on failure.
- Vol-Down dispatch: SimpleFileSystem-based USB enumeration in
find_first_usb_block_io_device_path (name kept for legibility;
semantics are "first SFS handle whose device path contains a
USB messaging node"). Falls through to frontpage_app_path when
no USB is present. SFS is required, not raw BlockIo: LoadImage
auto-resolves \EFI\Boot\BOOTX64.EFI only when the device path
terminates at a mounted FAT volume, and picking a whole-device
BlockIo handle gives a path ending at the USB messaging node
which LoadImage rejects with NotFound.
- Device-path helpers:
- fv_file_device_path(file_guid) -> FvFile(...)/End
- fv_volume_file_device_path(fv_guid, file_guid) ->
Fv(...)/FvFile(...)/End
Patina's LoadImage requires the explicit FV node to expand an
FvFile reference -- bare FvFile returns NotFound. Hardware
verification on Maa confirmed this.
- gMsStartOfBdsNotifyGuid signal added in execute() after
signal_bds_phase_entry. Surface boot-policy components subscribe
to it; matches C BdsDxe's DeviceBootManagerBdsEntry order.
- gDfciStartOfBdsNotifyGuid intentionally NOT signaled (see long
comment in execute()). When fired, SettingsManagerDxe installs
gDfciSettingAccessProtocolGuid which triggers DfciManager's
SettingAccessCallback -> ProcessMailBoxes, which null-derefs
because mApplyIdentityProtocol / mApplyPermissionsProtocol /
mApplySettingsProtocol stayed NULL at DfciManager entry (the
producer IdentityAndAuthManager hadn't installed them yet).
Replicating the C BdsDxe ordering from a Patina BootOrchestrator
without porting DfciManager + IdentityAndAuthManager +
SettingsManager to Patina components is brittle. Until those
are ported the signal stays off; DFCI-dependent UI
(SurfaceFrontPage) won't function on Patina builds. Tracked as
a follow-up.
End-to-end verified on a Maa 900 unit:
- Vol-Up + Power -> SRE app (BpRecoveryLoaderApp) launches.
- Vol-Down + USB -> SFS enumeration finds the stick and LoadImage
hands off to \EFI\Boot\BOOTX64.EFI (when BOOT_ON_FLASH_UPDATE
PEI state doesn't disable USB ports first - unrelated PEI issue).
- Plain power-on -> normal Boot#### dispatch, no regression.
- DfciManager null-deref no longer reproducible (signal removed).
59f29da to
b496c1b
Compare
|
Integration on Intel device surfaced 8 missing patina_boot primitives + platform-extension API gaps. Tracking the productionization work in Boot Manager Modernization (project #45). This PR will resume when BMM Epic 2 (C FFI proxy bridge) lands and Epic 3 (PlatformBootHooks trait) gives SreBootManager a cleaner factoring. Code in this PR is preserved as a reference implementation and will be refactored against the new traits when they exist. |
|
I attempted production integration of a Rust-native SRE boot orchestrator on an Surface Intel device. The integration exposed that patina_boot — the underlying framework — wasn't yet production-ready for platforms with EDK2-accumulated assumptions. Rather than patch around each gap individually, I restructured the work into Boot Manager Modernization — a systematic project covering measurement infrastructure, the missing primitives, a platform-extension API, and OEM-facing perf/BOM deliverables. patina_sre returns to the integration path once Boot Manager Modernization makes it production-viable on the first reference platform (estimated mid-August). |
Summary
Adds
patina_sreatuefi/crates/patina_sre/— apatina_boot::BootOrchestratorfor platforms that ship a System Recovery Environment alongside the main OS.SreBootManager::execute()runs the BDS phase: interleaved connect+dispatch → EndOfDxe → start-of-BDS notify → console discovery →discover_boot_options→ try eachBoot####, fall back to a constructor-providedmain_os_path. On hotkey, dispatches a configured recovery or fallback app, or runsbp_recovery::run_sre_flow(NVMe LID read → synthesized Block IO RAM disk → chainload).Additive builder surface:
with_sre_app_path(DevicePathBuf),with_frontpage_app_path(DevicePathBuf), plusfv_file_device_path/fv_volume_file_device_pathhelpers.Re-exports
DevicePathBuf,EndEntire,SreHotkey, and path helpers so callers stay on a singlepatina(trait-coherence safety).The large BP buffer uses
Vec::try_reserve_exactand surfacesOUT_OF_RESOURCESon alloc failure rather than panicking.Dependency chain
This PR consumes patina_boot at a fork URL temporarily, because the r-efi 7 chain isn't fully landed upstream yet:
patina_bootpin →kat-perez/odp-platform-common @ 8eadb3b(PR patina_boot: bump to r-efi 7 #106 — bumps patina_boot to r-efi 7)patinapin →kat-perez/patina branch=patina-boot-base(reset to upstream main + r-efi 7)OpenDevicePartnership/...URLs once patina_boot: bump to r-efi 7 #106 merges andpatina-boot-baseis no longer needed.Test plan
cargo test— 16 unit tests passcargo clippy --all-targetsclean (workspace denies suspicious/correctness/perf/style)x86_64-unknown-uefipatina_boot+patina_tianocore(nopatina_srejob in the matrix yet — needs follow-up workflow update)Follow-ups
patina_sreto the repo CI matrix (fmt/clippy/build-uefi/doc)OpenDevicePartnership/odp-platform-commonURLpartition::lock_partition_writeLoadedImage.DeviceHandleHotkeyProvidertrait for OEM-specific button mechanismsCloses #91.