Optimize critical_section: use MSR instruction #576

sw · 2025-01-20T20:30:33Z

(Marking as draft because I'm a Rust beginner who's probably doing things all wrong, and because it changes the API).

To avoid branches and unnecessary code in the acquire and release functions, simply write back the original value to the PRIMASK register using the MSR instruction. This reduces the number of cycles that the processor runs with interrupts disabled, improving interrupt latency.

Unfortunately due to how the critical-section dependency is implemented (extern "Rust"), there is still a branch to and return from the acquire and release functions.

Disassembly with --release:

Before - thumbv6m-none-eabi

<_critical_section_1_0_acquire>:
               	movs	r0, #0x1
               	mrs	r1, primask
               	ands	r1, r0
               	rsbs	r0, r1, #0
               	adcs	r0, r1
               	cpsid i
               	bx	lr

<_critical_section_1_0_release>:
               	cmp	r0, #0x0
               	beq	0x80002d6 <_critical_section_1_0_release+0x6> @ imm = #0x0
               	cpsie i
               	bx	lr

Before - thumbv7m-none-eabi

<_critical_section_1_0_acquire>:
               	mrs	r0, primask
               	movs	r1, #0x1
               	cpsid i
               	bic.w	r0, r1, r0
               	bx	lr

<_critical_section_1_0_release>:
               	cmp	r0, #0x0
               	it	eq
               	bxeq	lr
               	cpsie i
               	bx	lr

With this PR, the generated code is the same on thumbv6m and thumbv7m:

<_critical_section_1_0_acquire>:
               	mrs	r0, primask
               	cpsid i
               	bx	lr

<_critical_section_1_0_release>:
               	msr	primask, r0
               	bx	lr

This changes the critical-section dependency to use u32 as the RawRestoreState type instead of bool.

The register::primask module now defines a struct(pub u32) instead of an enum, and additionally a write function for use with the critical section. This is a breaking change to the API if you were using the enum values instead of the is_(in)active functions.

Is there a better way to achieve this without changing the API?

To avoid branches and unnecessary code in the acquire and release functions, simply write back the original value to the PRIMASK register using the MSR instruction. This changes the critical-section dependency to use `u32` as the `RawRestoreState` type instead of `bool`. The `register::primask` module now defines a `struct(pub u32)` instead of an `enum`, and additionally a `write` function for use with the critical section.

sw · 2025-01-21T10:46:44Z

Since the release function is now a single instruction consisting of 4 bytes, it should be possible for the linker to replace the branch with the MSR instruction. The Arm linker can apparently do that, but I wasn't able to get LLVM/LLD to do it.

adamgreig · 2025-01-22T02:09:07Z

Thanks for this PR, it sounds like a good idea!

I think it would be worth avoiding the breaking change to PRIMASK so we can release this in the 0.7 version family. We could do this by adding a new interface to the primask module that deals with a u32, or (I think better) by having the critical section implementation read/write primask directly.

Just a reference for future readers: the current implementation masks just the bottom bit of the PRIMASK register to get the current interrupt status, stores that, then uses cpsie to selectively re-enable interrupts without ever writing back to PRIMASK. The proposed new version reads the whole PRIMASK register and writes it all back to restore state. The other 31 bits of PRIMASK are reserved and SBZP in ARMv6/7/8-M. Writing back a value read from the field is valid, so there shouldn't be any issue with this approach.

Did you investigate https://doc.rust-lang.org/rustc/linker-plugin-lto.html ?

sw · 2025-01-22T09:22:20Z

I think it would be worth avoiding the breaking change to PRIMASK so we can release this in the 0.7 version family. We could do this by adding a new interface to the primask module that deals with a u32, or (I think better) by having the critical section implementation read/write primask directly.

Good point. I will revert the primask module and use asm! in critical_section.rs to read and write the PRIMASK register.

Did you investigate https://doc.rust-lang.org/rustc/linker-plugin-lto.html ?

There is a feature listed here, but I can't find out what it does:

cortex-m/cortex-m/Cargo.toml

Line 34 in b02ec57

linker-plugin-lto = []

Looks like some code in xtask/src/main.rs related to that was removed?

Anyway, thanks for pointing me to the right compiler flag. Adding "-C", "linker-plugin-lto" to rustflags in cargo.toml will make both acquire and release inline. Is this something the users of this crate will need to do manually? Does that need documentation? Or is there really code missing related to the crate feature I saw? I'm assuming that LTO would be beneficial for most users but it's not something this crate should force.

adamgreig · 2025-01-22T10:46:49Z

That feature is from the days before stable inline assembly, when the cortex-m crate shipped a pre-built binary blob per target that would be linked in to provide the assembly calls. We also shipped a pre-built xLTO version which could be inlined at link time if the feature was enabled. Since moving to inline asm for all users, we don't ship any pre-built blobs, so the feature doesn't do anything, but we leave it in place for backwards compatibility in 0.7 releases.

I'm glad to hear it helps inline the c-s impl though. Yes, I think users will have to do this manually, so we should document it where we talk about the critical-section-single-core feature. There's no way to force it from our crate as far as I know.

adamgreig

Thanks, looks good!

Do you want to add a note to the documentation for the critical-section-single-core feature here about enabling linker-plugin-lto for better performance?

adamgreig · 2025-01-23T00:34:25Z

cortex-m/src/register/primask.rs

+/// Note that bits [31:1] are reserved and SBZP (Should-Be-Zero-or-Preserved)
+#[cfg(cortex_m)]
+#[inline]
+pub fn write_raw(r: u32) {


This needs to be unsafe because it can be used to enable interrupts, to match interrupt::enable(). If it was safe to call, user code could enable interrupts when other unsafe functions depend on them being disabled for memory safety, leading to UB.

Fixed. Do you require safety comments for the unsafe blocks? That didn't look like it is a policy in this crate.

Not for the unsafe blocks inside functions, but we should add a # Safety note to the docs for that function. I suggest:

# Safety This method is unsafe as other unsafe code may rely on interrupts remaining disabled, for example during a critical section, and being able to safely re-enable them would lead to undefined behaviour. Do not call this function in a context where interrupts are expected to remain disabled - for example, in the midst of a critical section or `interrupt::free()` call.

It's a bit vague and hard to comply with, but I think that's the nature of the problem. Happy to hear other suggestions.

That sounds good, I've added it.

adamgreig

thanks! i'll see about backporting this to cortex-m 0.7 too.

sw added 2 commits January 22, 2025 13:51

also for interrupt::free; revert register::primask API changes

d5591d0

appease clippy

7f99b9a

sw marked this pull request as ready for review January 22, 2025 18:51

sw requested a review from a team as a code owner January 22, 2025 18:51

adamgreig previously approved these changes Jan 22, 2025

View reviewed changes

Add LTO hint to critical-section-single-core docs

78f7968

sw dismissed adamgreig’s stale review via 78f7968 January 22, 2025 21:27

adamgreig reviewed Jan 23, 2025

View reviewed changes

sw added 2 commits January 23, 2025 11:41

Review: write_raw() must be unsafe

8279b63

Added safety comment to write_raw()

e451dc3

adamgreig approved these changes Jan 27, 2025

View reviewed changes

adamgreig added this pull request to the merge queue Jan 27, 2025

Merged via the queue into rust-embedded:master with commit de61182 Jan 27, 2025
11 checks passed

sw deleted the cs-msr branch January 28, 2025 07:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize critical_section: use MSR instruction #576

Optimize critical_section: use MSR instruction #576

sw commented Jan 20, 2025

sw commented Jan 21, 2025 •

edited

Loading

adamgreig commented Jan 22, 2025

sw commented Jan 22, 2025

adamgreig commented Jan 22, 2025

adamgreig left a comment

adamgreig Jan 23, 2025

sw Jan 23, 2025

adamgreig Jan 24, 2025

sw Jan 24, 2025

adamgreig left a comment

Optimize critical_section: use MSR instruction #576

Optimize critical_section: use MSR instruction #576

Conversation

sw commented Jan 20, 2025

sw commented Jan 21, 2025 • edited Loading

adamgreig commented Jan 22, 2025

sw commented Jan 22, 2025

adamgreig commented Jan 22, 2025

adamgreig left a comment

Choose a reason for hiding this comment

adamgreig Jan 23, 2025

Choose a reason for hiding this comment

sw Jan 23, 2025

Choose a reason for hiding this comment

adamgreig Jan 24, 2025

Choose a reason for hiding this comment

sw Jan 24, 2025

Choose a reason for hiding this comment

adamgreig left a comment

Choose a reason for hiding this comment

sw commented Jan 21, 2025 •

edited

Loading