-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize critical_section: use MSR instruction #576
Conversation
To avoid branches and unnecessary code in the acquire and release functions, simply write back the original value to the PRIMASK register using the MSR instruction. This changes the critical-section dependency to use `u32` as the `RawRestoreState` type instead of `bool`. The `register::primask` module now defines a `struct(pub u32)` instead of an `enum`, and additionally a `write` function for use with the critical section.
Since the release function is now a single instruction consisting of 4 bytes, it should be possible for the linker to replace the branch with the MSR instruction. The Arm linker can apparently do that, but I wasn't able to get LLVM/LLD to do it. |
Thanks for this PR, it sounds like a good idea! I think it would be worth avoiding the breaking change to PRIMASK so we can release this in the 0.7 version family. We could do this by adding a new interface to the primask module that deals with a u32, or (I think better) by having the critical section implementation read/write primask directly. Just a reference for future readers: the current implementation masks just the bottom bit of the PRIMASK register to get the current interrupt status, stores that, then uses Did you investigate https://doc.rust-lang.org/rustc/linker-plugin-lto.html ? |
Good point. I will revert the primask module and use
There is a feature listed here, but I can't find out what it does: Line 34 in b02ec57
Looks like some code in xtask/src/main.rs related to that was removed? Anyway, thanks for pointing me to the right compiler flag. Adding |
That feature is from the days before stable inline assembly, when the cortex-m crate shipped a pre-built binary blob per target that would be linked in to provide the assembly calls. We also shipped a pre-built xLTO version which could be inlined at link time if the feature was enabled. Since moving to inline asm for all users, we don't ship any pre-built blobs, so the feature doesn't do anything, but we leave it in place for backwards compatibility in 0.7 releases. I'm glad to hear it helps inline the c-s impl though. Yes, I think users will have to do this manually, so we should document it where we talk about the critical-section-single-core feature. There's no way to force it from our crate as far as I know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good!
Do you want to add a note to the documentation for the critical-section-single-core
feature here about enabling linker-plugin-lto for better performance?
cortex-m/src/register/primask.rs
Outdated
/// Note that bits [31:1] are reserved and SBZP (Should-Be-Zero-or-Preserved) | ||
#[cfg(cortex_m)] | ||
#[inline] | ||
pub fn write_raw(r: u32) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be unsafe
because it can be used to enable interrupts, to match interrupt::enable()
. If it was safe to call, user code could enable interrupts when other unsafe functions depend on them being disabled for memory safety, leading to UB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Do you require safety comments for the unsafe
blocks? That didn't look like it is a policy in this crate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for the unsafe blocks inside functions, but we should add a # Safety
note to the docs for that function. I suggest:
# Safety
This method is unsafe as other unsafe code may rely on interrupts remaining disabled, for example during a critical section, and being able to safely re-enable them would lead to undefined behaviour. Do not call this function in a context where interrupts are expected to remain disabled - for example, in the midst of a critical section or `interrupt::free()` call.
It's a bit vague and hard to comply with, but I think that's the nature of the problem. Happy to hear other suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good, I've added it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! i'll see about backporting this to cortex-m 0.7 too.
(Marking as draft because I'm a Rust beginner who's probably doing things all wrong, and because it changes the API).
To avoid branches and unnecessary code in the acquire and release functions, simply write back the original value to the PRIMASK register using the MSR instruction. This reduces the number of cycles that the processor runs with interrupts disabled, improving interrupt latency.
Unfortunately due to how the critical-section dependency is implemented (
extern "Rust"
), there is still a branch to and return from the acquire and release functions.Disassembly with
--release
:Before - thumbv6m-none-eabi
Before - thumbv7m-none-eabi
With this PR, the generated code is the same on thumbv6m and thumbv7m:
This changes the critical-section dependency to use
u32
as theRawRestoreState
type instead ofbool
.The
register::primask
module now defines astruct(pub u32)
instead of anenum
, and additionally awrite
function for use with the critical section. This is a breaking change to the API if you were using the enum values instead of theis_(in)active
functions.Is there a better way to achieve this without changing the API?