-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using FreeRTOS-Kernel-Heap3 library leads to deadlock across cores, race condition on core spinlock #1646
Comments
I didn't make it clear, but I am seeing this on an actual project, so it's not just a hypothetical. |
- There's a race condition with heap_3 FreeRTOS between heap_3 and pico-sdk mutex implementation. See raspberrypi/pico-sdk#1646 for more information.
Can you post what version of freertos you're using? |
This is the last commit of a personal project before I switched to heap_4 to work around the issue: EDIT: use this commit, and switch from heap_4 to heap_3 in the CMakeLists.txt: gemarcano/snes_controllers_to_usb@aa4e8ff . The original commit didn't have the new/malloc calls in place yet. I'm using this branch of FreeRTOS (from this PR that fixes a bug I found: FreeRTOS/FreeRTOS-Kernel#991) which should be pretty close to FreeRTOS master. I'm also using a patch from #1530 to the pico-sdk master branch so it can work with FreeRTOS SMP support (that PR really needs to get merged). And I also found a bug in tinyusb, and I have a (crappy) PR pending here that I'm also using. Seeing that the bug is a race condition, I'm not sure how to reduce my project to something sane that might still trigger the bug. It seems pretty reliable to trigger at home (watchdog is firing all the time unless I disable it), but it's still a race. As I can reproduce it pretty reliably with that commit, I can dump any data structures using GDB from both cores. In the FreeRTOS forum post, I already posted the backtrace from both cores, showing the deadlock. If you'd like anything else, I can probably provide it. |
possibly the same issue as #1453 |
From reading that issue, it does sound like the underlying issue is at least similar (deadlock on a spinlock due to interactions between FreeRTOS and pico-sdk locking). The only thing that is different is that this is being triggered not just by pico-sdk code. Part of the cause in my case is the heap_3 implementation is grabbing the FreeRTOS task lock (via |
Could this bug cause also cause the following assertion? |
if you |
this should be fixed with the latest SDK and FreeRTOS-Kernel (https://github.com/raspberrypi/FreeRTOS-Kernel |
I describe the full issue here: https://forums.freertos.org/t/multicore-smp-rp2040-port-deadlock-when-using-heap-3/19255/
In summary, FreeRTOS heap_3 calls
vTaskSuspendAll
which grabs the task lock before calling malloc/free. At the same time, if another task is using good-ol' malloc/free, a race takes place. Under FreeRTOS, the rp2040 port uses some macro magic to inject FreeRTOS support into the mutex functions. This implementation tries to grab the task lock also. heap_3 takes the FreeRTOS task lock before grabbingmalloc_mutex
, while a normal call tomalloc
takes the core synchronization spinlock first before grabbing the FreeRTOS task lock. This leads to a deadlock between the two tasks.For clarity, let's say
core 0
is calling into FreeRTOS heap_3, andcore 1
is callingmalloc
(which is wrapped by pico-sdk). This is the sequence of events:core 0
:vTaskSuspendAll
invPortFree
before callingfree
core 1
: callsmalloc
core 1
: inmutex_enter_blocking
withmalloc_mutex
, grabs core synchronization spinlockcore 0
callsfree
core 0
: inmutex_exit
withmalloc_mutex
, blocks waiting for core synchronization spinlockcore 1
: callslock_internal_spin_unlock_with_wait
which is a macro provided by FreeRTOS rp2040 portcore 1
: tries to grab task lock with an eventual call tovTaskEnterCritical
after macro hellcore 1
is holding onto the core synchronization spinlock,core 0
has the task lock.This happens because FreeRTOS heap_3 is trying to be multi task safe, but the pico-sdk's malloc and free wrappers are already providing this. I don't know of a good way to fix/clean this up, as it will likely need better cooperation between the rp2040 port and FreeRTOS itself.
My current workaround is to waste more SRAM and just use heap_4 and let FreeRTOS manage its own heap for its own objects, and let tasks use the "regular" heap that uses pico-sdk locking.
The text was updated successfully, but these errors were encountered: