russian wrote:Now I am less sure. Something funny is going on:
My firmware is stuck, I am pausing and resuming execution. Note how current thread changes, but p_current is always the same?Code: Select all
#define ON_UNLOCK_HOOK onUnlockHook()
#define dbg_leave_lock() {dbg_lock_cnt = 0;ON_UNLOCK_HOOK;}
void onUnlockHook(void) {
uint64_t t = getTimeNowNt() - lastLockTime;
if (t > maxLockTime) {
maxLockTime = t;
}
}
void onUnlockHook(void) {
801b520: b580 push {r7, lr}
801b522: b082 sub sp, #8
801b524: af00 add r7, sp, #0
uint64_t t = getTimeNowNt() - lastLockTime;
801b526: f003 fa0b bl 801e940 <getTimeNowNt>
801b52a: f24f 5360 movw r3, #62816 ; 0xf560
801b52e: f2c2 0300 movt r3, #8192 ; 0x2000
801b532: e9d3 2300 ldrd r2, r3, [r3]
801b536: 1a82 subs r2, r0, r2
801b538: eb61 0303 sbc.w r3, r1, r3
801b53c: e9c7 2300 strd r2, r3, [r7]
if (t > maxLockTime) {
801b540: f24f 5368 movw r3, #62824 ; 0xf568
801b544: f2c2 0300 movt r3, #8192 ; 0x2000
801b548: 681b ldr r3, [r3, #0]
801b54a: 4618 mov r0, r3
801b54c: f04f 0100 mov.w r1, #0
801b550: e9d7 2300 ldrd r2, r3, [r7]
801b554: 4299 cmp r1, r3
...
...
...
Perhaps there is something wrong with the lock-free code. But I think the alternative code that uses lockAnyContext() might have a problem too... I could be wrong, but as I try to follow the code, it seems like the following could be happening:
Some ChibiOS code (ISR epilogue or _port_switch_from_isr) calls dbg_check_unlock(). This calls dbg_leave_lock(), which is a macro that does {dbg_lock_cnt = 0;ON_UNLOCK_HOOK;}. chconf.h defines ON_UNLOCK_HOOK to call onUnlockHook(). error_handling.cpp defines onUnlockHook() which calls getTimeNowNt(). getTimeNowNt() is defined in engine_controller.cpp and it calls Overflow64Counter::get(), which is defined in efilib2.cpp and calls lockAnyContext(). The lockAnyContext() function is defined in console_io.c, and it calls isLocked() which looks at dbg_lock_count to see if it's greater than zero. It is equal to zero, however, since the dbg_leave_lock() macro already assigned dbg_lock_cnt = 0 earlier in this paragraph. So the code does a lock based on whether it's in ISR context, with if(dbg_isr_cnt > 0) chSysLockFromIsr(); else chSysLock().
I don't have a good enough understanding ChibiOS to know if the above process is acceptable and valid or not. But I think the next part is the problem. Since dbg_lock_cnt == 0, Overflow64Counter::get() will have alreadyLocked==false, and it will call unlockAnyContext() before returning. But that will cause a recursive call to this function! It goes like this: unlockAnyContext() -> chSysUnlockFromIsr() -> dbg_check_unlock_from_isr() -> dbg_leave_lock() -> {dbg_lock_cnt = 0;ON_UNLOCK_HOOK;} -> onUnlockHook() -> getTimeNowNt() -> Overflow64Counter::get() -> lockAnyContext() -> chSysLockFromIsr(), chSysUnlockFromIsr() -> infinite recursion.
Maybe I'm missing something, but it looks like that could be the case.