I've been encountering an issue with two threads that are on the same priority level, in tick-less mode, using calls to chThdYield() to allow each other to operate. I'm running on ChibiOS 16.1.9.
There is one thread receiving UDP messages using LWIP (netconn_recv) on the same priority as a thread transmitting messages (checking a mailbox). The thread transmitting messages has a call at the beginning of the loop to wait chMbFetch with a 100ms time-out to check if there are any messages that are ready to transmit.
The problem as it was originally observed is that every once in a while, the Receive loop would stop being called for up to a few seconds, while the Transmit loop would continue to send messages. I noticed that the Transmit loop did not have a call to chThdYield and added that call at the end of the loop, and the problem happens less often but still persists.
I originally thought that the two threads are round-robin, and that (barring preemptions from higher priority threads) one will operate until it calls chThdYield(), then the other will wake up and operate until a call to chThdYield(), and so on. After reading another post here (Giovanni's reply to: http://www.chibios.com/forum/viewtopic.php?t=4168) I'm realizing that this may be how it works after 2017, but that in the version I'm running there is other logic.
I'd like to understand what's happening in the yield call vs calls that cause the thread to wait, and inquire about methods I can use to monitor the thread's context switches to find out what is preempting the receive thread now.
- 1) Why did the code work at all while there was no yield call in the transmit thread? Was the receive thread able to run while the transmit thread is waiting on the call to chMbWait(), even though it was behind in the round robin? Does this mean that if the transmit thread were to stop waiting, it would then preempt the receive thread to take back the CPU?
- 2) What is the logic behind chThdYield() chThdWait() (or other operations that cause waits) and same-priority scheduling in tickless mode, both pre-2017 and post-2017. I haven't seen any documented changes on this besides the post linked above.
- 3) Is there any way to profile threads in a way that either shows all context switches, or that displays at what points in time threads hit certain watch-points? I would like to "graph" when threads are running vs. waiting to find out what is preempting my receive thread. I've tried setting/clearing pins and connecting to an oscilloscope but I'm not getting the results that I expected (nor can I see context switches). I think that the CH_DBG_ENABLE_TRACE option may be helpful here but haven't found documentation on how to use it yet.