Hello,
We discovered an issue with function chMBReset(), it affects all ChibiOS/RT versions. The problem is triggered by the following sequence:
1) Thread A fetches from an empty mailbox, it goes to sleep on the semaphore "fullsem".
2) Thread B with higher priority posts on the mailbox, A is woken and goes "ready" with message MSG_OK.
3) Thread C with higher priority resets the mailbox using chMBReset(), A is already ready so it does not get the MSG_RESET message.
4) B and C go in sleep making A runnable.
5) A takes the MSG_OK path is chMBFetch() reading garbage from the mailbox that has been reset.
The recommended workaround is to not use chMBReset().
From our analysis fixing this bug would require making mailboxes slower and/or larger, the fix will be to remove chMBReset() from the API. It should not be an issue because it is a marginal functionality.
Giovanni
[BUG] Problem with chMBReset() Topic is solved
- Giovanni
- Site Admin
- Posts: 14455
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1076 times
- Been thanked: 922 times
- Contact:
- Giovanni
- Site Admin
- Posts: 14455
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1076 times
- Been thanked: 922 times
- Contact:
Re: [BUG] Problem with chMBReset()
Hi,
Update, I decided for a complete rework of the mailboxes code.
Changes:
1) Mailboxes no more use semaphores but lower-level thread queues.
2) Now resetting a mailbox brings it in a "reset state", all subsequent operations will fail with MSG_RESET. This ensures that no threads can enter the mailbox so it can be safely disposed.
3) The reset state can be exited by calling chMBResumeX(), new API.
4) Everything else works as usual.
The code is in the trunk repository for review, the test suite fails because it has not yet been adapted to handle the reset state, tomorrow I will fix it.
If successful the new implementation will be back-ported to 16.1.x, sorry for holding back releases but this was important.
Giovanni
Update, I decided for a complete rework of the mailboxes code.
Changes:
1) Mailboxes no more use semaphores but lower-level thread queues.
2) Now resetting a mailbox brings it in a "reset state", all subsequent operations will fail with MSG_RESET. This ensures that no threads can enter the mailbox so it can be safely disposed.
3) The reset state can be exited by calling chMBResumeX(), new API.
4) Everything else works as usual.
The code is in the trunk repository for review, the test suite fails because it has not yet been adapted to handle the reset state, tomorrow I will fix it.
If successful the new implementation will be back-ported to 16.1.x, sorry for holding back releases but this was important.
Giovanni
- alex31
- Posts: 379
- Joined: Fri May 25, 2012 10:23 am
- Location: toulouse, france
- Has thanked: 38 times
- Been thanked: 62 times
- Contact:
Re: [BUG] Problem with chMBReset()
Hello,
I have tested the new mailboxes implementation, and it fails on some code that worked just before the change.
It fails on the first call to chMBPost in my application :
The issue seems to be that chSysUnlock is called twice.
For the record, it's on a stm32f7, systick mode, but it probably does not matter.
Alexandre
I have tested the new mailboxes implementation, and it fails on some code that worked just before the change.
It fails on the first call to chMBPost in my application :
The issue seems to be that chSysUnlock is called twice.
For the record, it's on a stm32f7, systick mode, but it probably does not matter.
Code: Select all
chSysUnlock () at ../../../ChibiOS_next/os/rt/include/chsys.h:389
389 _stats_stop_measure_crit_thd();
(gdb)
_stats_stop_measure_crit_thd () at ../../../ChibiOS_next/os/rt/src/chstats.c:105
105 chTMStopMeasurementX(&ch.kernel_stats.m_crit_thd);
(gdb)
chTMStopMeasurementX (tmp=0x20001628 <ch+168>) at ../../../ChibiOS_next/os/rt/src/chtm.c:129
129 tm_stop(tmp, chSysGetRealtimeCounterX(), ch.tm.offset);
(gdb)
port_rt_get_counter_value () at ../../../ChibiOS_next/os/common/ports/ARMCMx/chcore_v7m.h:708
708 return DWT->CYCCNT;
(gdb)
709 }
(gdb)
tm_stop (tmp=0x20001628 <ch+168>, now=2405613119, offset=40) at ../../../ChibiOS_next/os/rt/src/chtm.c:57
57 tmp->n++;
(gdb)
58 tmp->last = (now - tmp->last) - offset;
(gdb)
59 tmp->cumulative += (rttime_t)tmp->last;
(gdb)
60 if (tmp->last > tmp->worst) {
(gdb)
61 tmp->worst = tmp->last;
(gdb)
63 if (tmp->last < tmp->best) {
(gdb)
66 }
(gdb)
chTMStopMeasurementX (tmp=0x20001628 <ch+168>) at ../../../ChibiOS_next/os/rt/src/chtm.c:130
130 }
(gdb)
_stats_stop_measure_crit_thd () at ../../../ChibiOS_next/os/rt/src/chstats.c:106
106 }
(gdb)
chSysUnlock () at ../../../ChibiOS_next/os/rt/include/chsys.h:395
395 chDbgAssert((ch.rlist.queue.next == (thread_t *)&ch.rlist.queue) ||
(gdb)
chSysHalt (reason=0x8027c48 <__func__.7460> "chSysUnlock") at ../../../ChibiOS_next/os/rt/src/chsys.c:194
194 port_disable();
Alexandre
- Giovanni
- Site Admin
- Posts: 14455
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1076 times
- Been thanked: 922 times
- Contact:
Re: [BUG] Problem with chMBReset()
Hi,
Thanks for finding, still work in progress but good to know
Giovanni
Thanks for finding, still work in progress but good to know
Giovanni
- Giovanni
- Site Admin
- Posts: 14455
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1076 times
- Been thanked: 922 times
- Contact:
Re: [BUG] Problem with chMBReset()
Hi,
Just tested with the state checker but the test suite passes and the function makes a single call to chSysUnlock().
Are your IRQ priorities in the range 3..15? it could be IRQ code executed within the kernel critical zones.
Giovanni
Just tested with the state checker but the test suite passes and the function makes a single call to chSysUnlock().
Are your IRQ priorities in the range 3..15? it could be IRQ code executed within the kernel critical zones.
Giovanni
- alex31
- Posts: 379
- Joined: Fri May 25, 2012 10:23 am
- Location: toulouse, france
- Has thanked: 38 times
- Been thanked: 62 times
- Contact:
Re: [BUG] Problem with chMBReset()
Are your IRQ priorities in the range 3..15? it could be IRQ code executed within the kernel critical zones.
Yes, minimum priority is 5 :
Code: Select all
grep IRQ_PRIORITY mcuconf.h | sort -n -k 3
#define STM32_I2C_I2C1_IRQ_PRIORITY 5
Tomorrow i will try to write a small example that exhibit the problem.
Thanks
Alexandre
- alex31
- Posts: 379
- Joined: Fri May 25, 2012 10:23 am
- Location: toulouse, france
- Has thanked: 38 times
- Been thanked: 62 times
- Contact:
Re: [BUG] Problem with chMBReset()
Hello,
i have been able to reproduce the error with a simple program which just post in main and fetch in a thread.
All goes fine if both main and thread priorities are equals, but it fails if not :
Alexandre
i have been able to reproduce the error with a simple program which just post in main and fetch in a thread.
All goes fine if both main and thread priorities are equals, but it fails if not :
Code: Select all
#include <ch.h>
#include <hal.h>
static msg_t mbBuf[2];
static MAILBOX_DECL(mb, mbBuf, sizeof(mbBuf)/sizeof(mbBuf[0]));
static THD_WORKING_AREA(waBlinker, 384); // declaration de la pile du thread blinker
static void blinker (void *arg) // fonction d'entrée du thread blinker
{
(void)arg;
chRegSetThreadName("blinker");
msg_t from;
while (true) {
chMBFetch (&mb, (msg_t *) &from, TIME_INFINITE);
}
}
int main(void) {
halInit();
chSysInit();
// works OK if prio is NORMALPRIO
chThdCreateStatic(waBlinker, sizeof(waBlinker), NORMALPRIO+2, &blinker, NULL);
while (true) {
chMBPost (&mb, (msg_t) 0, TIME_IMMEDIATE);
chThdSleepMilliseconds(100);
palToggleLine (LINE_C00_LED1);
}
}
Alexandre
- Giovanni
- Site Admin
- Posts: 14455
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1076 times
- Been thanked: 922 times
- Contact:
- Giovanni
- Site Admin
- Posts: 14455
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1076 times
- Been thanked: 922 times
- Contact:
Re: [BUG] Problem with chMBReset()
I fixed A problem in new mailboxes, not sure if it the same you experienced, could you give it a try?
Giovanni
Giovanni
- alex31
- Posts: 379
- Joined: Fri May 25, 2012 10:23 am
- Location: toulouse, france
- Has thanked: 38 times
- Been thanked: 62 times
- Contact:
Re: [BUG] Problem with chMBReset()
hello,
problem fixed in the small example and in the real application,
thanks !
Alexandre
problem fixed in the small example and in the real application,
thanks !
Alexandre
Who is online
Users browsing this forum: No registered users and 28 guests