Priority order violation - Asserion fails Topic is solved

Report here problems in any of ChibiOS components. This forum is NOT for support.
fotis
Posts: 32
Joined: Mon Apr 04, 2016 7:04 pm
Has thanked: 1 time
Been thanked: 1 time

Re: Priority order violation - Asserion fails

Postby fotis » Sun Sep 30, 2018 12:49 pm

Hi Giovanni,

Are you sure these are the only cases that may cause this issue?

As you pointed out, 1 is not the case.

But after much searching, I can be almost sure that either 2 causes the problem.
Apart for a manual search, I also searched both with Eclipse and the ack command my whole source code for anything containing either

Code: Select all

I(
or

Code: Select all

I (
and I couldn't find anything... I believe this could have found any call to an I-class function.

Is there any case that such a function may be called implicitly from another Chibios function?


My startup code is custom, not the one provided with Chibios. However it is fairly simple, you may have a look here.

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: Priority order violation - Asserion fails

Postby Giovanni » Sun Sep 30, 2018 1:22 pm

You should make sure that all initialization performed in our startup files are done in the same way. A dual stack is required.

In general, try to use the provided setup before attempting changes.

Closing as it is likely not a bug, feel free to open a topic in a support section.

Giovanni

fotis
Posts: 32
Joined: Mon Apr 04, 2016 7:04 pm
Has thanked: 1 time
Been thanked: 1 time

Re: Priority order violation - Asserion fails

Postby fotis » Sun Sep 30, 2018 2:39 pm

Just cross-checked it, the initialization is very similar. Essentially I do the exact same steps, in the same order. I just use C instead of assembly. I do use the dual stack model.

Running the Chibios startup code to my project is a very big effort, so I don't think I can do this test.

It seems to me that it is not related to the startup code.

Anything else to check?

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: Priority order violation - Asserion fails

Postby Giovanni » Sun Sep 30, 2018 5:10 pm

You are doing your startup in a C function, this way you don't see what the compiler does with stack n the prologue.

No other ideas.

Giovanni

fotis
Posts: 32
Joined: Mon Apr 04, 2016 7:04 pm
Has thanked: 1 time
Been thanked: 1 time

Re: Priority order violation - Asserion fails

Postby fotis » Mon Feb 18, 2019 7:31 pm

After careful examination of the source code and the Chibios traces, I have some more information on this issue, and a possible cause.

Let's say that I have the following threads:
Thread A, priority 128
Thread B, priority 129

And the following sequence:
1. Thread B executes (higher priority)
2. Thread B gets to WAITING state (call to chThdSleep())
3. Thread A starts execution.
4. SysTick Handler preempts Thread A.
5. Due to increase in system counter, Thread B gets in READY state.
6. Execution returns to Thread A. (WRONG!!!)
7. Priority order violation gets triggered.

After this, it becomes clear to me that for some reason the SysTick handler does not perform a context switch that it had to.

Follows my SysTick handler. I can't see anything wrong with it. Comparing it to the handlers in the examples provided by Chibios, it seems identical.

Code: Select all

void SysTick_Handler()
{
   CH_IRQ_PROLOGUE();

   chSysLockFromISR();
   chSysTimerHandlerI();
   chSysUnlockFromISR();

   CH_IRQ_EPILOGUE();
}


I started digging Chibios source code. I found that CH_IRQ_EPILOGUE() calls _port_irq_epilogue which is ultimately responsible for the context switch.

Within _port_irq_epilogue there is the following condition check:

Code: Select all

if ((SCB->ICSR & SCB_ICSR_RETTOBASE_Msk) != 0U)

According to this, the context switch will only take place if the SysTick handler is the only interrupt being executed. If SysTick has preempted another lower priority (non-system) interrupt, there will be no context switch. I am not sure whether I would consider this a bug or a feature.

In my case there are non-system / fast-IRQs (i.e. without CH_IRQ_PROLOGUE() and CH_IRQ_EPILOGUE() guards), of lower priority than SysTick. I believe that such a handler gets preempted by SysTick, which results in no switching.

Giovanni, do you believe that this would be the case? Do all fast-IRQs must be of higher priority than all system IRQs?

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: Priority order violation - Asserion fails

Postby Giovanni » Mon Feb 18, 2019 8:27 pm

Hi,

I think you nailed it, excellent analysis. If there is an ISR without macros that it the last in the "return chain" then the re-schedule is not executed, in ChibiOS it is always the last ISR in the chain that performs the final switch. If this cannot be trusted then the mechanism fails.

Not sure about what to do about this except document it. A new rule should be added: any non-ChibiOS IRQ should have higher priority than any ChibiOS IRQ.

Giovanni

fotis
Posts: 32
Joined: Mon Apr 04, 2016 7:04 pm
Has thanked: 1 time
Been thanked: 1 time

Re: Priority order violation - Asserion fails

Postby fotis » Mon Feb 18, 2019 8:38 pm

Hi,

Sure, it is not possible to satisfy everyone of course... But consider adding an option whether this check would be enabled or not.
For example, in my case:
1. I want (some) non-Chibios IRQs to have lower priority, to reduce the system jitter.
2. I do not want to add these macros to all non-relavant IRQs to reduce the overhead.

Of course, others will argue that this check is correctly placed there, so the switch will take place only once, increasing performance.



On the other hand, maybe at least add a run-time check. My thought was to have a global "interrupt nesting variable".
1. On every prologue increase the counter.
2. On every epilogue decrease the counter.
3. Check in the epilogue. If the counter is 0 (this is supposed to be the last IRQ), then but SCB_ICSR_RETTOBASE_Msk is 0, then trigger an error.

Something like:

Code: Select all

  chDbgAssert(irq_nesting != 0 || ((SCB->ICSR & SCB_ICSR_RETTOBASE_Msk) == 0U), "context switch error");

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: Priority order violation - Asserion fails

Postby Giovanni » Mon Feb 18, 2019 8:49 pm

Hi,

This is a good idea, in addition, we already have that counter when the state checker is enabled (ch.dbg.isr_cnt), just the assertion is required. You may try to add the assertion and verify the problem but I am pretty sure it is as you described it.

Giovanni

fotis
Posts: 32
Joined: Mon Apr 04, 2016 7:04 pm
Has thanked: 1 time
Been thanked: 1 time

Re: Priority order violation - Asserion fails

Postby fotis » Mon Feb 18, 2019 8:52 pm

Thinking of it again, doesn't this create a race-condition?

If the IRQ epilogue is called and then, before the actual IRQ has the opportunity to return, it gets preempted by another interrupt. The second interrupt epilogue will see from the SCB->ICSR that there is another IRQ that is preempted. Upon return however the switch will not take place, as the epilogue is already executed.

See bellow:

1. Enter IRQ1
2. Prologue
3. Do IRQ stuff...
4. Epilogue
5. New interrupt fires. Preempt IRQ1.
6. Enter IRQ2
7. Prologue
8. Do IRQ stuff...
9. Epilogue
10. Exit IRQ2
11. Exit IRQ1

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: Priority order violation - Asserion fails

Postby Giovanni » Mon Feb 18, 2019 8:57 pm

Hi,

The last ISR returns with interrupts disabled, note this in _port_irq_epilogue():

Code: Select all

    /* Note, returning without unlocking is intentional, this is done in
       order to keep the rest of the context switch atomic.*/
    return;


Then the switch is executed running from process stack and, finally, an exception is re-entered (PendSV or SVC) where interrupts are enabled again.

About ch.dbg.isr_cnt, there is a problem with that, it is not decremented atomically with _port_irq_epilogue() so another counter should be added. I will think about it.

Giovanni


Return to “Bug Reports”

Who is online

Users browsing this forum: No registered users and 12 guests