another i2c interrupt storm bug [patch]

Report here problems in any of ChibiOS components. This forum is NOT for support.
tridge
Posts: 141
Joined: Mon Sep 25, 2017 8:27 am
Location: Canberra, Australia
Has thanked: 10 times
Been thanked: 20 times
Contact:

Re: another i2c interrupt storm bug [patch]

Postby tridge » Mon Dec 16, 2019 6:32 am

Thanks. I've now just found another I2Cv1 case of an unhandled interrupt. This one is SB set without BUSY set. I've got a reproducible test case where it can happen if I unplug an I2C compass while it is actively transmitting.
In this case the storm doesn't seem to persist forever. Most I've seen is a couple of hundred interrupts then it stops, but I would not like to rely on that given past experience.
Patch attached.
Note that I have assumed we should send the address byte under this condition. We really have two choices, we can send the address byte and treat this as normal (after all we did get SB set, so it is asking for an address), or we could reset the peripheral. It isn't at all clear to me from the datasheet what the correct action is. I do know we have to respond, as otherwise we just get the ISR again.
The most fundamental issue with the structure of the I2Cv1 driver is it is based on matching of specific combinations of interrupt condition bits. That means if an unhandled combination happens we get a storm. If we instead structured it to check each bit separately, and then within handling of one bit we check for other bits, with all paths leading to handling of the ISR then the whole issue would never have happened.
Cheers, Tridge
Attachments
0001-I2Cv1-handle-SB-set-without-BUSY.zip
(905 Bytes) Downloaded 134 times

tridge
Posts: 141
Joined: Mon Sep 25, 2017 8:27 am
Location: Canberra, Australia
Has thanked: 10 times
Been thanked: 20 times
Contact:

Re: another i2c interrupt storm bug [patch]

Postby tridge » Mon Dec 16, 2019 7:08 am

Giovanni wrote:I applied the first two patches on trunk, I also added some extra DMA disable on another MSG_TIMEOUT exit path. In addition I made similar changes to I2Cv2 and I2Cv3, I think the DMA-disable-on-timeout problem existed there too.

I've had a look at your changes. I am concerned that they limit the DMA disable on failure to MSG_TIMEOUT. In my msg on Nov 26th I described a situation where it was a MSG_RESET that led to the memory corruption, so at the very least I think we ned a DMA disable for both MSG_TIMEOUT and MSG_RESET, but really I'd prefer it to be unconditional.
Others patches are too different and cannot be applied. About the flag "in_transaction", it is not required, the driver state can be checked (i2cp->state), it is handled in the HLD

It isn't clear to me that handling it in the HLD is good enough. We want the ISR to be rejected at any time before we set I2C_CR1_START, but the state variable is changed well before that point, so it seems to be the wrong variable to be using. I agree it would make the code neater, but I'm not convinced it is correct. I deliberately set in_transaction after we take the system lock and just before set set START.
Let's rebase the discussion starting from the current trunk state. Please, one change at time or it is difficult, to evaluate changes and decide what has to be ported to other drivers (I am slow, I know...).

I can try, but it is a bit tricky as our code base is significantly different from ChibiOS trunk. I'll need to see if I can rebase to test.
I'd also say that from a software engineering point of view this patches, forum and svn approach for dealing with an issue like this really isn't great. You're missing out on a whole generation of good review and comment tools. It always feels like a trip back to the 90s when dealing with ChibiOS patches.
I prefer to handle the IRQ storm separately, I still want a general solution if possible, better fix everything else first then do this.

I would definitely like to see this. I'd also like to see us get away from the "handle the following specific combinations of interrupt state bits" approach to drivers.

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: another i2c interrupt storm bug [patch]

Postby Giovanni » Sat May 23, 2020 9:10 am

I committed the latest patch (new IRQ storm instance) to the 3 active branches.

Fixed as bug #1064.

Lets continue the discussion based on current trunk.

About the driver state, Tridge, I don't understand your point, what is the problem in using the driver state, it is meant for that kind of things.

Giovanni


Last bumped by Giovanni on Sat May 23, 2020 9:10 am.


Return to “Bug Reports”

Who is online

Users browsing this forum: No registered users and 17 guests