Hi,
News, I added the base of a new "SE" product under ./os/se, it is meant to be a collection of code and best practices for people having to deal with functional safety. Architecturally it sits on top of the RTOS (RT or NIL).
Right now the module implements:
- Safe register writes with verify.
- Safe waiting loops with timeouts.
I am thinking also to:
- Read with debouching.
- Edge detection with debouching.
- Triple voted read.
Feedback, more ideas?
Giovanni
[INFO] Functional Safety Elements
- Giovanni
- Site Admin
- Posts: 14457
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1076 times
- Been thanked: 922 times
- Contact:
- alex31
- Posts: 379
- Joined: Fri May 25, 2012 10:23 am
- Location: toulouse, france
- Has thanked: 38 times
- Been thanked: 62 times
- Contact:
Re: [INFO] Functional Safety Elements
Hello,
Nice idea, better to have a well written and tested code than having to reinvent the wheel !
Do you mean debouncing instead of debouching ?
Alexandre
Nice idea, better to have a well written and tested code than having to reinvent the wheel !
Do you mean debouncing instead of debouching ?
Alexandre
- Giovanni
- Site Admin
- Posts: 14457
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1076 times
- Been thanked: 922 times
- Contact:
-
- Posts: 77
- Joined: Sat Mar 19, 2016 8:07 pm
- Been thanked: 17 times
Re: [INFO] Functional Safety Elements
I'm not working in an industry with specific functional safety requirements, so I'm not that experienced with such requirements and norms. But I still like to improve the reliability of the stuff I build if it doesn't take too long to implement.
Here are some more ideas in this area from me:
- Flash CRC check on startup and periodically in the idle thread.
The build process should make it easy to put a crc32 at a specific fixed flash location which is then compared to the calculated value on boot. It should also be easy to hook a periodic check (for example every hour) into the idle thread. The frequency should be easily adjustable, so battery powered devices could do this at a much reduced frequency.
Flipped bits in flash are a thing in reality. At work we have lots of thousand devices with Atmegas under support. I put a crc check into the bootloader. The bootloader prevents booting a faulty firmware, but shows a specific led blink code instead. Every year or two a customer calls with this happening to him. I then reflash the firmware via remote support and the problem is gone for this customer, so it is really a bit flip and not faulty hardware.
- Enhanced internal monitoring of the scheduler
The idea is to improve self-detection of scheduling bugs or deadlocks. I'm not really sure what is the best way to do this. Maybe some counter which allows to read out how many ticks ago a specific thread was actively running? There could be a specific thread which periodically reads these values, compares them to programmed worst-case values and resets the watchdog timer if the values are within the expected bands.
- Complete clear of the SRAM during early init to support the SRAM parity check option some STM32s have.
If there is any read from a non-initialized memory position, you get a parity error and NMI when the parity check in the option byte is enabled.
- Support in HAL for controlling the break input that some STM32 timers have in their PWM modes
Here are some more ideas in this area from me:
- Flash CRC check on startup and periodically in the idle thread.
The build process should make it easy to put a crc32 at a specific fixed flash location which is then compared to the calculated value on boot. It should also be easy to hook a periodic check (for example every hour) into the idle thread. The frequency should be easily adjustable, so battery powered devices could do this at a much reduced frequency.
Flipped bits in flash are a thing in reality. At work we have lots of thousand devices with Atmegas under support. I put a crc check into the bootloader. The bootloader prevents booting a faulty firmware, but shows a specific led blink code instead. Every year or two a customer calls with this happening to him. I then reflash the firmware via remote support and the problem is gone for this customer, so it is really a bit flip and not faulty hardware.
- Enhanced internal monitoring of the scheduler
The idea is to improve self-detection of scheduling bugs or deadlocks. I'm not really sure what is the best way to do this. Maybe some counter which allows to read out how many ticks ago a specific thread was actively running? There could be a specific thread which periodically reads these values, compares them to programmed worst-case values and resets the watchdog timer if the values are within the expected bands.
- Complete clear of the SRAM during early init to support the SRAM parity check option some STM32s have.
If there is any read from a non-initialized memory position, you get a parity error and NMI when the parity check in the option byte is enabled.
- Support in HAL for controlling the break input that some STM32 timers have in their PWM modes
- Giovanni
- Site Admin
- Posts: 14457
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1076 times
- Been thanked: 922 times
- Contact:
Re: [INFO] Functional Safety Elements
Hi,
Good ideas, in general that SE module should contain generic stand-alone code, some of your suggestions can be seen as improvements of other modules (still possible).
Some comments:
This can be implemented as a periodic, low priority, monitor thread. I do this in some projects.
Here I like the idea to create some kind of standard safe/secure bootloader as part of ChibiOS. A bootloader is almost always required in a commercial project and we should have one.
How to generate a CRC/checksum at link time? if not possible then it would require some kind of tool invoked after the linker in the build process (just an idea, it could be used to generate CRCs but potentially also signatures using a secret key and this kind of security-related things, perhaps it is time to learn some Python).
This is coming probably in RT7 under the form of "thread watchdogs".
This is already possible, you can insert initialization code between the reset vector and CRT0 entry point, there is a weak symbol to redefine. It is not in CRT0 because this kind of initializations are platform-specific, CRT0 is generic code.
This is an HAL PWM driver improvement. We could discuss it in that context.
Good ideas, in general that SE module should contain generic stand-alone code, some of your suggestions can be seen as improvements of other modules (still possible).
Some comments:
Flash CRC check on startup and periodically in the idle thread.
This can be implemented as a periodic, low priority, monitor thread. I do this in some projects.
The bootloader prevents booting a faulty firmware
Here I like the idea to create some kind of standard safe/secure bootloader as part of ChibiOS. A bootloader is almost always required in a commercial project and we should have one.
How to generate a CRC/checksum at link time? if not possible then it would require some kind of tool invoked after the linker in the build process (just an idea, it could be used to generate CRCs but potentially also signatures using a secret key and this kind of security-related things, perhaps it is time to learn some Python).
Enhanced internal monitoring of the scheduler
This is coming probably in RT7 under the form of "thread watchdogs".
Complete clear of the SRAM during early init to support the SRAM parity check option some STM32s have.
This is already possible, you can insert initialization code between the reset vector and CRT0 entry point, there is a weak symbol to redefine. It is not in CRT0 because this kind of initializations are platform-specific, CRT0 is generic code.
Support in HAL for controlling the break input that some STM32 timers have in their PWM modes
This is an HAL PWM driver improvement. We could discuss it in that context.
Return to “Development and Feedback”
Who is online
Users browsing this forum: No registered users and 57 guests