Stack guard pages on Cortex-M4 Topic is solved
Posted: Mon Feb 11, 2019 1:08 am
Hi:
I'm still working on a hobby project using the Nordic Semi nRF54820 which uses a Cortex-M4 core. Since it includes MPU support, I decided to try experimenting with the PORT_ENABLE_GUARD_PAGES feature. I got it to work, but I ran into a couple of unusual issues. I'm not sure if they're bugs per se, but at the very least they seem like inconsistencies.
My configuration:
OS: ChibiOS 18.2.1
Compiler: GCC 8.2.0
CPU: Nordic nRF52840 Cortex-M4 with FPU enabled
RAM: 256KB
Flash: 1MB
First, in os/common/ports/ARMCMx/chcore_v7m.h, there is the following macro:
#define PORT_INT_REQUIRED_STACK 64
I understand that you need to reserve some space on the stack to hold exception frames, but I'm curious as to where this specific number came from. I got the ARM Cortex-M4 user guide from here:
http://infocenter.arm.com/help/topic/co ... UI0553.pdf
Figure 2-3 on page 2-27 shows that there are two possible exception frame layouts: one with preserved floating point context and one without. When there is no floating point context the exception frame is 8 words in size (32 bytes), and when there is floating point context it's 26 words (104 bytes).
That makes the worst case size 104 bytes, which is larger than the 64 bytes that ChibiOS currently reserves. I would expect the definition to be something more like:
#if CORTEX_USE_FPU == TRUE
#define PORT_INT_REQUIRED_STACK 104
#else
#define PORT_INT_REQUIRED_STACK 32
#endif
Is there any particular reason why it's not done this way? I can confirm that in my setup the CPU does occasionally save FP context. (Disclaimer: I don't know the exception frame formats of all ARM V7M processors, so maybe there is one that's 64 bytes in size.)
The second inconsistency has to do with alignment of the stack. The exception stack and main thread stacks are allocated via the linker scripts, in particular os/common/startup/ARMCMx/compilers/GCC/ld/rules_stacks.ld.
The MPU has a requirement that a protected region's base address must be aligned on a boundary that agrees with the requested region size. The smallest size you can specify is 32 bytes, which is the size that ChibiOS chooses for its guard pages. This means that for 32-byte regions, the base address must be aligned on a 32-byte boundary. This means 0x20002000 is ok, but 0x20002010 is not. If you specify the latter address, the MPU will treat it as 0x20002000.
In os/common/ports/ARMCMx/chcore_v7m.h, we have:
#define PORT_WORKING_AREA_ALIGN (PORT_ENABLE_GUARD_PAGES == TRUE ? \
32U : PORT_STACK_ALIGN)
This enforces the correct alignment for all stacks declared with THD_WORKING_AREA().
The problem is that this macro doesn't apply to the main thread stack and exception stack since they're declared via linker script instead. In os/common/startup/ARMCMx/compilers/GCC/ld/rules_stacks.ld, it says:
Here the enforced alignment is only 8. I suppose if you are lucky you might end up with the main and exception stacks aligned on a 32 byte boundary, but I was not lucky. This meant that the protected region for the main thread stack overlapped to the start of the exception stack, and I got a memory manager fault as soon as the first exception happened after the main thread was created.
I ended up doing the following workarounds:
- In chconf.h:
. define PORT_INT_REQUIRED_STACK to 104 (I have the FPU enabled)
. define PORT_ENABLE_GUARD_PAGE to TRUE
. define CH_DBG_ENABLE_STACK_CHECK to TRUE
- In my project's custom linker script:
. change ALIGN(8) to ALIGN(32)
Then everything worked as expected.
I'm still working on a hobby project using the Nordic Semi nRF54820 which uses a Cortex-M4 core. Since it includes MPU support, I decided to try experimenting with the PORT_ENABLE_GUARD_PAGES feature. I got it to work, but I ran into a couple of unusual issues. I'm not sure if they're bugs per se, but at the very least they seem like inconsistencies.
My configuration:
OS: ChibiOS 18.2.1
Compiler: GCC 8.2.0
CPU: Nordic nRF52840 Cortex-M4 with FPU enabled
RAM: 256KB
Flash: 1MB
First, in os/common/ports/ARMCMx/chcore_v7m.h, there is the following macro:
#define PORT_INT_REQUIRED_STACK 64
I understand that you need to reserve some space on the stack to hold exception frames, but I'm curious as to where this specific number came from. I got the ARM Cortex-M4 user guide from here:
http://infocenter.arm.com/help/topic/co ... UI0553.pdf
Figure 2-3 on page 2-27 shows that there are two possible exception frame layouts: one with preserved floating point context and one without. When there is no floating point context the exception frame is 8 words in size (32 bytes), and when there is floating point context it's 26 words (104 bytes).
That makes the worst case size 104 bytes, which is larger than the 64 bytes that ChibiOS currently reserves. I would expect the definition to be something more like:
#if CORTEX_USE_FPU == TRUE
#define PORT_INT_REQUIRED_STACK 104
#else
#define PORT_INT_REQUIRED_STACK 32
#endif
Is there any particular reason why it's not done this way? I can confirm that in my setup the CPU does occasionally save FP context. (Disclaimer: I don't know the exception frame formats of all ARM V7M processors, so maybe there is one that's 64 bytes in size.)
The second inconsistency has to do with alignment of the stack. The exception stack and main thread stacks are allocated via the linker scripts, in particular os/common/startup/ARMCMx/compilers/GCC/ld/rules_stacks.ld.
The MPU has a requirement that a protected region's base address must be aligned on a boundary that agrees with the requested region size. The smallest size you can specify is 32 bytes, which is the size that ChibiOS chooses for its guard pages. This means that for 32-byte regions, the base address must be aligned on a 32-byte boundary. This means 0x20002000 is ok, but 0x20002010 is not. If you specify the latter address, the MPU will treat it as 0x20002000.
In os/common/ports/ARMCMx/chcore_v7m.h, we have:
#define PORT_WORKING_AREA_ALIGN (PORT_ENABLE_GUARD_PAGES == TRUE ? \
32U : PORT_STACK_ALIGN)
This enforces the correct alignment for all stacks declared with THD_WORKING_AREA().
The problem is that this macro doesn't apply to the main thread stack and exception stack since they're declared via linker script instead. In os/common/startup/ARMCMx/compilers/GCC/ld/rules_stacks.ld, it says:
Code: Select all
/* Special section for exceptions stack.*/
.mstack :
{
. = ALIGN(8);
__main_stack_base__ = .;
. += __main_stack_size__;
. = ALIGN(8);
__main_stack_end__ = .;
} > MAIN_STACK_RAM
/* Special section for process stack.*/
.pstack :
{
__process_stack_base__ = .;
__main_thread_stack_base__ = .;
. += __process_stack_size__;
. = ALIGN(8);
__process_stack_end__ = .;
__main_thread_stack_end__ = .;
} > PROCESS_STACK_RAM
Here the enforced alignment is only 8. I suppose if you are lucky you might end up with the main and exception stacks aligned on a 32 byte boundary, but I was not lucky. This meant that the protected region for the main thread stack overlapped to the start of the exception stack, and I got a memory manager fault as soon as the first exception happened after the main thread was created.
I ended up doing the following workarounds:
- In chconf.h:
. define PORT_INT_REQUIRED_STACK to 104 (I have the FPU enabled)
. define PORT_ENABLE_GUARD_PAGE to TRUE
. define CH_DBG_ENABLE_STACK_CHECK to TRUE
- In my project's custom linker script:
. change ALIGN(8) to ALIGN(32)
Then everything worked as expected.