threads crashes with STM32F746 when using STM32F746xG_MAX.ld link script Topic is solved

Report here problems in any of ChibiOS components. This forum is NOT for support.
User avatar
Abusous2000
Posts: 15
Joined: Fri Jul 05, 2019 1:26 am
Has thanked: 7 times
Been thanked: 3 times

threads crashes with STM32F746 when using STM32F746xG_MAX.ld link script  Topic is solved

Postby Abusous2000 » Mon May 25, 2020 7:55 pm

I am using STM32F746g-discovery board with latest ChibioOS. The default link script (STM32F746xG.ld) by default uses ram3 section for BSS_RAM, and whenever I switch to STM32F746xG_MAX.ld link script (which places BSS_RAM into ram0) my threads starts crashing or malfunctioning.
I managed to replicate this issue in one of your demo projects (RT-STM32-LWIP-FATFS-USB), just switch the links script to STM32F746xG_MAX.ld in /make/stm32f746_discovery.make and the file system FATFS won't mount!
I am running very very low in memory and 64k of RAM in ram3 isn't enough to run all threads. I need to switch BSS_RAM to use ram0, but whenever I do it, I run into many problems.
I appreciate any help on this one. I am aware that by default you use ram3 for all threads to eliminate cache coherence issue.

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: threads crashes with STM32F746 when using STM32F746xG_MAX.ld link script

Postby Giovanni » Mon May 25, 2020 8:41 pm

Hi,

The default script puts the BSS into an uncached memory, if you put it in other RAM areas then you need to handle cache consistency yourself for all drivers using DMA.

Giovanni

User avatar
Abusous2000
Posts: 15
Joined: Fri Jul 05, 2019 1:26 am
Has thanked: 7 times
Been thanked: 3 times

Re: threads crashes with STM32F746 when using STM32F746xG_MAX.ld link script

Postby Abusous2000 » Mon May 25, 2020 9:01 pm

Thanks for the quick reply Giovanni, YES... I am aware of DMA and cache coherence issues with STM32F7xx boards
However, how to make demo project RT-STM32-LWIP-FATFS-USB work with STM32F746xG_MAX.ld link script?
When I use STM32F746xG_MAX.ld..FATFS wouldn't even mount? But when I switch over to the default links script (STM32F746xG.ld), it works right away.
Any suggestions?
Many thx in advance

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: threads crashes with STM32F746 when using STM32F746xG_MAX.ld link script

Postby Giovanni » Mon May 25, 2020 9:05 pm

Hi,

It is not designed to work with that linker script, it relies on BSS being not cacheable.

Giovanni

User avatar
Abusous2000
Posts: 15
Joined: Fri Jul 05, 2019 1:26 am
Has thanked: 7 times
Been thanked: 3 times

Re: threads crashes with STM32F746 when using STM32F746xG_MAX.ld link script

Postby Abusous2000 » Mon May 25, 2020 9:14 pm

I hear you Giovanni... I have an application that uses FATFS, but there are several other threads all cramped into the ram3 section...which is limited to only 64k. This renders the boards much less useful. On the other hand, with STM32F769i, I have more leeway.

If you have any idea how to make, I would appreciate it.
Thx

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: threads crashes with STM32F746 when using STM32F746xG_MAX.ld link script

Postby Giovanni » Mon May 25, 2020 9:19 pm

You need to make a custom script and make sure your DMA buffers go in TCM, the rest can go the other RAMs.

Giovanni

wpaul
Posts: 16
Joined: Wed Oct 12, 2016 10:06 pm
Been thanked: 3 times

Re: threads crashes with STM32F746 when using STM32F746xG_MAX.ld link script

Postby wpaul » Wed May 27, 2020 8:43 pm

There are actually two additional options to addressing this issue, but each has their own advantages and disadvantages.

Option 1: Use the MPU to make the SRAM uncached

To do this, you need to make sure the MPU support is included. I think the simplest way to do that is to turn on the stack guard page feature. You can do this by adding the following to your halconf.h:

Code: Select all

#define PORT_ENABLE_GUARD_PAGES TRUE


Then you need to use mpuConfigureRegion() to activate a region and set its properties. You should add the following code to your main() at a point before you start using your DMA-capable controller devices:

Code: Select all

        mpuConfigureRegion (MPU_REGION_1, 0x20000000,
            MPU_RASR_ATTR_AP_RW_RW | MPU_RASR_ATTR_SHARED_DEVICE |
            MPU_RASR_SIZE_512K | MPU_RASR_ENABLE);


0x20000000 is the base address of the internal RAM.

The MPU supports 8 regions. When stack guard pages are enabled, ChibiOS uses region 7, so I used region 1 here. According to the Cortex-M7 manual, when two MPU regions overlap, the one with the highest region number takes precedence, so the stack guard page feature should still work even with this extra region enabled.

The "SHARED_DEVICE" attribute tells the CPU that this region will be accessed by both CPU and another bus master peripheral, which disables caching.

Note that if you use the FSMC controller and the external SDRAM that's available on the STM32F746-DISCO board, I think the memory is uncached by default. You can use the MPU to make it cached. I did it like this:

Code: Select all

        mpuConfigureRegion (MPU_REGION_3, FSMC_Bank5_MAP_BASE,
            MPU_RASR_ATTR_AP_RW_RW | MPU_RASR_ATTR_CACHEABLE_WB_WA |
            MPU_RASR_SIZE_8M | MPU_RASR_ENABLE);


Note that support for the FSMC controller is provided in the ChibiOS-Contrib repo (see the community subdirectory in the ChiniOS official releases).

Caveats:

The MPU doesn't allow you to set arbitrary sizes: you have to use a power of 2 selector.

Also, the MPU has alignment constraints on its base address value: the address must also be aligned on the same boundary as the size. So if you want to define a 64K region, its base address must be on a 64K boundary.

Here I used 512KB as the size, because the TCM RAM starts at 0x20000000 and ends at 0x2000FFFF (64K size), and the 256KB of SRAM starts at 0x20010000 and ends at 0x2004FFFF. The 256KB of SRAM is not aligned on a 256KB boundary, and the only power of 2 size up from 256K is 512K, so I used that. It's not entirely correct, but it's harmless. If you want you can use a collection of smaller regions to map just the actual SRAM and avoid the overflow.

Option 2: Sync the cache manually with the cacheBufferFlush() and cacheBufferInvalidate() routine when doing DMA transfers

This requires modifying either your application code or the drivers to perform the flush and invalidate operations as applicable. Before you do a memory to peripheral transfer, you must flush the source buffer. After you complete a peripheral to memory transfer, you must invalidate the destination buffer.

Caveats:

Flush/invalidate granularity is the size of a cache line, which for this CPU is 32. Special care may be taken to avoid having your source and destination buffers share a cache line with another buffer.

Also, you need to be reasonably familiar with the driver or application code in order to make the right code changes.

The advantage to option 1 is that it requires very little in the way of code changes. The disadvantage is that by marking all of the SRAM uncached, you sacrifice all of the performance that the data cache is supposed to give you. It's up to you as the designer to decide if that matters or not for your application.

The advantage to option 2 is that it gives you reasonably fine-grained control over cache synchronization (buffers which are not used for DMA can remain cached and you will retain the data cache performance gains). The disadvantage is that it takes a bit more knowledge and experience to make the necessary code changes.

-Bill
Last edited by wpaul on Wed May 27, 2020 8:59 pm, edited 1 time in total.

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: threads crashes with STM32F746 when using STM32F746xG_MAX.ld link script

Postby Giovanni » Wed May 27, 2020 8:56 pm

Hi,

Just few notes:

PORT_ENABLE_GUARD_PAGES TRUE

Should go in chconf.h, there is a dedicated section to the bottom of the file. You also need to enable stack checking in there. Placing it in halconf.h makes it "not seen" by all modules.

In latest versions the used region is now 7, it makes no difference if you use section 1 but 0 is available too.

I am adding a port section to the ChibiOS book with this kind of low level details.

About cache handling, I would avoid that if possible, it is complex especially if inexperienced with HAL. The simplest/safest approach is to use MPU to make a memory region not cacheable as suggested. TCM is always not cacheable so it is convenient for DMA buffers.

Giovanni


Return to “Bug Reports”

Who is online

Users browsing this forum: No registered users and 18 guests