Where to start STM32H7 support
Moderators: RoccoMarco, barthess
-
- Posts: 141
- Joined: Mon Sep 25, 2017 8:27 am
- Location: Canberra, Australia
- Has thanked: 10 times
- Been thanked: 20 times
- Contact:
small fix for ADCSEL bug on H7
Here is a small fix for the ADCSEL values on H7
- Attachments
-
- STM32-H7-ADCSEL-fix.zip
- (672 Bytes) Downloaded 218 times
-
- Posts: 141
- Joined: Mon Sep 25, 2017 8:27 am
- Location: Canberra, Australia
- Has thanked: 10 times
- Been thanked: 20 times
- Contact:
memory domains and DMA on H7
Hi All,
One of the more interesting aspects of the H7 is how memory is setup to be in domains, with different DMA controllers able to talk to different domains.
I thought I'd explain how I've got this setup, and hope that others will post their strategies as well.
First the basics. There are 7 builtin memory banks:
- ITCM at 0x00000000 of size 64k
- DTCM at 0x20000000 of size 128k
- AXI SRAM at 0x24000000 of size 512k
- SRAM1, SRAM2 and SRAM3 starting at 0x30000000 of total size 288k
- SRAM4 at 0x38000000 of size 64k
Peripherals and memory are attached to one of 3 domains called D1, D2 and D3.
With all of the connections between the various buses it isn't completely obvious (at least to me) from the reference manual which peripherals can do DMA from which memory regions. I ended up testing it experimentally, and found the following for the peripherals I use:
- can't use ITCM or DTCM for any DMA
- SPI1 to SPI5 can use AXI SRAM, SRAM1 to SRAM3 and SRAM4 for DMA
- SPI6, I2C4 and ADC3 can use SRAM4 on BDMA (I didn't actually test ADC3)
- UARTS can use AXI SRAM, SRAM1 to SRAM3 and SRAM4 for DMA
- I2C1, I2C2 and I2C3 can use AXI SRAM, SRAM1 to SRAM3 and SRAM4 with DMA
- timers can use AXI SRAM, SRAM1 to SRAM3 and SRAM4 with DMA
- ADC12 can use AXI SRAM, SRAM1 to SRAM3 and SRAM4
- SDMMC can use AXI SRAM, SRAM1 to SRAM3 with IDMA (cannot use SRAM4)
The last one is the most unfortunate, as I was hoping to find a memory region that was universal for DMA. As you can see, everything can use SRAM4 except for SDMMC, and as SPI6 needs to use SRAM4 we can't have a single universal DMA memory pool.
So I ended up with this setup:
- data, stack and primary heap on DTCM
- all DMA except for SDMMC on SRAM4
- SDMMC uses AXI SRAM for IDMA
- secondary heaps available on SRAM1 to SRAM3, SRAM4, AXI SRAM and ITCM so the full 1Mbyte is available
With ITCM I chopped out the first 1k so that NULL pointer checks can be used.
In the above I haven't worried about the efficiency in terms of memory bandwidth and crossing domain boundaries with DMA. That doesn't really worry me for my application as I'm only doing about 200kbytes/sec of DMA in total, so I care more about keeping the code simple. If you wanted to move a lot more data then you'd want to match up peripherals to memory more carefully.
I've also ignored DMA for USB as ChibiOS doesn't seem to use that yet.
Also note that If you do try to do DMA to/from a peripheral that doesn't support DMA with that memory region then you'll get a transfer error bit set in the DMA status register. If you use a blocking API (such as spiExchange()) then your code will just hang as the DMA won't complete.
One of the more interesting aspects of the H7 is how memory is setup to be in domains, with different DMA controllers able to talk to different domains.
I thought I'd explain how I've got this setup, and hope that others will post their strategies as well.
First the basics. There are 7 builtin memory banks:
- ITCM at 0x00000000 of size 64k
- DTCM at 0x20000000 of size 128k
- AXI SRAM at 0x24000000 of size 512k
- SRAM1, SRAM2 and SRAM3 starting at 0x30000000 of total size 288k
- SRAM4 at 0x38000000 of size 64k
Peripherals and memory are attached to one of 3 domains called D1, D2 and D3.
With all of the connections between the various buses it isn't completely obvious (at least to me) from the reference manual which peripherals can do DMA from which memory regions. I ended up testing it experimentally, and found the following for the peripherals I use:
- can't use ITCM or DTCM for any DMA
- SPI1 to SPI5 can use AXI SRAM, SRAM1 to SRAM3 and SRAM4 for DMA
- SPI6, I2C4 and ADC3 can use SRAM4 on BDMA (I didn't actually test ADC3)
- UARTS can use AXI SRAM, SRAM1 to SRAM3 and SRAM4 for DMA
- I2C1, I2C2 and I2C3 can use AXI SRAM, SRAM1 to SRAM3 and SRAM4 with DMA
- timers can use AXI SRAM, SRAM1 to SRAM3 and SRAM4 with DMA
- ADC12 can use AXI SRAM, SRAM1 to SRAM3 and SRAM4
- SDMMC can use AXI SRAM, SRAM1 to SRAM3 with IDMA (cannot use SRAM4)
The last one is the most unfortunate, as I was hoping to find a memory region that was universal for DMA. As you can see, everything can use SRAM4 except for SDMMC, and as SPI6 needs to use SRAM4 we can't have a single universal DMA memory pool.
So I ended up with this setup:
- data, stack and primary heap on DTCM
- all DMA except for SDMMC on SRAM4
- SDMMC uses AXI SRAM for IDMA
- secondary heaps available on SRAM1 to SRAM3, SRAM4, AXI SRAM and ITCM so the full 1Mbyte is available
With ITCM I chopped out the first 1k so that NULL pointer checks can be used.
In the above I haven't worried about the efficiency in terms of memory bandwidth and crossing domain boundaries with DMA. That doesn't really worry me for my application as I'm only doing about 200kbytes/sec of DMA in total, so I care more about keeping the code simple. If you wanted to move a lot more data then you'd want to match up peripherals to memory more carefully.
I've also ignored DMA for USB as ChibiOS doesn't seem to use that yet.
Also note that If you do try to do DMA to/from a peripheral that doesn't support DMA with that memory region then you'll get a transfer error bit set in the DMA status register. If you use a blocking API (such as spiExchange()) then your code will just hang as the DMA won't complete.
- Giovanni
- Site Admin
- Posts: 14455
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1076 times
- Been thanked: 922 times
- Contact:
Re: Where to start STM32H7 support
I think SRAM4 is not very efficient for anything except BDMA. Probably an option should be added to drivers to use DMA instead of BDMA if/where it is possible (to be checked), probably this will mean even more preprocessor hell in code.
Giovanni
Giovanni
-
- Posts: 141
- Joined: Mon Sep 25, 2017 8:27 am
- Location: Canberra, Australia
- Has thanked: 10 times
- Been thanked: 20 times
- Contact:
Re: Where to start STM32H7 support
Giovanni wrote:I think SRAM4 is not very efficient for anything except BDMA
What impact does this have? Does it mean it uses more memory bandwidth? Stalls DMAs? Affects the cpu usage? Uses more power?
I can see from the bus diagram that it means memory transfers are crossing a bunch of interconnects, but as long as it doesn't affect DTCM memory or the CPU speed then it's probably OK for what I'm doing, but I am curious what the actual impact is. If I know what to look for then maybe I could write a benchmark which explores the impact.
Probably an option should be added to drivers to use DMA instead of BDMA if/where it is possible (to be checked) probably this will mean even more preprocessor hell in code
I'd actually prefer to be able to use BDMA more if I could. I'm using all of the 16 DMA streams, with a lot of sharing going on, whereas I'm only using 4 of the 8 BDMA streams. I could reduce the amount of sharing if I could move some more peripherals to BDMA.
Current DMA map for Pixhawk4Pro: https://gist.github.com/tridge/4059c500 ... 8f9c276a42
As for cpp macro complexity, I agree on that. I don't have a solution though.
- Giovanni
- Site Admin
- Posts: 14455
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1076 times
- Been thanked: 922 times
- Contact:
Re: Where to start STM32H7 support
I think that crossing domains has an impact, you take bandwidth in all matrices you cross and add wait states. See figure 1 in the RM all connections are there.
It is best to use a DMA closer to your destination RAM.
Giovanni
It is best to use a DMA closer to your destination RAM.
Giovanni
-
- Posts: 141
- Joined: Mon Sep 25, 2017 8:27 am
- Location: Canberra, Australia
- Has thanked: 10 times
- Been thanked: 20 times
- Contact:
Re: Where to start STM32H7 support
Giovanni wrote:I think that crossing domains has an impact, you take bandwidth in all matrices you cross and add wait states.
If the CPU is mostly using data in DTCM and ITCM then I expected it to not add any wait states, as those don't go via the shared matrices. So basically I was relying on putting all the performance sensitive data in DTCM (all stacks, main heap, all static data, bss). Then ITCM is used as 2nd choice heap for anything that needs to be fast.
It is best to use a DMA closer to your destination RAM.
yes, it is just a matter of managing the complexity in the code. I may revisit and make each peripheral have a preferred heap, but I'd like to know that the complexity would gain something substantial.
- Giovanni
- Site Admin
- Posts: 14455
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1076 times
- Been thanked: 922 times
- Contact:
Re: Where to start STM32H7 support
About the SPI patch posted before, I am not sure it is correct, it disables SPI_CFG1_RXDMAEN and SPI_CFG1_TXDMAEN which are not more re-enabled. Following operations would fail.
What is it trying to fix?
ADC problem fixed as bug #1016.
Giovanni
What is it trying to fix?
ADC problem fixed as bug #1016.
Giovanni
-
- Posts: 141
- Joined: Mon Sep 25, 2017 8:27 am
- Location: Canberra, Australia
- Has thanked: 10 times
- Been thanked: 20 times
- Contact:
fix dummy byte handling in SPIv3 for H7
I attach a patch which fixes an issue with spiSend() and spiReceive() for SPIv3 on H7. The problem is that the dummyrx and dummytx variables may not be DMA safe, depending on where static data is declared and which SPI bus you are using.
The patch adds dummyrx and dummytx pointers in the SPIv3 config, which the caller needs to setup to point at 4 bytes of DMA safe memory suitable for the SPI device number. If the caller doesn't set them up then the old variables are used.
I know we'll need to change the examples as well. I'm happy to do that if you think that this approach is OK.
Previously I just avoided spiSend() and spiReceive() on H7 to avoid this issue, but that was costing extra memory bandwidth and it make the code more complex as I want to use them on F4 and F7.
Cheers, Tridge
The patch adds dummyrx and dummytx pointers in the SPIv3 config, which the caller needs to setup to point at 4 bytes of DMA safe memory suitable for the SPI device number. If the caller doesn't set them up then the old variables are used.
I know we'll need to change the examples as well. I'm happy to do that if you think that this approach is OK.
Previously I just avoided spiSend() and spiReceive() on H7 to avoid this issue, but that was costing extra memory bandwidth and it make the code more complex as I want to use them on F4 and F7.
Cheers, Tridge
- Attachments
-
- SPIv3-dummy.zip
- (1.21 KiB) Downloaded 210 times
-
- Posts: 141
- Joined: Mon Sep 25, 2017 8:27 am
- Location: Canberra, Australia
- Has thanked: 10 times
- Been thanked: 20 times
- Contact:
Re: Where to start STM32H7 support
Giovanni wrote:About the SPI patch posted before, I am not sure it is correct, it disables SPI_CFG1_RXDMAEN and SPI_CFG1_TXDMAEN which are not more re-enabled. Following operations would fail.
Do you mean the patch to fix spi_lld_polled_exchange()? That disables SPI_CFG1_RXDMAEN | SPI_CFG1_TXDMAEN, but only when doing a polled exchange. When polling you don't want DMA enabled.
It does mean you need to do a new spi_lld_start() if you are mixing polled and DMA transfers. I guess some users will want to do that, so they can use polled for small transfers and DMA for larger ones, so yes, I see what you mean about breaking that. In that case we could either re-enable the DMA bits after a polled transfer, or we could always set them before a DMA transfer. Which would you prefer?
What is it trying to fix?
It is fixing polled transfers. They didn't work at all. I was using polled transfers for a while before I got things fixed up with DMA.
ADC problem fixed as bug #1016.
thanks!
- Giovanni
- Site Admin
- Posts: 14455
- Joined: Wed May 27, 2009 8:48 am
- Location: Salerno, Italy
- Has thanked: 1076 times
- Been thanked: 922 times
- Contact:
Re: Where to start STM32H7 support
The SPI driver should allow for mixed DMA and non-DMA operations, is leaving those 2 bits enable a problem? DMA channels are disabled. It is not a problem apparently for other SPI implementations.
Giovanni
Giovanni
Who is online
Users browsing this forum: No registered users and 10 guests