1,000,000 Sample/Sec ADC Logger

This forum is about you. Feel free to discuss anything is related to embedded and electronics, your awesome projects, your ideas, your announcements, not necessarily related to ChibiOS but to embedded in general. This forum is NOT for support.
wgreiman
Posts: 33
Joined: Sun Oct 23, 2011 3:21 pm
Been thanked: 2 times

1,000,000 Sample/Sec ADC Logger

Postby wgreiman » Thu Jul 14, 2016 7:48 pm

I have been testing a version of my SdFat library on ChibiOS. If there is interest, I will post a preview version of the data logger.

Here is data from a test using a NUCLEO-F411RE.

million.png

Notice, one data point every microsecond.

I use TIM3 to trigger ADC1 a million times per second. I capture the data in 512 byte buffers in a high priority thread and write them to the SD in a lower priority thread.

I create a large contiguous file on an SD connected to SPI1. I write data to the file using a single raw SD multi-block write.

I don't use SDIO since the HAL does not have a method to do a single large multi-block write. Modern high performance SD card don't perform well unless you pipeline large writes.

Here is output from the test program:
ch> log demo.bin 1000000
Type any character to stop.
Done
ch> stats
STM32_SYSCLK 96000000, CH_CFG_ST_FREQUENCY 10000
STM32_ADCCLK 24000000
run time : 256053 msec
block write count : 1000000
min write latency : 94 usec
max write latency : 1987 usec
avg write latency : 95 usec
max used buffers : 10
total buffer count : 128
ch> ls
2016-07-04 12:34:32 4096000000 big.bin
2016-07-14 06:13:20 512000000 demo.bin

This program creates a file with one million 512 byte blocks and logs 256,000,000 sample to the file in 256 seconds.

The average time to write a block is 95 microseconds. the max time was 1,987 microseconds. Notice only 10 block buffers were used.

I also logged data from the six "Arduino analog pins" on the NUCLEO-F411RE, PA0, PA1, PA4, PB0, PC1, PC0.

I trigger ADC1 to read the six pin group 200,000 times per second. This results is 1,200,000 ADC values/sec. I connected all pins to a 10 kHz signal. Notice the skew in the plot below since it take a large fraction of a microsecond to digitize each value.

Here is output from the program:
ch> log six.bin 100000
Type any character to stop.
Done
ch> ls
2016-07-04 12:34:32 4096000000 big.bin
2016-07-14 06:13:20 512000000 demo.bin
2016-07-14 06:37:34 51200000 six.bin
ch> stats
STM32_SYSCLK 96000000, CH_CFG_ST_FREQUENCY 10000
STM32_ADCCLK 24000000
run time : 21007 msec
block write count : 100000
min write latency : 94 usec
max write latency : 1548 usec
avg write latency : 95 usec
max used buffers : 8
total buffer count : 128
ch> view six.bin
1274,1232,1199,1157,1117,1074
956,923,895,865,820,804
754,728,693,681,666,657
625,617,616,607,606,618
627,636,647,658,678,688
746,776,794,826,856,887
976,1016,1046,1088,1116,1165
1317,1341,1372,1434,1468,1516
1654,1707,1755,1810,1845,1897
2046,2094,2140,2188,2226,2280
2411,2449,2490,2529,2580,2610
2719,2751,2789,2812,2832,2868
2962,2965,2980,3009,3011,3032
3063,3061,3073,3072,3071,3074
3050,3043,3030,3023,3011,2990
2929,2909,2881,2860,2830,2799
2697,2670,2628,2588,2557,2501
2364,2348,2292,2259,2206,2159
2011,1966,1914,1873,1824,1775
1635,1586,1535,1497,1443,1396
1263,1223,1186,1145,1104,1065
958,926,896,865,845,801
711,727,695,681,668,656
626,616,615,607,606,619
625,636,646,656,679,688
744,776,795,826,856,890
976,1017,1046,1086,1128,1162
1320,1348,1375,1432,1470,1519
1656,1706,1755,1809,1846,1896
2044,2096,2137,2187,2228,2279
2405,2451,2489,2529,2582,2611
2720,2751,2792,2812,2845,2865
2962,2966,2981,3007,3010,3031
3061,3063,3071,3073,3075,3076
3051,3043,3033,3023,3012,2994
2929,2910,2882,2859,2830,2801
2696,2669,2630,2589,2550,2507
2367,2342,2294,2259,2206,2157
2011,1965,1913,1875,1826,1776
1634,1584,1535,1495,1443,1397
1263,1226,1188,1147,1102,1065
958,926,895,864,834,810
Done
ch> bin2csv six.bin six.csv
Type any character to stop
52%
Done
Attachments
six.png

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: 1,000,000 Sample/Sec ADC Logger

Postby Giovanni » Thu Jul 14, 2016 9:27 pm

Interesting results.

About SDIO, the ioblock interface should be able to write large blocks, where is exactly the limitation? do you mean FatFS?

Giovanni

wgreiman
Posts: 33
Joined: Sun Oct 23, 2011 3:21 pm
Been thanked: 2 times

Re: 1,000,000 Sample/Sec ADC Logger

Postby wgreiman » Thu Jul 14, 2016 10:38 pm

I am not using FatFS. I am using a totally rewritten version of a FAT library I wrote many years ago for Arduino. I only use the file API to to create a large, up to 4 GB, contiguous file.

I then write the file as a single multi-block write. I issue a single CMD25, WRITE_MULTIPLE_BLOCK, then write the whole file as one big multi-block write. I end by sending a 'Stop Tran' token. I then truncate the file if not all blocks have been used.

I am only using the HAL for SPI. I use my own SD SPI driver program.

High end SD cards have 32 KiB or larger flash blocks and only emulate 512 byte blocks. These cards have sufficient RAM buffer so the write latency is very low in this mode.

If you you interrupt this pipeline transfer, you can have a delay of many milliseconds. At over 2 MB/sec this causes overruns.

Here are some results for SDIO writes with a NUCLEO-F411RE. I am able to use clock bypass on this chip for 48 MHz SDIO clock.

for one block writes it is hopeless.
SDIO_CLK : 48000000 Hz
write size : 512 bytes
write count : 100000
min write latency : 33 usec
max write latency : 200000 usec
avg write latency : 1706 usec
avg write rate : 300 KB/sec


Better with 8,192 byte writes but not good enough. The 76 ms write latency would cause an overrun.
SDIO_CLK : 48000000 Hz
write size : 8192 bytes
write count : 100000
min write latency : 1211 usec
max write latency : 75926 usec
avg write latency : 1916 usec
avg write rate : 4275 KB/sec


65,536 byte writes are good enough but you can't have a buffer pool.
SDIO_CLK : 48000000 Hz
write size : 65536 bytes
write count : 100000
min write latency : 3845 usec
max write latency : 21937 usec
avg write latency : 4537 usec
avg write rate : 14444 KB/sec


An SDIO begin multiple-block write, write block, and end write is needed. I don't see this in the HAL API.

The future is an implementation of exFAT that preallocates space and uses multi-block write in a sophisticated way. New versions of FatFS have exFAT but it is sort of patched in and does not use the exFAT features very well. The Microssoft patent is also an issue.

User avatar
Giovanni
Site Admin
Posts: 14444
Joined: Wed May 27, 2009 8:48 am
Location: Salerno, Italy
Has thanked: 1074 times
Been thanked: 921 times
Contact:

Re: 1,000,000 Sample/Sec ADC Logger

Postby Giovanni » Fri Jul 15, 2016 8:26 am

Interesting.

We have something like that in the MMC-SPI driver already.

Code: Select all

  bool mmcStartSequentialRead(MMCDriver *mmcp, uint32_t startblk);
  bool mmcSequentialRead(MMCDriver *mmcp, uint8_t *buffer);
  bool mmcStopSequentialRead(MMCDriver *mmcp);
  bool mmcStartSequentialWrite(MMCDriver *mmcp, uint32_t startblk);
  bool mmcSequentialWrite(MMCDriver *mmcp, const uint8_t *buffer);
  bool mmcStopSequentialWrite(MMCDriver *mmcp);


Are those not functional for some reason? I think something similar could be added to SDIO too.

Giovanni

wgreiman
Posts: 33
Joined: Sun Oct 23, 2011 3:21 pm
Been thanked: 2 times

Re: 1,000,000 Sample/Sec ADC Logger

Postby wgreiman » Fri Jul 15, 2016 1:30 pm

The MMC-SPI driver would probably work OK but I use versions of my SD SPI driver in many systems like Particle.io boards which are based on STM32F2 with WI-FI or Cellular. It is cleaner for me to interface at the SPI level.

I am experimenting with the idea of presenting a simple block read/write interface to the file system layer but keep state in the SD driver so large contiguous writes and reads use multi block transfers. The first call to access a block on the SD would start a multi-block transfer and the next call would continue the transfer if possible.

I already have two caches in the file system layer, one for FAT/directory blocks and one for data blocks. This way the SD is only accessed once every 128 times to allocate another cluster. The file system is a general FAT file system and I have used it with USB flash drives and USB hard drives.

You can understand why this is important if you think about what happens when you do a true single 512 byte block write to card with 32 KiB flash blocks.

Data from the existing 32 KiB flash block must be combined with the changed 512 bytes and programed in a new flash block. That's why a card that is capable of writes at 50 MB/sec slows to the 300 KB/sec in the test in my previous post.

I would love to see what kind of performance would be possible with SDIO using these ideas. I was thinking about trying to modify your SDC driver for tests. Actually a layer between your HAL and the abstract block interface to the file system might be better.

In the future it would also be nice to allow the STM32 clock bypass mode for chips that don't have the bypass errata. I allow it with board I have.

I know these chips have the problem and many are popular:
STM32F205xx, STM32F207xx, STM32F215xx, STM32F217xx,
STM32F415xx, STM32F417xx, STM32F405xx, STM32F407xx

wgreiman
Posts: 33
Joined: Sun Oct 23, 2011 3:21 pm
Been thanked: 2 times

Re: 1,000,000 Sample/Sec ADC Logger

Postby wgreiman » Fri Jul 15, 2016 8:05 pm

I did a first cut of keeping state in the SD SPI driver so I can use the max number of multi-block transfers.

The improvement was astounding. This loop in my file system test program had a factor of ten speedup for 512 byte writes!

Code: Select all

  for (i = 0; i < n; i++) {
    chTMStartMeasurementX(&tm_sd);
    if (file.write(buf, nb) != (int)nb) {
      chprintf(chp, "write failed\r\n");
      file.close();
      return;
    }
    chTMStopMeasurementX(&tm_sd);
  }


Old way, in use by thousands of users:
ch> bench 512
file size : 8388608 bytes
write count : 16384
write size : 512 bytes
write time : 16727 msec
avg write rate : 501 KB/sec
min write latency : 103 usec
max write latency : 25592 usec
avg write latency : 1021 usec

New way with optimized use of multi-block writes:
ch> bench 512
file size : 8388608 bytes
write count : 16384
write size : 512 bytes
write time : 1589 msec
avg write rate : 5279 KB/sec
min write latency : 95 usec
max write latency : 6973 usec
avg write latency : 97 usec


Even small writes where data is cached are amazingly fast. Here is a test of 50 byte file writes. 167,772 50 byte writes in just over two seconds.
file size : 8388600 bytes
write count : 167772
write size : 50 bytes
write time : 2105 msec
avg write rate : 3985 KB/sec
min write latency : 3 usec
max write latency : 7249 usec
avg write latency : 13 usec


I have been writing small system SD libraries for about eight years and this idea just occurred to me in the last week. Talking about a long time for an aha moment.


Return to “User Projects”

Who is online

Users browsing this forum: No registered users and 22 guests