In this tutorial, we explore memory management in STM32 Cortex-M7 processors using CubeIDE. You’ll learn about different internal memory regions (like TCM, D1, D2, D3 SRAM), how memory domains and DMA access vary, and why proper memory placement is crucial for performance. Ideal for embedded developers working on real-time or high-performance STM32 applications!
🔗 Download memory.c and supporting code: https://controllerstech.com/wp-content/uploads/2021/06/memory.c
00:00Hello everyone. Welcome to Controllers Tech. As the title says, this video will cover the
00:16memory management in STM32. This is rather a very long topic, so there are going to be
00:23few more videos on it. By memory management I mean, the different RAMs available in STM32,
00:30the external memories, like SD-RAM, EMMC, etc., and also we will see, how to use the memory
00:37protection unit, aka MPU. These videos will be more inclined towards the Cortex-M7 processors.
00:47But you can use the same logic in any other STM32 device also. Today we will start with something
00:54very simple, and that is, how can we use the different internal RAMs to write and read data.
01:01In this video, I am going to explain a little about different memories in Cortex-M7 processors.
01:08And then we will see how can we use these memories to save the data, or load the data from.
01:15Now why is this important? Well one of the main reasons is the DMA.
01:21DMA don't have access to the tightly coupled memories, so we can't read or write data into
01:28TCM using the DMA. Another reason is the large data buffer. Let's say you want to save some picture
01:37data into the RAM, and if the main memory is not enough, we can use the other available memories to
01:42do so. You will understand this in a while. Let's take a look at the reference manual.
01:59This is the system architecture for H745 series. This is as complicated as it can get,
02:07and others will be somewhat easier than this. Here I have highlighted the internal RAMs that are available.
02:19As you can see, there are three different domains available,
02:23and each domain we have some memory, along with other things.
02:34In D2 domain, there are three SRAMs of different size.
02:38In D1 domain, there is a single Axi RAM of 512 KB.
02:50We have two tightly coupled memories in direct contact with the CPU.
02:55And in D3 domain, we have 64 KB SRAM.
03:06Tightly coupled memories are of two types, data, and the instructions.
03:14These memories have zero wait cycle for the data processing, and that's why they are used for the
03:19real-time operations, where we can't afford to delay the processing. Also only the MDMA can access them,
03:27and other DMA like DMA 1 or 2 don't have access to these memories.
03:37We have some inter-domain buses, where the data can be accessed between the different domains.
03:43Like D2 to D1 bus, where the masters from the D2 domain can access the resources in the D1 domain.
03:59By masters, I mean all these components, which are placed vertically.
04:08These components can access all these resources, which are placed horizontally.
04:13We also have D1 to D2 bus, where the masters from D1 domain can access the resources from the D2 domain.
04:28Similarly there are other buses. One thing to note here is that there are buses D1 to D3,
04:35and D2 to D3, but not in the reverse direction, so the masters from D3 domain cannot access anything
04:42other than D3 domain. And that is the BDMA, it can only access the SRAM 4.
04:54You can read more about these buses here.
04:56We will take a look at the DMA description. As I said earlier, the regular DMA can not access the TCM
05:09memories, and that's what it's mentioned here.
05:11This is where the first reason comes up.
05:31Here we have the addresses for the memories.
05:33DTCM starts at 2 million, SRAM 1, 2 and 3 are part of D2 domain, and they start at 3 million.
05:47Let's take a look at some other MCU to understand the difference.
06:10This is the manual for H723 series.
06:14The system architecture looks pretty much similar to what we saw in the H745 series.
06:22There is some difference between the number of SRAMs, and the size, but it's not that of a big issue.
06:36Now let's see the F76 series.
06:44Here the architecture is completely different.
06:49It does not have any domains, so that's a relief.
06:53Also it have SRAM 1, SRAM 2, and the tightly coupled memories.
07:00Also note one thing, we can access the DTCM using the AHPS, which is basically connected to every master.
07:09Even the DMA1, and DMA2.
07:16So I think here we can access the DTCM using those DMA, which was not the case with the other architectures we saw.
07:23This is very important, and you should check the architecture of your CPU, to see if the DMA
07:30can access these memories or not.
07:41Here it is mentioned that DMAs surely do have the access to DTCM.
07:46I don't have means to test for the F7 related queries, but I will clarify this in the next video.
07:55So we saw all the architectures, now let's see the working.
07:59I have H745 Discovery Board.
08:06Give some name to the project, and click finish.
08:12Let's clear the pin out first.
08:14I am selecting external crystal for the clock.
08:20Set up the clock as per your MCU.
08:22This PK0 is the LCD backlight, and I like to keep it off.
08:42Now in the Cortex-M7, we will enable the instruction cache, and the data cache.
08:47Enabling the cache will improve the performance of the MCU by a significant amount, but this data cache is the
08:54source of all the issues that we will be handling in these videos.
08:59Even you know this, you should still use the cache, and handle the memory issues?
09:05The MPU part will be covered later.
09:08That's all we need, just the basic setup.
09:12I have disabled the M4 core, and I am only using the M7 for now.
09:18You can check my another video to see how to do that.
09:22Here at the memory details, we can see all the types of internal memories available to us.
09:28The RAM D2 is the total RAM in D2 domain, and it consists of SRAM 1, 2 and 3.
09:34RAM D3 is the RAM in D3 domain, and that is SRAM 4.
09:43You can also check these details in the Flash Linker script.
09:47The main RAM at 2.4 million, and that is the Axie RAM.
10:08Now this is where the things get harder.
10:14Here I have the Linker scripts from the other MCUs.
10:19This one is for the H735, and if you see here, the main RAM starts at 2 million.
10:25If you check this address in the manual, you will find that it's the address for the DTCM RAM.
10:38This is something we should be worried about, you will know in a while.
10:42The RAM D1 starts at 2.4 million, and it's the Axie RAM. And other RAMs are as usual.
11:03Even in the F746, the main RAM starts at 2 million.
11:24And that's the address for the DTCM again.
11:27Now why should we be worried about the DTCM being the main RAM?
11:34Let's see this.
11:37I am not running the second core, so I will comment out these functions.
11:48Let's create a buffer of 1KB in size.
11:52Now I will fill this buffer with some random data.
11:55Let's build it.
11:59Also pay attention to the used RAM size.
12:03It's increased by 1KB.
12:06You can also check the memory details, to know where the buffer is stored.
12:13The buffer is placed at the location 2 million 2C.
12:17This is not an issue for me, but if you have the H735, this address will be in the DTCM RAM.
12:25And that's where you need to be careful.
12:28Since the DMA cannot access DTCM, you can't perform any operation on this buffer using DMA.
12:35Now we will see how can we change the location of this buffer into some other RAM.
12:40To do this, we need to define a section.
12:51We'll call it buffer.
12:52And put the variable along with it.
13:00Now we will modify the linker script, so that this attribute can be pushed into some other RAM.
13:05Here align 1 means the data will be aligned in 1 byte format.
13:21Like here we have 8 byte alignment, here is 4 bytes.
13:25And now push this to RAM D2.
13:32You can see here, RAM D2 is used by 1KB.
13:36You can check the memory details to see its location.
13:43It's not present in RAM anymore, and here we have it in the D2 RAM.
13:49Notice that this is at the beginning of the D2 RAM.
13:52But what if we want to save it at some other location inside the D2 RAM itself?
14:00We need to modify the linker script again.
14:03This time we will use the keyword absolute, and give the memory location here.
14:08We need to assign this location to a subsection, and we will call it TXBuffer.
14:17I am creating one more memory location for the RXBuffer.
14:26It won't show up just yet.
14:28We need to point the variable to that location.
14:38And now if you see, we have the TXBuffer in the new memory location.
14:49Let's define the RXBuffer also.
14:59RXBuffer is not showing, because we aren't performing any action to load the data into it.
15:08I want to store different data into the TXBuffer every time.
15:14And now let's copy the TXBuffer into the RXBuffer.
15:20Since we are performing the operations on the RXBuffer, it will show up in the D2 RAM.
15:29There we have RXBuffer in that particular memory location.
15:33Let's run this program now.
15:36Here I am using only one element of the array, since all of them will have same data anyway.
15:43The data is updating every second in both the buffers.
15:49Let's check the memory locations now.
15:52This is the memory location for the RXBuffer, and you can see the data here.
16:06It's also updating, when the new data gets copied from the TXBuffer.
16:10I will do one more thing here.
16:26This section is for the DTCM RAM, and let's call it BufferDT.
16:30Now let's put the RXBuffer in this location.
16:36So DTCM RAM have one kilobyte occupied now.
16:49Let's run this also.
17:04Let's run this also.
17:06We can check the memory location of DTCM RAM.
17:17Here we have the data, and it's updating properly.
17:20Now this is working because CPU can access any memory, so memcopy, and memset functions can work on any RAM.
17:30But it wouldn't have worked, if I try to copy the data using the DMA.
17:35I am stressing too much on DMA, and there is a good reason for it.
17:40Most of the peripherals, like LCD, SD card, SD RAM, and other will use DMA by default, and that's why we need to be prepared for those conditions, where we simply can't use CPU to copy the data around.
17:55This was the case for H745, where the default RAM, is the Axi RAM, and even if I simply create a variable, it will be created in Axi RAM itself.
18:05But this is not the case for other microcontrollers.
18:11I will quickly show an example for F746.
18:20Here I have created a project for F746.
18:25As you can see, the main RAM starts at 2 million hexa.
18:29Now let's try to simply create an array.
18:47Here you can see in the memory details, the TX buffer is stored in 2 million 28 address.
18:54And if we check the datasheet, this is the address of the DTCM RAM.
19:00Now for some application, let's say we can't have the buffer in DTCM, so we will store it in the SRAM1.
19:11Same method will be used here also.
19:13This is the address, where the TX buffer will be stored.
19:29This is the address, where the TX buffer will be stored.
19:33Here it shows up in the new memory location, which is the start of the SRAM1.
19:50This is how we can store the data in the different memory locations in the M7 processors.
19:56This was the introductory video, and it was an important topic to cover.
20:01In the next video, I will cover the DMA part.
20:05We will see the limitations, and the solutions.
20:09And later in the series, I will cover MPU also.
20:14This is it for the video.
20:16You can download the code form the link in the description.
Be the first to comment