Enable D-Cache for Cortex-M7 #1222

vishwamartur · 2024-11-09T16:32:56Z

Closes #485

Enable D-Cache for Cortex-M7 devices.

Enable the D-Cache in src/modm/platform/core/cortex/startup.c.in by adding SCB_EnableDCache() after SCB_EnableICache().
Add a comment explaining the D-Cache enablement and the need for manual invalidation on certain operations.
Update the documentation in docs/src/reference/build-systems.md to reflect the D-Cache enablement for Cortex-M7 devices.
Add a note in the documentation about the need for manual invalidation on certain operations.

salkinium · 2024-11-09T17:59:39Z

Mostly worried about our DMA code, but I think @chris-durand already enabled the D-Cache on M7?

docs/src/reference/build-systems.md

Co-authored-by: Vishwanath Martur <[email protected]>

salkinium

DMA will probably fall over at some point, but that won't get fixed if it doesn't break.

chris-durand · 2024-11-11T19:30:31Z

DMA will probably fall over at some point, but that won't get fixed if it doesn't break.

Sorry for not looking at this earlier. DMA is for sure broken on H7 with the D-Cache enabled if buffers are placed in cacheable memory regions.

@salkinium Could we make this an option? I'd rather avoid breaking working user code by default. I'm using an H723 with D-Cache, DMA and modm at work, but this is only possible yet with custom code.

I'm not even sure that there is a practical way to implement the appropriate cache maintenance operations in the peripheral drivers alone which will work with modm device drivers in their current state.

The granularity of cache maintenance operations is a 32-byte cache line. In case you have some device driver containing a small buffer it will share cache lines with other memory. There are lots of edge cases that will cause correctness issues when performing cache maintenance operations on those cache lines.

For example, write-back from cache to RAM can happen corrupting DMA data being written to RAM. Even if you clean and invalidate memory before the start of the DMA transaction any unrelated modification to the cache line during the DMA operation will fetch data from RAM again which can get evicted from cache, written-back and corrupt data written by DMA.

I'm not aware of a practical way to fix all of those issues in the general case without reserving exclusive cache lines for DMA buffers. That would be a non-trivial change to modm device drivers.

Others are struggling with the same issues. Zephyr also has no good solution to this problem. Another way to solve this is allocating DMA buffers exclusively in non-cacheable memory regions but that also wouldn't be enforceable with buffers inside modm device drivers you can instantiate anywhere.

This is clearly a non-trivial problem to solve and none we will fully fix now. In my opinion the default caching setting should either be safe to use with DMA, inhibit certain DMA use or at least warn the user. Of course there should be an option to override this if you know what you're doing and put buffers into non-cacheable memory, etc.

chris-durand · 2024-11-11T19:33:34Z

src/modm/platform/core/cortex/module.md

+### Cache Initialization
+
+For Cortex-M7 devices, both the I-Cache and D-Cache are enabled by default with
+a write-through policy to significantly improve performance. However, it is


The default SRAM cache policy on STM32 M7 devices is write-back write-allocate, not write-through.

From AN4839:

Also write-through is broken on half of the H7 devices so we shouldn't change the code to enable it:

salkinium · 2024-11-11T19:48:15Z

Could we make this an option?

We could enable it only if the :platform:dma module is not present, otherwise issue a warning.
I'll open a PR and also fix the description.

I think I would prefer marking a part of SRAM as non-cachable with the MPU and having some kind of memcpy for small buffers or fast non-cachable block allocator for bigger buffers. Ideally in a way that's backwards compatible to in-place allocation (some template stuff or macro magic). I think that could work.

chris-durand · 2024-11-11T20:08:00Z

I think I would prefer marking a part of SRAM as non-cachable with the MPU and having some kind of memcpy for small buffers or fast non-cachable block allocator for bigger buffers.

Keep in mind that some DMA units in H7s can't access all SRAMs, e.g. the BDMA on a H72x/3x is restricted to SRAM4.

salkinium · 2024-11-11T20:11:56Z

Keep in mind that some DMA units in H7s can't access all SRAMs, e.g. the BDMA on a H72x/3x is restricted to SRAM4.

Hm ok, so the device driver will have to ask the DMA driver for some non-cachable memory. But then we could also use a cache-line aligned block allocator and dish out 32B blocks and manage the cache invalidation there without the MPU?

chris-durand · 2024-11-11T20:48:25Z

Hm ok, so the device driver will have to ask the DMA driver for some non-cachable memory. But then we could also use a cache-line aligned block allocator and dish out 32B blocks and manage the cache invalidation there without the MPU?

Something like that could work. The cache can be managed if the DMA memory is properly aligned and isn't shared with anything else.

It seems to me the cache management operations are best handled in downstream peripheral drivers (like UART DMA, SPI DMA, I2S etc). Any more advanced access scheme would otherwise need to be hard-coded inside the DMA drivers. If one wanted to use UART DMA with circular mode and a half-transfer interrupt the DMA driver would need to include special handling for it to do the right cache flushing and invalidation on halves of the buffer. Same for double-buffering and other features.

Furthermore it would prevent implementing custom drivers e.g. with non-cacheable buffers in user code without copying and modifying the whole modm DMA implementation.

vishwamartur force-pushed the enable-dcache branch from 5b83f5b to ad8bd7b Compare November 9, 2024 16:35

salkinium requested a review from chris-durand November 9, 2024 17:59

salkinium added enhancement 🌈 advanced 🤯 labels Nov 9, 2024

salkinium reviewed Nov 9, 2024

View reviewed changes

docs/src/reference/build-systems.md Outdated Show resolved Hide resolved

[core] Enable D-Cache for Cortex-M7 devices

28c87e4

Co-authored-by: Vishwanath Martur <[email protected]>

salkinium force-pushed the enable-dcache branch from ad8bd7b to 28c87e4 Compare November 10, 2024 21:44

salkinium approved these changes Nov 10, 2024

View reviewed changes

salkinium merged commit 28c87e4 into modm-io:develop Nov 10, 2024
12 checks passed

chris-durand reviewed Nov 11, 2024

View reviewed changes

salkinium mentioned this pull request Nov 11, 2024

[core] Only enable D-Cache without :platform:dma module #1225

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable D-Cache for Cortex-M7 #1222

Enable D-Cache for Cortex-M7 #1222

vishwamartur commented Nov 9, 2024 •

edited by salkinium

Loading

salkinium commented Nov 9, 2024 •

edited

Loading

salkinium left a comment •

edited

Loading

chris-durand commented Nov 11, 2024 •

edited by salkinium

Loading

chris-durand Nov 11, 2024

salkinium commented Nov 11, 2024

chris-durand commented Nov 11, 2024

salkinium commented Nov 11, 2024

chris-durand commented Nov 11, 2024

Enable D-Cache for Cortex-M7 #1222

Enable D-Cache for Cortex-M7 #1222

Conversation

vishwamartur commented Nov 9, 2024 • edited by salkinium Loading

salkinium commented Nov 9, 2024 • edited Loading

salkinium left a comment • edited Loading

Choose a reason for hiding this comment

chris-durand commented Nov 11, 2024 • edited by salkinium Loading

chris-durand Nov 11, 2024

Choose a reason for hiding this comment

salkinium commented Nov 11, 2024

chris-durand commented Nov 11, 2024

salkinium commented Nov 11, 2024

chris-durand commented Nov 11, 2024

vishwamartur commented Nov 9, 2024 •

edited by salkinium

Loading

salkinium commented Nov 9, 2024 •

edited

Loading

salkinium left a comment •

edited

Loading

chris-durand commented Nov 11, 2024 •

edited by salkinium

Loading