Questions regarding the problem with atomics #9

Thom-de-Jong · 2024-11-14T13:09:53Z

Good day all,

Not an issue but a question: What exactly is the problem with atomics?
I have read the discussion #59

Is this confirmed by WCH, or by the test mentioned in #59?
Is this a hardware bug in all chips?
~~How does one know it is the atomic operation causing a problem?~~
Is it enough to put the atomic operation in a critical section by using the portable-atomics crate for example?

Just curious about the recent changes.
Thank you in advance for the explanation!

Codetector1374 · 2024-11-14T21:27:02Z

Yeah, we never really had time to run the machine test suite. But we did run into an actual / real problem with embassy executor. Where the executor enter a bad state, the task is marked as woken up but not in the ready list.

@Dummyc0m and I have reviewed the code in the embassy executor and can not come up with reasonable possibility for us to be entering that state if load reserve store conditional actually work as expected. And by disabling atomic (it will fall back to use the critical section based executor data structures) we can no longer reproduce the problem as expected.

Regarding your question, using portable atomic and placing access into critical section would solve the problem. Because we are practically no long use lr/sc instructions anymore.

If anyone have the time / interest in running through the litmus test suite to actually confirm it would be extra nice.

Usioumeo · 2024-11-19T08:05:44Z

I tried to run the litmus test suite. The problem is that it is not meant to run on bare metal, and it requires threads and a real operating system underneath.
The best I can think of is to parse the litmus tests with a new parser, generate some code (even rust based on this lib, with the test included as embedded assembly), and to simulate traps by threads we could use timer interrupts. The problem with this approach is that interrupts have a priority, so one "thread" will always be interrupted, while the other will never be.
We could swap after a test run, but I don't know if that's a good idea for all tests.

Let me know what you think, and if it is worth spend time on it.

Codetector1374 · 2024-11-19T15:13:15Z

I was thinking adapting something like free rtos and slap it on there as a few threads with preemptive scheduling. I personally don’t really have a huge interest in confirming it is broken, since losing atomic on an embedded & single core platform like this is honestly not a huge deal. And we confirmed by using critical section we can get embassy to work as expected .

Thom-de-Jong · 2024-11-19T15:40:58Z

@Codetector1374, Thank you for the explanation.

For me, not using atomics solves all the instability I encountered in my firmware on the CH32V203.
Atomics make things easy but it is not a dealbreaker as portable-atomics will fill the gap in my case. (And if you are open to depend on an extra crate)
I'm not experienced enough to help you further investigate this issue and would have never guessed the cause myself.
Huge thanks to you all!

andelf assigned Dummyc0m and Codetector1374 Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions regarding the problem with atomics #9

Questions regarding the problem with atomics #9

Thom-de-Jong commented Nov 14, 2024 •

edited

Loading

Codetector1374 commented Nov 14, 2024 •

edited

Loading

Usioumeo commented Nov 19, 2024

Codetector1374 commented Nov 19, 2024

Thom-de-Jong commented Nov 19, 2024

Questions regarding the problem with atomics #9

Questions regarding the problem with atomics #9

Comments

Thom-de-Jong commented Nov 14, 2024 • edited Loading

Codetector1374 commented Nov 14, 2024 • edited Loading

Usioumeo commented Nov 19, 2024

Codetector1374 commented Nov 19, 2024

Thom-de-Jong commented Nov 19, 2024

Thom-de-Jong commented Nov 14, 2024 •

edited

Loading

Codetector1374 commented Nov 14, 2024 •

edited

Loading