Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hal] Fix potential race in CANAPI #6819

Merged
merged 4 commits into from
Jul 29, 2024

Conversation

rzblue
Copy link
Member

@rzblue rzblue commented Jul 9, 2024

Currently, the call to HAL_CAN_SendMessage is not synchronized with updates to periodicSends (which represents the internal state of the netcomm sender thread).

Now, the mutex is locked before HAL_CAN_SendMessage is called to ensure the update is atomic.
periodicSends and receives also now have their own mutexes to reduce unnecessary contention between send and receive functions.

Ex:
Thread A calls HAL_StopCANPacketRepeating with apiId 0
Thread B calls HAL_WriteCANPacketRepeating with apiId 0 and repeatMs 10
Inside HAL_StopCANPacketRepeating, Thread A calls HAL_CAN_SendMessage, which updates netcomm's state to not repeat the packet
Thread A is paused
Inside HAL_WriteCANPacketRepeating, Thread B calls HAL_CAN_SendMessage, which updates netcomm's state to repeat the packet
Thread B locks the mutex
Thread B updates the map to indicate the new state (packet is repeating)
Thread B exits HAL_WriteCANPacketRepeating and unlocks the mutex
Thread A resumes
Thread A locks the mutex
Thread A updates the map with what it thinks the new state is (packet is not repeating)
Thread A exits HAL_StopCANPacketRepeating and unlocks the mutex
Thread A calls HAL_CleanCAN, which doesn't stop the repeating packet because the state has diverged.

@rzblue rzblue requested a review from a team as a code owner July 9, 2024 18:51
Comment on lines 115 to 117
if (*status != 0) {
return;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question for @ThadHouse: will netcomm still update its internal state if it returns an error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure. But I do seem to remember a mention from NI that those functions basically can't fail when any of the periodic flags are set.

@PeterJohnson PeterJohnson requested a review from ThadHouse July 13, 2024 14:53
@rzblue
Copy link
Member Author

rzblue commented Jul 27, 2024

I've removed the status check on tx functions, making the assumption that even if the function returns a bad status, the periodic state will still have been updated.

@rzblue rzblue requested a review from a team as a code owner July 27, 2024 07:27
@PeterJohnson
Copy link
Member

Needs conflicts resolved.

@rzblue rzblue force-pushed the canapi-periodic-race branch from 40fc634 to 1a0c448 Compare July 28, 2024 18:52
@PeterJohnson PeterJohnson merged commit 8c06ef6 into wpilibsuite:main Jul 29, 2024
36 checks passed
@rzblue rzblue deleted the canapi-periodic-race branch August 23, 2024 04:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants