You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While experimenting with USB benchmarking, I realized that there was a significant performance bump when switching from "receive then transmit" to "receive in one task, hand data off to other task, other task sends". postcard-rpc currently does all of the following in one task:
Receive bytes off the wire
Buffer them into a frame
When a frame is complete, attempt to deserialize
if deserialize succeeds, attempt to dispatch
if dispatching succeeds, attempt to serialize the response
if serializing succeeds, send the response frame one packet at a time
We would likely see some improvement by breaking the process into three stages:
One that handles buffering of an entire frame
One that does deser, dispatch, ser
One that does fragmentation and sending of the frame as parts
IMO this would be a very good use case for a pair of bbqueue or bbq2, as it is already async, sharable, and offers a framed interface. This would allow receiving to not be blocked by the relatively larger middle steps, as well as the time it takes to respond. Data can also be received directly into and out of the bbqueue, meaning that we do not pay any extra copying costs when sending frames between the pipelined stages.
Here are the graphs from testing simple bulk frame loopback using first just a single task:
And here's what it looks like when I used a basic embassy-sync channel to split sending and receiving into separate tasks:
We should keep an eye on how much complexity this adds, and may be able to be handled transparently in the server stack by instead implementing the WireRx and WireTx traits using bbqueue handles instead of directly driving the USB endpoints. This may also be useful for other protocols as well.
The text was updated successfully, but these errors were encountered:
While experimenting with USB benchmarking, I realized that there was a significant performance bump when switching from "receive then transmit" to "receive in one task, hand data off to other task, other task sends". postcard-rpc currently does all of the following in one task:
We would likely see some improvement by breaking the process into three stages:
IMO this would be a very good use case for a pair of
bbqueue
orbbq2
, as it is already async, sharable, and offers a framed interface. This would allow receiving to not be blocked by the relatively larger middle steps, as well as the time it takes to respond. Data can also be received directly into and out of the bbqueue, meaning that we do not pay any extra copying costs when sending frames between the pipelined stages.Here are the graphs from testing simple bulk frame loopback using first just a single task:
And here's what it looks like when I used a basic embassy-sync channel to split sending and receiving into separate tasks:
We should keep an eye on how much complexity this adds, and may be able to be handled transparently in the server stack by instead implementing the
WireRx
andWireTx
traits using bbqueue handles instead of directly driving the USB endpoints. This may also be useful for other protocols as well.The text was updated successfully, but these errors were encountered: