-
-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transmit directly from connection tasks #1729
Conversation
Hmm, this needs a mechanism to broadcast wakeups when the UDP socket is backpressured... edit: solved by duplicating ( |
6722c24
to
9857e79
Compare
9857e79
to
21bb81a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks cool -- initial round of feedback trying to make sense of it all.
e0f1c5b
to
e1fb663
Compare
I've replaced the original |
1289d92
to
ac8a1ad
Compare
Hi, curious to try this out on various other platforms as well. How have you been running benchmarks to compare this? I guess any perf loss due to losing sendmmsg is offset by the perf gain of not doing the userspace-multiplexing. sendmmsg does give some small benefit when you manage to fill the pipe to many peers in one syscall, but I'd have to spend more time figuring out how often quinn really manages to do that. |
I haven't benchmarked this very rigorously yet, since I don't have convenient access to a machine that's both running Linux (for our most optimized UDP backend) and not a laptop (subject to unpredictable throttling). We could use more data on that, if you're interested. I do plan to do some benchmarking on Windows, but we have neither sendmmsg nor GSO there just yet, so it might be a bit unfair to the status quo. For a single connection, sendmmsg is strictly worse than GSO, and for multiple connections I think the near-linear speedup from parallelism will be a much bigger win in any case, so I'm not too concerned about regressions, though it's always possible we might miss something silly. |
1525c4d
to
a71e5c0
Compare
c91f813
to
abfe28c
Compare
Rebased. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
Allows multiple tasks to concurrently wait for writability on the same socket. See the [tokio] and [async-io] docs for details. [tokio]: https://docs.rs/tokio/latest/tokio/net/struct.UdpSocket.html [async-io]: https://docs.rs/async-io/latest/async_io/struct.Async.html#concurrent-io
Dropping these has low cost because they're not associated with any connection.
abfe28c
to
de4e351
Compare
`UdpState` was removed some time ago, and various other interface details have changed. As a result, the build would fail on platforms without native support.
As we no longer buffer multiple transmits in memory, this complexity is unused. GSO is expected to account for most, if not all, of the performance benefit.
Because we no longer buffer transmits for unpredictable periods, there's no need to share ownership of their contents.
This didn't impact any tests, but was confusing.
We no longer need to share ownership of this memory, so we should use a simpler type to reflect our simpler requirements.
de4e351
to
1b63ecf
Compare
`neqo-bin` has been importing `quinn-udp` as a git reference, in order to include quinn-rs/quinn#1765. The quinn project has since released `quinn-udp` `v0.5.0`. This commit upgrades `neqo-bin` to use `quinn-udp` `v0.5.0`. `quinn-udp` now takes a data reference (`&[u8]`) instead of owned data (`bytes::Bytes`) on its send path, thus no longer requiring `neqo-bin` to convert, but simply pass a reference. See quinn-rs/quinn#1729 (comment) for details. `quinn-udp` has dropped `sendmmsg` support in the `v0.5.0` release (quinn-rs/quinn@ee08826). `neqo-bin` does not (yet) use `sendmmsg`. This might change in the future (mozilla#1693).
* feat(bin): use quinn-udp crates.io release instead of git ref `neqo-bin` has been importing `quinn-udp` as a git reference, in order to include quinn-rs/quinn#1765. The quinn project has since released `quinn-udp` `v0.5.0`. This commit upgrades `neqo-bin` to use `quinn-udp` `v0.5.0`. `quinn-udp` now takes a data reference (`&[u8]`) instead of owned data (`bytes::Bytes`) on its send path, thus no longer requiring `neqo-bin` to convert, but simply pass a reference. See quinn-rs/quinn#1729 (comment) for details. `quinn-udp` has dropped `sendmmsg` support in the `v0.5.0` release (quinn-rs/quinn@ee08826). `neqo-bin` does not (yet) use `sendmmsg`. This might change in the future (#1693). * remove impl From<Datagram> for Vec<u8>
This allows outgoing data to parallelize perfectly. Initial informal testing suggests a performance improvement for bulk data, likely due to reduced allocation and cross-task messaging. A larger performance benefit should be expected for endpoints hosting numerous connections.
I'd left the original userspace-multiplexed transmit strategy in place for a long time because I assumed the UDP socket had a mutex-guarded send buffer, so leaning into task-parallelism for sending would just lead to contention. This turns out to be complete nonsense, at least on Linux: outgoing UDP datagrams are buffered by the kernel with dynamically allocated memory, and the primitive NIC queuing operation is apparently scalable, as borne out by empirical testing. The following minimal test scales almost linearly with physical parallelism on Linux:
It'd be interesting to see how this compares on other major platforms, but Linux's drastic improvement and the simplification of Quinn's internals are enough to satisfy me that we should go this way.