Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement acknowledgement frequency #1543

Closed
wants to merge 2 commits into from

Conversation

aochagavia
Copy link
Contributor

@aochagavia aochagavia commented Apr 19, 2023

Note: this PR is no longer a draft, but I've left the text below untouched so people can trace how the PR was developed. You will probably want to jump to this comment for an overview of what this PR actually achieves.


This is a draft PR for now, because it is missing the following stuff:

  • Regularly sending ACK_FREQUENCY frames to actively control the ACK frequency of the peer (I'm waiting for Tie up PLPMTUD's loose ends #1529 to be merged to avoid conflicts around the congestion control code)
  • Polishing out-of-order packet detection (more on that below, but the current code to handle out-of-order ack-eliciting packets is probably wrong)
  • Adding a few more tests

As to the things that are properly implemented:

  • min_ack_delay transport parameter
  • ACK_FREQUENCY and IMMEDIATE_ACK frames
  • Logic around delaying ACKs up to max_ack_delay, properly tweaked upon receiving an ACK_FREQUENCY frame
  • Receiving an IMMEDIATE_ACK causes an ACK to be sent in the next poll_transmit call, even if that results in an ACK-only packet.
  • If the remote peer supports it, we send an IMMEDIATE_ACK in the following cases:
    • When a PMTU probe is sent (see section 8.3 of the draft)
    • When a client wants to validate the path after connection migration (see section 8.4 of the draft)
    • When a loss probe is sent (see section 5 of the draft)

That being said, here are a few questions:

  1. What do you think of the general direction of the PR? Do you see anything that seems out of place?
  2. Is there a way to access unencrypted packet frames in tests that simulate a connection (quinn-proto/src/test/mod.rs)? It would be useful to ensure the ACK delay value in ACK frames is being properly set. Unit tests don’t give the same degree of confidence and have a higher maintenance cost. If this doesn't exist yet, would you welcome a commit adding it?
  3. I'm confused about the way out-of-order packets are detected in the ACK frequency draft, which seems different than in RFC 9000. The draft gives an example that, to my understanding, is wrong (which probably means my understanding is wrong). I posted a detailed question in the quicdev slack to get more clarity, so please have a look if you think you can help.

@aochagavia aochagavia force-pushed the delayed-ack branch 2 times, most recently from 8830d95 to 5a1e0c0 Compare April 19, 2023 10:23
@Ralith
Copy link
Collaborator

Ralith commented Apr 19, 2023

Is there a way to access unencrypted packet frames in tests that simulate a connection (quinn-proto/src/test/mod.rs)?

Not presently. We've generally relied on making assertions about resulting connection state. You could do that here by looking at the RTT estimate, but that is a bit indirect. I don't see anything fundamentally wrong with asserting about the transmitted plaintext (we do something morally similar in the version negotiation tests, after all), but it's not immediately obvious to me how it might be done gracefully.

@aochagavia
Copy link
Contributor Author

I don't see anything fundamentally wrong with asserting about the transmitted plaintext (we do something morally similar in the version negotiation tests, after all), but it's not immediately obvious to me how it might be done gracefully.

I just pushed a commit prototyping how this could work. It introduces a Connection::decode_packet function that returns the decoded bytes of the packet's payload, meant to be used from tests. It duplicates about 70 lines of code, though, and it requires deriving Clone for two structs (we could make the latter conditional for testing, if desired). It feels a bit awkward at the moment because of the code duplication, but with a bit of effort I might be able to extract that code into a dedicated module that can be shared. Do you think this is a good direction?

By the way, I have had no feedback on the implementer's Slack about detection of out-of-order packets in the ACK frequency draft, so my current plan is to go through the draft again next week and implement my own interpretation of it. I also plan to look around to see if there are open implementations of the current version of the draft that I can have a look.

@Ralith
Copy link
Collaborator

Ralith commented Apr 27, 2023

Do you think this is a good direction?

Seems reasonable in general (especially with #[cfg(test)]), but yeah factoring out the common logic would be nice. I have no problem with adding Clone to internal types; its absence generally isn't principled, we just haven't needed it before.

I have had no feedback on the implementer's Slack about detection of out-of-order packets in the ACK frequency draft

I saw; that was disappointing. Maybe open an issue (or PR with a correction) on https://github.com/quicwg/ack-frequency? The WG has been good about engaging on github historically. I definitely wouldn't be shocked by an error in an example.

@aochagavia aochagavia changed the base branch from main to plpmtud-2 May 2, 2023 14:09
@aochagavia aochagavia force-pushed the delayed-ack branch 3 times, most recently from d011899 to bbbd8b7 Compare May 2, 2023 15:09
@aochagavia aochagavia marked this pull request as ready for review May 2, 2023 15:10
@aochagavia
Copy link
Contributor Author

aochagavia commented May 2, 2023

I just pushed a first non-draft version of this PR, and modified the PR's base to the plpmtud-2 branch to get us a better diff.

Here is an overview of the feature:

  • The ack frequency extension allows one to control the ACK frequency of the remote peer.
  • Quinn provides an AckFrequencyConfig to control the parameters, which are sent to the peer in an ACK_FREQUENCY frame at the beginning of the connection (technically, the extension allows modifying the ack frequency parameters multiple times during a connection, but then you would need to come up with an algorithm to automatically tune the parameters, which IMO is out of scope for this first implementation).
  • On the other side of the connection, Quinn will receive and process the ACK_FREQUENCY frame, updating the local ack frequency parameters based on the new parameters sent by the peer.
  • If the local peer has requested a custom ack frequency from the remote peer, Quinn will send an IMMEDIATE_ACK frame every RTT if there are any unacked ack-eliciting packets (this is a new kind of frame to immediately elicit an ACK). This is recommended by the draft as a way to ensure congestion control and loss detection work properly.
  • The draft is ambiguous when it comes to out-of-order packet detection. I wrote the specific details in this issue, hoping it gets clarified. In the meantime, I am sticking to the text of the draft and ignoring the parts in the examples that deviate from the draft's definitions.

I have a bunch of questions / comments:

  1. I introduced two timers, one to be notified when max_ack_delay is elapsed, and one to be notified when an RTT is elapsed. I tried to come up with an approach that wouldn't require additional timers, but I'm not sure it is possible. It would be nice if you can double-check that, given your knowledge of the codebase.
  2. I introduced a reset_rtt_timer function that arms the Timer::Rtt for the first time. I wanted to arm it right after we have reached the Data space and have an RTT estimate. I ended up calling it inside process_payload. Please let me know if there is a more suitable place.
  3. I rewrote a long comment inside PendingAcks::acks_sent, which I found difficult to understand. Hopefully the new version says the same in clearer terms (otherwise let me know and I'll update it)

Tomorrow I'll do some benchmarking to ensure everything works properly.

@aochagavia
Copy link
Contributor Author

aochagavia commented May 3, 2023

And... here are the benchmarks!

Practical details

I'm running the server using cargo run --release --bin perf_server -- --no-protection and the client using cargo run --release --bin perf_client -- --no-protection --duration 15 (note that this makes the benchmarks misleading for real-world use cases, because encryption is disabled). As you can deduce from the cargo commands used, I'm running both benchmarks on the same machine.

For the scenarios with ACK frequency enabled I'm doing so through the following code:

let mut ack_freq = AckFrequencyConfig::default();
ack_freq.ack_eliciting_threshold(10);
transport.ack_frequency_config(Some(ack_freq));

Conclusion

There seems to be no significant performance difference in the benchmarks, though we probably need a more statistically rigorous approach to benchmarking if we want to be sure.

Baseline (before this PR)

      │ Upload Duration │ Download Duration | FBL        | Upload Throughput | Download Throughput
──────┼─────────────────┼───────────────────┼────────────┼───────────────────┼────────────────────
 AVG  │        862.00µs │          811.00µs │    50.00µs │     1199.97 MiB/s │       1293.80 MiB/s
 P0   │        661.00µs │          534.00µs │     1.00µs │       32.34 MiB/s │         34.34 MiB/s
 P10  │        741.00µs │          678.00µs │     2.00µs │      986.50 MiB/s │       1089.00 MiB/s
 P50  │        810.00µs │          762.00µs │    47.00µs │     1235.00 MiB/s │       1313.00 MiB/s
 P90  │          1.01ms │          919.00µs │    79.00µs │     1350.00 MiB/s │       1475.00 MiB/s
 P100 │         30.91ms │           29.12ms │   470.00µs │     1513.00 MiB/s │       1873.00 MiB/s

ACK frequency disabled (threshold = 1)

      │ Upload Duration │ Download Duration | FBL        | Upload Throughput | Download Throughput
──────┼─────────────────┼───────────────────┼────────────┼───────────────────┼────────────────────
 AVG  │        920.00µs │          805.00µs │    53.00µs │     1190.95 MiB/s │       1280.93 MiB/s
 P0   │        670.00µs │          134.00µs │     1.00µs │       33.44 MiB/s │         34.12 MiB/s
 P10  │        746.00µs │          699.00µs │     4.00µs │      996.50 MiB/s │       1113.00 MiB/s
 P50  │        819.00µs │          774.00µs │    49.00µs │     1222.00 MiB/s │       1292.00 MiB/s
 P90  │          1.00ms │          899.00µs │    82.00µs │     1341.00 MiB/s │       1431.00 MiB/s
 P100 │         29.92ms │           29.30ms │   847.00µs │     1493.00 MiB/s │       7464.00 MiB/s

ACK frequency enabled on the client (threshold = 10)

      │ Upload Duration │ Download Duration | FBL        | Upload Throughput | Download Throughput
──────┼─────────────────┼───────────────────┼────────────┼───────────────────┼────────────────────
 AVG  │        866.00µs │          767.00µs │    48.00µs │     1187.26 MiB/s │       1479.18 MiB/s
 P0   │        686.00µs │           70.00µs │     1.00µs │       33.25 MiB/s │         32.25 MiB/s
 P10  │        753.00µs │          680.00µs │     3.00µs │     1020.50 MiB/s │       1198.00 MiB/s
 P50  │        821.00µs │          749.00µs │    46.00µs │     1219.00 MiB/s │       1336.00 MiB/s
 P90  │        980.00µs │          835.00µs │    77.00µs │     1329.00 MiB/s │       1471.00 MiB/s
 P100 │         30.08ms │           30.99ms │   421.00µs │     1458.00 MiB/s │      14288.00 MiB/s

ACK frequency enabled on the server (threshold = 10)

      │ Upload Duration │ Download Duration | FBL        | Upload Throughput | Download Throughput
──────┼─────────────────┼───────────────────┼────────────┼───────────────────┼────────────────────
 AVG  │        890.00µs │          844.00µs │    51.00µs │     1182.28 MiB/s │       1202.34 MiB/s
 P0   │        684.00µs │          122.00µs │     1.00µs │       34.59 MiB/s │        100.25 MiB/s
 P10  │        766.00µs │          753.00µs │     2.00µs │     1016.50 MiB/s │       1064.00 MiB/s
 P50  │        826.00µs │          828.00µs │    49.00µs │     1211.00 MiB/s │       1208.00 MiB/s
 P90  │        984.00µs │          940.00µs │    86.00µs │     1306.00 MiB/s │       1329.00 MiB/s
 P100 │         28.90ms │            9.98ms │   427.00µs │     1462.00 MiB/s │       8200.00 MiB/s

@djc djc deleted the branch quinn-rs:plpmtud-2 May 3, 2023 10:33
@djc djc closed this May 3, 2023
@aochagavia
Copy link
Contributor Author

@djc it looks like you closed this PR by accident

@aochagavia
Copy link
Contributor Author

For anywone following along, I just rewrote the above comment with the benchmark results, because it looks like my baseline was non-representative.

@djc
Copy link
Member

djc commented May 3, 2023

Ahh, sorry -- I deleted the branch for #1552, which apparently makes it impossible to reopen this PR. Can you try repushing this branch so we can reopen it? We could open a new PR but would be better to keep this one going.

@aochagavia
Copy link
Contributor Author

aochagavia commented May 3, 2023

Ah yes, because the PR was targetting that branch... I can no longer edit the target branch of the PR, even after force-pushing a rebased version, so it looks like I'll have to open a new one.

@aochagavia
Copy link
Contributor Author

And here is the new PR: #1553

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants