-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert application close frame to transport close frame during handshake. #4169
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4169 +/- ##
==========================================
- Coverage 85.59% 84.04% -1.56%
==========================================
Files 56 56
Lines 15358 15384 +26
==========================================
- Hits 13146 12929 -217
- Misses 2212 2455 +243 ☔ View full report in Codecov by Sentry. |
I had some more time to spend on this, and it turns out to be a slightly bigger rabbit hole than I originally thought: The way TLS abstraction on top of the CRYPTO stream works, is that when there is a pending validation, we keep "checked out" part of the buffer, and if we attempt to process future CRYPTO frame before the async validation is completed, an assert segfaults, which is a good thing I guess. However, this means that we have to pause processing CRYPTO data until the user sets the result for the async operation. This is inconvenient especially on client, as current MsQuic implementation sends the CONNECTION_CLOSE on 1-RTT level, because it has the keys, but client derives the keys only once the (asynchronous) cert validation completes (on Linux at least). So the client will defer the CONNECTION_CLOSE until the async validation completes. We should implement sending CC on "Current and previous levels" as RFC suggests to get the desired behavior, but I would prefer to do that as a separate change. |
Looks like we have down-level test failures, indicating we have changed behavior unexpectedly. |
13eb5b8
to
d9d4bd4
Compare
There seem to be some test failures in the kernel mode runs, @nibanks Do you have any suggestions how to debug those? For the others, there seems to be some nondeterminism in the added tests. And the best way to fix it seemed to me to bite the bullet and implement the sending of CC on both current and lower encryption levels, but that is going to break down-level tests because it introduces a behavioral change (the client is going to process CC in Handshake packets earlier than CC in 1-RTT packets, where previously we did not send CC in Handshake packets). What do you think? |
I'm not sure. I haven't had the time to dedicate to thinking about this recently, but I do suspect we will need to send both, and deal with fixing the down-level tests. |
400449b
to
4c69b4e
Compare
Looking at CI errors, there are many failures on different cases of
I think this PR creates a race condition. The An idea occurred to me, if we encounter transport close with APPLICATION_ERROR, we could delay the closing of the connection and check if there is a 1-RTT packet in the same datagram with an App close frame and report that close error instead. This would keep the same behavior as before (at least when running against other MsQuic servers of the same version). What do you think, @nibanks? |
I like the idea. I wonder if instead of delay, we treat it more like we allow the 1-RTT app close to overwrite the transport close. The change likely will just be that we need to keep processing the packets (and frames). |
This reverts commit 02ae60b.
770d900
to
7db08bb
Compare
@nibanks There are some test failures, but there are no logs attached in the downloadable artifacts, they might be the same as in #4169 (comment) but I can't say for sure. |
I think I see what is going on. In the failed runs, server locally confirms the handshake (since TLS handshake completed), and calls If client sends CC before receiving the HANDSHAKE_DONE frame, it should according to RFC send it in Handshake packet as well. For Long-header packets MsQuic does lookup based on the Remote hash Lines 1449 to 1463 in 226138d
but the connection was already removed from said lookup so both Handshake and 1-RTT packets with CC are discarded. |
The comment above the mentioned code suggests that at server-side, Destination CIDs are not under server control for long-header packets. I needed to refresh my memory by reading the RFC
So it seems to me that RFC-conforming client can only choose the DCID of the first Initial packet, so it seems safe to lookup Handshake packets by DCID as well. what do you think? |
Following quick and dirty attempt fixes the issue for my local repro - if (!Binding->ServerOwned || Packets->IsShortHeader) {
+ if (!Binding->ServerOwned || Packets->IsShortHeader || Packets->LH->Type == QUIC_HANDSHAKE_V1) {
Connection =
QuicLookupFindConnectionByLocalCid(
&Binding->Lookup,
Packets->DestCid,
Packets->DestCidLen);
} else { Proper fix would have to take QUIC Version-dependent header Type values to account ofc. What do you think, @nibanks? |
55147c2
to
97b770f
Compare
Client can (and do I believe) continue to use the initial DCID beyond the first round trip. AFAIK, there is nothing that prevents that. So, while it might work with MsQuic as a client, it doesn't necessarily work with all clients. You also need to consider if there is any (probably not) attack implication by allowing these types of packets to be routed.
You also need to make sure those fields have been validated already. Another thought I had though, is should we delay removing the DCID from the lookup table on client, until we have an ACK for the handshake confirmed frame? Would that fix this too? |
I thought that is against the very first sentence of the linked paragraph.
As for possible attacks, I don't see any, third parties should not have access to necessary handshake keys, and packets failing decryption/integrity check shouldn't affect the connection.
The fields are luckily already validated at this point (assert at the beginning of the containing function).
That would work as well, but the fix would be slightly more complicated. I can change it if you think this would be a better fix. |
…g handshake." This reverts commit 97b770f.
76aa2c1
to
53e6200
Compare
My fix broke 0RTT tests. Looks like your idea works better, so let's go with that :D |
QUIC_PATH* Path = &Connection->Paths[0]; | ||
CXPLAT_DBG_ASSERT(Path->Binding != NULL); | ||
QuicBindingOnConnectionHandshakeConfirmed(Path->Binding, Connection); | ||
} | ||
|
||
QuicCryptoDiscardKeys(Crypto, QUIC_PACKET_KEY_HANDSHAKE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to throw away the handshake key still? Or should this whole function just be delayed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense to keep them, when TLS handshake finishes at server side, client already must have 1-RTT keys. The issue this fixes is only about getting the coalesced datagram routed to the connection.
Fixes #4166.
Description
This PR ensures we convert application CONNECTION_CLOSE to transport CONNECTION_CLOSE when the handshake is not confirmed yet (and the peer may not be able to process the frame yet).
Testing
New tests added.
Documentation
No change needed.