Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tls-eio: decide on improvements needed #464

Open
talex5 opened this issue Jan 4, 2023 · 9 comments
Open

tls-eio: decide on improvements needed #464

talex5 opened this issue Jan 4, 2023 · 9 comments

Comments

@talex5
Copy link
Contributor

talex5 commented Jan 4, 2023

In #460 (comment), @hannesm commented:

Since it seems both you @bikallem and @talex5 are working on TLS and EIO, maybe it'd make sense if both of you come to a common understanding what kind of semantics you'd like to have. [...] I already wonder whether I merged the tls-eio PR way too early.

I agree we're sending too many (half-baked) PRs. Let's first decide what behaviour we want and then write the code afterwards. But I think we really need @hannesm's input on this.

As I see it, there are 4 issues being discussed:

  1. tls-eio may perform concurrent writes on a flow, and Eio does not specify the behaviour in this case.
  2. Trying to use a TLS flow after it has failed produces confusing errors.
  3. Trying to use a TLS flow after cancelling an operation is not supported and produces confusing errors.
  4. Half-shutdown does not work, but the documentation doesn't mention this and the resulting behaviour is confusing.

As far as I can see, 1, 2 and 3 also affect tls-lwt (4 doesn't because the documentation doesn't claim to support it in the first place).

Concurrent writes

Reading from a TLS flow may cause a write. This means that performing a read and a write of the TLS flow at the same time (which should be allowed) may cause overlapping writes on the underlying flow (which isn't).

I'm not sure why we haven't seen this problem in tls-lwt, which seems to be doing the same thing (Lwt_cs.write_full doesn't take a lock, even though it may do multiple writes on the FD).

This problem doesn't show up in the fuzz tests because the mock socket handles concurrent writes in order. But I tried modifying it to fail in the case of concurrent writes and it still didn't trigger. Do concurrent writes only happen on errors, perhaps?

In #458 (comment), @hannesm says:

the other side-effecting pieces do not need a mutex due to careful design

I propose that Eio should explicitly document that the behaviour of concurrent writes is unspecified (see ocaml-multicore/eio#387). It's possible that there's nothing to fix here, but I suspect we need some kind of lock when reporting errors at least. I can make a PR to do that, but I'd like to know whether this is really a problem and how to trigger it in the fuzz tests if so.

Errors using a flow after failure

If an exception occurs while reading or writing the underlying flow, tls-eio moves into the Error state and all future operations fail with the same error.

This can be confusing. For example, if a read failed with Out_of_memory then all future reads will also claim to be out of memory. The stack-trace only indicates where the error was rethrown, not the original cause.

This also affects tls-lwt, but people don't expect to get reasonable backtraces from Lwt anyway.

I propose that:

  1. We record the backtrace when storing an exception (if backtraces are on) and include it when reraising.
  2. When reraising, we wrap the error as e.g. Tls_previously_failed _ to indicate that the error hasn't just happened.

In #458, @bikallem instead proposes not storing errors at all. I think that could be confusing, however, since you will likely get bogus errors about protocol violations. In any case, I think the tls library as a whole should decide whether it wants to do this and tls-eio should do the same as tls-lwt.

Errors using a flow after cancellation

If a read or write operation is cancelled, this raises a Cancelled exception which is stored as noted above. This means that a TLS flow cannot be used again after being cancelled. The reported error is confusing because it looks like it has been cancelled again and doesn't mention the location of the original cancellation. This is a bigger problem than for other exceptions because the original cancellation exception will typically not have been shown to the user.

This should also affect Lwt, which also raises an exception to indicate cancellation.

There seem to be three reasonable options here:

  1. Using a flow after cancelling is never allowed (the current behaviour). With the improvements to re-raising errors above, this should be fairly clear.
  2. Reusing a cancelled flow is best-efforts. If we cancel during a read then we ignore the error and the flow can continue to be used, but if you are unlucky and cancel during a write then reusing the flow will raise an exception, as before. This is the behaviour you get from tls-eio: allow cancelling reads #459.
  3. Make it safe to cancel and resume at any point. This would require changes to Eio to report partial writes, which we have debated doing.

Option 1 seems simplest here. I'm not convinced there's much use for cancelling a flow operation if you might want it later, and we've managed with this behaviour in tls-lwt so far without trouble.

Half-close / shutdown

tls-eio claims to implement the Eio.Flow.two_way signature, which allows performing a half close. However, tls always closes both directions (and tls-eio gives a confusing behaviour when you try to read, by simply returning end-of-file).

#460 tried to fix this by shutting down the underlying flow, but that won't work (both because TLS needs to write in order to read, and because the underlying flow is supposed to remain usable after a TLS shutdown).

It should be possible to support half-close properly (tracked in #452). @hannesm wrote:

Reading up on the RFCs and the mailing list, I guess we can actually have a unified close_notify semantics for all protocol versions. Still we need to track that close_notify state in the TLS engine to reject any "send_application_data" etc.

So the main question is what tls-eio should do until then. Some options are:

  1. Simply document in tls_eio.mli that shutdown closes both directions for now and note that we intend to improve this in future.
  2. As (1), but also raise an exception if you try to shut down the sending side only.
  3. As (1), but raise a helpful exception on future reads instead of end-of-file.
  4. Stop claiming to support generic POSIX-style shutdown and provide a separate close_tls operation as tls-lwt does.

Part of the problem is that some users may be satisfied with the current behaviour (i.e. they don't plan to read after shutting down the sending side), so (2) or (4) will annoy them unnecessarily. And changing the API in (4) and then changing it back once it's working seems annoying too. So I think I prefer either (1) or (3).

@hannesm
Copy link
Member

hannesm commented Feb 4, 2023

Hello @talex5,

indeed there has been a lot of PRs and issues. I also suspect this issue is slightly badly titled. From what I learned over the past:

  • the current close semantics of (tls-mirage, tls-lwt) is not optimal
  • using the same TLS context from different threads / tasks concurrently is not per se safe (this is certainly easier in a single-cpu world where, e.g. with lwt, we can reason that there's a yield only at a Lwt.bind)

Going into more detail below. Please let me know whether this is useful for you.

Concurrent writes

Thinking about it, I fail to understand the use-case (application-wise) where you have a single TLS connection and write concurrently from multiple tasks onto it. Usually (HTTP/DNS/...) you have a request-response mechanism with only a single task being involved. Would you mind to elaborate a concrete and real application that concurrently writes from multiple tasks?

Lwt_cs.write_full doesn't take a lock

That could indeed be an issue in the lwt backend. Though I haven't seen a tls-lwt usage that concurrently writes to a single fd.

Errors using flow after failure

Maybe we should distinct some errors here: (a) out of memory in my opinion should never be intercepted (indeed, that should be fixed in tls-lwt); (b) any other exception. For (b) my assumption is that any (exposed) TLS alert (from the other side) is fatal -- i.e. there won't be any further communication possible. Any local error (processing data from the other side) is as well fatal, and no way to recover from that. Any socket issue (i.e. ECONNRESET) is as well fatal.

So, leaving out of memory aside, I don't think it makes sense to use a flow that moved to the failed state once.

Errors using a flow after cancellation

I agree, we can stick (at least for now) to "cancelling an operation on a flow cancels that flow". Again, I don't see a concrete use case to allow cancelling a read/write and afterwards using the flow for a different (or the same) purpose.

Half-close / shutdown

As said, needs to be implemented in the tls core. There may be additional thoughts that closing a TLS connection does not imply the underlying transport is finished. I.e. you could re-use that TCP connection for e.g. another TLS handshake (though, tbh I lack real-world usage thereof).

"What EIO should be doing until half-close / shutdown is in tls?" I don't know, after all it's your framework.

hannesm added a commit to hannesm/ocaml-tls that referenced this issue Feb 6, 2023
as discussed in mirleft#464 (thanks to @talex5 for raising this)
@hannesm
Copy link
Member

hannesm commented Feb 6, 2023

As @talex5 explained in #469

So, I think TLS should either store all unknown exceptions or none of them, and report it to the application in all cases.

The main tricky case is read, which may successfully read some data but then fail to write. Currently it reports success and then reports the write failure later.

Now I understand a bit better (your Errors using flow after failure). The background is that considering we have a TLS session where we did some request and receive a response. Now the server finished with the response and sends a close_notify message. Due to usage of TCP it can be the case that TLS receives <further data> <alert: close notify> in a single packet (a single OS.read). Now, the user of the TLS stack is likely interested in the last pieces of data, thus returning `Eof immediately is not a good solution. But as you outline, deferring all exceptions to a potential later point is as well not good.

My current approach would be to replace the handle in read_react with:

    let handle tls buf =
      match Tls.Engine.handle_tls tls buf with
      | Ok (state', `Response resp, `Data data) ->
          let state' = match state' with
            | `Ok tls  -> `Active tls
            | `Eof     -> `Eof
            | `Alert a -> `Error (Tls_alert a)
          in
          t.state <- state' ;
          safely (resp |> when_some (write_t t)) >|= fun () ->
          (* HERE *)
          if data = None && state' <> `Active _ then state' else `Ok data

      | Error (alert, `Response resp) ->
          t.state <- `Error (Tls_failure alert) ;
          write_t t resp >>= fun () -> read_react t
    in

What do you think? This would return errors earlier, but keeping the last received bytes in tact.

@talex5
Copy link
Contributor Author

talex5 commented Feb 10, 2023

Thinking about it, I fail to understand the use-case (application-wise) where you have a single TLS connection and write concurrently from multiple tasks onto it.

I don't want to do concurrent writes at all. The problematic case is:

  1. The client starts a read operation, waiting for data.
  2. The client also performs a write. Tls asks the socket to send some encrypted data.
  3. Before that finishes, the client receives a message from the server. Tls asks the socket to write a reply (e.g. confirming shutdown).

Then the two writes from (2) and (3) may get interleaved.

The fuzz tests don't detect this because once the server has sent the close request it stops reading from the socket, so it doesn't matter if the data sent from the client gets corrupted.

I tried adding an assert to the fuzz tests to detect overlapping writes. Then, the client's read operation gets an assertion failure. But read_react ignores the exception because it's already at Eof. The next read then fails with Eof because the flow is closed, and so we never see the assertion failure.

Is sending a close confirmation the only case where a TLS read does a socket write? If so, using the TLS 1.3 close behaviour would make all of this complexity go away. If not, we need a lock around the writes.

What do you think? This would return errors earlier, but keeping the last received bytes in tact.

It doesn't compile for me (state' else `Ok data have different types).

@talex5
Copy link
Contributor Author

talex5 commented Feb 10, 2023

I wonder how this close_tls system can work even for TLS 1.2. If you want to shut down a TLS connection but continue using the raw socket afterwards, then you'll still need to wait until you've received the client's confirmation of the shutdown. But doing a Tls read after sending the close notify just fails immediately with Eof.

@hannesm
Copy link
Member

hannesm commented Feb 14, 2023

@talex5

I don't want to do concurrent writes at all. The problematic case is:

So you do a read and write at the same time on the same flow. Would you mind to point me to an application thereof?

Usually, a "read" in TLS doesn't invoke a "write" -- apart from the renegotiation / rekeying which may occur. Of course this can be tracked in the state as well, and then a "read" would never do a "write" (but only signal the state that further writes could use an updated crypto context). But, as said above, I'd be keen to see the concrete application / protocol that behaves as you described.

I wonder how this close_tls system can work even for TLS 1.2. If you want to shut down a TLS connection but continue using the raw socket afterwards, then you'll still need to wait until you've received the client's confirmation of the shutdown.

Indeed, to support such a use-case we would need to revise the API. I would wait to think about it until we find a protocol / application that requires such a behaviour (re-use the TCP socket once TLS session is finished).

@talex5
Copy link
Contributor Author

talex5 commented Feb 14, 2023

So you do a read and write at the same time on the same flow. Would you mind to point me to an application thereof?

Most users of TLS will do this, I would think. For example, an HTTP client requesting multiple resources will want to write the requests while reading the responses. If it waits for each response before sending the next request it will be very slow, and if it waits until it has sent all the requests before starting to read then it's likely to deadlock (as the server may decide to stop reading more requests until the client has read some responses).

Other examples: capnp-rpc allows either party to send at any time; a chat application will typically allow users to send messages while also waiting for incoming ones; an HTTP connection using push notifications allows both parties to send at any time.

Usually, a "read" in TLS doesn't invoke a "write" -- apart from the renegotiation / rekeying which may occur.

OK, so that explains why the fuzz tests didn't find it then. close_notify isn't tested because we know we can't currently use a flow after sending close_notify, and we don't test renegotiation / rekeying yet.

@hannesm
Copy link
Member

hannesm commented Feb 15, 2023

Most users of TLS will do this, I would think

On the other side, in the recent 8 years of OCaml-TLS being used, there hadn't been such an issue with either Lwt or Async.

HTTP client requesting multiple resources

Interesting, I've not observed such code in the wild. But maybe that is a thing.

To me it sounds like the internal engine will need to be restructured then:

  • a Tls.Engine.state is only ever exposed once a handshake has been completed (as it is already done, no partial Tls state)
  • any read operation may update the state and return application data
  • any write operation may update the state

Now, any read doesn't involve a write. If a new cryptographic context is established, this is preserved in the state, and the next write may use that information to send a KeyUpdate/ChangeCipherSpec message to the other peer, and use the new keys. The invariant to preserve is that no read leads to a write (this is not the case at the moment), and no write leads to a read (this is already in place).

Does that sound sensible?

@talex5
Copy link
Contributor Author

talex5 commented Feb 17, 2023

On the other side, in the recent 8 years of OCaml-TLS being used, there hadn't been such an issue with either Lwt or Async.

Yes, it's very unlikely this race would be a problem normally, since it only happens when rekeying (which I doubt many people do) or after closing.

HTTP client requesting multiple resources

Interesting, I've not observed such code in the wild. But maybe that is a thing.

Here's httpaf, for example: https://github.com/inhabitedtype/httpaf/blob/9cefa96f39712d8c8a4957c98970c2ac4a814ae2/lwt-unix/httpaf_lwt_unix.ml#L279-L283

It runs separate read and write threads, so it will read the next request while writing out a previous response.

To me it sounds like the internal engine will need to be restructured then:

I think there are two things that would be useful:

  1. We need a mutex or queue for write operations, so that an earlier write requested by the engine always finishes before the next one starts. This doesn't require a change to the engine.
  2. The engine should either return data from a read or request a write, but not both at the same time. This is so that the library doesn't need to report both success (of the read) and failure (of the write) together, which currently complicates error handling. Instead, the engine could just return the data it has and delay processing the next bit (until the client's next read).

We need to be able to do writes from read because (as I understand it) a rekey request needs a write and the application isn't required ever to do a write if it doesn't want to.

@hannesm
Copy link
Member

hannesm commented Mar 21, 2024

From what I can see:

The engine should either return data from a read or request a write, but not both at the same time.

This is not in scope at the moment, I worked on shutdown semantics in #488, and figured that while a single TLS record won't lead to a response (to the other peer) and some data for the application, dealing with multiple records at a time will make this possible. This is only the case during a handshake (or renegotiation) - and a relict of TLS < 1.3.

engine could just return the data it has and delay processing the next bit

You can model the delay in your tls-eio if desired to come to the semantics that you think is necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants