-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proxy functionality broken in Aleph 0.7.0 and later #731
Comments
Hmmm. Do you see an IllegalArgumentException anywhere in the logs that says "Proxying HTTP/2 messages not supported yet"? I wonder if it's negotiating HTTP/2 and burying the exception somehow. I didn't think I changed anything with the HTTP/1 proxying; if so, that's an error. But I never used proxying, and never did much with it. It hasn't been implemented for HTTP/2 yet. |
No, I don't see such a message. I also don't think it's using HTTP/2, but to be sure: is there a pool config to force HTTP/1.1? Then I can retry with that.
Still, you can try to reproduce the errors with a local proxy, like I showed above. |
HTTP2 is explicitly opt-in, so unless you're setting it as a desired http version, it's http1 by default.
Oh, that sentence is more a note for Moritz, who's taken over as maintainer, that I changed virtually nothing with the proxy code itself, so the error probably lies elsewhere, like in the refactoring to support HTTP2, or the code path into the proxy code. |
Will these problems be addressed in the near future? We still depend on 0.6.4 in at least one service because of this issue. If not, it would be good if some dependency updates could be backported to the 0.6.x line. |
Ah sorry, I completely missed this issue for some reason 🙏 I'll have a look! |
OK so I am able to reproduce this (thanks for the excellent reproducer 🙏) and it looks indeed like an exception gets thrown behind the scenes which (AFAIUI) ends up removing the proxy handler from the pipeline:
I also tried with 0.6.4 but the same Netty version as 0.8.2 to (mostly) rule out that it is a bug in Netty itself. And indeed, it works fine with 0.6.4 even with Netty 4.1.115.Final. Digging in a bit deeper now! |
OK I tried a
I was confused for a moment because the proxy should forward the TLS traffic as-is but of course, the underlying TCP connection actually is with the proxy, so it made sense. However, in the new (failing) version, we get:
So apparently, the proxy is being bypassed for the TLS handshake! And it all makes sense now: With HTTP/2 support, we also had to introduce ALPN which is a TLS extension. So for ALPN to do its thing, we first need to do the TLS handshake. However, the proxy handler is then only added to the pipeline in I see two possible courses of action for addressing this now:
Any other ideas? I'll give 2. a shot now (with a dummy |
Ha, looks like it indeed Just Works ™️ with HTTP/2, as well! Now I'll only have to figure out how to deal with the error propagation there, as well... |
Since the introduction of HTTP/2 support in the client, proxy handlers (requested via proxy-options) would be installed only after the HTTP version was negotiated. In case of a HTTPS connection, this would be attempted via ALPN which means the TLS handshake would have already taken place with the actual destination host, bypassing the configured proxy. This resulted in an error (see #731). With this patch, the proxy handlers are installed earlier again so that a configured proxy is also used during the TLS handshake phase.
Indeed, that should explain why I got a different error in prod: in my local tests I can access the "proxied" https endpoint directly just fine, but of course not in prod (that's why we need a proxy in the first place). So in prod it probably fails at an earlier code path. |
@DerGuteMoritz Good detective work! |
Since updating from 0.6.4 to 0.8.0 I see strange exceptions for proxied calls.
Decoding the hex string yields:
So it seems it tries to interpret an http response as https. (I try to reach a https endpoint via an http proxy, but maybe it's making an unproxied http call instead while still expecting https.)
It's difficult for me, though, to reproduce the proxy settings we use in prod locally. Therefore, I will use a local proxy for this bug report here:
After this, we can confirm that Aleph 0.6.4 works as expected
Trying the same in 0.7.0 or higher yields
Or alternatively, after a timeout of a minute or so:
Trying http yields a strange 503:
If I try https afterwards it seems it still tries to intepret the result as http:
although, I sometimes can get this error right away, without trying http first, so it doesn't seem to be a matter of misusing a pool for different protocols (which we don't do in prod anyway).
So even though I can not reproduce the exact error I see in prod locally, I think we can conclude that there is a regression in 0.7.0 and later compared to 0.6.4.
The text was updated successfully, but these errors were encountered: