Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No ssl-context via SslContext in Aleph 0.7.0 and later #727

Closed
David-Ongaro opened this issue Jun 10, 2024 · 11 comments · Fixed by #730
Closed

No ssl-context via SslContext in Aleph 0.7.0 and later #727

David-Ongaro opened this issue Jun 10, 2024 · 11 comments · Fixed by #730

Comments

@David-Ongaro
Copy link
Contributor

David-Ongaro commented Jun 10, 2024

When trying to upgrade one of our services from Aleph 0.6.4 to 0.8.0 I saw this strange exception in the logs:

[...]
Caused by: clojure.lang.ExceptionInfo: Some desired HTTP versions are not part of ALPN config.
{:desired-http-versions [:http1], :alpn-protocols []}
 at aleph.http.common$assert_consistent_alpn_config_BANG_.invokeStatic (common.clj:139)
    aleph.http.common$assert_consistent_alpn_config_BANG_.invoke (common.clj:130)
    aleph.http.common$ensure_consistent_alpn_config.invokeStatic (common.clj:182)
    aleph.http.common$ensure_consistent_alpn_config.invoke (common.clj:143)
    aleph.http.client$client_ssl_context.invokeStatic (client.clj:698)
    aleph.http.client$client_ssl_context.invoke (client.clj:690)
    aleph.http.client$http_connection.invokeStatic (client.clj:799)
    aleph.http.client$http_connection.invoke (client.clj:752)
    aleph.http$create_connection.invokeStatic (http.clj:104)
    aleph.http$create_connection.invoke (http.clj:97)
    aleph.http$connection_pool$fn__21882.invoke (http.clj:239)
    aleph.flow$instrumented_pool$reify__12626.generate (flow.clj:47)
    io.aleph.dirigiste.Pool.addObject (Pool.java:273)
    io.aleph.dirigiste.Pool.acquire (Pool.java:466)
    aleph.flow$acquire$fn__12631.invoke (flow.clj:74)
    aleph.flow$acquire.invokeStatic (flow.clj:73)
    aleph.flow$acquire.invoke (flow.clj:68)
    aleph.http$eval21907$request__21911$fn__21915$fn__21916.invoke (http.clj:377)
    aleph.http$eval21907$request__21911$fn__21915.invoke (http.clj:371)
    aleph.http$eval21907$request__21911.invoke (http.clj:370)
[...]

When trying to pin this down, I could confirm that this already happens in Aleph 0.7.0 (even for the alpha versions) and it only happens when the pool is configured with an io.netty.handler.ssl.SslContext instance. I.e.

(def ssl-context (.build (io.netty.handler.ssl.SslContextBuilder/forClient)))

(def pool (http/connection-pool {:connection-options {:ssl-context ssl-context}}))

(http/get "https://example.com" {:pool pool}) =>
#<Deferred@346c5293: Error printing return value (ExceptionInfo) at aleph.http.common/assert-consistent-alpn-config! (common.clj:139).
Some desired HTTP versions are not part of ALPN config.

Note that the exception happens even before the deferred is dereferenced. Also note that I just use a plain io.netty.handler.ssl.JdkSslClientContext instance here without any special SSL config in this example, since the specific config doesn't seem to matter here.

If I use a plain map to initialize the ssl-context, e.g. say

(def ssl-context {:trust-store (io/as-file client-ca)})

(whereas io/as-file is clojure.java.io/as-file here, and client-ca a file path string), this doesn't happen. So that's a viable workaround. Still, I couldn't find any update in the documentation that says SslClientContext instances are not supported anymore, so this is at least a documentation "bug".


Addendum

As per the docs, instead of java.io.File instances, java.io.InputStream instances are also supported as keys for the :ssl-context map. But I can't figure out how this is supposed to be used, as I regularly get java.lang.IllegalArgumentException: Input stream does not contain valid certificates. exceptions. I.e., preparing a single request just works fine, but preparing them in quick succession may fail:

(def ssl-context {:trust-store (io/input-stream client-ca)})

(def pool (http/connection-pool {:connection-options {:ssl-context ssl-context}}))

(http/get "https://example.com" {:pool pool}) => #<Deferred@6fb6283a: :not-delivered>

[(http/get "https://example.com" {:pool pool}) (http/get "https://example.com" {:pool pool})] =>
[#<Deferred@58ac8429: Error printing return value (CertificateException) at io.netty.handler.ssl.PemReader/readCertificates (PemReader.java:114).
Error printing return value (CertificateException) at io.netty.handler.ssl.PemReader/readCertificates (PemReader.java:114).
found no certificates in input stream
            PemReader.java:  114  io.netty.handler.ssl.PemReader/readCertificates
           SslContext.java: 1263  io.netty.handler.ssl.SslContext/toX509Certificates
    SslContextBuilder.java:  276  io.netty.handler.ssl.SslContextBuilder/trustManager
                 netty.clj:  917  aleph.netty/eval23500/add-ssl-trust-manager!
                 netty.clj: 1026  aleph.netty/eval23500/ssl-client-context
                 netty.clj: 1193  aleph.netty/coerce-ssl-context
                 netty.clj: 1179  aleph.netty/coerce-ssl-context
                  core.clj: 2641  clojure.core/partial/fn
                client.clj:  699  aleph.http.client/client-ssl-context
                client.clj:  690  aleph.http.client/client-ssl-context
                client.clj:  799  aleph.http.client/http-connection
                client.clj:  752  aleph.http.client/http-connection
                  http.clj:  104  aleph.http/create-connection
                  http.clj:   97  aleph.http/create-connection
                  http.clj:  239  aleph.http/connection-pool/fn
                  flow.clj:   47  aleph.flow/instrumented-pool/reify
                 Pool.java:  273  io.aleph.dirigiste.Pool/addObject
                 Pool.java:  466  io.aleph.dirigiste.Pool/acquire
                  flow.clj:   74  aleph.flow/acquire/fn
                  flow.clj:   73  aleph.flow/acquire
                  flow.clj:   68  aleph.flow/acquire
                  http.clj:  377  aleph.http/eval30155/request/fn/fn
                  http.clj:  371  aleph.http/eval30155/request/fn
                  http.clj:  370  aleph.http/eval30155/request
                  http.clj:  481  aleph.http/req
                  http.clj:  477  aleph.http/req
                  core.clj: 2642  clojure.core/partial/fn

I didn't look into the implementation, but I suspect what's happening here is that when the first thread of the thread pool is initialized, it's exhausting the input-stream instance and this instance is reused during the initialization of a second thread. (At least that's what I hope is happening, since the alternative would be that each thread tries to reread the certificate on each request.)

So the question is, if this doesn't work, why is it even supported? But if this is indeed an issue, it probably should be handled in a separate ticket, since this behavior already applies to Aleph 0.6.4 and therefore can't be considered a regression.

@DerGuteMoritz
Copy link
Collaborator

Thanks for the detailed report! I'm investigating it closer now and will get back to you when I've fully figured out what's going on!

@DerGuteMoritz
Copy link
Collaborator

So regarding the main issue: The problem is that along with the introduction of HTTP/2 support in 0.7.0, we (accidentally?) made ALPN config for a custom SslContext mandatory, which obviously is a breaking change.

We could probably allow custom contexts without ALPN config again when only HTTP/1 is desired anyway (WDYT @KingMob?). Otherwise, this should to be documented, indeed.

@David-Ongaro In the meantime, here's how you can configure your custom context to make it work again:

(def ssl-context (.. (io.netty.handler.ssl.SslContextBuilder/forClient)
                     (applicationProtocolConfig (aleph.netty/application-protocol-config [:http1]))
                     (build)))

The reason why it "works" (modulo the issue you mention in your addendum, which I'll break out into a separate ticket) when using the options map instead of a custom SslContext is that in this case, we supply a default ALPN config based on the desired HTTP versions (which defaults to [:http1]). We can't do this in the custom SslContext case since it can't be modified anymore. Researching this made me realize that this behavior is only documented for the HTTP server but not for the client. This will also have to be fixed.

@arnaudgeiser
Copy link
Collaborator

We could probably allow custom contexts without ALPN config again when only HTTP/1 is desired anyway (WDYT @KingMob?). Otherwise, this should to be documented, indeed.

I'm missing a bit of context here, but I'm wondering if this, while a breaking change, has not been made on purpose : #687 (review)

@DerGuteMoritz
Copy link
Collaborator

I'm missing a bit of context here, but I'm wondering if this, while a breaking change, has not been made on purpose : #687 (review)

I suppose you're referring to this bit specifically:

However, once you pass in an SslContext, the code won't handle ALPN for you, but it will check that the ALPN config and http-versions match, so you then have to add an ALPN config.

That confirms that the consistency check for the ALPN config is intentional in this case but I don't think that breaking the old behavior (i.e. allow HTTP/1.1-only without ALPN config) was necessarily intended.

@KingMob
Copy link
Collaborator

KingMob commented Jun 13, 2024

Yeah, I don't think my old comment meant this was a deliberate change, but it's been a while.

I agree that ALPN should be optional for HTTP1-only contexts, and if not, that's a bug. But it should still be an error if they request HTTP2/3 and fail to supply an ALPN, since the ALPN fallback for HTTP when one peer doesn't have ALPN is always 1.1.

This:

    :else
    (do
      (assert-consistent-alpn-config!
       (some-> ^SslContext ssl-context
               (.applicationProtocolNegotiator)
               (.protocols))
       desired-http-versions)
      ssl-context)

should be something like:

    (some? (.applicationProtocolNegotiator ^SslContext ssl-context))
    (do
      (assert-consistent-alpn-config!
        (some-> ^SslContext ssl-context
                (.applicationProtocolNegotiator)
                (.protocols))
        desired-http-versions)
      ssl-context)

    :else
    (if (= #{:http1} desired-http-versions)
      ssl-context
      (throw (ex-info "No ALPN supplied, but requested non-HTTP/1 versions that require ALPN."
                      {:desired-http-versions desired-http-versions})))

Totally untested, but that's the idea.

@DerGuteMoritz
Copy link
Collaborator

Great, thanks for the quick feedback! I'll whip up a patch with a corresponding test case.

DerGuteMoritz added a commit that referenced this issue Jun 13, 2024
With the introduction of HTTP/2 support, Aleph started to require a matching ALPN config to be
present in custom `SslContext` objects. This needlessly broke existing HTTP/1.1-only uses. With this
change, we now allow custom `SslContext` objects without ALPN config again if only HTTP/1.1 is
desired (via the `http-versions` option). Since this still happens to be the default, existing uses
should just work again.

Fixes #727
@DerGuteMoritz
Copy link
Collaborator

@David-Ongaro Patch is ready at #730 - would you be able to give it a try to see whether it makes your original code work again? 🙏

@DerGuteMoritz
Copy link
Collaborator

OK went ahead and merged this already. Will cut a new release soon!

@arnaudgeiser
Copy link
Collaborator

Thank you @DerGuteMoritz !

@David-Ongaro
Copy link
Contributor Author

OK went ahead and merged this already. Will cut a new release soon!

I'm sorry I didn't have the time last week, but I'll test this as soon as the new release is out.

@David-Ongaro
Copy link
Contributor Author

David-Ongaro commented Jun 21, 2024

@David-Ongaro Patch is ready at #730 - would you be able to give it a try to see whether it makes your original code work again? 🙏

Ok, I could now try out Aleph 0.8.1 and can confirm that the issue is not occurring anymore! Thanks for fixing this!

Though, as long as the proxy config is broken (as mentioned in #731) at least one of our services is blocked from updating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants