Skip to content

Upgrade to Hadoop 3.4.1 and JDK 17 #68

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
imjalpreet opened this issue May 19, 2025 · 0 comments · May be fixed by #65 or #67
Open

Upgrade to Hadoop 3.4.1 and JDK 17 #68

imjalpreet opened this issue May 19, 2025 · 0 comments · May be fixed by #65 or #67
Assignees

Comments

@imjalpreet
Copy link
Member

Deprecation of Hadoop Config dfs.client.use.legacy.blockreader in Hadoop 3.x

Relevant Jira: HDFS-10548

Although the legacy Block Reader (BlockReaderRemote) is deprecated, it remains necessary for SOCKS proxy support—used in our product tests—since SOCKS sockets lack associated channels. To resolve this while transitioning to BlockReaderRemote2, we introduced a custom adaptor that creates a channel-backed SOCKS socket.

Changes introduced due to this

  • Added ForwardingSocket.java and SocksSocketFactory.java
  • Removed legacy config dfs.client.use.legacy.blockreader from core-site.xml

Removed Direct Dependency on Native libsnappy Library

Relevant Jira: HADOOP-17125

The SnappyCodec now leverages the snappy-java library, eliminating reliance on native binaries.

Changes introduced due to this

  • Updated HadoopNative.java and TestHadoopNative.java
  • Removed native libsnappy binaries from shaded JARs for all architectures

Inlined KMSClientProvider from Hadoop 3.4.1

Since HADOOP-13988, KMSClientProvider
uses UserGroupInformation.getLoginUser(), which is incompatible with Presto’s execution model. We’ve inlined and adapted the class to avoid this issue.

Inlined LineReader.java from Hadoop 3.4.1

Inlined Filesystem.java from Hadoop 3.4.1

In PR #2396, FileSystem.Cache was made final, breaking PrestoFileSystemCache which extends it. To maintain compatibility, we forked the class and safely removed the final keyword.

JDK 17+ Compatibility Adjustments

To ensure compatibility with JDK 17+ and avoid reflective hacks:
• Exposed setCache() in the forked FileSystem class
• Removed the final modifier from the static CACHE field

These changes ensure Presto integrates cleanly with Hadoop’s FileSystem across modern JVM environments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants