Skip to content

Upgrade Hadoop to 3.4.1 #65

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
281 changes: 216 additions & 65 deletions pom.xml

Large diffs are not rendered by default.

90 changes: 90 additions & 0 deletions scripts/Linux-aarch64/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Start from the specified base image
FROM imjalpreet/centos7-oj8:latest-arm
MAINTAINER Presto community <https://prestodb.io/community.html>

WORKDIR /opt

# Install required tools
RUN yum clean all && \
yum makecache fast && \
yum install -y \
gcc gcc-c++ make automake autoconf libtool \
pkgconfig \
openssl-devel \
snappy snappy-devel \
java-1.8.0-openjdk-devel \
zlib zlib-devel \
git \
wget curl && \
yum clean all && \
rm -rf /var/cache/yum

# Set Java environment variables
ENV JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
ENV PATH=$JAVA_HOME/bin:$PATH

# Set OpenSSL root
ENV OPENSSL_ROOT=/usr/lib64
ENV OPENSSL_ROOT_DIR=$OPENSSL_ROOT
ENV OPENSSL_INCLUDE_DIR=$OPENSSL_ROOT/include
ENV OPENSSL_LIBRARIES=$OPENSSL_ROOT/lib
ENV PKG_CONFIG_PATH=$OPENSSL_ROOT/lib/pkgconfig
ENV LDFLAGS="-L$OPENSSL_ROOT/lib"
ENV CPPFLAGS="-I$OPENSSL_ROOT/include"
ENV CMAKE_PREFIX_PATH=$OPENSSL_ROOT

# Set Zlib environment
ENV ZLIB_HOME=/usr
ENV CMAKE_PREFIX_PATH=$ZLIB_HOME:$CMAKE_PREFIX_PATH
ENV C_INCLUDE_PATH=$ZLIB_HOME/include:$C_INCLUDE_PATH
ENV LIBRARY_PATH=$ZLIB_HOME/lib:$LIBRARY_PATH
# Append Zlib flags to existing LDFLAGS and CPPFLAGS
ENV LDFLAGS="-L$ZLIB_HOME/lib $LDFLAGS"
ENV CPPFLAGS="-I$ZLIB_HOME/include $CPPFLAGS"

# Install Protobuf v3.21.12
# Ensure autogen.sh is executable
RUN curl -L https://github.com/protocolbuffers/protobuf/archive/refs/tags/v3.21.12.tar.gz -o protobuf-3.21.12.tar.gz && \
tar -zxvf protobuf-3.21.12.tar.gz && \
cd protobuf-3.21.12 && \
chmod +x autogen.sh && \
./autogen.sh && \
./configure --prefix=/usr/local && \
make -j$(nproc) && \
make install && \
ldconfig && \
cd .. && rm -rf protobuf-3.21.12 protobuf-3.21.12.tar.gz

# Install CMake 3.22.3 from source
RUN cd /tmp && \
wget https://github.com/Kitware/CMake/releases/download/v3.22.3/cmake-3.22.3.tar.gz && \
tar -zxvf cmake-3.22.3.tar.gz && \
cd cmake-3.22.3 && \
./bootstrap && \
make -j$(nproc) && \
make install && \
cd / && rm -rf /tmp/cmake-3.22.3*

# Install Apache Maven 3.9.6
ENV MAVEN_VERSION=3.9.6
ENV MAVEN_HOME=/opt/apache-maven-${MAVEN_VERSION}
ENV PATH=$MAVEN_HOME/bin:$PATH

RUN curl -fSL https://archive.apache.org/dist/maven/maven-3/${MAVEN_VERSION}/binaries/apache-maven-${MAVEN_VERSION}-bin.tar.gz -o /tmp/apache-maven.tar.gz && \
tar -xzf /tmp/apache-maven.tar.gz -C /opt && \
rm /tmp/apache-maven.tar.gz

# Clone Hadoop and checkout
RUN git clone https://github.com/apache/hadoop.git /opt/hadoop && \
cd /opt/hadoop && \
git checkout branch-3.4.1

WORKDIR /opt/hadoop

# Build Hadoop Common with native libs
RUN mvn clean package -pl hadoop-common-project/hadoop-common -am \
-Pdist,native \
-DskipTests \
-Dtar \
-Drequire.snappy \
-Dmaven.javadoc.skip=true
90 changes: 90 additions & 0 deletions scripts/Linux-amd64/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Start from the specified base image
FROM imjalpreet/centos7-oj8:latest
MAINTAINER Presto community <https://prestodb.io/community.html>

WORKDIR /opt

# Install required tools
RUN yum clean all && \
yum makecache fast && \
yum install -y \
gcc gcc-c++ make automake autoconf libtool \
pkgconfig \
openssl-devel \
snappy snappy-devel \
java-1.8.0-openjdk-devel \
zlib zlib-devel \
git \
wget curl && \
yum clean all && \
rm -rf /var/cache/yum

# Set Java environment variables
ENV JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
ENV PATH=$JAVA_HOME/bin:$PATH

# Set OpenSSL root
ENV OPENSSL_ROOT=/usr/lib64
ENV OPENSSL_ROOT_DIR=$OPENSSL_ROOT
ENV OPENSSL_INCLUDE_DIR=$OPENSSL_ROOT/include
ENV OPENSSL_LIBRARIES=$OPENSSL_ROOT/lib
ENV PKG_CONFIG_PATH=$OPENSSL_ROOT/lib/pkgconfig
ENV LDFLAGS="-L$OPENSSL_ROOT/lib"
ENV CPPFLAGS="-I$OPENSSL_ROOT/include"
ENV CMAKE_PREFIX_PATH=$OPENSSL_ROOT

# Set Zlib environment
ENV ZLIB_HOME=/usr
ENV CMAKE_PREFIX_PATH=$ZLIB_HOME:$CMAKE_PREFIX_PATH
ENV C_INCLUDE_PATH=$ZLIB_HOME/include:$C_INCLUDE_PATH
ENV LIBRARY_PATH=$ZLIB_HOME/lib:$LIBRARY_PATH
# Append Zlib flags to existing LDFLAGS and CPPFLAGS
ENV LDFLAGS="-L$ZLIB_HOME/lib $LDFLAGS"
ENV CPPFLAGS="-I$ZLIB_HOME/include $CPPFLAGS"

# Install Protobuf v3.21.12
# Ensure autogen.sh is executable
RUN curl -L https://github.com/protocolbuffers/protobuf/archive/refs/tags/v3.21.12.tar.gz -o protobuf-3.21.12.tar.gz && \
tar -zxvf protobuf-3.21.12.tar.gz && \
cd protobuf-3.21.12 && \
chmod +x autogen.sh && \
./autogen.sh && \
./configure --prefix=/usr/local && \
make -j$(nproc) && \
make install && \
ldconfig && \
cd .. && rm -rf protobuf-3.21.12 protobuf-3.21.12.tar.gz

# Install CMake 3.22.3 from source
RUN cd /tmp && \
wget https://github.com/Kitware/CMake/releases/download/v3.22.3/cmake-3.22.3.tar.gz && \
tar -zxvf cmake-3.22.3.tar.gz && \
cd cmake-3.22.3 && \
./bootstrap && \
make -j$(nproc) && \
make install && \
cd / && rm -rf /tmp/cmake-3.22.3*

# Install Apache Maven 3.9.6
ENV MAVEN_VERSION=3.9.6
ENV MAVEN_HOME=/opt/apache-maven-${MAVEN_VERSION}
ENV PATH=$MAVEN_HOME/bin:$PATH

RUN curl -fSL https://archive.apache.org/dist/maven/maven-3/${MAVEN_VERSION}/binaries/apache-maven-${MAVEN_VERSION}-bin.tar.gz -o /tmp/apache-maven.tar.gz && \
tar -xzf /tmp/apache-maven.tar.gz -C /opt && \
rm /tmp/apache-maven.tar.gz

# Clone Hadoop and checkout
RUN git clone https://github.com/apache/hadoop.git /opt/hadoop && \
cd /opt/hadoop && \
git checkout branch-3.4.1

WORKDIR /opt/hadoop

# Build Hadoop Common with native libs
RUN mvn clean package -pl hadoop-common-project/hadoop-common -am \
-Pdist,native \
-DskipTests \
-Dtar \
-Drequire.snappy \
-Dmaven.javadoc.skip=true
44 changes: 44 additions & 0 deletions scripts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Hadoop Native Library Generation (Linux)

This guide explains how to generate Hadoop native libraries for Linux using a Dockerfile.
## Prerequisites

- Docker installed on your machine
- Dockerfile present in your working directory

## Steps to Build and Extract Hadoop Native Libraries

### 1. Build Docker Image

Run the following command in the same directory as your Dockerfile:

docker build -t hadoop-native .

This will create a Docker image named hadoop-native

### 2. Run Docker Container

Start a container from the image and open an interactive shell:

docker run -it --name hadoop-native-check hadoop-native bash

This launches a container named hadoop-native-check and gives you access to its shell.

### 3. Locate the Generated Library

Once inside the container, the generated Hadoop native library can be found at:

/opt/hadoop/hadoop-common-project/hadoop-common/target/native/target/usr/local/lib/

The file name will be :
libhadoop.so

### 4. Copy the Library to Local System

Use the following command to copy the libhadoop.so file from the container to your local machine:

docker cp <container_id>:/opt/hadoop/hadoop-common-project/hadoop-common/target/native/target/usr/local/lib/libhadoop.so <local_path>

Replace:
container_id with your running container's ID or name (e.g., hadoop-native-check)
local_path with your desired local directory
Loading