You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merge pull request #3217 from metabrainz/source-profile
1. Upgrade spark cluster to 3.5.5, hadoop to 3.4.1, python to 3.13, other python dependencies
2. Add a cleanup script to remove old spark application from workers.
3. Remove the use of deprecated SQLContext.
4. Use read_files_from_HDFS where possible.
5. Fix artist map stats broken due to null country values.
6. Add a try/except around each request consumer job to avoid kombu client crashes.
7. Disable readSideCharPadding to avoid OOMs during artist country data import.
Some of the changes for 1 and 2 are implemented in the recent commits of https://github.com/metabrainz/ansible-role-spark.
Copy file name to clipboardexpand all lines: docker/Dockerfile.spark.base
+6-6
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
ARG PYTHON_BASE_IMAGE_VERSION=3.9-focal-20220315
1
+
ARG PYTHON_BASE_IMAGE_VERSION=3.12-20241130
2
2
FROM metabrainz/python:$PYTHON_BASE_IMAGE_VERSION
3
3
4
4
ARG PYTHON_BASE_IMAGE_VERSION
@@ -26,9 +26,9 @@ RUN wget https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSI
26
26
27
27
WORKDIR /usr/local
28
28
29
-
ENV JAVA_VERSION 11.0.21
29
+
ENV JAVA_VERSION 11.0.26
30
30
ENV JAVA_MAJOR_VERSION 11
31
-
ENV JAVA_BUILD_VERSION 9
31
+
ENV JAVA_BUILD_VERSION 4
32
32
RUN wget https://github.com/adoptium/temurin${JAVA_MAJOR_VERSION}-binaries/releases/download/jdk-${JAVA_VERSION}%2B${JAVA_BUILD_VERSION}/OpenJDK${JAVA_MAJOR_VERSION}U-jdk_x64_linux_hotspot_${JAVA_VERSION}_${JAVA_BUILD_VERSION}.tar.gz \
33
33
&& tar xzf OpenJDK${JAVA_MAJOR_VERSION}U-jdk_x64_linux_hotspot_${JAVA_VERSION}_${JAVA_BUILD_VERSION}.tar.gz \
0 commit comments