Skip to content

Bump org.apache.pdfbox to 3.0.4 and guard against empty unicode strings #3271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dafriz
Copy link
Contributor

@dafriz dafriz commented May 21, 2025

Resolves dependency convergence error with tika version of pdfbox

Fixes 3265

Guards against empty unicode strings

Fixes 3054

dafriz added 2 commits May 21, 2025 20:59
Signed-off-by: David Frizelle <david.frizelle@gmail.com>
Signed-off-by: David Frizelle <david.frizelle@gmail.com>
@dafriz dafriz changed the title Bump org.apache.pdfbox to 3.0.4 Bump org.apache.pdfbox to 3.0.4 and guard against empty unicode strings May 21, 2025
@dafriz
Copy link
Contributor Author

dafriz commented May 22, 2025

Note that the update to PDF Box 3.0.4 also needs the guard against empty unicode strings change otherwise the existing test case fails.

@@ -280,7 +280,7 @@
<protobuf-java.version>3.25.2</protobuf-java.version>

<!-- readers/writer/stores dependencies-->
<pdfbox.version>3.0.3</pdfbox.version>
<pdfbox.version>3.0.4</pdfbox.version>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth trying to go to 3.0.5?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.0.4 resolves the current maven convergence issue.

Tika 3.2.0 is due any day now which uses 3.0.5 so would reccomend waiting a while and then updating them both together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants