Skip to content

mathijs81/java-dataframes

Repository files navigation

Java dataframes test

This is the companion repository to the following medium post: Doing cool data science in Java: how 3 DataFrame libraries stack up

Data

The data was extracted from Eurostat in the beginning of September 2018. I opened the extracted CSV in LibreOffice and saved it again because there were some illegal UTF-8 characters in the Eurostat output that some csv importers couldn't handle directly.

Results [June 2025]

Library Maintained Version Time (ms)
DuckDb Y 1.3.0 93
DFLib Y 1.3.0 226
Kotlin DataFrame Y 1.0-beta2 816
Tablesaw Y 0.44.1 820
Joinery n 1.9 1,478
Krangl n 0.18.4 1,796
Morpheus n 0.9.23 *
  • Morpheus is no longer maintained and doesn't seem to work on later java versions (error related to accessing sun.util.calendar.ZoneInfo)

Code

The code for the three libraries is present in the Test{libraryname}.java files. They all use CheckResult.java to do a basic correctness check for the top-growing cities.

As described in the medium post, I couldn't find a good way to do the pivot step in datavec, but I included the code I wrote up until that point.

About

A quick test of a couple of data frame libraries for Java

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published