-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Establish new standard of compressed netCDF output, retaining datetime64[ns] format for time #63
Comments
@ryjombari and I revisited this. The compressed file as read with python does not report a change in the formatting of the time
Version of the libraries we are using xarray==2025.1.2 |
Here are the two files we tested - both the raw and the compressed versions: Archive.zip |
Update – Danelle and I confirmed that the compressed netCDF file retains the datetime64[ns] time format, and that the compressed netCDF file provided to NCEI for testing actually held time as datetime64[ns]. So, the issue reported by a tester must have been due to the software they used to read the netCDF file. (They would have had the same issue with the uncompressed netCDF.) Here is what they wrote: "I did have to make a change to accommodate the datetime formatting (minutes since XXX), but other than that they seem to load fine! Data from the compressed and uncompressed NetCDF files are identical after I read them in, so the compression seems like a great idea." |
@carueda OK to move forward with adding the compressed versions in PBP. Here is a code snippet that was used to write the compressed version. def write_compressed_netcdf(ds: xr.Dataset, out_file: Path) -> None:
enc = {}
for k in ds.data_vars:
if ds[k].ndim < 2:
continue
enc[k] = {
"zlib": True,
"complevel": 3,
"fletcher32": True,
"chunksizes": tuple(map(lambda x: x // 2, ds[k].shape))
}
ds.to_netcdf(out_file, format="NETCDF4", engine="h5netcdf", encoding=enc) |
Thanks @danellecline @ryjombari : So, in conclusion:
Correct? |
IMHO that is up to @ryjombari . I think it's safe to conclude that multiple readers can read the compressed format, but I can see why keeping the option to save uncompress could be helpful as well for backwards compatibility. |
@carueda I think you made the right call on this first pass:
Standing by to test on gizo whenever we are ready... and crank out the most recent deployments of MB05 and CH01. |
Ok, I've merged #65, which added the NetCDF compression plus the CLI option Some notes:
|
Our example compressed netCDF format was tested by multiple people at NOAA / NCEI. The only issue they had with the compressed file was that the format for time changed from datetime64[ns] in the uncompressed to this in the compressed file:
time
Size: 1440x1
Dimensions: time
Datatype: int64
Attributes:
units = "minutes since 2022-07-21 00:00:00"
calendar = "proleptic_gregorian"
Can we:
The text was updated successfully, but these errors were encountered: