Skip to content

Memory issues loading pp files with default LOAD_POLICY #6404

@david-bentley

Description

@david-bentley

🐛 Bug Report

With Iris >= 3.11, and the introduction of LOAD_POLICY/COMBINE_POLICY, loading some pp files from the UM fails with memory issues. Note that the files load as expected with the 'legacy' LOAD_POLICY, but fail as described with the 'default' LOAD_POLICY.

I've tried running on our compute cluster with 64GB memory and still get the same issue. I can point someone to the files in the Met Office if required.

Script to reproduce:

import glob
import iris

policy = "legacy"  # or "default"
iris.LOAD_POLICY.set(policy)

files = glob.glob("*.pp")

cubes = iris.load(files)
print(cubes)

default LOAD_POLICY

$ time python read.py
...
long traceback
...
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 15.6 GiB for an array with shape (71, 6, 1920, 2560, 2) and data type float32

real  38m48.660s
user  35m16.419s
sys   3m7.086s

legacy LOAD_POLICY

$ time python read.py
0: m01s03i318 / (unknown)              (pseudo_level: 5; time: 13; latitude: 1920; longitude: 2560)
1: high_type_cloud_area_fraction / (1) (time: 37; latitude: 1920; longitude: 2560)
2: land_binary_mask / (1)              (latitude: 1920; longitude: 2560)
3: land_binary_mask / (1)              (latitude: 1920; longitude: 2560)
4: low_type_cloud_area_fraction / (1)  (time: 37; latitude: 1920; longitude: 2560)
5: mass_fraction_of_cloud_ice_in_air / (kg kg-1) (time: 37; model_level_number: 71; latitude: 1920; longitude: 2560)
6: mass_fraction_of_cloud_liquid_water_in_air / (kg kg-1) (time: 37; model_level_number: 71; latitude: 1920; longitude: 2560)
7: medium_type_cloud_area_fraction / (1) (time: 37; latitude: 1920; longitude: 2560)
8: surface_altitude / (m)              (time: 37; latitude: 1920; longitude: 2560)
9: surface_altitude / (m)              (time: 37; latitude: 1920; longitude: 2560)
10: surface_snow_amount / (kg m-2)      (time: 37; latitude: 1920; longitude: 2560)
11: x_wind / (m s-1)                    (time: 37; latitude: 1921; longitude: 2560)
12: y_wind / (m s-1)                    (time: 37; latitude: 1921; longitude: 2560)

real  12m57.740s
user  9m32.780s
sys   2m51.927s

How To Reproduce

Steps to reproduce the behaviour:

  1. Some UM pp files downloaded from the MASS archive
$ du -ch *pp | grep total
8.0G  total
  1. Read the files
import glob
import iris

files = glob.glob("*.pp")

cubes = iris.load(files)
print(cubes)
  1. Script is killed with memory issues
$ time python read.py
...
long traceback
...
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 15.6 GiB for an array with shape (71, 6, 1920, 2560, 2) and data type float32

real  38m48.660s
user  35m16.419s
sys   3m7.086s

Expected behaviour

These are pretty standard pp files coming out of the UM global model forecast, so I would expect that the files could be loaded successfully with the default LOAD_POLICY/COMBINE_POLICY.

Environment

  • OS & Version: RHEL7
  • Iris Version: >=3.11

Metadata

Metadata

Labels

Type

No type

Projects

Status

No status

Status

👀 In Review

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions