Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash caused by leaked semaphore objects #13

Open
GenevieveBuckley opened this issue Aug 10, 2022 · 3 comments
Open

Crash caused by leaked semaphore objects #13

GenevieveBuckley opened this issue Aug 10, 2022 · 3 comments

Comments

@GenevieveBuckley
Copy link

I'm getting crashes when running label statistics for datasets around size (1000, 2048, 2048) pixels.

There's some sort of memory leak, which kills ipython completely. I'd understand more if this was a truly giant dataset, or I was doing exessively complicated label statistics, instead of just label size, but that's not the case. I'm using a small, cropped subsection of my larger dataset, and I'd always considered one to two thousand pixels a fairly reasonable size to process in memory.

In [7]: zsh: killed     ipython
(napari-empanada) genevieb@192-168-1-103 ~ % /Users/genevieb/mambaforge/envs/napari-empanada/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
@haesleinhuepf
Copy link
Owner

haesleinhuepf commented Aug 10, 2022

Hey @GenevieveBuckley ,

thanks for reporting this!

size (1000, 2048, 2048) pixels

Assuming it's a uint32 dtype label image, that image holds 16 GB of memory.

two thousand pixels a fairly reasonable size to process in memory.

Well, your image has 4 billion :-D

I wrote a little minimal-working-example to reproduce your issue.

import numpy as np
from skimage.measure import label
from napari_simpleitk_image_processing import label_statistics

image = np.random.random((100,2048,2048))
binary = image > 0.8
labels = label(binary)
stats = label_statistics(image, labels)

This code runs forever on my machine and fills up all 32 GB of memory (despite the image is just 1.6 GB).

I contiued playing with the underlying code and suspect that I found an issue that can cause your issue. You'll see a PR in a minute, and a release of a new version some minutes after. Let me know if that solves your issue!

And thanks again for reporting!

@haesleinhuepf
Copy link
Owner

Quick benchmark, using the new release

pip install napari-simpleitk-image-processing==0.4.2

and using this image (note: just 100 slices):

image = np.random.random((100,2048,2048))

and this code:

stats = label_statistics(image, labels, 
                         intensity=True,
                         shape=False,
                         perimeter=False,
                         position=False,
                         moments=False)

Takes about one minute on my computer. Increasing it to 1000 slices exceeds what my little laptop can do. I don't expect a label image of 16 GB to be processed by a computer with 32 GB of memory. If you test it on a computer with more memory, I would be curious how long it takes. (-:

@GenevieveBuckley
Copy link
Author

Hm, ok. Those back of the envelope calculations make sense, I guess even my "small" examples are still pretty big. It's difficult getting access to a machine with larger memory at the moment - the supercomputing cluster is undergoing an upgrade (which should be good eventually, but quite a bit of it is offline now)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants