Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

investigate allocations #683

Closed
Tracked by #390
juliasloan25 opened this issue Mar 8, 2024 · 1 comment
Closed
Tracked by #390

investigate allocations #683

juliasloan25 opened this issue Mar 8, 2024 · 1 comment
Assignees
Labels
GPU 🍃 leaf Issue coupled to a PR

Comments

@juliasloan25
Copy link
Member

When we try to run the DYAMOND configuration on central's P100 GPUs, it fails because there isn't enough memory available during the atmos_init call. The same run works fine on clima's A100 GPUs, but in atmos_init we see Effective GPU memory usage: 87.32% (69.114 GiB/79.150 GiB). 70GB memory usage is a lot, so we need to look into where these allocations are coming from.

We can do this by placing CUDA.memory_status calls throughout the code to see where the allocations jump

@juliasloan25
Copy link
Member Author

Coupler output table shows very similar allocations between atmos-only and coupled simulations, as of 5/1 (on GPU):
coupled simulation allocations: 3.361 GiB
atmos-only simulation allocations: 3.255 GiB

(on CPU):
coupled CoupledSimulation object allocations: 0.196 GiB
atmos-only CoupledSimulation object allocations: 0.195 GiB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GPU 🍃 leaf Issue coupled to a PR
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant