Skip to content

Stalls on Ampere #22

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
govertb opened this issue Dec 7, 2021 · 2 comments
Open

Stalls on Ampere #22

govertb opened this issue Dec 7, 2021 · 2 comments
Labels

Comments

@govertb
Copy link
Owner

govertb commented Dec 7, 2021

Running on a GeForce RTX 3090 (Ampere architecture) the code stalls in the SortKernel. Possibly because certain threads no longer co-reside on the same multiprocessor during shared memory reductions in the SummarizationKernel and/or SortKernel.

@govertb govertb added the bug label Oct 18, 2023
@hubbardp
Copy link

hubbardp commented Aug 3, 2024

I think I am encountering this problem, too, on a A6000. If I comment out the call to SortKernel then the code finishes all the iterations without stalling. This bug is a shame, because the performance seems very good.

@govertb
Copy link
Owner Author

govertb commented Oct 26, 2024

From the paper on this project, I see we used CUDA 7.5 at the time. I checked the corresponding docs ('CUDA Compiler Driver NVCC' ) and in CUDA 7.5 nvcc defaults to compiling with

--gpu-architecture=compute_20 --gpu-code=sm_20,compute_20

In recent versions of CUDA nvcc defaults to compiling with

--gpu-architecture=compute_52 --gpu-code=sm_52,compute_52

Explicitly targeting a 'virtual architecture' (e.g. explicitly passing the CUDA 7.5 defaults) would fix some things. I'm curious if it would resolve the stalling, unfortunately don't have access to an Ampere GPU any more. But even if it resolves the stalling, breaking changes between CUDA 7.5 and recent versions need to be addressed as well.

Edit: I see support for Compute Capability 2.0 was dropped starting with CUDA 9.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants