-
Notifications
You must be signed in to change notification settings - Fork 74
Usefulness of supports_atomics
in its current form
#594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
👍 these functions are post-hoc additions to attempt to allow these queries without breaking compatibility. So we could add a more fine-grained interface that uses types as well. |
What should the API look like? Essentially extending the What about partial support? For example, Metal has some 64-bit atomic support, but only for This is kind of just me dumping my thoughts while I try to better understand atomics and how they relate to GPU programming/KA. |
Yeah, I think the granularity: We could, of course, make it much more fine-grained For weak vs non-weak. That ought to matter only for compare-and-swap? And the way you support this generally is to use a loop. IIRC Atomix provides that fallback loop? Now if that loop matches the progress requirement of the GPU is a different story. |
I opened #595 for more concrete feedback. I feel like that may be all that's needed on the KA side. |
What is the exact use case for this? Querying whether a backend supports a datatype - be it for atomics or normal use - means you're writing type-generic code anyways. If you're exposing granular capabilities from KA, a downstream library developer would then have to write checks in case a user wanted to use an unsupported data type - and in the end just throw an error. Is there anything else to do in such a case? Why not just have a pass at the KA level for unsupported operations that throws a useful error? E.g. right now, if you wanted to use a Float64 on Metal, you'd get a I suppose query-able capabilities only make sense if you wanted different codepaths depending on their result. We had to do that in AcceleratedKernels for Metal |
Thanks for weighing in @anicusan. I was hoping you would as you have practical experience using atomics with KA code. At the moment, it was an attempt to figure out the role of However, after reading your comment, I'm doubting its general usefulness. For example, Metal has some support for atomics, but not everything, and we've forgotten to undefine the function after support was added and no one has complained yet, so maybe we should get rid of it and implement one of your suggestions. Maybe all this need is good documentation outlining the various atomics support of each backend? However I can see this becoming stale quite quickly so maybe not the best suggestion. |
Thanks for
For 1., I think the differences between platforms are too subtle - and often not specified enough by the official docs themselves - to be generally useful; if we consider the 5 fundamental algorithms in AK (mapreduce, accumulate, sort, any, foreachindex) for 5 backends, only 2 of these combinations had quirks that needed workarounds, so 8%, and even those, in extremely specific implementation details. I'm personally fine with finding them out as they come, and the workaround (via a kwarg) is simple enough that I don't think it's necessary to expose more KA surface area to maintain there. A helpful error message and/or some docs would be good, though they can also be in a "debugging" fashion rather than capability promises (e.g. AcceleratedKernels troubleshooting). I think 2. would be much more useful, and probably a low-hanging fruit - for example, it would be great to query the maximum group size of a given device (many algorithms scale with the number of blocks to process, so it'd be good to just use the maximum, but we don't have that exposed - so most Also, probably deciding on which terminology to use would be good for consistency, between all block/group/thread/item/etc. |
At the moment,
supports_atomics
returns a boolean, but different backends have different levels of support. For example, Metal essentially only supports 32-bt integers and floats, with 64-bit integer atomics being limited to min and max.Would it be useful/worth it to add granularity to the function?
Came up when fixing the histogram example for oneAPI and Metal which don't support 64-bit atomics.
The text was updated successfully, but these errors were encountered: