-
Notifications
You must be signed in to change notification settings - Fork 49
Support ibv_reg_dmabuf_mr for buffer allocated by cuMemMalloc #513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
src/ib.cc
Outdated
if (cuMemAlloc) { | ||
int fd; | ||
cuMemGetHandleForAddressRange(&fd, dptr, pages * pageSize, CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD, 0); | ||
this->mr = IBVerbs::ibv_reg_dmabuf_mr2(pd, 0, pages * pageSize, addr, fd, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please double check the size here. I think we need to use another size instead of pages * pageSize
src/ib.cc
Outdated
bool cuMemAlloc = mscclpp::isCuMemMapAllocated((void*)dptr); | ||
if (cuMemAlloc) { | ||
int fd; | ||
cuMemGetHandleForAddressRange(&fd, dptr, pages * pageSize, CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need to use buff directly instead of dptr, please double-check the API doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second parameter should be CUdeviceptr dptr.
CUresult cuMemGetHandleForAddressRange ( void* handle, CUdeviceptr dptr, size_t size, CUmemRangeHandleType handleType, unsigned long long flags )
https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MEM.html#group__CUDA__MEM_1g51e719462c04ee90a6b0f8b2a75fe031
…cated always return flase
…b.com/microsoft/mscclpp into qinghuazhou/support_ibv_reg_dmabuf_mr
#if !defined(__HIP_PLATFORM_AMD__) | ||
MSCCLPP_CUTHROW(cuCtxGetDevice(&dev)); | ||
MSCCLPP_CUTHROW(cuDeviceGetAttribute(&dmaBufSupported, CU_DEVICE_ATTRIBUTE_DMA_BUF_SUPPORTED, dev)); | ||
#endif // !defined(__HIP_PLATFORM_AMD__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this micro, we map hip function to cu function, maybe you can refer this file: https://github.com/microsoft/mscclpp/blob/main/include/mscclpp/gpu.hpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems CU_DEVICE_ATTRIBUTE_DMA_BUF_SUPPORTED is not supported by hip
https://rocm.docs.amd.com/projects/HIPIFY/en/latest/reference/tables/CUDA_Driver_API_functions_supported_by_HIP.html
|
||
int fd; | ||
MSCCLPP_CUTHROW(cuMemGetHandleForAddressRange(&fd, base + alignedOffset, alignedUserBufferSize, | ||
CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems we can use cuMemGetHandleForAddressRange(&fd, addr, pages, CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD, 0))
Don't need to call cuMemGetAddressRange
?
Fix issue #496
For buffer allocated by cuMemMalloc, use ibv_reg_dmabuf_mr to register a dma-buf based memory region.