Skip to content

Support ibv_reg_dmabuf_mr for buffer allocated by cuMemMalloc #513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

seagater
Copy link
Contributor

Fix issue #496
For buffer allocated by cuMemMalloc, use ibv_reg_dmabuf_mr to register a dma-buf based memory region.

@seagater seagater requested review from Binyang2014 and chhwang April 24, 2025 19:06
src/ib.cc Outdated
if (cuMemAlloc) {
int fd;
cuMemGetHandleForAddressRange(&fd, dptr, pages * pageSize, CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD, 0);
this->mr = IBVerbs::ibv_reg_dmabuf_mr2(pd, 0, pages * pageSize, addr, fd,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please double check the size here. I think we need to use another size instead of pages * pageSize

src/ib.cc Outdated
bool cuMemAlloc = mscclpp::isCuMemMapAllocated((void*)dptr);
if (cuMemAlloc) {
int fd;
cuMemGetHandleForAddressRange(&fd, dptr, pages * pageSize, CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need to use buff directly instead of dptr, please double-check the API doc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second parameter should be CUdeviceptr dptr.
CUresult cuMemGetHandleForAddressRange ( void* handle, CUdeviceptr dptr, size_t size, CUmemRangeHandleType handleType, unsigned long long flags )
https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MEM.html#group__CUDA__MEM_1g51e719462c04ee90a6b0f8b2a75fe031

#if !defined(__HIP_PLATFORM_AMD__)
MSCCLPP_CUTHROW(cuCtxGetDevice(&dev));
MSCCLPP_CUTHROW(cuDeviceGetAttribute(&dmaBufSupported, CU_DEVICE_ATTRIBUTE_DMA_BUF_SUPPORTED, dev));
#endif // !defined(__HIP_PLATFORM_AMD__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this micro, we map hip function to cu function, maybe you can refer this file: https://github.com/microsoft/mscclpp/blob/main/include/mscclpp/gpu.hpp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


int fd;
MSCCLPP_CUTHROW(cuMemGetHandleForAddressRange(&fd, base + alignedOffset, alignedUserBufferSize,
CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD, 0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems we can use cuMemGetHandleForAddressRange(&fd, addr, pages, CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD, 0)) Don't need to call cuMemGetAddressRange ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants