You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running a graph with an ops.matmul that operates on tensors with batch size larger than 2^16-1, I get a cuda error: CUDA call failed: CUDA_ERROR_INVALID_VALUE (invalid argument). The same graph executes fine when using a CPU.
Steps to reproduce
Here is a python code that calls ops.matmul on with a Bx1x1 tensor. It is run on a CPU and a GPU with batch sizes B = 65535 and B = 65536.
When I run this on my machine (with RTX 3090), I get output like:
running matmul on cpu with batch size 65535
running matmul on cpu with batch size 65536
running matmul on gpu with batch size 65535
running matmul on gpu with batch size 65536
...
ValueError: At Kernels/mojo/gpu/host/device_context.mojo:1685:26: CUDA call failed: CUDA_ERROR_INVALID_VALUE (invalid argument)
System information
$ magic info
Magic version: 0.7.0
System
------------
Pixi version: 0.41.3
Platform: linux-64
Virtual packages: __unix=0=0
: __linux=6.8.0=0
: __glibc=2.39=0
: __cuda=12.8=0
: __archspec=1=zen4
$ magic list max
Package Version Build Size Kind Source
max 25.2.0.dev2025030205 release 9.7 KiB conda max
max-core 25.2.0.dev2025030205 release 238.3 MiB conda max-core
max-python 25.2.0.dev2025030205 release 117.9 MiB conda max-python
The text was updated successfully, but these errors were encountered:
Bug description
When running a graph with an
ops.matmul
that operates on tensors with batch size larger than 2^16-1, I get a cuda error:CUDA call failed: CUDA_ERROR_INVALID_VALUE (invalid argument)
. The same graph executes fine when using a CPU.Steps to reproduce
Here is a python code that calls
ops.matmul
on with a Bx1x1 tensor. It is run on a CPU and a GPU with batch sizesB = 65535
andB = 65536
.When I run this on my machine (with RTX 3090), I get output like:
System information
The text was updated successfully, but these errors were encountered: