Complex struct as input to a kernel #574

foglienimatteo · 2025-02-25T17:24:37Z

Hello everyone!

We are trying to use KernelAbstraction.jl for parallelising some nested integrals inside a Julia code (GaPSE.jl, for the ones who are interested).

We encountered however an issue we are not able to solve: we need to pass a complicated struct into the kernel to make the computations,
but we get stuck at the error Argument XXX to your kernel function is of type YYY, which is not isbits: that exceptions is due to Vector{Float64}, which is not
a bit type (even if I don't get why, because we don't use pointers and it's made of primitive concrete Float64 values).

We created a minimal working example with oneAPI of what we are doing, I put it at the bottom.
The same same behaviour can be produced with Metal, after the replacements oneAPI.oneAPIBackend()->Metal.MetalBackend() and Float64->Float32 (because Metal does not support doubles).

The part which bugs us the most: it IS possible to pass a Vector{Float64}, but only if:

it is first transformed into the GPU type (e.g oneArray([1 2 3]))
that vector is then given as input DIRECTLY to the kernel

My question is: are we doing something wrong/missing something, or really is not possible to pass structs containing Vectors to the kernel?

Some notes:

the input struct can be read-only, we don't want to modify it;
we cannot use StaticArrays, because we don't know the size at compile time but at runtime;
we tried to replace Vector{Float64} with oneArray{Float64}, no change;
we tried to add @Const to the struct, no change;
it would be very complicated and unpractical for us to unpack the whole struct; have a look at GaPSE.Cosmology if you need; we thought about creating another struct from this one removing the obvious nonbits types (e.g. Strings), but we absolutely need to bring the vectors (GaPSE.MySpline is just another struct containing Vector{Float64})

Thank you in advance for the help; KernelAbstractions.jl is a great idea, we would really like to make it work.

Here the MWE:

using KernelAbstractions
using oneAPI


struct MyStruct
    a::Float64
    b::Vector{Float64}
end

MS = MyStruct(1.0, [2.0, 3.0])


@kernel function my_kernel(A, MS)
  I = @index(Global)
  A[I] = 2 * A[I] + MS.b[1]
end

#backend = CPU()                       # This works
backend = oneAPI.oneAPIBackend()     # This doesn't

A = KernelAbstractions.ones(backend, Float64, 1024, 1024)
ev = my_kernel(backend, 64)(A, MS, ndrange=size(A))
KernelAbstractions.synchronize(backend)

println( "Test: ", all(A .== 4.0) ? "Passed" : "Failed" )

which gives:

   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.7 (2024-11-26)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> include("./KernelAbstraction_struct.jl")     # with "backend = CPU()"
Test: Passed

julia> include("./KernelAbstraction_struct.jl")     # with "backend = oneAPI.oneAPIBackend()"

ERROR: LoadError: GPU compilation of MethodInstance for gpu_my_kernel(::KernelAbstractions.CompilerMetadata{…}, ::oneDeviceMatrix{…}, ::MyStruct) failed
KernelError: passing and using non-bitstype argument

Argument 4 to your kernel function is of type MyStruct, which is not isbits:
  .b is of type Vector{Float64} which is not isbits.


Stacktrace:
  [1] check_invocation(job::GPUCompiler.CompilerJob)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/validation.jl:92
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:128 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/Lw5SP/src/TimerOutput.jl:253 [inlined]
  [4] codegen(output::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:126
  [5] codegen
    @ ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:115 [inlined]
  [6] compile(target::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:111
  [7] compile
    @ ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:103 [inlined]
  [8] #58
    @ ~/.julia/packages/oneAPI/1GTs3/src/compiler/compilation.jl:81 [inlined]
  [9] JuliaContext(f::oneAPI.var"#58#59"{GPUCompiler.CompilerJob{GPUCompiler.SPIRVCompilerTarget, oneAPI.oneAPICompilerParams}}; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:52
 [10] JuliaContext(f::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/driver.jl:42
 [11] compile(job::GPUCompiler.CompilerJob)
    @ oneAPI ~/.julia/packages/oneAPI/1GTs3/src/compiler/compilation.jl:80
 [12] actual_compilation(cache::Dict{…}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{…}, compiler::typeof(oneAPI.compile), linker::typeof(oneAPI.link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/execution.jl:128
 [13] cached_compilation(cache::Dict{…}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{…}, compiler::Function, linker::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/execution.jl:103
 [14] macro expansion
    @ ~/.julia/packages/oneAPI/1GTs3/src/compiler/execution.jl:203 [inlined]
 [15] macro expansion
    @ ./lock.jl:267 [inlined]
 [16] zefunction(f::typeof(gpu_my_kernel), tt::Type{Tuple{KernelAbstractions.CompilerMetadata{…}, oneDeviceMatrix{…}, MyStruct}}; kwargs::@Kwargs{})
    @ oneAPI ~/.julia/packages/oneAPI/1GTs3/src/compiler/execution.jl:198
 [17] zefunction(f::typeof(gpu_my_kernel), tt::Type{Tuple{KernelAbstractions.CompilerMetadata{…}, oneDeviceMatrix{…}, MyStruct}})
    @ oneAPI ~/.julia/packages/oneAPI/1GTs3/src/compiler/execution.jl:195
 [18] macro expansion
    @ ~/.julia/packages/oneAPI/1GTs3/src/compiler/execution.jl:66 [inlined]
 [19] (::KernelAbstractions.Kernel{…})(::oneArray{…}, ::Vararg{…}; ndrange::Tuple{…}, workgroupsize::Nothing)
    @ oneAPI.oneAPIKernels ~/.julia/packages/oneAPI/1GTs3/src/oneAPIKernels.jl:89

The text was updated successfully, but these errors were encountered:

vchuravy · 2025-02-26T00:59:52Z

You will need to add a rule with https://github.com/JuliaGPU/Adapt.jl to define how to translate your struct across the CPU-GPU boundary. The @adapt_strucuture may be sufficient.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Complex struct as input to a kernel #574

Complex struct as input to a kernel #574

foglienimatteo commented Feb 25, 2025

vchuravy commented Feb 26, 2025

Complex struct as input to a kernel #574

Complex struct as input to a kernel #574

Comments

foglienimatteo commented Feb 25, 2025

vchuravy commented Feb 26, 2025