You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/lecture_11/lab.md
+28-13
Original file line number
Diff line number
Diff line change
@@ -75,8 +75,8 @@ using Metal
75
75
x =randn(Float32, 60, 60)
76
76
y =randn(Float32, 60, 60)
77
77
78
-
mx =MtlArray(x)
79
-
my =MtlArray(y)
78
+
mx =CuArray(x)
79
+
my =CuArray(y)
80
80
81
81
@info"" x*y ≈Matrix(mx*my)
82
82
@@ -86,7 +86,7 @@ my = MtlArray(y)
86
86
This may not be anything remarkable, as such functionality is available in many other languages
87
87
albeit usually with a less mathematical notation like `x.dot(y)`. With Julia's multiple dispatch, we
88
88
can simply dispatch the multiplication operator/function `*` to a specific method that works on
89
-
`MtlArray` type. You can check with `@code_typed`:
89
+
`CuArray` type. You can check with `@code_typed`:
90
90
```julia
91
91
julia>@code_typed mx * my
92
92
CodeInfo(
@@ -124,7 +124,7 @@ Let's now explore what the we can do with this array programming paradigm on som
124
124
# rgb_img = FileIO.load("image.jpeg");
125
125
# gray_img = Float32.(Gray.(rgb_img));
126
126
gray_img = rand(Float32, 10000, 10000)
127
-
cgray_img = MtlArray(gray_img)
127
+
cgray_img = CuArray(gray_img)
128
128
```
129
129
130
130
**HINTS**:
@@ -222,7 +222,7 @@ In the next example we will try to solve a system of linear equations $Ax=b$, wh
222
222
223
223
**BONUS 1**: Visualize the solution `x`. What may be the origin of our linear system of equations?
224
224
225
-
**BONUS 2**: Use sparse matrix `A` to achieve the same thing. Can we exploit the structure of the matrix for a more effective solution?
225
+
**BONUS 2**: Use sparse matrix `A` to achieve the same thing. Can we exploit the structure of the matrix for a more effective solution? Be aware though that `\` is not implemented for sparse structures by default.
226
226
227
227
!!! details "Solution"
228
228
```julia
@@ -323,14 +323,28 @@ int main() {
323
323
Compared to CUDA C the code is less bloated, while having the same functionality.[^4]
324
324
```julia
325
325
function vadd(a, b, c)
326
-
# CUDA.jl
327
-
# i = (blockIdx().x-1) * blockDim().x + threadIdx().x
326
+
i = (blockIdx().x-1) * blockDim().x + threadIdx().x
327
+
c[i] = a[i] + b[i]
328
+
return
329
+
end
328
330
329
-
# Metal.jl
331
+
len = 100
332
+
a = rand(Float32, len)
333
+
b = rand(Float32, len)
334
+
d_a = CuArray(a)
335
+
d_b = CuArray(b)
336
+
d_c = similar(d_a)
337
+
@cuda threads = len vadd(d_a, d_b, d_c)
338
+
c = Array(d_c)
339
+
```
340
+
341
+
In `Metal.jl` for Apple silicon
342
+
```julia
343
+
functionvadd(a, b, c)
330
344
i =thread_position_in_grid_1d()
331
-
c[i] = a[i] + b[i]
345
+
c[i] = a[i] + b[i]
332
346
333
-
return
347
+
return
334
348
end
335
349
336
350
len =100
@@ -451,7 +465,8 @@ It's important to stress that we only schedule the kernel to run, however in ord
451
465
- or a command to copy result to host (`Array(c)`), which always synchronizes kernels beforehand
452
466
453
467
!!! warning "Exercise"
454
-
Fix the `vadd` kernel such that it can work with different launch configurations, such as
468
+
Fix the `vadd` kernel such that it can work with different launch configurations, i.e. even if the launch configuration does not correspond to the length of arrays, it will not crash.
0 commit comments