-
Notifications
You must be signed in to change notification settings - Fork 0
GPU Matrices
BIDMat currently has GPU versions of FMat, IMat and SMat, which are respectively GMat, GIMat and GSMat. Most operators and matrix functions defined for CPU matrices will also work on GPU matrices. These operations should also be defined for the Mat superclass. This allows generic code to be written and run on either CPU host or GPU, and should support sparse or dense data.
e.g. the inner loop of LDA (Latent Dirichlet Allocation) looks like this:
def uupdate(sdata:Mat, user:Mat, ipass:Int):Unit = { for (i <- 0 until opts.uiter) { val preds = DDS(mm, user, sdata) val dc = sdata.contents val pc = preds.contents max(opts.weps, pc, pc) pc ~ dc / pc val unew = user ∘ (mm * preds) + opts.alpha if (opts.exppsi) exppsi(unew, unew) user <-- unew } }
The matrix sdata
can be either sparse or dense, and CPU- or GPU-based. DDS
returns the product of mm
and user
at the non-zeros of sdata
which for a dense sdata
is just the full product.
You can convert to GPU matrices with constructors for each type, e.g. GMat(a)
constructs a GMat from an FMat
source (and returns its argument if a is already a GMat). Similarly GIMat(mi)
and GSMat(s)
construct GIMats and GSMats respectively from IMat or SMat arguments. GPU matrices should have the same toString as the corresponding CPU matrix, so look the same when returned from interactive commands. e.g.
> val a = rand(4,6) a: BIDMat.FMat = 0.67636 0.15675 0.43748 0.081511 0.46293 0.097704 0.31279 0.69485 0.91233 0.87120 0.12652 0.71330 0.10547 0.88596 0.58793 0.90858 0.45308 0.45136 0.83065 0.84817 0.080891 0.022294 0.73676 0.14168 > GMat(a) res12: BIDMat.GMat = 0.67636 0.15675 0.43748 0.081511 0.46293 0.097704 0.31279 0.69485 0.91233 0.87120 0.12652 0.71330 0.10547 0.88596 0.58793 0.90858 0.45308 0.45136 0.83065 0.84817 0.080891 0.022294 0.73676 0.14168
you can access elements of GPU matrices with indices, e.g. a(0,0)
but element access is normally only useful for debugging or interactive exploration. Pulling single elements across the CPU/GPU boundary is very expensive, and will normally nullify any benefits from computing on the GPU.
GIMat
s support block indexing, e.g. for integer (GIMat) matrices ii
and jj
, you can access a block of a GMat aa
as
aa(ii,jj)