-
Notifications
You must be signed in to change notification settings - Fork 1
Cheat sheet
__device__
: executed on the device. Callable from the device only.
__global__
: executed on the device. Callable from the host or from the device for devices of compute capability 3.x or higher. Must have void return type.
__host__
: executed on the host. Callable from the host only (equivalent to declaring the function without any qualifiers).
gridDim.[x,y,z]
: 3 dimensional vector containing the dimensions of the grid. This is a constant that is set at kernel launch time. If not set explicitly each dimension defaults to 1.
blockIdx.[x,y,z]
: 3 dimensional vector containing the block index within the grid. This is a dynamic value that depends on which block calls it.
blockDim.[x,y,z]
: 3 dimensional vector containing the dimensions of the thread block. This is set at kernel launch time. If not set explicitly each dimension defaults to 1.
threadIdx.[x,y,z]
: 3 dimensional vector specifying the thread index within the thread block. Dynamic value depending on which thread calls it.
void Kernel_name<<< gridsize, blocksize >>>(arg1,arg2,…);
cudaMalloc( void **devPtr, size_t size );
: allocate memory
cudaFree( void *devPtr );
: free memory
cudaMemcpy( void *dst, const void *src, size_t size, enum cudaMemcpyKind kind );
: copies data between host and device.
kind is an enum that can be :
- cudaMemcpyHostToDevice
- cudaMemcpyDeviceToHost
cudaGetLastError(void);
: returns the last error from a runtime call.
char* cudaGetErrorString( cudaError_t code );
: returns the description string for an error code.