

Therefore, I dug a little deeper and noticed that the error (0x2) might be because of shared memory requirement, but haven’t found any way I can determine how much does my kernel use. I find it hard to believe our kernel managed to gobble up nearly a GB worth of space.

The following line fails never reaching the first lines of code inside the kernel:ĬalculationKernell>( arrLen, arr, kernelData, results, log) The point where the code fails is upon entry to the kernel itself, not within it.Outside of processing, the changes affected the pre-allocated memory and added one statically-defined boolean inside the kernel. The system worked prior to a few changes.All memory for the kernel is pre-allocated before the call.The error persists even when the amount of free memory, per cudaMemGetInfo, is up to 900mb.I’m not at liberty to post the contents of the program, and it is very long. Warning: Cuda API error detected: cudaLaunch returned (0x2)Īccording to the documentation, Error 0x2 means the API call failed because it was unable to allocate enough memory to perform the requested operation. When debugging, I also see the following error: Now, I have no idea how it got name for the kernel but that seems besides the point.
#CUDA VERY EXPENSIVE CUDALAUNCH CALLS DRIVER#
There is no need to explicitly link against the driver API/library to use cuda functions. Functions that begin with cu (but not cuda) are part of the driver API. When attempting to run a kernel from host, we get this:ĬUDA error while running kernel: /home/Velo/History/2013/.dat, err: out of memory Function names that begin with cuda are part of the CUDA runtime API, not the CUDA driver API.

I’ve searched for similar problems, but no suggestion seemed to help. I’ve encountered something weird, and am unsure whether it is a bug, misuse of the hardware or just a misunderstanding.
