среда, 1 октября 2025 г.

addresses of cuda kernel functions

 Quote from official document:

It is not allowed to take the address of a __device__ function in host code

I haven't been surprised for a long time that entire CUDA is made up of ridiculous restrictions. What if I told you that paranoid nvidia lies as usually and actually you can get addresses of kernel functions in your host code?

But first lets check what workarounds we can employ to have functions pointers. I don't know for what pedagogical purpose this code intentionally was written so poorly and does not free the allocated memory - and now millions of brainless artificial idiots will copy-paste it forever, so I made patched version. You can realize that attempt to read from early gathered with cudaMemcpyFromSymbol addresses will results error 1 (invalid argument)

Ok. but we could just return address of function directly from another kernel function, right? So I made quick & dirty hack
I brute-forced all combinations of cf1(__device__/__constant__) & variants of cudaMemcpyFromSymbol/cudaMemcpy - and with no luck
So it's time to run

cuda-gdb