Hello,
I’m currently developing a CUDA kernel to perform RayTracing over voxel data. The kernel works fine and so far I have managed to get it to write the image in a simple array which is then copied back to the ‘CPU’ and displayed with glDrawPixels().
Obviously all this memory copying/moving ruins performance so I want to generate the image on the GPU and leave it in the ‘graphics’ memory to be displayed later.
In the CUDA By Example book chapter 8 includes a basic example of this which I tried to copy. However, when I run the code I get a segmentation fault and no decent error messages.
According to the CUDA documentation the OpenGL Interoperability code is now deprecated and it doesn’t describe a situation where this might happen.
Has anyone had this problem before and can share a solution? Or is there a better solution to linking CUDA and OpenGL code together?
I have a google for a solution but I couldn’t find one especially as its just this one function causing problems. Also I can’t find the newer API for CUDA interoperability with OpenGL.
Details:
Nvidia 9800 GTX+ GPU
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2010 NVIDIA Corporation
Built on Wed_Nov__3_16:16:57_PDT_2010
Cuda compilation tools, release 3.2, V0.2.1221
build command: nvcc -c cuda_code.cu -arch sm_11
opengl : 2.1
OS : Ubuntu Linux 64bit
Relevant code:
pixels = new unsigned char[FLAGS_screenw*FLAGS_screenh]; // for copying GPU generated frame into
memset(pixels, 0, FLAGS_screenw*FLAGS_screenh);
cudaDeviceProp prop;
HANDLE_ERROR(cudaGetDevice(&dev));
LOG("ID of current CUDA device: %d", dev);
memset(&prop, 0, sizeof(cudaDeviceProp));
prop.major = 1;
prop.minor = 1;
HANDLE_ERROR(cudaChooseDevice(&dev, &prop));
HANDLE_ERROR(cudaGLSetGLDevice(dev)); // <---- from book
LOG("ID of CUDA device closest to revision %d.%d: %d", 1,1,dev);
HANDLE_ERROR(cudaSetDevice(dev));
LOG("Allocating memory in GPU");
HANDLE_ERROR(cudaMalloc((void**)&dev_world, sizeof(VOXEL) *
kVOXELLENGTH*kVOXELHEIGHT*kVOXELWIDTH));
HANDLE_ERROR(cudaMalloc((void**)&dev_screen, sizeof(unsigned char) *
FLAGS_screenw*FLAGS_screenh));
LOG("Allocating GL buffers");
glGenBuffers(1, &pixel_buffer);
LOGGLERROR(); // Macro for giving opengl errors
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, pixel_buffer);
LOGGLERROR();
glBufferData(GL_PIXEL_UNPACK_BUFFER, sizeof(unsigned char) * FLAGS_screenw *
FLAGS_screenh, NULL, GL_DYNAMIC_DRAW);
LOGGLERROR();
if (resource == NULL) LOG("NULL");
HANDLE_ERROR(cudaGraphicsGLRegisterBuffer(&resource, pixel_buffer,
cudaGraphicsMapFlagsNone));
LOG("CUDA Init complete");
The output I get is:
[cuda_code.cu:234] ID of current CUDA device: 0
[cuda_code.cu:240] ID of CUDA device closest to revision 1.1: 0
[cuda_code.cu:242] Allocating memory in GPU
[cuda_code.cu:247] Allocating GL buffers
[cuda_code.cu:255] NULL
Segmentation fault
As you can see the cudaGraphicsGLRegisterBuffer() function fails and causes a segmentation fault but I can’t seem to get any error codes out of it.
Thank you in advance for your help,
from Josh
CUDA Documentation: CUDA Toolkit Documentation