Hello!
I’m a newbie in Cuda. How to use many devices in the same time in a cuda code? I use the latest version cuda 4.0.
Thanks for your attention.
Hello!
I’m a newbie in Cuda. How to use many devices in the same time in a cuda code? I use the latest version cuda 4.0.
Thanks for your attention.
Hi,
One option is to use on thread per GPU.
Hi,
One option is to use on thread per GPU.
Excuse me, it isn’t clear for me. Do you have an example? Suppose that i want use ex1_kernel on 2 GPUs.
I do this but it seems that the execution is sequential and not sumultaneous:
dim3 dimBlock(BlockSize);
dim3 dimGrid((n+BlockSize-1)/BlockSize);
dim3 dimGrida((nnz+BlockSize-1)/BlockSize);
dim3 dimGridTest(65535);
cudaGetDevice(0);
ex1_kernel<<<dimGridTest, dimBlock >>>(…variables allocated on device(0)…);
cutilSafeCall(cudaMemcpy(…cudaMemcpyDeviceToHost));
cudaGetDevice(1);
ex1_kernel<<<dimGridTest, dimBlock >>>(…variables allocated on device(1)…);
cutilSafeCall(cudaMemcpy(…cudaMemcpyDeviceToHost));
Thank you.
Excuse me, it isn’t clear for me. Do you have an example? Suppose that i want use ex1_kernel on 2 GPUs.
I do this but it seems that the execution is sequential and not sumultaneous:
dim3 dimBlock(BlockSize);
dim3 dimGrid((n+BlockSize-1)/BlockSize);
dim3 dimGrida((nnz+BlockSize-1)/BlockSize);
dim3 dimGridTest(65535);
cudaGetDevice(0);
ex1_kernel<<<dimGridTest, dimBlock >>>(…variables allocated on device(0)…);
cutilSafeCall(cudaMemcpy(…cudaMemcpyDeviceToHost));
cudaGetDevice(1);
ex1_kernel<<<dimGridTest, dimBlock >>>(…variables allocated on device(1)…);
cutilSafeCall(cudaMemcpy(…cudaMemcpyDeviceToHost));
Thank you.
Hi,
when i say thread i mean host thread on the CPU side. Each thread handles it’s own GPU.
You need to find a thread library that you want to use.
In CUDA 3.2 there is an SDK example “Simple multi-GPU” that could be helpful. Maybe CUDA 4.0 SDK have the same.
In CUDA 4.0 i know that it is possible to switch context in the same thread, but i can’t give you any help in that case. I have not tried 4.0 yet.
Hi,
when i say thread i mean host thread on the CPU side. Each thread handles it’s own GPU.
You need to find a thread library that you want to use.
In CUDA 3.2 there is an SDK example “Simple multi-GPU” that could be helpful. Maybe CUDA 4.0 SDK have the same.
In CUDA 4.0 i know that it is possible to switch context in the same thread, but i can’t give you any help in that case. I have not tried 4.0 yet.
Ok, thanks for your help.
Ok, thanks for your help.