multi-GPU parallel operation

bshucker · April 28, 2008, 7:24pm

Hi,

I just started testing on a dual-GPU box (Dell H2C, with dual 8800GTX cards, 680i chipset, running ubuntu 32-bit), and I can’t seem to get both GPUs to work in parallel. I have two threads, each working on a compute-bound task on different device. However, the behavior I get is similar to what I see with a single GPU; each thread is busy about half the time, and appears to be waiting for the GPU for the other half.

I did call cudaSetDevice() first, and I called cudaGetDevice() to confirm that each thread has a different device number. The task involves virtually no I/O off the card, although there are a number of device-to-device copies and many different kernels running in sequence.

Is there any way to determine whether the two threads are really executing on two GPUs, as opposed to sharing one?

Are there any calls I could be making that inadvertently make one thread wait for the other?

If anyone has suggestions, please let me know.

tachyon_john · April 28, 2008, 9:27pm

I’d suggest querying the device properties for the two GPUs and verifying that you’re getting the props that match the two GPUs you’re using (if they aren’t identical, it’d be easy to tell).

Another thing to check is whether your threads code has a mutex in it that’s effectively preventing

both from running concurrently? Another trick would be to have your first thread allocate all of the GPU memory, then query print the amount of free GPU memory in both threads as a means of

determining if they got to the correct device, etc. Some of these you’d have to do with the driver API, but you get the idea…

Cheers,

John Stone

Hi,

I just started testing on a dual-GPU box (Dell H2C, with dual 8800GTX cards, 680i chipset, running ubuntu 32-bit), and I can’t seem to get both GPUs to work in parallel. I have two threads, each working on a compute-bound task on different device. However, the behavior I get is similar to what I see with a single GPU; each thread is busy about half the time, and appears to be waiting for the GPU for the other half.

I did call cudaSetDevice() first, and I called cudaGetDevice() to confirm that each thread has a different device number. The task involves virtually no I/O off the card, although there are a number of device-to-device copies and many different kernels running in sequence.

Is there any way to determine whether the two threads are really executing on two GPUs, as opposed to sharing one?

Are there any calls I could be making that inadvertently make one thread wait for the other?

If anyone has suggestions, please let me know.

[snapback]369791[/snapback]

bshucker · April 29, 2008, 7:28pm

Thanks for the suggestions. I have confirmed via cuGetMemoryInfo() that the two threads are allocating their memory on different devices. So it looks like the execution is split between the two GPUs as it’s supposed to be, except that the two GPUs aren’t both processing at the same time.

This would seem to indicate an inter-thread sync issue, but my two threads do not communicate with each other at all–they just get created at the start of execution and process separately, on two different data sets. That makes me think that CUDA is doing some kind of synchronization behind the scenes. Do any of the CUDA utility functions or macros (like cudaThreadSynchronize() or CUDA_SAFE_CALL()) actually sync across both GPUs instead of just the context from which they’re called? Or is this perhaps a driver issue?

paulius · May 1, 2008, 9:12pm

There shouldn’t be syncing between CUDA contexts. Can you try your code without cutil macros? A while ago I was able to drive 4 GPUs (2 Tesla D870) from a single box and didn’t run into the problem you’re describing.

Paulius

tachyon_john · May 1, 2008, 11:14pm

I don’t have any of these problems either, and I’ve been doing multi-GPU code for over a year starting with early beta versions. I presume there’s something specific to his kernel that’s creating the problem…

John

Topic		Replies	Views
How to check work is done by different GPU in multi GPU environment CUDA Programming and Performance	8	3005	June 18, 2009
My first test on CUDA and some questions sync, thread with CUDA CUDA Programming and Performance	5	3024	November 13, 2007
Multiple GPU computing CUDA Programming and Performance	8	7883	May 7, 2008
CPU core is busy while GPU runs its kernel CUDA Programming and Performance	11	5246	February 11, 2018
Multi-GPU with a single thread and driver API? CUDA Programming and Performance	5	4990	July 25, 2008
multi-GPU question. CUDA Programming and Performance	3	3074	December 6, 2008
unable to get the cpu and gpu to run in parallel CUDA Programming and Performance	34	23220	October 7, 2010
Overhead of using more than one streams? CUDA Programming and Performance	5	6180	April 14, 2009
Language confusion with multi-gpu CUDA Programming and Performance	11	19296	October 30, 2007
GPU and CPU don't run in (pure) parallel ? CUDA Programming and Performance	24	20159	May 4, 2007

multi-GPU parallel operation

Related topics