maximum number of blocks

hieronymus · April 10, 2008, 8:46am

I have 64 threads for a block. More threads per block does not make sense. I use havily __syncthreads. My kernel uses about 16 registers. I can use either 2000 bytes of shared memory or the full 16kb ( which gives faster performance).

Can somebody tell what is the maximum amount of blocks I can invoke? Is it limited by the total nr of registers? That would mean 8192/16 = 512 ??

seibert · April 10, 2008, 12:19pm

The maximum number of blocks is 65535 in each dimension (pg 74 of programming guide). The CUDA driver will run as many of them simultaneously as possible, and that depends on register and shared memory usage. So not all blocks are guaranteed to be running at the same time. Some will be queued waiting for other blocks to finish.

hieronymus · April 10, 2008, 12:55pm

I understand that the number of registers is of influence. But the shared memory? Does it mean it will swap shared memory. Or will be some of the shared memory left unused . Remember I use 64 threads each block.

MisterAnderson42 · April 10, 2008, 2:00pm

Only the the occupancy depends amount of registers, block size, and shared memory. None of these influence the total number of blocks you can execute, which is 65535*65535.

Shared memory is not swapped. Each block gets its own exclusive section of shared memory from the start of the block’s execution to the end. In your 2000 bytes of shared mem usage, there may be some left unused: use the occupancy calculator spreasheet to determine this.

Topic		Replies	Views
shared memory and CUDA calculator CUDA Programming and Performance	6	4045	October 26, 2008
Execution Of Thread-Blocks CUDA Programming and Performance	4	5285	June 18, 2007
Shared memory and register usage - just 1 thread/block CUDA Programming and Performance	1	801	July 21, 2009
Shared memory per block Related to shared memory of an MCPU CUDA Programming and Performance	3	3990	August 14, 2007
maximum threads per block not always used CUDA Programming and Performance	2	756	June 14, 2018
how to determine max number of blocks per kernel CUDA Programming and Performance	10	17235	September 11, 2011
Registers per thread limit and occupancy CUDA Programming and Performance	3	10061	March 30, 2007
regsPerBlock CUDA Programming and Performance	4	2458	September 28, 2008
What is the maximum number of threads per block? CUDA Programming and Performance	4	21249	April 8, 2010
Limit to Number of Blocks? Noob Question CUDA Programming and Performance	4	2992	May 16, 2008

maximum number of blocks

Related topics