Hi!
I have nested loops. I marked inner loop with “!$acc loop seq”, but compiler’s output looks like compiler ignore this directive. I tried to write a small program to reproduce another bug I faced with.
program test
integer, parameter :: SZ = 12800000
integer :: i,j,k
real*8 :: d(SZ)
real*8 :: tmp(128)
d(:) = 0.0
!$acc kernels loop private(tmp) independent
do i=0,((SZ/128)-1)
!$acc loop seq
do j=1,128
tmp(j) = 1.0/j
enddo
!$acc loop seq
do j=1,128
d(i*128+j) = tmp(j)*3.1415
enddo
enddo
!$acc end kernels
print *, 'sum = ', sum(d)
end program
Output:
pgi$ pgfortran -acc -Minfo test.f90
test:
13, Generating present_or_copy(d(:))
Generating compute capability 1.3 binary
Generating compute capability 2.0 binary
14, Loop is parallelizable
16, Loop is parallelizable
Accelerator kernel generated
14, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
16, CC 1.3 : 15 registers; 20 shared, 48 constant, 0 local memory bytes
CC 2.0 : 17 registers; 0 shared, 48 constant, 0 local memory bytes
20, Loop is parallelizable
Accelerator kernel generated
14, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
20, CC 1.3 : 10 registers; 24 shared, 4 constant, 0 local memory bytes
CC 2.0 : 17 registers; 0 shared, 44 constant, 0 local memory bytes
26, sum reduction inlined
Result is correct.
Any ideas? Should I use “acc parallel loop” instead of “kernels”?
Alexey