Hi,
I’m trying to compile a simple loop where the iteration boundaries are not fix:
#pragma acc region
{
#pragma acc for
for (int y = ytl; y < ybr; y++) {
for (int x = xtl; x < xbr; x++) {
float v1 = a[x + y*width];
float v2 = b[x + y*width];
sum += (v1 - v2);
}
}
}
However, the compiler exits with an internal compiler error:
PGC-F-0000-Internal compiler error. unknown reference 15 (test.c: 46)
PGC/x86-64 Linux 10.0-0: compilation aborted
The loop is inside of a function ending at line number 46.
When I replace xtl by a constant, the compiler generates working code for CUDA. Is there any way to get around this issue?
Best regards,
Richard
Hi Richard,
I tried to recreate the ICE, but was unable. If you could send an example code which reproduces the error to PGI customer service ([email protected]), I would appreciate it.
What I did encounter was that I needed to specify the bounds of the a and b arrays in the copyin clause. It’s possible that you need to do the same. For example:
% cat test.c
#include <stdio.h>
#include <accel.h>
float foo (float *a, float *b, int xtl, int ytl, int ybr,int xbr,int width) {
float sum;
#pragma acc region copyin(a[0:xtl], b[0:xtl])
{
for (int y = ytl; y < ybr; y++) {
for (int x = xtl; x < xbr; x++) {
float v1 = a[x + y*width];
float v2 = b[x + y*width];
sum += (v1 - v2);
}
}
}
return sum;
}
% pgcc -c test.c -Msafeptr -ta=nvidia -V10.0 -Minfo=accel
foo:
9, Generating copyin(b[:xtl])
Generating copyin(a[:xtl])
11, Loop is parallelizable
Accelerator kernel generated
11, #pragma acc for parallel, vector(256)
15, Sum reduction generated for sum
12, Loop is parallelizable
Hope this helps,
Mat
Hi Mat,
using the copyin clause was the right hint!
I had still some problems since I used structs to store some parameters and got the same error message. Using normal variables solved the problem!
Richard