Hi,
My group is working on porting a CFD code to GPU with OpenACC. As a first step, we wanted to compile the code on CPU but issues came up.
On a Linux x86 system, it shows segmentation fault with nvfortran -g -Ktrap=fp -r8 -O0
. In some tests, it shows the line number where it encounters an array index out-of-bound error, where the displayed array range makes no sense to me:
0: Subscript out of range for array mhdflux_v (FaceFlux.f90: 2983)
subscript=2, lower bound=1640695287988, upper bound=1640695287990, dimension=1
In some other tests, it does not show the line number, just messages like
[ny01:09070] *** Process received signal ***
[ny01:09070] Signal: Segmentation fault (11)
[ny01:09070] Signal code: (128)
[ny01:09070] Failing at address: (nil)
The code that is causing this issue is related to the usage of derived types in Fortran. We have a large derived type declared in one module and used in another module, with a mixture of scalars and vectors that looks like
type, public :: Param
integer :: iLeft, jLeft, kLeft
integer :: iRight, jRight, kRight
integer :: iBlockFace
integer :: iDimFace
integer :: iFluidMin = 1, iFluidMax = nFluid
integer :: iVarMin = 1, iVarMax = nVar
integer :: iEnergyMin = nVar+1, iEnergyMax = nVar + nFluid
integer :: iFace, jFace, kFace
real :: CmaxDt
real :: Area2, AreaX, AreaY, AreaZ, Area = 0.0
real :: DeltaBnL, DeltaBnR
real :: DiffBb ! (1/4)(BnL-BnR)^2
real :: StateLeft_V(nVar)
real :: StateRight_V(nVar)
real :: FluxLeft_V(nVar+nFluid), FluxRight_V(nVar+nFluid)
real :: Normal_D(3), NormalX, NormalY, NormalZ
real :: Tangent1_D(3), Tangent2_D(3)
real :: B0n, B0t1, B0t2
real :: UnL, Ut1L, Ut2L, B1nL, B1t1L, B1t2L
real :: UnR, Ut1R, Ut2R, B1nR, B1t1R, B1t2R
real :: MhdFlux_V( RhoUx_:RhoUz_)
real :: MhdFluxLeft_V( RhoUx_:RhoUz_)
real :: MhdFluxRight_V(RhoUx_:RhoUz_)
real :: Enormal
real :: Unormal_I(nFluid+1) = 0.0
real :: UnLeft_I(nFluid+1)
real :: UnRight_I(nFluid+1)
real :: EtaJx, EtaJy, EtaJz, Eta
real :: InvDxyz, HallCoeff
real :: HallJx, HallJy, HallJz
logical :: UseHallGradPe = .false.
real :: BiermannCoeff, GradXPeNe, GradYPeNe, GradZPeNe
real :: DiffCoef, EradFlux=0.0, RadDiffCoef
real :: HeatFlux, IonHeatFlux, HeatCondCoefNormal
real :: bCrossArea_D(3) = 0.0
real :: B0x=0.0, B0y=0.0, B0z=0.0
real :: ViscoCoeff
logical :: IsBoundary
real :: InvClightFace, InvClight2Face
logical :: DoTestCell = .false.
logical :: IsNewBlockVisco = .true.
logical :: IsNewBlockGradPe = .true.
logical :: IsNewBlockCurrent = .true.
logical :: IsNewBlockHeatCond = .true.
logical :: IsNewBlockIonHeatCond = .true.
logical :: IsNewBlockRadDiffusion = .true.
logical :: IsNewBlockAlfven = .true.
end type Param
An object of this derived type is passed between several subroutines to set the parameters and intermediate values.
One of the arrays with declared range 2:4 in this derived type caused the issue. I have tried several different approaches to resolve this issue:
- turn off OpenMP
- use local array (copy) instead of pointer to the vectors
- direct call with
p%MhdFlux_V
, etc., without using theassociate
block - change vector range from
2:4
to1:3
- move the vectors into a separate type declaration
However, none of these works. An older version of this module without using derived types can be compiled and run without issue, which indicates that there’s something going on with the usage of derived type.
With -O2
or above, the code does not generate runtime error, but the result is wrong. We have confirmed that the same code has no issue with gfortran, nagfor and ifort. We have also run valgrind with gcc, and it showed no memory issue.