0

Runtime error with TOUGH2-MP (Allocatable variable BCELEM)

Hello everyone,

I'm new to TOUGH2-MP although we have the TOUGH2-code for some time now. We received the TOUGH2-MP code just some weeks ago. However I tried to compile it in on our cluster under Linux and had some difficulties with that.

Also I don't know anything about metis and aztec. I hope compiling metis (4.0.3) and aztec (2.1) worked. I did get the libaztec.a and libmetis.a so I guess everything is correct.

Allocatable variable BCELEM

After successfully compiling TOUGH2-MP (Linux, Intel 12.0-64, OpenMPI 1.4.4) with the Flags

ifort -O0 -fpp -r8 -i4 -check all -g -traceback

I get the following error message during runtime:

forrtl: severe (408): fort: (8): Attempt to fetch from allocatable variable BCELEM when it is not allocated

Image              PC                Routine            Line        Source
t2eos7_mp_intel    00000000007E1A8A  Unknown               Unknown  Unknown
t2eos7_mp_intel    00000000007E0605  Unknown               Unknown  Unknown
t2eos7_mp_intel    00000000007922C6  Unknown               Unknown  Unknown
t2eos7_mp_intel    0000000000750CA5  Unknown               Unknown  Unknown
t2eos7_mp_intel    00000000007510F9  Unknown               Unknown  Unknown
t2eos7_mp_intel    00000000005DB2E2  allreplicom_             2583  Paral_Subs.f
t2eos7_mp_intel    00000000004C04FC  cycit_                    240  Main_Comp.f
t2eos7_mp_intel    00000000004EF633  MAIN__                    477  TOUGH2.f
t2eos7_mp_intel    0000000000404A9C  Unknown               Unknown  Unknown
libc.so.6          00007F11A016AC36  Unknown               Unknown  Unknown
t2eos7_mp_intel    0000000000404999  Unknown               Unknown  Unknown

When I use the gcc-4.3.4 instead of the intel compiler with the flags

gfortran -O0 -fdefault-real-8 -i4 -g -fbacktrace -Wall

I just get the error message:

Program received signal 11 (SIGSEGV): Segmentation fault.
../t2eos7_mp_gnu.sh: line 11: 28431 Segmentation fault

Any idea what the problem could be?

Holger

4 replies

null
    • George_Pau
    • 10 yrs ago
    • Reported - view

    Holger,

    Could you send the errors directly to Noel Keen (ndkeen@lbl.gov)? He can help you in identifying the problem.  

    George

    • Holger_Seher
    • 10 yrs ago
    • Reported - view

    After some time I tried to replicate the error I described above. I did not manage to. Maybe they changed something on our cluster. I don't know. So I can't give a solution to this problem.

    Thanks for the help anyways.

    Holger

    • Holger_Seher
    • 10 yrs ago
    • Reported - view

    After testing different compiler flags last week, I found a solution to my previous post. I also want to present some results that could be useful for both, users and developers. It appears to me that the present code version needs some fixes regarding the occurrence of uninitialized values. It also should be made more robust against the compiler options that are used.

    Here are my findings:

    EOS7 and EOS7R with Intel compiler (12.0-64, Linux)

    The runtime error mentioned in my previous post (Attempt to fetch from allocatable variable BCELEM when it is not allocated) occurs for both, EOS7 and EOS7R, if the flag 

    -check all
    

    is set. With one of the following flags TOUGH2-MP it runs without error:

    ifort -O0 -fpp -r8 -i4 -g -traceback
    
    ifort -O0 -fpp -r8 -i4
    
    ifort -O0 -r8 -i4
    

    EOS7R with GCC (gfortran) compiler (4.3.6, Linux)

    Segmentation fault independent of the compiler flags I'm using

    gfortran -O0 -fdefault-real-8 -i4
    
    gfortran -O0 -fdefault-real-8 -i4 -g -fbacktrace
    
    gfortran -O0 -fdefault-real-8 -i4 -g -fbacktrace -Wall
    

    The segmentation fault disappears if the following compiler option is set:

    -fno-automatic
    

    EOS7 with GCC (gfortran) compiler (4.3.6, Linux)

    NaN in the TOUGH output after second timestep:

        36   (    1, 1) ST = 0.100000E-08 DT = 0.100000E-08 DX1= 0.000000E+00 DX2= 0.000000E+00 T =  25.000 P =   100000. S = 0.100000E+01
    
         1   (    2, 3) ST = 0.200000E+01 DT = 0.200000E+01 DX1=          NaN DX2=          NaN T =     NaN P =       NaN S = 0.100000E-05
    

    The NaN errors disappear if one or two of the following compiler options are set:

    -fno-automatic
    
    -finit-local-zero
    

    Discussion

    Using the gfortran –Wall option I noticed some “unitialized value” warnings. Since EOS7 was executed without error using the Intel compiler this brought up the idea that the two compilers might treat uninitialized values differently (by setting them 0 or NaN).

    Solutions were found using the options -fno-automatic and -finit-local-zero. The compiler option -fno-automatic treats each program unit as if the SAVE statement were specified for every local variable and array referenced in it. Consequently, the previously unitialized values remain initialized. The same is true for the compiler option -finit-local-zero, which instructs the compiler to initialize local INTEGER, REAL, and COMPLEX variables to zero.

    Using these compiler options, it seems that I get the correct TOUGH behavior with correct calculation results. However, which of both initialization methods (SAVE or =0) is correct cannot be answered from a user perspective. I therefore suggest that the code is revised in order to get a well defined model state independent from the compiler options that are used.

    It also should be clarified why the Intel compiler produces a runtime error in debug mode.

    I also send an email to Noel Keen with these results.

    Holger

    • George_Pau
    • 10 yrs ago
    • Reported - view

    Hi Holger,

    You are right. We always use a compiler flag that initializes all numbers to zero.  This initialization issue will be dealt with in the future.

Content aside

  • 10 yrs agoLast active
  • 4Replies
  • 2555Views
  • 2 Following