0

Error 'Program received signal SIGSEGV: Segmentation fault' after increasing MGTAB

Hi, 

I increased the MGTAB value in the T2 file to 9000 and recompiled the code. However, when running I receive this error message:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x7FF622C8A667
#1  0x7FF622C8AC34
#2  0x7FF62224719F
#3  0x40C886 in multi_ at t2fm.f:4201
#4  0x408A5D in cycit_ at t2fm.f:3077
#5  0x4248FF in tough2 at t2fm.f:741

The problem only occurs in the dynamic run. The static run works without problems.

I don't know if this is relevant, but I use this to compile the code (on a linux machine):

 gfortran -o eos1_n3 -g -fdefault-real-8 -fno-align-commons eos1.f t2fm.f t2cg22.f meshm.f t2f.f t2solv.f

I do get some warnings of the type:

    Warning: Named COMMON block 'c9' at (1) shall be of the same size as elsewhere       (5 vs 3000000 bytes)
      t2fm.f:5573.21:

      COMMON/FMOLDIF/FDIF(1)
                     1

Anyone anny suggestion on how to solve this?

Thanks a lot in advance!

Katrijn

8replies Oldest first
  • Oldest first
  • Newest first
  • Active threads
  • Popular
  • Nobody any experience with this?

    Like
  • Katrijn,

    You may exceed the memory limit. Try to add the following to the compile options:

    -mcmodel=medium

    The warnings on COMMON block size you can ignore (make sure you disable array bound checking).

    Stefan

    Like
  • Dear Stefan,

    Thank you for your response. I added these changes (also checking mcmodel = large), but unfortunately, this didn't help.

    I am under the impression that the problem is not the creation of the executable using the enlarged mgtab value (I can use the new executable on smaller sized problems without any errors), but really linked to the program itself?

    We are now working around this issue by splitting up the model into a number of smaller timeframes, therby using the output of each run as initial conditions for the next run.

    Katrijn

    Like
  • Katrjin,

    Try to compile using the following options to get more information (line 4201 is not related to GENER):

    -g -fbacktrace -fdefault-real-8  -falign-commons -fno-automatic -finit-local-zero -mcmodel=medium

    and if it works, try to remove the -g.

    Stefan

    Like
  • Dear Stefan,

    Thank you for the suggestion. I added these compiler options. The exact compilation string I now use is:

    gfortran -o eos1_n -g -fbacktrace -fdefault-real-8 -falign-commons -fno-automatic -finit-local-zero -mcmodel=medium eos1.f t2fm.f t2cg22.f meshm.f t2f.f t2solv.f

    An executable was created (there are some common blocks warnings of the type 'Warning: Padding of 4 bytes required before 'rp' in COMMON 'rpcap' at (1); reorder elements or use -fno-align-commons')

    Running with this executable resulted again in a segmentation fault error (this time already in the static model), but wit a different backtrace:

    Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

    Backtrace for this error:
    #0  0x2B627CB0C667
    #1  0x2B627CB0CC34
    #2  0x2B627D50719F
    #3  0x40D622 in multi_ at t2fm.f:4202
    #4  0x4090DB in cycit_ at t2fm.f:3077
    #5  0x42710F in tough2 at t2fm.f:741
    Segmentation fault

    When compiling without the -g, the error message when executing the code is the following:

    Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

    Backtrace for this error:
    #0  0x2AD36D99B667
    #1  0x2AD36D99BC34
    #2  0x2AD36E39619F
    #3  0x40D622 in multi_
    #4  0x4090DB in cycit_
    #5  0x42710F in MAIN__ at t2fm.f:0

    So no succes yet..

    Katrijn

    Like
  • Katrjin,

    If I had the input file (and the time), I would debug that by going to t2fm.f and add a write statement just before Line 4202 to figure out why N1 or N2 is either 0 or a huge number.

    Stefan

    Like
  • Dear Stefan,

    Before posting in this forum I went with this problem to our IT-department and they tried doing exactly that. However, if I remeber correctly, the issue with this was that the program would quit running just before this line, so we never managed to get a printout of this information. 
    Do you perhaps have any tricks we could try to overcome this?

    Katrijn

    Like
  • Katrjin,

    If it crashes on Line 4202, it will get to the debug printout line, but you may not see it in the output file, because it's still in the output buffer and you lose it during the crash. To avoid this, add a line with

          STOP

    after the debug printout line but before Line 4202, so it will just stop (not crash) and give you the printout. Of course, if the issue is before Line 4202, you have to move backwards with this strategy.

    None of what is happening near Line 4202 has anything to do with MGTAB, except if memory is overwritten somewhere (which is hard to track in the old TOUGH2 version because you cannot check for array bound errors; this issue is resolved in TOUGH and iTOUGH2).

    Final wild guess:

    In case N1 and/or N2 turns out to be a funny number (i.e., below 1 or very high), try adding -finit-local-zero to the gfortran compile options (even though this would not help with a memory overwrite).

    If somebody has time to look at Katrjin's issue, please invite her to send you her input file.

    Good luck!

    Stefan

    Like
Like Follow
  • 4 yrs agoLast active
  • 8Replies
  • 3132Views
  • 2 Following