pT+H compiling
Hi Dear All
We are trying to compile a parallel version of Tough+Hydrate (pT+H). So far we have been able to generate executable files of the Parallel version of TOUGH+HYDRATE (pT+H) at the supercomputer. At the supercomputer, instead of OpenMPI, MPICH is available. Instead of mpicc, mpifort, etc, we have cc & ftn commands. However, when we run the test files (i.e., Test_2DX in the attachment). We got the runtime error:
runtime error on SC:
At line 294 of file Parallel_subs.f
Fortran runtime error: Allocatable actual argument 'part' is not allocated
srun: error: nid00198: task 1: Exited with exit code 2
srun: Terminating job step 3624227.0
slurmstepd: error: *** STEP 3624227.0 ON nid00197 CANCELLED AT 2022-10-17T18:15:08 ***
srun: error: nid00197: task 0: Terminated
srun: Force Terminated job step 3624227.0
We got the same errors on the different supercomputers and our Dell Tower Desktop (Linux) with OpenMPI. When we start to run the infile, the executable file can read the input file, it can generate the INCON, OUT and Plot Coord, and other related files. But then the simulation is aborted without generating data in Hydrate Status, etc. We tried to use different compilers and other options with our HPC team at USGS but got the same error as the above.
Could you please help us with this runtime error?
11 replies
-
The error is pretty strange. I am pretty sure the "part" has been allocated. Did you accidently delete one line of the source code for "part" allocation or do you correctly link to Metis? The error occurs at the section for mesh partition using metis.
By the way, do you compile the codes in debug mode (I saw the error message with source code line number)? Debug mode will be significantly slower than the release mode.
-
you may add follow line to the file Parallel_subs.f right before line 294, recompile the source code and try again:
if (myid .NE. iMaster) allocate(part(1))
-
can you remove the option "-fcheck=all -Wall" from makefile and try again. The error message for the arrays that have not been used in certain CPU and so they are not allocated at these CPUs. The executable should not be forced doing the check during run time.
-
I can run your example with my own version code on google cloud. I wonder what version codes you are using
-
I have no idea of the V1.5. The tough website shows the last parallel version is V1.0
-
If you like, you can send me the package by email, and I will do the comparison. I am the original developer of the pTH, but did not touch it for a while. my email is kzhang at LBL dot gov