|
|||||||||||||||||||||||||||||||||
AMBER Archive (2006)Subject: AMBER: parallel pmemd with intel 9 fc
From: bala (bala_at_igib.res.in)
Dear Amber users,
I am using Amber8 and doing simulations in a cluster using "pmemd". I am using 64 processors. I am submitting jobs through bsub command. The simulation gets stopped inbetween. I have pasted the errors I got in three different runs of the same job. I checked with Intel website for the Runtime errors [for the error code forrtl: severe (174): SIGSEGV, segmentation fault occurred]. It is given that this could happen if the program attempts an invalid memory reference. Kindly suggest me how to get rid of this problem.
My input file is given below:
&cntrl
imin = 0, irest = 1, ntx = 7,
ntb = 2, pres0 = 1.0, ntp = 1,
cut = 10, ntr = 0,
ntc = 2, ntf = 2,
tempi = 300.0, temp0 = 300.0,
ntt = 3, gamma_ln = 1.0,
nstlim = 250000, dt = 0.002,
ntpr = 100, ntwx = 100
/
Error files
Error-file 1:
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
libvapi.so 0000002A96BF74AF Unknown Unknown Unknown
srun: error: n31: task49: Exited with exit code 174
srun: Terminating job
srun: error: n1: task0: Exited with exit code 174
---------------------------------------------------------------------------------
Error-file 2:
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
pmemd 00000000004480F6 Unknown Unknown Unknown
pmemd 000000000044A114 Unknown Unknown Unknown
pmemd 000000000045F1E3 Unknown Unknown Unknown
pmemd 00000000004051B6 Unknown Unknown Unknown
libc.so.6 0000002A95E20197 Unknown Unknown Unknown
pmemd 00000000004050EA Unknown Unknown Unknown
srun: error: n26: task41: Exited with exit code 174
srun: Terminating job
srun: error: n1: task0: Exited with exit code 174
-----------------------------------------------------------------------------------
Error-file 3:
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
pmemd 00000000004480F6 Unknown Unknown Unknown
pmemd 000000000044A114 Unknown Unknown Unknown
pmemd 000000000045F1E3 Unknown Unknown Unknown
pmemd 00000000004051B6 Unknown Unknown Unknown
libc.so.6 0000002A95E20197 Unknown Unknown Unknown
pmemd 00000000004050EA Unknown Unknown Unknown
srun: error: n26: task41: Exited with exit code 174
srun: Terminating job
srun: error: n1: task0: Exited with exit code 174
-----------------------------------------------------------------------
| |||||||||||||||||||||||||||||||||
|