AMBER Archive (2007)

Subject: RE: AMBER: amber 9 sander crashed with "forrtl: severe (174): SIGSEGV, segmentation fault occurred"

From: Ross Walker (ross_at_rosswalker.co.uk)
Date: Tue Dec 11 2007 - 15:32:43 CST


Hi Shuzhi

My first question is a simple one. Have you run the test cases both in
serial and in parallel? If so do they all pass? Do other simulations all run
fine?

You need to do this step before we can debug any further since from what you
have said so far it suggests that it may be hardware problems - possible
interconnect failure if it only happens in parallel - or possibly a compiler
bug.

Have you tried PMEMD? Does the same problem occur in both PMEMD and in
sander.MPI?

Also if you set ntpr=1 and ntwx=1 what happens? Does it still fail? It may
be possible that you have a bad structure - sometimes this only shows up
when you switch to constant pressure. If you run with ntwx=1 and ntpr=1 you
may be able to see the structure start to blow up before some division by
zero or similar infinite energy problem is leading to the segfault. However,
the fact it runs okay in amber 8 and 7 suggests it is most probably a
compiler bug issue and running the test cases might help identify it.

Good luck,
Ross

/\
\/
|\oss Walker

| HPC Consultant and Staff Scientist |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- ross_at_rosswalker.co.uk |
| http://www.rosswalker.co.uk | PGP Key available on request |

Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.

> -----Original Message-----
> From: owner-amber_at_scripps.edu
> [mailto:owner-amber_at_scripps.edu] On Behalf Of Shuzhi Wang
> Sent: Tuesday, December 11, 2007 13:07
> To: amber_at_scripps.edu
> Cc: Shuzhi Wang
> Subject: AMBER: amber 9 sander crashed with "forrtl: severe
> (174): SIGSEGV, segmentation fault occurred"
>
> Dear all,
>
> (Sorry for the long email. but my problem is complicated and i cannot
> shorten this.)
>
> I am a new user of Amber, and I bumped into a very
> frustrating problem
> in my first try of running Amber 9: SANDER keeps crashing after an
> uncertain number of steps with the error message as follows:
> ----------error message with output context---------------
> NSTEP = 17800 TIME(PS) = 37.800 TEMP(K) =
> 285.13 PRESS =
> -656.4
> Etot = -2390.0295 EKtot = 1023.2938 EPtot =
> -3413.3233
> BOND = 1.2793 ANGLE = 0.4961 DIHED
> =
> 0.0002
> 1-4 NB = 0.0000 1-4 EEL = 0.0000 VDWAALS
> =
> 209.2466
> EELEC = -3624.3456 EHBOND = 0.0000 RESTRAINT
> =
> 0.0000
> EKCMT = 506.8383 VIRIAL = 996.4304 VOLUME =
> 34547.6103
> Density
> =
> 0.5226
> Ewald error estimate: 0.3956E-03
>
> --------------------------------------------------------------
> ----------------
>
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image PC Routine Line
> Source
> sander 0000000000548A0C Unknown
> Unknown Unknown
> sander 00000000004FAB86 Unknown
> Unknown Unknown
> sander 00000000006BE194 Unknown
> Unknown Unknown
> sander 00000000004DBE6B Unknown
> Unknown Unknown
> sander 00000000004ADF9E Unknown
> Unknown Unknown
> sander 00000000004AA218 Unknown
> Unknown Unknown
> sander 0000000000404062 Unknown
> Unknown Unknown
> libc.so.6 0000003BA081D8A4 Unknown
> Unknown Unknown
> sander 0000000000403FA9 Unknown
> Unknown Unknown
>
> NSTEP = 17900 TIME(PS) = 37.900 TEMP(K) = NaN PRESS
> = NaN
> Etot = NaN EKtot = NaN EPtot
> = NaN
> BOND = 1.5918 ANGLE = 0.6282 DIHED
> =
> 0.2988
> 1-4 NB = 0.0000 1-4 EEL = 0.0000 VDWAALS
> = NaN
> EELEC = NaN EHBOND = 0.0000 RESTRAINT
> =
> 0.0000
> EKCMT = 532.8891 VIRIAL = NaN VOLUME =
> 34531.6889
> Density
> =
> 0.5228
> Ewald error estimate: NaN
>
> --------------------------------------------------------------
> ----------------
>
> The whole situation is as follows:
>
> I want to run a NVT MD at 300 K on a nitrate ion in a 600 POL3 water
> cubic box with periodic boundary conditions. I first generated the
> prmtop and inpcrd files using Leap. I minimized the system first, and
> then heated it up from 0K to 300K using NVT MD. In the third
> step, I did
> a NPT MD at 300 K to get the correct density (~1g/cc). It was at this
> step when I found the problem. The input file is attached
> below together
> with the command to start the simulation:
> ---------------input-----------------
> NO3-.(H2O)600: 100ps MD NPT
> &cntrl
> imin = 0,
> irest = 1, ntx = 7,
> ntb = 2, pres0 = 0.7, ntp = 1, taup = 5.0,
> ipol = 0,
> cut = 12.0,
> ntc = 2, ntf = 2,
> tempi = 300.0, temp0 = 300.0,
> ntt = 3, gamma_ln = 1.0,
> nstlim = 100000, dt = 0.001
> ntpr = 100, ntwx = 100, ntwr = 1000
> /
> ---------bash script to run sander--------------
> sander -O -i nit_600pol3_cube_md2.in -o nit_600pol3_cube_md2.out -p
> nit_600pol3_
> cube.prmtop -c nit_600pol3_cube_md1.rst -r
> nit_600pol3_cube_md2.rst -x
> nit_600po
> l3_cube_md2.mdcrd
>
>
> I searched the mail archive and only found a similar problem about
> DIVCON, which has already been corrected by a bugfix of amber 9. this
> amber 9 was compiled using intel fortran compiler 10.0.023. all bug
> fixes for amber 9 had been applied before compilation.
>
> i tried the following things:
> 1) changing the parameters, which didn't help at all. amber still
> crashed, although not exactly after the same number steps.
> 2) doing the same simulation on H2O in 600 POL water box (i.e. a 601
> POL3 water box), in which the same problem occurred.
> 3) using amber 8 (compiled with intel fortran compiler v9)
> and amber 7
> (compiled with some other fortran compiler, but i don't know
> which one),
> and amber 7 worked and finished the simulation, but it was
> slower than
> amber 9, cannot do NTT=3 temperature scaling, and there was
> no parallel
> sander i can use. amber 8 displayed the same problem as amber 9.
>
> i wonder if anyone can kindly help me out of this frustrating
> situation.
>
> thanks,
> Shuzhi "James" Wang
> --------------------------------------------------------------
> ---------
> The AMBER Mail Reflector
> To post, send mail to amber_at_scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
>

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu