Subject: AMBER: Amber 9 run error

From: Andrew Borgert (
Date: Tue Apr 08 2008 - 16:28:07 CDT

Amber users,
I am running an mpi version of sander (amber 9 version) on an SGI linux
cluster using 16 processors. My simulation has been running fine in 2ns
chunks for a total of ~50ns, however for some reason the current 2ns run
will not execute properly. The run starts without error, but ends
shortly thereafter. The script error file (returned by the queuing
system I presume) contains a rather cryptic message:
forrtl: error (76): IOT trap signal
Timeout for rank 0 hostname 'cl1n246'. Job is not finalized there.
Cleaning up all processes ...
Some rank on 'cl1n246' exited without finalize.

The sander output file lists no errors, however it appears to stop when
reading in the restart file:

| Flags:
 getting new box info from bottom of inpcrd
| INFO: Old style inpcrd file read

This is the end of the 'out' file.

Upon examination, there appears to be nothing wrong with the restart
file. It is the same size as the previous restart files for this system
and contains the correct timestamp and number of atoms at the beginning
of the file, along with the correct box information at the end.

I am running other systems with the same sander executable, so I know
the program is functioning properly.

Any thoughts?

For what it's worth, this is my input file:

EA2 MD initial run up
  irest = 1,
  ntx = 7,
  ntb = 2, pres0 = 1.0, ntp = 1, taup = 2.0,
  cut = 10,
  ntr = 0,
  ntc = 2,
  ntf = 2,
  tempi = 300.0,
  temp0 = 300.0,
  ntt = 3,
  gamma_ln = 1.0,
  nstlim = 2000000, dt = 0.001,
  ntpr = 200, ntwx = 200, ntwr = 200

Andrew Borgert

