AMBER Archive (2009)

Subject: Re: [AMBER] help with TIP4P and mpi pmemd

From: Hashem Taha (hashemt_at_gmail.com)
Date: Thu Dec 03 2009 - 20:26:57 CST


Hi Bob,

I have tried this tip4p system before with the same molecule, and it worked
fine (using serial sander, parallel sander and parallel pmemd). The same
exact input files were used in this case. There are no comment lines in the
input file before &cntrl.

I tried running the same job using a serial version of sander but I
encountered the same problem. I've recompiled sander using gcc with
debugging flags and this is what I get when I run sander in GDB:

(gdb) run -O -i minwat.in -o minwat.out -p alpha_ara_ome_tip4p.top -c
alpha_ara_ome_tip4p.crd -r minwat.rst -ref alpha_ara_ome_tip4p.crd
Starting program: /home/john/amber10/bin/sander -O -i minwat.in -o
minwat.out -p alpha_ara_ome_tip4p.top -c alpha_ara_ome_tip4p.crd -r
minwat.rst -ref alpha_ara_ome_tip4p.crd

Program received signal SIGSEGV, Segmentation fault.
0x00000000004bb74e in nb_adjust_ ()
(gdb) backtrace
#0 0x00000000004bb74e in nb_adjust_ ()
#1 0x00000000004bdd42 in ewald_force_ ()
#2 0x00000000005f8259 in force_ ()
#3 0x0000000000483797 in runmin_ ()
#4 0x00000000004734e3 in sander () at _sander.f:1296
#5 0x0000000000470124 in MAIN__ () at _multisander.f:291
#6 0x0000000000a2c6ae in main ()

I don't have much experience with gdb but from the looks of it the error is
originating from nb_adjust().

I've tried recompiling sander and pmemd with different MPI libraries
(openmpi and mpich2) and no MPI, with and without MKL and using gfortran and
ifort, all the these combinations resulted in a SIGSEGV fault error.
Although, I only added the debug flags to the gfortran/no parallel version.

On Thu, Dec 3, 2009 at 3:37 PM, Robert Duke <rduke_at_email.unc.edu> wrote:

> Have you done this (tip4p) before? Try your prmtop/inpcrd/mdin with single
> processor sander, then single processor pmemd, and then pmemd mpi. I bet
> you have setup problems, or pmemd build problems, but this will sort that
> out. I will let others expond on setting up an extra points simulation if
> that is the problem. As an aside, why did you modify the elec and vdw
> screening parms for 1-4 interactions, scnb and scee. This is I believe
> generally not recommended, but maybe you are doing something I don't know
> about... Also, do you really have two comment lines in front of &cntrl? I
> have never tried that, maybe it is inconsequential but I don't know...
> (because there are multiple reading passes, namelist i/o combined with group
> i/o, I would not do anything nonstandard. May work fine, but namelist read
> errors can be really obscure, especially in parallel - one reason to switch
> to a single processor test case if something wierd happens.
> Regards - Bob Duke
> ----- Original Message ----- From: "Hashem Taha" <hashemt_at_gmail.com>
> To: <amber_at_ambermd.org>
> Sent: Thursday, December 03, 2009 5:16 PM
> Subject: [AMBER] help with TIP4P and mpi pmemd
>
>
> I have a problem with trying to run some jobs using TIP4P water as the
>> solvent. I have tried running the same exact files with TIP3P water and
>> the
>> calculations started and completed perfectly. However, upon changing from
>> TIP3P to TIP4P, my calculations would stop without reason. the file that I
>> am trying to run is just a water minimization and it results in the
>> following errors. The input file is also included below. The calculations
>> start but after a few steps they come to a halt. Any help would be
>> appreciated, and if you require further information please let me know...
>>
>> HT
>>
>> the errors are:
>>
>> forrtl: severe (174): SIGSEGV, segmentation fault occurred
>>>
>> Image PC Routine Line
>> Source
>> pmemd 000000000048265A Unknown Unknown Unknown
>> pmemd 00000000004777C3 Unknown Unknown Unknown
>> pmemd 00000000004AA1D5 Unknown Unknown Unknown
>> pmemd 00000000004CA1CE Unknown Unknown Unknown
>> pmemd 000000000040744C Unknown Unknown Unknown
>> libc.so.6 0000003F4D81D8B4 Unknown Unknown Unknown
>> pmemd 0000000000407359 Unknown Unknown Unknown
>> rank 7 in job 55 compute-0-8.local_45343 caused collective abort of all
>> ranks
>> exit status of rank 7: killed by signal 9
>>
>> the input file...
>>
>> Constant Volume Minimization
>> # Control section
>> &cntrl
>> ntwx = 50, ntpr = 1, ntwr = 1,
>> scnb = 1.0, scee = 1.0, nsnb = 25, dielc = 1, cut = 8.0,
>> ntb = 1,
>> maxcyc = 1000, ntmin = 0, dx0 = 0.01, drms = 0.0001,
>> ntp = 0,
>> ibelly = 0, ntr = 1,
>> imin = 1,
>> &end
>> Group Input for restrained atoms
>> 5.0
>> RES 1 2
>> END
>> END
>> _______________________________________________
>> AMBER mailing list
>> AMBER_at_ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER_at_ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER_at_ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber