AMBER Archive (2006)

Subject: Re: AMBER: AMBER parallel run bombs

From: Michael Crowley (crowley_at_scripps.edu)
Date: Mon Dec 04 2006 - 19:41:32 CST


Dear Asif,
Please try a 2 step dynamics run on 1 and 2 processors and the same run
with a serial sander, to make sure that your installation is working.
Do the answers for serial sander and 1 processor parallel sander match?
Does 2 processor run work?
How big is the pairlist for 1 processor and 2 processors?
What is the density of the system, is it what you expect it to be?

The assumption for generating the pairlist array size (maxpairs) is that
your system is relatively homogeneous and that the size of the pairlist
for all regions of your system is relatively constant. If your system is
not homogeneous, then it is possible that one of the processors will get
a lot more than its fair share of the nonbond list, and blow the nonbond
list maximum size. It is assumed that you would not want to run a system
like that so there is no provision at present to accomodate vacuous
regions in a periodic system.

If non-homogeneity is your problem, then you should either fix your
starting structure from the outset, or be sure to run some constant
pressure dynamics on 1 or 2 processors to get the system to shrink down
to a size that is physically reasonable with no voids and a reasonable
density (very close to 1.0).

If inhomogeneity is not the source of your troubles, then please report
back the answers to the above questions and we will see what we can do.
Best wishes
Mike

Rahaman, Asif wrote:
> Dear All,
>
> I am trying to run MPI (parallel) version of amber. I have a total of 9600 atoms in the system and I use PBC. I am getting the following error when I try to run:
> _____________________________________________________________
> running /usr/local/amber9/exe/sander.MPI on 4 LINUX ch_gm processors
> Program binary is: /usr/local/amber9/exe/sander.MPI
> Machines file is /home/rasif/fast_verylarge/mach
> Shared memory for intra-nodes coms is enabled.
> gm receive mode used: polling.
> 4 processes will be spawned:
> Process 0 (/usr/local/amber9/exe/sander.MPI "-O" "-i" "md.inp" "-p" "new1.top" "-c" "new1.crd" "-o" "md.out" "-r" "md.rst" "-x" "md.crd" ) on node001
> Process 1 (/usr/local/amber9/exe/sander.MPI "-O" "-i" "md.inp" "-p" "new1.top" "-c" "new1.crd" "-o" "md.out" "-r" "md.rst" "-x" "md.crd" ) on node001
> Process 2 (/usr/local/amber9/exe/sander.MPI "-O" "-i" "md.inp" "-p" "new1.top" "-c" "new1.crd" "-o" "md.out" "-r" "md.rst" "-x" "md.crd" ) on node001
> Process 3 (/usr/local/amber9/exe/sander.MPI "-O" "-i" "md.inp" "-p" "new1.top" "-c" "new1.crd" "-o" "md.out" "-r" "md.rst" "-x" "md.crd" ) on node001
> Open a socket on head...
> Got a first socket opened on port 44225.
> Shared memory file: /tmp/gmpi_shmem-3027994:[0-9]*.tmp
> MPI Id 0 is using gm port 2, board 0 (MAC 0060dd47b81f).
> MPI Id 1 is using gm port 4, board 0 (MAC 0060dd47b81f).
> MPI Id 3 is using gm port 5, board 0 (MAC 0060dd47b81f).
> MPI Id 2 is using gm port 6, board 0 (MAC 0060dd47b81f).
> Received data from all 4 MPI processes.
> Sending mapping to MPI Id 0.
> Sending mapping to MPI Id 1.
> Sending mapping to MPI Id 2.
> Sending mapping to MPI Id 3.
> Data sent to all processes.
> Received valid abort message !
> Reap remote processes:
> * NB pairs 154 799894 exceeds capacity ( 800000) 2
> SIZE OF NONBOND LIST = 800000
> SANDER BOMB in subroutine nonbond_list
> Non bond list overflow!
> check MAXPR in locmem.f
> --------------------------------------------------------------
>
> As you can see I am trying to do a 4 processor job. and seems that in sander the nonbonded list is overflowing. I have changed the MAXINT in /src/anal/sizes.h. There is not any sizes.h file in /src/sander and I also could not locate any place where I can assign the value for MAXPR in locmem.f. It seems to me that in locmem.f MAXPR is asigned or calculated as [# of atoms (9600)*(cut +scnb)**3/3 = 3200000] and for four processor it will 800000.
>
> Could anybody please let me know what should I do to make the parallel version work or what to do???
> Or do I need to change the NONBOND list? If so what do I need to change and where in sander I should make the change?
>
> Thank you in advance.
>
> With best regards, Asif
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber_at_scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
>
>

-- 
-----------------------------------------------------------------
Physical mail:   Dr. Michael F. Crowley
                  Department of Molecular Biology, TPC6
                  The Scripps Research Institute
                  10550 North Torrey Pines Road
                  La Jolla, California 92037

Electronic mail: crowley_at_scripps.edu Telephone: 858/784-9290 Fax: 858/784-8688 ----------------------------------------------------------------- ----------------------------------------------------------------------- The AMBER Mail Reflector To post, send mail to amber_at_scripps.edu To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu