AMBER Archive (2002)

Subject: RE: P4_GLOBMEMSIZE problem

From: Ross Walker (r.c.walker_at_ic.ac.uk)
Date: Tue Nov 05 2002 - 17:04:03 CST


Hi Jean,

> Thanks for your email. Unfortunately it doesn't work :(
> a) I get the following error message:
> p0_17603: p4_error: exceeding max num of P4_MAX_SYSV_SHMIDS: 256

I've not seen this error before, you could try recompiling mpich with
higher shared Memory segments. You may want to check that
P4_MAX_SYSV_SHMIDS is a compile time variable first of all and not
something that can be adjusted via an environment variable.

Alternatively I would recommend using the LAM version of mpi on a redhat
box since it is optimised for linux. I have found it is on average about
10% faster than a default mpich installation on my redhat 7.3 boxes. It
also comes as an RPM making installation a breeze. You can download it
from
ftp://ftp.mirror.ac.uk/sites/ftp.redhat.com/pub/redhat/linux/7.2/en/os/i
386/RedHat/RPMS/lam-6.5.4-1.i386.rpm

This will install scripts for compiling mpi based f77 and c programs.
Just change the compile options in the amber machine file to mpif77 and
mpicc and then recompile amber. After that it's fairly easy to use, you
just create a lamdef file containing the machine ids (you'll have to
check the manual for scali and myrinet compatibility) over which you
want to run the job. Then just run:

lamboot lamdef
mpirun -np 8 /usr/local/amber_mpi/exe/sander -O ........

I hope this helps.
All the best
Ross

> b)if I set P4_GLOBMEMSIZE to 256MB, I get the same error message as
> before (w/ 16MB)
>
> Do you suggest to recompile MPICH w/ higher shared memory segments?
> Thanks.
>
> JC
>
>
> Walker wrote:
>
> >Try setting the value much much higher.
> >
> >On our cluster I have this set in /etc/bashrc to:
> >
> >Export P4_GLOBMEMSIZE=536870912
> >
> >(=512MB).
> >
> >This should work.
> >
> >All the best
> >Ross
> >
> >/\
> >\/
> >|\oss Walker
> >
> >| Imperial College of Science, Technology & Medicine |
> >| Department of Chemistry | Theoretical Division |
> >| Tel:- +44 20 759(45851) |
> >| EMail:- ross_at_rosswalker.co.uk | http://www.rosswalker.co.uk/ |
> >| PGP Key available on request |
> >
> >
> >-----Original Message-----
> >From: Jean-Christophe Ducom [mailto:jducom_at_nd.edu]
> >Sent: 04 November 2002 17:33
> >To: amber_at_heimdal.compchem.ucsf.edu
> >Subject: P4_GLOBMEMSIZE problem
> >
> >
> >When I try to run a sander job to 8 SMP nodes running Linux
> Redhat 7.2
> >(kernel 2.4.18) using mpich 1.2.4, I get the following error
> messages:
> >
> >*if setenv P4_GLOBMEMSIZE 16000000(or higher), then I get the error
> >message:
> >p2_25612: p4_error: interrupt SIGSEGV: 11
> >p4_22913: p4_error: interrupt SIGSEGV: 11
> >Broken pipe
> >Broken pipe
> >bm_list_26381: (8.040565) wakeup_slave: unable to interrupt
> slave 0 pid
> >26380
> >
> >*if setenv P4_GLOBMEMSIZE 14000000, then:
> >p2_25887: (6.780981) xx_shmalloc: returning NULL; requested 13914960
> >bytes
> >p2_25887: (6.781052) p4_shmalloc returning NULL; request = 13914960
> >bytes
> >You can increase the amount of memory by setting the environment
> >variable
> >P4_GLOBMEMSIZE (in bytes); the current size is 14000000
> >p2_25887: p4_error: alloc_p4_msg failed: 0
> >Broken pipe
> >bm_list_14913: (7.010840) wakeup_slave: unable to interrupt
> slave 0 pid
> >14912
> >
> >Every node (Dual Xeon 1.7Ghz) has 1GB of memory.
> ># cat /proc/sys/kernel/shmmax ->536870912
> >
> >The file size.h has been modified as following:
> >--------------------------------------
> > parameter (MAXREA=3800000)
> > parameter (MAXINT=2750000)
> > parameter (MAXHOL=1000000)
> > parameter (MAXPR=5000000)
> > parameter (MAXDUP=8000)
> >c
> >c --- allocate a "stack" space for temporary real variables:
> >c (size depends on the problem: the maximum value used
> is reported
> >c at the end of a calculation)
> >c
> > integer MAX_RSTACK,MAX_ISTACK,MAX_STACK_PTRS,MAX_HEAP_PTRS
> > parameter (MAX_RSTACK=1600000)
> > parameter (MAX_ISTACK=100000)
> > parameter (MAX_STACK_PTRS=100)
> > parameter (MAX_HEAP_PTRS=100)
> >c
> >----------------------------------------
> >
> >Any idea?
> >Thanks a lot
> >
> > JC
> >
> >--------
> >237 Nieuwland Science Hall
> >Notre Dame, IN 46556
> >
> >
> >
> >
> >
>
>
>
>
>