AMBER Archive (2009)

Subject: RE: [AMBER] PMEMD 9 on MVAPICH / Infiniband problem

From: Ross Walker (ross_at_rosswalker.co.uk)
Date: Fri Mar 27 2009 - 20:04:04 CDT


Hi Nick,

There really isn't enough information in here to be able to tell what is
going on. Do you get any type of error message? Do you see an output file?
What about the log files produced by the queuing system do they tell you
anything? Normally stderr will have been redirected somewhere and you would
need to find this to see what was said. There are a number of problems that
could be occurring including file permission / path problems if all nodes
don't share the same filesystem, problems with shared libraries due to
environment variables not being exported correctly, stack limitation issues
causing segfaults, insufficient memory etc etc. Clues to which of these it
is will be in the log file.

Note, you say you can launch single pmemd jobs but don't explain this. The
parallel version of pmemd can only run at 2cpus and greater. Did you compile
a serial version as well? Is this what you means by single pmemd jobs?

All the best
Ross

> -----Original Message-----
> From: amber-bounces_at_ambermd.org [mailto:amber-bounces_at_ambermd.org] On
> Behalf Of Nick Holway
> Sent: Friday, March 27, 2009 8:56 AM
> To: amber_at_ambermd.org
> Subject: [AMBER] PMEMD 9 on MVAPICH / Infiniband problem
>
> Dear all.
>
> We've compiled PMEMD 9 using ifort 10, MVAPICH2 1.2 and OFED 1.4 on
> 64bit Rocks 5.1 (ie Centos 5.2 and SGE 6.1u5). I'm able to launch
> single pmemd jobs via qsub using mpirun_rsh and they run well. The
> problem we see is when two jobs are launched at once is that some of
> the jobs disappear from qstat in SGE as well as continue to run
> indefinitely.
>
> I'm calling PMEMD with this line - $MPIHOME/bin/mpirun_rsh -np $NSLOTS
> -hostfile $TMPDIR/machines $AMBERHOME/pmemd -O -i xxxx.inp -c
> xxxx_min.rest -o xxxx.out -p xxxx.top -r xxxx_eqt.rest -x xxxx.trj
>
> Does anyone know what I've got to do to make the PMEMD jobs run properly?
>
> Thanks for any help.
>
> Nick
>
> _______________________________________________
> AMBER mailing list
> AMBER_at_ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER_at_ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber