AMBER Archive (2009)

Subject: [AMBER] errorin sander---Amber10

From: archana sonawani (ask.archana_at_gmail.com)
Date: Tue Jan 13 2009 - 03:27:43 CST


Hi,

>
> I have installed Amber10 on HP proliant ML350 server having 2 Xeon
> processor machine, RHEL4 OS. I used g95 fortran comipler.
>
> Following steps were performed.
>
> 1] Downloaded and extracted Amber10 and Amber tools in /usr/local.

 2] Downloaded and extracted g95 in root and then the path was set.
> wget -O - http://ftp.g95.org/g95-x86-linux.tgz |tar xvfz -
> ln -s $PWD/g95-install/bin/i686-pc-linux-gnu-g95 /usr/bin/g95

 3] Compiling and Running g95 didnt gave any error.

    4] AMBERHOME environment variable was set using "bashsh unix shell"
        AMBERHOME=" /usr/local/amber10"
        PATH=$PATH:$AMBERHOME/exe
        export AMBERHOME PATH
        echo $AMBERHOME
        echo $AMBERHOME/exe

   5] AmberTools was installed and path was set using
path="usr/local/amber10/exe"

   6] Amber10 was installed
      cd $AMBERHOME/src
      ./configure_amber -static g95
      make serial
      cd $AMBERHOME/test
      make test
      cd $AMBERHOME/src
      make clean

Now xleap is working fine. I want to run parallel sander jobs. LAM 7.1.3 is
included in Amber10 and it is given in the manual install MPI library to
make parallel runs.I have installed openmpi-1.2.8.tar.bz2.

lamboot gives following msg:
LAM 7.1.3/MPI 2 C++/ROMIO - Indiana University

lamboot -d gives following msg:

n-1<4980> ssi:boot:open: opening
n-1<4980> ssi:boot:open: opening boot module globus
n-1<4980> ssi:boot:open: opened boot module globus
n-1<4980> ssi:boot:open: opening boot module rsh
n-1<4980> ssi:boot:open: opened boot module rsh
n-1<4980> ssi:boot:open: opening boot module slurm
n-1<4980> ssi:boot:open: opened boot module slurm
n-1<4980> ssi:boot:select: initializing boot module rsh
n-1<4980> ssi:boot:rsh: module initializing
n-1<4980> ssi:boot:rsh:agent: rsh
n-1<4980> ssi:boot:rsh:username: <same>
n-1<4980> ssi:boot:rsh:verbose: 1000
n-1<4980> ssi:boot:rsh:algorithm: linear
n-1<4980> ssi:boot:rsh:no_n: 0
n-1<4980> ssi:boot:rsh:no_profile: 0
n-1<4980> ssi:boot:rsh:fast: 0
n-1<4980> ssi:boot:rsh:ignore_stderr: 0
n-1<4980> ssi:boot:rsh:priority: 10
n-1<4980> ssi:boot:select: boot module available: rsh, priority: 10
n-1<4980> ssi:boot:select: initializing boot module slurm
n-1<4980> ssi:boot:slurm: not running under SLURM
n-1<4980> ssi:boot:select: boot module not available: slurm
n-1<4980> ssi:boot:select: initializing boot module globus
n-1<4980> ssi:boot:globus: globus-job-run not found, globus boot will not
run
n-1<4980> ssi:boot:select: boot module not available: globus
n-1<4980> ssi:boot:select: finalizing boot module slurm
n-1<4980> ssi:boot:slurm: finalizing
n-1<4980> ssi:boot:select: closing boot module slurm
n-1<4980> ssi:boot:select: finalizing boot module globus
n-1<4980> ssi:boot:globus: finalizing
n-1<4980> ssi:boot:select: closing boot module globus
n-1<4980> ssi:boot:select: selected boot module rsh
LAM 7.1.3/MPI 2 C++/ROMIO - Indiana University
n-1<4980> ssi:boot:base: looking for boot schema in following directories:
n-1<4980> ssi:boot:base: <current directory>
n-1<4980> ssi:boot:base: $TROLLIUSHOME/etc
n-1<4980> ssi:boot:base: $LAMHOME/etc
n-1<4980> ssi:boot:base: /etc/lam
n-1<4980> ssi:boot:base: looking for boot schema file:
n-1<4980> ssi:boot:base: lam-bhost.def
n-1<4980> ssi:boot:base: found boot schema: /etc/lam/lam-bhost.def
n-1<4980> ssi:boot:rsh: found the following hosts:
n-1<4980> ssi:boot:rsh: n0 localhost (cpu=1)
n-1<4980> ssi:boot:rsh: resolved hosts:
n-1<4980> ssi:boot:rsh: n0 localhost --> 127.0.0.1 (origin)
n-1<4980> ssi:boot:rsh: starting RTE procs
n-1<4980> ssi:boot:base:linear: starting
n-1<4980> ssi:boot:base:server: opening server TCP socket
n-1<4980> ssi:boot:base:server: opened port 33111
n-1<4980> ssi:boot:base:linear: booting n0 (localhost)
n-1<4980> ssi:boot:rsh: starting lamd on (localhost)
n-1<4980> ssi:boot:rsh: starting on n0 (localhost): hboot -t -c
lam-conf.lamd -d -I -H 127.0.0.1 -P 33111 -n 0 -o 0
n-1<4980> ssi:boot:rsh: launching locally
hboot: performing tkill
hboot: tkill -d
tkill: setting prefix to (null)
tkill: setting suffix to (null)
tkill: got killname back: /tmp/lam-ramshankar_at_59.163.34.82/lam-killfile
tkill: f_kill = "/tmp/lam-ramshankar_at_59.163.34.82/lam-killfile"
tkill: killing LAM...
tkill: killing PID (SIGHUP) 4978 ...
tkill: killed
tkill: removing socket file ...
tkill: socket file: /tmp/lam-ramshankar_at_59.163.34.82/lam-kernel-socketd
tkill: removing IO daemon socket file ...
tkill: IO daemon socket file: /tmp/lam-ramshankar_at_59.163.34.82/lam-io-socket
tkill: all finished
hboot: booting...
hboot: fork /usr/bin/lamd
hboot: attempting to execute
n-1<4983> ssi:boot:open: opening
n-1<4983> ssi:boot:open: opening boot module globus
n-1<4983> ssi:boot:open: opened boot module globus
n-1<4983> ssi:boot:open: opening boot module rsh
n-1<4983> ssi:boot:open: opened boot module rsh
n-1<4983> ssi:boot:open: opening boot module slurm
n-1<4983> ssi:boot:open: opened boot module slurm
n-1<4983> ssi:boot:select: initializing boot module rsh
n-1<4983> ssi:boot:rsh: module initializing
n-1<4983> ssi:boot:rsh:agent: rsh
n-1<4983> ssi:boot:rsh:username: <same>
n-1<4983> ssi:boot:rsh:verbose: 1000
n-1<4983> ssi:boot:rsh:algorithm: linear
n-1<4983> ssi:boot:rsh:no_n: 0
n-1<4983> ssi:boot:rsh:no_profile: 0
n-1<4983> ssi:boot:rsh:fast: 0
n-1<4983> ssi:boot:rsh:ignore_stderr: 0
n-1<4983> ssi:boot:rsh:priority: 10
n-1<4983> ssi:boot:select: boot module available: rsh, priority: 10
n-1<4983> ssi:boot:select: initializing boot module slurm
n-1<4983> ssi:boot:slurm: not running under SLURM
n-1<4983> ssi:boot:select: boot module not available: slurm
n-1<4983> ssi:boot:select: initializing boot module globus
n-1<4983> ssi:boot:globus: globus-job-run not found, globus boot will not
run
n-1<4983> ssi:boot:select: boot module not available: globus
n-1<4983> ssi:boot:select: finalizing boot module slurm
n-1<4983> ssi:boot:slurm: finalizing
n-1<4983> ssi:boot:select: closing boot module slurm
n-1<4983> ssi:boot:select: finalizing boot module globus
n-1<4983> ssi:boot:globus: finalizing
n-1<4983> ssi:boot:select: closing boot module globus
n-1<4983> ssi:boot:select: selected boot module rsh
n-1<4983> ssi:boot:send_lamd: getting node ID from command line
n-1<4983> ssi:boot:send_lamd: getting agent haddr from command line
n-1<4983> ssi:boot:send_lamd: getting agent port from command line
n-1<4983> ssi:boot:send_lamd: getting node ID from command line
n-1<4983> ssi:boot:send_lamd: connecting to 127.0.0.1:33111, node id 0
n-1<4983> ssi:boot:send_lamd: sending dli_port 32908
[1] 4983 lamd -H 127.0.0.1 -P 33111 -n 0 -o 0 -d
n-1<4980> ssi:boot:rsh: successfully launched on n0 (localhost)
n-1<4980> ssi:boot:base:server: expecting connection from finite list
n-1<4980> ssi:boot:base:server: got connection from 127.0.0.1
n-1<4980> ssi:boot:base:server: this connection is expected (n0)
n-1<4980> ssi:boot:base:server: remote lamd is at 127.0.0.1:32908
n-1<4980> ssi:boot:base:server: closing server socket
n-1<4980> ssi:boot:base:server: connecting to lamd at 127.0.0.1:33112
n-1<4980> ssi:boot:base:server: connected
n-1<4980> ssi:boot:base:server: sending number of links (1)
n-1<4980> ssi:boot:base:server: sending info: n0 (localhost)
n-1<4980> ssi:boot:base:server: finished sending
n-1<4980> ssi:boot:base:server: disconnected from 127.0.0.1:33112
n-1<4980> ssi:boot:base:linear: finished
n-1<4980> ssi:boot:rsh: all RTE procs started
n-1<4980> ssi:boot:rsh: finalizing
n-1<4980> ssi:boot: Closing
[ramshankar_at_59 /]$ n-1<4983> ssi:boot:rsh: finalizing
n-1<4983> ssi:boot: Closing

I am trying to run the following command:

mpirun -np 2 $AMBERHOME/exe/sander -O -i min.in -o min.out -p 6pti.top -c
6pti.crd -r 6pti.rst -x 6pti.traj -ref 6pti.crd

and i get this error:

Unit 6 Error on OPEN: min.out
------------------------------
-----------------------------------------------
It seems that [at least] one of the processes that was started with
mpirun did not invoke MPI_INIT before quitting (it is possible that
more than one process did not invoke MPI_INIT -- mpirun was only
notified of the first one, which was on node n0).

mpirun can *only* be used with MPI programs (i.e., programs that
invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
to run non-MPI programs over the lambooted nodes.
-----------------------------------------------------------------------------

So my query is how to use lam (without parallel process) and
mpirun(parallel process) ??????????
How to set the environment variables and which variables should be
set???????????

Please help me out.....

----Archana
_______________________________________________
AMBER mailing list
AMBER_at_ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber