AMBER Archive (2009)

Subject: [AMBER] parallel amber on cluster

From: Siavoush Dastmalchi (Dastmalchi.s_at_tbzmed.ac.ir)
Date: Mon May 18 2009 - 01:04:58 CDT


Dear List,

 

I am trying to use parallel amber on a cluster, but it does not work on different nodes. It runs parallel on master node, i.e., on 4 processors when I use the following command:

 

mpirun -np 8 /opt/amber9/exe/sander -O -i min.in -o min.out -p egf-egfr_solvated.prmtop -c egf-egfr_solvated.inpcrd -r min.rst -ref egf-egfr_solvated.inpcrd

 

This is to run sander on sum of 8 processors on two nods. Here is part of the error message that I get:

 

  Unit 5 Error on OPEN: min.in

 

  Unit 5 Error on OPEN: min.in

-----------------------------------------------------------------------------

It seems that [at least] one of the processes that was started with

mpirun did not invoke MPI_INIT before quitting (it is possible that

more than one process did not invoke MPI_INIT -- mpirun was only

notified of the first one, which was on node n0).

 

mpirun can *only* be used with MPI programs (i.e., programs that

invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program

to run non-MPI programs over the lambooted nodes.

-----------------------------------------------------------------------------

 

  Unit 5 Error on OPEN: min.in

forrtl: error (78): process killed (SIGTERM)

Image PC Routine Line Source

sander 0849AA33 Unknown Unknown Unknown

sander 0849A053 Unknown Unknown Unknown

sander 0845B022 Unknown Unknown Unknown

sander 08417BE6 Unknown Unknown Unknown

sander 0841A8CE Unknown Unknown Unknown

Unknown B7F2F420 Unknown Unknown Unknown

sander 080D4D43 Unknown Unknown Unknown

sander 080C1D92 Unknown Unknown Unknown

sander 080C0EF1 Unknown Unknown Unknown

sander 0804A161 Unknown Unknown Unknown

libc.so.6 0053D7E4 Unknown Unknown Unknown

sander 0804A0A1 Unknown Unknown Unknown

 

  Unit 5 Error on OPEN: min.in

forrtl: error (78): process killed (SIGTERM)

Image PC Routine Line Source

…..

…..

If I do the same command using just 4 processors on my master node it works perfectly. I don’t understand what I am doing wrong.

I appreciate it if you could kindly help me with this.

 

Cheers, Siavoush

_______________________________________________
AMBER mailing list
AMBER_at_ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber