AMBER Archive (2004)

Subject: AMBER: Amber 7 on SGI: Sander MPI problem

From: Rohn Wood (llystrata_at_mac.com)
Date: Tue Apr 13 2004 - 17:34:12 CDT


Compiling Amber with Machine.sgi_mpi results in a bad sander module
(see errors below under PROBLEM). I have seen a few references to this
problem on the list, but no clear resolution. The only solution that
included a code modification was found at:
http://structbio.vanderbilt.edu/archives/amber-archive/2003/2629.phtml
  However, this fix did not work for me.

Any insights would be appreciated.

Rohn Wood
University of Montana-Missoula
----------------

ARCHITECTURE

Origin 2400 32 CPU R12000
IRIX 6.5.23m

SOFTWARE

I mpi 04/09/2004 MPI 4.3 (MPT 1.8)
I mpi.hdr 04/09/2004 MPI 4.3 Headers
I mpi.hdr.lib 04/09/2004 MPI 4.3 Library Headers

I mpt 04/09/2004 MPT 1.8

I ftn90_dev 02/27/2003 Fortran 90 Headers and Libraries,
7.3
I ftn90_dev.sw 02/27/2003 Fortran 90 Software
I ftn90_dev.sw.ftn90 02/27/2003 Fortran 90 Compiler
I ftn90_fe 09/08/2003 Fortran 90 Front-end, 7.3

I c++_dev 02/27/2003 C++ Headers and Libraries, 7.3
I c++_dev.sw 02/27/2003 C++ Software
I c++_dev.sw.c++ 02/27/2003 C++ Compiler
I c++_dev.sw.lib 02/27/2003 C++ Libraries
I c++_fe 09/08/2003 C++ Front-end, 7.3

COMPILE
Compiled Amber 7 with Machine.sgi_mpi machine file from the SGI site:
MACHINE -> Machines/Machine.sgi_mpi
(
http://www.sgi.com/industries/sciences/chembio/resources/amber/
Machine.sgi_mpi )

PROBLEM Landscape

I set the environment for MPI: setenv DO_PARALLEL 'mpirun -np 2'
(I have varied the np value from 1-32 w/same results)

I can run the gibbs test suite as well as the others. I cannot run the
sander test suite with MPI (or any suite that invokes sander). I can
run sander and all the tests without MPI. I have narrowed the failure
down to sander. (note: I wrote a simple MPI program to make sure MPI
was working correctly on my system -- my simple program runs).

OUTPUT from make test.sander

mccfsrv01 149# make test.sander
         cd dmp; ./Run.dmp
This test not set up for parallel
  cannot run in parallel with #residues < #pes
         cd adenine; ./Run.adenine
This test not set up for parallel
  cannot run in parallel with #residues < #pes
==============================================================
         cd cytosine; ./Run.cytosine
MPI: Program ../../exe/sander, Rank 0, Process 142987 received signal
SIGSEGV(11)

MPI: --------stack traceback-------
PC: 0x5ddb100 MPI_SGI_stacktraceback in /usr/lib32/libmpi.so
PC: 0x5ddb544 first_arriver_handler in /usr/lib32/libmpi.so
PC: 0x5ddb7d8 slave_sig_handler in /usr/lib32/libmpi.so
PC: 0xfaee79c _sigtramp in /usr/lib32/libc.so.1
PC: 0xa6ca7c0 flush_ in /usr/lib32/libfortran.so
PC: 0x10164d6c amflsh in ../../exe/sander
PC: 0x1000c150 sander in ../../exe/sander
PC: 0xace9d74 main in /usr/lib32/libftn.so

MPI: dbx version 7.3.3 (78517_Dec16 MR) Dec 16 2001 07:45:22
MPI: Process 142987 (sander) stopped at [__waitsys:24 +0x8,0xfa53338]
MPI: Source (of
/xlv42/6.5.23m/work/irix/lib/libc/libc_n32_M4/proc/waitsys.s) not
available for Process 142987
MPI: > 0 __waitsys(0x0, 0x22e8d, 0x7fff1960, 0x3, 0x20, 0x1, 0x34370,
0xafc406c)
["/xlv42/6.5.23m/work/irix/lib/libc/libc_n32_M4/proc/waitsys.s":24,
0xfa53338]
MPI: 1 _system(0x7fff1a30, 0x22e8d, 0x7fff1960, 0x3, 0x20, 0x1,
0x34370, 0xafc406c)
["/xlv42/6.5.23m/work/irix/lib/libc/libc_n32_M4/stdio/system.c":116,
0xfa5f868]
MPI: 2 MPI_SGI_stacktraceback(0x0, 0x22e8d, 0x7fff1960, 0x3, 0x20,
0x1, 0x34370, 0xafc406c)
["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/adi/sig.c":242,
0x5ddb268]
MPI: 3 first_arriver_handler(0xb, 0x71756974, 0x7fff1960, 0x3, 0x20,
0x1, 0x34370, 0xafc406c)
["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/adi/sig.c":445,
0x5ddb544]
MPI: 4 slave_sig_handler(0xb, 0x22e8d, 0x7fff1960, 0x3, 0x20, 0x1,
0x34370, 0xafc406c)
["/xlv4/mpt/1.8/mpi/work/4.3/lib/libmpi/libmpi_n32_M4/adi/sig.c":542,
0x5ddb7e0]
MPI: 5 _sigtramp(0x0, 0x22e8d, 0x0, 0x3, 0x0, 0x0, 0x34370,
0xafc406c)
["/xlv42/6.5.23m/work/irix/lib/libc/libc_n32_M4/signal/sigtramp.s":71,
0xfaee79c]
MPI: 6 flush_(0x0, 0x1000c138, 0x1fffffff, 0x1, 0x0, 0x1,
0x7fff26c0, 0x0)
["/j10/mtibuild/v741m/workarea/v7.4.1m/libf/fio/f77wrappers.c":188,
0xa6ca7c8]
MPI: 7 amflsh(0x0, 0x7fff26c0, 0x1fffffff, 0x1, 0x0, 0x1,
0x7fff26c0, 0x0) ["/opt/amber7mpi/src/Machines/standard/_sys_.f":63,
0x10164d6c]
MPI: 8 sander(0x0, 0x0, 0x0, 0x0, 0x0, 0x6, 0x0, 0x0)
["/opt/amber7mpi/src/sander/_sander_.f":1431, 0x1000c150]
MPI: More (n if no)? 9 main(0x0, 0x7fff26c0, 0x1fffffff, 0x1, 0x0,
0x1, 0x7fff26c0, 0x0)
["/j10/mtibuild/v741m/workarea/v7.4.1m/libF77/main.c":101, 0xace9d74]
MPI: 10 __start()
["/xlv55/kudzu-apr12/work/irix/lib/libc/libc_n32_M4/csu/crt1text.s":
177, 0x10008988]

MPI: -----stack traceback ends-----
MPI: Program ../../exe/sander, Rank 0, Process 142987: Dumping core on
signal SIGSEGV(11) into directory /opt/amber7mpi/test/cytosine
MPI: MPI_COMM_WORLD rank 0 has terminated without calling MPI_Finalize()
MPI: aborting job
MPI: Received signal 11

   Program error
*** Error code 1 (bu21)

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu