AMBER Archive (2003)

Subject: AW: AMBER: Problem with MPI_Finalize

From: Thomas.Fox_at_bc.boehringer-ingelheim.com
Date: Tue Nov 18 2003 - 09:34:36 CST


Hi Bob -

seems your guess is right...I changed the flush call in sys.f to flush(lun,
istat), and now things seem to work...and the good thing
is the executables linked with this version of amflsh also work on my other
SGIs (all 6.5 though), so I can safely apply this modification in my
setting...

Thanks a lot,
Th.

Dr. Thomas Fox
Dept. Lead Discovery - Computer Aided Molecular Design
K91-00-10
Boehringer Ingelheim Pharma GmbH & Co KG
88397 Biberach, Germany
thomas.fox_at_bc.boehringer-ingelheim.com

-----Ursprüngliche Nachricht-----
Von: Robert Duke [mailto:rduke_at_email.unc.edu]
Gesendet: Dienstag, 18. November 2003 15:55
An: amber_at_scripps.edu
Betreff: Re: AMBER: Problem with MPI_Finalize

Thomas -
This is sort of an "off the top of my head" guess, but it is possible you
are hitting the flush() problem with the new SGI libraries. What SGI
apparently did is change flush() to have 2 arguments instead of 1. Now
there is a second istat argument, and if you look at sys.f under
amber7/src/Machines/standard, you will see that amflsh uses a flush call
with 1 arg. Introduce a second integer arg, istat, that you ignore, and
things will probably be okay. What is happening is that the stack is
getting trashed on return from the flush call, and under some circumstances,
I think this kills the master process (which does the mdout i/o). Generally,
I think it just happens as a run is printing final data, and because the
master croaks, the finalize's get messed up. Anyway, it is worth a try. I
actually took the flush calls out of pmemd over this issue (much annoyed),
and instead do timed close/open's. Changing library interfaces is bad. I
don't know if there is a sander7 bugfix for this one or not, but the problem
with doing a fix is you have to know the version of s/w being used (in other
words, doing the fix on machines with old SGI software will break them).
Regards - Bob

----- Original Message -----
From: <Thomas.Fox_at_bc.boehringer-ingelheim.com>
To: <amber_at_scripps.edu>
Sent: Tuesday, November 18, 2003 9:10 AM
Subject: AMBER: Problem with MPI_Finalize

> Hi -
>
> looking through the archives, I didnt find anything helpful, so Im
reporting
> my own observations:
>
> I have compiled sander (amber7) on an SGI O3000 with IRIX64 6.5, and
> MPI 3.2.0.7, with MIPS Pro 7.3.1.2m. Running a MD simulation of a
> protein on this machine, everything goes fine and my MD
> calculation runs through smoothly...however, we just got a new machine
> (O3000 with IRIX 6.5, and MPI 4.3 MPT 1.8,
> but no compiler on it) and running my sander executable on it, I get
> basically identical results, my simulation runs
> through to the end, but now I get the following error message
>
> MPI: Program /home/foxt/amber7/exe_mpi/sander, Rank 0, Process 6182
> received signal SIGSEGV(11)
>
>
> MPI: --------stack traceback-------
>
>
> sh: dbx: not found
>
> MPI: -----stack traceback ends-----
> MPI: Program /home/foxt/amber7/exe_mpi/sander, Rank 0, Process 6182:
> Dumping core on signal SIGSEGV(11) into directory
> /home/foxt/PROJECTS/LIE_FXA/LIE_RUNS/RUNS_RST
> MPI: MPI_COMM_WORLD rank 0 has terminated without calling
> MPI_Finalize()
> MPI: aborting job
> MPI: Received signal 11
>
> The output stops before the final timing information ("5. TIMINGS") -
> but this could be a buffering issue... minimizations are no problem,
> just MD calculations.
>
> To be honest, this behavior is more an annoyance, as I dont get the
> timing information and a lot of garbage in my log-files (and, yes,
> lots of core dumps that I have to remove), but still...I have looked
> through the code but couldnt find anything obvious, but this is
> probalbly as Im not
familiar
> enough with MPI...
>
> Any idea/suggestion ?
>
> Th.
>
> Dr. Thomas Fox
> Dept. Lead Discovery - Computer Aided Molecular Design K91-00-10
> Boehringer Ingelheim Pharma GmbH & Co KG
> 88397 Biberach, Germany
> thomas.fox_at_bc.boehringer-ingelheim.com
>
> ----------------------------------------------------------------------
> -
> The AMBER Mail Reflector
> To post, send mail to amber_at_scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
>
>

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu