AMBER Archive (2005)

Subject: Re: AMBER: parallel AMBER/pmemd installation problem on Opteron

From: Robert Duke (rduke_at_email.unc.edu)
Date: Wed Feb 23 2005 - 11:27:13 CST


Lars-
If anything, I would say pmemd is more stable, or at least as stable, on
platforms it has been tested on. However, the pgi compiler is a bit of an
unknown quantity; apparently the pgi c compiler is known to be problematic.
When you are seeing hangs, that is most likely associated with the network
layer (mpi) somehow, and the only reason that pmemd may give more trouble in
this area than sander is that it uses nonblocking primitives a lot more, and
drives the net connection harder. I would be interested to hear exactly
what you are observing. and on what exact setup (hw + sw), and it is
possible that my work on the cray machines will shed some light on the
problems, as it is a similar setup (opterons, mpich pgi compilers). One
other thought. Early on last year, I attempted to run on an opteron
workstation and had serious problems with heating; this would cause hangs of
the entire system (ie., the machine locks up), and the problem was worse
with pmemd because it would drive the opteron fp unit about 50% harder. Any
chance your opterons have cooling problems (on a dual p4 with
thermoregulated fans, I can hear the fans rev up as you go into a pmemd
run - sounds like a jet taxiing out).
Regards - Bob

----- Original Message -----
From: "Lars Packschies" <packschies_at_rrz.uni-koeln.de>
To: <amber_at_scripps.edu>
Sent: Wednesday, February 23, 2005 11:22 AM
Subject: Re: AMBER: parallel AMBER/pmemd installation problem on Opteron

>
>
> --On Mittwoch, Februar 23, 2005 08:17:54 -0500 Robert Duke
> <rduke_at_email.unc.edu> wrote:
>
>> Jyh-Shyong -
>> I just hit and fixed this one while porting pmemd to a cray with opteron
>> cpu's and a pgi compiler. Seems the pgi compiler is extra picky, or
>> interprets variable scope nesting restrictions a bit differently. A diff
>> file of the fix is:
>>
>
> [...]
>
> Oh, wow, thank you, I just stumbled over this problem yesterday, only a
> few minutes before Jyh-Shyong posted his report. I could compile pmemd now
> (opteron, mpich (Infiniband), pgf90)
>
> I tested Amber8 with all available patches on our opteron cluster (running
> Rocks) and everything seems to work fine so far. Pmemd instead justs seems
> to hang quite often, tested with the factor ix benchmark.
>
> Interestingly, a pmemd run of factor ix with 4 slots (2 nodes) always
> hangs at the same point (if it generates output at all, sometimes it just
> opens the filehandle without writing to it)
>
> ======
> NSTEP = 400 TIME(PS) = 2543.275 TEMP(K) = 298.80 PRESS = 0.0
> Etot = -234217.5764 EKtot = 54870.9021 EPtot
> = -289088.4784
> BOND = 1149
> ======
>
> Several runs with 64 slots stopped earlier, some of them ran just fine...
> Seems a bit diffuse.
>
> Is Pmemd known to run less stable compared to sander?
>
> Thanks in advance,
>
> Lars
>
> --
> Dr. Lars Packschies, Computing Center, University of Cologne
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber_at_scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
>

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu