AMBER Archive (2005)

Subject: AMBER: New pmemd build utility; pathscale vs. pgi vs. ifort em64 opteron benchmarks

From: Robert Duke (rduke_at_email.unc.edu)
Date: Mon Mar 07 2005 - 10:47:02 CST


Folks -
I have put together a new configuration utility for pmemd that will
hopefully be available early this week on the amber web site. This facility
will simply need to be untar'd in the amber8/src/pmemd directory, which will
place a new_configure_README, a new_configure script, and a new_config_data
directory in the pmemd directory. Nothing from your previous installation
is overwritten, and you can use the old stuff by doing a "./configure" or
the new stuff by doing a "./new_configure". This is a bit of a kludge, but
the new stuff is incompatible with the old stuff, and this maintains maximal
functionality. What you get in the new_configure stuff is up-to-date
support for linux systems, compilers, and mpi implementations. A total of
35 hw/sw configurations on linux are now supported. Please read the readme
for more details. Regarding supported stuff:

Linux - 32 bit p3 or p4; 64 bit opteron

Compilers:
Pathscale 2.0
Intel ifort 8.1.024 (use EM64 version for opteron)
Portland Group 5.2-4 (earlier versions have a bad bug)

MPI Interconnects:
MPICH 1.2.6 (backward compatibility to at least 1.2.5.2)
LAM MPI 7.1.1 (backward compatibility to at least 6.5.9)
MPICH2 1.0 (this stuff is faster than mpich!)
Myrinet MPICH-GM 1.2.6..13 (GM 2.0.17) - earlier versions probably work fine
but this is what I tested.

There is a new mechanism for both pathscale and ifort that uses
LD_LIBRARY_PATH to set rpath, which then should make it reasonably likely
that your processes will all find the compiler shared libraries. I was
prompted to go this route by changes in recent releases of ifort that made
things even worse than they were. Read the readme for details. Thanks to
Ross Walker, Scott Brozell and Dave Case for providing a little impetus in
this area, as well as for suggesting that ifort em64 will work on opterons
(until they came out with em64, I didn't even worry about this).

THIS SHOULD END AT LOT OF THE HEADACHES FOR CONFIGURING PMEMD ON LINUX
SYSTEMS!

(thanks for your patience; it is my intention to keep updating the web site
tar ball as appropriate to support new stuff; it just happens that we
released amber 8 at about the worse time imaginable in regard to stability
of linux hardware and software).

PMEMD on Opterons, the continuiing story - some testing and benchmarking of
PGI 5.2-4 and Intel ifort em64 8.1.024.

Okay, I basically completed my testing and benchmarking on the opterons late
last week, adding effort in the area of the pgf90 and ifort compilers.
These compilers were tested to the extent possible with existing
interconnects at U of Utah. All the pathscale stuff was done earlier in the
week, and I was able to test virtually everything there (the pathscale
compiler passes all (21) regression tests for uniprocessor and mpich-gm
builds, and it also produces the fastest code of all the compilers, though
the margin is not large - roughly 5%).

Testing:

For Portland Group PGI pgf90, v 5.2-4:

1) pgi-built pmemd passes my 21 test suite for single processor runs.
2) pgi-built pmemd passes my 21 test suite for 4 processor runs using an
mpich 1.2.6 interconnect.
3) I could not test with mpich2 because they have not built that software
for pgi, and I did not want to take the time myself.
4) I could build pmemd with pgi and mpich-gm, and it ran one test okay at 2
processors (this is with an underlying mpich-gm library built with pgi, and
a gm library under that built with heaven only knows what - it may be
supplied built by myrinet, but I am not sure; if they would provide source
for the gm lib that would be better). At more than 2 processors, this beast
seg faults very consistently. This may be a problem with how the folks at
utah built mpich-gm with pgi, or it could be that they also need to build gm
with pgi; I am suspicious of an incompatibility between these two layers,
but really have no data to speak of, other than that the problem occurs when
the myrinet hardware gets involved (2 processors use shared memory for
communications on one node). Bottom line is that I cannot "certify" pgi is
okay to use to build pmemd - mpich-gm, but I expect this is a fixable
problem.
5) The cpu_time() library function has very low resolution (multiple
seconds), which is kind of ugly.

For Intel ifort em64, v 8.1.024

1) ifort-built pmemd passes my 21 test suite for single processor runs (this
is em64 on the opteron; ifort is probably the main
   compiler used for pmemd on intel chips; the new version (8.1.024) passes
1 and 4 proc tests there).
2) the mpi libraries for mpich, mpich2, and mpich-gm were not available at U
of Utah, so I did not do multiprocessor testing. I expect things would work
fine.

Benchmarking:

I did not run extensive benchmarking; just some spotchecking. However,
pathscale is clearly the fastest f90 available, if "available" includes
pathscale, pgi, and ifort em64.

For bigcp (90,906 atoms, const pressure) on the 1.4 GHz opterons at utah I
get:

1) Running on 1 processor:

Pathscale pathf90 v2.0 = 60.2 psec/day
PGI pgf90 v 5.2-4 = 57.1 psec/day
ifort em64 v 8.1.024 = 55.9 psec/day

2) Running on 8 processors, mpich 1.2.6:

Pathscale pathf90 v2.0 = 324 psec/day
PGI pgf90 v 5.2-4 = 312 psec/say
ifort em64 v 8.1.024 = not done (mpich libraries for ifort not
available)

I hope to have the more extensive pathscale benchmarks I sent to the
reflector posted on the amber web site for reference; they show the
excellent performance and scaling of pathscale-built pmemd on an
opteron-mpich-gm cluster and also show tha mpich2 outperforms mpich if you
have gigabit ethernet.

CAVEATS:

Okay, one thing to keep in mind with the faster interconnects like Myrinet
is that they can be touchy, and there can be problems with all the
asynchrony inherent in a fast interconnect. There have been problems with
mpich-gm in the past that required a software patch. Clearly, there are
current problems with a pgi-built mpich-gm build. SO, if you use these
interconnects, someone who is competent in systems setup and administration
needs to do the installation AND clearly test the basic functionality, and
then you need to test pmemd. PMEMD basically works very well on a lot of
high end hardware, but if there are any problems with threading or
asynchrony in the interconnect itself, pmemd may well expose the problem. I
am aware of problems with an Infiniband installation using pgi-built pmemd
at the moment, and am hoping that we will be able to report that using
pathscale instead fixes the problem (and / or get a fix for pgi).

Much thanks to Tom Cheatham and the folks at U of Utah for access to their
delicatearches opteron cluster; this work would not have otherwise been
possible.

Regards - Bob Duke

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu