AMBER Archive (2003)Subject: Re: AMBER: tru64 alpha
From: Robert Duke (rduke_at_email.unc.edu)
Date: Thu Oct 16 2003 - 21:50:08 CDT
Mu -
A few more thoughts. First of all, all this hardware info is interesting,
but unless the hardware is malfunctioning or completely overloaded, it
should be scaling much better than 25% at 8 processors. You should expect
at least 80% for anything except really bad or broken hardware, or otherwise
completely overwhelmed hardware, and quadrics is supposed to be some of the
better stuff out there (and performs rather nicely on lemieux, one of the
biggest supercomputers in the world). PMEMD 3.03 scales at 94% on the
latest ibm regatta's, so it is capable of scaling rather nicely in this
range on decent hardware. SO check with your system support people about
the state of the hardware interconnect and what other people get out of it.
Point out that a somewhat similar setup at PSC (lemieux) scales at 83%, so
something is wrong. Other possibilities to look at real closely: the
build - I presume to use quadrics, you used a -lmpi -lelan in the link step.
Any chance you are getting mpich running on top of the wrong
software/hardware - like mpich over tcp/ip? This seems far-fetched, but
could happen. Finally, does the mdout file report that you are running on 8
nodes? In alphaserver-speak, a node is actually a cluster of 4 processors
sharing some hardware, whereas a "node" in amber-speak is a cpu. The
different usages of the term node often create confusion. At any rate, your
prun, if that is what you used, should be something like a "prun -N 2 -n 8",
which specifies that you want to use 2 4-cpu nodes, and you want to use all
the cpu's associated with each node (sorry if this is elementary, but it is
another source of possible confusion; I go nuts keeping all this junk
straight as I go from system to system). This and the 2 previous pieces of
mail sum up everything I can think of , but you definitely should be scaling
at better than 25% on 8 nodes.
Regards - Bob
----- Original Message -----
From: "Mu Yuguang (Dr)" <YGMu_at_ntu.edu.sg>
To: <amber_at_scripps.edu>
Sent: Thursday, October 16, 2003 9:52 PM
Subject: RE: AMBER: tru64 alpha
> Dear Bob,
>
> The configuration of SC45 is
>
> interconnect building block
> 16-port or 128-port switch chassis at the heart of the AlphaServer SC
> Interconnect from Quadrics Supercomputers World, delivering up to 500
> MB/s per server, with 32 GB/s of cross-section bandwidth and MPI
> application latencies under 5 microseconds
> Fully integrated console management includes initial DECserver 732
> terminal servers
> Fully integrated management network including an initial 10/100 Ethernet
> switch
> Complete hardware and software documentation and media
>
> I check the web, there is a CompaqMPI seems perform better than common
> mpich.
>
> Do you have some ideas ?
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber_at_scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
>
>
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
|