AMBER Archive (2003)

Subject: AMBER: PMEMD Performance on Beowulf systems

From: Stephen.Titmuss_at_csiro.au
Date: Thu Dec 18 2003 - 21:19:59 CST


Hello All,

We have been testing PMEMD 3.1 on a 32 cpu (16x dual Athlon nodes)
cluster with a gigabit switch. The performance we have been seeing (in
terms of scaling to larger numbers of CPUs) is a bit disappointing when
compared to the figures released for PMEMD. For example, comparing
ps/day rates for the JAC benchmark (with the specified cutoff changes,
etc) on our cluster (left column) and those presented for a 2.4GHz Xeon
cluster also with a gigabit switch (right column) gives:

       athlon xeon
 1cpu: 108
 2cpu: 172 234
 4cpu: 239 408
 8cpu: 360 771
16cpu: 419 1005
32cpu: 417

In general, in terms of wall clock time, we only see a parallel speedup
(c.f. 1cpu) of about 3.3 at 8 cpus and struggle to get much past 3.9
going to higher numbers of cpus. The parallel scaling presented for
other cluster machines appears to be much better. Has anyone else
achieved good parallel speedup on beowulf systems?

Also, we are using the Portland f90 compiler and LAM in our setup - has
anyone experienced problems with this compiler or MPI library with
PMEMD?

Thanks in advance,

Stephen Titmuss
 
CSIRO Health Sciences and Nutrition
343 Royal Parade
Parkville, Vic. 3052
AUSTRALIA
 
Tel: +61 3 9662 7289
Fax: +61 3 9662 7347
Email: stephen.titmuss_at_csiro.au
www.csiro.au www.hsn.csiro.au

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu