AMBER Archive (2005)

Subject: Re: AMBER: Interpreting Amber8 Benchmark Results

From: Robert Duke (rduke_at_email.unc.edu)
Date: Wed Aug 31 2005 - 15:34:42 CDT


Ross -
PMEMD DOES report wall clock times, labeled "setup wallclock" and "nonsetup
wallclock", at the very end of the output. The other times are indeed cpu
times, as the headers say. I have made the fact that these are all cpu
times (prior to the wallclock times) even more explicitly clear in pmemd 9;
the pmemd 8 format is basically a holdover from sander 6, and I retained it
because I thought the useful info to clutter ratio was higher. So it really
should not be hard or confusing for folks to recognize the wallclock times
in pmemd; they are clearly labeled, and there is no intent to mislead anyone
about anything.
Regards - Bob
----- Original Message -----
From: "Ross Walker" <ross_at_rosswalker.co.uk>
To: <amber_at_scripps.edu>
Cc: <nkelshik_at_cisco.com>
Sent: Wednesday, August 31, 2005 4:09 PM
Subject: RE: AMBER: Interpreting Amber8 Benchmark Results

> Dear Nikhil,
>
>> I am trying to run Amber8 on EM64T with Intel Compilers. I have never
>> done this before and needed help to understand how to interpret the
>> benchmark data. The primary purpose of the benchmarks is to understand
>> how using different interconnects such as GbE, InfiniBand, 10 GigE may
>> affect the application performance.
>>
>> - What number should I be looking at from the benchmark summary report
>> (attached).?
>
> Ultimately you should be looking at the total time reported at the end of
> the output file. This is actually Wall Clock Time (NOT CPU TIME) so will
> give you a real understanding of 'time to solution' on a specific machine
> setup. You can convert this to pico seconds of molecular dynamics per day
> by:
>
> ps per day = total_time * nstlim * dt / 86400
>
> If you are interested in performance in parallel then things get a little
> more complicated. You can still run say 1cpu, 2cpu, 4cpu etc and the total
> time reported will still be the total wallclock time.
>
> Also reported will be certain communication times such as FFT
> communication,
> CRD distribute time, FRC Collect time etc. These you will find will grow
> as
> the number of cpu's grow. Ultimately these times dominate at large CPUS
> leading to simulations actually taking longer on a larger number of cpus.
>
> The easiest way to compare simulations with different interconnects is, if
> you can, connect up the exact same system (cpu, memory, disks etc) with
> each
> of the different interconnects and then run the major benchmarks. For each
> interconnect you can then plot wall clock time as a function of number of
> cpus. You will then see how each different interconnect performs and where
> the limit of each interconnect lies. E.g. GigEth typically chokes beyond
> 16
> cpus while Infiniband will perform much better.
>
> Once you have this then it is an easy step to produce price/performance
> ratios etc.
>
>> - What tests are more relevant to a customer who may be looking to
>> purchase a cluster to running Amber as a primary application?
>
> This really depends on the type of simulations they will be doing.
> Typically
> the bigger the system is the better it scales in parallel. If they are
> largely planning on running simulations using periodic boundaries (PME)
> then
> you should look closely at the DHFR benchmark which has 22930 atoms and
> also
> the much larger factor_ix benchmark which has 90906 atoms. Note for these
> to
> run as efficiently as possible in parallel you should use the PMEMD module
> of Amber which is designed to run PME calculations as efficiently as
> possible in parallel. See the amber website for details on compiling pmemd
> on different systems. (BEWARE: while sander v8's total time is wallclock
> time, PMEMD's total time is NOT. This can be misleading since on big
> systems
> the efficiency to which the cpus are used can drop rapidly and so the
> reported cputime can often be much less than the REAL wall clock time.
> Thus
> here to understand the true throughput of the cluster you should calculate
> the wall clock time used by subtracting the reported start time from the
> reported end time. - This is also true of any program you benchmark. The
> real measure of speed is wall clock time to get the job done. The cpu time
> is largely irrelevant to the customer...)
>
> If the end user is looking to do mainly implicit solvent simulations then
> you should look at the gb_cox2 and gb_mb benchmarks. These should be run
> using sander since pmemd does not support implicit solvent simulations.
> The
> scaling here to large numbers of cpus may not be as good as the explicit
> solvent (pme) simulations.
>
> There are many other things that should be considered as well. A large
> number of GigE and 10GigE clusters are built with 32 port switches that
> are
> then chained together. Note, if the chaining is NOT non-blocking (e.g. has
> less bandwith than the sum of all the ports) then you will see problems
> with
> scaling both above 32 cpus and also if queuing software is used that does
> not force locallity of MPI processes to a single switch... This needs to
> be
> considered carefully since under ideal conditions a GigE cluster may seem
> to
> work well once half of it gets loaded up the scaling of jobs on the other
> half may suffer due to blocking at the switches.
>
> Suffice to say the speed and bandwith of the backplane are the most
> important things when running amber simulations in parallel. Disk speed is
> generally a secondary issue.
>
> Also note that faster cpus will generally give worse scaling for a given
> backplane since the communication throughput remains fixed but the nuber
> of
> messages that need to be sent per second goes up since the cpus get the
> job
> done faster.
>
> I hope this information helps.
>
> If you want more in-depth advice please email me directly and I can
> arrange
> a time to talk with you.
>
> All the best
> Ross
>
> /\
> \/
> |\oss Walker
>
> | Department of Molecular Biology TPC15 |
> | The Scripps Research Institute |
> | Tel: +1 858 784 8889 | EMail:- ross_at_rosswalker.co.uk |
> | http://www.rosswalker.co.uk | PGP Key available on request |
>
> Note: Electronic Mail is not secure, has no guarantee of delivery, may not
> be ready every day, and should not be used for urgent or sensitive issues.
>
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber_at_scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
>

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu