AMBER Archive (2007)

Subject: Re: AMBER: Sander slower on 16 processors than 8

From: Martin Stennett (martin.stennett_at_postgrad.manchester.ac.uk)
Date: Thu Feb 22 2007 - 15:58:57 CST


I'm sorry I stand somewhat corrected, my system probably is too small to
show the performance increase from multiple processors. I'm afraid that I
was just told to use Sander without being told what systems it was designed
for.
I might point out that for small systems it does slow dramatically (measured
using an actual stopwatch) but that it wasn't a fair test.
Apologies for any slight.
Martin
----- Original Message -----
From: "Adrian Roitberg" <roitberg_at_qtp.ufl.edu>
To: <amber_at_scripps.edu>
Sent: Thursday, February 22, 2007 9:25 PM
Subject: Re: AMBER: Sander slower on 16 processors than 8

> Martin and Steve.
> I am not the one that made sander fast, but I must make sure you guys and
> everyone else understands the data you are seeing.
>
> First, as the benchmarks in the amber web page show, sander scales nicely
> to some 8 to 16 processors, with PMEMD being about the best out there in
> terms of scaling.
>
> So, before saying that sander slows dramatically with two processors,
> please note that the benchmarks indicate otherwise, so I would much more
> carefully test your own systems before making such comments.
>
> Now: How did you measure timings ?
>
> time or timex is the WRONG measure, and I believe this is what you did.
> The timings you mention are probably CUMULATIVE CPU times and not
> wallclock times.
>
> Please look at the output file itself to see how long it really took.
>
> Just my 2 cents worth.
>
>
> Martin Stennett wrote:
>> In my experience Sander slows dramatically with even two processors. The
>> message passing interface used means that it frequently drives itself
>> into bottlenecks, with one or more processors waiting for very long
>> periods for others to finish.
>> It also passes an extra-ordinary amount of data between threads, though
>> with your setup this shouldn't be as much of a factor as it was on my
>> test system.
>> To me it seems that AMBER is great from the point of view of a chemist,
>> and very accessible should one want to change it. While from a
>> computational point of view needs a bit of optimisation and tweaking
>> before it should be considered as a serious solution.
>> Martin
>> ----- Original Message -----
>> From: Sontum, Steve To: amber_at_scripps.edu Sent: Thursday, February 22,
>> 2007 8:32 PM
>> Subject: AMBER: Sander slower on 16 processors than 8
>>
>>
>> I have been trying to get decent scaling for amber calculations on our
>> cluster and keep running into bottlenecks. Any suggestions would be
>> appreciated. The following are benchmarks for the factor_ix and jac on
>> 1-16 processors using amber8 compiled with pgi 6.0 except for the lam
>> runs which used pgi 6.2
>>
>> BENCHMARKS
>>
>> mpich1 (1.2.7) factor_ix 1:928 2:518 4:318 8:240 16:442
>>
>> mpich2 (1.0.5) factor_ix 1:938 2:506 4:262 8:*
>>
>> mpich1 (1.2.7) jac 1:560 2:302 4:161 8:121 16:193
>>
>> mpich2 (1.0.5) jac 1:554 2:294 4:151 8:111 16:181
>>
>> lam (7.1.2) jac 1:516 2:264 4:142 8:118
>> 16:259
>>
>> * timed out after 3hours
>>
>> QUESTIONS
>>
>> First off, is it unusual for the calculation to get slower with
>> increased number of processes?
>>
>> Does anyone have benchmarks for a similar cluster, so I can tell if
>> there is a problem with the configuration of our cluster? I would like
>> to be able to run on more than one or two nodes.
>>
>> SYSTEM CONFIGURATION
>>
>> The 10 compute nodes use 2.0GHz dual core opteron 270 chips with 4GB
>> memory and 1Mb memory Cache, tyan 2881 motherboards, HP Procurve 2848
>> switch, and single 1Gb/sec Ethernet connection to each motherboard. The
>> master node is configured similarly but also has a 2TB of raid storage
>> that is automounted by the compute nodes. We are running SuSE
>> 2.6.5-7-276-smp for the operating system. Amber8 and mpich were compiled
>> with pgi 6.0.
>>
>> I have used ganglia to look at the nodes when a 16 process job is
>> running. The nodes are fully consumed by system CPU time. The User CPU
>> time is only 5% and this node is only pushing 1.4 kBytes/sec out over the
>> network
>>
>>
>>
>> Steve
>>
>> ------------------------------
>>
>> Stephen F. Sontum
>> Professor of Chemistry and Biochemistry
>> email: sontum_at_middlebury.edu
>> phone: 802-443-5445
>
>
> --
> Dr. Adrian E. Roitberg
> Associate Professor
> Quantum Theory Project and Department of Chemistry
>
> University of Florida PHONE 352 392-6972
> P.O. Box 118435 FAX 352 392-8722
> Gainesville, FL 32611-8435 Email adrian_at_qtp.ufl.edu
> ============================================================================
>
> To announce that there must be no criticism of the president,
> or that we are to stand by the president right or wrong,
> is not only unpatriotic and servile, but is morally treasonable
> to the American public."
> -- Theodore Roosevelt
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber_at_scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
>

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu