AMBER Archive (2006)

Subject: RE: AMBER: cluster architecture for the best amber performance

From: Ross Walker (
Date: Wed May 03 2006 - 10:42:19 CDT

Hi Kateryna,

> Our institute is planning to buy a cluster and we are really
> interested in optimal architecture for the best amber performance.
> We are hesitating between:
> 1. dual socket for dual-core Athlon 64 X2(4200+) or
> 2. dual socket for dual-core Opteron (model 165 or 275) or
> 3. dual socket for dual-core Penthium D4 (920, 2,8GHz) or
> 4. dual socket for single-core Athlon 64 (3200+)
> Where the amber benchmarks for these architectures can be found?

This is ultimately one big can of worms. Especially since the performance
can vary based on the compilers you use, what the interconnect between nodes
in the cluster is etc. There really are far too many variables to make
proper comparissons. My recommendation would be, depending on whether you
are building your own cluster or ordering a pre-built one would be to get
the company concerned to lend you one of each of the various options to try.

A few things to bear in mind though. Will this cluster only be used for
Amber calculations? If so do you plan on running lots of small (say 4 cpu)
jobs or one or two large (>32 cpu) jobs? The type of cluster you want to
build is very very dependent on the type of simulations you want to run.
Also how much money do you have for the backplane? If it is going to be
gigabit ethernet you can pretty much forget going to more than 16 cpus in a
single run.

Also you shoudl think about balancing the backplane to the number of cpus
per box. E.g. Option 1 will give you 4 cpus per box. Thus if you have a 1 GB
ethernet backplane the connection will only 250MBits per cpu. Thus 4
processor and probably 8 processor jobs will run quite well but much beyond
that won't.

Maybe others can chip in here with some specific numbers but my opinion
would be to go to the manufacturer and request some machines to test.

> What will be your recommendations for the HDD:
> 1. IDE
> 2. SCSI
This one is easy. For AMBER calculations go with IDE every time since it is
a fraction of the price and MD simulations (unless you are doing some crazy
stuff like writing the trajectory on every step) will be totally cpu and
interconnect bound. Note you also want to make sure that you have a seperate
backplane for NFS traffic - i.e. having the NFS traffic go over the same
interconnect that you use for MPI will be disaster. My advice, would be to
put a small local IDE disk in each node, as it makes configuration and
maintenance easier and then have a node that provides dedicated NFS services
and put in that 4 or 5 SATA disks with RAID5.

I'm sorry I can't give you any hard numbers on the cpus but I don't have
access to all the different architures. All I will say is that clock speed
is definately not everything and not all cpus perform the same for all types
of runs. E.g.:

AMBER 9 factor IX Benchmark (SINGLE CPU RUN) [Ps/day] PME Periodic
Pentium 4 2.8GHz - 97.16 ps/day
Pentium-D 3.2GHz - 111.96 ps/day
Power-4 1.7GHz - 110.17 ps/day
Itanium-2 1.5GHz - 176.11 ps/day

AMBER 9 GB_MB Benchmark (SINGLE CPU RUN) [Ps/day] Implicit solvent GB
Pentium 4 2.8GHz - 239.89 ps/day
Pentium-D 3.2GHz - 266.03 ps/day
Power-4 1.7GHz - 249.93 ps/day
Itanium-2 1.5GHz - 191.51 ps/day

Note the difference in the Itanium-2 for the implicit solvent simulation.

Maybe others can chip in here who have access to the architectures you are

All the best

|\oss Walker

| HPC Consultant and Staff Scientist |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- |
| | PGP Key available on request |

Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.

The AMBER Mail Reflector
To post, send mail to
To unsubscribe, send "unsubscribe amber" to