|
|||||||||||||||||||||||||||||||||
AMBER Archive (2005)Subject: AMBER: Amber Performance in Parallel on Itanium
From: Robert J. Woods (rwoods_at_ccrc.uga.edu)
Hi Folks,
The Itanium cluster is running RHEL3 Update4 with Scali for management. The MPI traffic is going out over Myrinet and we are using a 10/100Mb LAN for management and NFS. We are using the Intel compilers to build Amber but are not using the Intel math libraries or any others for that matter.
The shared Amber8 directory is NFS mounted as well as the user's working directory. We are seeing relatively poor scaling (3.2 fold speed-up on 8 cpu's). For comparison, on an essentially equivalent implementation on our Xeon cluster we see reasonable scaling (6.0 fold on 8 cpus).
On the Itanium cluster, what we do see is that when we start an n-way parallel job, n-1 of the processors are pegged at ~100% utilization, however, one of processors starts very high and then falls to about 50% and stays there. We have run ethereal on the head node to watch packets and as the code starts up, of course we see lots of NFS queueries to all of the nodes. Then as that one processor falls to around ~50% use we see lots of NFS communications between the head node and the node that has the low performing processor. Once the poorly performing CPU drops to 50% you can look at the 100Mb switch and see enormous amounts of traffic between the head node and it.
This behavior is not present on the Xeon system, on which all CPUs appear to run at about 100%.
Could this problem simply be due to our use of NFS as a way to share the required files?
Rob Woods
--
| |||||||||||||||||||||||||||||||||
|