AMBER Archive (2009)

Subject: Re: [AMBER] pmemd running very slow in amber10

From: Robert Duke (rduke_at_email.unc.edu)
Date: Thu May 14 2009 - 08:38:29 CDT


Hi Vijay,
Well, I am feeling sort of slow this morning too... So Ross responded to
this one before me (means Ross was in the office by 6:00, yipes...), and all
his points are good ones. What I notice is you say "very slow" but then
basically remove all the timing data from the output, so we have no idea how
fast sander ran on various node counts, or how fast pmemd ran on various
node counts. This could be absolutely anything from unrealistic
expectations to a machine hardware problem at the disk or at the
interconnect or elsewhere, a compiler version problem, an mkl problem (note
the OMP issues these days, especially on smp's), an mpi build/config
problem, other junk running on the altix4700 (any of these shared memory
machines behave badly if you have a few guys in the background running, lets
say, something like gaussian), what exact config of a 4700, etc etc etc.
Without data I cannot even begin to guess. The ntt 3 certainly will not be
helping at 64 cpu's, but it won't be the whole story either. So you have to
do this sort of thing right. You take a standard benchmark. My favorite is
factor ix. You run it with sander at 1, 2, 4, 8, 16, 32, 64, ... cpu's.
You run it with pmemd at 1, 2, 4, 8, 16, 32, 64, ... cpu's (so you have to
build the uniprocessor version of pmemd too; I don't do this on all
platforms, but it is the FIRST thing I would do if it looked like cpu
performance was below par, to look for problems at the single cpu level).
You do it 2 or 3 times so you have some idea of variability. You do it when
the machine is not being pounded into a slobbering goo by your buddies in
the lab running their machine-killing software (when I see problems with a
benchmark, I go looking for these guys, ESPECIALLY on an smp as opposed to a
cluster). You save the bottom of all the mdout's so we have complete timing
info. For pmemd, you save all the logfiles, because you can look at those
(well, I can), and spot a wide variety of problems with the hardware or
software. THEN you can say if pmemd is running slower than expected..., and
we can maybe help you. As far as I am concerned, by the way, all this stuff
is slow, and I am agog at how long you guys will wait for results.
Regards - Bob Duke
----- Original Message -----
From: "Vijay Manickam Achari" <vjrajamany_at_yahoo.com>
To: "Amber mailing List" <AMBER_at_ambermd.org>; "Amber Scrops"
<amber_at_scripps.edu>
Sent: Thursday, May 14, 2009 1:41 AM
Subject: [AMBER] pmemd running very slow in amber10

Dear amber users,

I have installed amber10 and pmemd in altix4700.
All the installation went well without error after guidance from amber
mailing list.

If I run sander.MPI in amber10, it runs fine, but if I run pmemd it run very
slow. I run my script not using PBS but directly.

Here I give my *.in file and the *.out file contents.

*in file
=========Dynamic Simulation with Constant Pressure
 &cntrl
   imin=0,
   irest=1, ntx=7,
   iwrap = 1, ntxo=1,
   scnb=1.0, scee=1.0,
   ntt=3, gamma_ln = 1.0,
   tempi = 300.0, temp0=300.0, tautp=0.2,
   ntb = 2, ntp=1, taup=0.2,
   ntf=2,ntc=2,
   nstlim=100000, dt=0.001,
   ntwe=100, ntwx=100, ntpr=100, ntwr=-50000,
   ntr=0,
 /

*.out
============

          -------------------------------------------------------
          Amber 10 SANDER 2008
          -------------------------------------------------------

| PMEMD implementation of SANDER, Release 10

| Run on 05/14/2009 at 13:32:03

  [-O]verwriting output

File Assignments:
| MDIN: MD-betaMalto-lowLyo.in
| MDOUT: betaMalto-lowLyo-MD00-run0000.out
| INPCRD: e2malto-lowLyo-equi21.rst_100000
| PARM: maltose-lowLyo.top
| RESTRT: betaMalto-lowLyo-MD01-run0100.rst
| REFC: refc
| MDVEL: mdvel
| MDEN: mden
| MDCRD: betaMalto-lowLyo-MD00-run0000.traj
| MDINFO: mdinfo
|LOGFILE: logfile

 Here is the input file:

Dynamic Simulation with Constant Pressure
 &cntrl
   imin=0,
   irest=1, ntx=7,
   iwrap = 1, ntxo=1,
   scnb=1.0, scee=1.0,
   ntt=3, gamma_ln = 1.0,
   tempi = 300.0, temp0=300.0, tautp=0.2,
   ntb = 2, ntp=1, taup=0.2,
   ntf=2,ntc=2,
   nstlim=100000, dt=0.001,
   ntwe=100, ntwx=100, ntpr=100, ntwr=-50000,
   ntr=0,
 /

| Conditional Compilation Defines Used:
| MPI
| SLOW_INDIRECTVEC
| PUBFFT
| MKL

| Largest sphere to fit in unit cell has radius = 24.102

| New format PARM file being parsed.
| Version = 1.000 Date = 06/23/08 Time = 17:06:22
| Duplicated 0 dihedrals

| Duplicated 0 dihedrals

--------------------------------------------------------------------------------
   1. RESOURCE USE:
--------------------------------------------------------------------------------

 getting new box info from bottom of inpcrd

 NATOM = 23736 NTYPES = 9 NBONH = 14776 MBONA = 9216
 NTHETH = 27648 MTHETA = 12032 NPHIH = 45312 MPHIA = 21248
 NHPARM = 0 NPARM = 0 NNB = 123552 NRES = 1256
 NBONA = 9216 NTHETA = 12032 NPHIA = 21248 NUMBND = 9
 NUMANG = 14 NPTRA = 20 NATYP = 9 NPHB = 1
 IFBOX = 1 NMXRS = 81 IFCAP = 0 NEXTRA = 0
 NCOPY = 0

| Coordinate Index Table dimensions: 18 9 10
| Direct force subcell size = 5.1696 5.3561 5.0569

     BOX TYPE: RECTILINEAR

--------------------------------------------------------------------------------
   2. CONTROL DATA FOR THE RUN
--------------------------------------------------------------------------------

General flags:
     imin = 0, nmropt = 0

Nature and format of input:
     ntx = 7, irest = 1, ntrx = 1

Nature and format of output:
     ntxo = 1, ntpr = 100, ntrx = 1, ntwr
  = -50000
     iwrap = 1, ntwx = 100, ntwv = 0, ntwe =
100
     ioutfm = 0, ntwprt = 0, idecomp = 0, rbornstat=
0

Potential function:
     ntf = 2, ntb = 2, igb = 0, nsnb =
25
     ipol = 0, gbsa = 0, iesp = 0
     dielc = 1.00000, cut = 8.00000, intdiel = 1.00000
     scnb = 1.00000, scee = 1.00000

Frozen or restrained atoms:
     ibelly = 0, ntr = 0

Molecular dynamics:
     nstlim = 100000, nscm = 1000, nrespa = 1
     t = 0.00000, dt = 0.00100, vlimit = 20.00000

Langevin dynamics temperature regulation:
     ig = 71277
     temp0 = 300.00000, tempi = 300.00000, gamma_ln= 1.00000

Pressure regulation:
     ntp = 1
     pres0 = 1.00000, comp = 44.60000, taup = 0.20000

SHAKE:
     ntc = 2, jfastw = 0
     tol = 0.00001

| Intermolecular bonds treatment:
| no_intermolecular_bonds = 1

| Energy averages sample interval:
| ene_avg_sampling = 100

Ewald parameters:
     verbose = 0, ew_type = 0, nbflag = 1, use_pme =
1
     vdwmeth = 1, eedmeth = 1, netfrc = 1
     Box X = 93.053 Box Y = 48.205 Box Z = 50.569
     Alpha = 90.000 Beta = 90.000 Gamma = 90.000
     NFFT1 = 96 NFFT2 = 50 NFFT3 = 54
     Cutoff= 8.000 Tol =0.100E-04
     Ewald Coefficient = 0.34864
     Interpolation order = 4

| PMEMD ewald parallel performance parameters:
| block_fft = 1
| fft_blk_y_divisor = 4
| excl_recip = 0
| excl_master = 0
| atm_redist_freq = 320

--------------------------------------------------------------------------------
   3. ATOMIC COORDINATES AND VELOCITIES
--------------------------------------------------------------------------------

 begin time read from input coords = 4012.500 ps

 Number of triangulated 3-point waters found: 1000

     Sum of charges from parm topology file = 0.00000000
     Forcing neutrality...

| Dynamic Memory, Types Used:
| Reals 908174
| Integers 1893246

| Nonbonded Pairs Initial Allocation: 168376

| Running AMBER/MPI version on 64 nodes

--------------------------------------------------------------------------------
   4. RESULTS
--------------------------------------------------------------------------------

 ---------------------------------------------------
 APPROXIMATING switch and d/dx switch using CUBIC SPLINE INTERPOLATION
 using 5000.0 points per unit in tabled values
 TESTING RELATIVE ERROR over r ranging from 0.0 to cutoff
| CHECK switch(x): max rel err = 0.3338E-14 at 2.509280
| CHECK d/dx switch(x): max rel err = 0.8261E-11 at 2.768360
 ---------------------------------------------------

 NSTEP = 100 TIME(PS) = 4012.600 TEMP(K) = 301.34 PRESS
  -45.6
 Etot = 72323.5211 EKtot = 16896.3909 EPtot =
55427.1302
 BOND = 4331.7919 ANGLE = 12660.8451 DIHED =
3557.8100
 1-4 NB = 5543.6944 1-4 EEL = 112668.1311 VDWAALS
= -10250.7744
 EELEC = -73084.3679 EHBOND = 0.0000 RESTRAINT =
0.0000
 EKCMT = 1136.7441 VIRIAL = 1359.7065 VOLUME =
226657.6514
                                                    Density =
1.0897
 Ewald error estimate: 0.1325E-03
 ------------------------------------------------------------------------------ NSTEP = 200 TIME(PS) = 4012.700 TEMP(K) = 300.36 PRESS =51.9 Etot = 72308.4304 EKtot = 16841.3962 EPtot =55467.0342 BOND = 4223.5093 ANGLE = 12742.0334 DIHED =3598.7024 1-4 NB = 5569.0962 1-4 EEL = 112679.4301 VDWAALS= -10228.2235 EELEC = -73117.5137 EHBOND = 0.0000 RESTRAINT =0.0000 EKCMT = 1124.6180 VIRIAL = 870.8898 VOLUME =226443.8438 Density =1.0907 Ewald error estimate: 0.8322E-04 ------------------------------------------------------------------------------I dont know what is the error.Can anyone help?Thank you in advance.Vijay Manickam Achari(Phd Student c/o Prof Rauzah Hashim)Chemistry Department,University of Malaya,Malaysia vjramana_at_gmail.com_______________________________________________AMBER mailing listAMBER_at_ambermd.orghttp://lists.ambermd.org/mailman/listinfo/a
mber

_______________________________________________
AMBER mailing list
AMBER_at_ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber