AMBER Archive (2004)

Subject: RE: AMBER: execution time

From: Ross Walker (ross_at_rosswalker.co.uk)
Date: Fri Jul 09 2004 - 13:52:08 CDT


Dear Fabien,

> I am running MD on a protein with AMBER7 on 16 processors SGI
> Origin 3800.

Are you the only person running on this at a time or is it a shared
resource? Do you run sander over 16cpus or less?
 
> For mdtrj, only a small part in the middle was different (I
> suppose my problem of corrupted

Seems reasonable if everything else is identical. I would see if there is
some diagnostic software to check your hardware is not failing, however,
since you shouldn't get file corruption.

> the mdout files concern the timing.
> There is a great difference in the time of calculation needed
> (35000 s
> for one and 41000s for the
> second), despite this is the same calculation (the identical results
> show it has been the same).
> Is it normal ?

This is normal for big machines where several things are going on at once.
If other people are logged into the machine and using it then your job will
take longer. Also you don't mention which job took longer. The fact that one
gave you a corrupted mdcrd file suggests that it was not running correctly
and so the timings are probably not valid. If it took longer it may have
been waiting while the machine continually tried to write to the file which
might have been in a bad sector on the disk. Or alternatively it may have
taken less time since the machine was returning that it had written the file
when it hadn't. Also, the amount of free memory, due to other processes
running, can greatly effect performance as can conention for disks and cpus
etc as well as contention for the backplane. If you are the only person on a
machine, you have it all to yourself and it doesn't run any server processes
(NFS, web server etc) then you can expect the times to be reasonably similar
for the same run. If it is a shared machine then the times will probably
fluctuate wildly...

> And what is even stranger for me is that some times do not appear in
> both reports : there are
> one time Other after RunMDTime (left column) and one time Other after
> FRC Collect Time (right
> column).

I wouldn't worry about this. If you look in $AMBEROME/src/sander/new_time.f
you will find around line 374 the following:

c minimal percent of parent to be printed
c
    minp = 0.005d0

So, anything that makes up less than 0.5% of it's parent process is not
printed. Hence in your second job the two missing others probably
contributed less than 0.5% of the time to that section and so their printing
was skipped.

All the best
Ross

/\
\/
|\oss Walker

| Department of Molecular Biology TPC15 |
| The Scripps Research Institute |
| Tel:- +1 858 784 8889 | EMail:- ross_at_rosswalker.co.uk |
| http://www.rosswalker.co.uk/ | PGP Key available on request |

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu