AMBER Archive (2004)Subject: RE: AMBER: Minimization error
From: Ross Walker (ross_at_rosswalker.co.uk)
Date: Wed Aug 25 2004 - 12:23:13 CDT
Dear Armin
> The tests that Dr. Walker asked about, they passed on Octane
> with just a
> few warnings, but they turned out to be rounding errors. On
> the cluster I
> am not certain, as the system administrator had installed
> Amber7 there.
If the tests pass on the octane then you should be able to trust the results
you get from it. Just a note though - did you run the tests in parallel for
the same number of processors you have been using for the minimisation? I.e.
did you setenv DO_PARALLEL "mpirun -np 2" for example before running the
test? Often things can run perfectly on a single processor but when you move
to multiple processors you are more likely to have problems caused by
compiler / mpi optimisations etc.
Minimisation is a very difficult thing to get reproducible on different
numbers of processors due to rounding errors. However, your energies are
wildly different and so I am quite suspicious of the results. One quick
check - ensure you haven't got SHAKE turned on during the minimisation.
Try taking one of the structures you got and run MD with that structure on
1,2,4 cpu's etc on your octane and your cluster. Run it only for about 100
steps, set ntpr=1 so it prints the info on every step. MD on different
machines and numbers of cpu's will diverge over time again due to rounding
errors. However, over a 100 steps you should get trajectories that are
almost identical to each other (if you used the same starting structure). If
the first 5 steps or so are NOT identical (to the last couple of decimal
places) then something is definitely wrong. Post you output files at this
point and we can take a look and see if something is not getting broadcast
correctly in the parallel run.
> I have appended the final results output from each of the
> runs as well as
> attached an excel spreadsheet with these results a bit more
> organized. My
> question is, which of these states is acceptable? Are all of
> these wrong?
Since your differences are so huge I suspect that yes indeed something is
wrong. But it will take a bit to pin it down. The energy values on the first
step would be much more helpful for diagnosis than the last step. Are you
sure the octane tests passed for both 1 and 2 cpus?
All the best
Ross
/\
\/
|\oss Walker
| Department of Molecular Biology TPC15 |
| The Scripps Research Institute |
| Tel:- +1 858 784 8889 | EMail:- ross_at_rosswalker.co.uk |
| http://www.rosswalker.co.uk/ | PGP Key available on request |
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
|