AMBER Archive (2003)Subject: Re: AMBER: machine dependent?
From: Michael Crowley (crowley_at_scripps.edu) 
Date: Tue Aug 12 2003 - 09:24:58 CDT
 
 
 
 
Dear Shuang,
 
Although you may be finding a hidden problem in Amber 7, I agree
 
with Dr. Case that one should not see SHAKE failures in a
 
production run, that is, a run that is meant to be a valid
 
simulation and not an equilibration. As soon as you see
 
a SHAKE failure, all confidence  in the validity of the
 
simulation is probably gone.
 
 I would like to explain a SHAKE failure and ask for patience
 
if it is obvious, in the hope that it will help one understand the
 
subsequent explanation of why different machines would
 
give different answers.
 
SHAKE failures a mystery to a large number of researchers and
 
I hope the following will help with understanding what is
 
going on how one should respond to the event.
 
 ======WARNING====== Very long verbose message ......
 
 SHAKE is a method to solve a set of nonlinear simultaneous
 
equations that describe the desired bond lengths between a group of
 
atoms, for instance the three bonds in TIP3 water (OH,OH,HH).
 
It is a little more complicated than that but not much.
 
When motion is propagated in the integration step of the dynamics,
 
the atoms move away from their desired bond distances. The new
 
coordinates are plugged into the SHAKE equations, and these
 
equations are solved for a new set of coordinates that do
 
have the desired bond lengths (and also follow some other constraints).
 
The solution is found using the assumption that the system has not
 
strayed too far from the desired distances in one step, and
 
the equations can be solved in the linear region of the equations.
 
The general method involves  many atoms and is solved by an
 
iterative method which only works within  the valid linear region.
 
Outside that region, there is either no solution of multiple solutions.
 
Thus if your atoms have moved so far that they are outside
 
the region where the SHAKE algorithm is valid, then SHAKE will
 
"fail" when it does not converge to a solution.
 
 SHAKE failures usually occur when one starts a simulation with
 
a very strained structure, one that has atoms too close or bonds
 
of angles stretched or compressed in a way that produces very large
 
(abnormally large) forces. The propagation of the classical motion will
 
send atoms out of the region where SHAKE is valid, and one will
 
see SHAKE failure. Sometimes the simulation can recover if the
 
resultant kinetic energy is removed (vlimit) and the bond
 
forces pull the system back to the region where SHAKE can find
 
a solution. Most of the time, this condition is avoided by
 
minimizing the system before starting any dynamics.
 
 Most people see SHAKE failure as a BAD thing, and use it as a signal
 
that there is something wrong with their structure or system.
 
Certainly, if a failure has occurred in the middle of a long
 
simulation, the system will definately have taken a step
 
away from the trajectory that would have been followed if
 
the constraints were enforced. One probably cannot trust the
 
data after that event. In other words, one is not really
 
simulating the system that one intends (at least one of the
 
rigid bonds is not rigid for that step).
 
 The messaage from me about SHAKE failure is to concentrate on your
 
system to see what is causing the SHAKE failures since your
 
trajectory is probably meaningless with them.
 
 If you are interested in a system with behavior that exceeds the
 
SHAKE limitations, then you must turn SHAKE off and use a much smaller
 
time step.
 
 Finally, I suggest that the reason that you might see different
 
behavior on different machines is that the propagation of numerical
 
error in an interative process will differ depending on how
 
the instructions are executed on different machines. With optimized
 
code, the CPUs on different machines are doing things in
 
different orders, combining operations in different ways that
 
are algebreicaly correct, but when done many times over will
 
accumulate roundoff error in different ways. There are ways to
 
track down what may be happening differently on different machines,
 
like turning off optimization to see if that is the source of
 
the divergence, but I think that your time is better spent
 
finding the source of you SHAKE failures.
 
 Apologies for the long message, I hope it helps, and that it
 
is not misleading. Comments +- criticisms welcome.
 
Sincerely,
 
Mike
 
 -----------------------------------------------------------------
 
Physical mail:   Dr. Michael F. Crowley
 
                 Department of Molecular Biology, TPC6
 
                 The Scripps Research Institute
 
                 10550 North Torrey Pines Road
 
                 La Jolla, California 92037
 
 Electronic mail: crowley_at_scripps.edu
 
Telephone:         858/784-9290
 
Fax:               858/784-8688
 
-----------------------------------------------------------------
 
 -----------------------------------------------------------------------
 
The AMBER Mail Reflector
 
To post, send mail to amber_at_scripps.edu
 
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
 
 
  
 |