AMBER Archive (2005)

Subject: Re: AMBER: Floating Exception in sander

From: Robert Duke (rduke_at_email.unc.edu)
Date: Wed Jan 05 2005 - 07:55:22 CST


Anshul -
This can be floating point overflow, underflow, divide by zero, a fp
operation on a chunk of data that does not represent a fp number, etc. etc.
If you had exactly coincident atoms, it could happen, but you would also
have a rather large energy. The "check" command in tleap/xleap should
prevent this in your input data (it just lets you know the problem exists,
and then you can edit the model to fix it, I believe). I would be
interested in more particulars in this case (ie., exactly what happened,
what input, what output, perhaps build sander with debugging on, how many
processors, what exact cpu/OS version/compiler). As it turns out, in pmemd
development about the only place I have seen floating point exceptions is on
digital unix machines (ie., compaq/dec/hp unix, alphaservers at PSC; long
ago I did also see this on IA32 machines using the pgi compiler, which is
one reason I abandoned that compiler). This has happened in the past in a
"2 rail quadrics" configuration on the terascale alphaserver at PSC, 128
procs, the one rail default configuration works fine; I never got a fix from
HP on this one, and eventually just suggested that everyone not run on two
rails. The reason you get fpe's with two rails is probably data corruption
on the interconnect. The benefit of using two rails is that it can improve
throughput at high processor count by maybe 10% because the interconnect is
the bottleneck. I am pursuiing this 2 rails problem for this machine again
for the next release of pmemd because I have done some work that can
increase throughput by as much as 50% at high processor count, and the full
benefit is not realized on the terascale computer at psc unless you use two
rails. So anyway, I have seen screwball fpe problems before, but they are
typically associated with system hardware/software problems, and I am very
interested in any reports of problems of this type. In general users should
not be seeing these things.
Regards - Bob Duke

----- Original Message -----
From: <anshul_at_imtech.res.in>
To: <amber_at_scripps.edu>
Sent: Wednesday, January 05, 2005 6:26 PM
Subject: AMBER: Floating Exception in sander

> Dear Amber users,
> While doing minimization of a protein molecule i got the following error:
>
> Floating Exception
> process.bat: Abort - core dumped
>
> However, there is no error in the .out file. This error appears on the
> screen and the process stops and amkes a core file.
>
> Can anyone tell me what this means and how to handel this. I am using
> amber-7 installed on a digital unix machine.
>
> Thanks in advance for anykind of help and suggestions.
>
> With best regards,
> Anshul
>
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber_at_scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
>

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu