AMBER Archive (2008)

Subject: Re: AMBER: Amber9 fails test.parallel

From: Mark Williamson (mark.williamson_at_imperial.ac.uk)
Date: Wed Jan 30 2008 - 14:04:31 CST


Marc Cozzi wrote:
> rank 0 in job 12 ndrl.secure.net_46053 caused collective abort of all
> ranks
> exit status of rank 0: return code 174

Does it fail on the same test every time? What is the topology of your
computer setup?

Off the top of my head, the SIGSEGV signal *could* be indicative of a
memory hardware problem. If this is failing at random points, I'd
suggest burning an image of memtest86+ ( http://www.memtest.org/ ) to a
CD and then booting the machine in question from that CD. This will run
a series of memory tests on the machine and should give an indication if
there is a memory problem.

regards,

Mark
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu