AMBER Archive (2006)Subject: Re: AMBER: job crashes
From: JunJun Liu (ljjlp03_at_gmail.com)
Date: Tue Sep 26 2006 - 22:39:52 CDT
Hi Xiaowei,
Google it with "semop lock failed" and you will know it's related to your
MPI shared memory. Try using "cleanipcs" to clean them up in each
computation node.
Good luck!
Liu
On Tue, 26 Sep 2006 23:09:04 -0400, Xiaowei (David) Li <xl3a_at_virginia.edu>
wrote:
> Dear all:
> I have met intensive (almost every job I submitted) MD job crashes
> during recent simulation work. The job crashes always happend upon the
> completion points of simulations (for example, the crash happens around
> 950 ps for a 1 ns simulation). All of the errors messages have the
> "semop lock failed" information as following.
> Job is running on node(s):
> ------------------------
> compute-2-5 compute-2-6 compute-2-7 compute-2-9
> ------------------------
> p4_error: latest msg from perror: Invalid argument
> p0_9469: p4_error: OOPS: semop lock failed: -1
> forrtl: error (69): process interrupted (SIGINT)
> forrtl: error (69): process interrupted (SIGINT)
> forrtl: error (69): process interrupted (SIGINT)
> forrtl: error (69): process interrupted (SIGINT)
> p3_28418: (207846.623829) net_send: could not write to fd=5, errno = 32
> forrtl: error (69): process interrupted (SIGINT)
> p0_9469: (207848.918605) net_send: could not write to fd=4, errno = 32:
>
> I was running the parallel simulation with MPI on a linux cluster with
> Athlon Opteron 244 processors. The input file is:
> &cntrl
> imin = 0,
> irest = 1,
> ntx = 5,
> ntb = 2,
> pres0 = 1.0,
> ntp = 1,
> tautp=5
> taup =5,
> cut = 10,
> ntr = 0,
> ntc = 2,
> ntf = 2,
> tempi = 300.0,
> temp0 = 300.0,
> ntt = 1,
> nstlim =500000,
> dt = 0.002,
> ntpr = 100,
> ntwx = 100,
> ntwr = 1000,
> nscm=1,
> &end
> Any help or suggestion will be deeply appreciated. Thanks.
>
> Best,
> Xiaowei Li
> University of Virginia
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber_at_scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
--
JunJun Liu
College of Chemistry
Central China Normal University
WuHan 430079
P.R. China
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
|