| AMBER Archive (2009)Subject: Re: [AMBER] binding free energy
From: Don.Bashford_at_stjude.orgDate: Sun Mar 01 2009 - 11:18:39 CST
 
 
 
 
I've seen this kind of problem happen either due to disk filling up or
due to networking outage on a cluster that uses NFS to allow the nodes
 to write back to a main user disk.  It's unfortunate that if the
 output failure happens during writing of restrt, your restrt files
 ends up unusable.  And since output is disabled generally, you get
 little or nothing in the way of error messages for clues.
 
 I'm afraid the most rigorous choice is just to restart from your last
good restrt (prod2.rst?) and give up on the partial prod3 run.
 Alternatively, you could try extracting the last good coord set from
 the prod3 run's mdcrd file and turning that into a restrt file, but
 you'll lose some precision in the coordinates, and you'll lose
 velocity information and just have to start from a Boltzmann v dist.
 
 It would be nice if sander and pmemd would try to be a little more
failsafe when writing the restrt file.  For example, one could move
 the previous restrt file to a temporary location in the same directory
 (filesystem), write the new restrt, and then, if successful, delete
 the old.  Of course, this would cause temporary spikes in disk usage
 that might make failure on a near-full filesystem come sooner rather
 than later.  But I think it would be worth it to have failures from
 which one can recover more easily.
 
 -Don
St. Jude Children's Res. Hosp.
 
 Email Disclaimer:  www.stjude.org/emaildisclaimer
 _______________________________________________
AMBER mailing list
 AMBER_at_ambermd.org
 http://lists.ambermd.org/mailman/listinfo/amber
 
 
 
 |