AMBER Archive (2009)

Subject: Re: [AMBER] TI production runs stop after a certain Step number

From: steinbrt_at_rci.rutgers.edu
Date: Wed Apr 08 2009 - 09:35:11 CDT


Hi Hannes,

> I tried the suggestions you made and found that it depends on the number
> of CPU´s i use.
> When i do the simulation without the TI flags no problem occurs.
> Also the TI transformation of the ligand works well (200k steps).
> If i reduce the number of CPU´s to 2, all of the 200k steps are performed.
> With 8 CPU´s it stops after step 121900. The 4 CPU try is still running.

well, we are entering the murky realm of MPI performance and problems
here, of which I am no expert. TI should run on any number of 2xN CPUIs,
the TI process communication is independent of ambers normal
parallelisation scheme, but it certainly sounds as if it couldnt hurt to
try a different MPI implementation (are you using the LAM that came with
amber 10?).

Also, check every one of your output files, sometimes error messages
appear only in one of the two TI processes or in the queueing output.
Assuming your cluster has 4 CPU per node, if the 4 CPU run goes through
fine, maybe the problem only occurs when you run on more than one node and
be network-related?

Kind Regards,

Dr. Thomas Steinbrecher
BioMaps Institute
Rutgers University
610 Taylor Rd.
Piscataway, NJ 08854

_______________________________________________
AMBER mailing list
AMBER_at_ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber