AMBER Archive (2009)

Subject: RE: [AMBER] Problems with running two separate jobs in the same time

From: Ross Walker (ross_at_rosswalker.co.uk)
Date: Tue Apr 07 2009 - 10:01:14 CDT


Hi Antonija,

It would be helpful to understand exactly how you are running these jobs, what your computer environment looks like etc.

A few things to check.

1) Are you running these jobs in separate directories? There can often be problems running in the same directory since the two codes will compete to overwrite certain files such as mdinfo and profile_mpi. Depending on your filesystem and whether it is local or network mounted this can cause all sorts of problems from minor to major performance degredation to complete lock up.

2) Do you have enough disk space free? Are you certain the two jobs running together are not filling the disk?

3) Do you have enough memory. It is possible that the two jobs running together are running out of memory since they have to share it. Often when something is killed because you ran out of memory you see no error message other than perhaps the word 'Killed'. Note if you are using MVAPICH there are numerous memory leaks in it, especially in v0.9.9. This means that the memory usage, for MPI buffers I believe, can increase linearly over time. With just one job running you may have enough memory + swap that you don't see any problems within the length of the run. With two jobs running you may be pushing things over the edge.

4) You don't mention the version of AMBER. You could try running the calculation in AMBER 10 PMEMD in place of sander. This, in parallel, will use less memory and so you may not encounter the same problems.

Good luck,
Ross

-----Original Message-----
From: amber-bounces_at_ambermd.org [mailto:amber-bounces_at_ambermd.org] On Behalf Of Antonija Tomic
Sent: Tuesday, April 07, 2009 1:09 AM
To: amber_at_ambermd.org
Subject: [AMBER] Problems with running two separate jobs in the same time

Hi

for example, if I try to run two separate molecular dynamics or
steered molecular dynamics (nstlim=100000) on my computer in the same
time the processes stop after some time without any warnings or error
messages, but when I start to run that same jobs one after another all
processes finish successfully. I don't know what is going on because I
don't get any warning messages. On the other hand I can do few
separate optimization processes simultaneously without any trouble. I
would be very grateful if someone could help me. I don't think that
the problem is in my computer because I have a new computer.

Antonija

_______________________________________________
AMBER mailing list
AMBER_at_ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER_at_ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber