AMBER Archive (2008)Subject: AMBER: mpd error in submitting parallel job
From: Lili Peng (lpeng_at_ucsd.edu)
Date: Thu Feb 14 2008 - 03:11:25 CST
Hi everyone,
I've come across another mpd-related error (probably related to the same
one in my previous inquiry) while submitting a parallel job. My input
script is:
#!/bin/csh
#$ -cwd
#
#$ -m lpeng
#$ -l h_rt=00:10:00
#$ -e error_file
#$ -o output_file
#$ -pe mpi 16
#
#
cd /nas/lpeng/test
/nas/lpeng/opt/bin/mpirun -np 16\
/nas/lpeng/src/amber9/exe/sander -O -i pgga10.in -p pgga10.top -c pgga10.crd-o
pgga10.out -x pgga10.mdcrd -r pgga10.rst
I checked the 'qstat' and the job ran to completion. However, the output
files (pgga10.mdcrd, pgga10.out, and pgga10.rst) did not show up in my
/test directory.
I checked the error_file, but it has no data (0 kb). However, the
output_file shows:
/:xpt/gridengine/default/spool/compute-0-0/active_jobs/239954.1/pe_hostfile
compute-1-20
compute-1-20
compute-3-0
compute-3-0
compute-2-25
compute-2-25
compute-0-0
compute-2-2
compute-2-2
compute-0-17
compute-0-17
compute-0-19
compute-0-19
compute-0-20
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
mpiexec_compute-1-20.local: cannot connect to local mpd
(/tmp/mpd2.console_lpeng); possible causes:
1. no mpd is running on this host
2. an mpd is running but was started without a "console" (-n option)
Then I go back and try "mpd &" but receive:
An mpd is already running with console at /tmp/mpd2.console_lpeng at
granite.ucsd.edu. Start mpd with the -n option for a second mpd on same
host.
This is where I get stuck. I Googled the problem, but nothing relevant came
up. Does anyone have any leads on this issue? Your input would be
appreciated. Please note that I already have "export PATH" to the mpd set.
Sincerely,
Lili
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu
|