AMBER Archive (2008)

Subject: RE: AMBER: Problems compiling Amber with MKL

From: Ross Walker (ross_at_rosswalker.co.uk)
Date: Mon Apr 14 2008 - 16:32:49 CDT


Hi Sasha,
 
 export AMBERHOME=/data/amber9
export MPI_HOME=/data/openmpi

source /opt/intel/cce/10.1.012/bin/iccvars.sh
source /opt/intel/fce/10.1.012/bin/ifortvars.sh

PATH=/opt/intel/cce/10.1.012/bin:$PATH; export PATH
PATH=/opt/intel/fce/10.1.012/bin:$PATH; export PATH

Try checking that MPI_HOME/bin is being picked up at the beginning of your path as well - to make sure that 'which mpif90' and 'which mpirun' return the correct versions.
 
Also try running:
 
mpif90 -show
 
to make sure it returns the correct compiler etc. E.g. here is mine for ifort with mpich2:
 
[14:21][caffeine:0.04][rcw:~]$ mpif90 -show
ifort -g -I/usr/local/mpi/mpich2-1.0.3_ifort9.1.039/include -I/usr/local/mpi/mpich2-1.0.3_ifort9.1.039/include -L/usr/local/mpi/mpich2-1.0.3_ifort9.1.039/lib -lmpichf90 -lmpichf90 -lmpich -lpthread -lrt
 
 
Serial version compiles ok with or without the -static flag, but make test.serial fails:
 
So the serial version links against the MKL libraries okay then? It is just the parallel version below that doesn't?

cd qmmm2/2pk4; ./Run.2pk4_stan
This test not set up for parallel, skipping
 
This is really weird - if you really did make test.serial but it returns that "This test not set up for parallel," then something is wrong here. Make sure the DO_PARALLEL and TESTsander variables are NOT set. Then try things again. My suspicion is that you have DO_PARALLEL set so it is running the serial version of sander through mpi - I.e. running multiple copies of the same code - hence errors opening restrt files etc.
 
Sequence of actions to compile parallel Amber (after patching the source):

[sasha_at_abicluster src]$ ./configure -opteron -openmpi ifort_x86_64
 
Leave out the -opteron - I don't think it does anything with ifort anyway.
 
After this, I edit config.h to replace ifort with mpif90 in FC and LOAD flags. It doesn't compile without it, and it might be useful to have a note about it in the installation instructions.
 
Don't do this... It should be using mpif90 otherwise you will be missing all sorts of library files that are needed. You shouldn't need to edit the config.h file at all. What is the problem when you do 'make parallel' with mpif90 in the config.h file? I assume mpif90 exists in your path and picks up the correct compiler?
 
make parallel creates the executables, but make test.parallel fails with this error:
 
You are linking dynamically here I assume?
 
cd cytosine; ./Run.cytosine
/data/amber9/exe/sander.MPI: error while loading shared libraries: libmkl_lapack.so: cannot open
 
This implies that the environment is someway different on different nodes. Typically this happens in parallel when you set some environment variables on one node but the other node (which is also running part of the mpi code) doesn't inherit these - hence it doesn't know where to look for the mkl libraries. Typically the simplest solution here is to try and compile statically and then you don't need to worry about it.
 
Otherwise you will need to tweak things like the default .profile or .bashrc so that something like 'mpirun -np 4 env' returns you the same thing from all nodes. Normally static linking (if you can do it) avoids this hassle though.
 

The strange thing is that libmkl_lapack.so is located in the directory that was happily noticed by the ./configure script. Same error is thrown when sander.MPI is attempted to run with one of the test cases from the Amber tutorial (which is kind of expected after the test error).
 
        It is very possible that the mpirun command (even if you run everything on the same physical node you are compiling on) is invoking a new shell and not picking up the correct paths. Try editing /etc/bashrc on all nodes so they source the compiler and mkl environment setup scripts on login.

Compilation with -static flag fails invariably with the following message:
ld: cannot find -lmpi_f90
make[1]: *** [sander.MPI] Error 1
make[1]: Leaving directory `/data/amber9/src/sander'
make: *** [parallel] Error 2
 
I would hope this would go away with using mpif90 - although maybe not if no static library is available for openmpi. There should be a way to build a statically linkable openmpi (I do it all the time with mpich2 without problems). So you could try that. Although I would first look at making sure the environment gets inherited correctly on all nodes under an mpirun.

All the best
Ross

/\
\/
|\oss Walker

| Assistant Research Professor |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- ross_at_rosswalker.co.uk |
| http://www.rosswalker.co.uk <http://www.rosswalker.co.uk/> | PGP Key available on request |

Note: Electronic Mail is not secure, has no guarantee of delivery, may not be read every day, and should not be used for urgent or sensitive issues.

 

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu