AMBER Archive (2008)Subject: Re: AMBER: Compile AMBER 9 on TACC Ranger super computer
From: David LeBard (david.lebard_at_asu.edu)
Date: Thu Aug 14 2008 - 14:43:52 CDT
Dear Ross,
Thank you so very much for the detailed instructions!
I have also been bugging the TACC consulting folks about my problems
installing amber9 (and actually had a ticket in their system since
June 9th!!!) with absolutely no luck or help at all. Their attempt at
installing amber9 in the form of a loadable module (ie. module load
amber9) has never worked for me. Even after showing them output from
all of my failed attempts, they still could not install amber9
correctly. I tried to compile it myself, but had little time to fish
through the errors because of other priorities (likely due to the -tp
7/-m32 problems). The latest "advise" from their consulting group was
to add the following in my batch script:
== module loading advise from TACC ==
module purge
module load TACC
module unload pgi mvapich2
module load intel
module load mvapich/1.0
module load amber9
==================================
This does not work (at least for my user) with sander.MPI or pmemd.
However, I followed your extremely simple instructions, and now both
sander.MPI and pmemd work as expected. The only problem with your
installation instructions, which are admittedly trivial, was the
misspelled AMBRHOME variable and the use of setenv for a bash script
("setenv AMBRHOME ~/amber9" should be changed to "export
AMBERHOME=~/amber9", and "setenv DO_PARALLEL ibrun" should be changed
to "export DO_PARALLEL=ibrun").
Anyway, I think these instructions should be passed along to the TACC
folks so they can correct this module and make a working Ranger-wide
installation.
Many, many, many thanks-
David LeBard
2008/8/14 Ross Walker <ross_at_rosswalker.co.uk>:
>> well. It is a little bit complex to compile Amber9 in some special
>> platform.
>
> I'd like to point out that this is not really true here. Sure on something
> exotic like a IBM Blue Gene or Cray X1E where you are cross compiling it can
> be tricky but Ranger is nothing more than a standard infiniband cluster, all
> be it a large one.
>
> Most problems that occur are generally a function of a poorly setup machine
> environment, confusion over compiler versions and mpi libraries or a user
> trying to tweak with things too much and getting confused. The nice thing
> about the NSF supercomputers is that they are generally set up very well and
> indeed Ranger is no exception here. There are multiple compilers and mpi
> libraries available but all the environment options are well documented. If
> you follow their guidelines for selecting mpi libraries, compilers etc then
> you get an environment that is correctly set up. Which pgf90 will yield the
> Portland Group compiler version you want along with the correct MPI
> compilation for this compiler version. Additionally all of the development
> libraries etc are all installed. So in fact something like Ranger is
> generally much easier to compile and run on than some in house cluster. The
> problems occur when people start messing with their environment options
> directly rather than reading the documentation for the machine on how to
> select different compilers etc.
>
>> >" /usr/bin/ld: skipping incompatible
>> > ...
>> >/usr/bin/ld: cannot find -lmpichf90
>
>> In your case one. the error come from the "libmpichf90" file. Are you sure
>> this
>> file locate on this direction:/share/home/00654/tg458141/local/mpich2-
>> 1.0.7/lib/ ?
>
> No the error the user is seeing here is the 'skipping incompatible',
> libmpichf90 does indeed exist, the issue is that this was compiled as a 64
> bit binary while the amber pgf90 configure script forces 32 bit
> compilation...
>
>> In your case two. it is clear taht C compiler can not be found. I suggest
>> you to
>> use the icc( intel C compiler) and ifort(intel fortran compiler).
>
> No, one should use the default c compiler on ranger, which is gcc - it works
> fine. The issue here is that the default environment has been changed in
> some way resulting in the autoconfigure of netcdf (not amber) for some
> reason finding that the c compiler is not working. One option is to skip the
> -bintraj flag to configure so that netcdf is not built, the other is to
> restore the default environment variables so the c compiler works again.
>
>> by the way, "-tp p7" or "-tp althon" is decided by the CPU of you
>> supermachine.
>> Please use "man" command to check the option of pgf90 or ifort to ensure
>> which is followed to "-tp"
>
> This has been discussed numerous times on the mailing list before. At the
> time of Amber 9's release there were bugs in the pgf90 64 bit compilation
> routines that lead to incorrect results being produced and thus failures of
> the test cases. This is very very serious. A segfault is one thing but
> silently generating incorrect data is very bad. This is why we have the
> comprehensive test suite to detect such problems. At the time this problem
> with pgf90 was found it was very close to release and we couldn't find a
> workaround for the compiler issues other than to turn off 64 bit compilation
> altogether. Hence why the -tp p7 was deliberately added for an 'opteron'
> build. It forces 32 bit compilation.
>
> The issue is that to build in parallel you have to link against all 32 bit
> mpi libraries which generally means that the administrator of the machine
> must have built both sets for you, the 64 bit default ones and the 32 bit
> ones.
>
> Unfortunately on Ranger only the 64 bit versions are available hence the
> linking problems with pgf90. A simple solution is to remove all the flags
> that force 32 bit compilation. The latest version of pgf90 on ranger
>
> login3% pgf90 -V
>
> pgf90 7.1-2 64-bit target on x86-64 Linux -tp gh-64
> Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
> Copyright 2000-2007, STMicroelectronics, Inc. All Rights Reserved.
>
> has hopefully fixed the 64 bit compiler problems.
>
> Indeed, I have just tried this out on Ranger without encountering any
> problems. Here is what I did - fairly simple really:
>
> tar xvzf amber9.tgz
> setenv AMBERHOME ~/amber9
> cd amber9
> wget http://www.ambermd.org/bugfixes/9.0/bugfix.all
> patch -p0 -N -r patch-rejects <bugfix.all
>
> ..> login4% which mpif90
> ..> /opt/apps/pgi7_1/mvapich2/1.0/bin/mpif90
> ..> login4% which pgf90
> ..> /opt/apps/pgi/7.1/linux86-64/7.1-2/bin/pgf90
>
> setenv MPI_HOME /opt/apps/pgi7_1/mvapich2/1.0/
>
> ./configure -mpich2 -opteron pgf90
>
> edit config.h and remove all instances of -tp p7 and -m32 (otherwise you
> can't link against the 64 bit mvapich2)
>
> make parallel
>
> Next we test the parallel implementation...
>
> ---- qsub script ----
> #!/bin/bash
> #$ -V # Inherit the submission environment
> #$ -cwd # Start job in submission directory
> #$ -N testPMEMD # Job Name
> #$ -j y # combine stderr & stdout into stdout
> #$ -o $JOB_NAME.o$JOB_ID # Name of the output file (eg. myMPI.oJobID)
> #$ -pe 2way 32 # Requests 2 cores/node, 32/16 = 2 nodes total (4
> cpu)
> #$ -q development # Queue name
> #$ -l h_rt=00:45:00 # Run time (hh:mm:ss) - 0.5 hours
> set -x #{echo cmds, use "set echo" in csh}
> setenv AMBRHOME ~/amber9
> cd $AMBERHOME/test
> setenv DO_PARALLEL ibrun
> make test.parallel
>
> -------------------------------------------
>
> Note there are several failures here - mainly in hybrid REMD, LES and PIMD
> so stay away from these types of runs with this installation on RANGER!
>
> Next we build PMEMD - this just needs a minor modification to work with
> mvapich2 since it only has options for mvapich (basically tries to link to a
> non-existance mtl-common otherwise).
>
> cd $AMBERHOME/src/pmemd/
> ./configure linux64_opteron pgf90 mvapich pubfft nobintraj
> ..> Please enter name of directory where Infiniband libraries are installed:
> ..> /opt/apps/pgi7_1/mvapich2/1.0/lib/
>
> edit config.h and change MPI_LIBS line to MPI_LIBS =
>
> change all pgf90's to mpif90
>
> make install
>
> Next we test the PMEMD implementation...
>
> ---- qsub script ----
> #!/bin/bash
> #$ -V # Inherit the submission environment
> #$ -cwd # Start job in submission directory
> #$ -N testPMEMD # Job Name
> #$ -j y # combine stderr & stdout into stdout
> #$ -o $JOB_NAME.o$JOB_ID # Name of the output file (eg. myMPI.oJobID)
> #$ -pe 2way 32 # Requests 2 cores/node, 32/16 = 2 nodes total (4
> cpu)
> #$ -q development # Queue name
> #$ -l h_rt=00:45:00 # Run time (hh:mm:ss) - 0.5 hours
> set -x #{echo cmds, use "set echo" in csh}
> setenv AMBRHOME ~/amber9
> cd $AMBERHOME/test
> setenv DO_PARALLEL ibrun
> make test.pmemd
>
> -------------------------------------------
>
> These tests all pass without problems.
>
> All done...
>
> Again this may not give optimum performance, you can tweak to your hearts
> content to get better performance out but this should at least be reasonable
> and will give a working (and tested assuming you check the results from the
> test runs) version of AMBER 9.
>
> Have fun....
> Ross
>
> /\
> \/
> |\oss Walker
>
> | Assistant Research Professor |
> | San Diego Supercomputer Center |
> | Tel: +1 858 822 0854 | EMail:- ross_at_rosswalker.co.uk |
> | http://www.rosswalker.co.uk | PGP Key available on request |
>
> Note: Electronic Mail is not secure, has no guarantee of delivery, may not
> be read every day, and should not be used for urgent or sensitive issues.
>
>
>
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber_at_scripps.edu
> To unsubscribe, send "unsubscribe amber" (in the *body* of the email)
> to majordomo_at_scripps.edu
>
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" (in the *body* of the email)
to majordomo_at_scripps.edu
|