AMBER Archive (2008)

Subject: RE: AMBER: Compile AMBER 9 on TACC Ranger super computer

From: jialei (leijianyu_at_hotmail.com)
Date: Thu Aug 14 2008 - 19:15:39 CDT


Dear Ross,

I just compiled the parallel version and the pmemd of AMBER9 at TACC's ranger successfully with your detailed instructions. Thanks so much for your help!

I then ran both AMBER parallel and pmemd testing processes. They are all completed with minor errors. For running the qsub script of the pmemd test,

> ---- qsub script ----
> #!/bin/bash
> #$ -V # Inherit the submission environment
> #$ -cwd # Start job in submission directory
> #$ -N testPMEMD # Job Name
> #$ -j y # combine stderr & stdout into stdout
> #$ -o $JOB_NAME.o$JOB_ID # Name of the output file (eg. myMPI.oJobID)
> #$ -pe 2way 32 # Requests 2 cores/node, 32/16 = 2 nodes total (4
> cpu)
> #$ -q development # Queue name
> #$ -l h_rt=00:45:00 # Run time (hh:mm:ss) - 0.5 hours
> set -x #{echo cmds, use "set echo" in csh}
> setenv AMBRHOME ~/amber9
> cd $AMBERHOME/test
> setenv DO_PARALLEL ibrun
> make test.pmemd
>
> -------------------------------------------

I first got errors as the following:

"export TESTsander='../../exe/pmemd'; cd 4096wat; ./Run.pure_wat
 MPI version of PMEMD must be used with 2 or more processors!
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 ./Run.pure_wat: Program errormake: *** [test.pmemd] Error 1
TACC: Cleaning up after job: 180035
TACC: Done."

It seems that the MPI environment was not used. So I switched to the C shell to get it fixed.

Thanks everyone for all your suggestions,

Lei Jia
New York University

----------------------------------------
> From: ross_at_rosswalker.co.uk
> To: amber_at_scripps.edu
> Subject: RE: AMBER: Compile AMBER 9 on TACC Ranger super computer
> Date: Thu, 14 Aug 2008 09:00:02 -0700
>
>> well. It is a little bit complex to compile Amber9 in some special
>> platform.
>
> I'd like to point out that this is not really true here. Sure on something
> exotic like a IBM Blue Gene or Cray X1E where you are cross compiling it can
> be tricky but Ranger is nothing more than a standard infiniband cluster, all
> be it a large one.
>
> Most problems that occur are generally a function of a poorly setup machine
> environment, confusion over compiler versions and mpi libraries or a user
> trying to tweak with things too much and getting confused. The nice thing
> about the NSF supercomputers is that they are generally set up very well and
> indeed Ranger is no exception here. There are multiple compilers and mpi
> libraries available but all the environment options are well documented. If
> you follow their guidelines for selecting mpi libraries, compilers etc then
> you get an environment that is correctly set up. Which pgf90 will yield the
> Portland Group compiler version you want along with the correct MPI
> compilation for this compiler version. Additionally all of the development
> libraries etc are all installed. So in fact something like Ranger is
> generally much easier to compile and run on than some in house cluster. The
> problems occur when people start messing with their environment options
> directly rather than reading the documentation for the machine on how to
> select different compilers etc.
>
>>>“ /usr/bin/ld: skipping incompatible
>>> ...
>>>/usr/bin/ld: cannot find -lmpichf90
>
>> In your case one. the error come from the "libmpichf90" file. Are you sure
>> this
>> file locate on this direction:/share/home/00654/tg458141/local/mpich2-
>> 1.0.7/lib/ ?
>
> No the error the user is seeing here is the 'skipping incompatible',
> libmpichf90 does indeed exist, the issue is that this was compiled as a 64
> bit binary while the amber pgf90 configure script forces 32 bit
> compilation...
>
>> In your case two. it is clear taht C compiler can not be found. I suggest
>> you to
>> use the icc( intel C compiler) and ifort(intel fortran compiler).
>
> No, one should use the default c compiler on ranger, which is gcc - it works
> fine. The issue here is that the default environment has been changed in
> some way resulting in the autoconfigure of netcdf (not amber) for some
> reason finding that the c compiler is not working. One option is to skip the
> -bintraj flag to configure so that netcdf is not built, the other is to
> restore the default environment variables so the c compiler works again.
>
>> by the way, "-tp p7" or "-tp althon" is decided by the CPU of you
>> supermachine.
>> Please use "man" command to check the option of pgf90 or ifort to ensure
>> which is followed to "-tp"
>
> This has been discussed numerous times on the mailing list before. At the
> time of Amber 9's release there were bugs in the pgf90 64 bit compilation
> routines that lead to incorrect results being produced and thus failures of
> the test cases. This is very very serious. A segfault is one thing but
> silently generating incorrect data is very bad. This is why we have the
> comprehensive test suite to detect such problems. At the time this problem
> with pgf90 was found it was very close to release and we couldn't find a
> workaround for the compiler issues other than to turn off 64 bit compilation
> altogether. Hence why the -tp p7 was deliberately added for an 'opteron'
> build. It forces 32 bit compilation.
>
> The issue is that to build in parallel you have to link against all 32 bit
> mpi libraries which generally means that the administrator of the machine
> must have built both sets for you, the 64 bit default ones and the 32 bit
> ones.
>
> Unfortunately on Ranger only the 64 bit versions are available hence the
> linking problems with pgf90. A simple solution is to remove all the flags
> that force 32 bit compilation. The latest version of pgf90 on ranger
>
> login3% pgf90 -V
>
> pgf90 7.1-2 64-bit target on x86-64 Linux -tp gh-64
> Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
> Copyright 2000-2007, STMicroelectronics, Inc. All Rights Reserved.
>
> has hopefully fixed the 64 bit compiler problems.
>
> Indeed, I have just tried this out on Ranger without encountering any
> problems. Here is what I did - fairly simple really:
>
> tar xvzf amber9.tgz
> setenv AMBERHOME ~/amber9
> cd amber9
> wget http://www.ambermd.org/bugfixes/9.0/bugfix.all
> patch -p0 -N -r patch-rejects
> ..> login4% which mpif90
> ..> /opt/apps/pgi7_1/mvapich2/1.0/bin/mpif90
> ..> login4% which pgf90
> ..> /opt/apps/pgi/7.1/linux86-64/7.1-2/bin/pgf90
>
> setenv MPI_HOME /opt/apps/pgi7_1/mvapich2/1.0/
>
> ./configure -mpich2 -opteron pgf90
>
> edit config.h and remove all instances of -tp p7 and -m32 (otherwise you
> can't link against the 64 bit mvapich2)
>
> make parallel
>
> Next we test the parallel implementation...
>
> ---- qsub script ----
> #!/bin/bash
> #$ -V # Inherit the submission environment
> #$ -cwd # Start job in submission directory
> #$ -N testPMEMD # Job Name
> #$ -j y # combine stderr & stdout into stdout
> #$ -o $JOB_NAME.o$JOB_ID # Name of the output file (eg. myMPI.oJobID)
> #$ -pe 2way 32 # Requests 2 cores/node, 32/16 = 2 nodes total (4
> cpu)
> #$ -q development # Queue name
> #$ -l h_rt=00:45:00 # Run time (hh:mm:ss) - 0.5 hours
> set -x #{echo cmds, use "set echo" in csh}
> setenv AMBRHOME ~/amber9
> cd $AMBERHOME/test
> setenv DO_PARALLEL ibrun
> make test.parallel
>
> -------------------------------------------
>
> Note there are several failures here - mainly in hybrid REMD, LES and PIMD
> so stay away from these types of runs with this installation on RANGER!
>
> Next we build PMEMD - this just needs a minor modification to work with
> mvapich2 since it only has options for mvapich (basically tries to link to a
> non-existance mtl-common otherwise).
>
> cd $AMBERHOME/src/pmemd/
> ./configure linux64_opteron pgf90 mvapich pubfft nobintraj
> ..> Please enter name of directory where Infiniband libraries are installed:
> ..> /opt/apps/pgi7_1/mvapich2/1.0/lib/
>
> edit config.h and change MPI_LIBS line to MPI_LIBS =
>
> change all pgf90's to mpif90
>
> make install
>
> Next we test the PMEMD implementation...
>
> ---- qsub script ----
> #!/bin/bash
> #$ -V # Inherit the submission environment
> #$ -cwd # Start job in submission directory
> #$ -N testPMEMD # Job Name
> #$ -j y # combine stderr & stdout into stdout
> #$ -o $JOB_NAME.o$JOB_ID # Name of the output file (eg. myMPI.oJobID)
> #$ -pe 2way 32 # Requests 2 cores/node, 32/16 = 2 nodes total (4
> cpu)
> #$ -q development # Queue name
> #$ -l h_rt=00:45:00 # Run time (hh:mm:ss) - 0.5 hours
> set -x #{echo cmds, use "set echo" in csh}
> setenv AMBRHOME ~/amber9
> cd $AMBERHOME/test
> setenv DO_PARALLEL ibrun
> make test.pmemd
>
> -------------------------------------------
>
> These tests all pass without problems.
>
> All done...
>
> Again this may not give optimum performance, you can tweak to your hearts
> content to get better performance out but this should at least be reasonable
> and will give a working (and tested assuming you check the results from the
> test runs) version of AMBER 9.
>
> Have fun....
> Ross
>
> /\
> \/
> |\oss Walker
>
> | Assistant Research Professor |
> | San Diego Supercomputer Center |
> | Tel: +1 858 822 0854 | EMail:- ross_at_rosswalker.co.uk |
> | http://www.rosswalker.co.uk | PGP Key available on request |
>
> Note: Electronic Mail is not secure, has no guarantee of delivery, may not
> be read every day, and should not be used for urgent or sensitive issues.
>
>
>
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber_at_scripps.edu
> To unsubscribe, send "unsubscribe amber" (in the *body* of the email)
> to majordomo_at_scripps.edu

_________________________________________________________________
用手机MSN聊天写邮件看空间,无限沟通,分享精彩!
http://mobile.msn.com.cn/
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" (in the *body* of the email)
      to majordomo_at_scripps.edu