AMBER Archive (2009)

Subject: Re: [AMBER] Error in PMEMD run

From: Robert Duke (rduke_at_email.unc.edu)
Date: Wed May 06 2009 - 15:40:29 CDT


Hi Marek,
It may be something subtle in the setup of the different builds, and I will
have to go back over a bunch of stuff to understand exactly what is going
on. PMEMD uses what is called the "rpath" mechanism to add library
directories to search paths; could be this fails differently than whatever
sander is doing (could be just use of LD_LIBRARY_PATH). I am actually very
puzzled by all this because the pmemd mechanism has been working very well
for probably about 3-4 years now, but with configuring unix systems, there
are a number of subtle things that can happen. Speaking of LD_LIBRARY_PATH,
do check and see if the ifort lib directory is listed as part of it...
Regards - Bob
----- Original Message -----
From: "Marek Malý" <maly_at_sci.ujep.cz>
To: "AMBER Mailing List" <amber_at_ambermd.org>
Sent: Wednesday, May 06, 2009 4:26 PM
Subject: Re: [AMBER] Error in PMEMD run

Hi Bob,

thanks again. What I do not understand in the light of your last comment is
why the parallel version of Amber which I compilled on the same node as
pmemd is
possible to use on any node of our cluster without any problems. It works
perfectly,
I have now 3 calculations on 32CPUs/job ( = each job on four 8 CPUs nodes
) and it works,
it is running from Monday without problem (minimisation, and MD - explicit
solvent).

So where could be so big difference here. Is it that Amber uses different
shared libraries
that pmemd ?

   Best

     Marek

Dne Wed, 06 May 2009 22:08:56 +0200 Robert Duke <rduke_at_email.unc.edu>
napsal/-a:

> Hi Marek,
> I think you are probably going to need to get somebody involved on your
> end that understands the intricacies of runtime loading of shared
> libraries; it is probably best to go this route rather than hacking the
> build, which is known to work if you don't mess with it (maybe not on
> your setup at the moment, but that is because something is not being set
> up correctly). The key here is being able to handle getting at the ifort
> shared libraries from all nodes in the cluster. Sorry again this has
> been so hard.
> Regards - Bob
> ----- Original Message ----- From: "Marek Malý" <maly_at_sci.ujep.cz>
> To: "AMBER Mailing List" <amber_at_ambermd.org>
> Sent: Wednesday, May 06, 2009 4:02 PM
> Subject: Re: [AMBER] Error in PMEMD run
>
>
> Hi Ross,
>
> I just tested pmemd on the same (calculation) node where I have compilled
> it, still with the same
> error.
>
> I also found in my personal notices that
>
> compilation with this -static flag didn't proceed.
>
> ./configure_amber -intelmpi ifort -static
>
> I can eventually try again to sent you which errors appeared during
> compilation ...
>
> But anyway thank you for your suggestions.
>
> Best,
>
> Marek
>
>
>
>
>
> Dne Wed, 06 May 2009 21:41:10 +0200 Ross Walker <ross_at_rosswalker.co.uk>
> napsal/-a:
>
>> Hi Marek,
>>
>> Here is my take on what is going on here. It may be right it may not be
>> but
>> this is what I guess it is.
>>
>> 1) When you compile PMEMD with MKL it always links in the static
>> libraries.
>> Thus it doesn't matter what the environment is at run time, just at
>> build
>> time.
>>
>> 2) When it links svml it links the shared version of svml. This is part
>> of
>> the ifort compiler suite.
>>
>> Thus you have statically linked mkl and dynamically linked svml.
>>
>> My guess then would be that when you run the code your environment is
>> different in some way, either the shell is different, the paths are
>> different, or it is a different node with different versions of the
>> intel
>> compiler installed. Either way this is messing up the dynamic link to
>> svml.
>>
>> To fix this you either need to find out what is wrong with your
>> environment
>> (i.e. what is different between when you build and when you run) or
>> build a
>> statically linked version of pmemd.
>>
>> All the best
>> Ross
>>
>>> -----Original Message-----
>>> From: amber-bounces_at_ambermd.org [mailto:amber-bounces_at_ambermd.org] On
>>> Behalf Of Robert Duke
>>> Sent: Wednesday, May 06, 2009 12:30 PM
>>> To: AMBER Mailing List
>>> Subject: Re: [AMBER] Error in PMEMD run
>>>
>>> Hi Marek,
>>> As an additional note, I did look at your config.h, and it looks to me
>>> like
>>> the ifort setup should be fine, so I am pretty puzzled as to what is
>>> going
>>> on. If it still doesn't work, please send some additional info about
>>> your
>>> hardware setup; I would note that I use ifort 10.1.021 all the time
>>> without
>>> problems, but don't know if there is something odd about 012 or not.
>>> Best Regards - Bob
>>>
>>> ----- Original Message -----
>>> From: "Marek Malý" <maly_at_sci.ujep.cz>
>>> To: "AMBER Mailing List" <amber_at_ambermd.org>
>>> Sent: Wednesday, May 06, 2009 2:42 PM
>>> Subject: Re: [AMBER] Error in PMEMD run
>>>
>>>
>>> Sorry now I realised that you probably talked about "config.h" not
>>> about
>>> configure file,
>>> so please find this pmemed config file attached - there is "-lsvml"
>>> present.
>>>
>>> So if it is necessary to modify this file please tell me how or please
>>> edit it and send
>>> back.
>>>
>>> Thanks a lot !
>>>
>>> Best,
>>>
>>> Marek
>>>
>>>
>>> Dne Wed, 06 May 2009 20:30:43 +0200 Marek Malý <maly_at_sci.ujep.cz>
>>> napsal/-a:
>>>
>>> > Dear Bob,
>>> >
>>> > I am definitively getting lost :))
>>> >
>>> > OK, first of all nor the original nor your config file for pmemd
>>> obtain
>>> > "-lsvml" parameter.
>>> > Simply this string doesn't exist in this file please see the attached
>>> > file
>>> > "configure" (this is that your
>>> > last version which you sent me). In confiuguration file for Amber -
>>> > please
>>> > see attached file "configure_amber"
>>> > there is one occurrence of this parameter in part "IA32 Intel
>>> compilers".
>>> >
>>> > Here is the whole path to our ifort compiler:
>>> >
>>> > /opt/intel/fce/10.1.012/bin/ifort
>>> >
>>> > all the others paths are listed in my previous email (please see
>>> below)
>>> > there
>>> > is list after performing "env" command.
>>> >
>>> > My config line for pmemd is this:
>>> >
>>> > ./configure linux_em64t ifort intelmpi pubfft bintraj
>>> >
>>> > If I can provide you more useful information please just let me know.
>>> >
>>> > For this moment thank you veru much for your time and effort !
>>> >
>>> > Best,
>>> >
>>> > Marek
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > Dne Wed, 06 May 2009 19:30:43 +0200 Robert Duke <rduke_at_email.unc.edu>
>>> > napsal/-a:
>>> >
>>> >> Hi Marek,
>>> >> Well, I have been plowing around in the intel MKL libraries, and the
>>> >> unresolved symbol you list is not defined in either MKL 8 or 10, so
>>> that
>>> >> is why trying to fix the mkl does not work. It is instead defined
>>> in
>>> >> libsvml.so (for shared libs) and libsvml.a (for static libs).
>>> Normally
>>> >> you get the shared lib linked in by including
>>> >> -lsvml in the link line, which should be happening in your config
>>> file
>>> >> (if you look at the config data files, this happens for everything
>>> >> except linux_p3_athlon.ifort, which is probably now broken, but also
>>> >> probably now completely unused (hence folks are not complaining -
>>> any
>>> >> chance you were using this one?)). SO this is NOT an mkl problem,
>>> but a
>>> >> problem getting to an svml function, perhaps called by some other
>>> >> function. Okay, so first question - are you setting up the ifort
>>> >> environment in the manner specified by the compiler (you source
>>> >> something like /opt/intel/fc/10.whatever/bin/ifortvars.csh or
>>> >> ifortvars.sh depending on which shell you use). You need to do an
>>> >> equivalent thing for MKL, by the way. Then if you did not specify
>>> >> linux_p3_athlon, what exactly did you use when you ran configure?
>>> We
>>> >> are finally narrowing it down... Sorry I did not pick up on this
>>> right
>>> >> away - so many math function linkage problems source from the chaos
>>> >> surrounding the interface to MKL.
>>> >> Best Regards - Bob
>>> >>
>>> >> ----- Original Message ----- From: "Marek Malý" <maly_at_sci.ujep.cz>
>>> >> To: "AMBER Mailing List" <amber_at_ambermd.org>
>>> >> Sent: Wednesday, May 06, 2009 11:58 AM
>>> >> Subject: Re: [AMBER] Error in PMEMD run
>>> >>
>>> >>
>>> >> Dear Bob,
>>> >>
>>> >> unfortunately your "configure patch" didn't help me.
>>> >>
>>> >> I tried just configure pmemd with your new configure file and run
>>> >> the simulation (with still the same error), then I also made a new
>>> >> compilation of of the pmemd after configuration with new cofigure
>>> file,
>>> >> but there is again the same error (undefined symbol: __svml_cos2).
>>> >>
>>> >> Anyway regarding to your question about version of our ifort
>>> compiler.
>>> >> Our actual version is this: "Intel(R) 64, Version 10.1 Build
>>> 20080112
>>> >> Package ID: l_fc_p_10.1.012"
>>> >>
>>> >> If you have no other idea, probably will be for this moment the best
>>> >> solution to use pmemd without
>>> >> MKL. If pmemd uses MKL just for the implicit solvent calculations,
>>> it
>>> >> will
>>> >> be acceptable for me
>>> >> now since as I wrote sooner. Now I am dealing just with explicit
>>> solvent
>>> >> calculations.
>>> >>
>>> >> So please tell me what all (lines/sentences) I should delete from
>>> the
>>> >> configure file to prevent
>>> >> linking pmemd with MKL and which configure file (original or your's)
>>> I
>>> >> have to use now.
>>> >> I assume that in this situation doesn't matter.
>>> >>
>>> >> Thank you very much in advance !
>>> >>
>>> >> Best,
>>> >>
>>> >> Marek
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> Dne Tue, 05 May 2009 06:08:37 +0200 Robert Duke
>>> <rduke_at_email.unc.edu>
>>> >> napsal/-a:
>>> >>
>>> >>> Okay, attempt at a late-night fix. Attached is a tar ball for
>>> pmemd
>>> >>> configuration, basically with two files. If you untar this, it
>>> will
>>> >>> expand
>>> >>> into a config_stuff dir. This then contains a new "configure" and
>>> a
>>> >>> new
>>> >>> config_data/interconnect.intelmpi (which you maybe can use if you
>>> >>> really
>>> >>> have intel mpi). So copy the two files into your existing pmemd
>>> tree
>>> >>> (saving old files first, just in case), and rerun ./configure in
>>> the
>>> >>> pmemd
>>> >>> directory, and hopefully all will be well.
>>> >>> Regards - Bob
>>> >>> ----- Original Message -----
>>> >>> From: "Marek Malý" <maly_at_sci.ujep.cz>
>>> >>> To: "AMBER Mailing List" <amber_at_ambermd.org>
>>> >>> Sent: Monday, May 04, 2009 10:19 PM
>>> >>> Subject: Re: [AMBER] Error in PMEMD run
>>> >>>
>>> >>>
>>> >>> Dear Bob,
>>> >>>
>>> >>> actually we have installed MKL version 10.0.011 as it is clear from
>>> the
>>> >>> "env list" below. Recently I would like to use PMEMD just
>>> >>> for the explicit solvent simulations, but of course I would be
>>> happy to
>>> >>> have possibility use PMEMD also for the implicit
>>> >>> solvent calculations. So I will appreciate any idea which can help
>>> to
>>> >>> fix
>>> >>> this problem.
>>> >>>
>>> >>> Thanks in advance !
>>> >>>
>>> >>> Best,
>>> >>>
>>> >>> Marek
>>> >>>
>>> >>>
>>> MANPATH=/opt/intel/mkl/10.0.011/man:/opt/intel/cce/9.1.043/man:/opt/int
>>> el/fce/10.1.012/man:/usr/local/share/man:/usr/share/man:/usr/share/binu
>>> tils-data/x86_64-pc-linux-gnu/2.16.1/man:/usr/share/gcc-data/x86_64-pc-
>>> linux-gnu/4.1.1/man
>>> >>>
>>> INTEL_LICENSE_FILE=/opt/intel/fce/10.1.012/licenses:/opt/intel/licenses
>>> :/home/mmaly/intel/licenses:/Users/Shared/Library/Application
>>> >>>
>>> Support/Intel/Licenses:/opt/intel/cce/9.1.043/licenses:/opt/intel/licen
>>> ses:/home/mmaly/intel/licenses
>>> >>> TERM=xterm
>>> >>> SHELL=/bin/bash
>>> >>> SSH_CLIENT=192.168.0.15 37849 22
>>> >>> LIBRARY_PATH=/opt/intel/mkl/10.0.011/lib/em64t
>>> >>> SGE_CELL=default
>>> >>> FPATH=/opt/intel/mkl/10.0.011/include
>>> >>> SSH_TTY=/dev/pts/3
>>> >>> USER=mmaly
>>> >>>
>>> LD_LIBRARY_PATH=/opt/intel/impi/3.1/lib64:/opt/intel/mkl/10.0.011/lib/e
>>> m64t:/opt/intel/cce/9.1.043/lib:/opt/intel/fce/10.1.012/lib::/opt/intel
>>> /impi/3.1/lib64
>>> >>> LS_COLORS=no=00:fi=00:di=01
>>> >>> CPATH=/opt/intel/mkl/10.0.011/include
>>> >>> PAGER=/usr/bin/less
>>> >>> CONFIG_PROTECT_MASK=/etc/fonts/fonts.conf /etc/terminfo
>>> >>> MAIL=/var/mail/mmaly
>>> >>>
>>> PATH=/opt/intel/impi/3.1/bin64:/opt/intel/cce/9.1.043/bin:/opt/intel/fc
>>> e/10.1.012/bin:/usr/local/bin:/usr/bin:/bin:/opt/bin:/usr/x86_64-pc-
>>> linux-gnu/gcc-
>>> bin/4.1.1:/opt/intel/cce/9.1.043/bin:/opt/intel/fce/10.1.012/bin:/opt/i
>>> ntel/impi/3.1/bin:/opt/intel/idbe/9.1.043/bin:/opt/intel/idbe/10.1.012/
>>> bin:/opt/intel/etc:/opt/amber/exe:/opt/sge/bin/lx24-amd64
>>> >>> PWD=/home/mmaly
>>> >>> SGE_EXECD_PORT=537
>>> >>> EDITOR=/bin/nano
>>> >>> SGE_QMASTER_PORT=536
>>> >>> SGE_ROOT=/opt/sge
>>> >>> MKL_HOME=/opt/intel/mkl/10.0.011
>>> >>>
>>> INTEL_PATHS=/opt/intel/cce/9.1.043/bin:/opt/intel/fce/10.1.012/bin:/opt
>>> /intel/impi/3.1/bin:/opt/intel/idbe/9.1.043/bin:/opt/intel/idbe/10.1.01
>>> 2/bin:/opt/intel/etc
>>> >>> SHLVL=1
>>> >>> HOME=/home/mmaly
>>> >>> DYLD_LIBRARY_PATH=/opt/intel/fce/10.1.012/lib
>>> >>> PYTHONPATH=/usr/lib64/portage/pym
>>> >>> LESS=-R -M --shift 5
>>> >>> LOGNAME=mmaly
>>> >>> GCC_SPECS=
>>> >>> CVS_RSH=ssh
>>> >>> SSH_CONNECTION=192.168.0.15 37849 192.168.0.13 22
>>> >>> MPI_HOME=/opt/intel/impi/3.1
>>> >>> LESSOPEN=|lesspipe.sh %s
>>> >>> INFOPATH=/usr/share/info:/usr/share/binutils-data/x86_64-pc-linux-
>>> gnu/2.16.1/info:/usr/share/gcc-data/x86_64-pc-linux-
>>> gnu/4.1.1/info:/usr/share/info/emacs-22
>>> >>> INCLUDE=/opt/intel/mkl/10.0.011/include
>>> >>> AMBERHOME=/opt/amber
>>> >>> _=/usr/bin/env
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> Dne Tue, 05 May 2009 03:35:54 +0200 Robert Duke
>>> <rduke_at_email.unc.edu>
>>> >>> napsal/-a:
>>> >>>
>>> >>>> This looks to me like an MKL linkage problem. If you don't need
>>> >>>> generalized Born, you can make this go away by simply not choosing
>>> to
>>> >>>> use
>>> >>>> MKL when you run pmemd configure. Otherwise, we do have more
>>> recent
>>> >>>> directions that work with the latest versions of MKL. If you want
>>> to
>>> >>>> use
>>> >>>> this, let me know your version of MKL and I will dig up the
>>> >>>> appropriate
>>> >>>> new version of pmemd configure that should work (I think I have
>>> >>>> posted
>>> >>>> fixed versions to the list before; we should probably release a
>>> >>>> patch,
>>> >>>> but in the meantime I can dig out the last posting if you want GB
>>> >>>> support
>>> >>>> with MKL).
>>> >>>> Best Regards - Bob Duke
>>> >>>> ----- Original Message ----- From: "Marek Malý" <maly_at_sci.ujep.cz>
>>> >>>> To: <amber_at_ambermd.org>
>>> >>>> Sent: Monday, May 04, 2009 9:23 PM
>>> >>>> Subject: [AMBER] Error in PMEMD run
>>> >>>>
>>> >>>>
>>> >>>> Dear amber users,
>>> >>>>
>>> >>>> I have installed Amber10 in our cluster some time ago. Now I
>>> started
>>> >>>> with some calculations and I have problem with PMEMD.
>>> >>>>
>>> >>>> When I try to switch (after minimisation, heating and density
>>> >>>> equilibrium
>>> >>>> phases) from SANDER
>>> >>>> to PMEMD, my calculation is broken starting with this error line:
>>> >>>>
>>> >>>> "symbol lookup error: /opt/amber/exe/pmemd: undefined symbol:
>>> >>>> __svml_cos2"
>>> >>>>
>>> >>>>
>>> >>>> Without switching to PMEMD everything is OK, it means SANDER works
>>> >>>> perfectly but since
>>> >>>> I am working on big systems (hundreds thousands of atoms )
>>> typically
>>> >>>> 32-64
>>> >>>> CPUs jobs,
>>> >>>> I would like to use PMEMD for my equil/production runs.
>>> >>>>
>>> >>>> I would be grateful for any useful info.
>>> >>>>
>>> >>>> With the best wishes
>>> >>>>
>>> >>>> Marek
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>> >
>>>
>>> --
>>> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>>> http://www.opera.com/mail/
>>>
>>>
>>> -----------------------------------------------------------------------
>>> ---------
>>>
>>>
>>> > _______________________________________________
>>> > AMBER mailing list
>>> > AMBER_at_ambermd.org
>>> > http://lists.ambermd.org/mailman/listinfo/amber
>>> >
>>>
>>>
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER_at_ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER_at_ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>> __________ Informace od NOD32 4051 (20090504) __________
>>
>> Tato zprava byla proverena antivirovym systemem NOD32.
>> http://www.nod32.cz
>>
>>
>

-- 
Tato zpráva byla vytvořena převratným poštovním klientem Opery:
http://www.opera.com/mail/

_______________________________________________ AMBER mailing list AMBER_at_ambermd.org http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________ AMBER mailing list AMBER_at_ambermd.org http://lists.ambermd.org/mailman/listinfo/amber