AMBER Archive (2009)Subject: Re: [AMBER] Error in PMEMD run
From: Marek Malý (maly_at_sci.ujep.cz)
Date: Wed May 06 2009 - 16:27:17 CDT
Hi Ross,
it seems as an promissing idea. I will try it tomorrow and I will let you
know.
For this moment thanks a lot !
Best,
Marek
Dne Wed, 06 May 2009 22:36:36 +0200 Ross Walker <ross_at_rosswalker.co.uk>
napsal/-a:
> Hi Marek,
>
> Sander's veclib library contains vdcos, vdtanh etc while pmemd's does
> not.
> Hence for vectored cosine calls in sander the calls got through the mkl
> library in your case. For pmemd calls to cosine are being vectorized
> through
> through the ifort compiler's vectored vsml library and it is here that
> the
> problem lies. Either that there is some kind of corruption of this
> library,
> that it is picking up an incorrect version at runtime (that is different
> from the compiled version) etc. If you can't work out what is wrong with
> your ifort setup there are some possible hacks that could work.
>
> 1) Try linking without the vsml library - just remove it from the
> config.h
> file, make clean and then build. Then it won't try to vectorize the cos
> calls and it should work.
>
> 2) modify pmemd to include vdcos in veclib.f and replace all calls to
> cos()
> with calls to vdcos() then it will use the MKL vector library.
>
> All the best
> Ross
>
>> -----Original Message-----
>> From: amber-bounces_at_ambermd.org [mailto:amber-bounces_at_ambermd.org] On
>> Behalf Of Marek Malý
>> Sent: Wednesday, May 06, 2009 1:27 PM
>> To: AMBER Mailing List
>> Subject: Re: [AMBER] Error in PMEMD run
>>
>> Hi Bob,
>>
>> thanks again. What I do not understand in the light of your last
>> comment is
>> why the parallel version of Amber which I compilled on the same node as
>> pmemd is
>> possible to use on any node of our cluster without any problems. It
>> works
>> perfectly,
>> I have now 3 calculations on 32CPUs/job ( = each job on four 8 CPUs
>> nodes
>> ) and it works,
>> it is running from Monday without problem (minimisation, and MD -
>> explicit
>> solvent).
>>
>> So where could be so big difference here. Is it that Amber uses
>> different
>> shared libraries
>> that pmemd ?
>>
>> Best
>>
>> Marek
>>
>>
>> Dne Wed, 06 May 2009 22:08:56 +0200 Robert Duke <rduke_at_email.unc.edu>
>> napsal/-a:
>>
>> > Hi Marek,
>> > I think you are probably going to need to get somebody involved on
>> your
>> > end that understands the intricacies of runtime loading of shared
>> > libraries; it is probably best to go this route rather than hacking
>> the
>> > build, which is known to work if you don't mess with it (maybe not on
>> > your setup at the moment, but that is because something is not being
>> set
>> > up correctly). The key here is being able to handle getting at the
>> > ifort shared libraries from all nodes in the cluster. Sorry again
>> this
>> > has been so hard.
>> > Regards - Bob
>> > ----- Original Message ----- From: "Marek Malý" <maly_at_sci.ujep.cz>
>> > To: "AMBER Mailing List" <amber_at_ambermd.org>
>> > Sent: Wednesday, May 06, 2009 4:02 PM
>> > Subject: Re: [AMBER] Error in PMEMD run
>> >
>> >
>> > Hi Ross,
>> >
>> > I just tested pmemd on the same (calculation) node where I have
>> compilled
>> > it, still with the same
>> > error.
>> >
>> > I also found in my personal notices that
>> >
>> > compilation with this -static flag didn't proceed.
>> >
>> > ./configure_amber -intelmpi ifort -static
>> >
>> > I can eventually try again to sent you which errors appeared during
>> > compilation ...
>> >
>> > But anyway thank you for your suggestions.
>> >
>> > Best,
>> >
>> > Marek
>> >
>> >
>> >
>> >
>> >
>> > Dne Wed, 06 May 2009 21:41:10 +0200 Ross Walker
>> <ross_at_rosswalker.co.uk>
>> > napsal/-a:
>> >
>> >> Hi Marek,
>> >>
>> >> Here is my take on what is going on here. It may be right it may not
>> be
>> >> but
>> >> this is what I guess it is.
>> >>
>> >> 1) When you compile PMEMD with MKL it always links in the static
>> >> libraries.
>> >> Thus it doesn't matter what the environment is at run time, just at
>> >> build
>> >> time.
>> >>
>> >> 2) When it links svml it links the shared version of svml. This is
>> part
>> >> of
>> >> the ifort compiler suite.
>> >>
>> >> Thus you have statically linked mkl and dynamically linked svml.
>> >>
>> >> My guess then would be that when you run the code your environment
>> is
>> >> different in some way, either the shell is different, the paths are
>> >> different, or it is a different node with different versions of the
>> >> intel
>> >> compiler installed. Either way this is messing up the dynamic link
>> to
>> >> svml.
>> >>
>> >> To fix this you either need to find out what is wrong with your
>> >> environment
>> >> (i.e. what is different between when you build and when you run) or
>> >> build a
>> >> statically linked version of pmemd.
>> >>
>> >> All the best
>> >> Ross
>> >>
>> >>> -----Original Message-----
>> >>> From: amber-bounces_at_ambermd.org [mailto:amber-bounces_at_ambermd.org]
>> On
>> >>> Behalf Of Robert Duke
>> >>> Sent: Wednesday, May 06, 2009 12:30 PM
>> >>> To: AMBER Mailing List
>> >>> Subject: Re: [AMBER] Error in PMEMD run
>> >>>
>> >>> Hi Marek,
>> >>> As an additional note, I did look at your config.h, and it looks to
>> me
>> >>> like
>> >>> the ifort setup should be fine, so I am pretty puzzled as to what
>> is
>> >>> going
>> >>> on. If it still doesn't work, please send some additional info
>> about
>> >>> your
>> >>> hardware setup; I would note that I use ifort 10.1.021 all the
>> time
>> >>> without
>> >>> problems, but don't know if there is something odd about 012 or
>> not.
>> >>> Best Regards - Bob
>> >>>
>> >>> ----- Original Message -----
>> >>> From: "Marek Malý" <maly_at_sci.ujep.cz>
>> >>> To: "AMBER Mailing List" <amber_at_ambermd.org>
>> >>> Sent: Wednesday, May 06, 2009 2:42 PM
>> >>> Subject: Re: [AMBER] Error in PMEMD run
>> >>>
>> >>>
>> >>> Sorry now I realised that you probably talked about "config.h" not
>> >>> about
>> >>> configure file,
>> >>> so please find this pmemed config file attached - there is "-
>> lsvml"
>> >>> present.
>> >>>
>> >>> So if it is necessary to modify this file please tell me how or
>> please
>> >>> edit it and send
>> >>> back.
>> >>>
>> >>> Thanks a lot !
>> >>>
>> >>> Best,
>> >>>
>> >>> Marek
>> >>>
>> >>>
>> >>> Dne Wed, 06 May 2009 20:30:43 +0200 Marek Malý <maly_at_sci.ujep.cz>
>> >>> napsal/-a:
>> >>>
>> >>> > Dear Bob,
>> >>> >
>> >>> > I am definitively getting lost :))
>> >>> >
>> >>> > OK, first of all nor the original nor your config file for pmemd
>> >>> obtain
>> >>> > "-lsvml" parameter.
>> >>> > Simply this string doesn't exist in this file please see the
>> attached
>> >>> > file
>> >>> > "configure" (this is that your
>> >>> > last version which you sent me). In confiuguration file for Amber
>> -
>> >>> > please
>> >>> > see attached file "configure_amber"
>> >>> > there is one occurrence of this parameter in part "IA32 Intel
>> >>> compilers".
>> >>> >
>> >>> > Here is the whole path to our ifort compiler:
>> >>> >
>> >>> > /opt/intel/fce/10.1.012/bin/ifort
>> >>> >
>> >>> > all the others paths are listed in my previous email (please see
>> >>> below)
>> >>> > there
>> >>> > is list after performing "env" command.
>> >>> >
>> >>> > My config line for pmemd is this:
>> >>> >
>> >>> > ./configure linux_em64t ifort intelmpi pubfft bintraj
>> >>> >
>> >>> > If I can provide you more useful information please just let me
>> know.
>> >>> >
>> >>> > For this moment thank you veru much for your time and effort !
>> >>> >
>> >>> > Best,
>> >>> >
>> >>> > Marek
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> > Dne Wed, 06 May 2009 19:30:43 +0200 Robert Duke
>> <rduke_at_email.unc.edu>
>> >>> > napsal/-a:
>> >>> >
>> >>> >> Hi Marek,
>> >>> >> Well, I have been plowing around in the intel MKL libraries, and
>> the
>> >>> >> unresolved symbol you list is not defined in either MKL 8 or 10,
>> so
>> >>> that
>> >>> >> is why trying to fix the mkl does not work. It is instead
>> defined
>> >>> in
>> >>> >> libsvml.so (for shared libs) and libsvml.a (for static libs).
>> >>> Normally
>> >>> >> you get the shared lib linked in by including
>> >>> >> -lsvml in the link line, which should be happening in your
>> config
>> >>> file
>> >>> >> (if you look at the config data files, this happens for
>> everything
>> >>> >> except linux_p3_athlon.ifort, which is probably now broken, but
>> also
>> >>> >> probably now completely unused (hence folks are not complaining
>> -
>> >>> any
>> >>> >> chance you were using this one?)). SO this is NOT an mkl
>> problem,
>> >>> but a
>> >>> >> problem getting to an svml function, perhaps called by some
>> other
>> >>> >> function. Okay, so first question - are you setting up the
>> ifort
>> >>> >> environment in the manner specified by the compiler (you source
>> >>> >> something like /opt/intel/fc/10.whatever/bin/ifortvars.csh or
>> >>> >> ifortvars.sh depending on which shell you use). You need to do
>> an
>> >>> >> equivalent thing for MKL, by the way. Then if you did not
>> specify
>> >>> >> linux_p3_athlon, what exactly did you use when you ran
>> configure?
>> >>> We
>> >>> >> are finally narrowing it down... Sorry I did not pick up on
>> this
>> >>> right
>> >>> >> away - so many math function linkage problems source from the
>> chaos
>> >>> >> surrounding the interface to MKL.
>> >>> >> Best Regards - Bob
>> >>> >>
>> >>> >> ----- Original Message ----- From: "Marek Malý"
>> <maly_at_sci.ujep.cz>
>> >>> >> To: "AMBER Mailing List" <amber_at_ambermd.org>
>> >>> >> Sent: Wednesday, May 06, 2009 11:58 AM
>> >>> >> Subject: Re: [AMBER] Error in PMEMD run
>> >>> >>
>> >>> >>
>> >>> >> Dear Bob,
>> >>> >>
>> >>> >> unfortunately your "configure patch" didn't help me.
>> >>> >>
>> >>> >> I tried just configure pmemd with your new configure file and
>> run
>> >>> >> the simulation (with still the same error), then I also made a
>> new
>> >>> >> compilation of of the pmemd after configuration with new
>> cofigure
>> >>> file,
>> >>> >> but there is again the same error (undefined symbol:
>> __svml_cos2).
>> >>> >>
>> >>> >> Anyway regarding to your question about version of our ifort
>> >>> compiler.
>> >>> >> Our actual version is this: "Intel(R) 64, Version 10.1 Build
>> >>> 20080112
>> >>> >> Package ID: l_fc_p_10.1.012"
>> >>> >>
>> >>> >> If you have no other idea, probably will be for this moment the
>> best
>> >>> >> solution to use pmemd without
>> >>> >> MKL. If pmemd uses MKL just for the implicit solvent
>> calculations,
>> >>> it
>> >>> >> will
>> >>> >> be acceptable for me
>> >>> >> now since as I wrote sooner. Now I am dealing just with explicit
>> >>> solvent
>> >>> >> calculations.
>> >>> >>
>> >>> >> So please tell me what all (lines/sentences) I should delete
>> from
>> >>> the
>> >>> >> configure file to prevent
>> >>> >> linking pmemd with MKL and which configure file (original or
>> your's)
>> >>> I
>> >>> >> have to use now.
>> >>> >> I assume that in this situation doesn't matter.
>> >>> >>
>> >>> >> Thank you very much in advance !
>> >>> >>
>> >>> >> Best,
>> >>> >>
>> >>> >> Marek
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> Dne Tue, 05 May 2009 06:08:37 +0200 Robert Duke
>> >>> <rduke_at_email.unc.edu>
>> >>> >> napsal/-a:
>> >>> >>
>> >>> >>> Okay, attempt at a late-night fix. Attached is a tar ball for
>> >>> pmemd
>> >>> >>> configuration, basically with two files. If you untar this, it
>> >>> will
>> >>> >>> expand
>> >>> >>> into a config_stuff dir. This then contains a new "configure"
>> and
>> >>> a
>> >>> >>> new
>> >>> >>> config_data/interconnect.intelmpi (which you maybe can use if
>> you
>> >>> >>> really
>> >>> >>> have intel mpi). So copy the two files into your existing
>> pmemd
>> >>> tree
>> >>> >>> (saving old files first, just in case), and rerun ./configure
>> in
>> >>> the
>> >>> >>> pmemd
>> >>> >>> directory, and hopefully all will be well.
>> >>> >>> Regards - Bob
>> >>> >>> ----- Original Message -----
>> >>> >>> From: "Marek Malý" <maly_at_sci.ujep.cz>
>> >>> >>> To: "AMBER Mailing List" <amber_at_ambermd.org>
>> >>> >>> Sent: Monday, May 04, 2009 10:19 PM
>> >>> >>> Subject: Re: [AMBER] Error in PMEMD run
>> >>> >>>
>> >>> >>>
>> >>> >>> Dear Bob,
>> >>> >>>
>> >>> >>> actually we have installed MKL version 10.0.011 as it is clear
>> from
>> >>> the
>> >>> >>> "env list" below. Recently I would like to use PMEMD just
>> >>> >>> for the explicit solvent simulations, but of course I would be
>> >>> happy to
>> >>> >>> have possibility use PMEMD also for the implicit
>> >>> >>> solvent calculations. So I will appreciate any idea which can
>> help
>> >>> to
>> >>> >>> fix
>> >>> >>> this problem.
>> >>> >>>
>> >>> >>> Thanks in advance !
>> >>> >>>
>> >>> >>> Best,
>> >>> >>>
>> >>> >>> Marek
>> >>> >>>
>> >>> >>>
>> >>>
>> MANPATH=/opt/intel/mkl/10.0.011/man:/opt/intel/cce/9.1.043/man:/opt/int
>> >>>
>> el/fce/10.1.012/man:/usr/local/share/man:/usr/share/man:/usr/share/binu
>> >>> tils-data/x86_64-pc-linux-gnu/2.16.1/man:/usr/share/gcc-
>> data/x86_64-pc-
>> >>> linux-gnu/4.1.1/man
>> >>> >>>
>> >>>
>> INTEL_LICENSE_FILE=/opt/intel/fce/10.1.012/licenses:/opt/intel/licenses
>> >>> :/home/mmaly/intel/licenses:/Users/Shared/Library/Application
>> >>> >>>
>> >>>
>> Support/Intel/Licenses:/opt/intel/cce/9.1.043/licenses:/opt/intel/licen
>> >>> ses:/home/mmaly/intel/licenses
>> >>> >>> TERM=xterm
>> >>> >>> SHELL=/bin/bash
>> >>> >>> SSH_CLIENT=192.168.0.15 37849 22
>> >>> >>> LIBRARY_PATH=/opt/intel/mkl/10.0.011/lib/em64t
>> >>> >>> SGE_CELL=default
>> >>> >>> FPATH=/opt/intel/mkl/10.0.011/include
>> >>> >>> SSH_TTY=/dev/pts/3
>> >>> >>> USER=mmaly
>> >>> >>>
>> >>>
>> LD_LIBRARY_PATH=/opt/intel/impi/3.1/lib64:/opt/intel/mkl/10.0.011/lib/e
>> >>>
>> m64t:/opt/intel/cce/9.1.043/lib:/opt/intel/fce/10.1.012/lib::/opt/intel
>> >>> /impi/3.1/lib64
>> >>> >>> LS_COLORS=no=00:fi=00:di=01
>> >>> >>> CPATH=/opt/intel/mkl/10.0.011/include
>> >>> >>> PAGER=/usr/bin/less
>> >>> >>> CONFIG_PROTECT_MASK=/etc/fonts/fonts.conf /etc/terminfo
>> >>> >>> MAIL=/var/mail/mmaly
>> >>> >>>
>> >>>
>> PATH=/opt/intel/impi/3.1/bin64:/opt/intel/cce/9.1.043/bin:/opt/intel/fc
>> >>> e/10.1.012/bin:/usr/local/bin:/usr/bin:/bin:/opt/bin:/usr/x86_64-
>> pc-
>> >>> linux-gnu/gcc-
>> >>>
>> bin/4.1.1:/opt/intel/cce/9.1.043/bin:/opt/intel/fce/10.1.012/bin:/opt/i
>> >>>
>> ntel/impi/3.1/bin:/opt/intel/idbe/9.1.043/bin:/opt/intel/idbe/10.1.012/
>> >>> bin:/opt/intel/etc:/opt/amber/exe:/opt/sge/bin/lx24-amd64
>> >>> >>> PWD=/home/mmaly
>> >>> >>> SGE_EXECD_PORT=537
>> >>> >>> EDITOR=/bin/nano
>> >>> >>> SGE_QMASTER_PORT=536
>> >>> >>> SGE_ROOT=/opt/sge
>> >>> >>> MKL_HOME=/opt/intel/mkl/10.0.011
>> >>> >>>
>> >>>
>> INTEL_PATHS=/opt/intel/cce/9.1.043/bin:/opt/intel/fce/10.1.012/bin:/opt
>> >>>
>> /intel/impi/3.1/bin:/opt/intel/idbe/9.1.043/bin:/opt/intel/idbe/10.1.01
>> >>> 2/bin:/opt/intel/etc
>> >>> >>> SHLVL=1
>> >>> >>> HOME=/home/mmaly
>> >>> >>> DYLD_LIBRARY_PATH=/opt/intel/fce/10.1.012/lib
>> >>> >>> PYTHONPATH=/usr/lib64/portage/pym
>> >>> >>> LESS=-R -M --shift 5
>> >>> >>> LOGNAME=mmaly
>> >>> >>> GCC_SPECS=
>> >>> >>> CVS_RSH=ssh
>> >>> >>> SSH_CONNECTION=192.168.0.15 37849 192.168.0.13 22
>> >>> >>> MPI_HOME=/opt/intel/impi/3.1
>> >>> >>> LESSOPEN=|lesspipe.sh %s
>> >>> >>> INFOPATH=/usr/share/info:/usr/share/binutils-data/x86_64-pc-
>> linux-
>> >>> gnu/2.16.1/info:/usr/share/gcc-data/x86_64-pc-linux-
>> >>> gnu/4.1.1/info:/usr/share/info/emacs-22
>> >>> >>> INCLUDE=/opt/intel/mkl/10.0.011/include
>> >>> >>> AMBERHOME=/opt/amber
>> >>> >>> _=/usr/bin/env
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>> Dne Tue, 05 May 2009 03:35:54 +0200 Robert Duke
>> >>> <rduke_at_email.unc.edu>
>> >>> >>> napsal/-a:
>> >>> >>>
>> >>> >>>> This looks to me like an MKL linkage problem. If you don't
>> need
>> >>> >>>> generalized Born, you can make this go away by simply not
>> choosing
>> >>> to
>> >>> >>>> use
>> >>> >>>> MKL when you run pmemd configure. Otherwise, we do have more
>> >>> recent
>> >>> >>>> directions that work with the latest versions of MKL. If you
>> want
>> >>> to
>> >>> >>>> use
>> >>> >>>> this, let me know your version of MKL and I will dig up the
>> >>> >>>> appropriate
>> >>> >>>> new version of pmemd configure that should work (I think I
>> have
>> >>> >>>> posted
>> >>> >>>> fixed versions to the list before; we should probably release
>> a
>> >>> >>>> patch,
>> >>> >>>> but in the meantime I can dig out the last posting if you
>> want GB
>> >>> >>>> support
>> >>> >>>> with MKL).
>> >>> >>>> Best Regards - Bob Duke
>> >>> >>>> ----- Original Message ----- From: "Marek Malý"
>> <maly_at_sci.ujep.cz>
>> >>> >>>> To: <amber_at_ambermd.org>
>> >>> >>>> Sent: Monday, May 04, 2009 9:23 PM
>> >>> >>>> Subject: [AMBER] Error in PMEMD run
>> >>> >>>>
>> >>> >>>>
>> >>> >>>> Dear amber users,
>> >>> >>>>
>> >>> >>>> I have installed Amber10 in our cluster some time ago. Now I
>> >>> started
>> >>> >>>> with some calculations and I have problem with PMEMD.
>> >>> >>>>
>> >>> >>>> When I try to switch (after minimisation, heating and density
>> >>> >>>> equilibrium
>> >>> >>>> phases) from SANDER
>> >>> >>>> to PMEMD, my calculation is broken starting with this error
>> line:
>> >>> >>>>
>> >>> >>>> "symbol lookup error: /opt/amber/exe/pmemd: undefined symbol:
>> >>> >>>> __svml_cos2"
>> >>> >>>>
>> >>> >>>>
>> >>> >>>> Without switching to PMEMD everything is OK, it means SANDER
>> works
>> >>> >>>> perfectly but since
>> >>> >>>> I am working on big systems (hundreds thousands of atoms )
>> >>> typically
>> >>> >>>> 32-64
>> >>> >>>> CPUs jobs,
>> >>> >>>> I would like to use PMEMD for my equil/production runs.
>> >>> >>>>
>> >>> >>>> I would be grateful for any useful info.
>> >>> >>>>
>> >>> >>>> With the best wishes
>> >>> >>>>
>> >>> >>>> Marek
>> >>> >>>>
>> >>> >>>>
>> >>> >>>>
>> >>> >>>
>> >>> >>
>> >>> >
>> >>>
>> >>> --
>> >>> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>> >>> http://www.opera.com/mail/
>> >>>
>> >>>
>> >>> -------------------------------------------------------------------
>> ----
>> >>> ---------
>> >>>
>> >>>
>> >>> > _______________________________________________
>> >>> > AMBER mailing list
>> >>> > AMBER_at_ambermd.org
>> >>> > http://lists.ambermd.org/mailman/listinfo/amber
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> AMBER mailing list
>> >>> AMBER_at_ambermd.org
>> >>> http://lists.ambermd.org/mailman/listinfo/amber
>> >>
>> >>
>> >> _______________________________________________
>> >> AMBER mailing list
>> >> AMBER_at_ambermd.org
>> >> http://lists.ambermd.org/mailman/listinfo/amber
>> >>
>> >> __________ Informace od NOD32 4051 (20090504) __________
>> >>
>> >> Tato zprava byla proverena antivirovym systemem NOD32.
>> >> http://www.nod32.cz
>> >>
>> >>
>> >
>>
>> --
>> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>> http://www.opera.com/mail/
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER_at_ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER_at_ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
> __________ Informace od NOD32 4051 (20090504) __________
>
> Tato zprava byla proverena antivirovym systemem NOD32.
> http://www.nod32.cz
>
>
--
Tato zpráva byla vytvořena převratným poštovním klientem Opery:
http://www.opera.com/mail/
_______________________________________________
AMBER mailing list
AMBER_at_ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
|