AMBER Archive (2008)

Subject: AMBER: A new program for doing RESP

From: David Cerutti (dcerutti_at_mccammon.ucsd.edu)
Date: Thu Feb 07 2008 - 00:56:04 CST


Hello AMBER users,

    I jinned up my own resp utility last week, mostly as a way of having
more control over what's going on in the fitting procedure but also to set
the foundation, if I choose to go further, for doing RESP with truly
massive data sets and also very complicated molecules or even things with
additional molecular mechanics features that weren't in the original QM
calculation.

Here's how it works:

>> ./resp
resp >> A program for doing my own resp fits. Useful for making
resp >> my own restraints and other things that may help me create
resp >> charge sets that match both quantum calculations and
resp >> experimental observations.

Options:
   -i : the input file
     QMPHI : quantum-mechanical electrostatic potential fitting data
     REST : user-specified restraint
       ATOM : the integer number of the atom [ 1, 2, ..., N ]
                  followed by its weight
       TARGET : the target value for the sum of all
                  atoms * weights * charges in this restraint
     KEEP : general restraint weight to apply to each atom
               individually to keep its charge low (default 0.0)
     TOTALQ: total charge (e) on the system, implemented
             as a stiff constraint (default 0.0)
   -n : the number of fittings to perform (default 10)
   -t : size of the training set (as a fraction of all fitting data,
           default 0.5)
   -g : random seed for traning set selection (default 8992)

This says, "Do ten fits using inputs listed in gssdata.inp, reserving 50%
of the conformers for independent validation in each fit; use random seed
42941."
>> resp -i gssdata.inp -n 10 -t 0.5 -g 42941

And here's the input file:
>>>>
% First, the files
QMPHI mpd1.grd
QMPHI mpd2.grd
(...)
QMPHI mpd126.grd

% Now, restraints
REST ATOM 5 1000.0 ATOM 6 -1000.0 TARGET 0.0
REST ATOM 6 1000.0 ATOM 7 -1000.0 TARGET 0.0
REST ATOM 7 1000.0 ATOM 9 -1000.0 TARGET 0.0
REST ATOM 9 1000.0 ATOM 10 -1000.0 TARGET 0.0
REST ATOM 10 1000.0 ATOM 11 -1000.0 TARGET 0.0
REST ATOM 20 1000.0 ATOM 21 -1000.0 TARGET 0.0
REST ATOM 21 1000.0 ATOM 22 -1000.0 TARGET 0.0
REST ATOM 4 1000.0 ATOM 8 -1000.0 TARGET 0.0
REST ATOM 13 1000.0 ATOM 14 -1000.0 TARGET 0.0
KEEP 1.0
>>>>

The QMPHI records are the individual Gaussian'03 ESP outputs (it only
reads this format at the moment), the REST records add constraint
equations. This way I can set atom charges to be equal (mostly, it's
hydrogen and methyl carbons I'm concerned about above), or even coax the
program to insert a particular dipole into my system. The "KEEP"
parameter is optional; it adds constraint equations for every charge in
the system, loosely constraining them all to zero to keep charges as small
as possible while still giving an accurate fit. That's it. Just your
Gaussian outputs and the restraints you want.

    The program takes about ten seconds to read 126 conformers and do ten
fits of a 22-atom molecule; it does multiple Linear-Least Squares fits,
leaving a portion of the conformations for validation, and reports
statistics including the variance of each fitted charge:

resp >> Results for 10 fittings:
resp >>
resp >> Average correlation: 0.9822 +/- 0.0005
resp >> Average error (kcal/mol-e): 0.0053 +/- 0.0001

Atom # Average Q Std. Dev. Best Q
------ --------- --------- ---------
      1 -0.56735 0.00571 -0.56789
      2 0.35932 0.00249 0.35718
      3 0.25850 0.01788 0.26640
      4 -0.16424 0.01166 -0.16971
      5 0.04779 0.00240 0.04889
      6 0.04779 0.00240 0.04889
      7 0.04779 0.00240 0.04889
      8 -0.16424 0.01166 -0.16971
      9 0.04779 0.00240 0.04889
     10 0.04778 0.00240 0.04889
     11 0.04778 0.00240 0.04888
     12 -0.10182 0.01237 -0.09663
     13 0.04168 0.00323 0.03705
     14 0.04168 0.00323 0.03705
     15 0.21137 0.01434 0.21949
     16 -0.58102 0.00408 -0.58308
     17 0.36551 0.00283 0.36621
     18 0.02787 0.00489 0.02547
     19 -0.12944 0.01068 -0.12985
     20 0.03849 0.00231 0.03823
     21 0.03849 0.00231 0.03823
     22 0.03849 0.00231 0.03823

First, note that the average error is only 0.0053. If I were to use KEEP
0.0, the error only decreases to 0.0043 but the partial charges would be
about 0.1e bigger on average. In other words, adding a loose constraint
makes very little difference to the overall fit but really helps put
charge only where it's needed. Second, note that the charges on some of
these atoms are equal to the fifth decimal place--that's the stiff
restraints (see above) at work. If I had no restraints, this data set is
in fact large enough to get all those hydrogen charges equal to within 10%
or less. Third, note that although I didn't explicitly specify any
constraints involving more than two atoms, that can be done with the same
syntax (the total system charge constraint involves all atoms).

    Now, the first reason for this code is to permit me great control over
the fitting. In the above example, atom pairs (1 and 2) and (16 and 17)
constitute hydroxyl groups. Let's say I was interested in getting some
experimental properties of this fluid (it's methyl-(2-4)-pentanediol)
right, and my hunch was that the hydrogen bonds weren't strong enough. I
could add this to the input file to make the hydrogen bonds stronger:

REST ATOM 1 10.0 ATOM 2 10.0 TARGET -3.0
REST ATOM 16 10.0 ATOM 17 10.0 TARGET -3.0

This will moderately force the sum of charges on the hydroxyl group
towards -0.3e (whereas without this constraint the optimal sum of those
charges comes to about -0.21e). The other charges would automatically be
fitted to maintain optimal agreement with the QM data under the new
limits. I could also try different stiffnesses of the restraint to
determine how much accuracy I have to sacrifice in order to ramp up those
dipoles.

    Another thing I'm interested in with this molecule is its compatibility
with a particular water model. With these controls, particularly the KEEP
parameter, I can modulate the hydrophilicity of the fitted charges to
otpimize agreement with experimental densities for binary mixtures (the
above input seems to do pretty well--better to have KEEP 1.0 than KEEP
0.0).

    So, take my word for it, this code is easy to modify. Users could
easily add routines to iteratively run the fitting to determine optimal
positions for charged extra points or even polarization constants. (The
program could be included in scripts as-is to accomplish the same
purposes, though it would be more cumbersome.)

    If anyone is interested in obtaining a copy of the software for their
own use, please contact me directly. If anyone has comments, I suppose
the listserv is an appropriate place.

Dave
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber_at_scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo_at_scripps.edu