Instructions for Running Structure
Calculations
Written by:
Patty Fagan Jones and Melanie Nelson,
1999
This document contains some basic instructions for running distance filtering
during NMR structure calculations using DIANA and AMBER.
Once you can define a reasonable family of structures, you can
use distance filtering to help you find more restraints. This is the
process by which you let the sum of the rest of the data
(information that is contained in your preliminary family of
structures) help you determine which of the possible assignments
(based on chemical shifts) for a given NOE is valid. Of course, you
will need to be conservative about making assignments, particularly
early in the structure calculations when you have a poorer family of
structures. There are several ways to do distance filtering. What is
required is:
- A way to generate the possible assignments for each NOE.
(Methods for this include using GENXPK and using FELIX's built in
assignment procedures)
- A method for generating the distances between protons in your
family of structures. (Methods for this include using either the
distance or noevio programs in Garry' Gippert's
GAP package)
- A method for connecting these two pieces of information, and
using the distances to rule out potential assignments. (Methods for
this include two scripts written by Randy Ketchem: disambig,
which is a more manual method, as is particularly useful at the
early stages of distance filtering, when you need to be most
conservative, and doambig, which is really a wrapper script
that calls several different filters, including one or more distances
filters. The more manual disambig script is also useful
when working in regions where simply decreasing the number of
possible assignments to 2-3 may allow assignments based on chemical
shifts to be made. I found this to be the case in my D2O
NOESY, where I often could distinguish between two aromatic
protons that were close in chemical shift, but not betwenn the
variety of aliphatic protons to which they could be making an NOE.
Both disambig and the scripts called by
doambig will probably need to be modified to match the
atom nomenclature you are using. check the
Perl scripts for structure
calculations page for more information about the scripts, what
they do, and how to modify them.)
- A consistent set of rules by which you are filtering the
possible assignments. These rules will become increasingly stringent as
the calculations progress, allopwing more NOEs to be assigned.
Potential starting points are:
- For disambig: a cutoff distance of 9 or 10 angstroms. A
peak is assigned to a given possibility if the average distance
from the family minus the
RMSD of distances is within the realm of detection (5-6 angstroms,
depending on your data), the average distance plus the RMSD is
less than the cutoff, and no other possibility has an
average distance (minus the RMSD or twice the RMSD, depending on how
stringent you want to be) of less than the cutoff.
- For doambig: the distance filtering is done by a program
called in dofilter called ambidis. It
removes possible assignments for which the average distance
is greater than the cutoff and fewer than a defined percentage
of the structures in the family have distances less than the
cutoff. Reasonable starting points for the cutoff and the
percentage are 9-10 angstroms and 20%.
Distance filtering using GENXPK followed by doambig/disambig:
This method utilizes GENXPK to do a first step of chemical shift filtering, followed
by doambig/disambig to do the second step of distance filtering. Note that GENXPK
only analyzes the peaks which are unassigned in Felix, and ignores all fully assigned
peaks in Felix.
If you wish to filter against all of your peaks in a particular NOESY spectrum, you
can "unassign all peaks" in the Assign module of Felix, DON'T save it to your dba,
and export the peaks to a text file (genxpk.txt -- see general instructions below).
In this way the disambig/doambig output will be comprehensive for all of your picked
peaks.
Start with GENXPK. Run GENXPK in your /home/yourname/felix97/text/noesyname/
directory. Select one NOESY spectrum to filter against.
- Make a .assignments file from Felix97:
- In Felix97, copy the spectrum-specific chemical shifts for the spectrum
you wish to filter against to the generic chemical shifts (in the Assign module).
- Export the spinsystems table to a text file called spinsys.txt
- Run pat_to_4col. This script converts
the file format to 4 column GENXPK format.
SYNTAX: pat_to_4col spinsys.txt [expt # of this NOESY in the Felix97
project experiment list] > spinsys.4col
- Check the spinsys.4col file. Delete all lines which contain zero
information.
- First time only: Edit
fix_4col to include any non-IUPAC atom or residue
names you may have used in your Felix assignments.
- Run fix_4col. This script converts your
Felix97 nomenclature to GENXPK nomenclature. This script is not yet tested with
a stereopecific assignments file.
SYNTAX: fix_4col spinsys.4col > .assignments
- Make a refparm.noesy file manually or in Felix97
using getref.mac, a Felix macro for extracting reference
parameters for 3D matrices.
- Make the other two input files you will need for GENXPK: genxpk.vol and
genxpk.txt.
- In FELIX, measure all volumes for the NOESY you have selected.
- Export the volumes to a text file called /felix/text/noesyname/noesyname.vol
- Export the peaks to a text file called /felix/text/noesyname/noesyname.xpk
- Manually delete the first line of each of these two text files.
- Rename these two files: noesyname.vol -> genxpk.vol, and
noesyname.xpk -> genxpk.txt
- Run GENXPK. You need to install it on your machine.
- SYNTAX: Type the following commands.
- genxpk refparm.cnoe (or whatever your refparm file name is called)
(This command starts up GENXPK.)
- vol 1
- rxpk (You'll get lots of output to the screen. Don't worry.)
- sap (This command shows you the current settings for assignment
parameters. Change them if they are incorrect. See the GENXPK manual
for commands.)
- sap write
- asg (You'll get lots of output to the screen. Don't worry.)
- quit
- OUTPUT file: ASG_RESULTS
Repeat this procedure if you wish to do distance filtering against a different NOESY spectrum.
Now run disambig or doambig.
- disambig, which concatenates the ambiguity output
from either GENXPK or Felix with the distance information for a family, from
distance (part of Garry Gippert's
GAP package).
The atom names in the two files
must match! Therefore, another version of this script is
available, for cases where there is GENXPK format output with Felix-type atom
nomenclature. This modified version also has the newer nomenclature for
pseudoatoms. For instance, "QPA" is replaced by "QA", etc. All modifications are
restricted to the ConvertGenxpkName subroutine. Users with Felix format ambiguity
files may need to modify the ConvertFelixName subroutine. Another modified version of the
script is required for GENXPK output from 3D data sets, due to the difference in column
numbers in the ASG_RESULTS file. This script is called
disambig3d, and was made using the modified version of
disambig (disambig.mn). It is run exactly as disambig is.
To use a disambig script:
- Generate a list of unassigned NOEs, with the possible assignments, either
using GENXPK or Felix. The comments in the script give examples of the correct format
for these input files.
- Generate a list of inter-atom distances for your current family of
structures, using the distance program in Garry Gippert's GAP package.
It is important that all the pseudoatoms are included in this distance file.
Here is how I got that to work: 1) I used a modified
pseudoatom map file, in which I had removed the pseudoatoms I did not want
included. 2) I ran distance with the following command line:
distance -fam family.fam -sub "^[HCMNQ]" -pms "^[HMNQ]" -pam pseudomap.mn
-list 1 -cut 25 > output
Refer to the help and documentation available for the distance program for more
information about the command line options.
- Run the script. The comments of the script give the command line:
disambig -d [distance file] -f [felix|genxpk] -p [peak file]
written by: Randal R. Ketchem; modified by
Melanie Nelson
- aro_filt, which filters disambig output and only prints
lines in which the potential assignment in D1 is an aromatic proton. It is for use with
D2O NOESY data. It is run as follows:
aro_filt inputfile [>! output file]
Where inputfile is the output from disambig.
written by: Melanie Nelson
- doambig performs a similar function as disambig. The output
can be sent to ambigrab to simplify the process of manually finding new assignments. Doambig screens
through the unassigned peaks from one NOESY spectrum, based on the spectrum-specific assignments from that
NOESY spectrum. doambig is a wrapper script which runs/requires the following scripts:
- ambig2ncol (2D version)
(3D version)
(3D version with chemical shift filtering)
Chemical shift filtering removes peaks from the file which arise within a
user-defined chemical shift range, and is needed in the case where there are
many unassigned resonances in this range, preventing unambiguous assignment of
such peaks.
Note: you need to edit the doambig script to correctly denote which of these
three versions of ambig2ncol you wish to run. To test this script before running
doambig, use the following syntax:
ambig2ncol(.3d) -f [felix|genxpk] -p [peak file] -s [stereo file] >
outputfile
- noevio
- all of the scripts and files in ~username/bin/ambi/ (check with anyone who has
done structure calculations)
- dofilter: This script can be commented out of the
doambig script when doing manual distance filtering (early in your structure
calculations). dofilter performs automated distance filtering. It uses ambipick and
ambidis, which contain the filters that you will need to edit to do automated
distance filtering.
INPUT files:
- noesyname_asgresults (called the "peak file", this is the GENXPK output file
ASG_RESULTS renamed to noesyname_asgresults)
- noesyname.ssa (stereospecific assignments file, can be empty)
- family.all or family.fam (lists .pdb files)
OUTPUT:
- 7col.noesyname_asgresults
- 7col.noesyname_asgresults.noevio (this file can be fed into ambigrab)
- 7col.noesyname_asgresults.noevio.filter
Edit doambig before running:
- to point to the correct noesyname_asgresults file
- to point to either family.all or family.fam file
- to use the correct version of ambig2ncol (see the three versions above)
Run doambig in your amber/rst/doambig/ directory.
SYNTAX: doambig YYMMDD [monomer|dimer]
- ambigrab sorts the output from doambig and places asterisks next
to the most probable assignments.
(OPTIONAL: Before running ambigrab, you can run
ambigroup. ambigroup outputs a list in which each
line is
[ambiguity, or # of assignment possibilities]:
[number of peaks which display this ambiguity].
SYNTAX: ambigroup 7col.noesyname_asgresults.noevio >
outputfile)
To run ambigrab,
INPUT files:
- 7col.noesyname_asgresults.noevio
OUTPUT:
- filename designated in execute command
Run ambigrab in your amber/rst/doambig/YYMMDD/ directory, where the doambig output is.
SYNTAX: ambigrab -b ["bins"] -n [number of peaks] -p [peak file] > outputfile
(for example, ambigrab -b "3 6 10" -n 8 -p 7col.cnoe_asgresults.noevio >
7col.cnoe_asgresults.noevio.8
In this example, ambigrab places *** next to assignments with an avg. distance less than
3 angstroms, ** next to those which are 3 to 6 angstroms, and * next to those which are
6 to 10 angstroms.)
Note: To print out ambigrab output in a useful hardcopy format, use a2ps first:
SYNTAX: a2ps -q -nL -1 -F4.5 -H 7col.cnoe_asgresults.noevio.8 > 7col.cnoe_asgresults.noevio.8.ps
Then print the resulting postscript file using lp or lpr as you normally would.
Distance filtering using xpkasgn.sgi
xpkasgn.sgi is a compiled script (written in FORTRAN) which originated in the Wright lab. It
performs chemical shift filtering followed by distance filtering on Felix output. The main
difference between this method and the GENXPK/doambig or disambig method is that all peaks in a
spectrum are analyzed, both those which were already assigned in Felix and those which are
unassigned.
Note that the .xpk and .assgn files must have nomenclature which matches the .pdb output from AMBER.
You can use a script like subs to make the nomenclature changes. But you will
need to edit subs to match your current nomenclature!
- INPUT files:
- crcn.xpk
- crcn.ref (contains chemical shift referencing info)
- crcn.pat (contains tolerances for chemical shift filter)
- crcn.assgn (contains assignments in 4 column format)
- crcn.pdb (contains a list of .pdb files for distance filter)
- crcn1.pdb, crcn2.pdb, crcn3.pdb, . . . . . crcnx.pdb (AMBER output family)
- x.com (OPTIONAL)
- OUTPUT files:
- assign.lis
- close.dis
- xpk.new
- SYNTAX: Type x.com (automated) or xpkasgn.sgi (for interactive input)
Ambiguous assignments
You can use the ambigrab output to identify ambiguous restraints. The simplest method for handling
these restraints is to manually enter them into the fix_list file which is read by diana_filt while
makerst.new is run to create DIANA input restraints.
Look through the ambigrab output to find two or more possible NOEs which contribute to one NOE
cross peak. You can check the average distance between the proton pairs, and see that the avg dist
minus the rms is less than 7 angstroms. If two or more proton pairs make it through both the
chemical shift and this distance filter, assign them manually to the largest bin -- 5 or 6
angstroms.
last updated 8/21/99 by Patty Fagan Jones (fagan@scripps.edu)