Perl Scripts for Use in NMR Structure
Calculations
Written by:
Melanie Nelson and
Patty Fagan Jones, spring 1999
This document contains a list of Perl scripts that are helpful
during calculations of NMR solution structures. This list may not be
a complete list of all Perl scripts members of the lab use during
structure calculations. Check
Lena's Perl
Scripts and
Mike's Perl Scripts for more.
One method for running structure calculations involves using
GENXPK,
a program written by Garry Gippert. GENXPK can be used to assist in
the initial assignment process, particularly if you are working on a
mutant of a protein for which there are wildtype assignments. GENXPK
can also be used to help assign NOEs. See the GENXPK manual
(distributed with the software) or our wisdom page on
using GENXPK to assign crosspeak for
more information.
- make_pat, which converts the
GENXPK four column .assignments file to the format for Felix97 spin
systems. This allows you to import assignments made previously in
GENXPK into Felix, for use in a new Felix project.
written by: Melanie Nelson
- pat_to_4col, which converts
the Felix97 spin systems file to a GENXPK four column .assignments
file. This allows you to export experiment specific frequency
assignments from a Felix project to a .assignments file for use with
GENXPK (for instance, to assign NOEs).
written by: Melanie Nelson
- fix_4col. This script converts your
Felix97 nomenclature to GENXPK nomenclature. This script is not yet tested with
a stereopecific assignments file. You need to edit this script
to make sure all of your residue and atom names are converted.
SYNTAX: fix_4col basename.4col > .assignments
written by: Patty Fagan Jones
- getref.mac, a Felix macro for extracting
reference parameters for 3D matrices. Generates a refparm.noesy file, which is input
for GENXPK.
- disambig, which concatenates the ambiguity output
from either GENXPK or Felix with the distance information for a family, from
distance (part of Garry Gippert's
GAP package).
The atom names in the two files
must match! Therefore, another version of this script is
available, for cases where there is GENXPK format output with Felix-type atom
nomenclature. This modified version also has the newer nomenclature for
pseudoatoms. For instance, "QPA" is replaced by "QA", etc. All modifications are
restricted to the ConvertGenxpkName subroutine. Users with Felix format ambiguity
files may need to modify the ConvertFelixName subroutine. Another modified version of the
script is required for GENXPK output from 3D data sets, due to the difference in column
numbers in the ASG_RESULTS file. This script is called
disambig3d, and was made using the modified version of
disambig (disambig.mn). It is run exactly as disambig is.
To use a disambig script:
- Generate a list of unassigned NOEs, with the possible assignments, either
using GENXPK or Felix. The comments in the script give examples of the correct format
for these input files.
- Generate a list of inter-atom distances for your current family of
structures, using the distance program in Garry Gippert's GAP package.
It is important that all the pseudoatoms are included in this distance file.
Here is how I got that to work: 1) I used a modified
pseudoatom map file, in which I had removed the pseudoatoms I did not want
included. 2) I ran distance with the following command line:
distance -fam family.fam -sub "^[HCMNQ]" -pms "^[HMNQ]" -pam pseudomap.mn
-list 1 -cut 25 > output
Refer to the help and documentation available for the distance program for more
information about the command line options.
- Run the script. The comments of the script give the command line:
disambig -d [distance file] -f [felix|genxpk] -p [peak file]
written by: Randal R. Ketchem; modified by
Melanie Nelson
- aro_filt, which filters disambig output and only prints
lines in which the potential assignment in D1 is an aromatic proton. It is for use with
D2O NOESY data. It is run as follows:
aro_filt inputfile [>! output file]
Where inputfile is the output from disambig.
written by: Melanie Nelson
- doambig performs a similar function as disambig. The output
can be sent to ambigrab to simplify the process of manually finding new assignments. Doambig screens
through the unassigned peaks from one NOESY spectrum, based on the spectrum-specific assignments from that
NOESY spectrum. doambig is a wrapper script which runs/requires the following scripts:
- ambig2ncol (2D version)
(3D version)
(3D version with chemical shift filtering)
Chemical shift filtering removes peaks from the file which arise within a
user-defined chemical shift range, and is needed in the case where there are
many unassigned resonances in this range, preventing unambiguous assignment of
such peaks.
Note: you need to edit the doambig script to correctly denote which of these
three versions of ambig2ncol you wish to run. To test this script before running
doambig, use the following syntax:
ambig2ncol(.3d) -f [felix|genxpk] -p [peak file] -s [stereo file] >
outputfile
- noevio
- all of the scripts and files in ~username/bin/ambi/ (check with anyone who has
done structure calculations)
- dofilter: This script can be commented out of the
doambig script when doing manual distance filtering (early in your structure
calculations). dofilter performs automated distance filtering. It uses ambipick and
ambidis, which contain the filters that you will need to edit to do automated
distance filtering.
INPUT files:
- noesyname_asgresults (called the "peak file", this is the GENXPK output file
ASG_RESULTS renamed to noesyname_asgresults)
- noesyname.ssa (stereospecific assignments file, can be empty)
- family.all or family.fam (lists .pdb files)
OUTPUT:
- 7col.noesyname_asgresults
- 7col.noesyname_asgresults.noevio (this file can be fed into ambigrab)
- 7col.noesyname_asgresults.noevio.filter
Edit doambig before running:
- to point to the correct noesyname_asgresults file
- to point to either family.all or family.fam file
- to use the correct version of ambig2ncol (see the three versions above)
Run doambig in your amber/rst/doambig/ directory.
SYNTAX: doambig YYMMDD [monomer|dimer]
- ambigrab sorts the output from doambig and places asterisks next
to the most probable assignments.
INPUT files:
- 7col.noesyname_asgresults.noevio
OUTPUT:
- filename designated in execute command
Run ambigrab in your amber/rst/doambig/YYMMDD/ directory, where the doambig output is.
SYNTAX: ambigrab -b ["bins"] -n [number of peaks] -p [peak file] > outputfile
(for example, ambigrab -b "3 6 10" -n 8 -p 7col.cnoe_asgresults.noevio >
7col.cnoe_asgresults.noevio.8
In this example, ambigrab places *** next to assignments with an avg. distance less than
3 angstroms, ** next to those which are 3 to 6 angstroms, and * next to those which are
6 to 10 angstroms.)
- OPTIONAL: Before running ambigrab, you can run
ambigroup. ambigroup outputs a list in which each
line is
[ambiguity, or # of assignment possibilities]:
[number of peaks which display this ambiguity].
SYNTAX: ambigroup 7col.noesyname_asgresults.noevio >
outputfile
- getratio, which reads a peaks
file and a volume file (.xpk and .vol) generated from an HNHA
spectrum, and calculates the ratio of the crosspeak (HN:HA) volume
or intensity to the diagonal (HN:HN) volume or intensity. To use
this script, you must first have picked and optimized the boxes for all of the
peaks in the spectrum. I think it should be possible to get GENXPK
to create the peakboxes, but I could not make it work. You must
store intensities in the volume entity in Felix in order to use this
script on intensities instead of volumes. There is a MSI supplied
macro for this: relvolume.mac (in /usr/msi/970/felix/menus/mac).
However, I could not get this to work on my 3D data set.
All input files should be in Felix97 format.
written by: Melanie Nelson
- maketable, a script which reads a file
with chemical shift index information exported from Felix as text and the
output file from getratio, and creates a plain text,
tab-seperated file with the
CSI and J-coupling data for each residue, useful for importing into a word
processor or spreadsheet as a basis for tabulating information that defines the
secondary structural elements in your structure. This script is a bit rough:
it will only work for HA:HN coupling constants calculated with getratio and
for one column of CSI data.
written by: Melanie Nelson
- ambichemshift removes lines from an ambigrab group
(or noevio) that contain any H betas or H gammas that are within the range (which you can
modify) of 1.5 -2.5 ppm, in the case where there are many unassigned resonances in this range,
preventing unambiguous NOE assignments. It is not clear how this script is run, so ambig2ncol
was edited to insert the filter there.