Instructions for Running Structure Calculations

Written by: Melanie Nelson and Patty Fagan Jones

This document contains some basic instructions for running NMR structure calculations using DIANA and AMBER. It was written based on advice from Lena Maler.


Generating the First Restraint Lists

You will need approximately 200 distance constraints to make a first attempt at a structure calculation. There are two important types of NOEs to identify at this stage: medium range NOEs (which will help define the secondary structure) and long range NOEs (which will "fold up" the protein). However, in the process of calibrating the bin sizes for the distance constraints and identifying the medium range and long range NOEs, you will also identify many sequential NOEs, and these should also be included in the first restraint list.

Before you can use any NOEs from a particular NOESY in your restraint list, you must calibrate the bin sizes. I began by defining three bins for my distance restraints, with upper bounds of 3.5, 4.5, and 6.0 angstroms. Later in the structure calculation process, it might be desirable to define four bins, and to fine tune the bounds. Here is one procedure for calibrating the bin sizes:

I used the intra-residue distances between HD and HE in phenylalanines to calibrate the tight bin for the D2O NOESY. The measured distance between HD and HE is 2.5 angstroms. It was necessary to divide the measured volume by two for the calibration, since there are two HDs and two HEs on a phenylalanine residue, and both would be contributing to the NOE. It is also necessary to be conservative in assigning bins from these distances, since spin diffusion will certainly contribute to the volumes of these intra-ring NOEs.

Once you have the bins calibrated, you are ready to begin to build up a restraint list. If you have a good knowledge of the secondary structure of your protein, either from chemical shift indexing, experiments to measure torsion angles (such as the HNHA experiment), or other considerations, a reasonable next step is to systematically search for the NOEs that define the secondary structure, such as the i to i+3 (HA to HN) NOEs for the helices. See chapter 7 of NMR of Proteins and Nucleic Acids, by Wuthrich for a useful table of short inter-proton distances in standard secondary structures.


Setting Up the Structure Calculations Directories

It doesn't really matter how you set up your structure calculation directories. However, the instructions in this document will assume the following set up:

Make the following directories under one main directory (called struct_calc here):

You will also need a directory with all of the MD scripts, etc. The various scripts we currently use will assume that this directory is called ~username/bin/md. You will have to edit the scripts to make sure they have the correct location of this directory. Here is a list of the contents of the MD directory.


Running DIANA (Distance Geometry)

You will need the following files in order to run DIANA:

Procedure for creating a .upl DIANA input file from Felix97:

You will also need to have the diana executable and libraries accessible. I installed them in my home directory. If you are copying the diana directory from someone else (and not reinstalling it from scratch), you will need to rebuild diana in order to have the correct libraries used. If you do not do this, diana will continue to use the libraries in the old directory (the place from which you copied the directory). As long as the old directory stays in place, this will not cause problems. However, it is best to rebuild the program. To rebuild diana in the new directory:

Run diana either via the batch queue on the SGI clusters (see the Research Computing page on how to use NQE for batch queue instructions if you are at Scripps) or on your local machine. The command is simply the command file name (i.e. basename.com).

The most useful output file from DIANA is the .ovw file. This is a overview of the restraint violations in the structures. There is one for each set of structures DIANA generated. Look at the last one (basename_1d.ovw).


Converting from DIANA to AMBER

After you have finished with distance geometry, you are ready to do simulated annealing with AMBER. First, you must convert both the coordinate files and the restraint files to AMBER format.

To convert the coordinate files:

To convert the restraint files:


Running AMBER

To run amber, make a date-based named subdirectory in your struct_calc/amber directory. Put the following files in this directory:

If you are at Scripps, you have two choices:


Analyzing the AMBER Output

Once all of your anneal jobs have finished, you are ready to analyze the output and see how your structures look. There are two common types of problems that will prevent a particular anneal job from finishing correctly:

  1. The job crashes during minimization. In this case, the job will quit very early in the run (after only a few seconds or minutes). You will have no output, because the rMD files will not have been created, and the minimization files will have been deleted. If you would like to look at the minimization output files, edit the mkanneal.sgi script so that they are not deleted. If the job has crashed in minimization, you will find the following error in these output files:
    RESTARTED DUE TO LINMIN FAILURE
    I saw more structures crashing in minimization in two instances:
    See the LINMIN failures portion of the AMBER FAQ for more information.
  2. The job "blows up" during the restrained molecular dynamics (rMD). In this case, you will have all of your output files, but there willbe no meaningful information in the output xyz file (here is an example of the xyz file resulting from this problem), and there will be some odd lines in the output file (here is an example output file resulting from this problem). This problem is most common early in the calculations, when you don't have many restraints.
As long as you didn't lose too many structures to these two types of problems, you can continue with the analysis. I never lost more than 10 structures out of 50.

Randy Ketchem wrote a handy script for analyzing your structures. It is called doptrail, for "do paper trail". It can be run for either all of the structures in your ensemble, or for only a family of the best structures. Here are the steps for running it:


Distance Filtering



Stereospecific Assignments



How to choose a family of structures


Criteria for a final family of structures

Here are some rules of thumb to check to see if you are finished with your structure calculation!
Be sure to run twice as many structures through the calculation as you expect to have in your final family.
Visually check your family in InsightII, and throw away structures if you can justify it upon the basis of restraints and/or chemical reasonableness.

In the doptrail output which displays all of the analysis of your AMBER structures, check:


last updated 12/31/02 by Kevin Weiss