Robinson Research Building
School of Medicine
Nashville, Tennessee 37232-0146
| Egli Laboratory
Biochemistry and X-ray Crystallography
| Tel: 615.343.8070
We are pursuing crystallographic studies in combination with in vitro primer extension assays and analyses in vivo of the interactions between Y-class trans-lesion DNA polymerases and DNA adducts (including carcinogens and oxidation products). Much of the work so far has involved the Dpo4 DNA polymerase from Sulfolobus solfataricus which we have analyzed together with the 1,2N2-etheno-G, 8-oxoG, O6-methyl-G, O6-benzyl G and several other adducts as well as with the hydrophobic 2,4-difluorotoluyl nucleoside analog. We have expanded our investigations to the human trans-lesion DNA polymerases hPol-eta, hPol-kappa and hPol-iota for which three-dimensional structural information in complex with adducted DNA is currently lacking. Collaborations with the labs of F. Peter Guengerich, Carmelo Rizzo and Michael P. Stone at Vanderbilt University and R. Steven Lloyd at Oregon Health & Science University.
Cover of J. Biol. Chem. 2005, Vol. 280, August 19 (issue 33): Crystal structures of the archebacterial Sulfolobus solfataricus DNA polymerase Dpo4 with the DNA adduct 1,N2-ethenoguanine:
Biol. Chem. paper of the week
(Jul 2007; Vol. 282, 19831-19843),
highlighted in ASBMB Today - "Replicating damaged DNA"
(August 2007). Download
by NIH grants P01 CA160032, P30 ES00267, R01 ES005509 and R01 ES010375
| Nucleic Acid Structure:
Conformation, Stability and Activity of DNA, RNA and Nucleic Acid
We are investigating chemically modified nucleic acids as model systems for native DNA and RNA. Nucleic acid analogs can serve as chemical probes in diagnostics or the analysis of protein-nucleic acid interactions and in high-throughput genomics and drug target validation; as potential antigene-, antisense-, or RNAi-based drugs; and as tools for structure determination (i.e. crystallographic phasing), just to name a few. Biophysical and structural investigations of chemically modified DNAs and RNAs, particularly of nucleic acid analogs with more significant alterations to the well-known base-sugar-phosphate framework (i.e., peptide or hexopyranose nucleic acids), can also provide insights into the properties of the natural nucleic acids that are beyond the reach of studies focusing on DNA or RNA alone. We have determined dozens of crystal structures of chemically modified DNAs and RNAs that are of interest in discovery and development of oligonucleotide-based therapeutics. We are also interested in functional and recognition aspects of enzymes that degrade RNA strands paired to native and chemically modified oligonucleotides. Structural investigations of analogs in the context of an etiology of nucleic acid structure constitute a further focal point.
For highlights in Nature and in ACS Chem. Biol. regarding our work on homo-DNA, please see:
For more on DNA's sweet secret please go to the following link at Vanderbilt's Exploration science e-mag Page:
Supported by NIH grants R01 GM055237, R01 GM071461, R44 GM086937, DARPA contract N66001-14-2-4054 and Volkswagen Stiftung (Project "Molecular Life").
Cryo Neutron Crystallography of DNA and RNA
Water in living systems is omnipresent and serves important functions beyond its role as a simple diluent that include transport, stabilization, reactivity and partitioning. Water is also critically involved in governing folding, geometry, stability, dynamics, function and interactions of biomolecules such as proteins and nucleic acids. Thus, water is of fundamental importance in protein folding owing to its role in defining hydrophobic attractions. Water molecules present at protein active sites or ligand binding interfaces can also affect activity and binding affinity. Water plays an even more important role in the stabilization of the 3D-structure of nucleic acids relative to proteins because of the presence of negatively charged phosphate groups in the backbone of the former. Phosphate-phosphate electrostatic repulsion is diminished by the high dielectric constant of water, and the degree of hydration of nucleic acids can control their conformation (e.g. DNA B- to A-form transition). The main stabilizing force in DNA is not base pair H-bonding but base stacking. The hydrophobic cohesion of stacked base pairs requires abundant water and indirectly renders the DNA interior dry so that H-bonds can exert full recognition power. X-ray crystallography has shaped the way we think about water molecules surrounding macromolecules and at protein-ligand and protein-nucleic acid interfaces. However, X-rays are blind to hydrogen atoms in crystals of macromolecules. We hypothesize that the lack of insight into the orientations of water molecules and hydroxyl groups (e.g. RNA 2'-OH) has led to an incomplete understanding of many macromolecular systems. Our long-term goal is to gain a complete picture of the water structure in such systems using joint cryo neutron and X-ray crystallography. Using this approach we have initiated a series of structure determinations of well known DNA and RNA folding motifs. For a recent example, please see this highlight article:
Structure and Function of P450 Enzymes in Steroid Hormone Biosynthesis
The cytochrome P450 superfamily is distributed throughout all biological kingdoms. The two general classes of P450 reactions are xenobiotic metabolism (e.g. drugs, carcinogens) and endogenous substrate biosynthesis (e.g. hormones, eicosanoids). High-resolution structures of members of ca. 50 subfamilies among the >17,000 P450 superfamily members are known and all exhibit a common structural fold. However, the precise roles of most individual structural elements are not clearly established. We are addressing this void by investigating two P450s involved in steroid hormone biosynthesis: mammalian P450 21A2, which is known to have >100 amino acid mutations that influence its steroid 21-hydroxylation activity, and human P450 17A1 as well as zebrafish P450s 17A1 and 17A2, the latter two being distinct enzymes with dramatic variations in their steroid 17α-hydroxylation/17,20-lyase activities. By analyzing the structures and kinetics of these enzymes, we expect to be able to establish a more precise view of P450 structure/function than is currently available. We anticipate that these studies will clarify in detail how these two important P450s can function in steroid hormone biosynthesis, and the results can then be applied to understanding the structure/function of other P450s. Mutations in both P450 21A2 and 17A1 are the major causes of a genetic group of diseases known as congenital adrenal hyperplasia, and P450 17A1 is an important drug target for treatment of prostate cancer. The results from these studies will allow a better understanding of both diseases. Collaboration with F. P. Guengerich and M. Waterman.
Function of Circadian Clock Proteins
Circadian clocks are self-sustained biochemical oscillators. Their properties include temperature compensation, a time constant of approximately 24 h, and high precision. Recent research has shown that the KaiABC circadian clock from the cyanobacterium S. elongatus can be reconstituted in vitro from the three proteins KaiC, KaiA and KaiB in the presence of ATP. This renders the KaiABC molecular timer a unique target of biochemical and biophysical studies. We are characterizing this clock system using X-ray crystallography in combination with electron microscopy and mutagenetic studies. We have recently determined the crystal structure of the KaiC auto-kinase and auto-phosphatase and have presented three-dimensional models of binary KaiA-KaiC and KaiB-KaiC complexes that shed light on the roles of the KaiA and KaiB proteins in controlling the KaiC phosphorylation status. Collaborations with the labs of Carl H. Johnson and Phoebe L. Stewart at Vanderbilt University.
A flurry of crystallographic and NMR studies have recently led to the structural characterization of the cyanobacterial KaiA, KaiB and KaiC proteins (reviewed in c.v. ref. 92 abd ref. 119 and highlighted by A. Yarnell in C&EN)
See also the discussion of circadian clock protein structures in the PDB 'molecule of the month' series (Jan, 2008):
For more information on circadian clock advances related to the KaiABC family of proteins please go to the following link at Vanderbilt's Exploration science e-mag Page:
Supported by grant NIH R01 GM073845 and GM081646
|Microfluidic Integrated Transduction RealNose and Olfactory Receptors
Animal noses have evolved to rapidly detect small airborne and soluble molecules at minute concentrations. The range of odorants detected is chemically diverse and seemingly infinite. The sensitivity of the animal nose is exemplified in the canine, with over 1000 functional olfactory receptor (OR) genes that allow detection of many compounds at the level of parts per trillion (ppt). Despite decades of efforts, artificial noses are still embarrassingly inferior to their natural counterparts. The key to creating sensitive real-time olfactory sensors is to study and utilize the corner stone of mammalian scent detection -- the olfactory receptor. Our laboratory is part of a research team centered at MIT that studies ORs and aims to develop OR-based (RealNose) sensors. The currently available structures of ORs are rough estimations based on computational modeling and comparisons to bovine rhodopsin. No detailed molecular structure for any OR has yet been determined by X-ray diffraction. An accurate 3-dimensional OR model will likely provide insights into the mechanism by which organic compounds (odors) activate the olfactory system. Therefore, in addition to the development of the RealNose sensor, a central focus of the project is the expression and crystallization of OR membrane proteins suitable for X-ray crystal structure determination.
Supported by DARPA contract HR0011-09-C-0012
Protein-Nucleic Acid Interactions
We have embarked on an investigation of the structural basis that underlies the recognition of RNA-DNA hybrids by E. coli RNase HI. The enzyme binds to the hybrid duplex and degrades the RNA strand. Although the X-ray crystal structure of RNase HI from E. coli was reported a decade ago, there is presently no crystal structure of the enzyme-substrate complex. Although we (c.v. refs. 18 and 22) and others have studied the structures of chimeric RNA-DNA molecules by X-ray crystallography, neither the structures of substrates and the enzyme alone nor those of modeled complexes have yielded a satisfactory explanation for the substrate specificity of the enzyme (c.v. ref. 63). RNase H is a key player in replication and transcription (reverse transcriptases also feature an RNase H domain) and is therefore of fundamental biological importance. This project is a logical extension of our previous work on nucleic acid structure and antisense oligonucleotide design. RNase H is believed to play an important role in the suppression of a particular message by an antisense oligonucleotide (another possible mode of action of an antisense oligonucleotide is via a steric block mechanism). Obviously, the availability of the three-dimensional structure of a complex between RNase H and its substrate would be very useful for the design of nucleic acid modifications that allow recruitment of the enzyme to the site of hybridization and subsequent RNA cleavage. In the absence of such a structure, a correlation of the conformations in a duplex environment (DNA, RNA-DNA hybrid) of nucleotide analogs with the susceptibilities to cleavage by RNase H of their hybrids with RNA should yield valuable insights regarding the features of the hybrid duplex that underlie enzyme substrate-recognition and processivity. Nucleotide analogs that are being analyzed in this manner in our laboratory include arabino nucleic acid (ANA), 2'-F-ANA (see the structure of the duplex between 2'-F-ANA and RNA depicted above) and a host of 2'-O-modified ribonucleotide analogs.
Structure Assisted Approach to the Discovery of New Therapies for Neurodegenerative Diseases
Death-associated protein kinase (DAPK) is the first described member of a novel family of pro-apoptotic and tumor-suppressive serine/threonine kinases. In collaboration with the laboratory of D. Martin Watterson at Northwestern University, we determined the crystal structure of the catalytic domain of DAPK (c.v. refs. 74; a schematic of the domain structure of DAPK is depicted above). The structures studied include apo-form, the complex with AMPPnP as well as a ternary complex consisting of kinase, AMPPnP and either Mg2+ or Mn2+. A comparison between these structures of DAPK and nucleotide triphosphate complexes of several other kinases revealed several unique features of the DAPK catalytic domain. For example, a highly ordered basic loop in the N-terminal domain may be of importance in enzyme regulation.
In parallel with the structural work DAPK's preferences for phosphorylation site sequences was determined using a positional scanning peptide substrate library (c.v. ref. 75). An enzyme assay for DAPK was developed and then used to measure activity in adult brain as well as to monitor protein purification based on the chemical and physical properties of the DAPK cDNA open reading frame. The results of the two studies allowed insight into DAPK's substrate preferences and regulation and provide a foundation for proteomic investigations and inhibitor discovery.
Current inhibitor discovery efforts using a structure-assisted approach have led to the identification of a small molecule lead with high affinity and specificity for DAPK that attenuates hypoxia-ischemia induced brain damage in vivo (c.v. ref. 89).
For more information on DAPK please follow this link to Vanderbilt's Exploration science e-mag page:
A first phase of the Human Genome Project has recently been completed and has produced a working map of the entire human genome. Knowledge of its DNA sequence, which is estimated to code for (surprisingly) only ca. 30,000 proteins, is necessary but not sufficient for a complete understanding of human biology, or that of other living systems. The next logical step is to determine the biochemical functions and structures of these proteins. A long-range goal is to determine the structures of all 30,000 proteins. This is a formidable task and is unobtainable in the near term.
My interest in the type of research now commonly termed structural genomics lies not in the development of high-throughput crystallization or structure determination. This is best left to companies and government laboratories. However, one of the main goals of the centers currently supported by the National Institutes of Health is the production of as many new structures as possible within the next five years. Thus, the projects will leave little time and money for further work aimed at linking emerging structures to biological function. This task is a suitable one for academic research. We are using state-of-the-art sequence analysis and X-ray crystallographic methods combined with functional assays to determine function from structure for a handful of highly conserved genes of presently unknown function. These include the YrdC protein from E. coli and the Maf protein from B. subtilis for which crystal structures have recently been determined in my laboratory (c.v. refs. 69 and 64, resp.). The above illustration depicts the crystal structure of Maf along with a sequence alignment for the maf-family of genes.
Structure-Based Design of Antisense and Ribozyme Therapeutics / Nucleic Acid Etiology
Chemically modified oligonucleotides are currently being investigated as antisense and antigene reagents with potential therapeutic applications. Interference with biological information transfer can occur at a variety of stages. Thus, targeting either mRNA synthesis (transcription - antigene approach) or protein synthesis (translation - antisense approach) may allow a modulation of gene expression. The great potential of the antisense strategy consists in the high specificity of hybridization between antisense strand and messenger RNA via formation of Watson-Crick base pairs, offering the opportunity of rationally designing nucleic acid drugs. In 1998, the US Food and Drug Administration (FDA) approved the first antisense drug, Vitravene™ (Formivirsen), a DNA phosphorothioate oligonucleotide against cytomegalovirus-induced retinitis in AIDS patients.
We are pursuing a structure-based approach to define the principles that underlie the thermal stability (c.v. refs. 76, 57, 50, 45) and nuclease resistance of oligonucleotides and how chemical modifications affect these properties. The above illustration shows the detailed interactions of an oligodeoxynucleotide containing 2'-O-(3-aminopropyl)-modified nucleotides at the active site of the 3'-5'-exonuclease from DNA Pol I Klenow fragment based on a crystal structure of the complex (c.v. ref. 61). My laboratory has published more structures of oligonucleotide analogs than any other research group worldwide and we will continue these efforts, focusing on modifications that are of potential therapeutic interest as well as on those studied in the context of nucleic acid etiology. Efforts regarding the latter are currently concentrated on the crystal structure determination of a 2',3'-dideoxyglucopyranose nucleic acid duplex (homo-DNA) and on the origins of the established cross-pairing between TNA (tetrose nucleic acid) and both DNA and RNA (c.v. ref. 82).
In many viruses, including tumor- and retro-viruses, the programmed -1 ribosomal frameshifting of polycistronic mRNA regulates the relative level of structural and enzymatic proteins important for efficient viral assembly. The -1 shift in reading frames causes stop codon readthrough, and results in production of a single fusion protein. For example, in the Rous sarcoma retrovirus, the pol gene that encodes integrase, protease and reverse transcriptase is expressed with the upstream gag gene (encoding virus core proteins) through a gag-pol fusion protein. The mature products are later obtained by processing the poly-protein precursor. The -1 frameshifting is not only found in retroviruses but also in coronaviruses, yeast and plant viruses as well as bacterial systems. Frameshifting levels can range from 1 to over 30% in different systems to produce gene products in a functionally appropriate ratio. However, the mechanism of ribosomal frameshifting is not understood. It is postulated that a complex mRNA structure 6-8 nucleotides downstream from the "slippery sequence", in many cases a pseudoknot, leads to ribosomal pausing and the simultaneous slippage of both aminoacyl and peptidyl tRNAs toward the 5'-direction by one base.
We participated in the determination of the 1.6 Å resolution crystal structure of a 28-nucleotide pseudoknot from Beet Western Yellow Virus (BWYV, see illustration above; c.v. ref. 53). In the meantime, we have refined this structure to 1.25 Å and we have determined the structure of a second crystal form to 2.85 Å resolution (c.v. ref. 78). The next phase of this project involves the correlation of the mutation data collected in the laboratory of Dr. Alexander Rich at MIT with the structures of mutated RNA pseudoknots. This will entail the crystallization and structure determination of pseudoknots with sequence altertations that cause drastic changes in the frameshifting activity. In the more distant future it may be feasible to study the interactions of a viral message that contains a pseudoknot with the ribosomal proteins at the entry site in crystals of the E. coli ribosome.
The interactions between double helical DNA and cations, specifically mono- and divalent metal ions have recently received increased attention. Molecular Dynamics simulations, solution NMR and X-ray crystallography have all shed light on the coordination of ions in the major and minor grooves of DNA. Metal ion interactions may play key roles in the control of DNA conformation and topology, but despite progress in locating the ions and determining their precise binding modes, it remains difficult to figure out just how important ions really are (c.v. ref. 79). Most of the crystallographic investigations of DNA-ion coordination, in particular those concerning potential binding of alkali and earth alkali metal ions to so-called A-tracts, focused on the Dickerson-Drew dodecamer. We would like to expand our previous investigations (c.v. refs. 52, 56, 58) on other sequences containing longer A-tracts. We have demonstrated the usefulness of the single-wavelength anomalous dispersion (SAD) technique for locating alkali metal ions in DNA crystals (see the above illustration, depicting ion coordination to an A-form DNA duplex) and for determining the structures of the latter (c.v. ref. 70).
For proteins and enzymes, selenium has proven to be an effective anomalous scattering center that can be readily introduced into recombinant proteins in the form of selenomethionine. Selenomethionyl proteins account for about 65% of all new protein crystal structures phased by MAD. Selenium in place of sulfur leads to only minimal changes in geometry and hydrophobicity and crystals of Se-labeled proteins exhibit a high degree of isomorphism with their wild type counterparts. By comparison, the impact of MAD for solving new protein structures is currently not matched by a similar success in the determination of nucleic acid crystal structures. Bromine can be selectively introduced into oligonucleotides in the form of 5-bromo-uracyl. However, applications of MAD on bromo-derivatives are often not successful in practice, probably because of base-stacking disruption and other structural perturbations caused by bromo-derivatization. Crystallizability is another issue with bromine derivatives: Not all halogen derivatives can be crystallized under native conditions, and the derivative crystals do not always diffract as well as the native ones. In addition, a possible problem associated with halogen derivatives is that these halogenated nucleotides are light sensitive; long-time exposure to X-ray or UV sources may cause decomposition.
We have recently initiated a program to investigate chemical synthetic routes for covalently incorporating selenium into DNA. Several oxygen centers can potentially be replaced by selenium (i.e. the 2-oxygen in pyrimidines, the ribose 2'-, 3'-, 4'- and 5'-oxygens and the non-bridging phosphate oxygen). We have demonstrated that incorporation of 2'-selenomethyl-U into DNA allows structure determination via MAD (c.v. refs. 77, 80). This approach is also suitable for RNA structure determination. Moreover, we have obtained initial experimental evidence that replacement of one of the non-bridging phosphate oxygens by selenium and separation of the resulting diastereoisomeric phosphoroseleneoates may furnish a universal method for phasing X-ray diffraction data of native and chemically modified nucleic acids as well as protein-nucleic acid complexes (by labeling the nucleic acid instead of the protein) (c.v. ref. 83; see the figure above, depicting a MAD-based electron density map based on the PSe phasing approach).
DNA. We were the first to report an atomic-resolution crystal structure for a B-form DNA duplex, the Dickerson-Drew dodecamer (DDD) at 0.95 Å (c.v. refs. 52, 62). The resolutions of recent A- (0.83 Å; c.v. ref. 62) and Z-DNA structures (0.6 Å; c.v. ref. 70) reported by my laboratory also constitute the highest achieved to date for these duplex forms. In addition, we provided the initial unequivocal evidence for alkali metal ion coordination in the minor groove of the DDD A-tract (c.v. ref. 56). High-resolution structures of the DDD crystallized with divalent metal ions also yielded unique insights on how Mg2+ and Ca2+ coordination affect DNA topology, bending and packing (c.v. ref. 58).
RNA. In collaboration with the research group of Dr. Alexander Rich at MIT, we published the first crystal structure of an RNA frameshifting pseudoknot (pk; c.v. ref. 53). We have now extended the resolution of the trigonal pk crystal form to 1.25 Å, the highest resolution for a medium-size RNA with intricate tertiary structure, and have analyzed the coordination of alkali- and earth-alkali metal ions to the pk (c.v. ref. 79).
Analogs. We published the first crystal structures of fully chemically modified DNA and RNA duplexes (c.v. refs. 45 and 57, resp.). In addition, a structural assay was developed for analyzing the specific interactions at the enzyme active site that allow nucleic acid analogs to evade nuclease degradation (c.v. ref. 61). Enhanced stability of pairing and resistance to nucleases are considered key features of nucleic acid analogs in antisense and antigene applications. Beyond their use as potential drugs or molecular probes, chemically modified nucleic acids have provided a better understanding of electron transfer in DNA (c.v. ref. 60) and are a prerequisite for research concerning the etiology of nucleic acid structure (c.v. ref. 82).
Enzymes. Recently, we completed structural analyses of the catalytic domain of human death-associated protein kinase (DAPK), an enzyme associated with apoptosis and tumor suppression (c.v. ref. 75). The structure determination is an important first step toward an improved understanding of DAPK’s function and the discovery of small molecule inhibitors for potential therapeutic applications. The publication of the structure along with an accompanying paper reporting on functional aspects of the enzyme (c.v. ref. 74) has generated considerable interest in the pharmaceutical industry and has been covered extensively in the news (see enclosed materials). In collaboration with the research group of Dr. D. Martin Watterson at NWU, we have discovered competitive and non-competitive inhibitors for the kinase domain of DAPK that can now be co-crystallized with DAPK for further analysis.
Genomics. In 1997/98, we participated in a ‘structural genomics’ pilot project at NWU. Several gene products from B. subtilis and E. coli with homologs in all completely sequenced genomes but without apparent sequence similarity to known genes were over-expressed and crystallized. Using the Se-Met approach in combination with the multiwavelength anomalous dispersion (MAD) technique, several structures were determined at high resolution. We were among the first laboratories to report the results of structural genomics projects. In the case of the Maf and YrdC proteins, the structures exhibited new folds and provided hints as to possible functions of these genes (c.v. refs. 64 and 69, resp.). In 1999, this program was combined with the then emerging Midwest Center for Structural Genomics (MCSG). In 2000 this center received NIH funding as one of seven ‘Centers for Structural Genomics’ in the US.