Research Overview
Proteins and structural RNA have evolved while reconciling three main competing constraints, i.e. foldability, stability, and function. Our ultimate aim is to understand protein and RNA folding, function, and evolution under these constraints on sequence and structure by combining insights from experimental measurements with multiscale molecular dynamics (MD) simulations on high-performance computing systems and statistical genomic analyses. This requires multidisciplinary expertise in statistics and computer science on the one hand and in biology, chemistry, and physics on the other hand.
Protein Folding and Coarse-Grained Simulations
Anfinsen's hypothesis of a protein's native state occupying the global free-energy minimum has stirred many efforts to better understand the details of protein folding [1]. Straightforward all-atom MD simulations using explicit solvent representations and force fields like AMBER or CHARMM face several challenges including force field accuracy (“lack of transferability”) as well as overcoming high computational costs due to system size, simulation time (fs timestep in MD, protein folding on ms-s timescale), and environmental complexity. The alternative approach of energy landscape theory defines the entire energy landscape of a protein to be funnel-shaped and smoothened by evolution according to the principle of minimal frustration [2]. Based on this theory, native structure-based models are computationally significantly more efficient compared to all-atom MD simulations. In folding simulations, native structure-based models represent the ideal case of a perfectly funneled energy landscape dominated by native interactions and are in high agreement with experiments [3].
Integrating Experimental and Genomic Information into Molecular Simulations
Experimentally, complete structural characterization of biomolecular systems is a challenging task. In particular, short-lived and transiently populated intermediate conformations adopted during the functional cycle, membrane-bound or transmembrane domains, and proteins complexes are often not directly accessible [4]. An intriguing complementary approach is including existing experimental information, e.g. from FRET, cryo-EM, neutron scattering, NMR, or SAXS, into molecular simulations or taking advantage of the growing wealth of genomic information by statistic investigation of co-evolving amino acids. One specific example is two-component signal transduction (TCS), which conveys information in cells of bacteria, fungi, and plants. A membrane-bound sensor kinase detects an environmental stimulus and transfers a phosphoryl group to a transcription factor/response regulator, thus mediating some cellular response [5]. The structures are difficult to determine, as evidenced by the scarcity of structural representations of this heavily studied system. With TCS being ubiquitous and highly amplified in bacterial genomes, a bioinformatics analysis [6] of a large set of sequential homologues reveals interacting surface residues of the co-evolving TCS proteins sufficiently so as to predict complex structures by molecular dynamics simulations [7]. We also predicted a possible active conformation by refined protocols and confirmed it within mutagenetic experiments [8].
Multiscale Simulations Techniques
Simulations of biological macromolecules often struggle with reaching sufficiently long times. One way of addressing this challenge is to use coarse-grained models of different levels of detail [9]. Such methods have successfully been applied in condensed phase and biomolecular modeling research. By this means, computational demands can be reduced so that sampling of longer timescales is enabled in order to observe large-scale conformational transitions such as in protein folding [10]. In addition, simulations can easily be complemented by adding restraints from experimental data [11]. Such coarse-grained methods, however, treat some physically relevant interactions only implicitly or even neglect their contributions. Especially solvent and long-ranged interactions (e.g. electrostatics) introduce another layer of complexity into biomolecular simulations and are crucial for protein-protein or protein/RNA/metabolite interactions during molecular docking. Our expertise in developing force fields extends these methods by developing multiscale approaches, which combine computationally efficient sampling methods with an atomistic and physicochemically accurate description of the system.
Dynamics of Structured RNA Folding and Function
The structural assembly of non-coding structured RNA poses interesting questions. One particular example are riboswitches, which are part of untranslated 5'-regions in mRNA. They can bind to specific small metabolites and modify gene expression within a self-regulated process. Although a recent discovery [12], they have been found to occur frequently in nature such as in bacteria or fungi and are important drug targets for antibiotics [13]. We explore the co-transcriptional folding of riboswitches in coarse-grained MD simulations. Crucial questions include: Is there a common hierarchy of folding for riboswitches and how does the binding to a partner affect it? What is the free-energy landscape of this process, what are the barriers, and what conformational transitions occur? After transcription, riboswitches have a limited time frame for their genetic self-regulation. Is this genetic control kinetically or thermodynamically? Moreover, riboswitches are antibiotic target sites: How does the interaction between riboswitch and metabolite differ from the interaction with antibiotics?
Bibliography
[1] Schug A, Herges T, and Wenzel W (2003) Phys Rev Lett, 91(15):158102; Schug A, Verma A, Herges T, Lee KH, and Wenzel W, (2005) ChemPhysChem, 6(12): 2640–2646, 2005.
[2] Schug A, Whitford PC, Levy Y, and Onuchic JN (2007), Proc Nat Acad Sci USA, 104:17674–17679; Schug A and Onuchic JN (2010), Curr Opin Pharm 10:709-714
[3] Gambin Y, Schug A, Lemke EA, Lavinder JJ, et al. (2009), Proc Nat Acad Sci USA, 106:10153–10158
[4] They are typically too large for NMR-experiments or too difficult to sufficiently stabilize in a crystal lattice for X-ray experiments.
[5] Skerker JM, Perchuck BS, Siryaporn A, Lubin EA et al. (2008), Cell 133, 1043-1054; Gao R and Stock AM (2009), Annu Rev Microbiol 63, 133-154; Casino P, Rubio V & Marina A (2009), Cell; Schaller GE, Shiu SH and Armitage JP (2011), Curr Biol 21:R320-330.
[6] Weigt M, White RA, Szurmant H, Hoch J et al., (2009), Proc Natl Acad Sci USA 106, 67-72
[7] Schug A, Weigt M, Hoch HA, Onuchic JN, et al. (2010) Meth Enzymol 471:43-58, 2010; Schug, A, Weigt M, Onuchic JN, Hwa T, and Szurmant H (2009), Proc Nat Acad Sci USA, 106:22124– 22129
[8] Dago AE, Schug A, Procaccini A, Hoch JA, et al. (2012), Proc. Natl. Acad. Sci. U.S.A. 109, 1733-1742
[9] Schug A, Hyeon C & Onuchic JN (2008) Coarse-Graining of Condensed Phase and Biomolecular Systems, ed. Voth, G (CRC Press), 123-140
[10] Whitford PC, Noel JK, Gosavi S, Schug A et al. (2009) Proteins:Struct Funct Bioinf 75, 430-441
[11] We simulate conformational transitions by combining existing structural information with smFRET and CryoEM data (collaboration with C. Seidel, HHU Düsseldorf, Germany).
[12] Nahvi A, Sudarsa N, Ebert M, Zhou X et al. (2002) Chem & Biol 9, 1043-1049; Mironov AS, Gusarov I, Rafikov R, Lopez IE et al. (2002), Cell 111, 747-756; Winkler W, Cohen-Chalamish S, and Breaker RR (2002) Proc Natl Acad Sci USA 99, 15908-15913
[13] Blount KF & Breaker RR (2006) Nat Biotechnol 24, 1558-1564; Cheah M, Wachter A, Sudarsan N & Breaker RR (2007) Nature 447 497-500