(A) FOLDING AND FORMATION OF STRUCTURE

1. Protein Folding, Misfolding, and Aggregation

2: Protein design and structure prediction

3. RNA Folding

4. Folding of biomimetic heteropolymers

5. Lipid membranes, lipid-protein systems, lipid vesicle fusion

 

(B) FUNCTION

1. Ligand binding thermodynamics and drug design

2. Protein-protein interaction

 

 


 

(A) FOLDING AND FORMATION OF STRUCTURE

1. Protein Folding, Misfolding, and Aggregation:

How proteins self-assemble into their native state (responsible for biological function) has been a much studied problem for over a decade. Progress has been made into how simple models of proteins fold as well as means to design protein sequences de novo. However, these models ignore much protein detail which is likely crucial for understanding how real proteins fold. Thus, the current challenge lies in understanding how particular chemical detail in proteins (such as hydrogen bonding and hydrophic interactions) lead to particular protein folding mechanisms.

We have developed techniques which allows us to make fundamental advances in simulations of protein folding, by speeding atomistic simulations 100 to 1,000 times. This speedup allows us to simulate tens of microseconds and thus simulate the folding of the fastest folding proteins in all-atom detail. However, these methods are extremely computationally demanding, and require 1000's to 10,000's of computers. To solve this problem, we have released our software as a screen saver and have gathered over 10,000 collaborators who run our software. This project, called Folding@home has already lead to great initial results (the folding of proteins in atomistic detail on the microseconds timescale) and we are now continuing to use this technique on other systems as well, including the folding of RNA and non-biological polymers as well as the aggregation of proteins associated with diseases, such as Altzheimer's and Mad Cow (see below).

 

2. Protein design and structure prediction:

We have also started another distributed computing project to use protein design to generate new "virtual genomes." Our project, Genome@home, studies real genomes and proteins directly, by designing new sequences for existing 3-D protein structures, which come from real genomes. The protein structure files that are sent out as work contain the Cartesian atomic coordinates of a protein. This data was obtained experimentally through X-ray crystallography or NMR techniques. Note that this was not done by us; thousands of scientists have spent decades compiling this data, which is generously made freely available to the public. By designing new sequences that could form these specific protein structures, we're setting the stage to attack a number of significant contemporary issues in structural biology, genetics, and medicine. For example, the Genome@home data will be used to:

  • Try to unravel a fundamental issue in the "protein folding problem" (which itself lies at the heart of a huge amount of modern biomedical research): the fact that thousands of different sequences can all form the same three-dimensional structure.
  • Predict the functions of newly discovered genes and protein structures. Modern approaches to structural biology, known as "proteomics" or "structural genomics", often solve protein structures without knowing what the proteins do. Because techniques for function prediction tend to work best with large amounts of sequence data, a virtual library of sequences for a new protein structure will be an invaluable resource.
  • Potentially design and make new versions of existing proteins for use in medical therapy.

 

3. RNA Folding:

While protein folding has garnered much attention over the last decade, RNA folding has received much less interest. From a theoretical point of view, one reason for this is the large molecular weight of RNA chains and role of electrostatics and counter ions in RNA folding. However, with recent techniques developed in protein folding, we have started to tackle the RNA problem.

We are currently collaborating with several experimental groups at Stanford (Herschlag, Doniach, and Chu) to combine and compare our simulation results to experiment. This allows us to validate our simulations and allows one to refine the experimental data to yield more information about the structure and nature of folding.

 

4. Folding of biomimetic heteropolymers:

Can we apply our understanding gleaned from our study of proteins and RNA to design protein-like heteropolymers -- heteropolymers which can fold into particular structures? If so, how do these polymers fold as compared with proteins? Finally, can we take advantage of new polymer architechtures, such as branched chains, in order to design synthetic polymers with novel folding and material properties?

 

5. Lipid membranes, lipid-protein systems, lipid vesicle fusion:

Lipid membranes also play a fundamental role in biochemistry, serving as the structural units which encapsulate cells, organelles, viruses, etc. In particular, lipid membranes must fuse in order for such systems to combine (endocytosis) or detach (exocytosis). This physical process is also a first order phase transition, but is heavily mediated by proteins in biological systems. We are currently studing how lipid vesicles fuse with and without the affect of biological machinery (fusion peptides and proteins; see below).

 

 

(B) FUNCTION

1. Ligand binding and drug design:

One of the biggest challenges in computational drug design is the accurate calculation of the free energy of binding of small ligands. Currently, typical errors in these calculations make them unusable to distinguish between strong binders (which would potentially make good drugs) and non-specific binders (which wouldn't). We are using distributed computing methods to greatly increase the accuracy of such calculations.

 

2. Protein-protein interactions:

Related to our work on protein folding and protein design, we are also interested in protein-protein interactions. While this is a new area for our lab, we are leveraging our unique methods and capabilities in protein folding thermodynamics, kinetics, and design.