Home

Participate (Download)

Help!

Education
  Teacher Page
  Distributed Computing
  Activities
  Amino Acids
  Proteins
  Genome
  Trivia Game
  Research
Articles
  Diseases
  Molecular Modeling
  Monte Carlo
  Validation of results
  Assessment
  Glossary

News

Stats

Science

Results

About

 

Education@Home | Teacher Page | Distributed Computing | Activities | Amino Acids | Proteins | Diseases | Molecular Modeling | Monte Carlo | Validation of Results | Assessment | Genome | Trivia Game | Research Articles | Glossary

SUMMARIES OF SELECTED SCIENTIFIC JOURNAL ARTICLES

Nature 2002 | JMB 2002 | Physical Review Letters 2001

(The following is a simplified summary of article by Bojan Zagrovic, Christopher D. Snow, Siraj Khaliq, Michael R. Shirts and Vijay S. Pande titled: NATIVE-LIKE MEAN STRUCTURE IN THE UNFOLDED ENSEMBLE OF SMALL PROTEINS published in Journal of Molecular Biology, 2002)

Historically unfolded proteins were seen as a random distribution of a large number of structural possibilities. If you take a piece of string, it can be folded in many different ways.

Each amino acid contributes two freely rotating bonds to the backbone of the polypeptide chain, and thus even a small protein (100 amino acid residues) have a very large number of configurations it can adopt when unfolded (10100 - a number much larger than the grains of sand in a typical beach). It seemed almost paradoxical given a very large number of available states, proteins still manage to fold in a biologically relevant time to carry out their biological function.But are we really certain that unfolded states are astronomically complex? Our results suggest there may be a surprising simplicity to this seemingly heterogeneous mess.


Selected portion of simulation of HIV INTEGRASE unfolding by Pande Group (simulation is time-reversed for illustration purposes)

Since it was believed the unfolded state of proteins to be very complex, and biological function is predominantly dominated by the native state, unfolded state of proteins received significantly less attention. Recent studies of denatured proteins (For a definition of denatured protein click on movie) suggest that denatured state may not be as diverse as previously thought. Unfolded state refers to the heterogeneous state of proteins before folding into native state and it is different from the denatured state. For example cooked eggs does represent irreversibly denatured proteins. However, most denatured proteins can be renatured when the chemical agent is diluted or removed all together.

Computer simulation of unfolded state seemed formidable due to intense computational power needed. However, using more than 10,0000 computer processors through folding@home we have run thousands of fully independent simulations of three small proteins, each simulation tens of nanoseconds (billionth of a second) long. One advantage of running such a large number of independent, relatively short folding simulations is that we can expect a small number of folding events to take place which we can study. Even though initial random motions does play a factor, ultimately though under the right conditions, proteins do not have much of a choice but fold into their native state to do their biological function within fraction of a second. . Large number of parallel simulations is our way of handling the initial random motions until a small number of proteins fold within tens of nanoseconds. It takes many microseconds (millionth of a second) for a complete sample of proteins to fold.

Having more than 10,000 independent simulations also give us the advantage of giving us a detailed picture of the unfolded protein very early into folding (tens of nanoseconds after the initiation of folding.) The illustration below shows some of the stages the proteins go through simulation. Initially the proteins are extended like a piece of spaghetti. However, they quickly collapse to a compact unfolded state before the final folded state.

WHAT HAVE WE LEARNED ABOUT THE UNFOLDED STATE OF THREE SMALL PROTEINS?

Folding simulations for three proteins, Native Villin, Native TrpZip and Native BBA5, were started from extended conformations. In about 10 ns TrpZip, BBA5 and in 20 ns villin collapse from extended into compact conformations.


TrpZip

BBA5

Villin

Click here to see 3D computer visualization of chicken villin headpiece.

Individual members of the unfolded protein ensembles are very diverse; however we found if we look at the average structure there are some surprising similarities to the folded structure. First let us discuss how we determine the average structure. We get the distance between a selected carbon atoms (Alpha Carbon) of each amino acid in the proteins. In the illustration below, a protein with 7 amino acids, the gray circles represent alpha carbons and blue lines represent the distance in Angstroms ( 1 Angstrom = 1Å = 10-10 meters - one meter is roughly one yard).

Using the above illustration we would organize the data in a 7 X 7 table (Matrix).

  1 2 3 4 5 6 7
1 0 1.2 Å 5.1Å 7.6Å 6.1Å 5.4Å 4.2Å
2 1.2 Å 0          
3 5.1Å   0        
4 7.6Å     0      
5 6.1Å       0    
6 5.4Å         0  
7 4.2Å           0

We would organize a similar table (Matrix) for the native protein in the folded state. Then we would compare the structures between each protein in the unfolded state with the native folded one. We use mathematical formula below to do the comparison. We calculate the difference of the distances between the each (i and j represent corresponding atoms between two structures) entry in the table for unfolded and folded protein, square it, multiply by 2, take the square root of it and divide it by number of atoms in the protein, the result of these calculations are called distance root-mean square deviation or dRMS.

In the case of Villin protein the table (matrix) would really be a 36 X36 table since Villin has 36 amino acids not just 7 as in our illustration above. If we graph the number of structures vs dRMS we get a general distribution curve (red bars) which indicates the unfolded state is a very diverse group.

However, red bars in the above graph refers to comparing each unfolded protein with the folded protein. When protein structures are determined experimentally (with x-ray diffraction or Nuclear Magnetic Resonance), what the experimenter really determine is the average structure of a large number of proteins. We wanted to take a similar approach in our analysis of data as a result of our simulations. We averaged these distances for thousands of protein samples in the simulation. In the example below, the average distance (indicated by red line between Alpha Carbons) will be: (7Å + 6Å + 8Å) / 3 = 7Å..

From all the averages we would get a single table: 36 X 36 for Villin protein (Villin has 36 amino acids or 36 alpha carbons). We compare the average structure of unfolded state with the folded state using dRMS formula. The result of these calculations is shown as an arrow in the above graph. The surprising result is average structure of unfolded state is quite close to the folded state (dRMS = 2.4 Å). In comparison if we compare individual unfolded protein with the folded one dRMS fluctuate widely.

Our findings lead us to form what we call the "mean-structure hypothesis", which means the geometry of the collapsed unfolded state of small peptides and proteins in an average sense corresponds to the geometry of the native folded state.

  • We suggest that in the folding process in an average sense essentially the structure of the protein does not change. The average structure stays in place while folding reduce the large number of structural variability described in the beginning. Individual proteins may take a variety of paths to folding, but on the average of all proteins in the sample, structurally things do not change that much.
  • Experimentally the structural analysis is also done on an average basis of a large number of proteins. This approach will allow us to better validate our results experimentally.
  • If the mean-structure hypothesis is correct it could help us refine simulations. Using the distance constraints one could find the closest individual member of the unfolded ensemble to the average structure based on the same unfolded ensemble. If the hypothesis is correct, this structure should be closer to the native structure than most other individual unfolded structures.


 

 

 
(c) 2000-2002 Vijay Pande and Stanford University