Proteins are initially in an unfolded, relaxed state like an
expanded spaghetti noodle. Due to several forces in action, in
the cells they begin to fold very rapidly into a 3 dimensional
structure to serve their biological function such as carrying
oxygen from lungs, forming the structure of hair. The structure
of initial unfolded state and the final folded stage can be determined
through various well established methods. (If interested in the
explanation of the method follow this link x-ray
diffraction) We would like to understand the folding process
itself which will have many benefits for all of us as explained
elsewhere in this site.
Pande group Computer Simulation
The fastest proteins fold in about 10 microseconds (1 microsecond
is one millionth of a second). Even the fastest computers could
only simulate few nanoseconds of this process (a nanosecond is
a billionth of a second). The reason for this is there are a large
number of states proteins can go through until they settle in
the final folded state. If you take a typical coin toss, there
is only two possibilities: heads or tails. In a dice roll there
are six possibilities. In the case of proteins there are so many
possibilities since there are 1000's of individual atoms, and
even in a simplified simulation there are at least 5 different
type of forces acting upon each atom. Even worse proteins can
fall into "traps" which may take a long time to get
out. A single computer is like a single lane road, no matter how
well the road is made in heavy traffic the cars will not be able
to move very fast.
If you have a road composed of many lanes the traffic will move
much faster. This is the idea behind parallel simulations. When
you participate, your computer becomes one of the lanes in parallel
simulations. The more computers participate at Folding@Home,
the more lanes of simulations are available to us.
Even though a sample protein may take a long time
to fold to simulate by a single computer, if we run 10,000 or
more parallel simulations independent of each other we may expect
to observe a small number of proteins to fold. For example, if
we run 10,000 simulations simultaneously we could expect about
25 about those simulations showing folding. After the simulation
we could get very valuable information, data from those that folded
during our simulation.
Image courtesy of www.thinkquest.org
APPENDIX - Mathematical description of parallel
simulations
In a typical protein folding process, if we plot
percent folded over a period of time we get a graph like below
(%folded = 1 - exp -t/T where t is elapsed time and
T is the average folding time):
Fastest proteins fold in10 µs (1 µs
= 10 x 10 -6seconds or one millionth of a second).
A single computer can simulate a nanosecond in a day(one nanosecond
= 1 ns = 10 x 10 -9seconds or billionth of a second)
because of all the different possibilities exist for an unfolded
protein until it reaches the final stable, folded state. So using
a single computer it will take about 30 years to simulate even
the fastest folding protein. The good news is 10 µs is the
total time, and within this time span some proteins may fold much
faster than others just by chance. If you examine the graph below,
the initial percentage of folded proteins can be approximated
with a linear relation (%folded = t/T)
In the first 10 nanoseconds (one nanosecond is
a billionth of a second) of simulation, ratio of folded proteins
to the total number of proteins approximately equal to elapsed
time divided by the total time it takes for all the proteins to
fold. So in the first 10 ns, if we approximate %folded proteins
to be equal to t/T and if t = 10 ns and T = 10 µs then:
% folded = t/T = 10 ns / 10 µs = .001 x 100
= 0 .1% or in other words in the first 10 nanoseconds about one
thousandth of protein sample may have folded. If we run 10,000
parallel simulations then:
10000 x .001 = 10 folded proteins.
So with 10,000 parallel, independent simulations
we can approximately simulate about 10 folded proteins.
Alternatively, if we know the percent of proteins
folded, and the time it took, then using above equations we can
determine the folding time constant T for the protein to be compared
with experimental results. This is one way to validate the computer
simulation approach experimentally. As the graph below shows,
there is a close agreement with folding time constants determined
by computer simulation and experimentally.