RESEARCH INTERESTS

Applications of physical simulation and Bayesian statistics/Machine Learning to biologically and biomedically important questions

Protein misfolding diseases: Alzheimer's Disease (AD) and Huntington's Disease

Many diseases result from protein misfolding, i.e. their in ability to self-assemble (or "fold") into the right structure. For example, AD is result of the undesired aggregation of Aβ peptides. Surprisingly, the toxic element is the small oligomers (4-16 monomers) of Aβ. Since Aβ itself is small (42 res), simulation approaches using our advanced methodology for kinetics and thermodynamics should be able to shed light on the nature of the structure and stability of these oligomers. We are also applying these methods to study protein misfolding diseases more broadly, first by applying these methods to Huntington's Disease.

Small molecule Drug design: Linking genes, to drugs, to biochemical pathways

The future of much of biology and chemistry lies at the connection between genes, small molecules (drugs), and biochemical pathways. To unravel the connections between them, we apply machine learning, Bayesian statistics, atomistic simulation, bioinformatics, and cheminformatics methods to the application to problems of linking drug efficacy and side effects to geneomics and systems biology. My group also has expertise in related synergistic areas, such as theoretical physical chemistry, structural biology, computer science, and large-scale distributed computing. By combining our methods with the Folding@home distributed computing project (currently the most powerful supercomputer in the world, with almost 10 petaflops of performace), we have a unique opportunity to push the state of the art in these and related areas. Finally, via collaborations with biotechs, pharmaceutical companies, and experimental groups interested in drug design, we can directly test our predictions, thus strengthening our methods as well as the direct impact of our results.

Biophysics of cellular environments: Folding in vivo

While understanding the nature of folding in vitro is a challeng-ing biophysical question, understanding fold-ing in vivo is the dominant biological question. In collaboration with several experimental groups, we are now performing simulations of folding in biologically relevant conditions, i.e. models of celluar conditions, and with the relevant, important biochemical machinery. While these simulations will be extremely demanding, they should shed insight in ways that were previously impossible.

Moreover, we feel that it is clear that the future of Biophysics is in its connection directly to cellular environments, and we are working to pioneer new methods to be able to directly tackle this set of challenging problems.


Biophysical Chemistry: Studying biophysical questions using model systems

Protein folding

For several decades, understanding how proteins self-assemble (or "fold") has been a challenging problem in physical chemistry with important ramifications for structural biology and nanotechnology. Moreover, understanding protein folding is an important paradigm for many other difficult problems in structural biology and physical chemistry. Our goals have been to develop novel computational methods for greatly pushing the envelope in folding simulation, with a goal of directly and quantitatively predicting all possible experimental observables. Using novel algorithms and the power of Folding@Home, we have been able to, for the first time, simulate folding dynamics directly from the sequence.

RNA folding

RNA folding presents many additional challenges in understanding molecular self-assembly, when compared with protein folding. In particular, RNA molecules are considerably larger, electrostatics plays a much more domi-nant and complex role and the nature of tertiary interactions is considerably more subtle. By combining a tight coupling with experimental col-laborators, we are examining RNA folding on many scales, from atomistic simulations of small RNA motifs to simulations of the entire Tetrahymena ribozyme.

Role of water and co-solvents in protein kinetics and thermodynamics

Water and other co-solvents (such as urea) play an active role on bio-molecular self-assembly. Indeed, the hydrophobic effect is a dominant driving force. How does water influence the nature of biomolecular structure formation and does it play a structural (rather than general continuum) role? Using full-atomistic simulation with quantitative comparison to experiment, we can now start to detail the answers to these questions.


Theoretical Chemistry: Breaking fundamental barriers in molecular simulation

New paradigms for supercomputing: world-wide distributed computing and Graphics Processing Units (GPUs)

Current atomistic simulations are greatly limited by the available computational power. In order to even attempt a direct comparison to experiment, many simulations would need to be run for thousands of years. Distributed computing opens the door to new possibilities. Using 100,000 CPUs distributed throughout the world ("Folding@Home" and well-designed algorithms, one can turn 100,000 CPU days (= 300 years!) into one day of simulation.

Starting in 2005, we have pushed to a new computing paradigm: Graphics Processing Units (GPUs). Our efforts have lead to extremely powerful molecular dynamics software, which is also part of Folding@home. Moreover, this work lead to Folding@home on game consoles, such as the PS3. The combined power of the GPUs and PS3s has made Folding@home the most powerful supercomputer cluster in the world, approaching the 10 petaflop scale.

Long timescale kinetics: MD simulations of millisecond events in all-atom detail

While the fastest proteins fold in tens of microseconds to milli-seconds, atomistic simulations are limited to the nanosecond regime. How can we break this fundamental impasse? While using many 100,000 CPUs with distributed computing can give the raw horsepower, clearly well-designed algorithms are needed to efficiently use distributed computing. Indeed, just as 100,000 grad students can't work together finish 300 years of work in one day, folding simulations must be designed in order to be parallelized to this scale. By taking advantage of the nature of folding kinetics (single ex-ponential behavior of single domain proteins), one can devise natural ways to speed folding simulation 100,000x using distributed computing. This allows us to simulate folding on the millisecond timescale.

Sampling algorithms for more precise free energy calculation

Another great challenge in physical chemistry simulation is the ability to calculate free energies to chemical accuracy and precision (eg to 1 kcal/mol). With such capabilities, one could use simulation in drug lead discovery and refinement. We are devel-oping novel means to use distributed computing to make a fundamental advance, with a 1kcal/mol accuracy in absolute ΔG calculation as our goal.