Adaptation is the central process in evolution yet fundamental questions about its nature are still unanswered: How often does adaptation involve the rise and fixation of a new beneficial mutation versus other modes of adaptation, such as polygenic adaptation or selection that varies over time? What role does adaptation play relative to random genetic drift and what are typical selection coefficients of adaptive mutations? Does adaptation primarily leave the signatures of classic selective sweeps, where a single haplotype rises in the population, or "soft" sweeps, where several different haplotypes all carrying the adaptive allele sweep through the population simultaneously? How common are incomplete sweeps and what mechanisms underlie them?
The advent of population genomics has opened up exciting new possibilities for in-depth studies of adaptation that nowadays allow to address such questions systematically. From these studies, a new picture of adaptation is currently emerging that challenges evolutionary theory: Adaptation not only seems to occurr much more frequently than assumed by the neutral theory of molecular evolution, it also often appears to proceed in modes rather different from the classic selective sweep model. In my research I focus on elucidating the precise modes and rates at which adaptation operates and refining population genetics theory accordingly. Some specific research projects are:
In order to understand the population parameters that allow for rapid adaptation in eukaryotes, we studied the signatures of a well-documented example of particularly strong adaptation: the evolution of pesticide resistance in D.melanogaster . We found that this complex adaptation did not proceed by a classic hard sweep, but instead showed signatures of a soft sweep, where the adaptive allele arose independently on several different haplotypes. Our theoretical analysis further revealed that for cases of strong adaptation in species with large census population sizes, soft sweeps should actually be the norm rather than the exception.We are currently analyzing whole-genome population data of D.melanogaster to assess whether recent adaptations more commonly involved hard or soft sweeps in this species. Preliminary results indicate that strong adaptation indeed primarily shows the signatures of soft sweeps. We are also modeling how the dramatic fluctuations in population size that fly populations undergo each year are expected to affect the likelihood of observing soft sweeps.
Together with Richard Neher from the Max-Planck Institute in Tuebingen I recently developed a new approach to measure the selection coefficients of hard and soft sweeps from deep population diversity data . In contrast to previous methods, which typically analyze the reduction in diversity caused by a sweep, our method utilizes the novel variation that arises from mutations occurring on the sweeping haplotypes. When applying this method to HIV populations, we again observed several examples of strong adaptation involving both hard and soft sweeps.
One striking result of recent studies is that adaptation often results in only incomplete sweeps, where the adaptive allele does not become fixed in the population. For example, when fruit flies were evolved over 600 generations of laboratory selection for accelerated development, many polymorphisms changed their frequencies in response to selection, yet none of these variants ever reached fixation . Consistently with this observation, when we measured the population frequencies of polymorphisms in D.melanogaster at different times of the year in the wild, we found hundreds of polymorphisms that systematically cycle between seasons, often showing frequency differences on the order of 20% or larger.
One possible mechanisms to generate incomplete sweeps is heterozygote advantage, which can cause the adaptive mutation to be maintained at an intermediate population frequency. We derived an intriguing theoretical explanation for why heterozygote advantage should indeed be common during adaptation in diploid species . One way to see this is to consider that when a new mutation first arises in a diploid population, it primarily exists in heterozygotes. In order for the variant to become more common in the population, it thus needs to be beneficial in heterozygotes. However, the mutation does not actually need to be beneficial in homozygotes. If selection is stabilizing and mutations are sufficiently large, homozygotes can then often "overshoot" the fitness optimum. These mutations will be maintained at intermediate population frequencies and result in an adaptive dynamics that differs quite substantially from the classic picture. Strikingly, in this scenario adaptation promotes rather than exhausts genetic variation.
Under the paradigm of the neutral theory of molecular evolution, where the bulk of natural molecular variation is assumed to be selectively neutral, the effects of linkage between different polymorphisms, so-called Hill-Robertson interference (HRI), have generally been neglected in population genetic models. However, recent studies revealed that in many species adaptation appears to be much more frequent than assumed by the neutral theory. In D.melanogaster, for example, applications of the McDonald-Kreitman (MK) test yield that actually more than 50% of the amino-acid changing substitutions had been adaptive in this species, implying that HRI from recurrent selective sweeps might also be common. In addition, there is accumulating evidence that many polymorphisms in natural populations are slightly deleterious, and such polymorphisms are expected to generate another kind of HRI, so-called background selection.
These findings raise the question of whether it is indeed reasonable to neglect HRI when modeling evolutionary dynamics, and to what extent population genetic methods built on this assumption are biased under realistic scenarios. Since the MK-test itself assumes that most observable polymorphisms are selectively neutral, this also raises a fundamental problem of consistency.
We are currently investigating the effects of frequent adaptation and its interplay with deleterious mutations on the patterns of molecular variation and evolution using large-scale forward simulations. In our simulations the processes of mutation, recombination, and selection are modeled explicitly in all individuals of the population, and it is thus possible to investigate arbitrary distributions of fitness effects of new mutations. We use these simulations to evaluate the consistency of commonly used approaches to infer evolutionary parameters from population diversity and divergence data.
In order to fully understand the role of adaptation, it is essential to first understand the non-adaptive forces that shape evolution such as mutation, purifying selection, and demography. During my PhD at the Max-Planck-Institute for Molecular Genetics in Berlin I investigated the elementary patterns of mutational processes by applying analytic and modeling methods together with comparative genomics approaches to genome-level datasets. These studies brought up interesting findings of how insertions and deletions contribute to genome evolution [5,6], how they shape statistical properties of genomes [7,8], and how this can affect commonly used bioinformatics methods such as sequence aligment [9,10,11].
At Stanford, I developed a new method to estimate the rates and pattern of mutation from the low-frequency polymorphism data gathered from deep sequencing projects . My method thereby overcomes many of the problems of indirect estimates from divergence or heterozygosity, which typically suffer from unknown selective and demographic biases.
We also developed a maximum-likelihood framework to infer the strength of purifying selection under complex demographies and applied this approach to a particular family of transposable elements in D.melanogaster . This study highlighted the importance of accounting for demographic history when inferring selection.
In another research project we investigated the interplay between mutational biases and purifying selection . Surprisingly, we were able to show that mutational biases can cause constrained sequences to evolve faster than would be expected under the neutral expectation, provided that selection is weak and mutational biases favor the states that selection disfavors. We investigated how this phenomenon, in practice, can affect comparative genomics methods used for the detection of constraints. This study demonstrated that accounting for mutational biases and weak selection is necessary to accurately infer regions of the genome evolving under purifying selection.