SLiM (Selection on Linked Mutations) is a forward population genetic simulation for studying linkage effects such as hitchhiking under recurrent selective sweeps, background selection, and Hill-Robertson interference. The program can incorporate complex scenarios of demography and population substructure, various models for selection and dominance of new mutations, realistic gene and chromosome structure, and user-defined recombination maps. Special emphasis was further placed on the ability to model and track individual selective sweeps - both complete and partial. While retaining all capabilities of a forward simulation, SLiM utilizes sophisticated algorithms and optimized data structures that enable simulations on the scale of entire eukaryotic chromosomes in reasonably large populations. All these features are implemented in an easy-to-use C++ command line program.

Example of an evolutionary scenario that can be modeled with SLiM (example 2 in the documentation): The scenario features a population split with subsequent bottleneck, followed by the introgression of an adaptive mutation from the source population into the new subpopulation.


Source code and examples

SLiM is distributed under the GNU General Public License (GPL)


PW Messer (2013)
SLiM: Simulating Evolution with Selection and Linkage. Genetics. 194:1037

Indel Trace Extension

The indel trace extension method evaluates whether an insertion of a sequence segment is in fact a duplication of an adjacent sequence segment, rather than just a random piece of DNA. Furthermore, it allows to analyze whether duplicates were already present at the insertion site before the duplication event occurred. Likewise, the method can detect whether deletion events removed one copy of a preexisting duplicate.

The figure on the right illustrates an example for the trace extension method. The top part of the figure shows the dot-matrix of an indel. Different relations between the indel length l and its trace extension d distinguish 4 different classes of events shown in the bottom part of the plot.

Source code and documentation


PW Messer, PF Arndt (2007)
The majority of recent short DNA insertions in the human genome are tandem duplications. Mol Biol Evol. 24:1190


Long-range correlations in DNA are characterized by a power-law decaying autocorrelation function of the sequence composition. Given a DNA sequence as input, CorGen can measure its composition correlation function and determine amplitude and decay exponent of present long-range correlations. The obtained parameters can then be used to generate random sequences with the same correlation parameters and average sequence composition as the query sequence. CorGen can also generate sequences with user specified long-range correlations and GC-content.

CorGen homepage


PW Messer, PF Arndt (2006)
CorGen -- measuring and generating long-range correlations for DNA sequence analysis. Nuc Acid Res. 34:W692