LASER: a maximum likelihood toolkit for detecting temporal shifts in diversification rates from molecular phylogenies.

Rates of species origination and extinction can vary over time during evolutionary radiations, and it is possible to reconstruct the history of diversification using molecular phylogenies of extant taxa only. Maximum likelihood methods provide a useful framework for inferring temporal variation in diversification rates. LASER is a package for the R programming environment that implements maximum likelihood methods based on the birth-death process to test whether diversification rates have changed over time. LASER contrasts the likelihood of phylogenetic data under models where diversification rates have changed over time to alternative models where rates have remained constant over time. Major strengths of the package include the ability to detect temporal increases in diversification rates and the inference of diversification parameters under multiple rate-variable models of diversification. The program and associated documentation are freely available from the R package archive at http://cran.r-project.org.


Introduction
Recent years have seen an explosive proliferation of DNA sequence data for molecular phylogenetic analyses and a commensurate increase in the use of these data to draw inferences about macroevolutionary processes. A particularly active area of research involves the use of molecular phylogenies to study variation in rates of species origination and extinction, both among lineages (Slowinski and Guyer, 1989;Mooers and Heard, 1997) and over time (Nee et al. 1994;Paradis, 1997).
Likelihood Analysis of Speciation and Extinction Rates (LASER) is a package for the R programming environment that facilitates model-based analyses of diversifi cation rates. LASER is the fi rst software package to implement tests for temporal variation in diversifi cation rates using likelihood methods based on the birth-death process (Nee et al. 1994). LASER is licensed under the GNU General Public License and complements the existing R libraries 'ape' (Paradis et al. 2004) and 'apTreeshape' (Bortolussi et al. 2006), which provide functions for phylogenetic tree manipulation and the analysis of among-lineage heterogeneity in diversifi cation rates.
LASER was written to address several limitations of existing software for analyzing the tempo of diversifi cation. Approaches such as the gamma statistic (Pybus and Harvey, 2000) and survival analysis (Paradis, 1997), which are implemented in the R library 'ape' (Paradis et al. 2004), test for departures from the pure-birth model of cladogenesis, and can only be used to infer temporal decreases in diversifi cation rates (Nee, 2001;Rabosky, 2006). These methods are thus unable to address many questions of interest to evolutionary biologists, such as whether temperate faunas experienced elevated speciation rates during the Pleistocene (Weir and Schluter, 2004). Furthermore, existing methods suffer reduced power to detect temporal decreases in diversifi cation rates when clades have diversifi ed under high background extinction rates (Rabosky 2006).
LASER fi ts a candidate set of rate-variable diversifi cation models to phylogenetic data and contrasts the likelihood of the data under these models to alternatives where speciation and extinction rates have remained constant over time. The null hypothesis that diversifi cation rates have not changed over time is tested using the statistical approach described in Rabosky (2006). The test statistic for constancy of diversifi cation rates is computed as where AIC RC is the Akaike Information Criterion (AIC) score for the best-fi t rate-constant model of diversifi cation, and AIC RV is the AIC score for the best-fi t rate-variable model under consideration. Thus, a positive ΔAIC RC value suggests that the data are best approximated by a rate-variable model of diversifi cation. Although several previous studies have used the AIC to distinguish among rate-constant and rate-variable models of diversifi cation (Barraclough and Vogler, 2002;Turgeon et al. 2005), Rabosky (2006) found that this approach results in high Type I error rates unless critical values of the ΔAIC RC distribution are explicitly addressed through simulation.
The LASER package provides a comprehensive toolkit for computing ΔAIC RC for test phylogenies and for comparing the observed ΔAIC RC statistic to its distribution under the null hypothesis. This is the fi rst available approach that can detect temporal increases in diversifi cation rates, and extensive simulation has shown that the method has greater power than other methods to detect temporal declines in diversifi cation rates when clades have diversifi ed under elevated background extinction rates (Rabosky, 2006).
Additional strengths of the model-fitting approach implemented in the LASER include the ability to test hypotheses of rate variation while estimating relevant diversification parameters. Furthermore, the package can be used to test a priori hypotheses of temporal rate variation. The R programming environment used by LASER provides great fl exibility, and likelihood functions in the package can easily be tailored to a variety of statistical applications. For example, one can generate posterior distributions of diversifi cation parameters using the posterior distribution of phylogenetic tree topologies and branch lengths sampled using Markov chain Monte Carlo (MCMC) methods (e.g. Huelsenbeck and Ronquist, 2001).

Usage
LASER operates on sets of branching times derived from ultrametric phylogenetic trees, and provides functions for obtaining branching times from several input formats, including the widely used 'Newick' (parenthetic) tree format. Likelihoods and parameter estimates can be obtained for a range of rate-variable diversifi cation models, including logistic and exponential density-depedent models and multi-rate birth-death models. Additional functions permit batch processing of multiple phylogenies to obtain the null distribution of ΔAIC RC or posterior distributions of diversifi cation parameters.
The function fi tdAICrc computes the ΔAIC RC test statistic for a test phylogeny using arguments that specify the candidate set of rate-variable models to be considered. The null distribution of the test statistic is obtained by either simulating branching times with the function yuleSim or by importing simulated trees using the function getBtimes.batch. The latter function is particularly useful for the analysis of phylogenies with incomplete taxon sampling, because incomplete sampling can result in a spurious decline in diversifi cation rates over time (Pybus and Harvey, 2000). To address this problem in LASER, one can simply generate rate-constant phylogenies with incomplete sampling using PhyloGen (Rambaut, 2002) or other software and import the trees into LASER to tabulate the null distribution of the ΔAIC RC test statistic.
A call to the function fi tdAICrc.batch will then generate the null distribution of the test statistic and return the probability of the observed ΔAIC RC index under the null hypothesis. Functions are available to call any rate-variable and rate-constant diversification models individually, and additional functions permit exploration of diversifi cation patterns for any user-defi ned temporal interval.

Diversifi cation Models
Rate-constant diversifi cation models implemented in LASER include the pure birth model, with a constant speciation rate λ > 0, and the birth-death model, with λ > 0 and extinction rate μ ≥ 0. Seven rate-variable diversifi cation models are provided, including density-dependent and multi-rate variants of the pure birth and birth-death models. The package includes both logistic and exponential density-dependent speciation models. Under the logistic density-dependent model of cladogenesis, the speciation rate λ at time t is modeled as where λ 0 is the initial speciation rate, N t is the number of lineages at time t in a reconstructed phylogeny, and K is analogous to the carrying capacity parameter of population ecology. Speciation rates are modeled under an exponential density-dependent process as x controls the magnitude of the rate change with respect to the number of lineages at any point in time in the reconstructed phylogenetic tree.
Multi-rate variants of the pure birth and birthdeath model assume the existence of one or more breakpoints in time, such that a clade has diversifi ed under one set of diversifi cation parameters before the breakpoint and another set of parameters after the breakpoint. For example, LASER includes a two-rate pure birth model with three parameters: the initial speciation rate, the fi nal speciation rate, and the time of the rate shift. Turgeon et al. (2005), tested whether Holarctic damselflies in the genus Enallagma showed evidence for increased diversification rates during the Quaternary. They found evidence for a recent increase in speciation rates, suggesting a role for Pleistocene glacial cycles in damselfl y diversifi cation. Their conclusions were based on a model-fi tting approach similar to that described above. However, as noted in Rabosky (2006), this method can result in high Type I error rates. To explicitly address this problem, I used LASER to compute ΔAIC RC for the Enallagma phylogeny from Turgeon et al. (2005) and to tabulate the null distribution of the statistic as follows: Figure 1. Distribution of the ΔAIC RC test statistic for 5000 rate-constant phylogenies of the same size as the Enallagma phylogeny. The calculated ΔAIC RC for Enallagma was 13.0372 and indicates a highly signifi cant temporal increase in the net diversifi cation rate over time (p = 0.0016). data.bt <-getBtimes(fi le = 'enallagma.tre') summary <-fi tdAICrc(data.bt)

Example: Holarctic Damselfl y Radiation
The fi rst line creates a vector data.bt of the branching times for Enallagma by reading the parenthetic tree stored in enallagma.tre. The function fi tdAICrc generates an object, summary, that contains the results of fi tting all rate-variable and rate-constant models to the data. The observed ΔAIC RC statistic for Enallagma is 13.0372, and the best fi t model is a three-parameter rate-variable model specifying a 12.5-fold increase in the speciation rate over time. The signifi cance of the observed ΔAIC RC statistic was assessed with the following commands: null.bt <-yuleSim(37,5000) fi tdAICrc.batch (null.bt, stat = 13.0372) The fi rst line simulates 5000 phylogenies with the same number of tips as the Enallagma tree (37) under the null hypothesis of rate-constancy and stores them in object null.bt. The set of simulated phylogenies is analyzed using fi tdAICrc. batch, which approximates the probability of the observed ΔAIC RC statistic under the null hypothesis. In this case, the observed ΔAIC RC statistic indicates a highly signifi cant departure from the null hypothesis of rate-constancy (p = 0.0016; Fig. 1), supporting the conclusions of Turgeon et al. (2005).

Conclusion
LASER fits multiple rate-variable and rateconstant models of diversifi cation to reconstructed phylogenies using maximum likelihood. Its main strength includes the use of Monte Carlo simulation to control for elevated Type I error rates associated with likelihood-based analyses of diversifi cation. LASER is the fi rst available package that can detect temporal increases in diversifi cation rates, and has considerable power to detect temporal declines in diversifi cation rates when clades have diversifi ed under high background extinction rates. As a freely available package for the R programming environment, it is fl exible and platform-independent, and can easily be tailored to a variety of user-specifi c applications.