Abstract
Sequence alignment is an indispensable computational tool in modern molecular biology. The model underlying biological sequence alignment is of interest to physicists because it approximates the statistical mechanics of DNA and protein annealing, while bearing an intimate relationship to models of directed polymers in random media. Recent methods for determining the statistics of random sequence alignments have reduced the computation time to less than 1 s, opening up some interesting possibilities for online computation with biological search engines. Before implementation, however, the methods required an objective technique for computing regression coefficients pertinent to an asymptotic regime. Typically, physicists estimate parameters pertinent to an asymptotic regime subjectively: They eyeball their data; estimate the asymptotic regime where the regression model holds with reasonable accuracy; and then regress data only within the estimated asymptotic regime. Our publicly available computer program arrp replaces the subjective assessment of the asymptotic regime with an objective change-point detection method, increasing confidence in the scientific objectivity of the parameter estimates. Asymptotic regression has potential applications across most of physics.
- Received 2 March 2011
DOI:https://doi.org/10.1103/PhysRevE.84.031914
Published by the American Physical Society