Abstract

Motivation: At present there exists no age estimate for the different protein structures found in nature. It has become clear from occurrence studies that different folds arose at different points in evolutionary time. An estimation of the age of different folds would be a starting point for many investigations into protein structure evolution: how we arrived at the set of folds we see today. It would also be a powerful tool in protein structure classification allowing us to reassess the available hierarchical methods and perhaps suggest improvements.

Results: We have created the first relative age estimation technique for protein folds. Our method is based on constructing parsimonious scenarios, which can describe occurrence patterns in a phylogeny of species. The ages presented are shown to be robust to the different trees or data types used for their generation. They show correlations with other previously used protein age estimators, but appear to be far more discriminating than any previously suggested technique. The age estimates given are not absolutes but they already offer intriguing insights, like the very different age patterns of α/β folds compared with small folds. The α/β folds appear on average to be far older than their small fold counterparts.

Availability: Example trees and additional material are available at http://www.stats.ox.ac.uk/~abeln/foldage

Contact:  deane@stats.ox.ac.uk

Supplementary information:  http://www.stats.ox.ac.uk/~abeln/foldage