Comparative Methods for Gene Structure Prediction in Homologous Sequences

Christian N. S. Pedersen, Tejs Scharling


The increasing number of sequenced genomes motivates the use of evolutionary patterns to detect genes. We present a series of comparative methods for gene finding in homologous prokaryotic or eukaryotic sequences. Based on a model of legal genes and a similarity measure between genes, we find the pair of legal genes of maximum similarity. We develop methods based on genes models and alignment based similarity measures of increasing complexity, which take into account many details of real gene structures, e.g. the similarity of the proteins encoded by the exons. When using a similarity measure based on an exiting alignment, the methods run in linear time. When integrating the alignment and prediction process which allows for more fine grained similarity measures, the methods run in quadratic time. We evaluate the methods in a series of experiments on synthetic and real sequence data, which show that all methods are competitive but that taking the similarity of the encoded proteins into account really boost the performance.

Full Text:


This website uses cookies to allow us to see how the site is used. The cookies cannot identify you or any content at your own computer.

ISSN: 0909-0878 

Hosted by the State and University Library and Aarhus University Library