DIMACS TR: 2001-02
Multiple sequence alignment using the quasi-concave function optimization based on the DIALIGN combinatorial structures
Authors: Leonid Shvartser, Casimir Kulikowski, Ilya Muchnik
ABSTRACT
Multiple sequence alignment is usually considered as an optimization problem,
which has a statistical and a structural component. It is known that in the
problem of protein sequence alignment a processed sample is too small and not
representative in the statistical sense though this information can be
sufficient if an appropriate structural model is used. In order to utilize
this information a new structural description of the pairwise alignment
results union has been developed. It is shown that if the structure is
restored then Multiple Sequence Alignment is achieved. Introduced structure
represents the set of local maximums of quasi-concave set function on a lower
semi lattice, which in turn is a union of the set-theoretical intervals. This
union is a set of the consistent subsets of diagonals, introduced by B.
Morgenstern, A. Dress, and T. Werner (1996). Algorithm for local maximums
search on proposed structure has been developed. It consists of an
alternation of the Forward and Backward passes. The Backward pass in this
algorithm is a rigorous while the Forward pass is based on heuristics.
Multiple alignment of 5 protein sequences are used as an illustration of the
proposed algorithm.
Paper Available at:
ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2001/2001-02.ps.gz
DIMACS Home Page