On the alignment space

Conf Proc IEEE Eng Med Biol Soc. 2005:2006:244-7. doi: 10.1109/IEMBS.2005.1616389.

Abstract

Sequences with generalized errors which are called mutations in bioinformatics and generalized error-correcting codes are studied in this paper. In the areas of bioinformatics, computer science and information theory, sequences with generalized errors are discussed respectively for different aims. Firstly, we give the definitions of alignment distance and Levenshtein distance by expansion sequences and discuss their properties and relations. Then the modular structure theory is introduced for strictly describe the expansion sequences. We show that the expansion modular structures of sequences form a Boolean algebra. As applications of the modular structure theory, we give a new and more strict proof of triangle inequality for alignment distance. At last, the definition and construction of generalized error-correcting codes are studied, and some optimal codes with small length are listed.