Motivation: A new method for finding subtle patterns in sequences is introduced. It approximates the multiple correlations among residuals with pair-wise correlations, with the learning cost O(m(2)n) where n is the number of training sequences, each of length m. The method suits to model splicing sites in human DNA, which are reported to have higher-order dependencies.
Results: By computational experiments, the prediction accuracy of our model was shown to surpass that of previously reported Markov models for the prediction of acceptor sites in human.
Availability: The C++ source code is available on request from the authors.