Small Coupling Expansion for Multiple Sequence Alignment

7 Oct 2022  ·  Louise Budzynski, Andrea Pagnani ·

The alignment of biological sequences such as DNA, RNA, and proteins, is one of the basic tools that allow to detect evolutionary patterns, as well as functional/structural characterizations between homologous sequences in different organisms. Typically, state-of-the-art bioinformatics tools are based on profile models that assume the statistical independence of the different sites of the sequences. Over the last years, it has become increasingly clear that homologous sequences show complex patterns of long-range correlations over the primary sequence as a consequence of the natural evolution process that selects genetic variants under the constraint of preserving the functional/structural determinants of the sequence. Here, we present a new alignment algorithm based on message passing techniques that overcomes the limitations of profile models. Our method is based on a new perturbative small-coupling expansion of the free energy of the model that assumes a linear chain approximation as the $0^\mathrm{th}$-order of the expansion. We test the potentiality of the algorithm against standard competing strategies on several biological sequences.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods