Modeling angles in proteins and circular genomes using multivariate angular distributions based on multiple nonnegative trigonometric sums

Stat Appl Genet Mol Biol. 2014 Feb;13(1):1-18. doi: 10.1515/sagmb-2012-0012.

Abstract

Fernández-Durán, J. J. (2004): "Circular distributions based on nonnegative trigonometric sums," Biometrics, 60, 499-503, developed a family of univariate circular distributions based on nonnegative trigonometric sums. In this work, we extend this family of distributions to the multivariate case by using multiple nonnegative trigonometric sums to model the joint distribution of a vector of angular random variables. Practical examples of vectors of angular random variables include the wind direction at different monitoring stations, the directions taken by an animal on different occasions, the times at which a person performs different daily activities, and the dihedral angles of a protein molecule. We apply the proposed new family of multivariate distributions to three real data-sets: two for the study of protein structure and one for genomics. The first is related to the study of a bivariate vector of dihedral angles in proteins. In the second real data-set, we compare the fit of the proposed multivariate model with the bivariate generalized von Mises model of [Shieh, G. S., S. Zheng, R. A. Johnson, Y.-F. Chang, K. Shimizu, C.-C. Wang, and S.-L. Tang (2011): "Modeling and comparing the organization of circular genomes," Bioinformatics, 27(7), 912-918.] in a problem related to orthologous genes in pairs of circular genomes. The third real data-set consists of observed values of three dihedral angles in γ-turns in a protein and serves as an example of trivariate angular data. In addition, a simulation algorithm is presented to generate realizations from the proposed multivariate angular distribution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Computer Simulation
  • DNA, Circular / chemistry*
  • Genome, Archaeal
  • Genome, Bacterial
  • Likelihood Functions
  • Models, Molecular*
  • Multivariate Analysis
  • Proteins / chemistry*
  • Software

Substances

  • DNA, Circular
  • Proteins