NETWORK ASSISTED ANALYSIS TO REVEAL THE GENETIC BASIS OF AUTISM

Ann Appl Stat. 2015;9(3):1571-1600. doi: 10.1214/15-AOAS844. Epub 2015 Nov 2.

Abstract

While studies show that autism is highly heritable, the nature of the genetic basis of this disorder remains illusive. Based on the idea that highly correlated genes are functionally interrelated and more likely to affect risk, we develop a novel statistical tool to find more potentially autism risk genes by combining the genetic association scores with gene co-expression in specific brain regions and periods of development. The gene dependence network is estimated using a novel partial neighborhood selection (PNS) algorithm, where node specific properties are incorporated into network estimation for improved statistical and computational efficiency. Then we adopt a hidden Markov random field (HMRF) model to combine the estimated network and the genetic association scores in a systematic manner. The proposed modeling framework can be naturally extended to incorporate additional structural information concerning the dependence between genes. Using currently available genetic association data from whole exome sequencing studies and brain gene expression levels, the proposed algorithm successfully identified 333 genes that plausibly affect autism risk.

Keywords: Autism spectrum disorder; hidden Markov random field; neighborhood selection; network estimation; risk gene discovery.