In silico prediction of long intergenic non-coding RNAs in sheep

Genome. 2016 Apr;59(4):263-75. doi: 10.1139/gen-2015-0141. Epub 2016 Feb 19.

Abstract

Long non-coding RNAs (lncRNAs) are transcribed RNA molecules >200 nucleotides in length that do not encode proteins and serve as key regulators of diverse biological processes. Recently, thousands of long intergenic non-coding RNAs (lincRNAs), a type of lncRNAs, have been identified in mammalians using massive parallel large sequencing technologies. The availability of the genome sequence of sheep (Ovis aries) has allowed us genomic prediction of non-coding RNAs. This is the first study to identify lincRNAs using RNA-seq data of eight different tissues of sheep, including brain, heart, kidney, liver, lung, ovary, skin, and white adipose. A computational pipeline was employed to characterize 325 putative lincRNAs with high confidence from eight important tissues of sheep using different criteria such as GC content, exon number, gene length, co-expression analysis, stability, and tissue-specific scores. Sixty-four putative lincRNAs displayed tissues-specific expression. The highest number of tissues-specific lincRNAs was found in skin and brain. All novel lincRNAs that aligned to the human and mouse lincRNAs had conserved synteny. These closest protein-coding genes were enriched in 11 significant GO terms such as limb development, appendage development, striated muscle tissue development, and multicellular organismal development. The findings reported here have important implications for the study of sheep genome.

Keywords: RNA-seq; comparative genomics; génomique comparée; lncRNA; lncRNAs; mouton; sheep.

MeSH terms

  • Animals
  • Base Composition
  • Computational Biology
  • Computer Simulation
  • Exons
  • High-Throughput Nucleotide Sequencing
  • Organ Specificity
  • RNA, Long Noncoding / genetics*
  • Sequence Analysis, RNA
  • Sheep / genetics*
  • Transcriptome

Substances

  • RNA, Long Noncoding