Integrating human genome database into electronic health record with sequence alignment and compression mechanism

J Med Syst. 2012 Aug;36(4):2587-97. doi: 10.1007/s10916-011-9731-0. Epub 2011 May 11.

Abstract

With the initial completion of Human Genome Project, the post-genomic era is coming. Although the genome map of human has been decoded, the roles that each segment of sequences acts are not totally discovered. On the other hand, with the rapid expansion of sequence information, the issues of data compilation and data storage are increasingly important. In this paper, a "Human genome database system" is designed and implemented in National Taiwan University Hospital (NTUH). By accessing this system, the doctors can store and manage the experimental sequence data. The achievement of this system is that it integrates the modules of sequence alignment and data compression. By embedding with the NCBI alignment program-blastall [1], it automatically aligns the uploaded sequences and searches for the corresponding genomic positions. Besides, the system encodes the differences between sequences, effectively compresses them and decreases the demand of storage spaces by the compression ratio at 12.28. At the same time, it offers a variety of query methods. Users can quickly access the interesting data by inputting the keywords of specimen number, GI and sequence position, etc. The electronic health record (EHR) in Health Information System (HIS) of NTUH is also combined in this system and the doctors can utilize the valuable information to figure out the relation between the diseases and genes. With this system, a genetic personal healthcare environment will be established in the future.

MeSH terms

  • Algorithms
  • Computer Systems
  • Data Compression*
  • Databases, Genetic*
  • Databases, Nucleic Acid
  • Electronic Health Records*
  • Genome, Human*
  • Humans
  • Internet
  • Software Design
  • Systems Integration*