Text Mining and Data Modeling of Karyotypes to aid in Drug Repurposing Efforts

Stud Health Technol Inform. 2015:216:1037.

Abstract

Karyotyping, or visually examining and recording chromosomal abnormalities, is commonly used to diagnose and treat disease. Karyotypes are written in the International System for Human Cytogenetic Nomenclature (ISCN), a computationally non-readable language that precludes full analysis of these genomic data. In response, we developed a cytogenetic platform that transfers the ISCN karyotypes to a machine-readable model available for computational analysis. Here we use cytogenetic data from the National Cancer Institute (NCI)-curated Mitelman database1 to create a structured karyotype language. Then, drug-gene-disease triplets are generated via a computational pipeline connecting public drug-gene interaction data sources to identify potential drug repurposing opportunities.

MeSH terms

  • Antineoplastic Agents / classification
  • Antineoplastic Agents / therapeutic use*
  • Data Mining / methods*
  • Databases, Genetic / classification
  • Databases, Pharmaceutical / classification
  • Drug Repositioning / methods*
  • Humans
  • Karyotype*
  • Natural Language Processing
  • Neoplasms / drug therapy*
  • Neoplasms / genetics*
  • Pharmacogenomic Testing / methods
  • PubMed

Substances

  • Antineoplastic Agents