Laplacian Regularized Sparse Representation Based Classifier for Identifying DNA N4-Methylcytosine Sites via L2,1/2-Matrix Norm

IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):500-511. doi: 10.1109/TCBB.2021.3133309. Epub 2023 Feb 3.

Abstract

N4-methylcytosine (4mC) is one of important epigenetic modifications in DNA sequences. Detecting 4mC sites is time-consuming. The computational method based on machine learning has provided effective help for identifying 4mC. To further improve the performance of prediction, we propose a Laplacian Regularized Sparse Representation based Classifier with L2,1/2-matrix norm (LapRSRC). We also utilize kernel trick to derive the kernel LapRSRC for nonlinear modeling. Matrix factorization technology is employed to solve the sparse representation coefficients of all test samples in the training set. And an efficient iterative algorithm is proposed to solve the objective function. We implement our model on six benchmark datasets of 4mC and eight UCI datasets to evaluate performance. The results show that the performance of our method is better or comparable.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • DNA / genetics
  • Epigenesis, Genetic / genetics
  • Machine Learning*

Substances

  • DNA