Cybercrime: Identification and Prediction Using Machine Learning Techniques

K Veena; K Meena; Ramya Kuppusamy; Yuvaraja Teekaraman; Ravi V Angadi; Amruth Ramesh Thelkar

doi:10.1155/2022/8237421

Cybercrime: Identification and Prediction Using Machine Learning Techniques

Comput Intell Neurosci. 2022 Aug 27:2022:8237421. doi: 10.1155/2022/8237421. eCollection 2022.

Authors

K Veena¹, K Meena², Ramya Kuppusamy³, Yuvaraja Teekaraman⁴, Ravi V Angadi⁵, Amruth Ramesh Thelkar⁶

Affiliations

¹ Department of Computer Science and Engineering, Sathyabama Institute of Science and Technology, Chennai 600119, India.
² Department of Computer Science and Engineering, GITAM School of Technology, GITAM-Bangalore 561203, India.
³ Department of Electrical and Electronics Engineering, Sri Sairam College of Engineering, Bangalore 562106, India.
⁴ Department of Electronic and Electrical Engineering, The University of Sheffield, Sheffield S1 3JD, UK.
⁵ Department of Electrical & Electronics Engineering, Presidency University, Bangalore City 560064, India.
⁶ Faculty of Electrical & Computer Engineering, Jimma Institute of Technology, Jimma University, Jimma, Ethiopia.

Abstract

In the world of cyber age, cybercrime is spreading its root extensively. Supervised classification methods such as the support vector machine (SVM) and K-nearest neighbor (KNN) models are employed for the classification of cybercrime data. Likewise, the unsupervised mode of classification involves the techniques of K-means clustering, Gaussian mixture model, and cluster quasi-random via fuzzy C-means clustering and fuzzy clustering. Neural networks are employed for determining synthetic identity theft. The formation of clusters takes place using these clustering techniques, which fetches crime data from the overall data. Cybercrime detection employs dataset that is fetched from CBS open data StatLine. The attributes utilized are concerning the crime victims through personal characteristics with total user identity being 1000. For analyzing the performance, different training and testing data undergo variation. Eventually using the best technique, the criminal is identified and the Gaussian mixture model in the unsupervised method reveals enhanced performance using the detection method. 76.56% percentage of accuracy is achieved in detecting the criminal. The accuracy achieved in case of classification via SVM classifier is 89% in the supervised method. Performance metrics for several attributes are being computed in terms of true positive (TP), false positive (FP), true negative (TN), false negative (FN), false alarm rate (FAR), detection rate (DR), accuracy (ACC), recall, precision, specificity, sensitivity, and Fowlkes-Mallows scores. The expectation-maximization (EM) algorithm is employed for assessing the performance of the Gaussian mixture model.

MeSH terms

Algorithms
Cluster Analysis
Machine Learning*
Neural Networks, Computer
Support Vector Machine*