Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering

Zeynel Cebeci; Cagatay Cebeci; Yalcin Tahtali; Lutfi Bayyurt

doi:10.7717/peerj-cs.1060

Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering

PeerJ Comput Sci. 2022 Sep 27:8:e1060. doi: 10.7717/peerj-cs.1060. eCollection 2022.

Authors

Zeynel Cebeci¹, Cagatay Cebeci², Yalcin Tahtali³, Lutfi Bayyurt³

Affiliations

¹ Department of Animal Science, Faculty of Agriculture, Cukurova University, Adana, Turkey.
² Department of Electronics & Electrical Engineering, University of Strathclyde, Glasgow, United Kingdom.
³ Department of Agriculture, Faculty of Agriculture, Tokat Gaziosmanpasa University, Tokat, Turkey.

Abstract

Outliers are data points that significantly deviate from other data points in a data set because of different mechanisms or unusual processes. Outlier detection is one of the intensively studied research topics for identification of novelties, frauds, anomalies, deviations or exceptions in addition to its use for data cleansing in data science. In this study, we propose two novel outlier detection approaches using the typicality degrees which are the partitioning result of unsupervised possibilistic clustering algorithms. The proposed approaches are based on finding the atypical data points below a predefined threshold value, a possibilistic level for evaluating a point as an outlier. The experiments on the synthetic and real data sets showed that the proposed approaches can be successfully used to detect outliers without considering the structure and distribution of the features in multidimensional data sets.

Keywords: Anomaly detection; Data analysis; Fuzzy and possibilistic clustering; Outlier detection; Unsupervised learning.

Grants and funding

This study has been funded by the Unit of Scientific Research Projects of Çukurova University in Adana, Turkey (grant number FBA-2019-10285). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.