TY - JOUR T1 - Using Self-Organizing Maps for Binary Classification with Highly Imbalanced Datasets AU - VINICIUS ALMENDRA AND DENIS EN ̀†ACHESCU JO - International Journal of Numerical Analysis Modeling Series B VL - 3 SP - 238 EP - 254 PY - 2014 DA - 2014/05 SN - 5 DO - http://doi.org/ UR - https://global-sci.org/intro/article_detail/ijnamb/232.html KW - unsupervised learning KW - self-organizing maps KW - imbalanced datasets KW - supervised learning AB - Highly imbalanced datasets occur in domains like fraud detection, fraud prediction, and clinical diagnosis of rare diseases, among others. These datasets are characterized by the existence of a prevalent class (e.g. legitimate sellers) while the other is relatively rare (e.g. fraudsters). Although small in proportion, the observations belonging to the minority class can be of a crucial importance. In this work we extend an unsupervised learning technique-Self-Organizing Maps-to use labeled data for binary classification under a constraint on the proportion of false positives. The resulting technique was applied to two highly imbalanced real datasets, achieving good results while being easier to interpret.