Overlap-Based Undersampling for Improving Imbalanced Data Classification

Pattaramon Vuttipittayamongkol, Eyad Elyan, Andrei Petrovski, Chrisina Jayne

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

Classification of imbalanced data remains an important field in machine learning. Several methods have been proposed to address the class imbalance problem including data resampling, adaptive learning and cost adjusting algorithms. Data resampling methods are widely used due to their simplicity and flexibility. Most existing resampling techniques aim at rebalancing class distribution. However, class imbalance is not the only factor that impacts the performance of the learning algorithm. Class overlap has proved to have a higher impact on the classification of imbalanced datasets than the dominance of the negative class. In this paper, we propose a new undersampling method that eliminates negative instances from the overlapping region and hence improves the visibility of the minority instances. Testing and evaluating the proposed method using 36 public imbalanced datasets showed statistically significant improvements in classification performance.

Original languageEnglish
Title of host publicationIntelligent Data Engineering and Automated Learning – IDEAL 2018 - 19th International Conference, Proceedings
EditorsHujun Yin, Paulo Novais, David Camacho, Antonio J. Tallón-Ballesteros
PublisherSpringer-Verlag
Pages689-697
Number of pages9
ISBN (Print)9783030034924
DOIs
Publication statusPublished - 2018
Event19th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2018 - Madrid, Spain
Duration: 21 Nov 201823 Nov 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11314 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2018
Country/TerritorySpain
CityMadrid
Period21/11/1823/11/18

Bibliographical note

Publisher Copyright:
© 2018, Springer Nature Switzerland AG.

Copyright:
Copyright 2019 Elsevier B.V., All rights reserved.

Fingerprint

Dive into the research topics of 'Overlap-Based Undersampling for Improving Imbalanced Data Classification'. Together they form a unique fingerprint.

Cite this