Entropy and gravitation based dynamic radius nearest neighbor classification for imbalanced problem

作者:

Highlights:

摘要

In imbalanced problems, the asymmetric number of samples in different classes brings great challenges to traditional classifiers, especially to the Nearest Neighbors (NN) classifiers. When NN-based classifier deals with imbalanced problems, the criterion of itself makes the classification result data-dependent, thus biasing towards the majority class. To overcome the drawback in NN-based classifiers, a meta heuristic NN-based algorithm named Gravitational Fixed Radius Nearest Neighbor classifier (GFRNN) is proposed to solve imbalanced problems by drawing on Newton’s law of universal gravitation. However, GFRNN still has three major problems including negligence of the distribution of samples, unreasonable calculation of data mass and improper distance metric. To this end, this paper proposes an Entropy and Gravitation based Dynamic Radius Nearest Neighbor algorithm (EGDRNN). Different from GFRNN, EGDRNN determines the radius in a dynamic and rapid way. EGDRNN uses entropy information to make samples at different locations have different importance. Finally, by utilizing a general Lp-norm to calculate the distance between two samples, the classification performance is greatly improved. The experimental result validates that the proposed EGDRNN not only achieves the highest classification accuracy but also takes the lowest time consuming among all comparison algorithms.

论文关键词:Information entropy,Gravitational force,Nearest neighbor rules,Imbalanced problem,Lp-norm

论文评审过程:Received 11 January 2019, Revised 31 December 2019, Accepted 2 January 2020, Available online 7 January 2020, Version of Record 7 March 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.105474