Overlap pattern synthesis with an efficient nearest neighbor classifier
作者:
Highlights:
•
摘要
Nearest neighbor (NN) classifier is the most popular non-parametric classifier. It is a simple classifier with no design phase and shows good performance. Important factors affecting the efficiency and performance of NN classifier are (i) memory required to store the training set, (ii) classification time required to search the nearest neighbor of a given test pattern, and (iii) due to the curse of dimensionality the number of training patterns needed by it to achieve a given classification accuracy becomes prohibitively large when the dimensionality of the data is high. In this paper, we propose novel techniques to improve the performance of NN classifier and at the same time to reduce its computational burden. These techniques are broadly based on: (i) overlap based pattern synthesis which can generate a larger number of artificial patterns than the number of input patterns and thus can reduce the curse of dimensionality effect, (ii) a compact representation of the given set of training patterns called overlap pattern graph (OLP-graph) which can be incrementally built by scanning the training set only once and (iii) an efficient NN classifier called OLP-NNC which directly works with OLP-graph and does implicit overlap based pattern synthesis. A comparison based on experimental results is given between some of the relevant classifiers. The proposed schemes are suitable for applications dealing with large and high dimensional datasets like those in data mining.
论文关键词:Nearest neighbor classifier,Pattern synthesis,Compact representation,Data mining
论文评审过程:Received 8 May 2003, Revised 10 May 2004, Accepted 8 October 2004, Available online 16 February 2005.
论文官网地址:https://doi.org/10.1016/j.patcog.2004.10.007