Fast implementations of nearest neighbor classifiers

作者:

Highlights:

摘要

Standard implementations of non-parametric classifiers have large computational requirements. Parzen classifiers use the distances of an unknown vector to all N prototype samples, and consequently exhibit O(N) behavior in both memory and time. We describe four techniques for expediting the nearest neighbor methods: replacing the linear search with a new kd tree method, exhibiting approximately O(N12) behavior; employing an L∞ instead of L2 distance metric; using variance-ordered features; and rejecting prototypes by evaluating distances in low dimensionality subspaces. We demonstrate that variance-ordered features yield significant efficiency gains over the same features linearly transformed to have uniform variance. We give results for a large OCR problem, but note that the techniques expedite recognition for arbitrary applications. Three of four techniques preserve recognition accuracy.

论文关键词:Non-parametric classifiers,kd Trees,k Nearest neighbors,Distanc e metrics,Karhunen-Loève transform,OCR

论文评审过程:Received 16 January 1996, Revised 13 June 1996, Accepted 10 July 1996, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/S0031-3203(96)00098-2