A data driven procedure for density estimation with some applications

作者:

Highlights:

摘要

This paper deals with the probability density estimation using a kernel-based approach where the window size of the kernel is found by a data-driven procedure. It is theoretically shown that, under certain assumptions, the estimated densities on bounded sets can be asymptotically unbiased when the width of window is obtained from the minimal spanning tree of the observed data. The theoretical development initially carried out on R2 is applicable to higher dimensional spaces. The results are experimentally verified on bounded sets with different types of distributions. The behaviour of the estimator in the case of the unbounded set as in that for Gaussian density is also experimentally seen to be good. Some applications of the proposed density estimation technique is demonstrated. One application is the representative point detection algorithm, which can be applied for data reduction and outlier rejection. Another application involves detection of border points of a dot pattern as well as finding a thinned version of the dot pattern.

论文关键词:Probability density estimation,Kernel method,Window selection,Minimal spanning tree,Bounded set,Asymptotically unbiased estimator,Representative point

论文评审过程:Received 3 July 1995, Accepted 28 February 1996, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/0031-3203(96)00028-3