A new distance measure for non-identical data with application to image classification

作者:

Highlights:

• Empirical evidence is provided that real-world data is non-identically distributed.

• PBR, the first distance measure to account for non-identical data is proposed.

• PBR was tested in 6 test applications using 12 benchmark data sets.

• PBR outperforms state-of-the-art measures for most data sets.

• Avoiding the identical distribution assumption can improve classification.

摘要

Highlights•Empirical evidence is provided that real-world data is non-identically distributed.•PBR, the first distance measure to account for non-identical data is proposed.•PBR was tested in 6 test applications using 12 benchmark data sets.•PBR outperforms state-of-the-art measures for most data sets.•Avoiding the identical distribution assumption can improve classification.

论文关键词:Poisson-Binomial distribution,Semi-metric distance,Non-identical data,Distance measure,Image classification,Image recognition

论文评审过程:Received 24 May 2016, Revised 12 October 2016, Accepted 15 October 2016, Available online 18 October 2016, Version of Record 28 October 2016.

论文官网地址:https://doi.org/10.1016/j.patcog.2016.10.018