PCA-based drift and shift quantification framework for multidimensional data

作者:Igor Goldenberg, Geoffrey I. Webb

摘要

Concept drift is a serious problem confronting machine learning systems in a dynamic and ever-changing world. In order to manage concept drift it may be useful to first quantify it by measuring the distance between distributions that generate data before and after a drift. There is a paucity of methods to do so in the case of multidimensional numeric data. This paper provides an in-depth analysis of the PCA-based change detection approach, identifies shortcomings of existing methods and shows how this approach can be used to measure a drift, not merely detect it.

论文关键词:Principal component analysis, Drift detection, Hellinger distance

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-020-01438-3