Data-driven missing data imputation in cluster monitoring system based on deep neural network

作者:Jie Lin, NianHua Li, Md Ashraful Alam, Yuqing Ma

摘要

Due to cluster instability, not in the cluster monitoring system. This paper focuses on the missing data imputation processing for the cluster monitoring application and proposes a new hybrid multiple imputation framework. This new imputation approach is different from the conventional multiple imputation technologies in the fact that it attempts to impute the missing data for an arbitrary missing pattern with a model-based and data-driven combination architecture. Essentially, the deep neural network, as the data model, extracts deep features from the data and deep features are further calculated then by a regression or data-driven strategies and used to create the estimation of missing data with the arbitrary missing pattern. This paper gives evidence that if we can train a deep neural network to construct the deep features of the data, imputation based on deep features is better than that directly on the original data. In the experiments, we compare the proposed method with other conventional multiple imputation approaches for varying missing data patterns, missing ratios, and different datasets including real cluster data. The result illustrates that when data encounters larger missing ratio and various missing patterns, the proposed algorithm has the ability to achieve more accurate and stable imputation performance.

论文关键词:Missing data, Cluster monitoring, Deep belief networks, Multiple imputation

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-019-01560-y