Missing data imputation on biomedical data using deeply learned clustering and L2 regularized regression based on symmetric uncertainty

作者:

Highlights:

• A novel missing data imputation approach is proposed for high dimensional health datasets.

• The proposed approach ensures that maximum information is utilized during imputation.

• Symmetric uncertainty and L2 regularized regression is performed to identify the imputed value.

• Deep learning preserves the global structure of the dataset and clustering preserves the local structure.

• The proposed approach outperforms the other approaches in different classifiers.

摘要

•A novel missing data imputation approach is proposed for high dimensional health datasets.•The proposed approach ensures that maximum information is utilized during imputation.•Symmetric uncertainty and L2 regularized regression is performed to identify the imputed value.•Deep learning preserves the global structure of the dataset and clustering preserves the local structure.•The proposed approach outperforms the other approaches in different classifiers.

论文关键词:Deeply learned clustering,L2 regularization,Missing data imputation,Biomedical datasets

论文评审过程:Received 22 January 2020, Revised 8 November 2021, Accepted 8 November 2021, Available online 6 December 2021, Version of Record 16 December 2021.

论文官网地址:https://doi.org/10.1016/j.artmed.2021.102214