Selection of key features for PM2.5 prediction using a wavelet model and RBF-LSTM
作者:Yi-Chung Chen, Dong-Chi Li
摘要
PM2.5 prediction has received much attention from researchers in recent years, as PM2.5 has been proven to have a major impact on human health. High-precision PM2.5 predictions would greatly benefit both the general public and governments. However, most of the existing studies on predicting PM2.5 or other air pollutants focus on model design and rarely discuss feature selection. Conventionally, researchers either simply choose a few types of meteorological data based on suggestions from experts or conduct simple correlation analyses to identify the types of meteorological data that are most correlated with the predicted pollutant. However, these methods suffer from two shortcomings. (1) Changes in PM2.5 values are influenced by low-frequency pollution from other places and high-frequency pollution from the local area. Furthermore, the meteorological data associated with the two pollution sources are generally different. In this case, changes with different frequencies in each type of meteorological data should be considered separately to precisely identify the correlated meteorological data of pollutants with different frequencies. (2) Datasets used for PM2.5 predictions generally contain high-dimensional meteorological data. Conventional correlation analysis methods are not effective at identifying features in such datasets, making it difficult to select useful features and negatively affecting prediction accuracy. This study therefore proposes two concepts to address this issue: (1) a wavelet model to decompose each type of meteorological data into multiple sub-time series with different frequencies and (2) the design of a novel radial basis function long short-term memory model that can analyze the outputs of the radial basis function to extract key features identified by deep learning models. This approach is faster and simpler than other methods using deep learning models to extract key features. Application of these concepts will greatly enhance prediction accuracy and reduce costs regardless of the prediction model used. We use three years of historical meteorological data from Central Taiwan to demonstrate the effectiveness of the proposed methods.
论文关键词:PM2.5 prediction, Feature selection, Deep learning model, Wavelet analysis
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10489-020-02031-5