PM2.5 concentration prediction using hidden semi-Markov model-based times series data mining

作者:

Highlights:

摘要

In this paper, a novel framework and methodology based on hidden semi-Markov models (HSMMs) for high PM2.5 concentration value prediction is presented. Due to lack of explicit time structure and its short-term memory of past history, a standard hidden Markov model (HMM) has limited power in modeling the temporal structures of the prediction problems. To overcome the limitations of HMMs in prediction, we develop the HSMMs by adding the temporal structures into the HMMs and use them to predict the concentration levels of PM2.5. As a model-driven statistical learning method, HSMM assumes that both data and a mathematical model are available. In contrast to other data-driven statistical prediction models such as neural networks, a mathematical functional mapping between the parameters and the selected input variables can be established in HSMMs. In the proposed framework, states of HSMMs are used to represent the PM2.5 concentration levels. The model parameters are estimated through modified forward–backward training algorithm. The re-estimation formulae for model parameters are derived. The trained HSMMs can be used to predict high PM2.5 concentration levels. The validation of the proposed framework and methodology is carried out in real world applications: prediction of high PM2.5 concentrations at O’Hare airport in Chicago. The results show that the HSMMs provide accurate predictions of high PM2.5 concentration levels for the next 24 h.

论文关键词:Air pollution,Hidden semi-Markov model,PM2.5 concentration prediction

论文评审过程:Available online 24 December 2008.

论文官网地址:https://doi.org/10.1016/j.eswa.2008.12.017