Exon prediction using empirical mode decomposition and Fourier transform of structural profiles of DNA sequences

作者:

Highlights:

摘要

Spectrum analysis approaches, such as the Fourier transform, wavelet transform and autoregressive model, have been successfully applied to solve the exon prediction problem due to their flexibility that requires no training data or prior knowledge. Detecting short exons is a difficult problem. The results achieved by the traditional methods are often unsatisfactory, because they cannot identify spectral patterns of short exons correctly. In this article, we propose an improved exon prediction method based on empirical mode decomposition and the Fourier transform. The proposed approach numerically represents the DNA sequences by their structural features, which can help to yield significant patterns that are rarely observed with the traditional methods. The structural profile is utilized to detect probable exons by examining the peaks of the local 1/3 frequency spectrum within a sliding window. The data in the window is firstly decomposed by empirical mode decomposition into a collection of intrinsic mode functions. Then the first intrinsic mode function is used to compute the local spectrum by fast Fourier transform. We compare our method with the traditional Fourier transform with binary representation method and the recently proposed paired spectral content method. Experiments on randomly selected Human genome dataset and the GENSCAN benchmark dataset illustrate that our method can enhance the signal-to-noise ratio of the analyzed sequences and improve the prediction accuracy of short exons.

论文关键词:DNA sequence analysis,Exon prediction,Discrete Fourier transform,Empirical mode decomposition,Structural features of DNA

论文评审过程:Received 1 November 2010, Revised 9 August 2011, Accepted 14 August 2011, Available online 23 August 2011.

论文官网地址:https://doi.org/10.1016/j.patcog.2011.08.016