Predicting protein secondary structure using a mixed-modal SVM method in a compound pyramid model

作者:

Highlights:

摘要

Accurate protein secondary structure prediction plays an important role in direct tertiary structure modeling, and can also significantly improve sequence analysis and sequence-structure threading for structure and function determination. Hence improving the accuracy of secondary structure prediction is essential for future developments throughout the field of protein research.In this article, we propose a mixed-modal support vector machine (SVM) method for predicting protein secondary structure. Using the evolutionary information contained in the physicochemical properties of each amino acid and a position-specific scoring matrix generated by a PSI-BLAST multiple sequence alignment as input for a mixed-modal SVM, secondary structure can be predicted at significantly increased accuracy. Using a Knowledge Discovery Theory based on the Inner Cognitive Mechanism (KDTICM) method, we have proposed a compound pyramid model, which is composed of three layers of intelligent interface that integrate a mixed-modal SVM (MMS) module, a modified Knowledge Discovery in Databases (KDD∗) process, a mixed-modal back propagation neural network (MMBP) module and so on.Testing against data sets of non-redundant protein sequences returned values for the Q3 accuracy measure that ranged from 84.0% to 85.6%,while values for the SOV99 segment overlap measure ranged from 79.8% to 80.6%. When compared using a blind test dataset from the CASP8 meeting against currently available secondary structure prediction methods, our new approach shows superior accuracy.Availability: http://www.kdd.ustb.edu.cn/protein_Web/.

论文关键词:Protein secondary structure prediction,Physicochemical properties,Mixed-modal SVM,Compound pyramid model

论文评审过程:Received 26 January 2010, Revised 15 August 2010, Accepted 11 October 2010, Available online 13 October 2010.

论文官网地址:https://doi.org/10.1016/j.knosys.2010.10.002