Rough set-based SAR analysis: An inductive method
作者:
Highlights:
•
摘要
Rough set algorithm was used as a new methodology to build structure–activity relationship (SAR) models in this paper. It acted as feature selector and nonlinear rule generator. The SAR model expressed as human readable if-then rules was developed for the inhibition of the serine/threonine kinase CDK1/cyclinB by compounds from the indirubin inhibitor family. The feature selection ability of rough set algorithm was compared with the build-in approaches (CfsSubsetEval and ConsistencySubsetEval) in Weka under leave-one-out (LOO) and 10-fold cross-validation. Through training a set of 31 objects, a rule-based SAR model had been built with a reduct generated by rough set. The predictability of the model was evaluated by an external test set of 16 compounds. The existing powerful approaches, such as the decision tree learners, neural network, support vector classifier and LogitBoost approaches, were used to verify the performance of rough set method. It revealed that rough set method should play important role in data preprocessing and model building of nonlinear SAR analysis. The advantages and limitations of rough set-based SAR analysis were discussed. The results were satisfactorily in accordance with the available understanding of cocrystal structures and 3D QSAR models.
论文关键词:Rough set,Structure–activity relationship,Feature selection,Cyclin-dependent kinase,Indirubins
论文评审过程:Available online 9 December 2009.
论文官网地址:https://doi.org/10.1016/j.eswa.2009.12.008