isGPT: An optimized model to identify sub-Golgi protein types using SVM and Random Forest based feature selection

作者:

Highlights:

• Classification and regression analysis for sub-Golgi protein types.

• Protein representation using n-grams, gapped dipeptides, position specific n-grams.

• Feature selection using importance score provided by Random Forest model.

• Prediction model built using SVM (linear kernel).

• No dependency on evolutionary information of proteins.

• Fast and accurate predictor.

摘要

•Classification and regression analysis for sub-Golgi protein types.•Protein representation using n-grams, gapped dipeptides, position specific n-grams.•Feature selection using importance score provided by Random Forest model.•Prediction model built using SVM (linear kernel).•No dependency on evolutionary information of proteins.•Fast and accurate predictor.

论文关键词:Sub-Golgi Apparatus,Classification,Regression,Support vector machine,Random Forest

论文评审过程:Received 5 October 2017, Revised 13 November 2017, Accepted 17 November 2017, Available online 26 November 2017, Version of Record 5 February 2018.

论文官网地址:https://doi.org/10.1016/j.artmed.2017.11.003