iRNA-PseTNC: identification of RNA 5-methylcytosine sites using hybrid vector space of pseudo nucleotide composition
作者:Shahid Akbar, Maqsood Hayat, Muhammad Iqbal, Muhammad Tahir
摘要
RNA 5-methylcytosine (m5C) sites perform a major role in numerous biological processes and commonly reported in both DNA and RNA cellular. The enzymatic mechanism and biological functions of m5C sites in DNA remain the focusing area of researchers for last few decades. Likewise, the investigators also targeted m5C sites in RNA due to its cellular functions, positioning and formation mechanism. Currently, several rudimentary roles of the m5C in RNA have been explored, but a lot of improvements are still under consideration. Initially, the identification of RNA methylcytosine sites was carried out via experimental methods, which were very hard, erroneous and time consuming owing to partial availability of recognized structures. Looking at the significance of m5C role in RNA, scientists have diverted their attention from structure to sequence-based prediction. In this regards, an intelligent computational model is proposed in order to identify m5C sites in RNA with high precision. Three RNA sequences formulation methods namely: pseudo dinucleotide composition,pseudo trinucleotide composition and pseudo tetra nucleotide composition are applied to extract variant and high profound numerical features. In a sequel, the vector spaces are fused to build a hybrid space in order to compensate the weakness of each other. Various learning hypotheses are examined to select the best operational engine, which can truly identify the pattern of the target class. The strength and generalization of the proposed model are measured using two different cross validation tests. The reported outcomes reveal that the proposed model achieved 3% better accuracy than that of the highest present approach in the literature so far.
论文关键词:methylcytosine sites, PseTNC, PseTetraNC, hybrid features, SVM, cross validation test
论文评审过程:
论文官网地址:https://doi.org/10.1007/s11704-018-8094-9