Computational prediction of protein ubiquitination sites mapping on Arabidopsis thaliana
Article
Article Title | Computational prediction of protein ubiquitination sites mapping on Arabidopsis thaliana |
---|---|
ERA Journal ID | 17808 |
Article Category | Article |
Authors | Mosharaf, Md. Parvez, Hassan, Md. Mehedi, Ahmed, Fee Faysal, Khatun, Mst. Shamima, Moni, Mohammad Ali and Mollah, Md. Nurul Haque |
Journal Title | Computational Biology and Chemistry |
Journal Citation | 85 |
Article Number | 107238 |
Number of Pages | 7 |
Year | 2020 |
Place of Publication | United Kingdom |
ISSN | 1476-9271 |
1476-928X | |
Digital Object Identifier (DOI) | https://doi.org/10.1016/j.compbiolchem.2020.107238 |
Web Address (URL) | https://www.sciencedirect.com/science/article/pii/S1476927118309654 |
Abstract | Among the protein post-translational modifications (PTMs), ubiquitination is considered as one of the most significant processes which can regulate the cellular functions and various diseases. Identification of ubiquitination sites becomes important for understanding the mechanisms of ubiquitination-related biological processes. Both experimental and computational approaches are available for identifying ubiquitination sites based on protein sequences of different species. The experimental approaches are time-consuming, laborious and costly. In silico prediction is an alternative time saving, easier and cost-effective approach for identifying ubiquitination sites. Moreover, the sequence patterns in the different species around the ubiquitination sites are not similar which demands species-specific predictors. Therefore, in this study, we have proposed a novel computational method for identifying ubiquitination sites based on protein sequences of A. thaliana species which will be robust against outlying observations also. Through the comparative study of two encoding schemes and three classifiers, the random forest (RF) based predictor was selected as the best predictor under the CKSAAP encoding scheme with 1:1 ratio of positive and negative samples (i.e. ubiquitinated and non-ubiquitinated) in training dataset. The proposed predictor produced the area under the ROC curve (AUC score) as 0.91 and 0.86 for 5-fold cross-validation test with the training dataset and the independent test dataset of A. thaliana respectively. The proposed RF based predictor also performed much better than the other existing ubiquitination sites predictors for A. thaliana. |
Keywords | Arabidopsis thaliana species; CKSAAP encoding; Protein sequences; Random forest; Ubiquitination sites |
Contains Sensitive Content | Does not contain sensitive content |
Public Notes | Files associated with this item cannot be displayed due to copyright restrictions. |
Funder | Ministry of Science and Technology, Taiwan |
Byline Affiliations | University of Rajshahi, Bangladesh |
Kyushu Institute of Technology, Japan | |
Jashore University of Science and Technology, Bangladesh | |
University of Sydney |
https://research.usq.edu.au/item/yy7yz/computational-prediction-of-protein-ubiquitination-sites-mapping-on-arabidopsis-thaliana
35
total views0
total downloads2
views this month0
downloads this month