Please use this identifier to cite or link to this item: http://cmuir.cmu.ac.th/jspui/handle/6653943832/52075
Full metadata record
DC FieldValueLanguage
dc.contributor.authorPhasit Charoenkwanen_US
dc.contributor.authorWatshara Shoombuatongen_US
dc.contributor.authorHua Chin Leeen_US
dc.contributor.authorJeerayut Chaijaruwanichen_US
dc.contributor.authorHui Ling Huangen_US
dc.contributor.authorShinn Ying Hoen_US
dc.date.accessioned2018-09-04T09:20:44Z-
dc.date.available2018-09-04T09:20:44Z-
dc.date.issued2013-09-03en_US
dc.identifier.issn19326203en_US
dc.identifier.other2-s2.0-84883364817en_US
dc.identifier.other10.1371/journal.pone.0072368en_US
dc.identifier.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84883364817&origin=inwarden_US
dc.identifier.urihttp://cmuir.cmu.ac.th/jspui/handle/6653943832/52075-
dc.description.abstractExisting methods for predicting protein crystallization obtain high accuracy using various types of complemented features and complex ensemble classifiers, such as support vector machine (SVM) and Random Forest classifiers. It is desirable to develop a simple and easily interpretable prediction method with informative sequence features to provide insights into protein crystallization. This study proposes an ensemble method, SCMCRYS, to predict protein crystallization, for which each classifier is built by using a scoring card method (SCM) with estimating propensity scores of p-collocated amino acid (AA) pairs (p = 0 for a dipeptide). The SCM classifier determines the crystallization of a sequence according to a weighted-sum score. The weights are the composition of the p-collocated AA pairs, and the propensity scores of these AA pairs are estimated using a statistic with optimization approach. SCMCRYS predicts the crystallization using a simple voting method from a number of SCM classifiers. The experimental results show that the single SCM classifier utilizing dipeptide composition with accuracy of 73.90% is comparable to the best previously-developed SVM-based classifier, SVM_POLY (74.6%), and our proposed SVM-based classifier utilizing the same dipeptide composition (77.55%). The SCMCRYS method with accuracy of 76.1% is comparable to the state-of-the-art ensemble methods PPCpred (76.8%) and RFCRYS (80.0%), which used the SVM and Random Forest classifiers, respectively. This study also investigates mutagenesis analysis based on SCM and the result reveals the hypothesis that the mutagenesis of surface residues Ala and Cys has large and small probabilities of enhancing protein crystallizability considering the estimated scores of crystallizability and solubility, melting point, molecular weight and conformational entropy of amino acids in a generalized condition. The propensity scores of amino acids and dipeptides for estimating the protein crystallizability can aid biologists in designing mutation of surface residues to enhance protein crystallizability. The source code of SCMCRYS is available at http://iclab.life.nctu.edu.tw/SCMCRYS/. © 2013 Charoenkwan et al.en_US
dc.subjectAgricultural and Biological Sciencesen_US
dc.subjectBiochemistry, Genetics and Molecular Biologyen_US
dc.subjectMedicineen_US
dc.titleSCMCRYS: Predicting Protein Crystallization Using an Ensemble Scoring Card Method with Estimating Propensity Scores of P-Collocated Amino Acid Pairsen_US
dc.typeJournalen_US
article.title.sourcetitlePLoS ONEen_US
article.volume8en_US
article.stream.affiliationsNational Chiao Tung University Taiwanen_US
article.stream.affiliationsChiang Mai Universityen_US
Appears in Collections:CMUL: Journal Articles

Files in This Item:
There are no files associated with this item.


Items in CMUIR are protected by copyright, with all rights reserved, unless otherwise indicated.