Computational biology has attacked the problem of isoelectric point prediction with little success, achieving a rough accuracy level of only 30%. In 2005, Matthew Conte performed a study focused on the relationship between sequence characteristics and isoelectric point prediction accuracy. Results indicated that charges between adjacent amino acids could have a significant impact on the overall predicted pi for the protein. In this study we introduce an evolutionary computation approach aimed at accounting for these problem dipeptides. For each possible dipeptide involving charged amino acids (7 chargeable groups -> 49 possibilities), the algorithm predicts a pKa value that, when included in the pi prediction algorithm, should result in a more accurate prediction. By accounting for these charged, adjacent amino acids, the pi prediction showed improvements for those proteins with the greatest deviation between experimental and predicted pi value (Apl > 0.7). However, these results were not generalized, as the incorporation of these values had the reverse effect on remaining proteins, most notably those from the most accurate data set (Apl < 0.1). While this research lays a foundation for improving the pi prediction algorithm, additional exploration remains necessary for an overall accuracy increase.
Department, Program, or Center
Thomas H. Gosnell School of Life Sciences (COS)
Parkin, Chris, "An Evolutionary Computation Approach to Optimization of Isoelectric Point Prediction in Proteins" (2006). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus