Adam Handen


Intrinsically disordered proteins (IDPs) are polypeptide sequences that do not form a rigid three dimensional structure when isolated in the cytosol of the cell. These sequences are very common in most genomes, and usually are involved in many protein-protein interactions. IDPs also play a key role in many diseases, including cancer and Huntington's disease. However, IDPs are difficult to study because of their amorphous shape, high mutation rate, and unique amino acid composition. These obstacles make homology studies especially difficult.

This study focused on generating a new substitution matrix designed to aid in homology studies of IDPs. The matrix was generated using a genetic algorithm (GA). GAs are alternative hill-climbing methods for finding solutions in complex problem spaces. To achieve this goal, a GA models the evolutionary process found in nature by "breeding" solutions to the problem until one of sufficient quality is produced.

The GA implemented in this study produced a substitution matrix for use in differentiation between homologous and non-homologous proteins containing disordered regions. The matrix showed some correlation to the patterns of evolution found in disordered proteins and their general sequence makeup. However, when compared to a commonly used substitution matrix, BLOSUM, the GA's solution did not show significant improvement. But the results here do show a general proof of concept, and that given modifications to the GA, more time, or more resources, a substitution matrix capable of out-performing BLOSUM is potentially possible.

Publication Date


Document Type


Student Type


Degree Name

Bioinformatics (MS)

Department, Program, or Center

Thomas H. Gosnell School of Life Sciences (COS)


Michael Osier


Physical copy available through RIT's The Wallace Library at: QP551 .H36 2013


RIT – Main Campus

Plan Codes