Balanced prediction of protein functions: a hybrid approach using homologies and protein interactions

Nguyen, C., Mannino, M., Gardner, K., and Cios, K.
Journal of Bioinformatics and Computational Biology Vol. 6, Issue 1, p. 203 – 222

We introduce a new hybrid algorithm, ClusFCM, which combines techniques of clustering and fuzzy cognitive maps for prediction of protein function. ClusFCM takes advantage of protein homologies and protein interaction network to improve low recall predictions associated with existing prediction methods. ClusFCM exploits the fact that proteins of known function tend to cluster together and deduce funtions not only through their direct interaction with other known proteins, but also from other proteins in the network. We use ClusFCM to annotate protein functions for cerevisiae (yeast), Caenorhabditis elegans (worm) and Drosophila melanogaster (fly) using protein-protein interaction data from the General Repository for Interaction Datasets (GRID) database and functional labels from Gene Ontology (GO) terms. The algorithm’s performance is compared with four state of the art methods for function prediction – Majority, 2 statistics, Markov random field, and Functional Flow using measures of Matthews correlation coefficient, harmonic mean, and receiver operating characteristic (ROC) curves. The results indicate that ClusFCM predicts protein functions with high recall while not lowering precision.