Drug Development Research 1994 July 6(33): 125-32

 

Estimating Protein Function From Combinatorial Sequence Data Using Decision Algorithms and Neural Networks

Goldman, E.R., Füllen, and Youvan, D.C.

Palo Alto Institute of Molecular Medicine, Mountain View, CA 94043, USA.

          FSPM-Strukturbildungsprozesse, Universität Bielefeld, Bielefeld, Germany
 
Correlations between protein sequences and phenotypes were explored using databases of combinatorial cassette mutants of pigment-protein complexes.  Heuristically formulated decision algorithms and computer implemented neural networks were compared to determine their accuracy in classification of phenotypic categories.  For the databases examined, decision algorithms employing very simple rules were able to properly classify mutants 80-84% of the time, based only on the amino acid sequence of the mutageneized region.. Such decision algorithms did not require the formulation of any rules that involved site-to-site interactions, but rather, performed well based on the stringency of specific critical sites in the protein that accept only a restricted set of amino acids.   In some cases, neural networks scored almost 10% higher than decision algorithms on the same databases (i.e., 94%).  However, the success of the primitive decision algorithms and perceptrons at sorting sequences into categories suggests that linear effects predominate in the classification of a mutant's phenotype.  Such methods should be generally applicable to the broad spectrum of databases that are currently being generated in combinatorial chemistry and biology experiments.

 

 

           | HOME |