CyberDope

CyberDope (Version 1.2) is a user-friendly, interactive program for protein engineering and molecular evolution that restricts the sequence complexity of combinatorial mutagenesis in a logical fashion by using novel algorithms and expert heuristics. CyberDope calculates optimized nucleotide mixtures based on either the maximization of group probability or the minimization of a 21-dimensional sum-of-the-squares-of-differences fitting function. The program's output is reported at "machine resolution", thus no change is required in the conventional procedures by which commercial DNA synthesizers construct oligonucleotides. CyberDope has been used to create optimized cassettes from phylogenetically related sequences or from random mutagenesis data. With a 16-site library, a ten million-fold amplification over random cassette mutagenesis has been reported (Bio/Technology * 11:1548-1552).

Tic-Tac-Toe Example

The genetic code can be visualized in 3-D to illustrate how target sets of amino acids can be selectively encoded by mixing nucleotides. Consider how one might use this representation to encode a target set of the aromatic amino acids: Phenylalanine (F), Tryptophan (W), and Tyrosine (Y). By highlighting these residues and referring back to the axis, one possible solution is obtained.

[3D Code] [3D Code Marked]

The target set {F,W,Y} is optimally encoded by T(TAG)(TG), thus reducing total triplet complexity for a random NNN doping (64 possibilities) to 1 x 3 x 2 = 6 possibilities. L, C, and X are unavoidable cross-products of the {F,W,Y} target set. This is a relatively easy problem to solve using the 3-D graphic. More difficult problems, including weighted proportions of amino acids, are readily calculated using the CyberDope program. CyberDope is used with well established molecular genetics and protein engineering techniques developed by KAIROS staff: TSM, Target Set Mutagenesis; REM, Recursive Ensemble Mutagenesis; EEM, Exponential Ensemble Mutagenesis.

CyberDoping Examples

Three different options for codon usage: NNN, NN(GC), and NN(GT) and two different error functions are available in CyberDope. The Maximum Group Probability (MaxGP) error function will encode all amino acids in the target set regardless of the frequency of occurrence. The Minimum Sum of Squares of Differences (MinSSD) error function takes into account the frequency of occurrence of amino acids in the target set and will drop infrequently used amino acids from the dope if the nucleotide complexity is lowered significantly.

The first example output from CyberDope shows the results of calculations involving the target sets {K,N,R,S,T} and {F,I,L,M,W,A,Y} which were also reduced to experimental practice in a publication by Goldman and Youvan (Bio/Technology*, 10:1557-1561). For each dope, the single letter amino acid is followed by its probability of occurrence. For {K,N,R,S,T} the program finds five equally optimized dopes. The next best set of dopes also encodes KNRST exclusively, however the relative frequencies of amino acids show greater disparity.

In a second example, it is not possible to encode FILMWAY exclusively due to the structure of the genetic code. Amino acids CDEGKNRSTVX are also encoded. If we use MinSSD on the target set { F,I,L,M,W,A,Y}, we see that Alanine is no longer encoded and the number of unwanted amino acids has decreased to CKNRS. If Alanine is important, as indicated by the relative weighting values in the target set {5A, 2F, 2I, 2L, 2M, 1W, 1Y}, CyberDope calculations demonstrate that the dope (GAT)(TC)(GC) will encode FILMA and STV, however, WY is dropped.

Since the number of unweighted target sets is [2^21] and the number of weighted sets is even greater, it is important to develop a feel for the program and try alternative strategies. Turnaround time for trying a new dope (or another strategy) is only a few seconds.

*Note: Nature Biotechnology; previously Bio/Technology


Home l Products and Services l Contract Research and Development l Publications l Company Information 
CyberDope l Employment Opportunities l Search l Contact l Copyright