by Pengfei Tian, Robert B. Best
Most foldable protein sequences adopt only a single native fold. Recent protein design studies have, however, created protein sequences which fold into different structures apon changes of environment, or single point mutation, the best characterized example being the switch between the folds of the GA and GB binding domains of streptococcal protein G. To obtain further insight into the design of sequences which can switch folds, we have used a computational model for the fitness landscape of a single fold, built from the observed sequence variation of protein homologues. We have recently shown that such coevolutionary models can be used to design novel foldable sequences. By appropriately combining two of these models to describe the joint fitness landscape of GA and GB, we are able to describe the propensity of a given sequence for each of the two folds. We have successfully tested the combined model against the known series of designed GA/GB hybrids. Using Monte Carlo simulations on this landscape, we are able to identify pathways of mutations connecting the two folds. In the absence of a requirement for domain stability, the most frequent paths go via sequences in which neither domain is stably folded, reminiscent of the propensity for certain intrinsically disordered proteins to fold into different structures according to context. Even if the folded state is required to be stable, we find that there is nonetheless still a wide range of sequences which are close to the transition region and therefore likely fold switches, consistent with recent estimates that fold switching may be more widespread than had been thought.