Journal of Molecular Biology
Novel Strategy for Protein Exploration: High-throughput Screening Assisted with Fuzzy Neural Network
Introduction
To create proteins with desirable properties from a natural protein by mutation and selection, exhaustive screening processes including library construction and assay of numerous samples of the library are required.1 The size of the library, which determines the labor of screening, expands enormously in order to explore such a massive combination space of protein sequences. For example, to examine the mutational effect on a protein based on only five residues, 3,200,000 (=205) variants should be theoretically covered. However, in such cases, a conventional screening strategy that experimentally creates every mutation is not feasible, due to limited throughput of assay devices and incompleteness of the selection that is inherited in the high-throughput technology.
Here, we propose a novel screening strategy introducing bioinformatic analysis to assist, revise, and integrate the high-throughput screening (HTS) process for the efficient exploration of combinatorial sequence space. As a model study, this strategy was applied to explore a novel enzyme with inverted enantioselectivity.
The importance of enantiomerically pure compounds has been widely expanding in pharmaceutical, agricultural, and synthetic organic chemistry fields.2, 3, 4 Enzymes, which are the biological enantioselective proteins, have been the key tool for the effective synthesis of these compounds.
In genetic engineering of enzymes, evolutional methods, which include random mutagenesis combined with HTS, have proved their effectiveness in tuning the enantioselectivity of enzymes.4, 5, 6, 7 Error-prone polymerse chain reaction (PCR) followed by saturation mutagenesis has succeeded in effectively inverting the enantioselectivity of hydantoinase toward d,l-5-(2-methylthioethyl) hydantoin.8 Reetz et al. have reported the effectiveness of combining error-prone PCR with DNA shuffling for obtaining a lipase variant of Pseudomonas aeruginosa with complete enantioselectivity inversion.9, 10 The same strategy has been successfully applied to other enzymes.11 These methods have successfully expanded the probability of having a wider variation in samples obtained from a limited source. However, the requirement of a great deal of labor in the screening process still remained, implying several cycles of screening thousands of variants with every round of random mutation, saturation mutation, or shuffling.
One of the directional approaches in library screening was the use of focused mutation on a rationally determined position in the enzyme. Recently, such a strategy has been shown to be effective in obtaining a bacterial lipase mutant with inverted enantioselectivity.12 In this study, the mutation sites were rationally determined based on a three-dimensional (3D) structural model of the intermediate complex between an enantioactive substrate, which had been already used as a model substrate for a directed evolutional experiment of an esterase13 and the enzyme, and mutational variants were obtained using a novel high-throughput technology, namely SIMPLEX (single-molecule-PCR-linked in vitro expression).14, 15, 16
Here, we attempted to construct a new strategy to effectively screen lipases with inverted enantioselectivity, from the (S)-form substrate to the (R)-form substrate, using fuzzy neural network (FNN). The basic idea of our strategy is to apply the data from HTS into FNN, and the result of the analysis is utilized as feedback to design a more effective experiment for obtaining the objective enzyme.
FNN is a type of artificial neural network, which automatically constructs complex model structures by learning the hidden relationship between input and output data, and it functions as a predictor.17 As compared to the artificial neural networks, FNN has an advantageous feature, i.e. the “fuzzy layer.” This enables the interpretation of the model structure and extraction of the quantified relationship between input and output values as “a rule” designated as “fuzzy rule.16” Such a feature should be regarded as a significant character for a predicting program in protein design, where usual artificial neural networks only function as a black box tool. We have utilized this feature of FNN by applying to a wide range of research fields to predict significant factors and factor combinations in complex phenomena, such as industry manufacturing,18, 19, 20 coffee taste modeling,21 peptide prediction,22 and gene profiles from microarray data.23, 24
To our knowledge, this work is the first trial in which high-throughput technology and computational technology have been actually combined for effective enzyme engineering in an interactive manner. Our objective was, with less labor input and without prior knowledge, to use data from the first screening as clues for the following protein engineering and in turn explore novel enzymes that were missed out in the HTS. A scheme with experimental results and future prospects has been discussed.
Section snippets
Data acquisition of enantioselective lipase variants and experimental scheme
As a model study, an integrated protein exploration strategy combining high-throughput technology (in this case, SIMPLEX) and FNN, a program that automatically extracts rules to interpret complex phenomena, was used for obtaining new enzymes with inversed enantioselectivity. The research scheme is described briefly in Figure 1. In normal HTS, only a few winners selected give us meaningful information (Figure 1(a)–(d)) and others are discarded. On the other hand, in our proposed strategy
Discussion
The objective of the present study was to establish a novel strategy for exploring proteins with a selective activity, by combining HTS technology with bioinformatic analysis (Figure 1). Here, as a model case, we attempted to explore lipases with the objective enantioselectivity. It is important to note that in most bioinformatic studies, the prediction results are only validated by in silico data and rarely validated by following up with actual experiments. However, in our study, we designed
Screening of inverted enantioselective lipase from variant library
The variant library of the Burkholderia cepacia KWI-56 lipase was constructed with single-molecule PCR and cell-free protein synthesis, termed SIMPLEX (single-molecule-PCR-linked in vitro expression) method. The lipase variants were screened with their enantioselectivity inverted from (S)-form substrate to (R)-form substrate. To assay the lipase activity, 10 μl of cell-free reaction solution from the variant library was added to 90 μl of lipase assay solution (2 mM of (R)-or (S)-p-nitrophenyl
Acknowledgements
This study was partly supported by a Grant-in-Aid for Scientific Research from the Japan Society for the Promotion of Science (No. 16360411, 15360439 and 17206082). We also acknowledge Hori Informational Science Foundation for financial support.
References (28)
- et al.
High-throughput screening of enzyme libraries
Curr. Opin. Biotechnol.
(2000) Enzymes in the synthesis of chiral drugs
Enzyme Microb. Technol.
(1993)- et al.
Novel methods for biocatalyst screening
Curr. Opin. Chem. Biol.
(2001) - et al.
Directed evolution of an enantioselective lipase
Chem. Biol.
(2000) - et al.
Directed evolution of epoxide hydrolase from A. radiobacter toward higher enantioselectivity by error-prone PCR and DNA shuffling
Chem. Biol.
(2004) - et al.
Inverting enantioselectivity of Burkholderia cepacia KWI-56 lipase by combinatorial mutation and high-throughput screening using single-molecule PCR and in vitro expression
J. Mol. Biol.
(2003) - et al.
High-throughput, cloning-independent protein library construction by combining single-molecule DNA amplification with in vitro expression
J. Mol. Biol.
(2002) - et al.
Fuzzy control of bioprocess
J. Biosc. Bioeng.
(2000) - et al.
Fuzzy neural network-based prediction of motif for MHC class II binding peptides
J. Biosc. Bioeng.
(2001) - et al.
Selection of casual gene sets for lymphoma prognostication from expression profiling and construction of prognostic fuzzy neural network models
J. Biosc. Bioeng.
(2003)
A simple method for displaying the hydropathic character of a protein
J. Mol. Biol.
The characterization of amino acid sequences in proteins by statistical methods
J. Theor. Biol.
Enzymatic synthesis of chiral intermediates for pharmaceuticals
J. Ind. Microbiol. Biotechnol.
Recent progress in biomolecular engineering
Biotechnol. Prog.
Cited by (21)
Directed Evolution of a Selective and Sensitive Serotonin Sensor via Machine Learning
2020, CellCitation Excerpt :A high-resolution structure of ligand-bound iSeroSnFR could reinvigorate this process, but we have been as yet unable to obtain such a structure. Alternatively, the addition of more biophysical parameters to the model, or more advanced ML models such as universal transformers (Dehghani et al., 2019), Bayesian optimization (Yang et al., 2019b), or neural networks (Kato et al., 2005), could extract sequence/function relationships that we missed. On a related note, it will be broadly useful for the field to somehow incorporate ML-gleaned insights back into the biophysical potential functions underlying structure-based computational protein design.
Structure-based drug design to augment hit discovery
2011, Drug Discovery TodayCitation Excerpt :Although this technique is very efficient and powerful in screening compounds of interest, it consumes a lot of time and materials to perform experimental studies for huge combinatorial space (i.e. cost is high). Further, with the increased size of a screening library the efficiency of HTS tends to decrease [20]. Hence, employing alternative hit identification approaches that can handle varieties of biological targets effectively and identify pharmacologically sound hits becomes inevitable [21].
Technical methods to improve yield, activity and stability in the development of microbial lipases
2010, Journal of Molecular Catalysis B: EnzymaticA motif detection and classification method for peptide sequences using genetic programming
2008, Journal of Bioscience and BioengineeringLipases from Extremophiles and Potential for Industrial Applications
2007, Advances in Applied MicrobiologyCitation Excerpt :Kinetic analysis indicated that a majority of the obtained enzyme variants either retained or surpassed wild‐type activity on a series of standard substrates. Kato et al. (2005) established a strategy for exploring functional proteins associated with computational analysis by using fuzzy neural network (FNN). FNN, a type of artificial neural network, automatically constructs complex model structures by learning the hidden relationship between input and output data, and it functions as a predictor.