INTRODUCTION

Protein microarray technology has made enormous progress in the last decade, increasingly becoming an important research tool for the study and detection of proteins, protein-protein interactions and numerous other biotechnological applications (14). The use of protein microarrays has advantages over more traditional methods for the study of molecular interactions. They require low sample consumption and have potential for miniaturization. Protein microarrays displaying multiple biologically active proteins simultaneously have the potential to provide high-throughput protein analysis in the same way DNA arrays did for genomics research a decade ago. This is a feature that is extremely important for the analysis of protein interactions at the proteome-scale. The transition from DNA to protein microarrays, however, has required the development of specially tailored protein immobilization methods that ensure the protein structure and biological function after the immobilization step. Several technologies have been developed in the last few years that allow the site-specific immobilization of proteins onto solid supports for the rapid production of protein microarrays using high throughput expression systems, such as cell-free expression systems (57). The development of appropriate detection systems to monitor protein interactions has also been an important challenge for the optimal use of protein microarrays. The use of techniques such as fluorescence imaging, mass-spectrometry (MS) and surface plasmon resonance (SPR) were recently developed and adapted to be interfaced with protein micro-arrays. During the last decade, a number of excellent reviews have appeared in the literature describing the concept, preparation, analysis and applications of protein microarrays, highlighting the increasing importance of this technology (14,8). The aim of this review is to summarize the latest developments in protein microarray technology in the areas of protein immobilization, novel protein detection schemes and applications of this promising technology.

PROTEIN MICROARRAYS

Protein microarrays are usually divided in two groups: functional protein microarrays and protein-detecting microarrays (Fig. 1) (2,9). Protein function microarrays are made by the immobilization of different purified proteins, protein domains or functional peptides. These types of microarrays are generally used to study molecular interactions and screen potential interacting partners. On the other hand, protein-detecting microarrays are made by the immobilization of specific protein capture reagents that can specifically recognize particular proteins from complex mixtures. These microarrays are used for protein profiling, i.e. quantification of protein abundances and evaluation of post-translational modifications in complex mixtures.

Fig. 1
figure 1

Common formats used for the preparation of protein microarrays. Functional protein microarrays (A) are used to study and identify new molecular interactions between proteins, small molecules or enzyme substrates, for example. Protein detecting microarrays (B) are used to identify proteins from complex mixtures. In the sandwich format (B, left), captured proteins are detected by a secondary antibody typically labeled with a fluorescent dye to facilitate detection and quantification. In contrast to antibody microarrays, lysate microarrays (B, right) are typically immobilized onto nitrocellulose-coated glass slides (FAST slides) and detected using fluorescent-labeled solution-phase specific antibodies.

Functional Protein Microarrays

Understanding the network of molecular interactions that defines a particular proteome is one of the main goals of functional proteomics. Functional protein microarrays provide an extremely powerful tool to accomplish this daunting task, especially when assessing the activity of families of related proteins. In 2000, Schreiber and co-workers showed that purified recombinant proteins could be microarrayed onto chemically derivatized glass slides without seriously affecting their molecular and functional integrity (10). More recently, Snyder and co-workers have been able to immobilize ≈5,800 proteins from Sacharomyces cerivisiae onto microscope glass slides (11). This protein chip was then probed with different phospholipids to identify several lipid-binding proteins. The same authors also used this proteome chip for the identification of substrates for 87 different protein kinases (12). Using this microarray data set in combination with protein-protein interaction and transcription factor binding data, the authors were able to reveal several novel regulatory modules in yeast (12). Using a similar approach, Dinesh-Kumar and co-workers were able to construct a protein microarray containing 2,158 unique Arabidopsis thaliana proteins. This array was used for the identification of 570 phosphorylation substrates of mitogen-activated protein kinases, which included several transcription factors involved in the regulation of development, host immune defense, and stress responses (13). The analysis of proteome-wide microarrays from yeast was also recently used to find unexpected non-chromatin substrates for the essential nucleosomal acetyl transferase of H4 (NuA4) complex (14). In this interesting work, the authors discovered that NuA4 is a natural substrate for the metabolic enzyme phosphoenolpyruvate carboxykinase and that its acetylation is critical for regulating the chronological lifespan of yeast (14). In another example, human proteome arrays were used for the detection of autoimmune response markers in several human cancers (15,16). Kirschner and co-workers have also used human proteome arrays to identify novel substrates of the anaphase-promoting complex (17). This was accomplished by probing the arrays with cell extracts that replicate the mitotic checkpoint and anaphase release and then probing the captured proteins with antibodies specific for detecting poly-ubiquitination (17). Functional protein microarrays have also been used to study families of interacting protein domains. Bedford and co-workers have shown that several protein domains (FF, FHA, PH, PDZ, SH2, SH3, and WW) can be immobilized onto a microarray format, retaining their ability to mediate specific interactions (18). Similar approaches were used to study the interactions associated with WW domains in yeast (19) and Kaposi-sarcoma viral proteins and the host endocytic machinery (20), and to evaluate the interactions between different proline-rich peptides derived from the myelin basic protein and several SH3 domains (21).

Functional protein domain microarrays can also be used to quantify protein interactions. For example, in 2004 Blackburn and co-workers used microarrays containing multiple variants of the transcription factor p53 to study and quantify their DNA-binding preferences (22). By using fluorescent-labeled DNA probes, the authors were able to produce binding isotherms and extract the different equilibrium dissociation constants for every p53 variant (22). MacBeath and co-workers have also used a similar approach to quantify the interactions of several human SH2 and PTB domains with different phosphotyrosine-containing peptides derived from human ErbB receptors (Fig. 2) (23). This type of protein microarray provides a unique way to study the binding properties of complete families of proteins and/or protein domains in an unbiased way. In addition, they have the potential to generate data that, when collected in a quantitative way, could be used for training predictive models of molecular recognition (2426). As a recent example, MacBeath and co-workers recently used functional microarrays containing multiple murine PDZ protein domains to screen potential interactions with 217 genome-encoded peptides derived from the murine proteome (24,25). The data generated was used to train a multidomain selectivity model that was able to predict PDZ domain-peptide interactions across the mouse proteome. Interestingly, the models showed that PDZ domains are not grouped into discrete functional classes; instead, they are uniformly distributed throughout the selectivity space. This finding strongly suggests that the PDZ domains across the proteome are optimized to minimize cross-reactivity (24,25).

Fig. 2
figure 2

Quantitative interaction networks of tyrosine kinases associated with the Erb family of receptors, which was determined using protein microarrays displaying 96 SH2 and 37 PTB domains. The SH2 and PTB protein domains were probed with fluorescently labeled phosphopeptides representing the different tyrosine phosphoryaltion sites on the Erb kinases. The readout of peptide binding was monitored and quantified by fluorescence. The interaction maps (bottom panel) were constructed from the quantitative interaction data (156). Reprinted from reference (156) with permission from Elsevier.

Protein-Detecting Microarrays

As described above, functional protein microarrays allow high-throughput screening and quantification of protein interactions on a proteome-wide scale, thus providing an unbiased perspective on the connectivity of the different protein-protein interaction networks. Establishing how this information flows through these interacting networks, however, requires measuring the abundance and post-translational modifications of many proteins from complex biological mixtures. Protein-detecting microarrays are ideal reagents for this type of analysis. One of the most frequently used strategies to prepare this type of microarray involves the use of monoclonal antibodies as specific protein capture reagents. Antibodies have been classically well suited for this task, since there are a large number of commercially available specific antibodies, which can be easily immobilized onto solid supports (4,2730). However, the potential problems associated with the use of antibodies for chip assembly, which might manifest themselves through moderate expression yields and by issues related to the stability and solubility of these large proteins, have led to the exploration of alternative protein scaffolds as a source for new, more effective and stable protein capture reagents (24,31,32). Suitable protein scaffolds that have been proposed include fibronectin domains, the Z domain of protein A, lipocalins and cyclotides, among others.

In general, antibody microarrays are well suited for detecting changes in the abundances of proteins in biological samples with a relatively large dynamic range (33). For example, Haab and co-workers made use of antibody microarrays for serum-protein profiling in order to identify potential biomarkers in prostate cancer (33). Using this approach, the authors were able to identify five proteins (immunoglobulins G and M, α1-anti-chymotrypsin, villin and the Von Willebrand factor) that had significantly different levels of expression between the prostate cancer samples and control samples from healthy individuals.

In a similar fashion to that of a sandwich ELISA assay, quite often, antibody microarrays make use of a second antibody directed towards a different epitope of the protein to be analyzed. This facilitates the detection and quantification of the corresponding analyte. This approach has been used for monitoring changes in the phosphorylation state of host proteins (34), including receptor tyrosine kinases (35), and for serum protein profiling to identify new biomarkers in prostate cancer (36) among other applications. The use of this approach is usually limited, however, by the availability of suitable antibodies that can be used for capture and detection. Moreover, the detection step requires the simultaneous use of multiple fluorescent-labeled antibodies, which may increase background signal as well as the risk of cross-reactive binding as the number of antibodies increases. A way to overcome this problem is to label the proteins in the biological sample to be analyzed using one or more fluorescent dyes (37). This approach allows one to perform ratiometric comparisons between different samples by using spectrally distinct fluorophores. This strategy has been employed for the discovery of molecular biomarkers in different types of human cancer (3840). It should be highlighted, however, that non-specific chemical labeling of proteins introduces chemical modifications on their surface and, therefore, may alter antibody recognition and lead to false signals. Also, this approach requires the homogeneous labeling of proteins across different samples, which in most cases cannot be completely guaranteed. These drawbacks can, in principle, be avoided by using a label-free detection scheme. However, nearly all of the different methods available for this task (see below) still lack the sensitivity required for most biological applications.

Although antibody microarrays are well suited for protein profiling, proteome-wide applications have not been accomplished yet. This is mainly due to the lack of available, well-validated antibodies. An ingenious solution proposed by Lauffenburger and co-workers, however, is to use a combination of different experimental approaches with the data generated by microarrays (41,42). In this work, the authors combined data gathered from antibody microarrays, enzymatic assays, immunoblotting, and flow cytometry to assemble a network of ≈10,000 interactions in HT-29 cells treated with different combinations of cytokines (41). All of this information was later used to uncover mechanisms of crosstalk involving pro- and anti-apoptotic signals induced by different cytokines (42).

Protein Lysate Microarrays

An interesting alternative to antibody microarrays is to immobilize cell lysates and then use specific monoclonal antibodies to identify and quantify the presence of a particular analyte in the corresponding lysate. This technology was first described by Liotta and co-workers to monitor pro-survival checkpoint proteins as a function of cancer progression (43). The same approach has recently been used for the discovery and validation of specific biomarkers for disease diagnosis and patient stratification. Utz and co-workers (44) have also made use of lysate microarrays to study the kinetics of intracellular signaling by tracking 62 phosphorylation sites in stimulated Jurkat cells, which allowed them to discover a previously unknown connection between T-cell receptor activation and Raf-1 activity (44).

In protein lysate microarrays, every spot in the microarray contains the entire set of biological proteins to be analyzed. This means that in order to analyze the abundance and modification states of different proteins present in the lysate, it is necessary to prepare as many copies of the array as proteins needed to be analyzed. Lysate microarrays also denature the proteins to be analyzed during the immobilization step onto the solid support. This makes it impossible to study complex protein-protein interactions and requires the use of specific and well-validated antibodies for the recognition of specific continuous protein epitopes. This is a serious limitation of this technique, since it only allows the analysis of proteins that have already been discovered and to which suitable antibodies are available. In this regard, it should be noted the majority of commercially available antibodies typically show substantial cross-reactivity issues and, therefore, are not appropriate for this type of approach. Only antibodies able to provide a single band in a standard Western blot should be used. Moreover, the blocking and detection protocols, as well as the composition of the lysis buffer, have been shown to substantially affect antibody performance (45), therefore indicating that further developments are required for the widespread use of this technology.

NOVEL APPROACHES FOR PROTEIN IMMOBILIZATION

The immobilization of proteins onto solid supports has traditionally relied on non-specific adsorption (46,47) or covalent crosslinking of naturally occurring chemical groups within proteins (4749). These approaches usually provide a random orientation of the immobilized protein onto the solid support, which may compromise the structural and/or functional integrity of the protein (50). This is a key issue for the fabrication of functional protein microarrays as described above. The use of recombinant affinity tags as capture reagents offers site-specific immobilization. The most commonly used affinity tags include biotin/avidin (5153), His-tag/Ni2+-nitriloacetic acid (11,54) and glutathione-S-transferase (GST)/glutathione (GSH) (12,55).

Immobilization of antibodies through the Fc region onto protein A- or protein G-coated surfaces has also been used for the creation of antibody microarrays (56,57). Additionally, thioredoxin (58), maltose-binding protein (59) and chitin-binding protein (60) have also been developed for the immobilization of the corresponding fusion proteins. Protein-DNA conjugates have also been recently reported for DNA-directed immobilization (DDI) of proteins onto complementary DNA-microarrays (61,62). Most of these interactions, however, are reversible and not stable over time (6367). The use of site-specific chemical ligation reactions for the immobilization of proteins overcomes this limitation by allowing the proteins to be arranged in a defined, controlled fashion with exquisite chemical control (see references (29,68,69) for recent reviews in this field). This type of reaction requires two unique and mutually reactive groups on the protein and the solid support used for the immobilization step (Fig. 3). Ideally, the reaction between these groups should be highly chemoselective and compatible with physiological conditions to avoid denaturation during the immobilization step (28,70). Finally, it should be desirable that these unique reactive groups could be directly engineered into the proteins to be immobilized by using standard recombinant expression techniques.

Fig. 3
figure 3

Site-specific and covalent immobilization of a functional protein onto a chemically modified surface using a chemoselective ligation reaction.

Most of the chemoselective methods suitable for site-specific immobilization of proteins described in the literature rely on ligation methods originally designed for the chemical engineering of proteins (7177). Key to these methods is the introduction of a unique reacting group at a defined position in the protein to be immobilized, which can later react in a chemoselective manner with a complementary group previously introduced into the surface (Fig. 3, see also references (4,27,29,69,78) for recent reviews).

Surface Modification

The most common solid supports employed for the immobilization of proteins in micro- and nano-biotechnology and biomedical applications involve the use of metals and silicon- and semiconductor-based substrates. Trialkoxysilanes such as 3-aminopropyl-trialkoxysilane (APS) or 3-mercaptopropyl-trialkoxysilane are typically employed for the chemical modification of silicon-based substrates for the introduction of amino (–NH2) and thiol (–SH) groups, respectively. These chemical groups can then be modified by the introduction of appropriate linkers allowing the chemoselective attachment of proteins. Long chain alkyl-trichlorosilanes are more reactive towards the silanol group than trialkoxysilanes and have also been employed for the chemical modification of silicon-based substrates. The higher reactivity of long alkyl-trichlorosilanes is due to the self-assembling properties of the long aliphatic chains, which result in the formation of highly ordered and densely packed monolayers with solid-state-like properties (79,80).

Compounds containing the thiol or selenol (–SeH) groups can be also used to modify substrates based on transition metals, mostly gold and silver (80,81), or semiconductor materials (48). The chemical derivatization of gold surfaces using alkanethiols is by far one of the most commonly employed (81,82). Our group has developed several synthetic schemes for the efficient preparation of modified alkanethiols (83,84) that were used for the selective immobilization of functional proteins onto gold and glass surfaces (8386).

The use of organic polymeric materials, such as poly-dimethylsiloxane (PDMS), poly-methylmethacrylate (PMMA) and polycarbonate (PC), has also been explored as a potential alternative to inorganic solid supports for the production of protein microarrays (87,88). The use of these materials also requires the introduction of suitable reacting groups for the site-specific immobilization of proteins. Common techniques usually employed for this task involve the use of plasma oxidation followed by treatment with appropriate organosilanes for the functionalization of PDMS (89), treatment of PMMA with 1,6-hexanediamine for the introduction of reactive amino groups (90), or using sulfonation reactions on PC to provide sulfated-coated surfaces (29).

Protein Immobilization Using Expressed Protein Ligation

The use of Expressed Protein Ligation (EPL) for the site-specific immobilization of biologically active proteins onto solid supports has been pioneered by our group (84). This approach relies on the chemoselective reaction of recombinantly produced protein α-thioesters with surfaces containing N-terminal Cys residues. C-terminal α-thioester proteins can be readily expressed in Escherichia coli, using commercially available intein expression systems (91). This ligation reaction is exquisitely chemoselective under physiological-like conditions and results in the site-specific immobilization of the protein through its C-terminus. We have successfully used this approach for the production of protein arrays containing several biologically active proteins onto Cys-coated glass slides (84). Typically, the immobilization reaction is performed at room temperature for 18 h and requires a minimal protein concentration in the low μM range for acceptable levels of immobilization (84). Yao and co-workers have also reported a similar approach for the selective immobilization of N-terminal Cys-containing polypeptides (52) and proteins (92) onto solid supports derivatized with an α-thioester group.

Schneider-Mergener and co-workers have recently combined SPOT synthesis (93) and a thioester ligation for the creation of arrays containing more than 10,000 variants of WW protein domains (94). Using 22 different peptide ligands to probe the WW domain arrays, the authors were able to monitor more than 250,000 binding experiments (94).

Protein Immobilization Using the Staudinger Ligation Reaction

A modified version of the Staudinger ligation reaction has also been employed for the chemoselective immobilization of azido-containing proteins onto solid supports derivatized with a suitable phosphine (71,75,9597). The azido function can be readily incorporated into recombinant proteins using E. coli methionine auxotroph strains (98,99). A reactive arylphosphine derivative can be easily introduced onto carboxylic- or amine-containing surfaces (63,97). It should be noted that when the protein to be immobilized has multiple methionine residues this type of immobilization is not site-specific. This limitation can be overcome, however, by using in vitro EPL for the site-specific introduction of an azido group at the C-terminus of recombinant proteins (97). This can also be accomplished by reacting the corresponding protein C-terminal α-thioesters with functional hydrazines containing the azido group for the site-specific introduction of this chemical group at the C-terminus of recombinant proteins (100).

Protein Immobilization Using “Click” Chemistry

The site-specific immobilization of azido- or alkyne-containing proteins onto alkyne- or azido-coated surfaces was recently accomplished by using the Cu(I)-catalyzed Huisgen 1,3-dipolar azide-alkyne cycloaddition, also known as “click” chemistry (101103).

This is a very mild reaction that usually requires only the presence of Cu(I) as catalyst and is typically performed under physiological conditions. Under these conditions, the cycloaddition reaction is exquisitely regiospecific, affording only the 1,4-disubstitued tetrazole. The catalyst Cu(I) is usually generated in situ by reduction of Cu(II) using reducing agents such as tris-[2-carboxyethyl]-phosphine hydrochloride (TCEP•HCl) or ascorbic acid (77).

Site-specific incorporation of an alkyne group at the C-terminus of recombinant proteins can be also accomplished by using in vitro EPL (101) or nucleophilic cleavage of intein fusion proteins with derivatized hydrazines (100). The alkyne function has also been introduced chemo-enzymatically into recombinant proteins by using protein farnesyltransferases (PFTase) (102,103). This approach allows the selective S-alkylation of the Cys residue located in C-terminal Cys-Aaa-Aaa-Xxx motifs (where Xxx = Ala, Ser) by farnesyl diphosphate analogs containing the alkyne function.

Taki and co-workers have also accomplished the introduction of the azido function onto the N-termini of proteins by using the enzyme L/F-transferase (104), which is known to catalyze the transfer of hydrophobic amino acids from an aminoacyl-tRNA to the N-terminus (105). This modification, called NEXT-A (N-terminal extension of protein by transferase and amino-acyl transferase), can be accomplished in one pot and can also work in the presence of other proteins or even in crude protein mixtures (106,107). The authors used this method to functionalize the N-terminus of lectin EW29Ch with p-azido-phenylalanine, which was then immobilized onto a solid support coated with 4-dibenzocyclooctynol (DIB) through a copper-free “click” chemistry ligation (108110).

Waldmann and co-workers have also developed the “click sulfonamide reaction” (CSR) between sulfonyl azides and alkynes to immobilize proteins and other types of biomolecules onto solid supports (111). Using this approach the authors were able to immobilize a C-terminal alkyne-modified Ras-binding domain (RBD) of cRaf1 onto a sulfonyl azide modified surface. The resulting immobilized protein was biologically active and able to selectively bind to GppNHp-bound Ras but not to inactive GDP-bound Ras (111).

In principle, “click” chemistry can be used for the chemoselective immobilization of alkyne- or azido-containing recombinant proteins onto azido- or alkyne-coated surfaces, respectively. However, it has been recently reported that the immobilization of alkyne-modified proteins onto azide-coated surfaces proceeds more efficiently (101). This effect could be attributed to the fact that the alkyne function coordinates Cu(I) in solution more efficiently than the azido group, which could improve the immobilization reaction (101). As for the other ligation reactions mentioned above, the minimal concentration of protein required for acceptable levels of immobilization using this type of ligation is typically found in the low μM range (101,102).

Protein Immobilization Using Active Site-Directed Capture Ligands

The efficiency of the different ligation reactions described so far for the site-specific immobilization of proteins onto solid supports depends strongly on the protein concentration in order to reach acceptable levels of immobilization (84,101,102). This intrinsic limitation could be in principle minimized by introducing two complementary interacting moieties on the protein and the surface, thus allowing the formation of a transient and specific intermolecular complex. The formation of this complex should be able to bring both reactive groups in close proximity, which would facilitate the efficiency of the ligation reaction (see Fig. 4). In this case, the efficiency of the reaction should not be dictated only by the concentration of the protein to be immobilized but rather by the affinity constant between the two interacting complementary moieties.

Fig. 4
figure 4

Principle for site-specific protein immobilization using an active site-directed capture ligand approach.

Mrksich and co-workers have used this approach for the selective immobilization of cutinase fusion proteins onto surfaces coated with chlorophosphonate ligands (112) (Fig. 5). Cutinase is a 22 kDa serine esterase, which can selectively react with chlorophosphonate ligands (113). These ligands bind with high affinity to the active site of the enzyme by mimicking the tetrahedral transition state stabilized by the esterase during the hydrolysis of the ester function. Once the complex is formed, the side-chain of the catalytic serine residue in the esterase active site reacts covalently with the chlorophosphonate group to form a relatively stable phosphate bond (Fig. 5). This approach was used for the immobilization of calmodulin (112) and for the preparation of antibody arrays (114) onto gold-coated self-assembled monolayers derivatized with a chlorophosphonate capture ligand.

Fig. 5
figure 5

A Site-specific immobilization of cutinase-fusion proteins using an active site-directed capture ligand. B Structure of F. solani cutinase enzyme free and bound to the inhibitor n-undecyl-O-methyl phosphonate chloride. The inhibitor is covalently bound through the side-chain hydroxyl group of the Ser120 residue, which is located at the active site of the enzyme (113).

Johnsson and co-workers have also used a similar approach for the site-specific immobilization of proteins but using human O6-alkylguanine-DNA alkyltransferase (AGT) as a protein capture reagent (115). These types of enzymes can accept a benzyl group from O6-benzylguanine (BG) derivatives, thus allowing the site-specific immobilization of AGT-fusion proteins onto O6-benzylguanine-coated slides (116).

Protein Immobilization by Protein Trans-splicing

The main limitation of the site-specific capture methods described above is that they rely on the use of enzymes as capture reagents, which remain attached to the surface once the immobilization step is complete. The production of protein arrays containing these large linkers could give rise to non-specific interactions, especially in applications involving the analysis of complex samples (11,87).

Our group has recently developed a new traceless capture ligand approach for the site-specific attachment of proteins to surfaces based on the protein trans-splicing process (85) (Fig. 6). In protein trans-splicing, the intein self-processing domain is split in two fragments, which are referred as N-intein and C-intein (117,118). In this approach the N-intein fragment is fused to the C-terminus of the protein to be immobilized, and the C-intein fragment is immobilized onto the solid support. When both intein fragments interact, they bind to each other with high affinity (K d ≈ 200 nM for the Ssp DnaE split-intein (85)), forming an active intein domain that can give rise to protein splicing in trans. This results in the immobilization of the protein of interest to the solid support at the same time that the split intein fragments are spliced out into solution (see Fig. 6). We have recently used this approach for the production of arrays containing several biologically active proteins onto chemically modified glass slides (85). The immobilization of proteins using trans-splicing is highly specific and efficient. For example, protein immobilization can be readily accomplished at concentrations in the low nM range (85). Importantly, the high specificity of protein trans-splicing allows the direct immobilization of proteins from complex mixtures, thus eliminating the need for purification and/or reconcentration of the proteins prior to the immobilization step. Furthermore, protein trans-splicing provides a completely traceless method of protein immobilization, since both intein fragments are spliced out into solution once the immobilization step is completed. Finally, protein trans-splicing was shown to be fully compatible with cell-free protein expression systems, which should facilitate high throughput production of protein arrays (5,85). More recently, we have also shown that the trans-splicing activity of the naturally occurring Ssp DnaE split-intein can be photomodulated by introducing photolabile backbone protecting groups on the C-intein polypeptide (119). This opens the intriguing possibility for light-activated immobilization of proteins onto solid supports, which should allow rapid production of protein arrays by using available photolithographic techniques (120).

Fig. 6
figure 6

Site-specific immobilization of proteins onto solid supports through protein trans-splicing (85). Maltose binding protein (MPB) was directly immobilized from (a) soluble cellular fraction of E. coli cells over-expressing MBP-IN, and (b) MBP-IN expressed in vitro using an in vitro trascription/traslation expression system. MBP was detected using a fluorescent-labeled specific antibody.

PROTEIN ARRAY TECHNOLOGIES BASED ON CELL-FREE EXPRESSION SYSTEMS

Protein arrays have been traditionally produced by cellular expression, purification and immobilization of individual proteins onto appropriate solid supports. The production of a large number of proteins using conventional expression systems, based on bacterial or eukaryotic cells, is usually a very time-consuming process that requires large amounts of manpower. Moreover, the presence of disulfide bonds, special requirements for folding and post-translational modifications in some proteins, especially those of human origin, may require more specialized expression systems such as mammalian cells or baculovirus. The stability of folded proteins in an immobilized state over long periods of storage is also another potential issue when working with protein microarrays, especially if we consider the highly heterogeneous nature of proteins in regards to their physicochemical properties and stability characteristics.

The use of cell-free expression systems has been proposed as a potential solution to circumvent some of these issues. Because DNA arrays can, in principle, be readily synthesized and are physically homogeneous and stable, the issues associated with availability and stability should not apply in this case. Hence, cell-free expression systems have the potential to allow the immobilization of proteins at the same time they are produced by converting DNA arrays into protein arrays on demand (7,121).

Cell-Free Protein Expression Systems

Cell-free expression systems make use of cell extracts that contain all of the key molecular components for carrying out transcription and translation in vitro. Typically, these extracts can be purified from cell lysates of different types of cells. The most commonly used are obtained from E. coli, rabbit reticulocyte and wheat germ, although more specialized cell extracts from hyperthermophiles, hybridomas, insect, and human cells can also be employed (7). This large variety of available cell-free expression systems ensures that proteins can be expressed under different conditions (122). Cell-free systems have also been used for the introduction of different biophysical probes during translation for protein detection and/or immobilization (123125).

An important aspect to consider when preparing in situ protein arrays is the level of protein expression. While many proteins can be readily expressed, others may require modifications in the expression protocol or to the protein construct, for example by fusing them to a well-expressed fusion protein. He and co-workers have shown that using fusion protein constructs containing the constant domain of immunoglobulin κ light chain can significantly improve the expression levels of many proteins in E. coli-based cell-free expression systems (126).

Protein In Situ Array (PISA)

In this method, proteins are produced directly from DNA in solution and then immobilized as they are produced onto the surface through a recognition tag sequence (Fig. 7A) (127). In general, the DNA constructs encoding the proteins can be generated by PCR using designed specific primers for the protein of interest, although expression plasmids can also be used. The DNA constructs are also designed with strong promoters, such as T7, and regulatory sequences required for in vitro initiation of transcription/translation. An affinity tag sequence is also usually encoded into the N- or C-terminus of the protein to facilitate its immobilization after the translation step (Fig. 7A).

Fig. 7
figure 7

In situ methods for protein arraying by PISA (A), NAPPA (B) and puromycin-capture from RNA arrays (C).

In this approach, all the proteins are expressed in parallel using the appropriate in vitro transcription/translation systems. The protein translation reaction is carried out on the surface, which is precoated with a capture reagent able to specifically bind to the affinity tag and immobilize the proteins. This is typically accomplished by using His-tagged proteins and Ni2+-NTA coated surfaces, although other affinity tag/capture reagent combinations can also be used. Once the protein is translated and specifically immobilized onto the surface, any unbound material can be washed away.

The PISA method was originally demonstrated using a small set of proteins, which included several antibody fragments and the protein luciferase. These proteins were immobilized onto microliter wells and magnetic beads (127). In this work, PISA was used in a macro format in which ≈25 μL of cell-free expression reaction was used for the immobilization of individual proteins. More recently, PISA has also been miniaturized (using ≈40 nL) and adapted for the direct production of microarrays onto glass slides. In this method, the transcription/translation reaction is performed for 2 h at 30°C before spotting (7).

Hoheisel and co-workers have further developed the miniaturization of PISA using an on-chip system based on a multiple spotting technique (MIST) (128). In this approach, the DNA template is first spotted (≈350 pL) on the surface followed by the in vitro transcription/translation mixture on the same spot. The authors used His-tagged GFP as a model protein that was immobilized onto Ni2+-NTA-coated glass slides. It was estimated that with unpurified PCR products, as little as 35 fg (≈22,500 molecules) of DNA was sufficient for the detection of GFP expression in sub-nL volumes (128). The same authors also adapted the system for the high throughput expression of libraries by designing a single specific primer pair for the introduction of the required T7 promoter and terminator, and demonstrated the in situ expression using 384 randomly chosen clones from a human fetal brain library (128). In principle, the optimized and miniaturized version of PISA should be able to produce high-density protein microarrays containing as much as 13,000 spots per slide using a variety of different genomic sources in a relatively uncomplicated fashion.

Nucleic Acid Programmable Protein Array (NAPPA)

NAPPA is another approach that allows the on-chip transformation of DNA arrays into protein arrays (Fig. 7B). NAPPA was initially developed by LaBaer and co-workers, and uses transcription and translation from an immobilized DNA template (67,129), as opposed to PISA, where the DNA template is kept in solution. In NAPPA, the expression plasmids encoding the proteins as GST fusions are biotinylated and immobilized onto a glass slide previously coated with avidin and an anti-GST antibody, which acts as the protein capture reagent. This plasmid array is then used for in situ expression of the proteins using rabbit reticulocyte lysate or a similar cell-free expression system. Once the proteins are translated, they are immediately captured by the immobilized antibody within each spot. This process generates a protein array in which every protein is co-localized with the corresponding expression plasmid. In general, NAPPA provides good quality protein spots with limited lateral spreading, although some variation can be observed in the quality of the arrays generated by this approach.

The first demonstration of the NAPPA approach was carried out by the immobilization of 8 different cell cycle proteins, which were immobilized at a density of 512 spots per slide (67). It was estimated that ≈10 fmol of protein were captured on average per spot, ranging from 4 to 29 fmol for the different proteins, which was sufficient for functional studies. The authors used this protein array to map and identify new interactions between 29 human proteins involved in initiation of DNA replication. These data were used to establish the regulation of Cdt1 binding to select replication proteins and map its geminin-binding domain (67).

As with PISA, NAPPA allows the protein array to be generated in situ, thus removing any concerns about protein stability during storage. However, it requires the cloning of the genes of interest and biotinylation of the resulting expression plasmids to facilitate their immobilization onto the chip (Fig. 7B). Furthermore, the technology does not generate a pure protein microarray, but rather a mixed array in which the different GST fusion proteins are co-localized with their corresponding expression plasmids, avidin and the capture antibody.

In Situ Puromycin-Capture from mRNA Arrays

Tao and Zhu have ingeniously adapted the mRNA display technology for the production protein of microarrays by capturing the nascent polypeptides through puromycin (Fig. 7C) (130). In this approach, the PCR-amplified DNA construct is transcribed into mRNA in vitro, and the 3′-end of the mRNA is hybridized with a single-stranded DNA oligonucleotide modified with biotin and puromycin. These modified RNAs are then arrayed on a streptavidin-coated glass slide and allowed to react with a cell-free lysate for in vitro translation. During the translation step, the ribosome stalls when it reaches the RNA/DNA hybrid section of the molecule, and the DNA is then cross-linked to the nascent polypeptide through the puromycin moiety. Once the translation reaction is finished, the mRNA is digested with RNase, leaving a protein array immobilized through the C-termini to the DNA linker, which is immobilized through a biotin/streptavidin interaction to the surface. This technology was first exemplified by the immobilization of GST, two kinases, and two transcription factors (130). The transcription factors retained the ability to specifically bind DNA on the chip. This approach provides well-defined non-diffused protein spots as a result of the precise co-localization of the mRNA with puromycin and the 1:1 stoichiometry of mRNA versus protein. However, this method requires extra manipulations involving the reverse transcription and modification of the RNA before the spotting process, which may limit its practical use for the creation of large protein microarrays. Furthermore, the amount of protein produced is proportional to the amount of mRNA spotted, since there is no enzymatic amplification involved as in the PISA and NAPPA approaches.

DETECTION METHODS

In order to analyze, identify and quantify the proteins or any other type of biomolecules captured by the protein microarray, it is necessary to have detection methods that can provide high throughput analysis, high signal-to-noise ratio, good resolution, high dynamic range and reproducible results, with relatively low instrumentation costs. Most of the methods available for this task can be classified as label-dependent and label-free detection methods (see references (1,131) for recent reviews).

Label-Dependent Methods of Detection

Fluorescence-based detection is probably the most commonly used method in protein microarrays. This is mainly due to its simplicity, relatively high sensitivity and compatibility with already available DNA-array scanners. Protein-detecting microarrays usually employ a sandwich assay fluorescence-based detection system in which captured proteins are detected by a secondary fluorescent-labeled antibody (Fig. 1). This assay provides a higher specificity than the immunoassay based on a single antibody, since it reduces potential cross-reactivity issues. The sensitivity of fluorescence detection can also be improved by using the rolling circle amplification (RCA) method, which has been successfully applied for the profiling of different cytokines with detection limits on the fM range (132,133). The main limitation of these methods, however, is that they require two distinct capture reagents per protein to be analyzed, which means that if there are 1,000 proteins to be analyzed, more than 2,000 antibodies are required.

Specific fluorescence biosensing probes have also been used for the quantitative analysis of protein phosphorylation and protein kinase activity on functional protein microarrays. For example, the Pro-Q Diamond dye is a novel fluorescent phosphorylation sensor that allows the detection of phosphoproteins at sub-picogram levels of sensitivity (134). Hamachi and co-workers have also developed a fluorescence-based method for imaging monophosphorylated polypeptides by using bis-(Zn2+-dipicolylamine)-based artificial sensors (135). Such chemical approaches do not require the use of anti-target antibodies and therefore represent a good approach for high throughput screening of protein phosphorylation and kinase activity.

The use of fluorescent-labeled substrates immobilized onto a microarray format has also been reported to study enzymatic specificity in a high throughput format. Ellman and co-workers have used this approach to determine the P-site substrate specificity of several serine and cysteine proteases (136). In their work, the fluorophore 7-amino-4-methy-coumarin (AMC) was covalently attached to a peptide microarray containing different amino acids at the different P-site positions. The corresponding sequence preferences were determined by analyzing the remaining fluorescence on the chip after performing the proteolytic reaction. Yao and co-workers have also used a similar approach for screening the activities of different types of enzymes, including proteases, epoxide hydrolases, and phosphatases by linking the substrate to the surface through a fluorogenic linker (137). The same authors have also developed a different approach for the activity-based detection of enzymes using a microarray format, in which the samples containing the enzymes to be analyzed are immobilized onto surfaces and then visualized with fluorescently labeled mechanism-based inhibitors (138).

The protein fingerprinting (PFP) technique is another fluorescence-based detection method that has been employed for the analysis of protein microarrays. This approach makes use of fluorophore-labeled capture reagents that change their fluorescent properties once they bind to the target protein; thus, by comparing patterns, the proteins of interest can be identified and at the same time discriminate any signal coming from non-specific interactions (139,140). This approach does not use high affinity capture reagents, such as antibodies, but rather uses relatively weak binders such as synthetic polypeptides.

Other label-dependent methods include the use of radioactivity, especially for enzymatic reactions such as phosphorylation, due to their sensitivity and specificity. For example, Schreiber and co-workers have used it to monitor kinase activity in combination with radioisotope-labeled ATP (10). Snyder and co-workers have employed this approach to study the activities and substrate preferences of 119 different protein kinases (87). The use of radioisotope-labeled molecules, however, may raise safety concerns, thus limiting its potential for high throughput analysis. The use of chemilumiscence-based detection schemes also provides high selectivity and sensitivity, but with a limited resolution and dynamic range (141).

Label-Free Detection

The use of fluorescence-based detection methods is by far one of the most commonly employed approaches for the detection of proteins. However, there are several limitations to this approach. For example, labeling of proteins in samples or specific protein capture reagents such as antibodies may alter the surface of the proteins and therefore their binding properties. It is also a very time-consuming technique, especially when a multitude of samples need to be labeled. Another potential issue is the variability in labeling efficiency of proteins across different samples. This is a critical issue, especially when non-specific labeling techniques are employed, since small variations in the temperature and reaction duration, for example, can seriously influence the efficiency of protein labeling.

These limitations have sparked the development of novel label-free detection schemes involving mass spectrometry (MS)- and optical spectroscopy-based measurements (131,142).

In particular MS-based detection has already been used for the discovery of disease-associated biomarkers (143). For example, the use of surface-enhanced laser desorption ionization time-of-flight (SELDI-TOF) MS allows the detection of captured proteins without the need for labeling (144). In fact, SELDI has been widely used for the discovery and detection of biomarkers associated to several types of cancer (145150). More recently, Becker and Engelhard have also used matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) for the direct read-out of protein/protein interactions using protein-DNA microarrays generated by DNA-directed immobilization (DDI) (61) (see reference (62) for a recent review in this field). The authors used this approach for the rapid detection of activated Ras in cell lysates from several cell lines.

Another well-established label-free detection method is surface plasmon resonance (SPR). SPR can also provide kinetic information on binding events. In this approach, the appropriate capture reagents are immobilized onto a gold surface, and quantification of the captured proteins is carried out by measuring the change in the reflection angle of light after hitting the gold surface (151). For example, an SPR imaging method was recently used for the high throughput screening of molecules able to target the interaction between the retinoblastoma tumor suppressor RB and the human papillomavirus (HPV) E7 proteins (152). The E7 protein is produced by high-risk human papillomavirus (HPV) and induces degradation of the retinoblastoma tumor suppressor RB through a direct interaction, and it has been suggested as a potential molecular target in cancer therapy. In this work, a glutathione-coated SPR chip was used for the immobilization of the E7 GST-fusion protein, which was then complexed with His-tagged RB protein in the presence of different RB-binding peptides derived from a motif of the E7 protein. Some of these peptides were shown to antagonize the interaction between His-tagged RB and GST-E7 in a concentration-dependent manner (152).

A conventional SPR system, however, can only use a single channel per experiment. The recent development of SPR microscopy allows the analysis of hundreds of biomolecular interactions simultaneously in large protein microarrays (>1,300 spots) allowing for qualitative screening and quantitative kinetics experiments in a high throughput format (153).

The anomalous reflection (AR) technique is another spectroscopic detection scheme that has been suggested as an alternative to SPR. AR is a characteristic property of gold that causes a large decrease in the reflectivity of blue or purple light (380 nm < λ < 480 nm) on a gold surface upon adsorption of a transparent dielectric layer on its surface (154). The AR technique requires relatively less complex optics than the SPR systems and has the potential to offer miniaturized and parallelized measurements; therefore, it could be potentially suitable as a high-throughput analytical platform. This approach has been used so far with some success for analyzing biotin/avidin, calmodulin/synthetic α-helical peptides and T7-phage displayed-proteins and synthetic peptide interactions (154,155). At this point, however, AR-based detection of microarrays needs to be further developed for detection of multiplexed protein-protein interactions beyond the proof-of-concept.

APPLICATIONS

Some of the applications of protein microarrays have already been discussed in the previous sections. The most prominent applications include high-throughput proteomics, biomarker research and drug discovery. Several reviews focusing on the biomedical applications of protein microarrays have been published recently (3,8).

Proteomics

Functional protein microarrays are ideal bioanalytical platforms to carry out high-throughput proteomics. Perhaps the most advanced example of this application to date was reported by MacBeath and co-workers to study the phosphorylation states of the ErbB-receptor kinase family using functional protein microarrays (23,156). The first three members of the ErbB family of receptor tyrosine kinases, ErbB1-3, are involved in the activation of a wide variety of signaling pathways that are frequently misregulated in cancer. Erb4, on the other hand, is not involved in tumorigenesis and has been shown to have a protective role in some cancers. In order to study in more detail the role of this receptor tyrosine kinase, the authors first used tandem mass spectrometry to identify 19 sites of tyrosine phosphorylation on ErbB4. These phosphopeptides were then used to probe a functional protein microarray containing 96 SH2 and 37 PTB protein domains encoded in the human genome. The obtained data was used to build a quantitative interaction network for ErbB4 as well as for the identification of several new interactions that led to the finding that ErbB4 can bind and activate STAT1 (Fig. 2).

Deng and co-workers have also studied protein-protein and protein-DNA interactions on a global scale in the plant A. thaliana by making use of functional microarrays (157). The authors created a microarray containing up to 802 different transcription factors from A. thaliana. The proteins were expressed using a yeast expression system and arrayed onto FAST glass slides, which are commercially available slides coated with a nitrocellulose membrane. The resulting microarray was probed with different fluorescent-labeled oligonucleotides containing known binding sites for several transcription factors of the AP2/ERF family. Using this approach the authors were able to confirm known interactions and identify 48 new ones. These included four transcription factors that were able to bind the evening element and showed an expected clock-regulated gene expression pattern, thus providing a basis for further functional analysis of their roles in circadian-regulated gene expression (157). The same authors also used this microarray for detecting novel protein-protein interactions and were able to discover four novel partners that interact with transcription factor HY5 (157), which is a key regulator of photomorphogenesis in A. thaliana (157).

It should be highlighted, however, that the production of whole-proteome microarrays is technically a challenging task, since it requires the isolation of a large number of functional proteins. Furthermore, the analysis of whole-genome microarrays is complicated due to the fact that they only represent particular time snapshots of the proteome. Moreover, proteins not only differ in structure and function but also in their cellular localization, turnover rates and, more importantly, abundance. However, the use of this technology in proteomic research still allows the unprecedented ability to monitor the biomolecular interactions of thousands of samples in parallel, which by far outweighs all the difficulties and limitations associated with their use and preparation.

Biomarker Research

The use of protein microarrays in biomarker research has received special interest in the areas of viral diagnostics and cancer research. For example, the examination and identification of particular protein profiles in early-stage cancers could lead to early detection of tumors and the development of improved therapies for cancer patients. Antibody-based microarrays are by far the most frequently used in biomarker profiling and discovery for cancer research. For example, Cordon-Cardo and co-workers have used an antibody array composed by 254 different antibodies to discriminate bladder cancer patients from control patients (40). Snyder and co-workers have also used protein microarrays to profile antibodies against human severe acute respiratory syndrome (SARS) virus and related coronaviruses (158). In their study, the authors used 82 different coronavirus GST-fusion proteins, which were expressed in yeast and arrayed onto FAST glass slides. These arrays were used to profile the sera of two patient groups (more than 600 samples obtained from patients in China and Canada) with ≈90% accuracy (158). Using this approach, it was possible to distinguish patients infected with SARS and HCoV-229E, two different human coronavirus. These results were further validated by statistical methods and an indirect immuno-fluorescence test, and also showed that the sensitivity provided by microarray profiling was similar in sensitivity to standard indirect immuno-fluorescence tests but was more specific (158).

LaBaer and co-workers have also used protein microrrays generated by the NAPPA approach for tumor antigen profiling in breast cancer (16). In this work, sera from breast and ovarian cancer patients were tested for p53-specific antibodies using a microarray displaying 1,705 different non-redundant tumor antigens. These results were also corroborated by standard indirect immuno-blotting techniques (16).

The described examples are just a sample of the recent applications of protein microarrays in biomarker profiling and discovery, and illustrate the great potential of this technology in biomedical applications.

Drug Discovery

Protein microarrays have also been used in drug discovery for target identification and validation. In 2004, Schreiber and co-workers described for the first time the use of a protein microarray for high-throughput screening of small molecules (159). In this work, the authors used a protein microarray obtained by spotting different His-tagged and GST-fusion proteins onto chemically modified glass slides. These arrays were used to screen the molecular targets of six small-molecule inhibitors of rapamycin (SMIR) that were previously identified for their ability to rescue growth of yeast cells exposed to rapamycin in a phenotype-based chemical genetic suppressor assay. To facilitate the screening process, the SMIRs were conjugated to biotin, and the bound SMIRs were then detected using fluorescent-labeled streptavidin. These results allowed the identification of a new, unknown member of the target of rapamycin (TOR) signaling pathway (159).

Protein microarrays can also be used in an indirect fashion for screening and selecting small molecules able to antagonize protein interactions. For example, antibody arrays can be used to screen and/or profile the proteome for changes in protein expression and/or post-translational modifications, such as phosphorylation, induced by the presence or absence of a particular drug candidate.

Sokolov and Cadet have used protein microarrays to study the correlation between the levels of expression of different proteins and the behavioral phenotype of mice treated with methamphetamine (METH) (160). METH abuse has been shown to stimulate aggressive behaviors in humans and in other animals. The authors found that mice treated chronically with METH demonstrated increased aggressiveness and hyper-locomotion when compared to an untreated control group. In this work, a total of 378 different monoclonal antibodies specific for proteins related to signal transduction, oncogene products, cell cycle regulation, cell structure, apoptosis, and neurobiology, among others, were used to prepare the protein-detecting array (160). This antibody microarray was incubated with proteins extracted from the brain of untreated and METH-treated mice and labeled with fluorescent dyes. The data revealed a decrease in the natural abundance of the proteins Erk2 and 14-3-3e in the striata of the mice chronically treated with METH. Since the kinase Erk2 is thought to be the principal component of the classical mithogen-activated protein (MAP) kinase pathway and protein 14-3-3e is an inhibitor and substrate of protein kinase C, the reduction in these two proteins suggests that repeated exposure to METH might alter MAP kinase-related pathways involved in behavioral change (160).

These examples clearly illustrate the potential of protein microarrays for drug discovery applications. Despite the numerous advantages in the preparation and analysis of these types of reagents, their use in drug discovery has been limited so far.

CONCLUDING REMARKS

The aim of this review is to highlight the latest developments in the preparation, analysis and biotechnological applications of protein microarrays. Just before MacBeath and Schreiber reported for the first time the use of protein microarrays in 2000 (10), the concept of using protein microarray technology was simply regarded as a dream. A decade later, the number of publications on protein microarray technologies has increased dramatically. There are approximately 32,000 publications indexed in PubMed (http://www.ncbi.nlm.nih.gov/pubmed) under the keyword protein microarrays. We have seen numerous examples that show protein microarrays are a very valuable tool for the study of whole proteomes (1113,18,23,24), protein identification and profiling for early diagnosis of diseases such as cancer (16,40) or viral infections (158) and for drug identification and validation (159,160).

Despite the large number of successful examples in the use of protein microarrays in biomedical and biotechnological applications during the last 10 years, there are still, however, some challenges that need to be tackled. For example, most of the methods commonly employed for the immobilization of proteins onto solid supports rely on non-site-specific immobilization techniques (10,46,47,49,161). The use of these methods usually results in the proteins being displayed in random orientations on the surface, which may compromise the biological activity of the immobilized proteins and/or provide false results (162). This issue has been addressed over the last few years by the development of novel site-specific immobilization approaches which involve the use of chemoselective ligation reactions (52,84,92,97,101,102), active site-directed capture ligands (112,116,163165) and protein splicing (68,85), among others.

The expression and purification of thousands of proteins without compromising their structural and biological activity is also a challenging task. The use of cell-free expression systems in combination with nucleic acid arrays, which are more readily available and easier to prepare, has been shown to give good results to produce in situ protein arrays from DNA (67,127,129) and RNA arrays (130). The combination of these approaches with site-specific and traceless methods of protein immobilization such as protein trans-splicing (68,85) shows great promise.

The introduction of label-free detection methods, such as surface plasmon resonance and mass spectrometry, also shows great promise to simplify the use of protein microarray analysis, since labeling of the interacting partners will no longer be required.

The standardization of protein microarray production is another issue that needs to be improved. At this time, most of the methods used by the scientific community for preparing and analyzing protein microarrays are not completely standardized. The adoption of stringent standards by the scientific community for the production and analysis of these valuable reagents should, in principle, allow the generation of data that could be compared and exchanged across different studies and different research groups.

None of these challenges is impossible to achieve; in fact, as we have seen in this review, much more progress has been made over the last decade to address them. At this point, we strongly believe that the protein microarray technology is on the brink of becoming a standard technique in research in the same way as DNA microarray technology is used today.