Introduction

Holocarboxylase synthetase (HLCS) has a pivotal role in essential biotin-dependent metabolic pathways and epigenetic phenomena in humans. The unique role of HLCS in intermediary metabolism is due to its catalytic activity as the sole ligase in the human proteome that can catalyze the covalent binding of biotin to carboxylases.1 Biotinylated carboxylases are key enzymes in the metabolism of glucose, fatty acids, and leucine.2 Acetyl-CoA carboxylases 1 and 2 catalyze key reactions in fatty acid synthesis and the inhibition of mitochondrial fatty acid uptake, respectively; 3-methylcrotonyl-CoA carboxylase catalyzes an essential step in leucine metabolism; propionyl-CoA carboxylase (PCC) catalyzes a key reaction in the metabolism of odd-chain fatty acids; and pyruvate carboxylase is a key enzyme in gluconeogenesis.

In epigenetic pathways, HLCS catalyzes the covalent binding of biotin to histones H1, H3, H4 and, to a lesser extent, H2A.3, 4, 5, 6 Biotinylated histones have roles in the transcriptional repression of genes and repeat sequences.7, 8 Importantly, evidence suggests that K12-biotinylated histone H4 contributes towards the transcriptional repression of retrotransposons, and that low abundance of K12-biotinylated histone H4 in HLCS- or biotin-deficient cells is linked with activation of retrotransposons and chromosomal abnormalities.9 Our observation that biotinylation is a rare natural histone modification3, 6 was independently confirmed by other laboratories.10, 11

Consistent with the important roles of HLCS in intermediary metabolism and epigenetics, no living HLCS null individual has ever been reported, suggesting embryonic lethality. HLCS knockdown studies (30% residual activity) produced phenotypes such as decreased life span and heat resistance in Drosophila melanogaster12 and aberrant gene regulation in human cell lines.8, 9, 13 Mutations have been identified and characterized in the human HLCS gene; these mutations cause a substantial decrease in HLCS activity and metabolic abnormalities.14, 15 Homozygous severe HLCS deficiency has been reported to be uniformly fatal.16 In all three independent cancer and patent databases correlate HLCS loss or mutation with human tumors.17, 18, 19

HLCS is present in both nuclear and extranuclear structures.20, 21 Nuclear HLCS is a chromatin protein;12 its binding to chromatin is mediated by physical interactions with histones H3 and H4.5 Our knowledge of HLCS regulation is guided primarily by the following observations: (i) Both the expression of HLCS and its nuclear translocation depend on biotin status in human cell lines.13 (ii) The human HLCS promoter has been tentatively identified,22 but not yet characterized in great detail. (iii) The expression of HLCS is repressed by miR-539.23 (iv) The HLCS-dependent biotinylation of histones depends on cross-talk with cytosine methylation. We reported that histone biotinylation is substantially impaired when cytosine methylation marks are erased by treating cells with 5-aza-2′-deoxycytidine.9 Partially, the effects of 5-aza-2′-deoxycytidine on HLCS expression are mediated by demethylation of the promoters in the two human miR-153 genes, leading to high levels of miR-153 and, subsequently, miR-153-dependent degradation of HLCS mRNA.24

Human HLCS is a single copy gene, which spans 14 exons and about 250 000 basepairs.25 The following domains have been identified in human HLCS: N-terminal domain (amino acids M1–F446), central domain (F471–S575), linker domain (T610–V668), and C-terminal domain (H669–R718).26 The central domain contains the binding sites for biotin and ATP; N-terminal, central, and C-terminal domains participate in the binding of the various apocarboxylases. To date, 2572 single-nucleotide polymorphisms (SNPs) have been reported for human HLCS,27 but the biological importance of these SNPs is unknown. In this study, we integrated biochemistry, structural biology, and molecular biology techniques to characterize the effects of SNPs on the catalytic activity of HLCS. As a secondary goal, we determined whether biotin supplementation might restore the activities of HLCS variants to wild-type levels.

Materials and methods

Selection of HLCS candidate SNPs

HLCS polymorphisms were analyzed in silico to identify those SNPs that are most likely to alter catalytic activity. First, we identified exon/intron boundaries and limited subsequent analyses to SNPs in exons.27 Second, we identified those polymorphisms that cause changes in the amino acid sequence of HLCS. Third, we selected those amino acid substitutions that are most likely to alter protein/substrate interactions, for example, substitutions of a basic for a hydrophobic amino acid, or substitutions in the N-terminal, central, or C-terminal domain.26 Based on this procedure, we identified five SNPs for subsequent biochemical characterization (Table 1). The rare L216R variant (mutant) is known to have an activity near zero and was used as control.28

Table 1 HLCS variants

Recombinant proteins

In previous studies we created plasmid HCS-pET41a(+), which codes for full-length human HLCS fused to N-terminal glutathione S-transferase (GST), S tag, and both N-terminal and C-terminal 6 × his tag (114.6 kDa).5 Recombinant HLCS (rHLCS) was expressed and purified using GSTrap FF Columns on an ÄKTA protein purification system (GE Healthcare; Piscataway, NJ, USA) as described.29 HLCS protein was quantified using the bicinchoninic acid method (Pierce, Rockford, IL, USA); protein purity, identity, and integrity were confirmed by gel electrophoresis and staining with coomassie blue and an antibody to the C-terminus in human HLCS.12 Previously, we have demonstrated that rHLCS has biological activity in vitro.5 Also, we have demonstrated the open reading frame in our plasmid rescues Saccharomyces cerevisiae in which the HLCS ortholog biotin protein ligase had been knocked out, if HLCS is subcloned into a yeast expression vector.26

HLCS variants were created by site-directed mutagenesis, using HCS-pET41a(+) as template. Briefly, HCS-pET41a(+) was digested with EcoRI and XhoI and subcloned into pBluescript II SK (+) vector (Stratagene; Wilmington, DE, USA) to create ‘HLCS- pBluescript II SK (+)’. Mutations were introduced using the GeneTailor site-directed mutagenesis system following the manufacturer's instructions (Invitrogen; Carlsbad, CA, USA). Mutations were confirmed by sequencing. Mutant plasmids were digested with EcoRI and XhoI and re-inserted into the pET41a(+) vector. rHLCS variants were purified, and integrity and purity were tested as described above.

The polypeptide p67 comprises the 67 C-terminal amino acids in PCC, including the biotin-binding site K694; p67 is a well-established substrate for biotinylation by HLCS.4, 5 The biotin-free fraction of recombinant p67 was prepared as described.4

rHLCS enzyme kinetics

The activities of HLCS and its variants were quantified as described previously4 with the following modifications. The final concentrations of wild-type and variant HLCS was adjusted to 50 nM in assay mixtures; concentrations and purities of HLCS variants were confirmed by gel electrophoresis, using coomassie blue and anti-HLCS as probes. The concentrations of biotin were varied between 0 (control) and 24 μM in the analyses of enzyme kinetics. Mixtures were incubated at 37 °C for 3 h, when reactions were terminated by adding tricine loading buffer (Invitrogen) and boiling at 95 °C for 10 min. In preliminary studies we confirmed that biotinylation of p67 occurs exponentially under these conditions (data not shown), except when biotin reaches concentrations that saturate the enzyme. A total of 15 μl of sample were loaded onto 16% tricine gels (Invitrogen), and p67-bound biotin in transblots was probed by measuring infrared absorbance of IRDye 800CW Streptavidin at 800 nm (channel 800CW) in an Odyssey imaging system (Licor, Lincoln, NE, USA). Michaelis constant (Km) for biotin and maximal velocity (Vmax) of HLCS were calculated based on the Michaelis–Menten equation by using non-linear regression analysis and GraphPad Prism 5.00 (GraphPad Software; La Jolla, CA, USA).30

HLCS structure and docking of p67

A three-dimensional (3D) model of wild-type HLCS comprising residues 457–717 (central domain, linker domain, and C-terminus) was built using MODELLER v9.9 (Departments of Pharmaceutical Sciences and Pharmaceutical Chemistry, University of California, San Francisco, CA, USA).31 Biotin-protein ligase from Pyrococcus horikoshii OT3 (pdb-code: 1wqw) has 32% sequence identity with human HLCS, and its biotinyl-5′-AMP bound form was used as the structural template. CLUSTALX (Conway Institute, University of Conway, Dublin, Ireland) was used for sequence alignments,32 which were manually refined based on the predicted secondary structural information and functionally conserved residues. MODELLER v9.9 was used to improve the back-bone residue conformations in loops. WEBLOGO (Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA) was used to illustrate the sequence conservation at the two variant sites,33 based on the multiple alignments of biotin-protein ligases downloaded from UniProt-kB. MODELLER v9.9 was used to generate models for the G510R and Q699R variants, using the 3D structure of wild-type HLCS as template. The 3D structure of the HLCS substrate p67 (residues 657–728 in human PCC) was modeled using the 3D structure of the alpha subunit in PCC (46% sequence identity with human PCC) from Ruegeria pomeroyi as template (pdb-code: 3n6r).

HLCS/p67 docking experiments were conducted for wild-type HLCS and the G510R variant. Docking was conducted manually using the structure of the biotin protein ligase/biotin carboxyl carrier protein complex from Pyrococcus horikoshii OT3 (pdb-code: 2ejf) as the reference complex; p67 was positioned so that the reactive ɛ-amino group of K694 was at bonding distance to the carboxyl group of biotin in HLCS. The manually docked complexes were then refined using FiberDock server.34 Six iterations of FiberDock were performed, starting with the refinement of only the clashing residues and progressively increasing the receptor and ligand flexibility and the side-change optimization of the full interface residues. The orientation of p67 was identical for both wild-type HLCS and the G510R variant before the first iteration by the FiberDock protocol. Molecular visualization and superpositions were done using PYMOL/SWISS-PDB VIEWER (Schrodinger Sales Center, Portland, OR, USA).35, 36 The hydrogen bond interactions and non-bonded interactions were calculated using the LIGPLOT software suite (University College London Business, London, UK).37

Statistical analysis

Bartlett's test was used to confirm that variances are homogeneous. The significance of differences among HLCS variants was tested by one-way ANOVA, followed by Fisher's Protected Least Significant Difference procedure for posthoc testing. StatView 5.0.1 (SAS Institute; Cary, NC, USA) was used to perform all calculations. Differences were considered significant if P < 0.05 Data were expressed as mean±SD. Repeats represent independent samples assayed on distinct days.

Results

Enzyme kinetics analysis

HLCS variants Q699R and L216R had a lower affinity for biotin compared with wild-type HLCS (Table 2). For Q699R, Km was about 57% greater than in wild-type HLCS. For L216R, the enzyme activity was too low to permit meaningful quantification of Km. The affinity for biotin was not significantly altered in variants V96F, V96 L and G510R compared with wild-type HLCS.

Table 2 Km and Vmax values of wild-type rHLCS and its variants

HLCS variant Q699R was rescued by high concentrations of biotin, as evidenced by Vmax returning to wild-type levels in biotin-supplemented assay mixtures (Table 2). In contrast, variant L216R could not be rescued by supplemental biotin; Vmax was only 6% of that calculated for wild-type HLCS. Variants V96F and G510R showed a 22% and 27%, respectively, lower Vmax compared with wild-type HLCS, whereas Vmax of V96L was unaltered.

The Michaelis–Menten equation provided a good fit for the data points in enzyme kinetics analyses, as evidenced by the correlation coefficients of the fitted curves (Figure 1). Only for the L216R mutant the correlation coefficient was below 0.9, due to the mutant's very low activity. Analysis of rHLCS and variants by gel electrophoresis suggests that recombinant proteins were >90% pure and that equal amounts of rHLCS were used in reaction mixtures (Figure 2).

Figure 1
figure 1

Non-linear regression analysis of HLCS variants V96F (a), V96L (b), L216R (c), G510R (d), and Q699R (e) compared with wild-type HLCS. Data points were fitted by using the Michaelis–Menten equation (N=3 independent analyses for each HLCS variant). R, correlation coefficient; WT, wild-type.

Figure 2
figure 2

Normalization of rHLCS concentrations for enzyme kinetics studies. Wild-type HLCS (WT) and variants were stained with coomassie blue (a) and probed with anti-HLCS (b).

HLCS structure and docking of p67

About 98% of the amino acid residues in the 3D model of HLCS are in the allowed regions of the Ramachandran plot (Figure 3a). The plot has a backbone atom RMS deviation of 0.7 Å compared with the biotin protein ligase from Pyrococcus horikoshii OT3, mainly due to insertions in loops α1/β2, β3/α2, α2/β4, β6/β7, β7/α3, and α3/α4 described for the P. horikoshii template. This also resulted in slightly longer α2 and α4 helices compared with the template. The biotin-binding site and adjacent amino acids in the central domain are highly conserved and have an RMSD of only 0.04 Å. The structure of the N-terminal domain could not be predicted because of the absence of a suitable template. Thus, we had to limit our 3D structure analysis of effects to SNPs in the central and C-terminal domains, that is, variants G510R and Q699R, respectively.

Figure 3
figure 3

3D modeling of wild-type and variant HLCS, and substrate docking analysis. (a) 3D model of HLCS. Biotin-5′-AMP is shown as yellow sticks. The biotin-binding site is colored orange. The substrate p67-binding site is colored purple and the variant residues are shown in red (G510R) and blue (wild-type). The biotin-binding region and the C-terminal domain are highlighted with dotted circles. (b) Sequence conservation near residues 510 (top) and 699 (bottom); residues are identified by arrows. (c) Space-filled models of human PCC docked into the wild-type HLCS and the G510R variant. Biotin-5′-AMP is shown as red sticks. Residue 510 in HLCS and the reactive lysine residue in PCC are colored yellow. (d) Superposition of the p67-docked complexes (p67 not shown) of wild-type HLCS (blue) and the G510R variant (red) with undocked wild-type HLCS (green).

Based on the 3D model, the arginine-699 residue localizes in a solvent-exposed loop in the Q699R variant. Multiple sequence alignment of homologous sequences indicated that this position does not have a high degree of residue conservation (Figure 3b). Importantly, residue 699 is not close to the biotin-binding region and does not participate in the binding of p67 (see below). Thus, the 3D structure does not provide mechanistic insights into why biotin affinity is decreased in variant Q699R compared with wild-type HLCS.

In contrast, the 3D model offers useful information with regard to catalytic activity of the G510R variant. Residue 510 is located in the conserved β2/β3 loop in the biotin binding site of the catalytical center in the central domain.38 Biotin forms hydrogen bonds with K506 and R508 in the loop (Online Supplementary Table 1). At position 508, the polypeptide makes a 90° turn, thereby positioning residues 509 and 510 away from the active site, so that they have no interactions with biotin. At position 510, the polypeptide again makes a 90° turn and, as seen from the sequence alignment, glycine is the highly preferred residue, with no occurrences of arginine in wild-type biotin protein ligases (Figure 3b). Glycine, due to its absence of a side chain, is probably indispensable for conserving the local geometry of the loop.

Based on our observation that high levels of biotin do not restore the activity of the G510R variant to that seen in wild-type HLCS in vitro (Table 2), we tested the possibility that this variant hinders interactions with carboxylases. Our 3D docking model is consistent with the theory that the arginine-for-glycine substitution impairs the binding of carboxylases to HLCS (Figure 3c). A prominent difference between the G510R variant and wild-type HLCS is the change in the position of the reactive K694 residue of p67, which is further away from the biotin molecule in the variant compared with wild-type HLCS. This effect is due to a change in the orientation of p67, apparently caused by the interaction with the large side chain of arginine in G510R compared with glycine in wild-type HLCS. The change is linked with altered hydrogen bond interactions of p67 in the case of G510R HLCS (Online Supplementary Table 2). Arginine-510 forms a new hydrogen bond with glycine-715 in p67. Importantly, the hydrogen bond found in the wild-type complex, between biotin in HLCS and K694 of p67 is absent in the G510R variant (Online Supplementary Table 2). The binding of substrate to biotin protein ligase in Pyrococcus horikoshii OT3 is associated with opening of the active site loop and the c-terminal domain.38 This effect was visualized by superposition of the wild-type and G510R-docked complexes (Figure 3d), and further confirmed by assessing the changes in the hydrogen bond and non-bonded interactions between biotin and HLCS after docking with p67 (Online Supplementary Table 1). It appears that the changes in p67 binding hinder the movement of p67 in the active site and the c-terminal domain of HLCS. In contrast, the binding of biotin may not be affected in G510R, as most of the active site hydrogen bonds that are observed in wild-type HLCS-p67 complex are retained in the variant enzyme-p67 complex (Online Supplementary Table 2).

Discussion

This is the first report to integrate biochemical, structural biology, and molecular biology approaches to characterize SNPs in human HLCS with regard to their effects on biotin affinity and catalytic activity. More than 2500 SNPs in HLCS were screened in silico to identify variants that cause amino acid substitutions in exons. Five such variants were identified in three of the four domains in the HLCS protein.26 Using enzyme kinetics analysis, we demonstrated that variant Q699R causes a decrease in substrate affinity, and that enzyme activity can be restored to wild-type activity by supplemental biotin in our in vitro assays. In contrast, the biotin affinity of variants V96F and G510R are not significantly different from the wild-type HLCS, but the Vmax of these variants remains moderately below normal even when supplemental biotin is provided. Km and Vmax of the V96L variant is similar to wild-type HLCS, which is not surprising given the conservative nature of the amino acid substitution. The activity of L216R (a rare mutant rather than an SNP) is near zero, consistent with previous studies.28, 39 Collectively, our findings suggest that individuals with HLCS variants may benefit from supplemental biotin, yet to different extend depending on the genotype. Please note that the plasma levels of biotin are about 250 pM in apparently healthy adults on a mixed diet, and that levels can be increased 40 times by using over-the-counter biotin supplements providing 300–600 μg/day biotin;40, 41 these levels are still well below saturation of cellular biotin transporters.42 Based on these observations one can assume that the intracellular concentration of biotin will increase substantially in response to biotin supplementation. Note, however, that we do not recommend the wide-spread use of biotin supplements until a benefit has been demonstrated in vivo.

Our 3D and docking analyses provide interesting mechanistic insights into enzyme catalysis by the G510R variant. Apparently, the highly conserved glycine residue is crucial for interactions between HLCS and carboxylases. From a health perspective the G510R variant is a greater challenge than the Q699R variant. For the latter, enzyme activities can be restored to wild-type activities by supplemental biotin. For the former, enzyme activities can, theoretically, be restored to wild-type activities only by increasing the abundance of apo-carboxylases, which is difficult to achieve. One might consider biotin supplementation, which is known to increase the expression of 3-methylcrotonyl-CoA carboxylases in human cells.43

Based on the findings from this study scientists can now focus on those SNPs in HLCS that are most likely to be relevant to human health, that is, SNPs that alter the affinity for biotin. This knowledge is a major advance compared with using quantitative trait loci analysis with its notoriously low resolution at the genome level, which makes it difficult to link a specific gene with increased risk for disease. Previous studies suggest links between HLCS, cancer,17, 18, 19 and genome stability.9 Presumably, these effects are caused by the participation of HLCS in forming a multiprotein repressor complex at the chromatin level, and the de-repression of genes in HLCS-deficient individuals (see below). Consistent with this hypothesis, a recent publication provides evidence that HLCS knockdown is linked with depression of retrotransposons.9 Importantly, the roles of HLCS in preventing retrotransposition events9 might offer a mechanism to explain birth defects and reproductive failure observed in biotin-deficient animal models.10, 44, 45

Some of the effects of HLCS in epigenetic pathways might be mediated by physical interactions of HLCS with chromatin proteins other than HLCS-dependent biotinylation of lysine residues and histones. For example, we previously demonstrated that HLCS physically interacts with histone H35 and we proposed that HLCS interacts with the methylated cytosine binding protein MeCP2.46 We envision that HLCS is an integral part of a gene repression complex that may also include histone methyl transferases, histone deacetylases and the nuclear co-repressor N-CoR. The possible effects of HLCS SNPs on the formation of such a repressor complex may be different from the effects on catalytic activity and will need to be investigated once all HLCS-binding partners have been identified.

We are reasonably confident that the Km values reported for HLCS in this study are similar to the concentrations of biotin in human tissues, notwithstanding the uncertainties with regard to the true concentrations of biotin in human tissues and cellular compartments. Evidence suggests that the concentrations of biotin in liver and kidney are about 3.3 and 4.5 μmol/kg, respectively, in pigs fed a normal diet containing 200 μg biotin/kg.47 Likewise, the concentration of biotin is about 500–900 fmol per cell in human embryonic palatal mesenchymal cells in culture,10 which translates into a concentration of about 6.8 μmol/l assuming a cellular volume of 0.3 nl.48 The tissue concentration of free biotin might be lower than that of protein-bound biotin,49 but this might be offset by local accumulation of free biotin in HLCS-rich microdomains. If the Km values reported for HLCS in this study are similar to the concentrations of biotin in human tissues, and if our in vitro findings can be recapitulated in vivo, then biotin supplementation might benefit individuals homozygous for the HLCS variant Q699R. Future studies will need to determine whether Q699R homozygous individuals have an increased risk for cancer, miscarriages, and birth defects in genome-wide association studies. Finally, our laboratory is currently investigating HLCS promoters and the effects of SNPs in these promoters on HLCS gene regulation. If any of these SNPs turn out to affect gene expression, they would need to be included in studies of the prevalence of HLCS SNPs.