Communication
Rationalization of Gene Regulation by a Eukaryotic Transcription Factor: Calculation of Regulatory Region Occupancy from Predicted Binding Affinities

https://doi.org/10.1016/S0022-2836(02)00894-XGet rights and content

Abstract

DNA-binding proteins regulate gene expression by binding preferentially to a set of related sequences. In order to quantify the correlation between gene regulation and the presence of sequence motifs, the affinity of a transcription factor for each variant of the binding site must be known or predicted. In addition, the contribution of multiple binding sites to the regulation of a single gene must be modeled. To predict the affinity of the yeast Leu3 transcription factor for genomic-binding sites, we measured the in vitro equilibrium dissociation constants of 43 binding-site variants and established that the free energy of binding can be approximated as a sum of free energy contributions from each base-pair. This allows the prediction of an equilibrium dissociation constant for all potential binding sites in the genome and, therefore, their fractional occupancy at some assumed concentration of free Leu3. From the occupancy of individual sites, the probability that at least one site is occupied within a defined segment upstream of a gene was calculated for all genes in yeast. We find that this probability is substantially better correlated with regulation by Leu3 than is the number of binding sites. This is true whether the number of binding sites is based on a consensus site definition of the binding site or by enumeration of all variants that have a predicted Kd value below some threshold. The occupancy calculation was best able to rationalize the Leu3-regulated gene set over a Leu3 concentration range that spans the Kd values for the best sites.

Section snippets

Derivation of a Leu3 binding-site model

The DNA-binding domain of Leu3 (amino acid residues 1–147) was expressed and purified to near-homogeneity as a fusion protein using the pMAL system in Esherichia coli.3., 11., 12. An electrophoretic mobility shift assay (EMSA) was then used to determine the affinity of this protein for 50 variants of the binding site (Table 1).13 Of the 50 variants, 29 were obtained by an in vitro selection procedure.14., 15. The selection served as a way to obtain diverse variants that bind with moderate to

Calculation of occupancy and correlation with gene expression

Our purpose in deriving a model for binding affinity is to see how well that model can be used to rationalize gene regulation. In addition to a model for binding affinity, we need a model for how multiple binding sites contribute to gene regulation. As with the affinity model, the goal is to use models with as few parameters as possible and then add complexity to the model as necessary and when justified by the data. The simple regulatory model we have implemented here is that binding to any

Prospects for improvement

More elaborate models for binding affinity might improve our ability to rationalize gene expression. It has been suggested, for example, that dependencies between neighboring base-pairs could be determined systematically by measuring the binding constants for all 16 variants of a particular dinucleotide, and then representing the data in a 16×(N−1) weight matrix of overlapping dinucleotides.22 This weight matrix has many more parameters than the standard 4×N matrix of independent base-pairs and

Supplementary Files

References (24)

  • J.Y. Sze et al.

    Purification and structural characterization of transcriptional regulator Leu3 of yeast

    J. Biol. Chem.

    (1993)
  • W.W. Wasserman et al.

    Identification of regulatory regions which confer muscle-specific gene expression

    J. Mol. Biol.

    (1998)
  • D.S. Fields et al.

    Quantitative specificity of the Mnt repressor

    J. Mol. Biol.

    (1997)
  • K. Zhou et al.

    Structure of yeast regulatory gene LEU3 and evidence that LEU3 itself is under general amino acid control

    Nucl. Acids Res.

    (1987)
  • Y. Hu et al.

    The Saccharomyces cerevisiae Leu3 protein activates expression of GDH1, a key gene in nitrogen assimilation

    Mol. Cell Biol.

    (1995)
  • P.S. Nielsen et al.

    Transcriptional regulation of the Saccharomyces cerevisiae amino acid permease gene BAP2

    Mol. Gen. Genet.

    (2001)
  • P. Friden et al.

    LEU3 of Saccharomyces cerevisiae activates multiple genes for branched-chain amino acid biosynthesis by binding to a common decanucleotide core sequence

    Mol. Cell Biol.

    (1988)
  • G.D. Stormo

    DNA binding sites: representation and discovery

    Bioinformatics

    (2000)
  • Clarke, N. D., Granek, J. A. (2002). Rank order metrics for quantifying the association of sequence features with gene...
  • M. Markstein et al.

    Genome-wide analysis of clustered Dorsal binding sites identifies putative target genes in the Drosophila embryo

    Proc. Natl Acad. Sci. USA

    (2002)
  • B.P. Berman et al.

    Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome

    Proc. Natl Acad. Sci. USA

    (2002)
  • J.Y. Sze et al.

    Transcriptional regulator Leu3 of Saccharomyces cerevisiae: separation of activator and repressor functions

    Mol. Cell Biol.

    (1993)
  • Cited by (0)

    View full text