Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Full Paper
  • Published:

Mining microarray data to identify transcription factors expressed in naïve resting but not activated T lymphocytes

Abstract

Transcriptional repressors controlling the expression of cytokine genes have been implicated in a variety of physiological and pathological phenomena. An unknown repressor that binds to the distal NFAT element of the interleukin-2 (IL-2) gene promoter in naive T-helper lymphocytes has been implicated in autoimmune phenomena and has emerged as a potentially important factor controlling the latency of HIV-1. The aim of this paper was the identification of this repressor. We resorted to public microarray databases looking for DNA-binding proteins that are present in naïve resting T cells but are downregulated when the cells are activated. A Bayesian data mining statistical analysis uncovered 25 candidate factors. Of the 25, NFAT4 and the oncogene ets-2 bind to the common motif AAGGAG found in the HIV-1 LTR and IL-2 probes. Ets-2 binding site contains the three G's that have been shown to be important for binding of the unknown factor; hence, we considered it the likeliest candidate. Electrophoretic mobility shift assays confirmed cross-reactivity between the unknown repressor and anti-ets-2 antibodies, and cotransfection experiments demonstrated the direct involvement of Ets-2 in silencing the IL-2 promoter. Designing experiments for transcription factor analysis using microarrays and Bayesian statistical methodologies provides a novel way toward elucidation of gene control networks.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1
Figure 2
Figure 3

Similar content being viewed by others

References

  1. Shores EW, Love PE . TCR zeta chain in T cell development and selection. Curr Opin Immunol 1997; 9: 380–389.

    Article  CAS  Google Scholar 

  2. Valitutti S, Muller S, Cella M, Padovan E, Lanzavecchia A . Serial triggering of many T-cell receptors by a few peptide–MHC complexes. Nature 1995; 375: 148–151.

    Article  CAS  Google Scholar 

  3. Rogers PR, Huston G, Swain SL . High antigen density and IL-2 are required for generation of CD4 effectors secreting Th1 rather than Th0 cytokines. J Immunol 1998; 161: 3844–3852.

    CAS  PubMed  Google Scholar 

  4. Iezzi G, Karjalainen K, Lanzavecchia A . The duration of antigenic stimulation determines the fate of naive and effector T cells. Immunity 1998; 8: 89–95.

    Article  CAS  Google Scholar 

  5. Rao A . NF-ATp: a transcription factor required for the co-ordinate induction of several cytokine genes. Immunol Today 1994; 15: 274–281.

    Article  CAS  Google Scholar 

  6. Jain J, Loh C, Rao A . Transcriptional regulation of the IL-2 gene. Curr Opin Immunol 1995; 7: 333–342.

    Article  CAS  Google Scholar 

  7. Rao A, Luo C, Hogan PG . Transcription factors of the NFAT family: regulation and function. Annu Rev Immunol 1997; 15: 707–747.

    Article  CAS  Google Scholar 

  8. Serfling E, Berberich-Siebelt F, Chuvpilo S et al. The role of NF-AT transcription factors in T cell activation and differentiation. Biochim Biophys Acta 2000; 1498: 1–18.

    Article  CAS  Google Scholar 

  9. Chuvpilo S, Schomberg C, Gerwig R et al. Multiple closely-linked NFAT/octamer and HMG I(Y) binding sites are part of the interleukin-4 promoter. Nucleic Acids Res 1993; 21: 5694–5704.

    Article  CAS  Google Scholar 

  10. Szabo SJ, Gold JS, Murphy TL, Murphy KM . Identification of cis-acting regulatory elements controlling interleukin-4 gene expression in T cells: roles for NF-Y and NF-ATc. Mol Cell Biol 1993; 13: 4793–4805.

    Article  CAS  Google Scholar 

  11. Rooney JW, Hoey T, Glimcher LH . Coordinate and cooperative roles for NF-AT and AP-1 in the regulation of the murine IL-4 gene. Immunity 1995; 2: 473–483.

    Article  CAS  Google Scholar 

  12. Goldfeld AE, McCaffrey PG, Strominger JL, Rao A . Identification of a novel cyclosporin-sensitive element in the human tumor necrosis factor alpha gene promoter. J Exp Med 1993; 178: 1365–1379.

    Article  CAS  Google Scholar 

  13. McCaffrey PG, Goldfeld AE, Rao A . The role of NFATp in cyclosporin A-sensitive tumor necrosis factor-alpha gene transcription. J Biol Chem 1994; 269: 30445–30450.

    CAS  PubMed  Google Scholar 

  14. Cockerill PN, Bert AG, Jenkins F et al. Human granulocyte-–macrophage colony-stimulating factor enhancer function is associated with cooperative interactions between AP-1 and NFATp/c. Mol Cell Biol 1995; 15: 2071–2079.

    Article  CAS  Google Scholar 

  15. Masuda ES, Tokumitsu H, Tsuboi A et al. The granulocyte–macrophage colony-stimulating factor promoter cis-acting element CLE0 mediates induction signals in T cells and is recognized by factors related to AP1 and NFAT. Mol Cell Biol 1993; 13: 7399–7407.

    Article  CAS  Google Scholar 

  16. Sikder SK, Mitra D, Laurence J . Identification of a novel cell-type and context specific enhancer within the negative regulatory element of the human immunodeficiency virus type 1 long terminal repeat. Arch Virol 1994; 137: 139–147.

    Article  CAS  Google Scholar 

  17. Luo C, Copeland NG, Jenkins NA et al. Normal function of the transcription factor NFAT1 in wasted mice. Chromosome localization of NFAT1 gene. Gene 1996; 180: 29–36.

    Article  CAS  Google Scholar 

  18. Mouzaki A, Weil R, Muster L, Rungger D . Silencing and trans-activation of the mouse IL-2 gene in Xenopus oocytes by proteins from resting and mitogen-induced primary T-lymphocytes. EMBO J 1991; 10: 1399–1406.

    Article  CAS  Google Scholar 

  19. Mouzaki A, Rungger D, Tucci A, Doucet A, Zubler RH . Occurrence of a silencer of the interleukin-2 gene in naive but not in memory resting T helper lymphocytes. Eur J Immunol 1993; 23: 1469–1474.

    Article  CAS  Google Scholar 

  20. Mouzaki A, Rungger D . Properties of transcription factors regulating interleukin-2 gene transcription through the NFAT binding site in untreated or drug-treated naive and memory T-helper cells. Blood 1994; 84: 2612–2621.

    CAS  PubMed  Google Scholar 

  21. Mouzaki A, Doucet A, Mavroidis E, Muster L, Rungger D . A repression–derepression mechanism regulating the transcription of human immunodeficiency virus type 1 in primary T cells. Mol Med 2000; 6: 377–390.

    Article  CAS  Google Scholar 

  22. Yang YH, Dudoit S, Luu P et al. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 2002; 30: e15.

    Article  Google Scholar 

  23. Newton MA, Kendziorski CM, Richmond CS, Blattner FR, Tsui KW . On differential variability of expression ratios: improving statistical inference about gene expression changes from microarray data. J Comput Biol 2001; 8: 37–52.

    Article  CAS  Google Scholar 

  24. Baldi P, Long AD . A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics 2001; 17: 509–519.

    Article  CAS  Google Scholar 

  25. Mouzaki A, Theodoropoulou M, Gianakopoulos I et al. Expression patterns of Th1 and Th2 cytokine genes in childhood idiopathic thrombocytopenic purpura (ITP) at presentation and their modulation by intravenous immunoglobulin G (IVIg) treatment: their role in prognosis. Blood 2002; 100: 1774–1779.

    CAS  PubMed  Google Scholar 

  26. Quackenbush J . Computational analysis of microarray data. Nat Rev Genet 2001; 2: 418–427.

    Article  CAS  Google Scholar 

  27. Chen Y, Dougherty ER, Bittner ML . Ratio-based decisions and the quantitative analysis of cDNA microarray images. J Biomed Opt 1997; 2: 364–374.

    Article  CAS  Google Scholar 

  28. Ideker T, Thorsson V, Siegel AF, Hood LE . Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data. J Comput Biol 2000; 7: 805–817.

    Article  CAS  Google Scholar 

  29. Tusher VG, Tibshirani R, Chu G . Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA 2001; 98: 5116–5121.

    Article  CAS  Google Scholar 

  30. Perneger TV . What's wrong with Bonferroni adjustments. BMJ 1998; 316: 1236–1238.

    Article  CAS  Google Scholar 

  31. Greenfield A . Applications of DNA microarrays to the transcriptional analysis of mammalian genomes. Mamm Genome 2000; 11: 609–613.

    Article  CAS  Google Scholar 

  32. Scheffe H . Practical solutions of the Behens–Fisher problems. J Am Stat Assoc 1970; 65: 1501–1508.

    Google Scholar 

  33. Ebenhardt K, Guthie W . Should (X1–X2) have larger uncertainty than X1?. OnlinePublications, Statistical Engineering Division, National Institutes of Standards and Technology, 2001, 10-5-2003.

  34. Robinson GK . Properties of the Student's t and the Behrens–Fisher solution to the two means problem. Ann Stat 1976; 4: 963–971.

    Article  Google Scholar 

  35. Barnard GA . Comparing the means of independent samples. Appl Stat 1984; 33: 266–271.

    Article  Google Scholar 

  36. Oukka M, Ho IC, de la Brousse FC et al. The transcription factor NFAT4 is involved in the generation and survival of T cells. Immunity 1998; 9: 295–304.

    Article  CAS  Google Scholar 

  37. Morrow MA, Mayer EW, Perez CA, Adlam M, Siu G . Overexpression of the helix–loop–helix protein Id2 blocks T cell development at multiple stages. Mol Immunol 1999; 36: 491–503.

    Article  CAS  Google Scholar 

  38. Rengarajan J, Tang B, Glimcher LH . NFATc2 and NFATc3 regulate T(H)2 differentiation and modulate TCR-responsiveness of naive T(H)cells. Nat Immunol 2002; 3: 48–54.

    Article  CAS  Google Scholar 

  39. Degnan BM, Degnan SM, Naganuma T, Morse DE . The ets multigene family is conserved throughout the Metazoa. Nucleic Acids Res 1993; 21: 3479–3484.

    Article  CAS  Google Scholar 

  40. Shore P, Whitmarsh AJ, Bhaskaran R et al. Determinants of DNA-binding specificity of ETS-domain transcription factors. Mol Cell Biol 1996; 16: 3338–3349.

    Article  CAS  Google Scholar 

  41. Anderson MK, Hernandez-Hoyos G, Diamond RA, Rothenberg EV . Precise developmental regulation of Ets family transcription factors during specification and commitment to the T cell lineage. Development 1999; 126: 3131–3148.

    CAS  Google Scholar 

  42. Alizadeh AA, Eisen MB, Davis RE et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000; 403: 503–511.

    Article  CAS  Google Scholar 

  43. Lennon G, Auffray C, Polymeropoulos M, Soares MB . The I.M.A.G.E. Consortium: an integrated molecular analysis of genomes and their expression. Genomics 1996; 33: 151–152.

    Article  CAS  Google Scholar 

  44. Sherlock G, Hernandez-Boussard T, Kasarskis A et al. The Stanford Microarray Database. Nucleic Acids Res. 2001; 29: 152–155.

    Article  CAS  Google Scholar 

  45. Lee PM . Bayesian Statistics: An Introduction. Arnold: London, 199, pp 117–138.

    Google Scholar 

  46. Maeder R . Programming in Mathematica. Addision Wesley: Reading, MA, 1997.

    Google Scholar 

  47. Wingender E, Chen X, Hehl R et al. TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res 2000; 28: 316–319.

    Article  CAS  Google Scholar 

  48. Clipstone NA, Crabtree GR . Identification of calcineurin as a key signalling enzyme in T-lymphocyte activation. Nature 1992; 357: 695–697.

    Article  CAS  Google Scholar 

  49. Foecking MK, Hofstetter H . Powerful and versatile enhancer–promoter unit for mammalian expression vectors. Gene 1986; 45: 101–105.

    Article  CAS  Google Scholar 

  50. Gorman CM, Moffat LF, Howard BH . Recombinant genomes which express chloramphenicol acetyltransferase in mammalian cells. Mol Cell Biol 1982; 2: 1044–1051.

    Article  CAS  Google Scholar 

  51. Loredo T . From Laplace to supernova SN 1987a: Bayesian inference in astrophysics. In: Fougere F (ed). Maximum Entropy and Bayesian Methods. Kluwer Academic Publishers: Dordrecht, 1990, pp 81–142.

    Chapter  Google Scholar 

  52. Duong Q, Shorrock R . On Behrens–Fisher solutions. Statistician 1996; 45: 57–63.

    Article  Google Scholar 

  53. Bernardo JM, Smith AFM . Bayesian Theory. Wiley: New York, 1994.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A Mouzaki.

Appendix

Appendix

The Bayesian interpretation of probability and the Behrens–Fisher problem

Traditionally, probability is identified with the long-run relative frequency of occurrence of an event, either in a sequence of repeated experiments or in an ensemble of ‘identical’ systems. This view of probability is known as the ‘frequentist’ view; it is also called the ‘classical,’ ‘orthodox’ or ‘sampling theory’ view. It is the basis for many of the textbook statistical procedures currently in use. Bayesian probability theory (BPT) is founded on a much more general definition of probability. In BPT, probability is regarded as a real-number-valued measure of the plausibility of a proposition when incomplete knowledge does not allow us to establish its truth or falsehood with certainty. The measure is taken on a scale where 1 represents certainty of the truth of the proposition and 0 represents certainty of its falsehood. In the Bayesian framework probability theory is just common sense reduced to numbers, and probability represents the observer's belief that a certain event is true.51

The tool for updating one's beliefs about the plausibility of a hypothesis (H) given available data (E) and background information (ie context I) is given by Bayes theorem

The left-hand term, π(HE, I), is called the posterior probability, and it gives the probability of the hypothesis H after considering the effect of evidence E in context I. The P(HI) term is just the prior probability of H given I alone, that is, the belief in H before the evidence E is considered. The term P(EH,I) is called the likelihood, and it gives the probability of the evidence assuming the hypothesis H and background information I are true. The denominator is independent of H, and can be regarded as a normalizing or scaling constant. The information I is a conjunction of (at least) all of the other statements (background knowledge) relevant to determining P(HI) and P(EI). For notational reasons and when the context is understood, I is dropped from the expressions.

The posterior distribution is the fundamental object of Bayesian analysis and contains the relevant information needed to reason further about a hypothesis.

Turning to our example of gene downregulation, we calculate the two posterior probabilities (which sum to one, since they represent mutually exclusive events): π(H0E,I) (probability that a gene is downregulated given microarray measurements) and π(H1E,I) (probability that a gene is not downregulated given microarray measurements).

The ratio of the two posterior probabilities gives the posterior odds (ie how more likely is the null hypothesis vs the alternative) and is called the POR. A ratio for exmaple of 20 implies that the null hypothesis is 20 times more likely than the alternative one. A POR threshold essentially defines a region separating downregulated from upregulated and ‘no-change’ genes in a probabilistic manner. This follows from the simple probability calculus relation:

If the events of up- and downregulation are defined in terms of a symmetric arbitrary cutoff (C) for the difference δ, the equation above is written as

or

Substituting the definition of POR we obtain:

The calculation of the relevant posterior probabilities in the present situation is known as the two means or the BF problem. The latter, which represents the most common problem in applied statistics,32 is concerned with the determination of hypothesis tests for comparing the means of two normally distributed populations. The original BF solution (known as the BF Distribution) is an exact solution that can be derived when one adopts a Bayesian perspective; in the frequentist viewpoint one is left with approximations, notably the Welch–Satterwaite test,52 which would be inadequate for small sample sizes.33

To find out the posterior probability of downregulation from the available data, one needs to calculate the right tail region of the BF distribution. In the analysis of the current data set, the prior used is the so-called reference prior,53 which considers all possible values of θ1, θ2, (mean expression levels for a single gene in the resting and activated state) and σ1, σ2 (dispersions of expression ratios) equally likely. The choice of this prior is motivated by the following:

  1. 1

    it is the option that would have the least impact on the final results,53 letting the data ‘speak for themselves,’

  2. 2

    lack of background data to guide the specification of informative priors for EST probes,

  3. 3

    a convenient default option due to the number of comparisons made(>10 000 in our case).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Argyropoulos, C., Nikiforidis, G., Theodoropoulou, M. et al. Mining microarray data to identify transcription factors expressed in naïve resting but not activated T lymphocytes. Genes Immun 5, 16–25 (2004). https://doi.org/10.1038/sj.gene.6364034

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/sj.gene.6364034

Keywords

This article is cited by

Search

Quick links