Identification and characterization of the human long form of Sox5 (L-SOX5) gene
Introduction
The Sox (Sry-type HMG box) family of transcription factors is related to the testis-determining gene Sry and is defined by the presence of a high-mobility group (HMG) DNA-binding domain. Sox proteins interact with DNA through this domain in a sequence-specific manner, binding in the minor groove of DNA and inducing a significant bend (Ferrari et al., 1992, Connor et al., 1994). Therefore, Sox proteins are thought to function as architectural proteins that organize local chromatin structure, assemble other DNA-binding transcription factors and induce correct gene expression (Werner and Burley, 1997, Wolffe, 1994). Sox function is critical to a number of developmental processes, including sex determination (SOX9) (Foster et al., 1994, Wagner et al., 1994), lens development (Sox1, 2 and 3) (Kamachi et al., 1998), T-cell differentiation (Sox4) (van de Wetering et al., 1993), endocardial ridge development (Sox4) (Schilham et al., 1996), developing cardiac and skeletal muscle systems (Sox6) (Hagiwara et al., 2000) and neural crest cell differentiation (Sox10) (Southard-Smith et al., 1998).
Sox proteins are categorized into six subfamilies based on sequence homology within the HMG box and other domains (Pevny and Lovell-Badge, 1997). The group D subfamily consists of Sox5, Sox6 and Sox13. Although many Sox proteins are encoded by a single exon, group D Sox genes contain multiple exons with conserved genomic structure (Wunderle et al., 1996, Argentaro et al., 2000). They also share conserved N-terminal domains, including a leucine zipper, coiled-coil domains and a glutamine-rich region (Q-box), as well as the highly conserved HMG domain (Kido et al., 1998), which are considered critical to their function.
Group D Sox genes are known to express two types of alternatively spliced transcripts, short and long forms (Hiraoka et al., 1998, Argentaro et al., 2000). The short form is a component of the long form, and it lacks the characteristic N-terminal domains. The two forms exhibit different expression patterns: in mouse, the short forms of Sox5 and Sox6 are predominantly expressed in testis (Connor et al., 1995, Denny et al., 1992). In contrast, the long form of Sox6 is expressed in multiple tissues, especially in skeletal muscle, and the long form of Sox5 (L-Sox5) is primarily expressed in cartilage (Lefebvre et al., 1998). Co-expressed with Sox9 during chondrogenesis, L-Sox5 and Sox6 heterodimerize via their coiled-coil domains and activate the type II collagen gene (Col2A1), which encodes a major matrix component protein in cartilage (Lefebvre et al., 1998).
In humans, only the short form of the SOX5 gene has previously been identified (Wunderle et al., 1996). This short form shares high homology with mouse Sox5. Human SOX6, which is predicted to interact with the long form of SOX5, has recently been identified (Cohen-Barak et al., 2001). These facts strongly support the existence of a long form of human SOX5 (L-SOX5). Identification of the human L-SOX5 gene will be an important first step toward determining the precise role of the group D SOX genes in human chondrogenesis and other developmental pathways.
Here we report the isolation and characterization of the human L-SOX5 gene. L-SOX5 contains all cardinal motifs common to group D SOX proteins. Like its mouse counterpart, L-SOX5 has multiple transcriptional start sites and multiple alternative splicing variants, but it shows a unique expression pattern in human tissues.
Section snippets
5′- and 3′-RACE
To extend the SOX5 sequence in the 5′- and 3′-directions, we performed rapid amplification of cDNA ends (RACE). Because mouse L-Sox5 is expressed in both liver and cultured chondrocytes, and human SOX5 is expressed in testis, we used Marathon-Ready human liver and testis cDNAs (Clontech, Palo Alto, CA) and a custom-made cultured human chondrocyte cDNA template prepared using the Marathon cDNA Amplification Kit (Clontech) as templates, according to the manufacturer's instructions.
The 5′-RACE was
Identification of cDNAs encoding human L-SOX5
Through 5′-RACE analysis, we identified two distinct L-SOX5 transcripts. The longest cDNA sequence obtained by RACE analysis, along with its deduced amino-acid sequence, is shown in Fig. 1 (DDBJ accession no. AB081588). The cDNA sequence contains an open reading-frame that encodes a 763-amino-acid protein, which exceeds the length of the SOX5 short form (GDB 5584271) by up to 416 amino acids. The translation start site for the short form of SOX5 is located at codon 417 in L-SOX5. An in-frame
Discussion
We have identified the human L-SOX5 cDNA. The predicted human L-SOX5 protein, which is more than twice as large as the short form, contains motifs that are characteristic of group D Sox proteins but absent from the short form. These motifs are highly conserved between L-SOX5 and SOX6, and between human and mouse L-Sox5. This intra-familial and inter-species conservation highlights the importance of these motifs in the function of group D Sox genes. In mouse, L-Sox5 and Sox6 proteins form homo-
Acknowledgements
We thank Drs. Masayoshi Namba and Hidetoshi Okabe for help in performing study of cell line cultures, and Mss. Aya Narita and Tomoko Kusadokoro for excellent technical assistance.
References (27)
- et al.
Genomic characterisation and fine mapping of the human SOX13 gene
Gene
(2000) - et al.
Cloning, characterization and chromosome mapping of the human SOX6 gene
Gene
(2001) - et al.
The mouse Sox5 gene encodes a protein containing the leucine zipper and the Q box
Biochim. Biophys. Acta
(1998) - et al.
Cloning and characterization of mouse mSox13 cDNA
Gene
(1998) - et al.
Sox genes find their feet
Curr. Opin. Genet. Dev.
(1997) - et al.
The transcription factors L-Sox5 and Sox6 are essential for cartilage formation
Dev. Cell
(2001) - et al.
Autosomal sex reversal and campomelic dysplasia are caused by mutations in and around the SRY-related gene SOX9
Cell
(1994) - et al.
Architectural transcription factors: proteins that remodel DNA
Cell
(1997) - et al.
Cloning and characterization of SOX5, a new member of the human SOX gene family
Genomics
(1996) - et al.
SOX9 directly regulates the type-II collagen gene
Nat. Genet.
(1997)