Ab initio modelling of the N-terminal domain of the secretin receptors

https://doi.org/10.1016/S1476-9271(03)00020-3Get rights and content

Abstract

G protein coupled receptors of the secretin family are activated by peptide hormones of about 30 residues in length. There is considerable sequence homology within both the hormone and receptor families. The receptors possess in addition to the integral membrane domain a characteristic extracellular domain of about 120 residues in length, having conserved cysteine residues, which are involved in disulphide bridge formation, and tryptophanes, which have been shown to be critical for hormone binding. This extracellular domain does not have detectable homology to any known protein fold. In order to be able to propose a structure for this domain we have used ab initio prediction methods combined with constraints based on experimental results for the disulphide connectivity. The results of computational tools for predicting secondary structure and accessibility, together with ligand binding and mutational data and other structural considerations were used in the ab initio protein folding programs dragon and gadget and also the simpler program ramble, which was able to explore different permutations of disulphide bond connectivity, tryptophan side chain orientation and chain topology. The methods generated a limited number of plausible models but no single unique solution was found under the constraints. One of these was refined into a full atomic model that contained a possible peptide binding site comprising the most conserved residues.

Introduction

The G protein-coupled receptors (GPCRs) are membrane proteins present in all higher animals where they perform vital signalling functions between the external environment (vision, olfaction) and the nervous system, and between cells in the body (neuromuscular, endocrinological and metabolic control and CNS functioning). Dysfunction of these receptors due to mutation or interference with the normal mechanism of agonist action will be harmful to the organism, and many diseases arise directly from these kinds of molecular lesion.

For this reason there has been a continued activity over many years in the study of these receptors at the physiological, pharmacological and biochemical levels. In particular, many of these receptors have been cloned and sequenced. There are more known sequences for GPCRs, over 2000 at the present time (Horn et al., 1998b), than for any other class of proteins.

The GPCRs form a superfamily in which members within each of the constituent families have considerable sequence similarity. Reliable alignments have been carried out by several authors (Oliveira et al., 1994, Taylor and Jones, 1995) and a continuously updated database of sequences and alignments is maintained at a website at the CMBI, Nijmegen, Netherlands (http://www.gpcr.org/7tm/).

There is much lower sequence similarity between the families. The small but mutually homologous secretin family comprises almost 200 of the sequences in the superfamily. There is no detectable homology between this family and the other GPCR families (Frimurer and Bywater, 1999). Despite that, an alignment of the transmembrane domain of the secretin family to the much larger opsin family has been accomplished using alignment techniques that do not depend on residue identity or “similarity” (Frimurer and Bywater, 1999). For the extracellular domains, especially those located at the N-terminus, there is no sequence similarity whatever and, in addition, not even any similarity in length. Within the opsin class, these N-terminal domains can vary in length from only a few residues to over 30, while the secretins typically possess a large N-terminal domain of about 120 residues. There is considerable homology between the N-terminal residues within this class but not to any proteins with known three-dimensional structure. It has not been possible to crystallise these proteins either as isolated domains or in the intact state, coupled to the transmembrane domain. The challenge, which is addressed in this paper, is to propose a structure for these extracellular domains that is useful for explaining the mechanism of action of the receptors in this class.

The family of the secretin receptors is characterised by having an extracellular domain of about 120 residues attached to the N-terminus of the integral membrane domain (7TM). This extracellular domain (Nter) has been shown to be critical for binding of the hormones that activate these receptors, and for overall function (Vilardaga et al., 1995, Wilmen et al., 1996, Graziano et al., 1996, Van Eyll et al., 1996, Vilardaga et al., 1997, Wilmen et al., 1997, DeAlmeida and Mayo, 1998). It is, therefore, just as important to obtain structural information for the N-terminal domain as it is for the 7TM domain. Although it has been possible to isolate the N-terminal domain from GLP1R and to show that interactions between the hormone and this domain are responsible for much of the total binding energy (Vilardaga et al., 1995, Wilmen et al., 1996, Graziano et al., 1996, Van Eyll et al., 1996), the N-terminal domain has so far eluded all attempts at crystallisation. Structural biology plays an important role in furthering the understanding of the function of biological molecules and in stimulating the design of new biochemical experiments. Until crystal structures become available, carefully constructed models can serve as a very useful substitute. Attempts to construct a model for the N-terminal domain have been hampered by the lack of homology to any known protein structure. We, therefore, resolved to use a set of ab initio protein model construction tools in order to propose a structure for the N-terminal domain.

In this work, we focus on the receptors for glucagon and GLP1 and their interactions with their respective hormones. Both are members of the secretin receptor family, and it has been shown in several members of the family that the N-terminal domain is critical for binding the hormone (Vilardaga et al., 1995, Graziano et al., 1996, Van Eyll et al., 1996), in conjunction with some of the extracellular loops connecting transmembrane helices (Buggy et al., 1995, Di Paolo et al., 1998).

The secretin families of peptides are all hormones and typical members of this family are glucagon and the related GLP1. Like insulin, the physiological role of glucagon is regulating blood glucose but it does so by a different mechanism. In the liver it activates its receptor, which in turn stimulates the gluconeogenesis cascade. In a non-diabetic, glucagon will be secreted in response to low blood glucose, and inhibited when blood sugar levels are high. GLP1 acts at the beta cells of the pancreas, having the effect of stimulating insulin secretion.

Because these two hormone/receptor systems are so similar (48.3% sequence identity for the hormones, 47.8% for the receptors), we performed folding studies only on GLP1R. It was the aim of this project to predict the structure of the N-terminal domain for this receptor and by inference/homology, for those of the secretin family.

Section snippets

Methods and data

In this section, a distinction is drawn between constraints that come from general principles of protein structure and those that have come specifically from the current application to the N-terminal domain. The former are generally incorporated into a “method” while the latter are usually referred to as “data”. For example, we have a section on general disulphide-bond geometry and another on the specific disulphides found in the N-terminal domain.

Sequence searching and alignment

Sequences were gathered from the non-redundant sequence databank (NCBI) using the iterated search programs psiblast (Altschul et al., 1997) and quest (Taylor, 1998, Taylor and Brown, 2000). The latter program has the advantage that it automatically reduces the number of close homologues retained from the search by a series of heuristics. The resulting selection (aligned by the program multal, Taylor, 1988) are shown in Fig. 2, coloured to emphasise both the quality and quantity of amino acid

Conclusions

In this paper, we have described modelling work on GLP1R as a representative of the secretin family. There are in fact several subfamilies of the secretins, all of which have similar 7TM domains but differ at the level of both hormone and peptide ligand. We have confined our attention to the subfamily comprising GLP1R, glucagon, parathyroid hormone, vasoactive intestinal peptide, pituitary adenyl cyclase activating peptide and secretin itself. The corticotrophin releasing factor, together with

Abbreviations

For amino-acid residue names the standard single and three-letter code have been used throughout. A cysteine residue known to participate in a disulphide bond is referred to as a half-cysteine. GLP1 stands for the hormone glucagon-like peptide-1 (7:36) and GLP1R for its receptor. 7TM for the 7-helix TransMembrane receptor domain.

References (48)

  • B Rost et al.

    Prediction of protein secondary structure at better than 70-percent accuracy

    J. Mol. Biol.

    (1993)
  • A.A Salamov et al.

    Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments

    J. Mol. Biol.

    (1995)
  • W.R Taylor

    Dynamic databank searching with templates and multiple alignment

    J. Mol. Biol.

    (1998)
  • B Van Eyll et al.

    Exchange of W39 by a within the N-terminal extracellular domain of the GLP-1 receptor results in a loss of receptor function

    Peptides

    (1996)
  • J.P Vilardaga et al.

    Properties of chimeric secretin and vip receptor proteins indicate the importance of the N-terminal domain for ligand discrimination

    Biochem. Biophys. Res. Commun.

    (1995)
  • G Vriend

    WHAT IF: a molecular modeling and drug design program

    J. Mol. Graph.

    (1990)
  • A Wilmen et al.

    The isolated N-terminal extracellular domain of the glucagon-like peptide-1

    FEBS Lett.

    (1996)
  • A Wilmen et al.

    Five out of six tryptophan residues in the N-terminal extracellular domain of the rat GLP-1 receptor are essential for its ability to bind GLP-1

    Peptides

    (1997)
  • S.F Altschul et al.

    Gapped blast and psi-blast: a new generation of protein database search programs

    Nucleic Acids Res.

    (1997)
  • A Aszódi et al.

    Folding polypeptide α-carbon backbones by distance geometry methods

    Biopolymers

    (1994)
  • A Aszódi et al.

    Secondary structure formation in model polypeptide chains

    Protein Eng.

    (1994)
  • A Aszódi et al.

    Protein fold determination using a small number of distance restraints

  • F.R Chalaoux et al.

    Molecular dynamics and accuracy of NMR structures: effects of error bounds and data removal

    Proteins: Struct. Funct. Genet.

    (1999)
  • B.L De Groot et al.

    Prediction of protein conformational freedom from distance constraints

    Proteins: Struct. Funct. Genet.

    (1997)
  • Cited by (0)

    1

    Present address: LION bioscience AG, Waldhofer Strasse 98, 69123 Heidelberg, Germany.

    View full text