TGF-β family signaling
Cell signaling proteins of the TGF-β family have integral roles in nearly all aspects of metazoan biology, from regulating early embryonic patterning
1,2 and organizing the development of tissues and organs,
3 to maintaining tissues
4–6 and regulating cellular growth
7 and metabolism
8 in adults. Compared with proteins of the Wnt and Notch signaling families, which also have essential roles in development and maintaining homeostasis, the TGF-β family has been even more fruitful in its evolution, with five family members in
Caenorhabditis elegans, seven in
Drosophila melanogaster, and 33 in humans.
9 Signaling proteins of the family, commonly referred to as ligands, can be broken down into three subfamilies based on their phylogeny,
9 the bone morphogenetic (BMP) and growth and differentiation factor (GDF) subfamily, the activin (Act) subfamily, and the eponymous TGF-β subfamily.
Compared to most other signaling families, the TGF-β family has a relatively simple signaling pathway in which a dimeric signaling ligand comprised of two cystine-knotted growth factor monomers tethered together in most, but not all cases, by a single inter-chain disulfide bond (
Figure 1(a)) assembles structurally similar, functionally distinct single pass receptor kinases, known as type I and type II receptors, into a hetero-tetrameric receptor complex (
Figure 1(b)). Close spatial proximity of the two receptor types in the context of the receptor heterotetramer leads to a transphosphorylation cascade in which the constitutively active type II receptor kinase phosphorylates and activates the autoinhibited type I receptor kinase;
10 the active type I receptor binds and phosphorylates the receptor regulated Smads (R-Smads), which in turn form a heterotrimeric complex with the co-mediator Smad, Smad4, that translocates to the nucleus to effect transcription of target genes with Smad binding elements.
11,12 Smads, which bind DNA directly through their C-terminal MH2 domain, do so weakly and are thus dependent on other co-activators and co-repressors to effect transcriptional responses.
13 Such a dependence on co-activators and co-repressors, coupled with their variation from cell to cell, underlies the strong cell- and context-dependent activities characteristic of signaling ligands of the family.
14Mechanisms for diversification of TGF-β family signaling
Compared to the 33 signaling ligands in humans, there are far fewer receptors, with just seven type I and five type II receptors.
9 In addition, the type I receptors of the family couple to and activate only two classes of R-Smads, R-Smads 1, 5, 8 and R-Smads 2,3, that target distinct promoter elements. Type I receptors known as activin-like kinases 1, 2, 3, and 6 (Alk1, Alk2, Alk3, and Alk6) couple to activate R-Smads 1, 5, and 8, and the type I receptors Alk4, Alk5, and Alk7 couple to activate R-Smads 2 and 3.
15 Interestingly, most ligands of the BMP/GDF sub-class bind type I receptors that activate Smads 1, 5, and 8, while ligands of the activin and TGF-β sub-classes bind type I receptors that activate Smads 2 and 3,
16 and thus the ligands and receptors appear to have co-evolved to generate two functionally distinct classes of signaling. Importantly, while this represents an important mechanism to increase functional diversity, it alone is insufficient to explain the diverse functions of the 33 TGF-β family proteins in humans.
Considerable efforts have been made over the past 10 to 15 years to better understand the mechanisms that engender the proteins of the TGF-β family with their distinctive activities
in vivo – these can be broadly characterized as follows: (a) regulatory pro-domains which block receptor binding sites, but which bind the ligands from low to high affinity, thus conferring varying degrees of latency (
Figure 1(c) and (d)),
17–20 (b) soluble or membrane-bound ligand-specific binding proteins that can either antagonize or potentiate receptor complex assembly and signaling,
9,21 (c) non-signaling complexes, such as between activin A, ActRII/ActRIIB, and Alk2, that antagonize signaling of other ligands, such as BMP-4, -7, and -9, that bind and signal through a common receptor,
22 (d) assembly of mixed receptor complexes with ligand homodimers, such as between TGF-β1, TβRII, and the type I receptors Alk1 and Alk5
23,24 or Alk2 and Alk5,
25 or (e) alteration of how a particular cell or group of cells interprets a given ligand, even if the ligand activates the same sub-class of R-Smads. Several mechanisms have been shown to be responsible for the latter, including (a) distinct affinities, and thus distinct ligand–receptor dynamics for different ligand–receptor pairs that leads to distinct cellular responses
26 and (b) ligand heterodimers capable of binding two different type I receptors that activate the same class of R-Smads, but for reasons similar to those above, distinct cellular responses.
27 Owing to space constraints, it is impossible in this mini-review to describe each of the mechanisms enumerated above, nonetheless the authors refer readers to several other comprehensive reviews that covers these topics,
9,21,28 including the full-length review included in this volume by Goebel
et al. In this mini-review, the focus is on recent advances in the structural biology of two structurally homologous co-receptors of the TGF-β family, betaglycan and endoglin, and how subtle, but critical differences in their structures, engender them with the ability to recognize and discriminate between distinct subsets of TGF-β family ligands. In the course of this review, we discuss probable mechanisms by which this imparts the ligands they bind with their distinct functions
in vivo.
Betaglycan and endoglin and their cognate ligands
Betaglycan and endoglin were both initially discovered by affinity labeling experiments in which cells were exposed to radiolabeled TGF-β1 and a chemical crosslinking reagent.
29–31 Both were found to be glycoproteins of relatively high molecular weight (ca. 80–90 kDa for the core protein), but in contrast to betaglycan, which is found as a monomer and expressed in a variety of cell types, endoglin is a disulfide-linked dimer and is expressed almost exclusively on vascular endothelial cells, which express little to no betaglycan. Cloning of the genes encoding betaglycan
32,33 and endoglin
34 showed that they had the same overall domain structure, consisting of an N-terminal signal peptide, a large ectodomain, a single-spanning transmembrane helix, and short (ca. 40 amino acid) cytoplasmic tail (
Figure 2(a) and (b)). Sequence comparisons show they are homologous over their entire length, yet there is significant variation, with the highest identity in their transmembrane (73%) and cytoplasmic (61%) domains, and the lowest in the N- and C-terminal portions of the ectodomain (20–21%). Sequence analysis showed that the ectodomain can be roughly divided into two halves, with the membrane-distal N-terminal half having no identifiable homology to other proteins, and the membrane-proximal C-terminal half having identifiable homology to the zona pellucida (ZP) family of proteins.
35 Betaglycan and endoglin both undergo proteolytic shedding, and thus can be found either membrane-bound or soluble.
32,36 In this mini-review, the focus is the membrane-bound forms as these are the ones that most directly affect receptor complex assembly and signaling; readers are referred to other reviews for a discussion of the soluble forms.
28,37Endoglin, although initially identified as a receptor for TGF-β1 based on affinity labeling studies, was subsequently shown to bind the TGF-β family ligands BMP-9 and BMP-10 with high affinity.
38,39 In addition to binding endoglin, BMP-9 and -10 were also shown to bind the TGF-β family type I receptor Alk1 with high affinity, but poorly to other type I receptors, such as Alk3 and Alk6.
39–41 This was an important discovery as this pattern of binding was the opposite of that found for most BMPs, which bind type I receptors such as Alk3 and Alk6 with high affinity, but bind poorly to Alk1.
40 Together with the finding that most patients suffering from the vascular disorder hereditary hemorrhagic telangiectasia have mutations in either endoglin or Alk1, this led to the realization that endoglin and Alk1 were receptors for BMP-9 and BMP-10 in the vascular endothelium and that this induced signaling required for normal development and maintenance of the vasculature.
5Betaglycan was shown to bind all three TGF-β isoforms with near nanomolar affinity, but with a slight preference for TGF-β2,
42 the TGF-β isoform which binds TβRII with an affinity 200–500 fold weaker than that of TGF-β1 and -β3 due to substitution of arginine residues with lysine in the loops connecting fingers 1–2 and 3–4 that engage acidic (Asp, Glu) residues on TβRII.
43–45 Because of betaglycan’s high affinity for the TGF-βs, and because it was shown to potentiate cellular responsiveness to TGF-β2 by 100- to 500-fold compared to cells that lack betaglycan, it was designated as the TGF-β type III receptor, TβRIII, and ascribed the role of co-receptor,
32,46 that is a cell-surface receptor that potentiated the assembly of the signaling complex, but itself did not directly participate in signal transduction. Genetic studies in mice provided additional direct support for betaglycan’s co-receptor function, as both the TGF-β2 and betaglycan null mice were inviable and shared significant phenotypic similarities, including severely impaired heart and liver development.
47–49Betaglycan has also been shown to bind to the α-subunit of the TGF-β family heterodimer inhibin A (InhA) and to potentiate its ability to antagonize signaling of activin A (ActA), and thus production of follicle stimulating hormone β (FSH-β) in the anterior pituitary.
50 Betaglycan’s antagonism has been proposed to derive from its ability to potentiate binding of the activin type II receptor, ActRII, or the closely related activin type IIB receptor, ActRIIB, to the InhA β-subunit, and thus sequester ActA’s type II receptors, ActRII and ActRIIB, in a ternary complex incapable of recruiting a type I receptor
50–52 (
Figure 3(a)). Recent genetic studies in mice in which betaglycan was ablated in the pituitary had phenotypic characteristics largely consistent with its proposed mechanism of antagonism, yet only InhA, but not inhibin B (InhB), suppression of FSH-β was impaired in cultured pituitaries of the betaglycan knockout mice.
53 Current genetic data therefore supports a role for betaglycan in InhA-mediated antagonism of activin in the anterior pituitary, but another mechanism might be responsible for InhB antagonism. Betaglycan has also been shown to bind several BMPs and GDFs, including BMP-2, BMP-4, and GDF-5, and to influence their signaling,
54 although unlike TGF-βs and inhibins, this has been investigated using only cell-based methods and hence is not as deeply understood.
Biochemical insights into betaglycan-mediated potentiation of type II receptor binding
Because of betaglycan’s importance in potentiating assembly of the signaling complex, and because of the additional finding that β-arrestin associates with betaglycan’s C-terminal cytoplasmic tail and can lead to internalization of TGF-βs and their receptors,
59,60 considerable attention has been directed toward understanding how betaglycan engages its ligands. Based on initial affinity labeling experiments in which a ternary complex could be detected between TGF-β1, TβRII, and betaglycan, but not also a quaternary complex that additionally included TβRI, it was proposed that betaglycan potentiates TGF-β signaling by a handoff mechanism, in which it first binds the ligand with high affinity, and once bound, potentiates TβRII binding.
46 Binding of TβRII is proposed to promote the binding and recruitment of TβRI, which in turn leads to the displacement of betaglycan, and assembly of the full type I-type II receptor signaling heterotetramer (
Figure 3(b)).
Though the structure of betaglycan bound to TGF-β has not been determined, considerable other biochemical and structural data have accrued that lend support to this model. Through studies of domain deletions, it has been shown that both the N-terminal domain, which initially was named the endoglin-like domain, but which was later referred to as the orphan domain, and the membrane proximal domain, which was initially named the uromodulin-like domain, but was later referred to as the ZP domain, both directly bind the ligand but do not compete with one another, and thus occupy distinct sites on the ligand
42,61 (
Figure 2(a)). Based on additional deletion constructs, it was shown that only the C-terminal half of the ZP domain was required for ligand binding.
52 Binding studies with the purified full-length betaglycan extracellular domain, as well as the purified orphan and C-terminal portion of the ZP domain (ZP-C) domains, provided further detail by showing that betaglycan engages TGF-β homodimers asymmetrically with an overall 1:1 stoichiometry and fully blocks one of the TβRII binding sites via its ZP-C domain, but leaves the other site unoccupied.
62 Based on additional measurements by the same group, (a) betaglycan-bound TGF-β was shown to bind one molecule of TβRII and to do so with increased affinity relative to TGF-β alone, and (b) the orphan domain had to be displaced to allow TβRI to bind. Based on these findings, a more detailed hand-off mechanism was proposed in which betaglycan functions to bind and concentrate TGF-β2 on the cell surface, and thus promote the binding of TβRII by membrane-localization effects and allostery
62 (
Figure 3(c)). The proposed mechanism further suggests that the transition to the signaling complex is mediated by the recruitment of TβRI, which displaces the betaglycan orphan domain. The binding and recruitment of TβRI, together with the simultaneous displacement of the orphan domain, is likely driven by direct contact between TβRI:TβRII, as this was demonstrated previously in the structures of the TGF-β3:TβRII:TβRI and TGF-β1:TβRII:TβRI ternary complexes (
Figure 1(b), red dashed outline) and was shown to provide more than half of the total binding energy for binding and recruitment of TβRI.
63,64 Though the structure of the TGF-β:betaglycan complex has not yet been reported, recent structural data described below, including NMR-based identification of the betaglycan ZP-C binding site on TGF-β2,
65 as well as the X-ray structures of the betaglycan orphan domain and betaglycan ZP-C domain,
66–68 provides further detail that supports the model shown in
Figure 3(c).
Endoglin structure and proposed mechanism
Structures of the human endoglin orphan and ZP-C domains, as well as a 2:1 complex between the endoglin orphan domain and BMP-9, were recently determined using X-ray crystallography.
58 Endoglin’s orphan domain is comprised of two tandem β-sandwich domains connected by two antiparallel β-strands (
Figure 5(a)); this structure was not represented by any structures previously deposited into the protein data bank and was somewhat unusual in that its overall architecture consisted of a strand-bend-strand that began in domain 1 (O-D1), and then extended into domain 2 to complete the domain 2 β-sandwich; upon exiting the last β-strand of domain 2, the pattern repeated, thereby generating the full two domain structure connected by two antiparallel β-strands (
Figure 5(a) and (b)). Sequence analysis shows that the two strand-bend-strand-β-sandwich motifs share 18% sequence identity, suggesting this arose as a result of an in-frame gene duplication.
Structure of the 2:1 orphan domain:BMP-9 complex shows that orphan domain engages the ligand symmetrically by forming a super β-sheet with finger 4 of the ligand through an exposed β-strand, β6, in O-D1 (
Figure 5(c)). Stabilization of this mode of binding is provided in two ways – first by residues extending from OD-1 β-strand 7 to form additional contacts with the backside of finger 4 of the growth factor, and second due to avidity since both full-length endoglin and BMP-9 are covalent dimers and bind in a bivalent manner. Such a manner of binding is consistent with that expected from the previous binding studies,
55,56 namely that binding of the type II receptors ActRII, ActRIIB, and likely BMPRII as well, will be blocked by binding of endoglin, while binding of the type I receptor Alk1, will not be blocked (as inferred by comparing orphan domain contact residues with BMP-9 in this structure, vs. ActRIIB and Alk1 contact residues with BMP-9 in the previously determined structure of BMP-9 bound to ActRIIB and Alk1
57).
Endoglin’s ZP domain, which harbors the cysteine residues responsible for covalent dimerization, was produced by recombinant overexpression in mammalian cells and was secreted as a mixture of monomer and covalent dimer. In the course of crystallization, the monomeric form selectively crystallized, and thus it was necessary to infer the sites of covalent dimerization based on surface exposed cysteines in the structure, together with dimerization assays with cysteine mutants. Interestingly, this identified two cysteine residues responsible for covalent dimerization of the ZP domain, Cys
516 located in an exposed loop close to the C-terminal portion of ZP-C domain, as well as Cys,
582 located in the structurally disordered juxtamembrane region. Informed by the identification of these two cysteines, Saito
et al. constructed a dimeric model for the ZP domain in which the two cysteines were within the proper distance for disulfide bond formation;
58 this yielded the V-shaped structure, reproduced in
Figure 5(d), in which each of the slightly curved arms of the V are formed by the two tightly connected immunoglobulin-like domains that comprise the ZP domain.
Success in determining both the orphan domain structure, and the ZP domain structure, allowed construction of a model for the full endoglin:BMP-9 complex (
Figure 5(e)). Positioning of the orphan domain:BMP-9 complex relative to the ZP dimer in this model was guided by knowledge that the ZP domain does not directly contact the ligand, and while this eliminates structures where the ligand would come in direct contact with the ZP domain, it nonetheless leaves significant uncertainty. In the model presented by Saito
et al., the ligand was positioned deeply into the V between the two ZP domains; this is plausible because the ligand can be readily accommodated without contacting the ZP domain, but also because it positions the C-terminus of the orphan domain, which emerges from the OD-1, at a distance from the N-terminus of the ZP domain that can be readily accommodated with the number of structurally disordered residues present between these two domains.
Overall, the endoglin structures that have been reported are not only consistent with the biochemical data and models for binding that have been previously reported, but they also shed important new details on how endoglin engages BMP-9 in an antibody-like manner, with the ZP domain being homologous to the Fc domain and the orphan domain being homologous to the Fab domain – this manner of binding allows endoglin to extend significantly outward and capture BMP-9 and BMP-10 with high affinity due to avidity. One other advantage of bivalent binding is this also provides a mechanism for subsequent release, which is essential as endoglin must be fully displaced so that the ligand can bind two molecules of Alk1, and two molecules of type II receptor, to assemble the full heterotetrameric signaling complex.
Betaglycan structure and proposed mechanism
Structures of the full-length betaglycan extracellular domain bound to TGF-β, or the ZP-C domain bound to InhA, are still lacking, yet there has been progress over the past five years that provides additional detail as to the precise structure of the complexes. One such effort employed NMR to identify the precise binding site on TGF-β2 for the betaglycan ZP-C domain.
65 To simplify the spectra and to obtain resolved signals in the context of the high molecular weight TGF-β2:ZP-C complex, these studies were performed by preparing deuterated Ile, Leu, Val
13C-methyl labeled TGF-β2 and in turn titrating this with unlabeled ZP-C. In these studies, the signals that were most strongly perturbed all mapped to the underside of fingers 2, 3, and 4, and extended from the base of the heel helix to the tip of fingers 1–2 and 3–4, where TβRII binds (
Figure 6(a)). In order to validate this putative binding site, residues were substituted both within the binding site, but also on the outer surface of the knuckles of fingers 3 and 4, and the results obtained were fully consistent with the binding site identified by NMR, with the largest effects upon substitution of residues, such as Ile
33, Ile
92, and Glu
99, on the underside of the fingers, but little to no effects upon substitution of any of the knuckle residues (
Figure 6(b)).
Included within this binding site were three residues, Ile
92, Lys
97, and Glu
99, that are present in TGF-β1, -β2, and -β3 and inhibin α, but are of completely different character, that is acidic versus basic, hydrophobic versus charged, in almost all other TGF-β family members (
Figure 6(d)). Importantly, although this binding site differs from the one identified in the InhA α subunit based on analysis of a large collection of InhA α subunit variants and functional assays,
69 it nonetheless includes several of the same residues and is in the same region of the fingers, but is located on the inner surface of the fingers, rather than on the outer (knuckle) surface. It is therefore likely that the binding site for ZP-C in TGF-β2 and the inhibin A α subunit are the same and are located on the inner surface of the fingers as shown by the NMR data, though this still needs to be validated for InhA. It should also be noted that the identification of this new binding site provides a potential explanation for the observation that betaglycan mediates InhA, but not InhB, antagonism of activin in the pituitary,
53 since the NMR data showed that while most of ZP-C’s contact was with one monomer, there was nonetheless limited contact with the heel helix of the adjoining monomer. It is therefore possible that the betaglycan ZP-C domain is sensing amino acid differences in the heel helix of InhA versus InhB and as a consequence, functioning to engender InhA with activin-antagonist activity, but not InhB. In summary, the binding site for the ZP-C domain that has been identified is consistent with the previous observation that the ZP-C domain competes with TβRII, and thus is responsible for blocking one of the TβRII sites in the context of the full betaglycan:TGF-β2 complex, but it also provides insights as to how betaglycan selectively recognizes the ligands its targets, possibly including InhA versus InhB.
In the proposed mechanism for inhibin-mediated antagonism of activin, the betaglycan ZP-C domain binds to the α-subunit, thereby tethering inhibin to the membrane and in turn promoting the binding of ActRII or ActRIIB to the inhibin β subunit. Inhibins are known to be unable to bind and recruit type I receptors, such as ActRIB (Alk4) to initiate Smad2,3 signaling, but mechanism for the inhibition of type I receptor binding is not fully understood. In 2012, Zhu
et al. showed that the extended N-terminus of the inhibin α-subunit is essential for inhibiting ActRIB binding, though whether this does so by blocking ActRIB binding at the type I site that lies at both α/β subunit interfaces, or only one of the α/β interfaces, was unclear. Importantly, the identification of the ZP-C binding on the underside of the fingers described above
65 provides an explanation for blocking of type I receptor binding at least at one of the α/β interfaces as this binding site overlaps extensively with that of the type I receptor site, which for all known type I receptors of the family, includes residues both from the underside of the fingers and from the heel helix of the adjoining monomer. It is therefore conceivable that binding of type I receptors to inhibin is blocked at one of α/β interfaces by the extended N-terminus of the α-subunit and the other by the betaglycan ZP-C domain. In this way, the betaglycan ZP-C has two roles, one to capture on the inhibin on the membrane and promote type II receptor binding to the β-subunit, and another to prevent binding and recruitment of the type I receptor ActRIB. Structures of the ZP-C domains of rat and mouse betaglycan have been determined, and as shown, these are highly similar to one another, as well as structures of other ZP domain structures that have been determined, including endoglin ZP-C
67,68 (
Figure 6(c)). Several different regions have been identified in the ZP-C domain that might provide the binding site for TGF-β/Inhibin α, though these have not been reconciled.
67,68 One of these lies in the A-B loop, while the other lies in the F-G loop, both of which are highlighted in the models shown in
Figure 6(c). One possible strategy to reconcile these alternative binding sites, but as well to provide detailed structural information to build a model of the TGF-β2:ZPC complex, is to employ the same type of NMR approach that was used to identify the ZP-C binding site on TGF-β2. One other strategy would be to co-crystallize TGF-β2 with ZP-C or the full betaglycan extracellular domain with TGF-β2, though these efforts have been hampered either by limited solubility or by the high degree of flexibility due to the disordered linker that connects the orphan and ZP domains (A Hinck, unpublished observation), the latter being similar to the challenge encountered with full-length endoglin.
58Structure of the betaglycan orphan domain was recently reported, although rather than crystallizing human or rat betaglycan, which have been extensively characterized, the authors crystallized zebrafish betaglycan orphan domain, owing to its significantly improved crystallization propensity.
66 Overall as one would expect based on roughly 20% sequence identity between betaglycan and endoglin, the structure of the betaglycan orphan domain is similar to that of endoglin, albeit with a somewhat different orientation of the two β-sandwich domains relative to one another (
Figure 7(a)). One structurally minor, but potentially significant difference between the two orphan domains, is the insertion of a short α-helix-β-strand motif, following β-strand 7 in O-D1 (
Figure 7(b)). Owing to the pairing of the newly inserted β-strand with the exposed β-strand 6, it would be expected to prevent a similar manner of binding as the endoglin orphan domain, due to steric clashes with the ligand (
Figure 7(c)). In order to investigate the possibility of an alternative manner of binding, Kim
et al. carried out binding studies with domain deleted constructs of both TGF-β and the betaglycan orphan domain, and showed that indeed the betaglycan orphan domain binds in a different manner, specifically the betaglycan orphan domain recognizes more than just the finger region of the ligand and it requires both β-sandwich domains, not just O-D1, to engage TGF-β2 dimers with high affinity.
70 On the basis of these observations, together with small angle X-ray scattering data which provides low-resolution structural information, models were constructed of the TGF-β:TβRII:orphan domain complex. One of the models that best fit these constraints is shown in
Figure 7(d), and as shown, the orphan domain is nestled around the dimer, but in a manner that does not interfere with binding of either molecule of TβRII. Overall, the structure of the betaglycan orphan domain has provided new structural details that likely account for the alternative manner by which it engages its ligand compared to endoglin, something previously hinted at based on its altered stoichiometry of binding, but not directly demonstrated through structural studies.