Pathway: Collagen biosynthesis and modifying enzymes

Reactions in pathway: Collagen biosynthesis and modifying enzymes :

Collagen biosynthesis and modifying enzymes

The biosynthesis of collagen is a multistep process. Collagen propeptides are cotranslationally translocated into the ER lumen. Propeptides undergo a number of post-translational modifications. Proline and lysine residues may be hydroxylated by prolyl 3-, prolyl 4- and lysyl hydroxylases. 4-hydroxyproline is essential for intramolecular hydrogen bonding and stability of the triple helical collagenous domain. In fibril forming collagens approximately 50% of prolines are 4-hydroxylated; the extent of this and of 3-hydroxyproline and lysine hydroxylation varies between tissues and collagen types (Kivirikko et al. 1972, 1992). Hydroxylysine molecules can form cross-links between collagen molecules in fibrils, and are sites for glycosyl- and galactosylation. Collagen peptides all have non-collagenous domains; collagens within the subclasses have common chain structures. These non-collagenous domains have regulatory functions; some are biologically active when cleaved from the main peptide chain. Fibrillar collagens all have a large triple helical domain (COL1) bordered by N and C terminal extensions, called the N and C propeptides, which are cleaved prior to formation of the collagen fibril. The C propeptide, also called the NC1 domain, is highly conserved. It directs chain association during intracellular assembly of the procollagen molecule from three collagen propeptide alpha chains (Hulmes 2002). The N-propeptide has a short linker (NC2) connecting the main triple helix to a short minor one (COL2) and a globular N-terminal region NC3. NC3 domains are variable both in size and the domains they contain.

Collagen propeptides typically undergo a number of post-translational modifications. Proline and lysine residues are hydroxylated by prolyl 3-, prolyl 4- and lysyl hydroxylases. 4-hydroxyproline is essential for intramolecular hydrogen bonding and stability of the triple helical collagenous domain. Prolyl 4-hydroxylase may also have a role in alpha chain association as no association of the C-propeptides of type XII collagen was seen in the presence of prolyl 4-hydroxylase inhibitors (Mazzorana et al. 1993, 1996). In fibril forming collagens approximately 50% of prolines are 4-hydroxylated; the extent of this is species dependent, lower hydroxylation correlating with lower ambient temperature and thermal stability (Cohen-Solal et al. 1986, Notbohm et al. 1992). Similarly the extent of 3-hydroxyproline and lysine hydroxylation varies between tissues and collagen types (Kivirikko et al. 1992). Hydroxylysine molecules can form cross-links between collagen molecules in fibrils, and are sites for glycosyl- and galactosylation.

Collagen molecules fold and assemble through a series of distinct intermediates (Bulleid 1996). Individual collagen polypeptide chains are translocated co-translationally across the membrane of the endoplasmic reticulum (ER). Intra-chain disulfide bonds are formed within the N-propeptide, and hydroxylation of proline and lysine residues occurs within the triple helical domain (Kivirikko et al. 1992). When the peptide chain is fully translocated into the ER lumen the C-propeptide folds, the conformation being stabilized by intra-chain disulfide bonds (Doege and Fessler 1986). Pro alpha-chains associate via the C-propeptides (Byers et al. 1975, Bachinger et al. 1978), or NC2 domains for FACIT family collagens (Boudko et al. 2008) to form an initial trimer which can be stabilized by the formation of inter-chain disulfide bonds (Schofield et al. 1974, Olsen et al. 1976), though these are not a prerequisite for further folding (Bulleid et al. 1996). The triple helix then nucleates and folds in a C- to N- direction. The association of the individual chains and subsequent triple helix formation are distinct steps (Bachinger et al. 1980). The N-propeptides associate and in some cases form inter-chain disulfide bonds (Bruckner et al., 1978). Procollagen is released via carriers into the exracellular space (Canty & Kadler 2005). Fibrillar procollagens undergo removal of the C- and N-propeptides by procollagen C and N proteinases respectively, both Zn2+ dependent metalloproteinases. Propeptide processing is a required step for normal collagen I and III fibril formation, but collagens can retain some or all of their non-collagenous propeptides. Retained collagen type V and XI N-propeptides contribute to the control of fibril growth by sterically limiting lateral molecule addition (Fichard et al. 1995). Processed fibrillar procollagen is termed tropocollagen, which is considered to be the unit of higher order fibrils and fibres. Tropocollagens of the fibril forming collagens I, II, III, V and XI sponteneously aggregate in vitro in a manner that has been compared with crystallization, commencing with a nucleation event followed by subsequent organized aggregation (Silver et al. 1992, Prockop & Fertala 1998). Fibril formation is stabilized by lysyl oxidase catalyzed crosslinks between adjacent molecules (Siegel & Fu 1976).

Collagen formation

Collagen is a family of at least 29 structural proteins derived from over 40 human genes (Myllyharju & Kivirikko 2004). It is the main component of connective tissue, and the most abundant protein in mammals making up about 25% to 35% of whole-body protein content. A defining feature of collagens is the formation of trimeric left-handed polyproline II-type helical collagenous regions. The packing within these regions is made possible by the presence of the smallest amino acid, glycine, at every third residue, resulting in a repeating motif Gly-X-Y where X is often proline (Pro) and Y often 4-hydroxyproline (4Hyp). Gly-Pro-Hyp is the most common triplet in collagen (Ramshaw et al. 1998). Collagen peptide chains also have non-collagenous domains, with collagen subclasses having common chain structures. Collagen fibrils are mostly found in fibrous tissues such as tendon, ligament and skin. Other forms of collagen are abundant in cornea, cartilage, bone, blood vessels, the gut, and intervertebral disc. In muscle tissue, collagen is a major component of the endomysium, constituting up to 6% of muscle mass. Gelatin, used in food and industry, is collagen that has been irreversibly hydrolyzed. On the basis of their fibre architecture in tissues, the genetically distinct collagens have been divided into subgroups. Group 1 collagens have uninterrupted triple-helical domains of about 300 nm, forming large extracellular fibrils. They are referred to as the fibril-forming collagens, consisting of collagens types I, II, III, V, XI, XXIV and XXVII. Group 2 collagens are types IV and VII, which have extended triple helices (>350 nm) with imperfections in the Gly-X-Y repeat sequences. Group 3 are the short-chain collagens. These have two subgroups. Group 3A have continuous triple-helical domains (type VI, VIII and X). Group 3B have interrupted triple-helical domains, referred to as the fibril-associated collagens with interrupted triple helices (FACIT collagens, Shaw & Olsen 1991). FACITs include collagen IX, XII, XIV, XVI, XIX, XX, XXI, XXII and XXVI plus the transmembrane collagens (XIII, XVII, XXIII and XXV) and the multiple triple helix domains and interruptions (Multiplexin) collagens XV and XVIII (Myllyharju & Kivirikko 2004). The non-collagenous domains of collagens have regulatory functions; several are biologically active when cleaved from the main peptide chain. Fibrillar collagen peptides all have a large triple helical domain (COL1) bordered by N and C terminal extensions, called the N- and C-propeptides, which are cleaved prior to formation of the collagen fibril. The intact form is referred to as a collagen propeptide, not procollagen, which is used to refer to the trimeric triple-helical precursor of collagen before the propeptides are removed. The C-propeptide, also called the NC1 domain, directs chain association during assembly of the procollagen molecule from its three constituent alpha chains (Hulmes 2002).

Fibril forming collagens are the most familiar and best studied subgroup. Collagen fibres are aggregates or bundles of collagen fibrils, which are themselves polymers of tropocollagen complexes, each consisting of three polypeptide chains known as alpha chains. Tropocollagens are considered the subunit of larger collagen structures. They are approximately 300 nm long and 1.5 nm in diameter, with a left-handed triple-helical structure, which becomes twisted into a right-handed coiled-coil 'super helix' in the collagen fibril. Tropocollagens in the extracellular space polymerize spontaneously with regularly staggered ends (Hulmes 2002). In fibrillar collagens the molecules are staggered by about 67 nm, a unit known as D that changes depending upon the hydration state. Each D-period contains slightly more than four collagen molecules so that every D-period repeat of the microfibril has a region containing five molecules in cross-section, called the 'overlap', and a region containing only four molecules, called the 'gap'. The triple-helices are arranged in a hexagonal or quasi-hexagonal array in cross-section, in both the gap and overlap regions (Orgel et al. 2006). Collagen molecules cross-link covalently to each other via lysine and hydroxylysine side chains. These cross-links are unusual, occuring only in collagen and elastin, a related protein.

The macromolecular structures of collagen are diverse. Several group 3 collagens associate with larger collagen fibers, serving as molecular bridges which stabilize the organization of the extracellular matrix. Type IV collagen is arranged in an interlacing network within the dermal-epidermal junction and vascular basement membranes. Type VI collagen forms distinct microfibrils called beaded filaments. Type VII collagen forms anchoring fibrils. Type VIII and X collagens form hexagonal networks. Type XVII collagen is a component of hemidesmosomes where it is complexed wtih alpha6Beta4 integrin, plectin, and laminin-332 (de Pereda et al. 2009). Type XXIX collagen has been recently reported to be a putative epidermal collagen with highest expression in suprabasal layers (Soderhall et al. 2007). Collagen fibrils/aggregates arranged in varying combinations and concentrations in different tissues provide specific tissue properties. In bone, collagen triple helices lie in a parallel, staggered array with 40 nm gaps between the ends of the tropocollagen subunits, which probably serve as nucleation sites for the deposition of crystals of the mineral component, hydroxyapatite (Ca10(PO4)6(OH)2) with some phosphate. Collagen structure affects cell-cell and cell-matrix communication, tissue construction in growth and repair, and is changed in development and disease (Sweeney et al. 2006, Twardowski et al. 2007). A single collagen fibril can be heterogeneous along its axis, with significantly different mechanical properties in the gap and overlap regions, correlating with the different molecular organizations in these regions (Minary-Jolandan & Yu 2009).

Extracellular matrix organization

The extracellular matrix is a component of all mammalian tissues, a network consisting largely of the fibrous proteins collagen, elastin and associated-microfibrils, fibronectin and laminins embedded in a viscoelastic gel of anionic proteoglycan polymers. It performs many functions in addition to its structural role; as a major component of the cellular microenvironment it influences cell behaviours such as proliferation, adhesion and migration, and regulates cell differentiation and death (Hynes 2009).

ECM composition is highly heterogeneous and dynamic, being constantly remodeled (Frantz et al. 2010) and modulated, largely by matrix metalloproteinases (MMPs) and growth factors that bind to the ECM influencing the synthesis, crosslinking and degradation of ECM components (Hynes 2009). ECM remodeling is involved in the regulation of cell differentiation processes such as the establishment and maintenance of stem cell niches, branching morphogenesis, angiogenesis, bone remodeling, and wound repair. Redundant mechanisms modulate the expression and function of ECM modifying enzymes. Abnormal ECM dynamics can lead to deregulated cell proliferation and invasion, failure of cell death, and loss of cell differentiation, resulting in congenital defects and pathological processes including tissue fibrosis and cancer.

Collagen is the most abundant fibrous protein within the ECM constituting up to 30% of total protein in multicellular animals. Collagen provides tensile strength. It associates with elastic fibres, composed of elastin and fibrillin microfibrils, which give tissues the ability to recover after stretching. Other ECM proteins such as fibronectin, laminins, and matricellular proteins participate as connectors or linking proteins (Daley et al. 2008).

Chondroitin sulfate, dermatan sulfate and keratan sulfate proteoglycans are structural components associated with collagen fibrils (Scott & Haigh 1985; Scott & Orford 1981), serving to tether the fibril to the surrounding matrix. Decorin belongs to the small leucine-rich repeat proteoglycan family (SLRPs) which also includes biglycan, fibromodulin, lumican and asporin. All appear to be involved in collagen fibril formation and matrix assembly (Ameye & Young 2002).

ECM proteins such as osteonectin (SPARC), osteopontin and thrombospondins -1 and -2, collectively referred to as matricellular proteins (reviewed in Mosher & Adams 2012) appear to modulate cell-matrix interactions. In general they induce de-adhesion, characterized by disruption of focal adhesions and a reorganization of actin stress fibers (Bornstein 2009). Thrombospondin (TS)-1 and -2 bind MMP2. The resulting complex is endocytosed by the low-density lipoprotein receptor-related protein (LRP), clearing MMP2 from the ECM (Yang et al. 2001).

Osteopontin (SPP1, bone sialoprotein-1) interacts with collagen and fibronectin (Mukherjee et al. 1995). It also contains several cell adhesive domains that interact with integrins and CD44.

Aggrecan is the predominant ECM proteoglycan in cartilage (Hardingham & Fosang 1992). Its relatives include versican, neurocan and brevican (Iozzo 1998). In articular cartilage the major non-fibrous macromolecules are aggrecan, hyaluronan and hyaluronan and proteoglycan link protein 1 (HAPLN1). The high negative charge density of these molecules leads to the binding of large amounts of water (Bruckner 2006). Hyaluronan is bound by several large proteoglycans proteoglycans belonging to the hyalectan family that form high-molecular weight aggregates (Roughley 2006), accounting for the turgid nature of cartilage.

The most significant enzymes in ECM remodeling are the Matrix Metalloproteinase (MMP) and A disintegrin and metalloproteinase with thrombospondin motifs (ADAMTS) families (Cawston & Young 2010). Other notable ECM degrading enzymes include plasmin and cathepsin G. Many ECM proteinases are initially present as precursors, activated by proteolytic processing. MMP precursors include an amino prodomain which masks the catalytic Zn-binding motif (Page-McCawet al. 2007). This can be removed by other proteinases, often other MMPs. ECM proteinases can be inactivated by degradation, or blocked by inhibitors. Some of these inhibitors, including alpha2-macroglobulin, alpha1-proteinase inhibitor, and alpha1-chymotrypsin can inhibit a large variety of proteinases (Woessner & Nagase 2000). The tissue inhibitors of metalloproteinases (TIMPs) are potent MMP inhibitors (Brew & Nagase 2010).