J
Molecular Biology of the Gene ) ~ ~¡ ~w"
,
FIFTH EDITI O N
oJ
1 1
~
'1
l~
I
BAKER BELL )
GANN LEVINE
,
\
LOSICK
\
Brief Contents
n.e
Genetic Co:Ie
,.-
~
•• ;, •
a •o ,~
~
.,
~o A
•
i'
'"
~
• Olaln-lermil'laling or "nonsensa" codotls
1 AIso used., bacteria lo specily !he initiator formyt-Met-IRNAIMeI
adeniroe
11 .1 A
10_8 Á
'ItIe positioo and l engttl 0 1 t.he
~
b::nIs betwcen the
tare pairs_
NOTFORSALE
Molecular Biology ofthe Gene F
1 FT
H
ED
ITI
QN
James D. Watson Cold Spring Harbar Laboratory
Tania A. Baker Massachu.·;etts Instituto ofTech nology
Slephen P. Ben Massachusetts Institute of Technology
Alexander Gann Cold Spring Harbar Laboratory Pross
Michael Levine University ofCalifomia, Berkeley
Richard Losick Harvaro University
.,.., PEARSON
...-... Benjamin Cummings
CSHL PRESS
~ti.
.J
NOTFORSALE
)
Benjamin Cummings Publísher: Jim Smith Associ8tC Pr.oject Editors: Alexandn'l F'ellowes, Jeanne Z
Cold Spl'ing Harbor Laboratory Press Publisher and Sponsoring Editor: John Inglis Editorial Director: Alexander Cunn Editorial Developmenl Manager: Jan ArgenHne Pmject Manager alld DtlvdupmtHlIi:11 Editor: Kaaren JHnsscn Proicet Coordinntor: Maryl iz Dickerson Editorial Developmcnt Assistant: Nora Rir:e Crystal strueture ¡mAges: Leemor Joshua-Tor Cover coneept sketch: F.riea Reade. MEe Craphies Cover Designers: Denise Weiss . Ed Atkeson
ISBN 0-321-22368-3 Copyright © 2004 Pearson Education, Ine .. publishing as Benjamin Cummings. 1301 Sansome Slrect. San Francisco. CA 94111. Al l rights reserved. Manufaetured in the United Slates of America. This publieation is protected by Copyright anu permission shuuld be obtained from the publisher prior to any prohibited reproduetion, storage in a rntrieval system . or transmission in aoy form or by any mCaJlS, electronie. meehanical. photoeopying. rceording, or likewise. To obtai n permission(s) fo use malerial from this work. pIcase submit a written request lo Penrson Edueation. Ine., Permissions Departmenl. 1900 E. Lake Avenue, Clenview. IL 60025. For infonn ation regarding permissions, eaI1847/486/2635. Many of Ihe designations used by manufaelurers and sellers lo distinguish their produets are c1aimed as trademarks. Where those designations appear in this book. and Ihe publisher was aware of a trauemark c1aim. Ihe desigllations have been printed in initial eaps or aH eaps. If yo u purehased this book within the United Stales or Canada you
should be aware that jt has been wrongfully imported without tbe approval oflhe Publisher or the Author.
--PEARSON
1234567 a 9 10-VHP- 07 06 05 04 03 www.aw-be.eom www.eshlpress.eom
Bel\iamin Cummings
NOT FOR SALE
Preface
A
s the fifth edition of Molecular Biology of the Gene goes to press. completion of the human genoma sequence IS no longer ncws. This was 001 something thal conld safely have beeo
anticipated when the Rest edition appeared in 1965: even when the fourth eclition carne out in 1987, fow ir any fo resaw haw quickly we
would move ioto a warld where whole genames, not jusi individual genes, could be visualizad aod comparad. Thare hils becn a comparable leap in Ihe elucidation of proteio structures as well . Thus, in thc lasl few years, the structures of the huge molecular machines that drive the basic proccsses discussed in Ihis book-DNA transcription. (cplication , protejo synthesis. and so forth - have largely beco salvad al the alomic level. and many details of their iuner workings revealed. The ncw edition of Molecular BiaJogy o/ Ih e Cena reflects thesa advances, and many others bosides. But when we sat down lo plan this latosl vorsion, \VO \Vare all of a mind that much of the organizalion and scope of the original book should be retained. This was not a matter of convenience-inevitably, in ligbl of the dramatic changes thal had taken place since the las l edition. the vast bulk of the toxt had lo be complelely rewritten anyway. and a11 Ihe art rendered afresh. No, the reasoning was simply lhat, more than ever in this genomic era, lbero seemed a need for a book thal oxplained wha! genes Clrt! and how they wOJ'k, nnd this was exactly wha! Molecu lar BioJogy of 'h e Ceno had originally been designed lo do. Thus. we have resisted the temptation to bct::ome encydopedic or to delve jnlo alliod disciplines, such as cell biology. Also. we wanted the new edition to retaio II focus on principIes and concepts, another feature of its predecessors. And so wo illustrate our discussion sparingly with experiments. which appear mainly in boxes. Those consideralions onsured tho book did not bocome unwieldy. As stated by its author in lhe prefaee lo the first cdition: "Oflen 1 presenl a faet , and, because of lack of space, I cannot oulJíne the experiments that demonstTate its vaUdity. Given the choice between deleting an important principio or giving an experimental detail, , am inc1illed to stato the principie." The current incarnaüon of Molecu/or BloJogy of the Gene adheres unapologetically to this philosophy. An outline of this new odition willlhus be familiar lo anyone who has used the book befOl:e. We bcgin (in Part t) with a series of chapters (modified in the current odition) that place the field of molecu lar biology in contexto Those chapters summarize the history of genetics and molecular biology and also present the timeless chem ical principies tha! determine the structure and fu nction of macromolecuJes . The text thereafter is organ iz.ed lo follow a familiar flow of topies. Tha nature of the genetic material, its organization and its mainlenance. a.rc diseussed in Part 2: in addition lo chaplers on DNA structure, replicéllion, rccombination, and repair. new lO this part of the book is a chapler on chromosomes. ehromatin, and the nucleosome. This add ition reflects current appreciHtion of how the COlltcxt in which él given gene is found influences its function and regulation .
NOTFORSALE
The passage of information from gene to prote¡o-so·cal led gene e,xpression- is I,;overed in Part 3; a nd, in Part 4 we describe the regulation of that process. As well as chaplers on th,¡ basic mechanisms of gene regul ation. Part 4 IU:ls chapters on lhe regul ation of gene express ion in animal devolopmeol and in the evolution of anim al diversity. These chapters agaio eonform ta a tradilion estab· lished by earlier editions: al ways there has beon a chapter or two li nking basle mcchanisms of mo lecu lar biology lo pressin g biologi enJ qucstions. In Iha c urren t edition . these c hoplces invcs tigatc perhaps th e mosl striking revelatioo to come from comparing the comple te genome sequences af various an ima Ls: di ffereot animals ineluding human s-conlain largely th e same genes a nd so di ffe rences between th ose a nimal s mus! result largely from cha nges in how those genes a re expressed. New lO the current edHion is Ihe final part - Patt 5- cornprising chaplees on experimental methods - the techniques of molecul ar biology. genomics. and bioinformatics-and on Ihe model organisms whose sludy has revealed many of !he Ilnderlying principies of molecular biology. We a lluded lo Ihe explosioo io the numbers of atómic structures solved in the last few years. These include not only many of the enzymes that mediale the basic processes of molecular biology. and many of the proteins that regulate those processes. bul the nueleo· sorne as well. While it remains true thal many of the basic concepts in molecul ar biol ogy can ue undt:rstood withollt reliance 011 structural detai l- i ndced il is one of the slrenglhs of the fi eld thal this is Ihé case- nevertheless . many mechan istk insights come onl y fl"Om seeiog these dctails . Accord ingly. where struclures shed Iigh l on how the molecules in question work. we present them; and We do so in a consistent slyle throughoul Ihe book. Each part open er includes a shorl texl . olltlin ing what wi ll be covered in the coming cha pters. and a few photogra phs_ These pie· tuces. from the Cold Spring Harbor Laboratory Archive. were all taken at the Laboratory on Long Island. the greal majodl y at Ihe Symposium hosted there almost every summer since 1933. Captions identify who is in each picture an d whon it was laken. Wc th ank Ciare Bunce and the CS HL Archi ve for hclp wilh these. Parts of the currenl ed ition grew oul ol' an introductor)' course 011 molecular biology taught by one of us (RL) al Harvard University_ and this Buthar is grnteful lo Stevo Harrison and J¡m Wang who contributed lo this course in past years a nd whose influence is rcflected in ChaVler 6 and elsewherc. We ba ve shown sections of Ihe manuscript lo various coJleagues and Ibejr comments have been mosl vahwblc. greatly improv¡ng Ihe accmacy and accessibi1ity of the text and figures. Specifically we Ihank: lam ie Cate. Richard Ebright. Mike Eisen. Cluis Fromme. It-d Hall . Adrian Kminer. Karolin Luger. BiJI McGinnis . Matl Miehael. Li ly Mirels. Nipam Patcl, Craig Petersoo, Mark Plaslme. UUam RajBhandary. and Bruce Stillman. In addition. e raig Hunler dra fted the section on thc worm for Olopler 21. We also thank ¡hose who provided us with figures. or the whcrewithall to create them . including: St.!¡m Carroll . Selh Darst. Edward Egelman , Georg Halder. Stuart Kim. Bill McGinnis, Slcve Paddock. Phoebe Rice. Matt Scotl . Peter Sorger, Andrzej Stasiak. Tom Steitz. Dan Voytas. and Sleve West. We are mosl grateful lo Leemor Joshua-Tor who rendered all Ihe s lructure fi gures. aften produr:ing multiple versions and patiently helping us see which best ~5~~~tAt~ needed_ We
are also grateful to Ihose who pro vided the ir softwa re 1 : Pc r Kraulis, Robert Esnouf, Etha.n Merritt, I1ml Barry Honig. Coordina tes were obtaincd from th e Protein Data Bank (www.rcsb.org/pdb/) ; and ci lations to those who sol ved each slructure are includcd in Ihe figure legends. Our 8ft program \\las developed and renderod by a talented and enthu siastic tcam from the Dragonfly Media Croup. led by Mike Demaray and Craig Durant. Renate He llm iss he lpcd to develop sorne of our initial sketches and pro\'ided carly renderings of a number of figures. The cover imoge was rendered by Tomo Narashima from an au lhor conccpt sketch by E.;ica Beade (MBC Graphi cs). We thank those a t Co ld Spring Harbor Laboratory Press who handled developmen l 01' Ih is book. fan Argen tin e. despite having to enforcc the d cadlines. was througho ul less cajoUng Ihan she was tireless ly engaged in helping us sol ve the problems these presen ted . Maryliz Díckerson kcpt organized the mass of material we generated and Nora Rice helped coordinale a ulhor meetings and olher as pects of the pro jcct. Doniso Weiss and Ed Alkoson produ ced the covcr design j and Johrl Inglis. who initia led Ihis coll aboration. was on hand with advice at critical points .in lhe process. Most of a ll , Kaaeen Janssen. o ur e ditor. kept overyth ing afloaL with an energy. e nthusiasm. and activity rae beyond anylh ing \Ve could reasonably have asked for; things simply would not have gol lo this point wit hout her. We a lso wish lo acknowledge the work of those al Ben jamin Cwnmi ngs who coordinated production of Ihe book. F'rank Ruggirollo oversaw the process carried out by Jim Smith , Kay Ueno. Corinne Benson, Alexandra Fell owes, leanne Zalp.sky. and Donn a Kala l. Ingrid Mounl al Elm Street Publíshing Sorvices copcd cheerfull y with the many rounds of changes lo art and texl cven very lal(l in the process. Michele Sordi, wh il e part of the Benjamin Cummings teHm, he lped bring us a ll together in the firsl pl ace. An d finalLy we gratefully acknowledge our famili es and friends who. Ihrough oul tltis periodo prov ided such strang support, despit e having lo put up w ith our freq ll enl absences and distractions. James D. Watson Tania A. Baker Stephen P. BeH Alexander Gano Michael Levine Richard Losick
, Per Krau lis grallled pennission 10 use MolScripl (Kri1111iS P. 1, 1991 . MOLSCRIPT: A program lo prod uce bolh delai led alld schemalic plots of prolein stru cturcs. ¡oumol of AppJjf'd Cryslo /Jo¡;mphy
NOTFORSALE
About the Authors JAMES D. WATSON was Director of Cold Spring Harbor Laboratory from 1966 lo 1993 and is now its President. He spenl his undergraduate years al tho Univorsity of Chicago and recoived his Ph.D. in 1950 from Indiana Ulliversity. Between 1950 and 1953, he did postdoctoral research in Copenhagen and Cambridge. England . While al Cambridge, he began !he ooll aboration lhat resu lted in ths oluddation of the double--heli cal structure of DNA in t 953. {FoT this d iscovery. Watson, Francis Crick. and Mauriee Wilkins \VefO awarded the Nabel Prize in 1962.) Later in 1953, he went to the California Institute ofTechnology. He moved to Harvard in 1955, where he t8Ugbt and did research 00 RNA synthesis and protein synthesis until 1976. He was the Hrst Director of the NationaJ Center for Genome Resean:h a fthe Nationallnstitutes orJ-lealth from 1989 to 1992. Dr. Watsan was sale BUlhor of the fi rst, second, nnd th ird editions of Molecular Biology o{ the Gene, and a co-author of Ihe fourth edition. These were published in 1965 , 1970, 1976, and 1987 respectively. Watson has él lso been involved in two other textbooks: he was one of the original authors of Molecular Biology o{ the Cell. élnd is also an author of Recombinant DNA: a Short Course. TANlA A. BAKER is the Whitehead Professor of Biology al the Massélchusetts Institute of Teclmology and an Investigator of the Howard Hughes Medicallnstilute. She received él B.S. in biochemislry (Tom the University of Wisconsin, Madison , and a Ph .D. in biochemistry from Stanford University in 1988. Her grad uate research was enrried out in Ihe laboratory of Professor Arthur Kornberg and focosed on mechanisms of initiation of DNA replication. She did postdoctoral research in the laboratory uf Oc. Kiyoshi Mizuuchi al Ihe National Institules of Health , studying the mecharusm and regulation of DNA lransposi tion. Her current research exp lores mecharusms and regulation of genetic recombinal ion, enzyme-catnlyzed prole¡n unfolding. and ATP-dependenl prolein degradation. Pr.ofcssor Baler received the 2001 EH lilly Research Award from the American Society of Microbiology and the 2000 MJT School oC Science Teaching Prize Cor Und crgraduate Education. She is co-aulhor (with Arlhur Kombcrg) of lhe book DNA Replication, Second Edition.
STEPHEN P. BEl~L is a Professor of Biology at the Massachusetls Instituto of Technology and an Assistant lnvestigator 01' the Howard Hughes Mcdicallnstitute. He rcceivcd B.A. d egrecs from Ihe Departmcnl of Biochemistry, Molecular Biology. and Cell Biology and the lntegrated Scicnces Program al Northwestem University and a Ph.D. in biochemistry at tbe University of California. Berkeley in 1991 . His graduatc research was carried out in the laboratory of Dr. Robert Tjian and focused on eukaryotic transcription. He did postdocloraJ rcsearch in the laboraIOry of Dr. Bruce Slillman al Cold Spring Harbar Laboratory. working ún !he initiation of eukaryotic DNA replical'ion. J-üs current research focuses 00 the mechanisms controlHng the duplication of eukaryotic chromosomes. Professor Bcll reccived Ihe 2001 ASBMB-Schering Plough Scientifi c Achievement Award and the Everett Moore Baker Memorial Award for Excdlence io Undergrnduate Teaching al MIT in 1998.
NOTFORSALE
ALEXANDER GANN is Editorial Director 01' Cold Spring Harbor Labúratory Press, and a Caculty member of the \'Valson School of Bio· logical Sciences al Col d Spring Harbor Laboratory. He rcceived his B.Sc in microbiology fra m University Coll ege London and a Ph.D. in molecular biology from Th e Univers il y of Ed inbu rgh in 1989. His graduale research was carricd out in Ihe laboratory ofNoreen Murray and Coc used on DNA recogn ition by reslriction enzymes. He die! postdoctoral research in the laboratory oC Mark Ptashne at Harvard. working on transcriptianaJ regulatian. and that of Jeremy llrockes al lh e Ludwig lnstitute of Cancer Research al University College London, where he .worked 00 newl Iimb regcneta tion. He was a Lecturer al Lancaster Unive rs ity. England . from 1996 to 1999, bofore mo ving lo Co ld Spri ng Harbor léI bora tory. He is co-a uthor (with Mark Ptashne) oC tho book Genes 6' S ignals (2002). MlQiAEL LEVINE is a Professor of Molecular and Cell Biology at the University of Cal ifornia, Berkeley, ancl is also Co-Director of the Center for lntegralive Genomi cs. He received bis B.A. from Ihe Deparlment of Genetics at lhe University of California, Berkeley. ilnd bis Ph.D. with AJan Garen in the Department of Moler.ular Biophyslcs and Biochem· istry from YaJe University in 1981. As a postdoctoraJ fellow with Waltor Gehring and Gerry Rubin from ]982-1984 . he !';tud ied tlle molecular genetir:s of DrosophiJo development. Professor Levine's research group cucrently studies lhe gene nelworks rcsponsiblc for the gastrulation of the Drosophilo and G ona (sea squirtl embryos. He holds the F. Williams Chair in Genetics and Ocvelopment aL Ihe University of California. Bel'koley. He was awarded the Monsanto Prize in Molecular Biology from the National Academy of Sciences in 1996, and was elocted lo Ihe American Acaderny of Arts and Sciences in 1996 and the Nalional Academy of Sciences in 1998. RICHARD M. LOSICK is the Ma riH Moors Cabol Professor of Biology. a Harvard College Professor, and a Howard Hughes Med ical lnstitute Profossor in Ihe Faculty of Arts & Sdences al Harvard University. He received his A.B. in chemistry al Princeton University an d his Ph.D. in biochemistry al the Massachusetts lnstitute of Tochno logy. Upon completion of his graduate wOl'k, Professor [.osick was named a Junior Fellow of the Harvard Society oC rellows when he bogan his stuwcs on RNA polymerasc and the rcgulaHon of gene trnnscription in bacteria. Professor Losick is ti past Chairman of the Oepartments of Cellular Rnd OevelopmentaJ Biology and Molecular and Cellular BioJogy al Harvard University. He I'eceived the Cam ille and Henry Dreyfuss Teacher-Scholar Award, is a mcmber of the Nat ional ACildemy of Scienccs. a Fellow of Ihe American Academy of Arts and Sciences, a Follow of the American Association for tJlO Advancement of Science, a Fe Uow of Ihe American Acaclemy of Microbiology, and a former Visiting Scholar of th o Phi Beta Kappa Society.
NOTFORSALE
Detailed Contents PAR T
CHAPTER
1 CHEMISTRY ANO GENETICS 1
1
The Mendelian View of lhe World
5
MENDEL'S DlSCOVERIES 6 The Princi pie of Intlepcndcm Segregarion
Box
J~ 1
Mendelian La,u5
THE OR1GIN OF GENETIC VAR1AB1LlTY THROUGH MUTAT10NS 15 EARLY SPECULATlONS ABOUT WHAT GENES ARE AND HOW THEY ACT 16 PRELlMINARY ATTEMPTS TO FINO A GENE.PROTE1N RELAT10NSHIP 16
6
6
Sorne A llcles A re Ncirher Dominant Nor Reccssive Principie of Indcpcnde nr Assortmcm
8
8
CHROMOSOMAL THEORY OF HEREDlTY 8 GENE L1NKAGE AND CROSSING OVER 9 Box 1-2 Genes Are Lillked ro Chromosomes 10 CHROMOSOME MAPPING 12
CHAP T E R
SWlIJnary
17
BibliogTuph)'
1B
2
Nucleic Acids Convey Genelic lnformation
19
AVERY'S BOMBSHELL, DNA CAN CARRY GENETIC SPECIFICITY 20
The Adaptor Hyporhcsis of C rick
Viral Genes A re A lso N uc lc ic Ac iJs
The Test-Tu/x- S ymhcsis of Proreins 32
THE OOUBLE HELlX
21
21
Box 2-1 Chargaffs Rules 23 Finding (he Polymerases th,lt Makc DNA
THE CENTRAL DOGMA 31
T hc ParaJox Qf the Nompcr ific-A ppearing Ribosomcs
24
J)
32
Discovery of Mcssengcr RNA (mRNA ) 33
Experimenwl Ev idence Favors S rranJ Scparati\)n during DNA Replica tían 26
Enzymadc Symhesis o ( RNA upon DNA Templmcs 33
THE GENETIC INFORMATlON WITHIN DNA IS CONVEYED BY THE SEQUENCE OF ITS FOUR NUCLEOTLDE BUILDING BLOCKS 28 DNA Cannot Be che Template [har- Ditccdy OrJcrs
Establishing the Geo€tic CoJe JS
ESTABLlSHING THE D1RECTION OF PROTElN SYNTHESIS 37 Srart and Stop S ignals Are A lso EncooerJ
wirhi n DNA
38
A mino Acids during Prorci n S}'lllhesis 28 Box 2-2 EtlicleflCe rhal Genes Cumrol Amino
THE ERA OF GENOMICS 38
Acid Sequence in Proteins 29 RNA Is Chcnw.: allv Very Similar to ONA
Oibliograplry
Summary 39
30
NOT FOR SALE
40
xii
Detailp-d Conlenls
CHAPTER
3
1l1e lmportance úf Weak C hemical lnteractio ns
41 Sl,lmC lonic Bonds Are Hyd rogen Bonds 47 Weak lnrernC lions DemanJ Compl ementary Molecular Surfaccs 48 Water Moleculcs Fonn H ydrogcn Bonds 49
CHARACTERISTICS OF CHEMICAL BONOS 41 C hcmical Bonds Are Explainablc In Quamum, Mechanical TerlTls 42 Chcmic
THE CONCEPT OF FREE ENERGY K'-'I 15 Exroncnr ially RelmcJ te d G 44 Cuvalcm BonJs Are Very Strnng 44
44
Wenk &nJs hetwccn Molccuk'S in Aqueous Solutions 4Y Box 3, I Tite lJniqucness uf Molecular Shapes and (Ite Cuncep! of Selccrit'C Stickiness 50 Org:mic Mulecu les chal TenJ ro Ft)rm Hydrogcn Bonds A re W:ncr Soluhle 51
WEAK BONOS IN BlOLOGICAL SYSTEMS Weak Bunds Have Energies betwcen I .md 7 kcal/mul 45
45
H yJrnphobic "BonJs" Stabilize Macrnmolecules The Advantages of b.G hctwecn 2 nnJ 5 kcnVmol Weak Bonds Atlach Enzymes
tQ
Substrales
Weak Bomls Are Constantl y MaJe and Brokcn al Physiological Tcmpcrarures 45
Wcnk BnnJs Mediate Mos( Protc in:DNA anJ Ptotcin:Pro tcin lntcractions 53
The Oistincrion betwecn Polar ami Nonpolar Molecules 45 Van der Waals Forccs 46 Hydrogcn Bonds 47
Summary
CHAPTER
53
Bibli(l!if'aphy 54
4
The Importance of High-Energy Bond,
55
MOLECULES THAT OONATE ENERGY ARE THERMODYNAM1CALLY UNSTABLE 55
Anivation of Am ino Acids by Attachment of AMP 63
ENZYMES LOWER ACTlVATION ENERGIES IN BIOCHEMICAL REACTIONS 57
Nucleic Acid Precursors A re Activnted
FREE ENERGY IN BlOMOLECULES High,Encrgy Bomls Hydrolyze w"ith Largc Negari ve d G 58 HIGH-ENERGY BONOS IN BIOSYNTHETIC REACTIONS
Pcptide Bomls HyJrulyze Sron[ancously Coupling of Negatlve with Positive .6.G
58
Biosynthetic Rcaoions
62
SUl11fmlry
60 60
67
Bihliagraphy 67
61
ACTIVAT10N OF PRECURSORS IN GROUP TRANSFER REACTIONS ATP Vers
e - G 64 The Valut' of Q -o Rde;¡se in Nuclidc AciJ SynthesiS 64 e -o S plits C haracterize Most by che Ptcsencc of
61
NOTFORSALE
65
53
xiii
Detaifed Contents
CHAPTER
5
Weak and Strong Bonds Detennine Macromolcclllar Srrllcture HIGHER~ORDER
69
STRUCTURES ARE DETERMINED BY INTRA· AND
Oifferenr Protcin Functions A risc from VariCll,ls Dom
INTERMOLECULAR INTERACTIONS 69
WEAK BONDS CORRECTLY POSITION PRO-
ONA Can Form a Regular Hclix
69
TEINS ALONG DNA AND RNA MOLE..
RNA Forms a Widc Varicty of S tructurcs
CULES 84
71
Proteins Sean Cllong DNA te Locatc a Specific ON A~BinJ¡ng S ite 85
71
C hcmica l Fcaturcs of PrOtein Building Blocks
The Pcptide Bond 72 Thcre Are Four Levels of Protein S[ructllre 72 o: Helices anJ ~ Sh(.'(' ,~ Are [he COI1 UllO n Forms of ScconJary SmlC[Ure 74 Bux 5~ 1 DClenninaliun uf Prolein Slrl/CCUre 75
Oivcrse S mnegies for Prote in
80
MOST PROTEINS ARE MODULAR, CONTAINING TWO OR THREE OOMAINS 81 Pmtcins A rc ComptlSl'J of a S U'l]risingly S mCl l1 Numbcr of StrllCfUral Motifs
of RNA
ALLOSTERY, REGULATION OF A PROTElN'S FUNCTION BY CHANGING ITS SHAPE 87
THE SPECIFlC CONFORMATION OF A PROTEIN RESULTS FROM ITS PATTERN OF HYDROGEN BONDS 78 a Helices Come Tllgcther to Form Coil eJ~Coi l s
Rcc~ni r i on
86
The SrrucHlral Basis of Alkosteric Re{!lIlation Is Known for Examplcs Invo1ving S mallligands, Pmte in -Prorein Inte r
91
BiblWgraphy 92
81
Bux 5·2 la7'ge Proreim Are Ofren CmulnlCled oJ Set!("fal Smaller Pulypepride Chains 82
PAR T
CHAP TER
2
MAINTENANCE OF THE GENOME
93
6
The Structures of DNA and RNA
97
DNA STRUCTURE 98 ONA Is Compu:.t:d uf PolynuclcutiJc C hains
9R
Each Base Has Its Prcferrcd Tau«lIneric Fonn
100
Thc T wo StranJs I ) f the Duuhle Helix Are HelJ Togc ther by Base Pairing in an Amiparallel Oriem ation 100
103
Bux 6· 1 DNA Has 10.5 Base Pairs per Tum af [Ile Hclo:: in So!ulinn: The Mica Experiment
104
The Double Hclix Exists in Multiple Conformalions
106 ONA Can Sometimcs Form a Left-Handed Helix
The Two C hl'1ins uf rhe Douhle Helix Have Complementary Sequcnccs 10 1 Hydrogen Bonding Is Imponam for rhe Srec iheity of Base Pairing
Tht! Major Gnxwc 1:. Rich in C ht' mical Infonnation
ONA S trands C an Separate (Dcnaturc) and Reassociatc 108
102
Bases Can F1ip O ut fmm the Double Hc!ix
107
102
ONA Is Usually a Right~ HanJed Doublc Helix 103 The [\)uble Helix Has Minor :lnJ Maj{)r G roovcs 103
Sorne DNA Molcculcs A re Circles DNA TOPOLOGY t 11
111
linking N umber Is an Invarianr Topolugical Pwpt'ny ofCovalently C loscd, Circular ONA 11 2
NOTFOR SALE
xiv
D fltn jfed COll lflll ls
Lmking N umbcr 15 Composed ofTwisr anJ Wrilhc 112
LkPIs che U nldng Numbcr o f Fully Relaxoo cccONA under Physiological Condin OllS DNA in Cells Is
N~a fj ve l y
J lO
Eth,d ium loos Cause ONA te Unwind
114
Supcrcoilcd
11 4
So.6-2 PrOOng WlI DNA Hru a Helical PeriOOici¡y uf aboul. /O.5 Base Pairsper Tum frum che Tupulvg;cal P,u(>erliesofDNA Ring> 121 RNA STRUCTURE l ZZ RNA Contains R,OOsc and U racil aOO 15 Usually Singlc-5t randed J22
Nuclcosomcs Introduce Negacive Supcrcuiling in Eukar,'olcs 1I 5 Prokaryorcs H:.1ve a Spcda l Topoisomerase Ihat /ntroduCL'S Supcrcoils into ONA 116
RNA Chains Fold Bock on -n u.'tnselves to Fonn Local R~ions of Douh lc Hel ix Sim ilar to A -Form ONA 123
Topoisorncrascs a lso U nknot :m d Discm a nglc ONA MClleculcs
RN A Can Fold Up iOlo Complcx Tcrt iary S trucw rcs
Topoisomernses Cm Rclax Supcrcuilc..>d ONA
115
11 7
Topo isomcrascs Use :.1 Covalent Prote in-DNA linkagc to C lcavc an d Rejo in DNA S trands 11 8 Topoisomemscs Form an ElUymc Bridge anJ Pass DNA Scgmtllls l hrough Each Other 118
Sorne RNAs Are Enzymcs
ONA TopoisomcrS Can Be Scparatcd by Elt:ctropho rcsb 120
Sumnwl)'
CHAPTER
124
125
H amm crh(.'ad Ribozyrne C lcavcs RN A by tht: Formatian of a 2',)' Cyclic Phosphfl tt' J 25
TI1C
Oíd Lifc Evolvc from an RNA WorlJ ?
) 26
126
Bi blWteT~hy
JZ?
7
C hmmosomes, C hromatin, and the Nucleosome
129
CHROMOSOME SEQUENCE ANO D1 VERSITY 130
C h romosomc S UUc[lItt! Changes as Eub ryotic Cclls Divide 143
C hromosom(S Can l3c C ircular o r Linear
Sislcr C h romatid Colu.'Siún and ChrornOSúme Condensat ion Are Med imoo bV SMC Prote ins
130
Evcry Cc ll Maim ains:.1 C harncterisric N umbcr ofChromosomcs 13 1 Gcnomc Si%c Is Rdared of lhe Orgnn ism 133
(O
Mitosis Maintains Ihe Parental
Ch romo1>Onl c Num bcr
che Complexity
Thc E. colí Gcnomc 15 Composcd almost EOlirely ufOenes 134 More Complex O rgfln isrns Have Decreased Gene Dcns i r ~' 134 Genes Makc Up On ly a Small Prupon ion o( [he Eukaryot lc Chromusomal DNA LJ5
J 44
146
The Gap Phases of fh e Ccll Cyd c A llow Time to Prepare fo r the Ncxt Cell C ycle S tage whi lc al50 Ch~cking mar rhe l'rev ious SWblC 15 Finishcd Corrcctl y 146 Meiosis Reduces rhe Parental C hromosomc Num"'c r 148 Di ffercn t Lcvcls o ( C h romosome S tructure
The Majority o( Human In tcrgen ic Scque nces A re Composcd ofRcpctil ivc DNA l37
Can Be Obscrvt..xI by Microscopy T HE NUCLEOSOME 15 1
CHROMOSOME OUPLlCATlON ANO SEGREGATION 138
Nuclcosomcs Are ,he Building Block, of Chromosumcs 151 Bux 7; 1 MicnlCOCcaJ Nudt.:,ase arnJ che DNA AsSlJCillt.."ll uoilh tI,e Nudro5Ume 152
Eukaryotic C hro mosomcs Rcquire Cenrromeres. Tclomt!rcs. Rnd O rigins o f ReplicRlÍon to Be Mam mint'd Juring C dl Oivisinn 138 Eukaryoric C h ro rnooomc Duplicarian and Scgrcg.nion Occur in Scpanll c Phascs el rhe Ccll Cy~ l e 141
150
H istoncs Are Small , P()Si t ivcIy~l.n
153 The A tomic S rruccurc of ch e N uc!eosome
NOT FOR SALE
154
Detailed Q:mtents
Nuclcosome Remodcling Complcxt-'s Facililalt> Nudoosome Movement 166
Many DNA Scqut'nce-lndepcnJenc Conrac ls Mediare rhe Intcraction betwcen {,he Core Hiswnes and DNA 156
Some Nud eosomes Are Found in Specific Positions in vivo: Nuclt"osome Po:;ilioning 168
The HislQne N-Tenninal Tai ls Srabilize DNA Wrapping arounJ the Octamer 159
HIGHER.ORDER CHROMATIN STRUCTURE 160
Modificarion of [he N -Terminal Tails of rhe Histones A lters C hromatin Accessibility 169
Hisrone H 1 BinJs LO rhe Linker DNA bctwccn Nudeosonles 160
Bux 7-2 Detcrmining NllClcusome Posirion in
Nuclcosome Amlys Can Fonn MOTe Complcx Srructurcs: the JO-nm Fiber 161
Spedf1c Enzymcs Are Responsible for Histone Modificaríon 173
The Histone N-Tenninal Tails Are Rtx.juired (or [he Formation o ( the JO-nm Fiber 162
Nucleosome Modificatíon and RemoJding Work Togcther ro fncrease ONA Acc~sibiliry 174
Further Compael ion o( DNA In volves Large Loops ofNudeosomalDNA 162
N ude().';Qmc~
NUCLEOSOME ASSEMBLY 175
Are Asscmbled ImmcJia tcly after DNA Rerlication 175
Hiswne Variants Alter Nucleosome Funerion 163
Asscmbly of Nucleosomes Requires Hisrone "C haperones" l76
REGULATION OF CHROMATIN STRUcrURE 165
Sumrnal')'
The ¡nreraetíon of DNA \Vilh rhe Histone Octamer Is Dynamic 165
CHAPTE R
xv
179
BibliogTLI"hy
ISO
8
l Bl THE CHEMISTRY OF DNA SYNTHESIS
The Replication of DNA
lH2
DNA Synthcsis RCQuircs Deoxynucleosidc Triphosphatcs and a Primer:Templare Junction
182
DNA Is Symhesizoo by Extending me J' Eml ohhe Primer IB3
8-'
Determinin.e: the Pu/arity
of a DNA He/icme
/ 96
Topoisomerast:S Remove Supcrcoils Produced by DNA UnwinJing at (he Replicarion Fork 198 Rcplication Fork Eoz)'ml.!S Extcnd rhe Rangc o(ONA Polymerasc Substratcs 199
DNA Polymerases Resemble a Hand rhar G rips che Primer:Temphne Junction
186 DNA Polymerases Are Prncessive Enzymes IBB
192
ONA Hclicases Un\V inJ m e Oouble Helix in AJvance of the Rcpl icatlon Fork 194
Box
THE MECHANISM OF DNA POLYMERASE 184 ONA Polymt'rases Use a Single Active Sire lO Camlyze ONA Synrhcsis 184
THE REPLlCATION FORK
194
Single-StranJed BinJing Proreins Srabi lize SingleStrandeJ ONA Prior f.O Replication 195
Hydrolysis of Pyrophosphares Is rhe Driving Foree for DNA Synrhesis 183
Exonuek"3Sl.'S Proofread Newly Sy mh~ ized ONA
RNA Primers Must Be Removed ro Complete DNA Rt'plicariún
THE SPECIALlZATION OF DNA POLYMERASES 200 191
ONA Polymerascs Are Sp!...>cializoo for Diffcrem Roles in rhe C e ll 100
I30rh StranJs of ONA Are Symhesized Together at the Replicarion Fork 192
Sl iding C hunps Dramatica ll y Inc reasc ONA Polymcmsc Proccssivi ty 201
The In itiation of a New Str.mtl o( DNA Rl-'Quin..'S an RNA Primer 193
Slitling C lamrs Are Opcnt_-d antl Placed on ONA by C lamp Loaders 204
NOTFORSALE
"vi
I"Jefoiled Conlen l~'
DNA SYNTHES1S AT THE REPLlCATlON FORK Z05 Box 8~2 ATP Control oI Prurem Flmction : LoadU'g a Sliding C lamp
ElJkaryotic Chromosomes Are Replicau.>- d Exactly O nce per Ccll C ycle 223 Prc-Replicarivc Complex Formation Direcrs me lniliation o[Replic
206
Imernctio ns benvcen Replicarion Fork Proteins Form me E. coli Replisome 210
lNITlATION OF DNA REPLlCATlON
Pre~ RC
Fo rmarion and Ac rivario n Is Regu latecl to A llow only a Single Roune! of Replicadon during Each Ce ll Cyele 225
ZIZ
Spccinc Gcn omic ONA Scquences Direc l [he Iniciarion úf DNA Replicación 212
S imilariries bc[wcen Eukaryoric anJ Prob.ryotic ONA Replicadon Inithnion 228
The Replicon Model of Replication In it¡alion 21l
F1NISHING REPLlCATION
RcplicétlOr &-qu~nCl__'5 InduJe Iniriator BimJing S itl-"l5 and Easily Unwound ONA lO
Type 11 Tupoisumcrast:s A re Requiroo ti) Separare Daughter ONA Mo leculcs
BINDING AND UNWINDlNG, ORIGIN SELECTION AND ACTIVATlON BY THE INITlATOR PROTElN 214 Bux 8~3 Thc Ick'JllificQ!ion oIOrigins of Replicario71 and Replicar/.JrS 214
Lagging S trand Synth csis Is Unablc ro Copy (he Extreme Ends of Linear Chromosomcs 229
Prot cin ~ Pro(cin
and Prmcin -DNA lnternctions Diren the lnitíation Process 2 17
CHAPTER
Telomerase Sol ves the End Replicario n Problcm by Extendíng (he)' End of (he Chromosome 232 232
BiblWgruph)'
2J3
9
The Mutability and Rcpair of ONA
235
REPLlCATION ERRORS AND THE1R REPA1R 236 The NawTe of Mutations 236
REPAlR OF DNA DAMAGE 246 Direct Reversal of ONA Damage
Expansion oI Triple Repeals
Causes Disease
247
Base Excision Repair Emymcs Remove. DamageJ
Sorne Replicarion Errors Escape Proofread ing
Box 9, I
228
Telomerase ls a Novel ONA Polymcrast" thal Ooes NOI R~quire an Exogcnous Templare 230
Summary
Box 8-4 E. coli DNA Repücation 1.5 Regulated IryDNA ·ATP Le",barulSeqA 21 7 Box 8;5 The Replicaoon Factary HypodJ.esis 221
228
Bases by a
237
Base~Fli pping
Mechanism
248
Nudeotide Exc isio n Repair Enzymcs C leaVE: Damaged ONA on Either S iJe of the Lesion 250
237
Mismatch Repair Removes Errors chal Escape Proofreading 238
RecombinatiOn RcpairS ONA Breaks by Retrieving Scqucnce Informaríon fmm UnJamaged DNA 2 53
DNA DAMAGE
DNA Unuergoes Damage Spontancously from HyJrolysis and Deaminarion 242
Translcsion DNA S ynthesis Enables Replication lO Procecd ac ross DNA Damage ZS4 Box 9-3 The Y~Family oIDNA rolymerases 256
Box 9~2 The Ames Tes!
SlImmary
Z4Z
243
DNA Is Damagct.:1 by A lkylation, O xidation , and Rad iation 244 Mutations Are airo Caused by Base Analogs and Intcrcalaring Agents 245
257
l:IihUograph)'
NOTFORSALE
258
Detailed Qmtents
CHA P T ER
xvii
10
Homologous Recombination at the Molecular Level 259 MODELS FOR HOMOLOGOUS RECOMBJNATlON 259 Thc HolliJay Modellllustl'
'0-'
HOMOLOGOUS RECOMBlNATION PROTElN MACHINES 268 The RecBCD Hel icasc/Nuclease Proccsscs Brokcn DNA Moleculcs for Recombínatíoll 269 RccA Protein Asscmbles on Single-Strandcd DNA and Promotes StranJ lnvasion 272 Newly Basc-Pa ired Partne rs Are EstablishcJ wirhin ,he RecA Filament 274 RecA Homologs Are Prcsenr in A H O rganisms 275 RuvAB Complcx Spec ífically Rccognizes Holl iday Junctions (l.nd Promotes Branch Migralion 276 RuvC C Jeaves Spccific DNA Strands at [he Holliday Junclion [ O Finish R("Combina tion 276
C H A P 1 E R
HOMOLOGOUS RECOMBINATION IN EUKARYOTES 27B Homologous Recombination Has Add itional Functions in ElIkaryotcs 278 Homologous RL'Combin;uion Is Requi reJ for C h romosollle Segrcgation during Mciosis 279 Programmed Generation of Dollble-Srranded DNA Breaks Occurs during Mciosis 179 MRX Protein Processes the C lcaved DNA Endsfor Asscmbly of the R~A- Iik e Strand ~Exchangc Proteins 282 Dmc I Is a RccA ~ lik c Protcin thflt Spccifl cally Funct ions in Meiotic Recombination 282 Many Prorcins Function Togcther ro Promote Mciotic Rccolllbination 284
MATlNG-TYPE SWITCHING
285
Mating-Type Switching Is In itiated by a S ire-Speciflc Oouble-Stmnd Bre:;¡k 286 Mating-Typc Switching 15 a Gene Convcrsion Event, Not Associated wírn Crossing Over 286
GENETIC CONSEQUENCES OF THE MECHANISM OF HOMOLOGOUS RECOMBINATION
288
Gene Conversion Occurs bccausc DNA Is Repa ired during Rccombination 289 Surmnary
290
Hibliosraph,
291
11
Site-Specific Recombination and Transposition of DNA CONSERVATlVE SITf..SPECIFIC RECOMBLNATION 294 Site-Spccific Recombination Occurs at Spccific DNA Scquenccs in lhe Targcr DNA 294 Site-Spcc ific Rccombínases C lcavc and Rejoin ONA Using a Covalt'nI' Prorein-ONA Intermediate 296 Serine Rccombinases Introduce Double-Strand<..'(1 Breaks in DNA and thcn Swap Str,mds tú Promotc Rt:comhination 298 Tyrosine Rccombinascs Break and Rcjoin One Paír of DNA Stramls at a Time 299
293
SrructuR'S of Tyrosine Rt.-'Combinascs Bound ro ONA Rcvea l the Mechanism of ONA Exchange 300
Box 11 - 1 Applicatiun uf Site-Speci!ic Recomhinarion to Generic Engineering
302
BIOWGICAL ROLES OF SITE-SPEClFlC RECOMBINATlON 302 ~
In [Cgrase Promotes (he Inrcgration 3nd Excision o( a Virol Gcnomc ioto the Host Cell C hromosome 303
Phage ~ Excision Requircs a New DNA~Bending Protein 304
NOT FOR SALE
xviii
tktaiJed Qmtents
The Hin Rccombinasc Inverts ~ Scgmcnt ofON A A Ilowing Exprcsskm of A ltcm arive Genes 305 Hin Recombination Rt'quircs a DNA Enhancer
306
Recombinases Convcrt Mulrimeric C ircular DN A Mo lcculcs into Monomers 307 T herc A re O rher Mech
TRANSPOSITION
310
Sorne Gencr ic Elcmcnts Muve te New C hromosomal Locations by Transpos ition 3 10 Thcre A re Thrce rrinc ip;.¡J C Jasses l,fTraosposahle EIemcnbs 311
DNA Transposase~ and Retrovirallntegtases Arf! Membcrs of a Protein Supcrfamil y J2 J Bux J J-2 The Par/¡way /Jf Rem.~iraJ cDNA Formation 322 Poly-A Rc rrotransposons Move by a "Rcverse Splicing" Mechanísm
324 EXAMPLES OF TRANSPOSABLE ELEMENTS AND THEIR REGULATION 327 1S4- Family Tra nsposons A re Compact EIt!men LS with Molriple Mech anisms (er Copy N umbe r Control 327 BIJx 11 -3 Maize Elcments and rhe Disc/XIefY ufTransposuns 328
DNA Transposons Carry a Transposase Gene, Flanked hy Rccombination Sites 312
T nI OT ranspo:¡itio n Is Coupk-d
Transpo5( tlls Ex ist as Bnth A utonofTlous and N{mauronomous Elc mcm s J 13
Phagc Mu Is an Exu emcly Robusr TransJ:oson
Viral-like RetrotransJXlSons and Retro viruscs C arry Tenninal Rcpcat &-4ut:nccs ami Two GcnL""S Impor[anr (lIr Rccúmbinatinn 3 I3
Mu Uses Targct Immuniry to Avoid Transposing into I[s Own DNA 33 1 Te ! {Marincr Elcmen ts A re Ex[remely S uccessfu l ONA Elcments in Eukaryores 334
Poly-A Rettútransposons Look Likc Genes
ro Cellular DNA Replicaría n
314
U NEs Promore T he ir Own Transposirion and Even Transposc CelluJar RN As JJ6
The lntenncdiatc in C ut-and-Pastc Transposition Is FinisheJ by Gap Rcpai r 3 16
V(DI] RECOMBlNATlON
Thcre A re Mulriplc Mechanisms (or C lcaving rhe Non m msfcrred Strnnd during ONA Transpnsition 316 ONA Transposirion by a Replicative Mechanísm
3
337
The Early Events in VeD}] Recombination Occur by Mcchanism S imilar (O Transposon ExcislOn .339 3 18
Víral-like Retro mmsf'OSOns aml Retro viruses Move Using an RNA Intermed iare 320
CHAPTER
331
Yeas[ Ty Elements Tr-dllSpO'iC into Safc Havens ín rhe Genome 335
DNA Trnnsr o:.trion by a Cut:m d-Pastc M ~c h anism J 1.4
PAR T
329
SummúT)'
342
Bihliogru.tJhy
342
EXPRESSION OF THE GENOME
343
12
Mechanisms ofTranscription
347
RNA POLYMERASES AND THE TRANSCRIPTION CYCLE 348
Transcription Iniriaríon lnvolves Thrce Dcfincd S tcps 352
RNA Po lymerascs Come in Oif(crcnl' Fo rms, but Shart: Many hatm es 348
THE TRANSCRIPTION CYCLE IN BACTERIA 353
Tmn.scription hy RNA Polymerase Procceds in a Series of S[eps 350
Bacterial Promo[ers Vary in S tTcngth and Sequence, bur Havc Certain Oefining Fcarures 3 53
NOTFORSALE
Detailed Contents
The (J" Facror Med iares Binding o( Polymcrase (O che Promorer 354 Box 1 2~ 1 Consenst/.s Seq1t(~ces 355 Transition ro tht' Open Complcx lnvolves Strucrura l Changt.'S in RNA Polymerase anJ in rhe Promoter DN A 356 Trnnscription Is InitiateJ by RNA Polymerase: wirhour the NceJ for a Primer 358
TBP Binds te anJ Distorts DNA Using a r3 Sheet Insen cd ¡nto rhe Minor G roove 366
RNA Polymerase: Symhesizcs &vcml Short RNAs before Emcring the Elong.ltion Ph asc 358 The Elongaring PolymentM" Is <1 Proce;~i\'c Mach inc that Symhesizcs and Proofrcads RNA 359 Box I Z~2 The Single-Suhunit RNA PolymerasEs 360 Transcription Is TenninateJ by Signals within rhe RNA Sequence 361
A Nc'W Set ofFactors Stimulare Poi 11 E1onganon t1nJ RNA ProofrcaJ ing 370 Elongaling Polymerase ls Associalt."ll with a New Set o( Protein Factors RcquircJ (or Various Typcs el RNA Processing 371
TRANSCRIPTION IN EUKARYOTES 3 63 RNA Polymcrase 11 Core ProIHoters Are MaJe up of OJmbinations o( Four Diffcrent &'quencc Elcmems J63 RNA Polymerasc n Forms a Prc~ lnitiario n Complex with Gcncrdl Transcripüon Facwr.; al che Prommer 364
The O ther General Tn;lIlscriprion Factor.; also H¡wc Specifi c Roles in lnitiation 367 In Vivo, Tr.mscriprion fniriarion Requircs Add i[ional Prorcins, InduJing (hc McJi
RNA Polymerases I amI 111 Recogn izc Distinct Promotcrs, Using Distincr Sets o( Transcrip[ion Facror.;, bul s[ill R<.>quire TBP 374 5mnmary
J 76 377
Bibl~raphJ'
e H A P T E R 13 RNA Splicing 379 THE CHEMISTRY OF RNA SPLlCING 380 &'quenccs wirhin [he RNA Determine Where Splicing Occurs 380
The Inrron ls Removed in a Form CaJlt:d a Lariar aS lhe Hmking Exons A re Joined 381 Exons from Oi((ereOl. RNA Moleculcs Can Be FuseJ by Trans~Sp l ic ing 383
Group 1 Introns Relcase a Linear Imron Rarhcr than a Lariat 388 Box J 3 ~ J C on'tnring Group 1 Introru imu RiooZ)mes 389 How Docs [he Spliceosomc Find the Splicc Silcs Reliably! 39J
ALTERNATlVE SPLlCING 394
Single Gen<.'S Can Prcx:luce Mulriplc Produc[s by Altem.uive Splicing 394 A lt~rna t ivc Sl'licing Is Regu lated by Activaton. anJ Repressors 396 Box 1 3~2 Adenovina and che Discovery Asscmbly, Rearmngements, amI Cat
THE SPLlCEOSOME MACHINERY 383 RNA Splicing 15 C arried O ut by a La rge Complex C alk-d [he Spliceosomt:- 383 SPLlCING PATHWAYS 385
xix
Defailffd Conffllll S
x"x
EXON SH UFFLING
mRNA TRANSPORT
40 1
Exons Are ShufflcJ hy Rccombination lO Produce Genes Encodin~ New Proteins
RNA EDIT ING
400
406
Oncc Proct'sscd , mRNA Is Packagoo and Exported from the N ucleus imn (he C ytophl.sm fo r Translation
406
404
RNA Editing Is Another Way of A lrering the $equcncc of an mRNA 404
SumrlUll")'
40B
flib'ivg:rQ~hy
409
14 Translation 41 1 C H APTER
MESSENGER RNA
4 12
PolypcptlJc C h rt ins Are SpccificJ by Opcn-Rcad ing Fnnocs 41 2 Prokaryoric mRNAs Ha"c (1 Rilx>tiomc BinJi ng SilC that Rccruits the Tr:ms lational Machinery 413 Ellbryoric mRNAs Are Mudifi l 't.l at Their 5' anJ J' Ends 10 Facilitare Tr,lIlslation 414
TRANSFER RNA
4 15
rRNAs Are AdaplOffi bctw(.'Cn CoJons and Am ino Ackls 415 tRNAs Sharc a Common ScconJary S uucwre thal Rescm blcs él C l ovcrl ~ f 416 tRNAs }-lave an L-Shaped Three-Oimcnsion::l l Srrucrure 417
ATTACHMENT OF AMI NO ACIOS TO,RNA 417 (RNAs Are ChargcJ by che Anachmcm an Am ino Ackl lo rhe 3' Tenninal AJenosine Nuc k-m idc " iel
or
Box
14 ~ 1
Selenoc,steme 423 THE RIBOSOME 423 Thc Ribosom~ h. ComposcJ of
Ncw A mino AciJs Are A uachcd ro (he C-Terminus o ( lhe G rowin~ Pl"IlypcptiJe C hain 427 Pcpt ide Bonds Are Form~J by Transfcr of rhe G rowin~ Poln xl'tide C hain from One tRNA tO A no ther 428 Ri}-.osQmal RNAs A re 80th Structura l ami Calalytic Dclcnnlllams of the Rihosome 428 Thc Rih'S)(Tlc Has TIm..-c BinJing SilCS (or tRNA 429 C hanncls through rhe Ribosomc Al low (he mRNA anJ Gruw ing Polnlt'pLidt.· to Entcr anJ/nr Exir lhe Ribosome 430
INITIATION OF TRANSLAT10N
432
ProkOlryotic mRNAs Are lnil iOllIy Recruited to rhe Small Suhunit hy 11'lSC ~ POliring LO rRNA 433 A Srccia lizoo lRNA C h
by ,hcS' Cap 435 The S rart CoJon Is Found by Scanning Downstrcam fmm the S' EnJ ofthc mRNA 437 Translation lnitiatim Factors Hold Eukarymic mRNAs in C ircles 438
Box 14, 2 uüRFsond lRESs; Exceptiuns mal PrU\oe .he Rulo 439 TRANSLATlON ELONGATlON 440 Am in nacyl ~ [RNA s A re Ocli vercd [Q the A S ite bl' E l tlng~ti on FOlc to r EF~Tu 44 1 The R¡hosolHc Ust.-s Mulriple Mcchanisms t~) Sclcc( Al!ttinst IncorT<."Ct A m ino3C)'I-tR NAs 441 Tht! RihQsomc Is a Rihozyme 442 P~pfidc &md Formation anJ th t! Elnngatinn Fac lor EF-G Orive Translocation of rhe tRNAs
NOTFORSALE
xx i
Detoilcd Con fenls
EF-G Ori ve; Translocation by Displacing the rRNA BounJ to rhe A Si te 44 5 EF;Tu -GDP
ofTrallslarion 447 TERMINATION OF TRANSLATION
448
ReleaS(' Factors Terminare. Transhtt ion In Response 10 STOP Cooons 448 Short Regions of C lass I Release Fac ( o~ Recogn izc Stor C"dons rtnJ Trigger Rd~dM:! of tht' Peptidyl Ch
GOP/GTP Exchange anJ GTP Hydrolysis Control the Function of lhe C lass 11 Release Factor 450 TIl\! Ribosome Recycling Factor Mimics a tRNA 450
TRANSLATlON-DEPENDENT REGULATION OF mRNA AND PROTEIN STABlLlTY 452 T he SsrA RNA Rescues Ribosotlles rhar Tmnslale Broken mRNAs 452 Bol: '4-4 Antibiorics Arresr Cell Division by Blocking Spe.cific Srcps in Translarion 453 Eukatyotic Cells Degrade tuRN As lhal A re Incomplere or rhar Have Premature Stop CoJons 456 Summary
458
Bibtiogra¡..h)' 459
15
Thc Gencric CoJe 461 THE CODE IS DEGENERATE
THREE RULES GOVERN THE GENETlC CODE 469
461
Pen.:eivi ng O rder in lhe Makeup of che CoJe 462 Wobblc in rhe Anticooon
Three Kinds of Po int Mutatiol1S A lter ~h e Gcneric Cede 470
463
Three C odons Direcl C ha in Termi narion
Ho\\' lhe Cede Was Cracked
Genetic Proof thal m e Ccxle Is Read in Units llf Three 47 1
461
464
SUPPRESSOR MUTATIONS CAN RESIDE IN THE SAME OR A D1FFERENT GENE 471
Slimulation of A mino Ac id Incorporar ion by Synlhetic mRNAs 465 Po l y~U
Codt.'S fur Polyphenylalanine
Intergenic S uppression lnvlllves Mutant tRNAs
466
Nonsense Surrressors also Read Nomlal Ten nination S ignals 474
MixeJ Corolymers Allowed AdJitional Codon Assignmem s 46 7 Transfer RNA BinJ ing ro DcfineJ Trinucl€'oride CoJons 468
THE CODE IS NEARLY UNIVERSAL
CoJon A ssign mcnrs from Rt'~ I' ing Q1ro1ymcrs 468
477 Bi/llivgra¡..h)' 4 77
Prov ing the Valid iry of the Gene lic Ccxle
474
475
Summary
PAR T CH AP T ER
472
4
REGULATION
479
16
Gene ReguJ ation in Prokaryotes
483
PRINCIPLES OF TRANSCRIPTlONAL REGULATION 483
Sorne Activators Work by A lIostery and Regul:'lfe S(eps aher RNA Polymcrase Binding 485
Gene Expressíon Is Conn o lleJ by Regulatory Pro teins 483 Many Promtlters A re ReJ,;ulated by AC livators ¡hal Hdp RNA Polymen.tse Bind DNA and hy Repressors
Action ar a Distance and DNA Looping 486 Q)Qperati ve Binding and A llostery Have Man \' Roles in Gene R('gulaü Qn 487 Anriterminario n and Beyond: NO[ A H of Gene
lhat Block that Bindi ng
484
NOT F OR SALE Regulalion Targel's Tnmscription In itialion
487
:K]{ii
Detaj/ed Confents
REGULATION OF TRANSCRWfION INITIATlON, EXAMPLES FROM BACTERIA 488
Ribosomal Proteins Are Tnmslatinnal Repressors ofThcir Own Synrhesis 506
An Activaror ;md a Repressor Together Cumrol lhe lac Genes 488 CAP and Lac Repressor Have Opposing Effects on RNA Polymcrase Binding to ,he lae Promoter 489 Box /6-1 Deucting DNA ~ Bjnding Sj[(>_~ 490 CAP Has Separate Activating and DNA-Binding Surfaces 492
Box 16.-4 Riboswi[ches 509 THE CASE OF PHAGE ~, LAYERSOFREGULATlON
512
A lremative Panerns ofGene Exrressio n Contro l Lytic and Lysogcnic Growl h 513 Rcgulatory Proteins and Their Binding Sites
514
~
Rerressor Binds ro Operator S ires Cooperativelv 515 Bux J 6~5 Concenrrarion , Affiniry,
CAP and Lac Repressor Bind DNA Using a Common Struc tural Motif 493 Box /6.-2 Ac[ivator Bypass Experimcms 493 The Act ivíties of Lac Repressor and CAP Are Controllcd A llosterically by 11)eir Signals
and CúOpcrative Binding 5 16 RepreSSl"Jr ~ 11(1 C ro Bind in DifferenT Plltterns ro Comrol Lylie and L\'Sogenic Growth 517 496
80x 16-3 jacob, Manad , ond ,he Ideas B,hin
Lysogenic Imluction Relluires Ptoteolytic Cleavage of ~ Repressor 518 Nega(ive A utoreguhnion of Repressor Requires Lung~Djstance lnreraclinns and él unge
Combinatoria l Control; CAP Controls OtherGenes As Wcl l 499
DNALoop
A hernativc C1 Factúrs Direct RNA Polymemsc to A lternative SetS of Promorers 499 NttC and MerR Transcriptional Activators thar Work hy A lIostery Rarher .han by Recruirment 500 NI'rC Has ATr ase Activity and Works from DNA Siles Far from lhe Gene 500
519
Another Activator, ~cll, Controls the Decision between Lyric and Lysogen ic Growlh upon Infecrio n uf a New Host 520
Box 16.-6 Gcneric Approoches [har ldr.nrified Genes (nvolved in ¡he Ly!ic/L)'.~ogenic Choice 521 Growrh Condirions of E_ col; Control rhe Srability of e ll Protein and thus lhe Lyl ic/Lysot,-.enk C hoice 522
MerR Aetivates Transcription by Twisting Pro moter DNA 501 Some Repressors Hllld RNA Polyme rase ar rhe Promoter Rather than Excluding Ir
502
Arae anJ Control of lhe araBAD Operan by A ntiaetivation 503
EXAMPLES OF GENE REGULATlON AT STEPS AFTER TRANSCRIPTION INITIATlON 504
T ranscriplio nal Antitermination in ~ Development 523 Retroregulation; An Interplay of Contro Ls on RNA Symhesis and Stability Detennines in Gene Expression 524 Sumll'1ary
-,; 25
Bibl~Taf.1h)'
526
Am ino Ac id Biosynthe:tic Operons A re Contrulled by Premature Transcriplion Terminarion 504 CHA PIEH
17
Gene Regulation in Eukaryotes
529
CONSERVED MECHANISMS OF TRANSCRIPTIONAL REGULATION FROM YEAST TO MAMMALS 531 Activators Have Separare DNA Bindíng and Activ
Eukaryoric Regulators Use a Range of DNA , Binding I:>ornains, bur DNA Recogni rion Invnlves {he Samt' Principies as Fo und in Bacreria
534
Activaring Regions A re NO( Well,Ddined StruClures
NOTFORSALE
536
Defailed Qm fents
RECRUlTMENT OF PROTEIN COMPLEXES TO GENES BY EUKARYOTlC ACTlVATORS
Activ ..uor:: . and Repressors Sometimes ():¡me in Pieces 555
537
Acti vators Rt-"cruit th1: Tran scriptional Machinery
GENE "SILENCING" BY MODlRCATlON OF HISTONES ANO DNA 556
[he Gene 537 Box 17-2 Chromatin lmlJllmopreclpitation 539
S¡lencing in Yeasr Is Mediated by Deacety larion and Melhylalion o( Hislone15 556
Acrivators also Recruit N ucleosorne Mod ifiers lhal He lp lhe Transcription Machinery Bind al me Promoter 540 Aaion at a Distance: Loops
Histone ModilÍcations anJ [he Hi15tone Code HypOlhe15is
(O
uf Genes Requircs Locus amrrol Regions 543
SIGNAL INTEGRATION AND COMBINATORIAL CONTROL 544 Acdvdtors Work Together Synergistically
Integr..ue Signals 544 S ignallmegra tion : [he HO Gene ls Com rolled by Two Regulawrs¡ One Rec.ruirs N ucleosome Modifiers amllhe O lher Recruies Mediator 546 S ignallmegration: Cooperarive Binding o( AClivators ar the Hum an ~- Lnterferon Gene 546 Combinatorial Control Lies a r lhe Hean of rhe Compl ex il y and D¡\lersity uf Eukaryotes 547 Com binalOrial Control o( rhe Matíng,Type Genes fmm Saccharom)'ce.~ ccretJisiae 548 lO
TRANSCRIPTlONAL REPRESSORS 549 SIGNAL TRANSDUCTlON AND THE CONTROL OF TRANSCRIPTIONAL REGULATORS 551 S igna ls A re Often Communicated 10 Transcriptiona l RegulalOrs through S ignal Traru;, duction PathwdYs 55 1 S ignals Ü)mrol the Activities of Eukaryotic Trnnscriptional Regulators in a Varicry of Ways 552 CHAP TER
xxiii
558
DNA Methylation 115 Associated wich S ilenced Genes in Mammalian ed ls 558 Sorne States of Gene Expression Are InherilcJ through Cell Division even when rhe In it iaring Signalls No Longer Presem 560 Box 17,3 A Lysogens and the Epigene!ic Switch 562
ELKARYOTIC GENE REGULATlON AT STEPS AFTER TRANSCRIPTION INITlATlON 562 Sorne ActivatOt"S Control Transcriptional Elongation rather Ihan Iniriation
562
Thc RegulClt ion of A hermuive mRN A Spli.cing Can Produce Different Prorein ProJucts in Differem Ce ll Types 563 Expression of Ihe Yeast Transcriptional ACti vator Gcn4 ls Controlled ar rhe Level of Translation 565
RNAS IN GENE REGULATlON
567
Double-StranJ ed RN A In hibits Expression of Genes Homo logous to thar RN A 568 Short lmerfering RNAs (siRNAs) Are ProouceJ from dsRNA and Direct Machinery rhar Switches Off Genes in Variu us Ways 568 M icroRNAs Control the Expression of sorne Genes Juring Dcvelopmenr 570 S"mlnary 571 BibliagTaphy 572
lB
Ge ne Regulation Juring Development
575
THREE STRATEGIES BY WHICH CELLS ARE
Be. 18- 1 M;croarray Assay" Theory and P",,,tice 577
lNSTRUCTED TO EXPRESS SPEClFIC SETS OF GENES DUR1NG DEVELOPMENT 576 Some mRNAs Become l ocalized within Eggs ano Embryos due 10 an lntrinsic Polarity in lhe
G raúients o f Secreted S ignaling Mo lecules Can Instruct Cells to Follmv Diffe rent Pathways of Development based on Their Location 578 EXAMPLES OF THE THREE STRATEGIES
Cy,,,,keleton
576
FOR ESTABLlSHING D1FFERENTlAL
Cell-to,Ccll Com act and Secreted Ccll S igna linJ:: Molecules both EHcit Chan ges in Gene Expression in Neighooring Cells 576
GENE EXPRESSION 580 The Localized Ash 1 Repre5Sor Controls Mating Type in Yeasr by S ilencing the HO Gene 580
NOTFORSALE
xxiv
Detoiled Colltents
Box J 8~2 Review ofCyroskelewn : Asymmelry arul GmwUt 582
Cell-t
Segmentarion Is InitiareJ by Local izoo RNAs ;u the Anterior a nd Posterior Poles of lhe UnfenilizeJ Egg 599 The Bieo id GraJienr Regu lares [he Exprl.'Ssion o( Segmentation Genes in a Concemrnlion-Dependenr Fashion 60 1
B. subfilis 584 Box 18--3 OwrtJictu af Dona Developtnenl 585
Hunchback Expression Is also Regulated ar rhe Leve! ofTranslarion 602
A Skin-NefVe Regulatnry Switch Is ComrolleJ by Notch S ignali ng in rhe lnsect CNS S87
The Gr'ilJ iem o( Ht,Jnchback Reprelisor Establishes Different Limits o ( Gap Gene Exprcssion 603
A Grad iem of (he Sonie Hedgehog Morphogen Comrols the Fonnation of Differem Neurons in Ihe Vertcbnlte Neuml Tube 588
Hunchback anJ Gap Proteins Produce Segmema[ion Stripes o ( Gene Expression 604 Box /8-6 Dioillfonnatics Merhods far ldemificatioll
THE MOLECULAR BIOLOGY OF DROSOPHILA EMBRYOGENESlS 590 A n Ovt!rvit:w (lf DrosophiIa Embryogenesis 590
af Camplex Enhanccrs 605 Gap Repressor Grnd iem s Produce m:'l.ny S tripes
A Morphogen Gradient Controls Dorsal-Ventral Patterning of (he Drosophila Etnbryo 590
Short-Range Tmnscriprional Repressors Permit Differenr Enhancers to Work IndepenJently o( one Another within rhe Complex etlt" Regulatory Region 608
A LocalizeJ mRNA Initiares Muscle Differcnt iátion in lhe Sea Squirt Embryo 584
l30x / 8-4
Ovcrview of
Drosophila Dewlopment 592 Box 18-5 The Role of Actiwror Synet"gy in Dewlof:mu'nt 597 CHAPTER
o( Gene Expression
607
609 Ribliugruphy 610
Summary
19
Comparative Genomics ano [he Evolu[ion of Animal Diversity MOST ANIMALS HAVE ESSENTIALLY THE SAME GENES
613
Suhrle C hanges in an Enhancer Sequence Can
6 14
Produce Ncw Pattems o ( Gene Expression
623
The Misexpression o( Ubx Changes the Morphology o( Fruil Fly 624 Changes in Ubx Funclion Modify the Morphnlogy of Fru it Fly Embryos 626
How Does Gene Duplieation G ive Rise 10 Biologica l Divenity ? 6 16 Bax 1 9~ 1 Gene Dupb'calion and lhe lmpartance af Reguwrory Evolutiún 616 Box 19-2 Duplicru:íon af Globil1 Gcnes Produces New Expression Palfm1s and Diverse Prmein Fuucúons 618 Dox 19-3 Crearion of New Genes Orives Bacteria1 Evolulian 6 18
Changes in Ubx Target Enchancers Ca n A he r
THREE WAYS GENE EXPRESSION IS CHANGED DURING EVOLUTlON 619 EXPERIMENTAL MANIPULATIONS THAT ALTER ANIMAL MORPHOLOGY 620
Arthro¡xxls Are Remarkably Diverse
Changes in Pax6 Expression Create Ectopic Eyes C hanges in A mI' Expression Trans(onn Antennac
imo Legs
622
Impommce o( Prolein Functio n: Intcrconversion o( frt and Amp 622
me
Panems ofGene Expression 627 Bo,..· J9-4 The Homeotic Genes of Drosophila Are Organized in Speclal Chromosame C luslCTs
62 1
627
MORPHOLOGICAL CHANGES IN CRUSTACEANS AND INSECTS 630 630
Changes in Ubx Expression Explain Modificatio ns in Limbs among the Crusc
NOT FOR SALE
xxv
Delruled Confents
GENOME EVOLUnON ANO HUMAN ORIG INS 635
The Evolul'io nar)/ O rigins of H uman Speech
How FOXPZ Fosrers Speech in Humans 637
H umallS Coma in Surprisingly Few Genes 635 The Human Genollle 15 very Simi lar l a trun o( [he Moose Hnd Vinua ll y Idenlica l to Ihe a, imp 636
PA R T
CHAPTER
n ,e Future of Compamtive Genomc Analysis SunUl'Wf;y
Bilrlioxrl..lph,
640
5 METHODS 643
647
INTRODUCTION 647 NUCLEIC ACIDS 648
Sha tgun Sequencing fI. Bacterial Genomc
Elecrropho resis lhrough a Gel Separntes DNA and RNA MolecuJes Acoord ing to 5ize 648
o( u rge Genome Sequenccs 664 Box 10-2 SequenlItOTS Are Used far High Throughpur SelJucncing 665
ar Panicular Sites
663
The ShOlgun Strdteb'Y Perm ies::¡ Panial As.sembly
Reslricrion EnJ onucleases Cleavc ONA Molecule.s
649
ONA HybriJization Can Be Used Idem ify Specific ONA Molecules
65l
Hybrid izarion Probes Can IJen¡ ify Elecnoph oret ica l l y~Sera r~ [ed ONAs and RNAs 652 lsolaría n of Specilic Segments of ONA ONA C loning
638
639
20
Techniques of Molecular Biology
[O
637
653
T he Pairoo ·End Slffi tegy Pennits the As.sembly of Large Genome Scaffokls 666 Genome-WiLle Analyses 667 Cornpmative Gcnome. An alysis 669
PROTEINS 672 Speci fic Protcins Can Be Purified
from Cell Ex tntets 671
65 4
C loning DNA in PlasmiJ Veclors
654
Vector DNA C .. n Be InlrooucOO in.lO H osl OrgallisTlIs
by Transfonn;nion 655 Libmries of ONA Molecules C~n Be Creatoo by C loning 656
Puri hcation o( a Prote in Requires a Spccific Assay 673
Prepararion of Cell ExrracIs Comaining Active Proteins
673
Hybridizcltion Ca n Be Used to Idenlify a Specific
Prore ms Can Be Separ.ued (ram One A nOlher Using Calurnn ChfOma t()~raph y 673
C lone in ti ONA Libr:.lry 657 C hemica Uy Sym hesizt.'í! O ligonuc leotides
Afti niry Chroll1
657
The Pol y mern~ Ou¡ in Rt:a~ ti on (PCR) Amplifies ONAs by Repeauxl Rounds o( ONA Replicarion in vil ro 658
Separarian a f Pl"Q{c ins o n PolyacrylamiJe Gels
NesreJ Scts of DNA Fragments
Protein Molecules Can Be Directly $equenced Pro teOlnics 677
Reveal Nucleotide Scquences
660
Box 20- 1 Forensics ancl che Pol)'merase C hain RellCfion 661
675
A ntiOOd ies Visu
Bibliograph)'
NOT FOR SALE
619
676
lU"¡
Detailfod Contenls
(HAPTER
21
Model Organisms
68 1
BACTERIOPHAGE 682 Assays of Phage G rowrh
THE NEMATODE WORM, Caenorhabdiris ekgaru 696
684
The S ingle-Ster Growth C urve
C.
Phage C rC6S5es and ComplementallOn Tests Transductio n and Recombimmt DNA
eleg{JJl~
Has a Very Rapid Life Cyde 696 C. elegans Is Com¡xJSed of Relatively Few,
685 685
wen S,udioo cen Lin"d~>es 697
686
n,e Cell De-cl th Parhwd.y Ww; Discoveroo in C. elegans 698 RNAi Was Dlscoveroo in e, d egans 698 THE FRUIT FLY, Drosophila melanogaster Drosophila Has a Rapid life C ycle 699 The First Geno me Mars \Vere ProouceJ
BACTERIA 687 Assays o( Bacteria! G rowlh 687 B«<.:teria Exchange DNA by Sexual Conjugatio n. Phab'e-MediateJ Tr-.msductio n, and DNA -Mediated Transformarian 688 B.'\cterial PlasmiJs Can Be Used as C loning VeClors 689
in Drosophila
699
700
Transposons Can Be Used ro Cen ernte Insertio nal Munllions ;md Gene and Orero n Fuslons .689
Genetíc Mosaics Penu it Lhe A n alysis of Letha l Genes in AJu lr H es 702
Sludies on rhe Molecular Biology of Bacteria Have Been Enhanced by Recombinant DNA Technolosy. Whole-Genome Sequenc ing, and Tr
The Vean FLP Reco mb inaSt! Pennits rhe Efficiem
Bioch emical A na lysis Is Espedally Powerful in S imple Cells wilh Well-Developt.'d Tools o fTrndiriona l and Molecular Genel 11.:5 691
THE HOUSE MOUSE, Mus mu,culm 705 Mouse Embryonic Developrnent Oepends on Srem Cells 706 It ls Easy to lntruJucc Foreign ONA
Bacteria A re Acc~ib l e 10 Cy[Qloglcal Analysis
691
Phage and Bacteria To ld Us Mos[ of rhe Fundamental Things aroul rhe Cene 692
BAKER'S YEAST, Saccharom)'Ces cercvisiae 693 Thc Existence o( H¡)plo iJ ami Diplo id Cells Facilitare Genetic Analysis of S . cCTcvisiae 693 Generatin g Precise Mutarions in Yeast ls Easy
Produ('[ion of Genelic Mosaics
h ls Easy to C reare Transgen ic Fruit Ries rhar Carry Foreign DNA 703
into lhe MO\I5e Embryo
Indcx
707
Ho mo lO(!ous Ret.:ombinatio n Permits rhe Selec(Íve Ablarion of Individual Genes Mice Exh ibit Epigenctic Inherirance Bibli~:TfJl)h)'
694
S. ceretJisiae Has a SlOa II , Well-O,araclerizoo Genome 694 S. cerevis;a€ Cells a,ange- Shape as They G ro\V
703
695
713
NOTFORSALE
71 J
709
707
Class Testers and Reviewers We wish to thank all of the instructors roc their thought ful suggestions and comments. incJudjng: Chapter Rcviewers Ann Aguanno. Marymoun l Manha uan College Charles F. Austerbcrry, Creighl on Universily David G. Bear, University 01 New Mexico fIea/tll Sciences Center Margaret E. Deard. Hnly Cross Cail S. Begley, Northeastern Univcrsity San ford Bernstein, San Diego 51afe University
Michacl Blaber, Florida Slale Un;versily Nicole Boum¡as, California Stote University. 5011 Bernoroino
Joho Bayle. Mississippi Sta te Un iversity
Suzanne Br.ldshaw, Universify o[ Cincinnati Joho C. Burr, Un iversity of Texas al Dollas Michael A. Campbell . Pennsyl vania Stole Un iversity. Erie. Thc Be}¡rend Callego Shirlcy Coomber. King's College. UnivelSity al London Anne Cordon, University o{ Toranta Sumana Dalla. Texas A&M University Jeff DeJong, Un iversily o/ Texas al Da/las Jurgen Denecke, Vniversíty o{ Leeds S usan M. DiBartolorncis. MiJJen;vilJe University Santosh R. D'Mello, UniversUy o{ Texas at DalIas Rohert J. Duronio, Universily o/ Norlh Carolina. Chapel HiIl Steven W. Edwards. Un ívcrsity o{ Liverpool AJlen Cathman, Soulheost Missouri S/ole Un iversity Anthony D. M. Glass. University o/ BritisJJ Columbia Elliott S. Goldstein, A ríZOllO Stole University Ann Grens. Indiano Univorsity, South Bend Grcgory B. Hec hl . Rowan UnÍversity Robert B. Hclling, Univcrsity al Michigan David C. Higgs. University o/ Wisconsin. Porksid~ Mork Kainz, Colgote University Gregory M. Kelly. Universily o{ Western OIl torio Ann Kleinschmidt, AlleghenyCoJlege Dan Kran e . Wright Stote University Mark LcvinLhal, Purdue University GaI)' J. Lindquester. RJwdes College
Curtis Loer. Univers ily 01 Son Diego Virginia McDonough, Hope Callege M ichael ,. McPhcrson. University o{ Leeds Victoria Mell er. Thfts University William L. Miller, NOI1h Carolina S lole University Dragana Miskovic, University o{ Woter/oo David Mullin. TuJane University Jeffrey D. NeWm
NOT FOR SALE
About the CD and Website Thc sludent eD-ROM rol' Molecular Biology 01 the Gene provides resourt.:cs lo help studcnts visualize difficult concepts , explore complcx proccsses. and rcview thoír undcrslanding of the masl chall cnging material presented in Ihi s
COUTSC.
This casy to use eloctro nie
resource providos students with rapid acccss lo twenly interactivo IlIlori als. thirleen s lructural animations. and crilica! thinking exerciscs Iha! Can be assigned by insLructors. T hc lota ri als cantain anirrntlions ¡hal are broken out step by step, so that studenl s can focus o n mastcring Qno clement al a üme. Evcry lulorial condudcs with ao "Apply Your Knowledgc" aClivit y, whero sludl:mls aro prcscnled with a proble m and Ihon gulded Ihrough lo Ihe solution with interactivc a nimatio ns a nd multi ple cho ice qucllti ons. The S!ructllrnJ anima!ions run in CI-HME, e n application Iha! BUlomalically converls the infonnation nccded to define the Ihree-dimcnsio n
•
O) .ll~a-
_ !~ <-_ ...
... _e--
.J ~
.. :::J .::I
_""'".=)_ - =_
Il .... A Inppl O(,'r
l .· ....
IV. l'd & V. RIJ HI
,.
.. -
D lo''''I'" "'" ~
~
......,
lo l.JndoIftIIaroI """'" u.. 1irülg .......w-1Or • fl'dec:vlol 01 DNA __ ~lydoHdOlc..Q r
-
...'- _... '-'--
~
{ccd>N ~
•
... ..1 Q:oo: . i!l OD 8 a t¡¡ ~ " 0 '!l'!l" $1 _ ..... 104' .....
~
r T ll!I ... _
___
...
~-
~
](Xix
NOTFORSALE
PAR
T
CHEMISTRY AND GENETICS
Z
PART J Chcmi51ry and Cene/ic;s
PART
Chapter I
QU TLINE
• The Mendelian View of the 'WOfId
Olapter 2
Nude\C Adds Conv€y Genetic
Informalion Chapter 3 The lmportance of Weak Chemical Interactions
Chapter 4 The lm~nce of High Energy Bonds Chaplel S
W&lk élnd Sboog Bond!>
Determine MacromoleoJlar Strudure
nlike Ihe resl ofthis book. Ihe Ilve chapters lhat make up Part 1 contain materiallargely unchanged from earlier editions. This is because the malerial remains as importan' as over-even in these days of genome sequcncing. Specillcally. Chaplers 1 and 2 provide an historical accounl of how thc fleld of genelics and the molecular basis of genetics was established. Key ideas and experiments are described. Chapters 3, 4. and 5 present Ihe chemistry lhal li es al tha heart of molecular bíology. We will discuss the fundamental chemical principIes Ihat underlie Ihe structliTcs of the macromolecules tbal figure so prominenlJy throughout the rest of Ihe book-ONA, RNA. and prote¡n-snd. the interactions between Ihose molecuJes. While the bulk of Ihe material is releined from earlier edítions. some of it has been reorganized and more recent examples have been ¡nduded. Chapler 1 addresses the founding evenls in the history of gcnelics from the dassic work oí Gregor Mende! up to that of Oswald T. Avery. We will discuss cverylhing froro MondeJ's famOlls experimenls on peas. which uncovered the basic laws of heredity, lo Avery's shocking (al Ihe time) revelation tha! DNA is the genetic material. Chilpler 2. covers the subsequent revolution of molecular biology. [rom Walson and Crick's proposal that the structure of ONA is a double helix, through the elucidation of the genetic code and the "central dogma" (ONA "makes" RNA which ·'makes" protein). This chapler condudes with a discussion of recent developments stcmming from the com· plete sequencing of the genomes of maoy organisms. and the impact Ihis has on modorn biology. The basic chemistry presented in Chapters 3 through 5 rocuses on the nature of chemical bond s-both weak and strong-and describes Iheir roles in biology. Our discussion opens. in Chapler 3, with weak chemicaJ interactions. namely hydrogen bonds, and van der Waals and hydrophobk inleractions. These rorces mediate most interactions be· tween macromolecules-between proteins. or bP.tween proteins and DNA. for example. Tbese wcak boncls are critical for Ihe activity and reguJalion of the majority of cellular processes. Thu s. e nzymes bind Iheir substrates using weak chemical intcractionsj and transcriplional regulators bind siles on DNA lo switch genes on and off usíng Ihe same c1ass ol bonds. Individual weak interactions are very weak índeed. and thus díssociate quickly afler forming. This reversibility is important for their roles in biology. Inside cells, molecules musl interncl dynamically (reversiblyl or the whole system wouJd seize up. At the same time. certain inleractions must, at least in the short lerm. be stable. To accom· modate these apparently conmcting demands. multiple weak internctions tend lo be used together. Strong bonds ho1d together the componenls lhal make up each macromólecule. 'Thus. proteins are made up of amino acids Iinked in a specific order by slrong bonds. alld ONA is mede up of similarly Iinked nucleolides. (The atoms thal make up the ami no acids and nudeolides are also ¡oined together by strong bonds.) These bonds IJre described in Chapter 4. In Chapler 5, we see how the strong and weak bonds together give macromolecules distinctive three-dimensional shapes (and thus bestow upon them specific functions). Thus. just as weak bonds mediate interacHons between macromolecules. so lOO they Ret betwcen. for ex::imple. nonadjacenl amino acids wilhin a given protein. In so doing. lhey determine how the primary chain of amino acids folds into a
U
PARTt Chemisll)'undGenerics
3
three-dimensional shape. Likewise. it is weak oonds that hold together Ihe two chains of Ihe DNA molecule. We also consideroin Chapter 5, how Ihe functioo of a protejo can be regulated. One way is by changiog lbe shape of the protein. a mechanism ca lled allosteric regu lal ion. Tbus. in one conformation . a given protein may perlorm a s pecific enzymatic fun ction. or bind a specific target molecu le. In another conforma tion. however. il may lose thal ability. Such a change in shape can be triggered by Ihe bindiog of anolhar prolein or a smalJ molecule such as a sugar. In olher cases, an aIlosteric effcct can be ¡nduced by a covalenl modification. For exampIe, attaching one or more phos phate groups Iu a protein can trigger a change in the shape of Ihal prote¡n. Another way a proteio can be controlled is by rogulatiog when it is brought ioto contacl with a larget molecu le. Jn Ihi s way o given protein can be recruited to work 00 different target proteins in response to diffcrent signals.
PHOTOS FROM THE COLD SPRING HARBOR LABORATORY ARCHIVES ~mon Ingraro. MarshaU Nirenbefg. and Mattbias 5taehelin,. 1963 5ymposium on Synthesis and 5tructure 01 Macromole-
cules. Ingram demonslrated !hal genes control !he amino acid sequence of proteins; the mut.Jtion causing sickIe-
Raymond Appfeyard. Ceorge Bowen, and Martha Chase, 19SJ Symposium on Viruses. .AWleyard and Bo.tven, both phage geneticists. are here sI1ov.f1 with Chase, who, in 1952, rogether INÍth JlJlred l-lershey, did the sim·
pie e¡¡;periment tIlal finally com;ncec! most peopie that!he genetic material is DNA (Chapler 2).
Melvin Calvin. Fr.nds Crick. Ceorge Gamo"".nd James Watson,. 1961 Symposium on Synthesis and Structure 01 Macrom~ecules.
CaMn v.on!he 1961 NobeI Prize for his v.od. en COl assimilation by plants. For their proposed stru:ture 01 [)NA. Crió. "rxI Walson shared in !he 1962 NobeI Prize for Medicine (Chapt€f 2). Garnow-, a physicisl "ttracted 10 !he prOOIem 01 Ihe genetic rode (Chaplers 2 and 14), lOIJIlded an informal group 01 likeminded scientists calkd dle RNA T.e dlb. (He LS weanng the ck.b tie-\I\ohich he designed -in fu piau
Max PenlU. 1911 Symposium on 5trurtun! and Function of Proteins al tIle T1IreeDimensional Leve!. PenAl shared, v.1th Jo/m Kenc!rew, !he 1962 NobeI Prize lor 0lem15try; using X-Iay crystallography, and after 25 years 01effon. they were Ihe fJrst 10 soIve !he atomic stn:ctures of proteios, hemogIobin and myoglobin respedively «h,pte< 5).
Calvin Bridges.. 1934 Symposium on 'bpects 01 Crowtfl_ Bridges (sh:Mn reading!he newspeper) was part of T.H. fJorgan's lamous 1ly groop" that pioneefed the deve!opment cA ile frut fIy Dn::IsqJliIo as a rmdeI genetic orgar'll:Sm (Olapte's 1 aod 21). Wth him is Dr. 1. lludhoIt2..
loan 51eib ancl Fritz lipmann. 1969 5ymposium on T1Ie Me
n of tris book. Upmann shcMoed Ihat !he high energy phosphate group in ATPis!he 50Ufce of energy Iha! drM3 many biological proce5ses (dlapler 4). ror this he sharoo in Ihe 1953 Nobel Prize Ior Medicine.
...."- . : . .'.,....r .. .. :
\
I
'.
._
C H A P TER
The Mendelian View of the World
'
is easy to coruider bumIDl beings uni que among li vi ng organisms. We alone have developed complicated languages that aBow m ean~ ingfuJ and oomplex interplay oC id eas aud emotions. Great civiliza~ tions have developed and cha nged our world's envirorunent in ways inconccivoble for any otber form of life. There bas always becn a len· dency, therefore, lo think thal somelrung speciaJ differentiales bumans froro evel)' otber species. Tbis belief hns found expression in Ibe many fonns of religion through wbich we seek the origin and explore the rea· saos for our exislence ando in so doing, try to create workable rules lor conducting our li ves. Little more than a century ago, it seemed natural to thillk that. just as evel)' human life begins and ends at a fi xed time. tbe human species and all otber form s of Jife mus! aIso have beco ere· ated al a fi xed moment. Thi s belief was firsl seriously questioned 140 years ago, wben Charles Oarwin and Alfred R. Wallace proposed tbeir theories of evo-lut ion. based on the selecHon of tbe most fi L They ¡,;tated Ihat the various forms of life are nol consta nI bul continualIy give rise to slightly different animals and planls , sorne of whicb adapt lo survive snd m ultiply more effectively. Al Ihe time of this Iheory. th ey did not know tbe origin of lhis conlinuous variation. but they d id oorrectly realize tbal these new characterisbcs must persist in tbe progeny iC such vari o ations are lo form the basis of evolution, Al first, there was a great furor against Darwin, most of it coming from people who did not like to believe Ihal humans and the rather obscene·looking apes could have a common anceslor. even if this anceslor had Jivsd some 10 miJl10n years ago. There was also ¡nitial opposition from many biologists who fail ed lo find Darwin's evidence convincing. Among these was lbe fam ous naturalist Jean L. Agassiz, tJlCn al Harvard. wbu spenl many ycars wríting agBinst Darwin and Darwin's champion , Thomas H. Hu xley. the most successful oC the popularizers of evolution. Bul by Ihe end of the nineteenlb century, the scientific argumenl was almost complete; botb the cu rrenl geographic distribution of planls and animals and their selective occurrence in the fossil records of the geologic past were explicable only by pastulating Ihat con linuously evolving groups of organ isms had descended from a common ancestor. Teday, evolution is an accepted facl for everyone except a fundame ntalisl minority. whose objections are basad nol on reasoning bul on doctrinAire adherence to religious principIes. An immediate consequence of Darwinian theol)' is the realizalion that Jire fi rsl existed on our Earth more than 4 billion years ago in a simpl e form , passibly rcsembling lhe bacteria - lhe simplest variety of Iife known today. The existence of sucb small bacte ria tf!lls us lhal lhe esscnce oClhe living slate is found in very small organisms. Evolutionary theory further suggests that the bas ic principIes of lire apply to allliving forro s.
I
OUT LI NE
Menders Discoveries (p. 6)
• Chromosomal Theory o, Heredity (p. 8)
Gene Linkage and Crossing Over (p. 9)
• Chromosome Mapping (p. 12) lhe Orig.n of Genelic Vari.3bility 1tJrough Mulations (p. 15)
Early Speculations about Vv1-tat Genes Are and Hc:Nv lhey Act (p. 16)
• Prelimll1é!ry Arrempts to Find a Gene-Protein Relationship (p. 16)
5
MENDEL'S DISCOVERIES Gregor Mendel's experiments traced the results of breed ing experiments (genctic crosses) between straios of peas dHfering in weUdefined characteristics . like seed sbape (round or w rinkledl, seed color (ye llow or green), pod shape (inflated O f w ri nkled), a nd stem length (long oc shoet). His conct-!nlcation 0 0 weU -defi ned di fTerences was of great importa nce; many breeders had previously ITied lo follow t he inhe ritan ce of more glUss qu alities. like body weight . and were unahle to discover aoy simple mi es about their lransmissio n from parenls to offspring (see Box 1-1 , Men clelian Laws).
The Prindple of l ndcpendent Segregation Aft er ascertaining Ihat cach type of parental straio bred true-tha t is, produced progeny with particuhu q ua lities identical lo Ihose of the pacenl s- Mendel performed a number of crosses between parents (P) differing in single characteristics (such as seed shape or seed coloe).
Rox 1-1 Mendelian laWS
The most striking attribute of a living cel! is its ability lo Iransmit hereditilry properties from one cell generation ro ¡mother. The existence 01heredity must have been nolked by early humons, who wltnessed Ihe passing al char.lcteristics, like eye or hair color, frcm pilrents lo offsprlng. lIS physical basis, hONe\ler, was not understood until the first years of the twen6eth century, when, during a remarkable pericxl 01creative activity, the chromosomol theory of heredity w.ls established. Hereditary transmíssion through Ihe sperm aOO egg became known by 1860, and in 1868 Ernst Haeckel, noting fha! sperm consists Iargely of nudear material, postulaled Ihat the nudeus is responsible ter heredity. AJmost 20 years passed befare the chromosomes W€re singled out as {he active factor:s. because the detalls 01 mitosis, meiosis, and fertili zation had 10 be VI-Orked out first VVhen Ihis WiIS acc.omplished, it (])uld be secn that, unlike other cellular constituents, the rnromosomes are equally divlded between daughter cells. Moreover, me complicated chromosornal changes thal reduce fhe Spem1 .lOO egg chlOmosome number lo the haplo¡d number during meiosis became understandableas necessary lar keep"ing !he chromosome number constant. lhesc tacts, havvever, merely suggested lha! rnromosomes carry heredity. Proof came at ttJe turn al the cenlury with the discovery al !he basic rules of heredity. The coocepts were first proposed by Gregor Mendel in 1865 in a pilper enlitled ~Experimenls on Planl Hybrids" given lo the Natural Science Society al Bmo. In his presentarian, Mendel described in great detail the patterns of transmlSsian 01 trallS in pea plilnts (which we discuss in detail below), his COndUSlonS of Ihe prindples of heredity, and their relevance 10 !he controversia1Iheories of €V01ulion. The dimille of sClenlific opinjan, however, was not favorable, ilOO these ideas were completcly ígnored, despire sorne early efforts on Mendel's part 10 interest the prominent biologisrs of his time. In 1900, 16 years after Mendel's dealh, three plant breeders working independently en different systems confirmed !he significanee of Mendel's forgorten ......ark. Hugo De vries, Karl Correns, and Erich l schermak. all doing experiments related 10 Mendel's, reached similar conclusions befare Ihey knew of Mender s work.
Allthe progeny lFl = first filial generation) had the appemance of ane parent a nly. For examp le, in o cross between peas having yellow seeds and peas having green seeds, a 11 the progony had yellow seed¡:;. The trait that appears in tbe F 1 progeny is called dominant. whereas the trait thal does nol appea.r in FI is called recessivc. The meaning of tbese results became clear when Mendel set up genetie crosses between Fl offspring. These crosses gave the important result thal the recessive trait reappeared in approximately 25~ of the F 2 progany. whcreas the dominant trait appeared in 75% of thoso offspring. For each of the seven trails he followed . Ihe ratio in F2 01" dominant lo recessive trai ts was always approximately 3:1. Whl:Jn these experiments were carricd to a third (F~ ) progeny generation, all the F2 peas wHh reces¡:;ive trai ts bred true (produced progeny wilh the recessive lraits). Those with dominanl traits fe ll into tv-Io groups: one-third bred true (prod ueed on ly progen)! with the dominanl Irait); the remaining two-thirds again produced mixed progeny in a 3:1 ratio of dominant lo recessive. Mendel correctly interpreted his results as follows (Figure 1-]): the various traits are controlled by pairs oC factors (which \'\le no\'\l call genes). one fac tor derived from the male parent, the otber from the fema le. For example. pure-breeding strains o r round peas contain two versions (or alJeles) of the roundness gelle (RR). whereas pmebreeding wrinkled strains have two copies of the wrinkledness (IT) aUele. The round-strain gametes each have one gene for rOllndness (R); the wrinkled-strain gametes each have olle gene foc wrinkJedness (r). tn a eross belween RR and IT, fertiliz.'1tion produces an FI plant with both aUeles {RrJ . The seeds look round because R is dominant over 1". We refer lo the appearam;e oc physicaJ structure or RO individual as ils phenotype. and lo its genetic composition as its genotype. Individuals witb idelltical phenolypes may passess different genotypes: tbus, lo determine tbe genotype of an organismo il IS frequentl y n8Cp..s"ary lo pp.rrorm gelletic crosses for several generations. The term homozygous refers lo a gene pair ill which both the maternal and paternal genes are identieal (for example, RR or rr). In contrast, tbose gene pajes in which paternal and maternal genes are different (for example. Rr) are called hcterozygous. One or several letters oc symbols may be used to represenl a particular gene. The dominant a llele of the gene may be indicated by a capital letter (R), by a superscript --+ lr ' ), or by a + standing alone. ln our discussions here. we use the first convention in which tbe dom inan! aUcle is represented by a c
parental generntion RR
"
t
t
gametes
R
[)
hybrid
~ R'
"ma~ r g-~1
1
r
R
+
R
,
1 l'
,
+
, 1'
,I ""'e gameles
1
7
" F~
generalioo
FIGURE 1-1 HowMendefsfimlaw (independent segregatton) expiains Ihe ) :1 rBtio of domlnant lO recessive phenotypes among Üle F1 propny.
R represen~ the domnanl gene and r !he recesswe gene. lhe round seed represents 1he dominan! phenotype. me wrinkled seed Ihe recessive phenotype.
8
7'he Mtmdelian View o[ Ih e WOfld
independently transmitted and so are able lo segregate independently during the formation of sex cells. This principIe of indepcndent segregation is frnquently referred to as Mendel's first law.
parental generalion
x
Sorne Alleles Are Neither Dominant Nor Reeessive
AA
as
~
~ gametes
F1 generalion
in the crosses reported by Mendel. one membec of each gene pair was clearly dominant to the ather. Such behaviac, howevec, is no! universaL Sometirnes tbe heterozygous phenotype is intermedinte between the two bomozygous phenolypes. For example , the cross between a pure-breed ing red snapdragon (AntilThinum) and apurebreeding white vaciety gives F, progeny o[ the intermediale pink color. Jf these F¡ pcogeny are erossed among themselves, the resulting F2 progeny contain red. pink. and white fl owers in the proportion of 1:2:1 (Figure 1-2). Tbus, it is possible here to distinguish hetecozygotes from homozygotes by their pbenotype. We also see Ihal Mendel's laws do nol depend on whether one aUele of a gene pair is dominanl over the other.
a
¡
As
f
I
gametes
F2 generation
fe""'"
,~
¡
Principie of lndependent Assortment
male
FICURE 1-2 "'einheritana!offkMer coIOf in ttle snapdragon. One paren! is h~ ter red fIowers (AA) and!he ama homolygous for vJ,ite fIowets (00). No domlnance IS pres€flt, and!he heterozygous f l n~rs are pink. fue 1:2: 1 ratio of red. pín!;, ¡md white rIowefs in the ~ pfOgeny is shown by appropflale rolonng.
Mendol extended his breeding experiments to peas differing by more than one chacaeteristic. As before. he started \\'itb two strains of peas, each of which bred pure when mated with itself. One oC the strains had round yellow seeds; the olbec, wrinkled green seeds, Since round and yallow are dominant ove.r wrinkled and green, lhe entice FI genec~ ation produced round yell ow seeds. Tbe FI generation was then erossed within itself lo produce a number of F2 progeny. which were examined for seed appearanee (phenotype). In addition to the two original phenotypes (round yellow; wrinkJed green). two new Iypes (recombinantsl emerged: wrinkled yell ow and round green. Again Mendel found he could interpret the results by the postulate of genes, ir he assumed tha! each gene pair was independently transmitted lo the gamete dueing sex-cell formation. This interpretation is shown in Figure ]-3. Any one gamete contains only Olle type oCallele from each gene pairoTbus, rhe gameles produced by an FI (RrYy) will have Ihe composition RY, Ry, rY. Of lJ'. bul never Rr. Yy. YY. or RR. f'urth ermore. in this exam ple. al! Cour possibl e gametes are procluced w.ith equal frequency. Tbere is no tendency of genes arising from one pa.rent 10 stay logetber. As a resulto the F 2 pcogeny phenotypes appear in the catio nine round yellow, three round grecn, three wrink1ed yeUow, and one wrink..led green as depicled in the Punnett square, named after lhe British malhematician who introduced it. in the luwer part of Figure 1-3. This principie of indcpendenl assortment is frequently called Mendel's second law.
CHROMOSOMAL THEORY OF HEREDITY A principal .reason foc the original faiJu.re lo appceciat e Mendel's diseovery was Lhe absence of firm facts about the behavior oC cruomosornes during meiosis and mitosis. This k.nowledge was available. however, when Mende l's lows wece conficmed in 1900 and was seized upon in 1903 by American biologist WaJter S. Sutton. ]n his c1assic papp...c "The Chromosomes in Heredity," Sutton empbasized tbe imporlance of the met Ihat tbe diploid chromo~:;ome group consists of two
Gt:nfl Linku8'? ulld Crossing Over
parental generation
fi GURE 1-3 How Me nderssecond law (i"dependent assortment) operatl!s. In lhis example, the inhefitance 01yellow (Y) and greet1 (y) ~ color is follOW€d together with the inheritaoce 01 round (R) and wnnl¡peS of !he various parent:5 and progeoy are .ndicated by lelter comblnations, and tour difterent phenotypes are distinguished by appropriate shading.
X
RRVY
"Y'
e*
gametes
\
~
F 1 generalion
"*I
RrYy
t
e \
F 2 generaUon
gametes
I
t Ry
,e
gametes
~
~ ,y
"I gametes
moephologically similar sets and thal, during meiosis. every gamete receives only one chromosome of each homologous pairo He then used this fact to explain Mendel's resulls by assuming that genes are parts of tbe chromosome. He postulated that the yellow- and green-seed genes are carried on a certain pair of chromosomes and that the cound- and wri.nkled-seed genes are carried on a differenl pairo This hypothesis immediately exp lains Ihe experimenlally observed 9;3:3:1 segregation eatios. Although SuttOll'S paper did not pcove the chromosomal theory of heredity. it was immensely important, for il brought logether foe th e first time the indepeodent di sciplines of genetics (the study of breeding experimentsl 1I 0d cytology Itbe study of ceU structure).
GENE UNKAGE AND CROSSING OVER MendeJ's principIe ofindependent assortment is based on Ihe fae! that genes localed 00 different chromosomes l;tehave independently during meiosis. Oftell , however, two genes do nol assorl independently because they are localed 0 11 the same chromosome (linked genes ; see Box 1-2, Genes Are Linked lo Chromosomes). Many
9
Box 1·2 Cienes Are Línked to Chromosomes
Initially, all breeding experiments used genetic differences alre.aO,l existing in nature For example, Mendel used seeds obtainoo from seed dealers, v.tJo must have obtained them from farmers.. The existence of altemative forms of the some gene (alleles) raises !he question of hovv they arose. One otMous hypothesis states that genes can change (mutate) to give rise tú l"IeIN genes (mutanl genes). This hypothesis was first seriously tested, I::eginning in 1908, by the great American biologisl lhomas Hunt fv10rgan and his)QUng collaborators, geneticists Calvin B. Bridge;. Hermann J. Muller, and Alfred H. Sturtevant. They VYOI"ked with tne tiny fIy DrosophiJo melanogaster. The first mutant found was a male with white eyes instead of !he rormal red eyes. The M ile-eyed varíant appeared 5pOI1taneously In a w lture bottle of a
red-eyed fl ies. Because essentially all Drosophifa found in nature have red eyes, !he gene leading 10 red f!oIeS was referred to as !he wiId·type gene; the gene leading to Mite eyes v.es called a mutcYlt gene (allele). The v.tJite--eye mutant gene WélS immediately used in breeding experiments (Box 1· 2 Ftgure 1), with me striking result that Ihe behavior of the anele completely paralleled !he distribution of an X chromosome (that is. was sex"¡inked). This fi nding immediately suggested that this gene might be located on the X chromosome, togethe r with those genes controli-ing sex. This hypothesis was quickly confirmed by additional genetic a osses using newly isolated mutant genes. Many of these additional mutant genes .liso were s€)(-/inked. b
parental genaration
parental generntion
'od
"""no
<;>
white
9
phenotype
x WW
gametes
,
wY I
I. r
""'<;>
"," 3
FI geoeration
¡
t
w
WY
genolype
ww
r, I
O
X
geootype
¡
red
I
t
t
0
ga",,""
w
r,
~r
I
F1 generation
.... <;>
d
...mita
" X
X
Ww
n
n
w
Ww
n
WY
~
ww
w
y
tri, rfl1
F2generntion
F 2 generation
red
w
n
wY
red
9
Ww
red
O
WY
white
O
wY
"'" <;>
Ww
""n. <;>
....
3
WY
""n. 3
wY
Box 1·2 fl(;U RE , lhe Inheritance of a sex-lInked lene In DrosopItiIo. Genes Iocated on se: chromosomes can e:pess themselves differenlly in male and female progeny, because jf mere is only one X dlfomosome pteSeI1l, recessive genes on this chromosome are always expressed Here are two crosses, both involving a recesWe gene (~ for white eye) Iocated en lhe X chromosome. (a) The male parenl ¡s a oM-ñte-eyed (wY) fIy, and thefemale is hornol$)US ror red ~ (WW ). (b) The male has Jed eres (W Y) and!he femalewhite eyes (WW). The lener y stands here fIO! for an a11e1e, but for lhe Y d1l'omosome, present in ma1e Orosofhilo In place ef él homolcgousX mromosome.. There is no gene on the Y chfOITClsorne conesponding 10 tIle w Of W gene en !he X chrornosome.
Gene IJnkO$e und Crossing Over
examples ol' nonrandom assortmeot were foun d as soon as a large number of mutan! genes became 8vailable for breeding analysis. Jn every well-studied case, the number of Jinked groups was identica l with the haploid chromosome number. For example. there are four groups of Holeed genes in Drosophi/(¡ and four morphologicnlly dislioel ehromosomes in a haploid cell . Unkage. howcver. is in cffect never complete. Thc probability that two genes 00 Ihe same chromosome wil! remain together dueing meiosis ranges from just less than 100% to nearly 50%. This variation in liokage suggests tbar there must be a mechan ism for exchangiog genes on homologous chromosomes. This mechanism is caBed crossing over. lts cytological basis was first described by Belgian cytologisl F. A. Janssens. Al the start of meiosis. through the process of synapsis, the homologous chromosomes foem paics w ith their long axes parallel. Al this slage. each chromosome has duplicated lo foem two chromatid s. Thus. synapsis brings togelber foue chromatids (a telrad) . which coil aboul one anolber. Janssens postulaled thal . poss ibly because of tension resuJting from this coiling, two of the chromatids might sometimes break al a corresponding place on each. These events cou ld create fOUT broken ends. wh ich might rejoin erossways, so that a section oI each of the Iwo chromatids would be joined to a section of the other (Figure 1-4). In lhis manner, recombinant eh romatids mighl be produced Ibat conla in a segment derived rrom each of the original homologous chromosomes. Formal proof of Janssens's bypolhesis that chromosomes physicaUy interchange material during synapsis carne more Iban 20 years later, w hen in 1931, Barba ra McClintock and Harriel B. Creigb ton, workil1g at Cornell University with the corn plant Zeo moys, devised an clegant cytological demon· stration of chromosome breakage and rejoining (Figure 1-5).
parental genotypes
~
:
krob
•
FI (¡ U R E 1-4 Janssens's hypothesis of
uossing overo
w,
e
1M<
W,
e
w,
exchanges between me pilired chromosomes. In the e>periment sholt.'n here, the horm~
w,
~
e
I
W> j ...
e e
e
~
e
~W>
e
~
e
w,1-
W,
e
w,
...
1
...
loo
~ crossover pntgeny
- 1
.. 1
loo ~
e
loo
e
...
...
w,
malerial
nOJl-crossover progeny
I -=1
-
each ct.omalid breaks 111 poinl al contad and fuses WlIh a partion cA
e
W><
e
synap$is" o, dt¡p/iaIled chrOlTlOSOl1leS lo form IlIlrnds
FI GUR E 1· 5 Demonsb'atton of pftyskal exchanges between hom"¡ogous dlromosomes. In most organisms. pairs 01 homoI~$ chl'OlTlO5OfT'leS llave idenlical shapes. Qccas,ionally, however. the two members of a palr are rot identical; ene IS ma~ by lhe presenc:e 01 mrachromosomal material Of compacted fegions lI1at reproducibly form knob-like structure;. McClintod and Creighton found one such palr ~nd used iI to show mal oosslllg over InvoIves actual physical
extrachromosomal
:- I
e
11
...
4
W,
e
w,
= 1 e
•
l-
e
I
I
I
e I
-;1 e
. ....-:w
I
,
~
W> I
,
...
w,
e, wx progcny had ID alise by crossing over between theCand wx Iod. W1len sudl
c. wx
offspring VIEre cytoIogically examine
12
The MBndelian View o/ tite World
CHROMOSOME MAPPING Thomas Hunt Morgan and his students , nowever, did not await formal cytological proof of ccossing over before exploiting the implication of 1anssens's hypothesis. They reasoned that genes located close together on a chromosome would assorl with one another much moro regular! y (close linkage) than genes located far apart on a chromosome. They immediately saw this as a way to locate (map) the relative positions of gene:; 011 chromosomes ~wd thus to produl:t: a genetic map . The \'Vay lhey used the frequencies of Ihe various recombinant classes is very stroightforward . Consider the segregati on of three genes all located on the same chromosome. The arrangement of tbe genes can be determincd by means of tbree crosscs. in each of which two genes are foUowed (two-factor crosses). f\ cross bctween AB and ob yields foue progeny types: the two parental genotypes IAB and ab} and two recombinanl genolypes (Ab and aB). A cross between AC and oc similarly give:; two parental combinations as weU as the Ac and oC recombinants, whereas a crnss between BC and be prodm;es the parental types and the recombinants Be and be. Each ccoss will produce a specme ratio of parental lo recombinanl progeny. Consider, for example, the ract Ihat lhe fi.rsl cruss gives 30% recombinanL<;, Ihe second cross 10%, and the third cross 25%. 'I'his te Us us that genes a an d e are c10ser together than a and b or b and e and tbat the genetic distances betwean a and b and b and e are more s imilar. The gene arrangement thal best fit s these data is o-c-b (Figure 1-6). The correctness of gene ord er suggested by crosses of two gene factors cao lIsually be unambiguously confirm ed by three-factor crosses. When the tbree genes used in Ihe preceding example are foIlowed in the c.ross ABe x obc. six recombinant genotypcs a.re found (Figure 1 -7). They fall inlo tmee groups of reciproca I paies. 'I'he rarest oC these gmups mises frorn a double crossover. By looking for the Ieast frequent class. it is often possible to instantly confirm (or deny) a postulated arrangement. The res ults in Figure 1-7 iromediately confirm the order hinted at by the two-factor crosses. On ly if the order is a-c-b does the fact that the rare recombinants are AcB and aCb make sense. The existence oC multiple crossovers means tbat the amount of recombination between the outside marke.rs a and b (ah) is usually less than the sum of tba recombination frequencies between a fmd e (a c) ¡:md c: and b lcb). 1'0 obtain a more accurate appcoximation of the distance between the outside markcrs. \Ve calculate the probability (ae x eb) tbat when a crossover occurs betwcen e and b, a crossover also oecurs between a and c. and vice versa (cb x oc). This probability subtracted from the suro oCthe frequencies expresses more accurately tbe amounl of reeombination. The simple.formula ab = ac
-1-
eh - 2(ac}(cb}
is applicable in all cases where the occurrenue of one crossover does nol aCreet the probl1bility of anotbar crossover. Unfo.rtunately.
FIC;URE 1-6 Assignmento.the tentative order o. three genes on the basis o, three two-factcr crosses..
30% 25%
10%
e
lb
Chromo5nme Mapping A
I
e
B
I
I
FICiURE t~7 The use ofthfée-factor aosses to assign gene OIder. The Ie.>;,
I
lrequent pair of reciproca! recombinants must
X
• A
e
~
B
e
B
: b b
:8:
A
I I
a
e
b
a
e
A
e
B
A
e
I
I I
I I
a
¡
I
e 67.5%
I I
b
1I a
l
e
I I
B
7.5%
...,,,
b
1
¡
I
I
e
Ihis class arlses from a break between
¡ A
e
:
a
: e
A
e
I
•
13
A
I
e 22.5%
e e
b
l
~
8 : •:2: X B
b
A
e
B
•
e
I I
this cIass arises from a break oolween caro b
I
l
I I
B
:
b
B
I I
b
2.5%
Olis dass arlses from a break between aoodcandcaodb
accurate mapping is oflen disturbed by in terference phenomena , which ean ei ther ¡nerease or deurease Ih e probabili ly of correlated crossovers. Us ing sud reasoni ng. the Co lumbia University group hcaded by Morga n had by 19 15 as~ igned locations to more than B5 mu tant genes in Drosoph i/a (Table 1-1), pladng each of them al disti nct spots on one of the four linkage groups, or chromosomes. Mosl importantl y. all the genes on a given chromosome were localed on a lineo The gene arrangement was strictly linear and never branched. The genetic map of one of the chromosomes of Drosophila is showll in Figure l -B. Dislances Letween genes on such 8 map are measured in map units, which are related lo Ihe frequency of recombination between Ihe genes. Thus. ir the frequency of recombinalion between two genes is fou nd lo be 5%. Ihe genes are said lo be separated by [¡ ve rnap I.l.n its. Bccaul)e of the high probability ol" dou ble crossovers belween widely spaced genes, such assignmenls of map un it s can he considered aflcurate only if recombinatioll bctween closely spnced genes is fo Uowed. Even when two genes are al tho rae ends of a very long chromosome. thcy assort tog.ether at lenst 50% of the time oocause of multiple crossovers. The two genes will be separated ir an odd number of crossovers OCCUTS bctween lhem. buI they wil! end up togetber ir Rn even nwnber occursbelween Ibem. Thus. in Ihe beginning of the gcnetic analysis or Dl"Osophila. il was aften impossible to determine whether two genes were on diffcrenl cbromosomes or al the opposite ends of ane long chromosome. Only after large numbers of genes had becn mapped was it poss ible to dcmonstrate convincingly that Ihe Ilu mber 01" linkage groups equalled the numbcr of cytologically visible chromosomes. In 1915, Morgan. with his students AIfrccl. H. Sturtevant, Hermann J. Muller. and Calvin B. Bridges. published theie definitive book The Mechanism o/ Mendeliao Heredity, which first announced the general va lidily of the chromosomal basis of heredily. We now rank this concepl. along with the Iheories of evolution and Ihe cell . as a ma jor achievemenl in our quest lo understand Ihe nature of the living world.
TABLE 1- 1 T11e 85 Mutant (¡enes Reported in Drosophilo meJcmogCJSter in 1915-
Name
Region Affected
Name
RegiO" Affected
Body, dealh Wing Venation
GlOup I
Abnormal
Abdomen
Ba,
Eye
Lethal, 13 Minialure
Bifid Bow Cherry Chrome C leff
Venation Wing Eyecolor BodycolQt
Notch Reduplicated Auby Rudtmentary
VenaliOfl
Club Depressed
Wing Wing n ioral(
Sable Shilted
Donro
Sro" Skee SpocO Spo.
Eye color
LOO Wing
Body color Venation Wing Wing
W¡ng
Eosln Facel FOlked Furrovved Fusad
Eye color Ornmalidia Sptne
Grecn
Body color Wing Bodycolor
While Yellow
Eye color
Jaunty Limited little crossovel MOlula
Wing Abdominal band Chromosome 2 Orrma tidia
O live Plexus Purple
Body color Venahon Eye rolar ThoralO; rTlil rk
JaunlY lemon
Eye VenahOfl
Tan Truncate Vermilíon
Body color Antenno wing Eye color Bodycolo r
Group2 Anllered AplelQUs Ale Balloon
Co"",,"
Wlng Wing Wlng Venation BodycolOI Wil]g Thoral( mal!\:
Confluent
Vena tion
Cream 11 Curved ,,"ehs
Eye color Wing
Extr.l vein Fringed
VenatlOn Wing
Black Blistere
Speck Strap
Wing
Streak Tretoll Truncale Vestigial
Pattern Partern Wing wing
Pallem Wing E'ye color
Pinll
Eyecolot
Eye
Sepia
SlZe 01 body Body colOl Sizeof body
Sooly Spinetess Spread Trldent
Le9
Group 3
Sand
Beaded Crearn 11 1 Oeformed
Dwarl Ebony Gian! Kidney LaVo' crossing over Maroon Peach
Group4 Sen.
Eye
Rougll Sa!¡anln
Eye Eye color Eyecolot Bodycolor Spir le Wing
Eye color Eye color
Truncafe Whitehead While ocelli
Paltem Wing Pallern Simpleeye
Wing
Eyeless
Eye
Chrornosome 3
'The mulahons lall inlo lour lird
TJ1e Origin o[ Gelletir Variobility througJ! Mutotjon.~
15
normal
.,., red
stralghl wings
slraigh l v,;ngs
long wings
",d
g
long
.yes
bod,
'eg'
\5~ffi; ,1(, l O.
1
bw
992
75.5
67
1
B
e
1
=t
broWn are benl
eyes
wi"",
"',wings
""
54.15 P'
'\
''''
aristae
vu
48.l ·5 b
'l.""
31
13
1
1
d
dp
..
O
=i
'{ 'm"
uI,
vestigial IfIIings
pUrple
.Y'"
bOO,
mutant
FIGURE
'ong wi"",
1-8 T1Ie genetK map of chromosome 1 of Drosop/lilo melonogoster..
THE ORIGIN OF GENETlC VARIABIUTY THROUGH MUTATIONS It now became possible to understand Ihe heredit ary variation that is
found lhroughout the biological world and thal form s Ihe basis of the theory o[ evolution. Genes are normaUy copied exactly duriog chromosome dllplication. Rarely. however. changcs (mulBlionsl Qccur in genes to give rise lo altered forms. mosl-bul no! aJ/-of which fllnclion less we ll thao th~ wild -type aUclus. Thi s process is necessari ly raro; otherwise. many genes would be changed during every cell cycle. and o[fspring wou ld not ordinarily resemblc their parcnts. There is, instead, a strong advantage in there being a small bul finite mutation rate; it provides a constant source of new variabil il y. necessary to allow planls and animals to odapt to a constantly changing physical and biological environmenl. Surpris ingly. however. the results of the Mcndolian genelici.sts wem not avidly seized upon by the classical biologisls. then tht: authorit ies on the evol utionary relations between the various forms of Ufe, Doubts were rajsed abou t whelhcr genetic c honges of th e type sludied by Morgan and his sludenls were sufficicot lo permit Ihe evolution of radicaUy new structures. Iike wings or eyes. lnstead. Ihese biologists believed that Iberc musl also occur more powerfuJ "macromutations." and that it was Ihese events Ihal allowed great evolut.ionary advances. Cradually. however, doubts vanished . largely as a result of the efforts of the mathematical geneti cists Sewall Wright. Ronald A. F'isher, and John Burclen Sanderson Haldane. They showed that , considering Ihe great age of Earth . the relatívely low mutation rates found Cor Drosophilo genes. together wilh only m ild selcctive advantages. would be sufficient to allow the gradual accumulation of new favorabl e attr ibutes. By th e 1930s. biologists began lo reevaJuate Iheir knowledge on the origin of species and lo understand the work of Ihe mathematical gen~ t.jcis ts, Amung th ese new Danvin ians \\Iere biologist Julian Iluxley
dach,
d~",
(ShOfl legs)
wings
aristaless (short aristae)
(a grandson of Druwin's original publicisl, Thomas Huxleyl. gcncticist Theodosius Dobzhansky. paleonlologist George Gaylord Simpson , and ornithologist Erost Mayr. In the 19405 aU lour ""lfote major works. flRCh showing from his special viewpoint how Mendelianism and Dnrwinism were indeed compatible.
EARLY SPECULATlONS ABOUT WHAT GENES ARE AND HOW THEY ACT Almost immed iately after lhe rediscovery of Mendel's laws, gcncUcists bega n to speculate about both lhe chemicaL structure of Ihe gene and the way it acts. No real progress could be made. however. because Ihe chemica l idenlity ofthe genelie malerial remained unknown. Even the realization thal both nucleic acirls and proleins are presenl in chromosomes did no! reatly help, s inee Ihe struc!ure of neilher was al aU understood. The most fruitful spoculations focused atlention on the fllel lhat genes must be. in sorne senSB. self-duplicating. Thei.r structure musl be exactly copied every time one chrom050me becontes Iwo. This fad immediately raised tite profound chemica l question of how a complicated mo l(:cule could be precisely cop ied lo yield exnct replicas . Sorne physicists also became intrigued w il b the gene. nnd when quanlum mechanics h'lrsl on the scene in the late 1920s, lhe possibility arose Ihat in order to understand the gene, it would firsl be necessnry to masler the subtleties of Ihe most advanced Ibeoretical physics. Such Ihoughts, however, never realIy took root. since il was obvious thal even lhe besl physicists or theoretical chemists would not concern Ihemselves with a substance whose structure still élwailed eluci· dation. There was only one racl Ihal they núght ponder: MuHer and L. J. Stadler's independenl 1927 discoverir.s that X-rays induce mutations. Since there IS a greater possibility that an X~ray will hit a larger gene than a smalJer gene, the frQqueney of mutations indueed in a given gene by a given X-ray dose yields an estimate o[ lhe size of Ihis gene. But even here, so man)' special assumptions were required that virtually no one. nol even Muller and Stadlcr themselves. look the eslimales very scriously.
PRELIMINARY ATTEMPTS TO FlND A GENE·PROTEIN RELATlONSHIP The mosl fruithJI early endeavors to find a relationship between genes and proteins examined the ways in which gene changes affect which prúteins are present in the eell. Al first these studies were difficult, sinee no one Imew anything aboul Ihe proteins thal were prcscnt in structures such as the eye or the wing. lt soon bocame elear tha! genes with simple metabolic functions would be easier lo sludy Ihan genes affecting grass structures. Qne of Lhe first useful examples carne from a study of a hereditary diseasc affecting amino acid metabolismo Spontaneous mutations occur in humans affecting lhe abitity lo metabolize the amino acid phenylalanine. When individuals homozygous for lhe mutan! Irait eal food containing phenylalanine. their inabitity lo convert the aIllino acid to tyrosine causes a toxic level 01" pbenylpyruvic acid to build up in the bloodstream. Sueh diseases. examples of "inborn errors ol' metabolism." suggested to English physician Archibald
Summory
17
E. Carrod, as early as 1909, !hat the w ild-type gene is responsible Cor the presencc oCa particular enzyme, and tltat in a homozygous mutanl, the enzyme is congenitally absent. Carrad's general hypothesis of a gene-enzyme relationship was extended in the 1930s by work on fl ower pigments by Haldane and Rose Scotl-Moncrieff in England, studies on Ihe ha ir p igmenl of Ih e guinea pi.g by Wright in tbe Uníted States, and research on the pigments oC insect eyes by A. Kuhn in Cermany and by Boris Ephrussi and George W. Dead le, working first in France and then in Californ ia. In a lt cases. lhe evidencc revealcd lhat a particular gene affected a particular slep in the Cormation of the re¡¡pectivc pigment whosc absencc changed. say, tile color oC a ny's eyes from red to ruby. However, the lack of fundamen tal knowledge ahout lhe structures o C the relevant enzymes ruled out d eeper exami nation oC Ih e gene-enzyme relationshlp, and no assurance could be given either that most genes control the synthesis of proteins (by Ihen l'1 was suspccted Ihal a ll enzymcs were prole¡n s) or that aH protoins are u nder gene control. As early as 1936. it bccame apparent to the Mendclian geneticisls that fut ure experiments of lhe sort successful in e lucidating the basic features of Mcndelian genetics were unlikely lo yield producti ve cvidence auout how genes act. [nstcad. it would be necessary to find biological objects more s uitable for chemical analysis. They were aware, moreover, Ihat contemporary knowledgc of nucleic acid and protein chemistry was completely inadequale for a fundamental chemical attack 011 uven the most suitable biological systems. Fortu· natcly, howevcr, Ibe Iim itatiol1s in chemistry did nol deter them from lca.rning haw lO do genelic experiments with chemically simple molds , bacteria, and v iruses. As we shall see, the necessary chemica l facts became available almost as 800n as th e geneticists \Vere rcady to use them.
SUMMARY Hcred ily is conlrolled by chromosomcs, which are Ihe ccllular carriers of genes. Hcreditary factors were first discovered and described by Mendel in 1865. but Iheir importallce was not realized until Ihe 51art of Ihe Iwenliclh century. Each gene can ex ist in a variety of differcnt fonllS called allcles. Mende t proposed Ihal a hercditary faclor (now knowll to be a gene) for each hercditary lraH is given by each parent lo each of its offspring. Tbe pliysical basis for Ihis behavior is Ihc distribution of homologous chromosomcs durillg meiosis: one (r-clJJdOlnly choseo) of each pnir of homologous ch romosomes is distributed to cach haptoid cell. Wllen two genes are 011 the same chwmosome, Ihey tend to be inhmited tOg€lher (linked). Cenes affccting di ffercnl characlcrislics are sometimes inhcrited independelll ly of each olher. Lecause they are located on differenl chromosomes. In any case. li nkagfl is so tdom complete beca use homologous chlllmosomes attach to each other during rneiosis and oflen break at idcntical spots ilnd rejoin crossways (crossing overl. Crossi llg nver transfers genes initiaJly Jocated on a paternally derived chromosome onlo gene groups originati ng rroro the materna l paren!.
Oifferenl aUcles from Ihe same gene arise by inheritabJe changes (mutations) in the gene itself. Nonnally, genes are extremely stable and are copied m.:actly during chromosome duplicalion; mutation occurs only rarely and usually has harmful consequencos. Murallon cines, however. play a posi tive role. since Ihe accumulalion of rare favorable mutations provides fhe b,1.sis for genetic variability tha! is presupposed by Ihe theory of evolution. r or ffiany years, the structure of genes and Ihe chemical ways in which they control cellular characterislics were 8 mystory. As soon as large numbers of spontaneous mutations had been described, it became obvious that a one gene- Dile characterislic relationship dnes nol exist and thal all complex characteristics are under Ihe control of many genes. Tho mosl sensible idel!. pastuJated by Garrod in 1909. was Ihat genes affccl the synlhesis of enzymes. Howover. the tools of Mendelian genelicists-organisms s ucb as Ihe com plant. lhe mouse. and m:en the frui! ny Dm.<>ophi/o-were nol suitable for r.let
The Mendtllion \litlW 01 the lI'orlrl
18
BIBLlOGRAPHY Gener-al Rcferences Ayala E J. and Kigcr J. A. Ir. 1984. Modern genelics. 2nd edition. Ben jamin Cummings. Menlo Park, Ca lifornia.
Bead le G.W, and Ephrussi B. 1937. Development of eye color in Drosophila: Diffusible substa noes and their illlerrelatiolLs. Cenetics 22: 75-66. Carlson E,J. H}66. '(he gen e theory; A critico} history. Saunders. Phi ladelphia . - - - 196 1. Genes. rodiotion. ond socie/y: The Ufe arrd work al Muller. Comell University PTeiS, Ithaca , NC\... York..
H.,.
Caspari E. 1946. Cytoplasm ic inheritance. Adv. Genetics 2: 1-66.
Correns C. 1937. Njcltt Mcndel"de vel'cl'bung (OO. r. von WeHstein). Borntraeger. Berlí n. Dobzhansky T. 1 941 . Ccnetics Dod the origio of spccies, 2nd edition. Columbia Universily Press, New York. Fisher R.A. 1930. The geneticoJ theory of natural selec;· tion. Clflrendon Press, Uxford, England. Garrod A.E. 1906. lnboro errars oC metaoolism. Lancet 2: 1-7. 73-79, 142- 148, 2 14 -220. Ha ldane J.8.S. 1932. TIte COIlr.>flS of c\'olutiDlI. Harpcr & Row, Now York .
J. 1943. Evolution: The modem sYJlthesis. Harper & Row. New York. Lea O.E. 1947. Ad iolts of rodiatioll s on living cells.
Huxley
Moore , . 19723. Heredity und development. 2nd ed it ion. OxJord University Press, Ox(oro, England. - - - 1972b. Heodings in hereditf alld developmen t. Ox(ord University Prcss, Oxford. England.
MOIllan T.H. 1910. Sex·li nked inheri tance in Drosoplti/a.
Science 32: 120- 122 . Morgan T.H., Stu rtevont A..H., Muller H.' ., and Bri dg€S C.8. 1915. TIte mechanism of Mettdeliall heredity. Holt, Rinehart & Win5lnn, New York. Muller H.J. 1927. Arlificial lransmutation oC the gene.
Science46: 84-87. Olby R.e. 1966. Grigins o[ Mendelism. Constable and Company LId., Landan. Petars J.A. 1959. CJassic papen; in genetics. PrenliceHa ll . Eng lewuod Clifr.~ , New Jersey. RhOftdcs M.M. 1946. Plastid mutati ons. Cold Sprlng Harbar Symp. Quam. Biol. 11 : 202-207. Soger R. 1972. Cytaplasmic genes and organelJes. Academic Press , Now York. Scolt·Moncrieff R. 1936. A biochcmical survey of sorne Mcndelifln fat.1ors for flower colo r. }. Genetics 32 : 1 17-1 70 .
S impson C.G. 1944 . Tempo and mode in evo/ulion. Columbia Univcrsity PresS. New York. Sonncborn T.M. 1950. The cytoplasm in heredity. Heredity 4 : 11 -36.
- - - 1982. "[jle growt/t 01 biologicoJ thought: DiVl!rsity, evo/uliOr!. and inhelitance. Harvard Uníversity Press. Cambridge, Mossachuselts.
$Iadler L.J. 1928. Mutations in barley induced by X-rays and rnd iulII. Sciellce 110: 543-548. Sturlevanl A.H. 19 13. The li near arrangement of six sexli nked factors in Drosophila as shown by mode of assuciatioll. / . Exp. Zool. 14: 39-45. Sturtevant A.H. and Beadle C.W. 1962. A l! introdvcüon lo genelJCS. Oover, Naw Ymk.
McClirtlock B. 1951. Chromosome organization 8nd gene expression. Cold Spriug Harba r Symp. Quont. 8iol,
S utt OI1 W.S. 1903. The chrOnlosúll1e in hercdity. Bio/. 81111. 4: 23 1 - 251.
Macmillan. Nc", York.
Mayr E. 1942. Systematics and tlm origin o[ species. Columbia University Press, Ney¡York.
16: 13-57.
- - - 1964. nle significance oC responses o( gcnome 10 cha llenge. Science 226: 792-800.
McClintock B. and Creighton H.B. 1931. A correlation of cytological and genetical crossing o ver in Zeo Muys. Proc. Natl. Acod. Sci. 17: 492-497.
Wilson E.o. 1925. The cell in delrelopmelll (¡nd heredity, 3rd edi tion. Mocmillrlll, Ncw York.
Wright S. 1931 . Evolution in Mendclian populations. Geoetics 16: 97 - 159. - - - 1941 . The physiology uf Ule gene. Physio/. Re", 21 : 487 - 527,
CHAPTER
Nucleic Acids Convey Genetic Information
hat special molecules migh! carry genetic information was appreciated by genelicislS long before the problem daimed Ihe altention of chemists. By the 1930s. gcneticis!s IJegan specu lating as lo whal sort of molecules could have tbe kind of stability that tbe gene demanded, yet be capable of permanent, sudden change lO lhe mutanl forms thal musl provide 'lhe basis of evoLutiol1. Until lhe nud·1940s, !here appeared lo be no direct way lo aUack the chemical essence oC the gene. It was known lhat chromosomes possessed a unique molecular cOllstituent, deoxyribonudeic acid (DNA) , bul Ibere was no \Vay lo show that lrus constituent carried genetic information, as opposed to serving merely as a molecular seaffold for a slill undiscovered class of proleins especially tailo red lo carry genelic informntion. It was generaJly assumed tha! genes ",ould be composed of amino acids becnuse, at lhal time. Ihey appeared lo be tbe onJy biomolccules with sufficienl complexity. tt Ihcrefore made sense lo approach Ihe nature of rhe gene by asking how genes funelion within cell s. In lhe early 1940s. research on the mold Ncurospora. spearheadcd by Ceorge W. Beadle and Edward Tatum, "",as generating incrensingly slraog evidence supporting Ihe 30-year-old hypothesis of Archibald E. Currad Ihat genes work by conIroUing tbe synthesis of specific enzymes (the one gene-one enzyme bypolhesis). Tllus . given Ihat al1 known enzymes had, by lhis time. bcen shown lo be proleins. Ihe key problcm was Ihe way ge nes partieipate in the synthesis of prote ins. From the very slart of serious specu· lalion . Ihe simples! hypolhesis was Ihal genetic infomlation within genes delermines the order oflhe 20 diffe rent aOlino ucids within lilo polypcpllde c hain s of proleins. In altempling lo test this proposal. intuition was of litt1e hclp even to Ihe beS! biochemisls. since there is no Jogical wa)' lo use enzymes as 10018 lo determine Ibe arder of eocb amino acld addcd lo a polypepHde chaio. Such schemes would require. for tbe synthesis ofa single Iype 01' proteio, as many orderiog enzymcs as Ihere are amino acids i.n !he respective protein, Bu! since aU enzymes known al that lime were Ihemselves proleins (we now know thal RNA can also act as an enzyme in a few instancesl, still additional oroering enzymes wouJd be necessary lo synlhesize the ordering eozymes. This situation c1eurly poses a paradox. unless we assume a fan lastically interrelaled series of syntheses in which a given protein has many differenl enzymatic specificities. With such an assumption. il migbl be possible (and tllen only with greal difficulty) to visualize a workable cel!. It did nol seem Iikely. however, tha! most proleins would be round lO carry oul multiple lasks. In fact, all lhe currenl knowledge poinled lO the opposite conclusion of one protein, one function.
T
OUTL INE
• Averfs Bombshe!l: ONA Con Con)' Genctic Specifidty (p. 20) The Oouble
Helix (p. 21)
• The Genetic InlOfTTlalion 'Mthin DNA Is Conveyed by the 5equence of lis Four Nudeotide BuikJing BIocks (p, 28)
• The Central Dogma (p. 3 1)
• Establishing tOe Oirection of Ploten 5>¡nlhesis (p. 37)
• 1he Era 01 Genomics (p. 38)
19
zo
Nuc/eic; IIdds ú:mvey Ge.nctic; Informution
AVERY'S BOMBSHELL: DNA CAN CARRY GENETIC SPECIFlCITY That DNA might hA tlm key genetk molecule emerged most uoexpectedly from studies 00 pneumonia-causing bacteria. In 1928 Englisb microbiologist frederick Griffith made the startling observation tha! nonviruJent strnjns of the bacteria became virulenl wben mixcd with their heat-killed pathogenic counterparts. Tha! such transformations from nonv irulence to virulence re prescnled hereditary changes was shown by using descendants of the ncwly pafhogenic stra ins to transforOl sHU olher nonpnlhogenic bacteria. This raised the possibility that when pathogenic ccHs are killed by heat, their genetic components remain undamagcd. Moreover. onco Iiberated from the heat-killed celIs. these components can pass through the cflll wall of the living reciplcnt ceUs and unrlergo subsequenl gcnetic recombinaIion with the recipient's genetic apparatus (Figure 2-1). Subsequent research has confirmed this genetic interpretation. Pathogenicity reflects the actiún 01' the capsule gene. which codes for 8 key enzyme involverl in lbe syntbesis of the carbohydrnte-containing capsule lhat surrounds masl pneu monia-causi ng bacteria. When the S (smooth) allele of the capsule gene is present, Ihen a capsule is fonned around the cell tbat is nccessary foc pathogenesis (the formati on of fl cnpsule also gives a smooth appearancc to the colonies fOffiled from these cells). When the R lrough) allele oCIhi s gene is presenl, no capsule is formed and the respective eeUs are no! pathogenic. ""ith io severa] yenes after Criffith's origi nal observation. extra<;ts 01' the kiJl ed bacteria were found eapable or inducing heredilary transformations. and a search began for Ih e c hemical identity of the transforming agenl. Al that time, the vast majority of biochemisls sti ll beHeved that genes were proteins. lt Ihe refoce came as a great surprise when in 1944. afi er some ten years of research , U.S. microbiologist Oswald T. Avery and his colleagues al tbc RockefeUcr lnstitute ",pS
(capsu!e gene)
chromosome
heat lo kili
•
palhogenic S (smooth) cell recombmalJon and ce l! division
nonpall1ogemc R (rough) cell
entryofdlromosome fragmenl bearing caps inlo capR cell
F I (j U R E 2-' Transfonnation of a genetic characteristic DI a bacterial cell (Streptococcus Here \-.e stow ilIl R cell receil.ring él chromosornal ~ containing!he ~uIe gene from iI heat-Irealed 5 0211_ Sinre mest R cells re:eive otheJ ch~ fragment:., \he efficiency 01tlilIlsfOfmélbon for iI given gene IS ~ Iess than llfrL
pneumonioe) by addition of beat-killed C'ells of a geneticalty diHe"mt strain.
• Secl!
The Double I-lelix
in New York, Colin M. MacLeod and Madyn McCarly. made the momentous announcemenl that the active genetic principie was ONA (Figure 2-2). Supporting their conclusion \Vere key experiments show ing Ihat the Iransforming activily of thei r high ly purified active fractions was destroyed by pancreatic deoxyribonucIease. a recently purified enzyme that speciHcall y degrades DNA molecules lo their nucIeotide building blocks and has no effect 00 thc iotegrity oC prole¡n mol ecul es or RNA. The addition oC either pancreatic ribonucIease (which degrade!': RNA) or various proleolytic enzymes (proteindestTOying) had no influen ce on the transforming actjvity.
21
pathogenic S (smoolh) rol
!
break cells
Viral Genes Are AIso N ucleic Adds Even more important confirmatory evidence came from chemical studies w1th viruses and virus-infeCled cells. By 1950 it was possible to oblain a number of essentially pure viruses and to determine which types oC molecules were present in them. This work led lo the very importanl generalization Ihat aH viruses contain nucleic ncid. Since there was at that time a growing reaJization that viruses contain genetic material, the question immediately arose as lO whelher the nucJeic acid componen( was the carrier of viral genes. A crucial tesl of Ihe question carne from isotop ie study of Ihe multi plication ot' T2. a bacterial virus {baetcl'iophage. or phagel contain ing a DNA cme and a protective shell buill up by Ihe aggrcgatioll of a number of dinerenl protein molecules. In Ihese experiments , performed in 1952 by Alfred D. Hershey a nd Martha Chase working al Cold Spring Harbor Laboratory 011 Long lsland . the protein eúat was labeled with the radioactive isotope 35 5 and Ihe DNA eore with Ihe radioactive ¡sotope 32p_ The labeled virus was then uscd lo follow the Cates of the phage protein and nllc1eic acid as phage multiplication proceeded . particlllarly 10 see which labeled aloms from the parental virus entered Ihe hosl ceH and later appcared in Ihe progeny phagc. Clear-cut results emerged from these experiments; much of the parental nucleic acid and none oflhe parental prote¡n was detected in the p rogeny phnge (figure 2-3). Moreover, il was possible to show Ihat Httl e or the parental protein even enters the bacteria; inslead, il stays attached to the outside of Ihe bacteria) cell. pe rforming no runction aCler the ONA component has passed inside. This point was neally sbown by violently agitating infected bacteria after the entrance 01' the ONA; the prolcin coats were shaken off wilhout nITecting the abilily of the bacteria to Conn oew phage parlicles. With sorne viruses it is now possible lo do an evell more convincing experiment. For example, purified DNA from the mouse virus polyoma e dIl enter mouse ceUs and il1itiale a cycle 01' viral multiplícation producing many thousands of new poJ yoma particles. Thll primary function of viral protein is Ihus lo protect its genetic nucJeic acid component in its movement from one cell lo another. Thus no reason exists for proteins to pl ay any part in the struelure of a gene.
(
-! . ~ isolale DNA
!
8dd DNA lo R cells
nonpathogenic R (rough) cell
!
recombination and cell division
THE DOUBLE HELIX While work was procecding on Ihe X-ray analysis of proteio struct ure, a smaller number of scientists were trying lo solve Ihe X-ray diffraction pattern oC DNA. The I1rst diITraction patterns werc taken in 1938 by William Astbury u sing DNA supplied by Ola Harnmarsten and Torbjorn Caspel's son. It was nol unlillhe early 1950s thal high-quality
F I G U R E 2-2 lsolation of a chemicaUy pute tnlnsfOfming agenL (Source: Adapted from Slah! F.w. 1964. The mechorncs o/ Inhefi/mce, Rg. 2.3. Copyright 0 1964. Rcpnnled by permis5Ioo of Pearson Educallon, tne., Upper Saddle River, NJ.)
Nuc/eic Avids Convey Gmwtic lnfo rmotion
22
___ 3S5-Iabeled
~ coal pmtein r
¡
32p-labered ONA
-
protejn "ghosrlabe4ed with :x.s
("'__\~),.J :. :' '"' ' release of new progeny parlicles
FIGURE 2-3 Demonstrationthatonly lhe ONA component of n colmes the cenetk informatíon and that the protein coat serves onty as a pKJtective sheU.
--
--
{Q
X-ray diffraction photographs were taken by Maurice Wilkins and Rosalind Fran klin (Figure 2-4 ). These photographs suggesled nol only Ihal Ibe underlying ONA struclure was helical bul Ihat it was composed of more than olle polYllucJeotide chain -either two or lhree. Al the same time. lbc covaJent bonds of ONA were being unamhiguously established . In 1952 a group of OI-ganic chemisls working in the laboratory of Alexander Todd showed thal 3' - 5' phosphodiester bonds regularly l.ink togelh er Ihe nucleotides ofDNA (Figure 2-51, In 1951, bocaw¡e of ¡nteresl in Linus Pauling's (1 helix protein mot.if (whicJl we shall consider in Chapter 5), an eleganllheory of diffractioll of helicaJ molccu les was developed by William Cochrao , Francis H. Crick, and Vladimir Vand. This theory made it easy 10 test possible DNA slructures on a Irial-and-error basis. The correct solution. a complementary double helix (see Chapler 6), was found in 1953 by Crick and James D. Watson, then working in lhe laboratory of Max Perulz and 10hn Kendrew. Their Brri val al lhe correct answer depended largely on finding Ihe stereochemically mosl favorable configuration compat ible with lhe X-cay diffract.ion data ofWilkins and Franldin. In tbe double helix, the two DNA chains are held together by hydrogen bonds (a w~a k noncovalent chemical bond; see O laptcr 3) between pairs of bases on lhe opposing strands (Figure 2-6). This base pairing is very specific: Th e purine adonine on ly base-pairs lo lhe pyrimidine thymine, while the purine guanine only base-pairs to tbe pyrimidine cytosine, In double-helical DNA, the number of A residues must be equal to the number of T residues. while the number of G and C residues mus! Iikowise be equal (see Box 2-1 , Chnrgafrs Rules) . As a result, Ule sl;.'C(uenu : uf Iho bI:lses of lhe Iwo chains of a gi\len doubJe helix have a complementary reJati unship and the sequence of any DNA strand exadly defmes Ihat of its pa rtner slrand. Tbe discovery of Ihe double helix initialed a profound revolution in lhe way many geneticists analyzed Iheir data. The gene was no longer a rnysterious entity. the behavior of which could be investigated only by genelic experiments. Tnstoad . it quickl y became a real molncular object about which chemists could think objeclively, as I.hey did about smaller molecutes such as pyruvate and ATP. Mosl of the excitemenl, however, carno not meroly from the fact that the structurc was solved, but also fro m Ihe nature of the structuro. Before the answer was known, there had always been the worry tha! il would hlm out 10 be dull , revealing nothing about how genes roplicete and function . Fortunalely, however, lhe an~wer was irnmensely cxciting. The h NO intertwined slrands of
F I G U R E 2-4 The key x-ray photograph ínvolved in ttte e'ucidation of ttte ONA structure. This photograph. Idken by ROSd~nd F/(Inklin al King's CoIlege. London. In lile Wlnler 01 1952 - 1953, confirmed lhe guess th
The Double Helix
23
S' end
I
o
I
•
O=P-O-CH~
b
H, /H
Hi'l
,o
O
I
N
,
H
N
cylo,;o.
O
0= r-0-CH2 O
o
O H
phosphodiester [ linkage
o I
•
O= P - 0 -CH
I
O
~N/H ~N/
H9"'0;00
I
H
20
,
?
O
H'C0/ H ~_l lhymine
~H
N
O= P-O-C~
I
O
,"
O
O
,
oId
,
?
3'end FIGU RE 2-5 A portion of II DNA polynudeolide chain. showing che J ' -- 5' phosptlodiester linkages dwIt conned lhe nudeoúdes. Phosphale groups ronnect Ihe 3' c.arbon 01 one nudeobde with the 5' COIbon oIltle nert.
eomplementary structures suggested that one strand serves as the specific surface {templatel upon which the other strand is made (F igure 2-6). If ¡his hypothesis were true. then the fundamental problem of gene replicotion, obout which geneticists had puzzled for so mony years, was. in faet. conceptually solved.
Boa: 2·1 OIargaffs Rules Biochemist Erwin Chargaff used o technique colJed "paper chrornotography" to onoIyze me nudeotidc composition of ONA By 1949 his dato s/-xMed not only that the four different nudcotides are not present in equal amounts, but also the exoct tatios of the four nuc\eorides var.¡ flOm ooe species te onother (E!ox 2- 1 Table 1). Thesc finclings q:¡cned up the possibility that it is the precise arrangemcnt of nuc\eaides within a ONA moIecule confers its genetic spedficity. Chargaff's experiments also shoNed mal the relative ratios of the fou t bases were not random. lhe number of odenine (A) tcsidues in 011 ONA samples was equal 10 me number of thymine (T) residues, while the numoo- of guanine (G) resldues equaled the numbcr of cytosine (C) residues. In oddition, fl.>gardlcss of me ONA SQUrte, the ratio of puñnes to pyrimidincs WlS always opproximately ene (puñncs = p,rrimidines). lhe fundamental sifr'iticance of me A "" T ancl G = C relationships (Chargaff's rules) couk:l not emer~, ho.Never, until senous attention v.es given to the mree-dimensional strudure of ONA.
mat
mat
~
old
F I G U R E 2-6 lhe replication of DNA. The newly synlnesized strands are shovlO ¡n oran¡;:e.
Box 2· 1 (COIItmUed)
BO X
2-1 TA BLE 1 Data Leading to the Fonnulatton of Chargaff's Rules Adenine 10 Guanine
Thymine 10 Cytosine
Adenine 10 Thymine
Guanine 10 Cvtosine
Yeast Hemophilus
1.29 1.56 1.45 1.43 1.22 1.67 1.74
1.43 1.75 1.29 1.43 1.18 1.92 1.54
1.04 1 00 1.06 1.02 1.00 1.00 1.07
1.00 1.00 0 .9 1 1,02 0.97 1.20 091
1.1 1.0 0.99 1.02 0.99 1.0 1.0
inffuenz8e Escherichia
1.05
0.95
1.09
0.99
1.0
OA
OA
109
1.08
1.1
0 .7
0.7
0.95
0.86
0.9
0.7
0.6
1,12
0.69
1.0
Source
O, Human Hen
Sal"""
_a,
coliK2 Avian tubercle bacillus
Serrarla marcescens Bacillus schatz
Purines lo PY'imidines
Source: Afie' Chargafl E. el aI.1949. J . BioI.Chem. 177: 405.
Finding the Polyrnerases that Make DNA Rigorous proofthat a single DNA chain is the lemplate that directs the synthesis of a complementary DNA chai n had lo await the devel· opmen! of test-tube {in vitro} systems for DNA synthesis. These came much fa ster than anticipated by molecular geneticists. whose worlel until then had been far removed from thal 01' the biochemisl well versed in the procedures needad for enzyme isolation. Leadillg this biochemical assauH on DNA replication was U.S. biochemist Arthur Kornberg, who by 1956 had demonstrated ONA synthesis in cell-freo extracts of bacteria. Over Ihe next severa l years. Komberg wenl on lo show that a specific polymerizing enzyme was needed lo cata lyze the linking together oC the building-bl ock precursors of DNA. Kornberg's studies reveaJed thal the nucleotide building blocks for DNA are energy-rich precursors (dATP. dGTP, dCTP, and dl"P; Figure 2-7). Further stud ies identified a single polypeptide. DNA polymerase I {DNA Poi (), that was capable of cara lyzing the synthesis oC new DNA slrands. It li nks tha nucleotide precursors by 3' - 5' phosphodiester bonds (Figure 2 B). Furthermore. it works on ly in th e presence of DNA . which is needed to order the four nucleolídes in the polynucleotide product. DNA PolI depends on a DNA template to determine the sequence of tha DNA it is synthesizing. Thi s was first demonslrated by allowing the ellzyme lO work in the presence of DNA molecules that contained varying amounts oC A:T and G:C base pairs. In every case, the enzymaticall y synthesized product had the base raHos oCthe template DNA (Table 2-1 ). During tbis cell-free synthesis. no synthesis of proteins or an)' other molecu lar: class occurs. unambiguously eliminat· ing any non-DNA t,;ompounds as ¡ntermediat~ carriers of gen etic M
nucleoslde:
deoxythymidine
deoxycytidine
nr" mlll"'" !>'''$l".
thymine (ONA)
cylosine (ONA)
X ~OH'W
o
OH 8ugal':
OH
deoxyribose
deoxyribose
I
nucleotide:
deoxycytidine-S',phosphate
nucfeoside:
deoxylhymidine-S'-phosphate
deoxyguanosine
deoxya
purine base:
adenine (ONA)
guanine (ONA)
H-1'~:
~9~_ 3.Á H N N
~o~td OH
deoxyribose
deoxyribose
8ugar: nucleotide:
deoxyadenosine-S', phosphate
deoxyguanosine-5'-Phosphate
F I C; U R E 2·1 The nudeotides o. ONA. The structUI'e'i 01 Ihe different componenlS 01 each of lile loul nudeotides are ...... CWl.
S'
S'
3'
HO
T I/'>., ~ :::::;
3' HO
adenine-
='•""'"
S'
T
~ OH 3'
"-
.....•-
ONA polymerase
•
<.1"",
pyrophosphate
OH 3'
I
0-0 """pho,pha"" OH
phosphale
S'
G
F I G U R E 2-8 Enlymatic synthes;s of a ONA chain catalYled by DNA polymerase l.
3'
S'
26
Nucleic Adds Cunvoy Genetic InJomlUliOll
JABLE 2-1 A Compañson of the Base Composition of Enlymatically Synthesized DNA and their DNA Templates
Base Composition of the Enzymatk: Product
AH G+C
AH G+C
Adenine
Thymine
Guanine
CytOSine
In Product
In Templare
Micrococcus
0.15
0 .15
0.35
0.35
041
0.39
Iysodeiklicus (a bacterium) Aerobacter
0 .22
0.22
0.28
0.28
000
0.82
0 .25 0.29 0.32
0 .25
0.25
0.97
0.21 0.18
0.25 0 .22 0.18
1.00
0.28 0.32
1.32 1.78
1.35 1.84
Source of ONA Templme
aerogenes (a baclenum) Escherictua coli Ca" Ihymus PhageT2
specificity. Thus Ihere is no doubl that DNA is the direct templat e for its Olvn formation.
Experimental Evide nce Favors Strand Separation During DNA Replication Simultaneously with Kornberg's research, in 1958 Matth ew Meselson and F'rank W. Stah l. tben i:d the California Institute of Technology, canied out an elegant experiment in which they separated daughter ONA molecules. alld in so doing, showed lhat the two strands of ,h e double helix permanently separate from e8ch other during DNA repli cation (Figure 2-9) . Their success was due in part to th e use of the heavy isotope 15N as a tag to differentially label the parental and daughter ONA strands. Bacteria grown in a med ium containing the heavy isotope HiN have denser DNA than bacteria grown under norma l conditions with HN. Also conlIibuting to the success of Ihe experiment was the development of procedures for separating heavy from light ONA in density grad ients of hea vy salts Iike cesium chloride. When high centrifuga1 Corees are applied, the solutíon becomes more dense at th e bollom of the centrifugo tube (whi ch . when spinn ing . is Ihe Cart hest from tho axis of rotalion). Whcn the correct initiaJ so lution density is choscn, the individua l ONA molecoles will move to the central region of the centrifuge tube where their density equals that oC the sa lt solution. Jn this situation. the heavy motecules will Corm a band at a higher densi l)' (closer lo the bottom of the tube) than the Iighter molecules. IC bacteria containing heavy ONA are transferred to a light medium (containing 14N) and allowed to grow. the precursor nucleotides available for use in ONA synthesis wiJl be Iight; hence, ONA synthesized after transfer will be dislinguishable from DNA made before transfer. If ONA repli cation ¡nvolves strand separation, definite pred ictions can be mad e aboul the density of Ihe ONA molecules found after various growth inlorvals in a light rnedium. Afi er ona generation ol' growth, all the DNA molecules shou1d contain one heavy strand and one light strand and thus be oC intermediate hybrid density. This result is exactly w hat Meselson and Stahl observed. Likewise. afler two generations of growth, half the ONA molecu les were Iight and haH hybrid, just as strand separation predicts.
The Double Helix bacteria growlng in 15N; all DNA is heavy
transfer lo 14N medium
continued growth in uN medium
ONA isoIated Ifom !he ceIIs 15 mixcd with esa soIution (SM, P (clenSlly) - 1.7g1ml) and pIaced in ultmcllnlrifuge
p=l .80
~~ i ~
/
Ji~1
DNA
1- \
14N_15N heavy hybrkl DNA DNA
soIution cllnlrifuged el 140,000 M9 for -48 hr
p
-
befare transfer to 14N
= 1.65
p=1 .60 ene cell generation after transfer to 14N
two genefalions after transfer to 14N
Ihe Iocallon 01 [)NA IT\OIEICUles Wlthm the centriluge cell ceo be determined by ultravi(jet opIics
DNA was thus shown to be a semiconservative process in which the single strands of the double helix remam mtact (are conserved) during a replication process that distributes one parental strand inlo each of the two daughter molecules (thus. the "semi" in semi-conservative). These experiments ruled out Iwo ot hcr models al the time: Ihe conservative and Ihe dispersívc replication schcmes (Figure 2-10). ln Ihe conservative model, both of the parenlal strands were proposed lo remain together and the two new strands of UNA would form an entirely new DNA molecule. In Ihis model. light DNA wou ld be formed after one celJ generalion . In the dispersive model. whlch was favorcd by many al Ihe time, the DNA strands were proposed lo be broken as frequeutly as every ton base pairs snd used to prime lhe synthe-sis of similarly short regions oC DNA. These shorl DNA fragments wou ld subsequent ly be joined to form complete DNA strands. This complex model would lead lo DNA strands Ihal would be composed of both old and new DNA (Ihus non-conservative) and would only approach fuUy Iighl DNA after mflny genermions of growlh ,
FI G U R E
2·9
21
Use of a cesium dlloride
(CsCI) densrty gradient to demonstrate the 5epafatiOfl of mm~ementary stt"ands during ONA repliQtion.
28
NlJeleic Acids Convey Genelir; IrJfom'IJliotl
FIGURE 2-10 Threepossible
mechani5111S for ONA replicaoon. Vot1en !he structure 01 DNA was discovered, sa.eral models were proposed to exploin hcr.v it was rep!ic.ated; three are illustrated here. The e;.periments proposed by MeseIson and Stahl deany disttnguished among these mode!s, demonstrat" ing 111m ONA WdS replicated semicons€fVatively.
distributive
semiconservative
conservali\le
THE GENETIC INFORMATION WITHIN DNA IS CONVEYED BY THE SEQVENCE OF ITS FOVR NVCLEOTIDE BUILDING BLOCKS The finding of the double helix had effectively ended any controversy about whether DNA was the primary genetic subslanCé. Even before strnnd separation during DNA replication was experimentally verified, the main concem of molecular genet icists had turned to how the genetic inforrnation of DNA functions to order amino acids during protein synthesis (sea Box 2-2, Evidence thal Genes Control Amina Acid Sequences in Proteins). With all DNA chains capable of forming double halices, the essence of t.heir genetic specificity had to reside in the linear sequences of their fournucleotidc building blocks. Thus, as information-containing entities. DNA molecules were by then properly regarded as very long words (as \Ve shoJI see later, they are now best considered very long sentencesJ buUt up from a four-Ietter alphabet (A. G, e, and T). Even with only four letters, the number of potential DNA sequences (4 N , where N is the number of letlers in the sequence) is very, very large for even the smallest oC DNA molecuJes; a virtua lly infinite number of different" genetic messages can exist. Now we know Ihal a typical bacterial gene is made up of approximately 1,000 base pairs. The number of potential genes of this size is 4 1000 • a number that is orders of magnítude larger than Ihe number of known genes in every organism, DNA Cannot Be the Template that Directly Orders Amino Acids during Protein Synthes is Although DNA must carry the informalion ror ordering amino acids, it was quite clear lhal the double helix it self coul d nol be the template ror protein synlhesis. Ruling out a direct role for DNA were experimenls showing Ihat protein synlhesís occurs al siles where DNA is absent. Protein synthesís in aU eukaryotic cells occurs in the cytoplasm, which is separated by lile nuclear mcmbrane fr om the chromosomal ONA.
The Cenelic Information witlu-n DNA Ir; eollveyed by 11m Sfflluence o[ lis four Nud,!o,¡de Building Block!;
29
Box 2-2 Evidence that Genes Control Amino Ac:id Sequence in Proteins
The firsl experimental evidence tIlat genes (DNA) control omino acid sequences arase n-om me study cA me hemoglobin present in humans suffering from the genetic disease sidJe-cel1 anemia. If an indMdual has !he S a1lele of \he p-glcbin gene (....nich encocles ()lE! el the t\I\rO poIypeptides that together form hemogIcOin) present in both homologous chromosornes, a severe anemia !eSlllts, characterized by the red bIood cells hoving a sidle-in gene are el me S foon, !he anemia is 1ess 5el.€re 000 the red bIood cells appear almost nctlT\(Il in sloape. The type al hernogkhin in red bIood cel1 s is lik€."Mse cooelated \Nith the geneñc pottem. In the SS case, \he hemoglobin is abnonnal, characterized bv a sdubility differcnt from el normal hemogIOOin, vmereas in the +5 Ccndltion, half the hemoglOOin is normal aOO hall sickle. Wild-Iype hemoglobin molecules are constructed from two kínds of poIypeptídc chains: o: chains and p chains (see Box 2-2 Agure 1). Each chain has a molecular vveight of about 16,100 daltons. Two o: chains and two p chains are presenl
mar
~..,.
1\-9Obi'I""
O 1
ti ¡
)
¡
l
¡
ín each molecule, giving hemoglobin a molecular welghl of about 64,400 dClltons. The (l( chains and P chains are tonIroIled by distinct genes so thal a single mutation wi!! aHect either the ( l chain or the p chain, bOl rtOt txrth. In 1957, Vemon M. Ingram at Cambridge University showed that síckle hemoglobin d¡ffers from nClf'l'TlOI hemoglobin by Ihe change 01 ene amino acid in the ~ chain: at positíon 6, Ihe glutamic acid residue found in wild-type hemoglobin is replaced by valine. Except for this one change, Ihe entire amino acid sequence is identical in normal and mutant hemoglobin. B~use this change in amino CIad sequence was observed only in patients with the S anele of the Ji-globin gene, the simplest hypothesis IS that Ihe S allele 0 1 the gene encocles the change in the p-globin gene. Subsequent studies of amino aad sequences in hemoglobin isolated from other forms of anemia completely supported this proposal; sequence analysis showed Ihat each specific anemia is cr.araderiled by a single amino acid replacement at a unique síte along the poiypeptide chain (Box 2-2 Figure 2).
:r.:::O:J ~
([1
¡
*~ -11- -1 I
I
,000blDo
B O X 2-2 F I G U R E 1 Fonnation of wild-type and sidde-cell hemoglobin. (Source: IlIuslralion, Irving Ceis. Rlght5 o.NIled by HoNard Hughes Medical Institule. Not to be reprcduced vithout permission.)
bela charn I
•
-r... ~_¡:..;1
~
,,'l
-*'oacldValHisLeuGluGluGluHisVelGluHis
"" , ""O
,•• """"'"' Hb . . . .k • ""M
• ...
%
,~
HbO
Philadelphia
'" O"
Hb'
""
HbC
ly.
HbO $M""
'" ,~
l,.
,•• HbE • HbM
•
%
-Hb
Zíl«h
80X 2-2 FIGURE 2 Asummaryofsome established amino acid sttbstitutions in human hemoglobin va,Wlnts.
""M
MiIwaU<.ee·1 HbD
~PunJab
"
30
Nudeic: Acids Convey Gellelic Informuliun
A second information-containing molecule thererare had to exist that obtains its genetic specificity from DNA. afler which it moves to the cytoplasm lo function as lhe lemplole for protein synthcsis. Attention rrom the slaft rocused on Ihe still funcUanai1y obscuro second class of nueleic acids, ){NA lbrbj6rn Caspersson and lean Hrachet had found RNA to reside largely in the cytoplasm; and it was easy to imagine single ONA strands, when not serving as templates for complementary DNA strnnds, acting as temploles for complementary RNA chBins.
RNA ]s Chemically Very Similar ro DNA Mere inspection of RNA structuro shows how it can be exactly synthesized on a DNA lemplate. ChemicaJly, it is very similar lo DNA lt, too, is a long, Wlbranched molecule containing fout types of nucleotides linked logelher by 3' - 5' phosphodiesler bonds (Figure 2-11 ). Two difTerences in il s chem ica l groups distinguish RNA from ONA. The first is a minor modification of the sugar component (Figure Z-12). The sugar of DNA is deoxyribose, whereas RNA contains riboso, idenlical tu deoxyribose except for the presencc of an addit¡onal OH (hydroxyl) group. The second diJIcrcnce is that RNA contains no thymine, bul inslead contains Ihe closely related pyrimidine madI. Despite these differences, however, polyribonucleotides have the potential for rorming complementary helices of the ONA type. Neither the addiHonal hydroxyl group. nor the absence of Ihe methyl group found in thymine but not in uridine, affccts RNA's abilit y to form double--helical struclures held logether by base pairing. Unlike
FIGURE 2-11 Aportionoh polyribonudeottde (RNA) chain. ElemenfS in red are distinct from DNA
S'end
I
o
I
•
O=f-O-CH~O
H~~ ~IO'¡M
O
OH
O
I
H
N
O
"
O=f-O-c~o
O
OH
o I
•
O=P- O-CH¡¡
I
O
O
OH H {;( ",001 ?
~H
o=f-o-c Hz o O
N
,
OH O
I
""'"
O
'fhe Centrol Dogmn
deoxythymidine
uridine
flGUR E 2-12 Distil'1ctions between the
nudeoüdes of RNA and DNA. A nOOcctide.
ufacil (RNA)
thym-ne (ONA)
o 11
"XN1 ., ,
H
,
,
1
H
N
O
O-O",c~ OH OH ribose deoxylhymidine-S'·phosphate
uridine-5'-phosphate
DNA, however. RNA is typically found in the eell as a single-stranded molecule. If double--stranded RNA heliees are form ad. they most often are eomposed of two parts of the same single-stranded RNA mulecule.
THE CENTRAL DOGMA By the fall of 1953, the working hypothesis \Vas adopted that chromosomal DNA functions as lhe template ror RNA molecules. which subsequently move to the eytoplasm, where they determine Ihe arrangement of am ino acicls wilhi n proteins. In 1956, F'raneis Crick referred lo this pathway for the Dow of genetic information as tbe cenlral dogma.
r;;:, Duplicatioo
31
Tnmseription Trans lation ~A RNA Proteio
Here the arrows indicate the direct ions proposed ror the lcansfer of genetie information, The arrow eneircling DNA signifies Ihat DNA is the template for its self-replieation. The arrow between DNA and RNA indicates thal RNA synthesis (lranscription) is directed by a DNA temp late. Correspondingly, the synthesis of proteins (translation) is directed by an RNA template. Must importantly. the las! two arrows were presented as wlidirectional; .that ¡s. RNA seq11enees are never determincd by prot ejn templates. nor was DNA then imagined ever lo be made on RNA templates. Tha! proleins never serve as templates for RNA has slood the test of lime. However. as we \Vill see in Chapter 11. RNA ehoins sOll1etimes do aet as templates for UNA cha ins of eom plementary sequence. Such reversals of the [10rmal Oow of informa!ion are very rare events compared with the enormous number of RNA molecu les made on DNA templates. Thus, the central dogma as originnHy proc1e.imed approximately 50 years ago still remains essentiaJly vaJid.
The Adaptor Hypmhesis of Crick Al first il seemed simples! lo believe Ihal the RNA templates for proteio synthesis were folded up to ercate eavities on their outer surraees specifie for the 20 different amino ad ds. Tite eavi ties would be so
01 DNA IS shown next 10 a nudeotide of RNA. AlI RNA nudeobdes hiIve!he 5Ugaf obo5e (,nstead ot deoxyribose lor DNA). v.t.ich has a hydroxyl group en carbon 2 (shoNn in red). In addltion. RNA has !he pyril'lidine base uracil lostead of thymine. The !hree other bases thill OCCUf'O ONA aOO RNA are identical.
32
Nuc/eic
Ih:id.~
C01n "f'y Cenot/i/.: InflJl"(Ila'ion
shaped that on ly one given aroino acid would fit. and in this way RNA wou ld previde Ihe information lo arder amino acids during protein synthesis. By 1955, however, Crick became disenchanted with Ihis conventional wisdom , arguing thal il would never work. In Ihe first p lace, lhe specific cheroica l groups on Ihe (our bases of RNA (A, U, e, and C) should mostly ¡nteract with water-soluble groups. Vel, t.he specHic s ide groups of many amino acids (for exam ple, Icucine, va linc, and phenyJahmine) Slrongly prefer interactlons with water-insoluble (hydrophobic) groups. In the second place. even if somchow RNA could be folded so as to display sorne hydrophobic surfar:es, il seemed at the ti me unlikely Lbat an RNA te mplate would be used to discrimi nale accurately be!ween chemically very similar amino acids like glycine Bnd alanine or valine and isoleucine, both pairs differing ooly by Ihe presence 01" single methyl (CH 3 ) groups. Crick thus proposed Iha! prior to incorporation into proteins, amino acids are first attached to specific adaptor molecuJes, whi ch in turn possess unique surfaces Iha! can bind specifically lo bases on the RNA templates.
The Test# Tube Synthesis of Proteins F I <> U R E 2-13
EJectron micrograptl ot ribosomes attached lo the endoplasmic retiadum. lhis ek!ctron mlCl"ograph (I05,000x) sho.vs a portien 01a panueatic cel!. The upper righl POnion shows a plYh01 01 the mltochondnon and !he Iower lefl show<; a large number of riboscrnes anached lo the endqJlasmic reticulum. Sotne rbasomes e1:Íst ffee in me cyteplasm; oIhers are attached 10 !he rnembra· nous er.doplasmic reticuhJm_(Source: CoUfle>y 01K..R Poner,)
The discovery of how proteins are synthesized required ihe development of cell-free extracts capable of ca.rrying on the esscntial synthetic steps. These were first elfect ively developed beginning in 1953 by Paul C. Zamecnik aod his coll aborators. Key lo their success were the reccntly available radioactively-tagged amino acids. which they used to mark fue trace amounts 01" newly made proteins. as well as high-quality. easy-to-use. I-lreparalivc u ltracen trifuge~ for fracti onalion of their cullular extracts. Early on, the cellular site of proteln synthesis was pinpointed lo be the ribosomes. sma11 RNA-containing particles in the cytoplasm of all cells actively engaged in protein synthesis (Figure 2-13). Several years later. Zamecnik , by then collaborating with Mahlon B. Hoaglaud. went on to make the seminal discovery thal prior 1'0 their incorporation inlo proteins. amino acid s are firsl attached lo w ha! we now call lransfer RNA (tRNA) molecules by a c1ass of enzy mes call ed aminoacyl syutheteses. Transfer RNA accounts for some 10% of a1l cellular RNA (Figure 2-14) . To nearly everyone except Crick. this discover)' was tolalI y W1CXpected. He had , of cOUJ-se, previously speculated that his proposed "adaptors" might be short RNA chains. since their bases wou ld be able lo base-pair with appropriate groups on the RNA rnolccules that served as the templetes for prelein synthesis. As we shaU relate Jater in greater detail (Chapler 14), the transfer RNA molecules of Zaroecnik and Hoagland are in facl Ibe adaptar molecuJes postulaled by Crick. Eac h transfer RNA contains a sequence of acl jacenl bases (ihe anUcodon) Ihat bind specificaJly during protein synlhesis lo successive groups of bases (codons) along Ihe RNA templates.
The Paradox of the Nonspecific#Appearing R¡bosomes About 85% of cellula r RNA is found in ribosomes. and since its absolute amount is grea tly increased in ceUs engaged in large-scale proteill synthesis (for example, pancreas and Iiver ceJls and rapidly growing bacteria). ribosomaJ RNA (rRNA) was initially Ihought lo be th e template for ordering amino acids. Bul once Ihe ribosomes of E. coJi \Vere carofull y analyzed . several disquieting fealures emerged. First, al1 E. coli ribosomes, as well as those I'rom aU other organisms.
The Ceflfral DognUl
33
are composed or two unequalIy-sized subunits, caeh eontaining RNA. Ihat either stick together 01' rall apart in a reversible manner, depending OD the surrounding ion coneentration. Second, all the rRNA chains with in the small subunits are of similar ehai n lengths (about 1,500 bases in E. como as are Ihe rRNA chains of Ihe large subunits (about 3,000 bases). Third, the base composition of both the small and Jarge rRNA chains is approximately the same (hjgh in G and el in all known bacteria, plants. and animals. despite wide variations in Ihe ATICe ralios of their respective DNA. This was no! to be expectecl ir Ihe rRNA chains were in faet a targe collection of rlifferent RNA templates marle of a large number of different genes. Thus. neither the small nor large class of rRNA had Ihe feel of lemplate RNA.
Discovery 01 Messenger RNA (rnRNA) Gells infected with phage T4 provicled lhe ideal syslem to find the !rue templale. Following ¡nfeclion by Ihis virus. cell s stop synlhesjzing E. coJj RNA; the only RNA synthesized is transccibed off Ihe T4 DNA. Mosl strikingly. nol only does T4 RNA have a base compos iti on very similar to T4 DNA. but it does nol billd to lhe ribosomal proteins lhat normally associatc with rRNA to form ribosomes. Insteacl. afler firsl atlaehmg to previously exisling ribosomes, T4 RNA moves across their surface lo bring it s bases into positions where Ihey can bind to lhe appropriate tRNA-ammo acid precursors for prote¡n synlhesis (Figure 2-15). [n so aeting, T4 RNA oroers the amino acids and is thus (he long-sought-for RNA template for prolejn synthesis. Becuuse it UlITies Ih !! information fmm DNA lo the ribosomal sites oC protein synl hesis. il is called messenger RNA (mRNA). The observa~ tion oC T4 RNA binding lo E. coli ribosomes, firsl made in the spring of 1960, was soon followed with evidence for a separate messenger c1ass oC RNA wilhin uninfected E. coli cells. thereby definitively fu ling out a tempIate role Cor any rRNA. lnslead. in ways thal we shall discuss more extensively in Chapler 14, the rRNA components of ribosornes, together with sorne 50 different ribosomal proteins that bind lo them, serve as lhe faelories for protein synthesis. functioning to bemg together the tRNA -amino acid prccursors into positions where they can read off the information provided by lhe messenger RNA tempJates. Only some 4% of total cellu lar RNA is mRNA. This RNA sh ows the expected large variations in leogth, clepending 00 Ihe polypeplides for which tbey codeo Hence, it is easy to understand why mRNA was first overlooked. Because only a small segment oC mRNA is attachecl at a given moment to a ribosome. a single mRNA molecuJe can simultaneously be read by several ribosomes. Mosl ribosornes are found as parts of polyribosomes (groups of ribosomes translating the same rnRNA), which can ¡nelude more Ihan 50 members (Figure 2-16).
Enzyrnatic Synthesis of RNA upon DNA Templates As messenger RNA WaS being discovered. the first of the enzymes lhal transcribe RNA off DNA templales was being inclependent1y isolated in the labs of biochemists }erard Hurwitz and Sam B. Weiss. Callcd RNA polymerases, these enzymes function only in lhe presence of DNA, which serves as the template upan which single-stranded RNA chains are macle. and use tile nucleotides ATP, eTP. CTP, and UTP as precursol'li
mRNA
=;U~C~G;·~=::J5' '""""
3"[1
L-.J
f I G U R E 2-14 Yeast alanine tRNA
strurn.e. as detennined by Robert W. HoI~y and his associates. The anticodon in Ih~ tRNA re<:ogl"ile5 the codoo Ibr dla!1ine i!1 the
mRNA Several moditied nudeosides exisr in the stn.JOure.:..p = pseudouridine. T -= ribothy· midlne, DHU .. 5.6-dlhydrouridine. I "" im)Sine.
m'G = Hnc1hylguanosine. mi .., l-methylinoSine, dnd m1G = N,N-dimeth)-tgual1OSÍlle.
34
Nuc/eic Adds Con vey Genetic lnjormuliun
F I (; U RE 2·15 Transcription and translation. The nudeotides of mRNA are assembled lo ftxm a complemen~ry copy of one strand of ONA Each group 01 three is a codon tIlal is complementary lo a group of Iltree nudeotides in Ihe anlÍCodon regioo of a speciflC IRNA molecule. When bdse pairing
transcrlpliOfl
OCOJ~,
an amino acid camed at Ihe OIher end of !he IRNA moIecule is added te ¡he growing protein chain.
¡
mRNA
5" u l e Ol e 8 CO llo q~ CC ~ e '3'
¡
mRNA
5" TOleU CC lloe u ee r:~~ translauon!
"
amino _ "... ~.v'~ acid
(Figure 2-17). In bacteria, the same en zyrne makes each of the major RNA classes (ribosoma l, transfer, and messengerl, using approp!'iate scgments of chromosomal DNA as their templates. Direct evidence lha! DNA lines up the correct ribonucleotide precursors carne from seeing how the RNA base composition varied with the addition of DNA molecules of different AT/GC rntios. In every enzyrnatic synthcsis, the \\Ni\. AUlGC ratio W8S roughIy similar to the DNA AT/GC ratio ITable 2-2),
complete poIypeptkle
FI(;URE 2-16 Diagramofa polyñbosome. Earh ribosome anaches al a start signal at tne 5' end of an mRNA chaio and syntheslles a polype~de as it proceeds along me molecule. SeveraI ribosomes may be anachcd te ene mRNA molecule al ene rime; ¡he entite assembly is calle
re,,,~
growing
5· ......
"....
, top ribosome subunits released
The Centrol Dogma site of nucleotide addilion
lo
35
f I G U R E 2-17 Enzymatic synthesis of
RNA strand
RNA upon a DNA tempLi!lte, catatyzed by RNA potymefilSe.
During tmnscription, only one oC the two strands oC DNA is used as a template to make RNA. This makes sense. because the messagcs carried by Ihe two strands. being complementar)' bul nol identical. are expected to cocle Cor complete1y differont polypeptides. The synthesis oC RNA always proceeds in a fixed direction, beginning aJ the 5' end and 0011c1uding with the 3'-end nucleotide fsee Figure 2-17). By this time, Ihere was mm evidence Cor lhe postulated movement oC RNA from the DNA-containing nucleus to the ribosome-containing cytoplasm. By brie fly exposing cells to rndioactively labelcd precursors, then adding a large excess oC unlabclecl amino acids (a " pulse chase" experirnentl. mRNA synthesized during a shon time window was labeled. These stud ies showed that mRNA is synthesized io lhe nucleus, With in an hour, mosl oC this RNA had left the nudeus lo be observed in the cytoplasm (Figure 2-18).
Establish ing the Genetic Code Given the existence of 20 amioo acids bul on ly Cour bases. groups oC several nucleotides must somehow specify a given amino acid, Groups of two, however. would spe<::ify ooly 16 (4 X 4) arnioo acids. So &om 1954, the start of serious thinking about what the ge netic code mighl be like, most altenlion was given to how triplets {groups oC three) might work. even though they obviously would provide more pennutations (4 x 4 x 4) than nceded if each amiuo acid was specified by ooly a single triplel. The assumption of colinearity was then
' A Bl E 2-2 Comparison of tite Base Composition of Enzymatkally Synthesiud RNAs with the Base Composition
of Their Double-Helical ONA remplates Composrtíon of the RNA Bases Source ot ONA Temptate
T2 Call thymus Escherichia COIi
Micrococcus Iysodeiktfcus (a bacterium)
A+ U G+ C
G+ C
AH
Adenlne
Uracll
Guanine
Cytosine
0"...-
In Ot
0.3 1 0 .3 1 0.24 0.17
0 .34 0.29 0.2 4 0 16
0 .18 0 .19 0.26 0 .33
0.17 0.21
1.86 1.50 0 .92 0.119
1.84
0 .26 034
1.35 0.97 0.39
36
Nue/eie Jleids CoIlvey CeIlerie Informotion
very important. It held that successive groups oC nuc1eotides along a DNA chain eode Cor suecessive amino acids along a given polypeptide chain. That colinearity does in Caet exist was shown by eIegant mutalional analysis on baelcrial proteins, carricd out in the early 1960s by Charles Yanofsky and Sydney Brenner. Equally important were the genetic analyses by Brenner and Crick, which in 1961 firsl established Ihal groups oC three nucJcoticles are uscd lo specify individual amino acids. Bul which spocific groups of three bases (codom;¡) detennine which specific amino acids cou1d on ly be learned by biochemical analysis. The major breakthrough carne whcn Marshall Nirenberg and Hciruich Matthaci. then working together, observed in 1961, Ihal the addition of the synthclie polynucleotide poly U (UUUUU ... ) to a cell-free syslcm capable of making proleins leads lo the synthcsis oC polypeptide eha ins containing only the amino acid phenylaIanine. The nucleotide groups UUU Ihus musl specify phenylalanine. Use of increasingly more eomplex, defined polynucleotides as synthetie messenger RNAs rapidly led to the identificalion oC more and more codons. Parl icularly important in eompleting tile eode was the use of polynudeotides Iike AGUAGU, put together by organic chemist Har Gobind Khornna. These furlher defined polynucleotides were critical to test more specific sels oCcodons. Completion of Ihe code in 1966 revealed that 61 out oC the 64 possible pennuted groups corresponded to amino acids. with most amino acids being encoded by more Ihan one nucleotide triplet (Table 2-3).
F I G U R E 2-18 Demonstration that RNA is synthesized in the nucleus and moves lo the cytoplasm. (a) Autoradiograph 01 a cell (Telrohymena) exposed lo radioactive cytidine Ior 15 minutes. SUpenrrposed on a photograph of a thin section of the ceN is a photograph of an exposed silver emulsion. Each dark spot represents the path of an electron emined lrom a .lH (tritium) atom lhat has bren incorporated into RNA Almost al the ~ rnade RNA IS founcl v.1thln the nudeus. (b) Auloradiograph 01a similar cell exposro ro radioactive cytJdine Ior 12 minutes arod tnen allowed te gl'Ow fOf 8S mirutes in me presroce 01 nonradioactive cytidine Practically all the label incorporated inro RNA in me first 12 minutes has leh Ihe nudeus and mo~ inro the cytoplasm. (Scurce: Courtesy of D.M. Prescott, Universlty of Colorado Medical Schoo!¡ repfOduced ffom 1964. Progr. Nudeic Add Res. 111: 35, ""'Ih permisSlon.)
JABLE 2 - ] The Cenetic Code
second posltion
UUU U\JC
UU'I UUG
g•
•o
PI'<
L""
CUU CUC Le, CUA CUG
UCU UCC UCA UCG
ceu cec CCA CCG
uGu UGC
Cys
stop
l!a1
s top
stop
UGG
T,p
H'
CGU CGC CGA
UAlJ Se<
U'lC
I!m l!1S CAU
p",
CAe
CAA CAl>
T"
Glo
A,.
;;
a:
COO
1! •
~
l'
~
AUU AUC AU'I AUG
lIe
M"
GUU GUC val GUA GUG
ACU ACC ACA ACG
AAU
Th,
GCA GCG
AAA AAG
GCU GCC
AAC
Aso
Lys
GAU
Ala
GAC GAA
GAG
Asp
GI,
AGU AGC AGA AGG GGU GGC GAA
GGG
S ..
A'9
GIy
~ ,
Estublislling the Directjon Q ! Prolein Syntl1esis
37
ESTABLlSHING THE DlRECTION OF PROTEIN SYNTHESIS The nature of the genetic codeo once delerm ined . lerl lo further questi ons about how a polynucleolide chain directs the synthesis of a polypeptide. As we hl:\vc seen hero and shall discuss in more detail in Chapter 6. polynucleotide chains (both DNA and RNAJ are synthesized in a S' - 3' direction. But what about the growing polypeptide chain'~ ls it assembled in an amino-tenninaJ to carboxyl-tennina l direction . or the opposite? T hi s question was answered in a classic experiment in which a ceJl-free system wa s u sed ror carrying out proteio synthesis. The coll-froo system was en lUted lIsing an extrael rrOIn ¡mmaturo red blood ceJl s (kn own as reticulocytes) from a rabbit. which are effi cienl faclon es for Ihe synthesis of the 0: - s nd ~-globin Sllbllnits óf hemoglobin. T he cell-free system was \reatcd with a radioaeti ve umino acid ror a very fe w seconds (less than Ihe time required lo synthesize a complete globin chain) after which protein synthesis was immediately stopped. A brief rad ioactive Jabeling regime of thi s kind is knowo as a pulse 01' p ulse~ labe l i ng . Next, globin ehains that had comp/eted their growth during the period of the pulse-Iabeling were separated from incomplete chain s by gel electrophoresis (Chap!er 20). The full -Icnglh polype ptides were then trealecl with an enzyme, the protease trypsin . that c1eaves protcins on particular sites in the polypetide chain , thereby generating a series of peptide fragemenls. In the final step of the experiment o the amoun! of radioar.livity that had been incorpora led into each pepl id e (Tagmenl was meas ured (Figure 2-19).
, , , , ,
a
) COOH
NH 2 {
(
F IGURE
=
)
(
( )
(
b
•
• •
f ,g • ro
• •
• COOH
NH,
position of peplide
2· 19 InCOfporation 0 1 rabel
into a growing pofypeptKJe dla m. The experimental details are described in the text (a) Distribulion 01 radiooctMry among completcd c:hains alter a short penad 01 labeling. (b) IrlCOfJ.lOration 01 !abe! pIoned as a lunction of positioo of !he peptide within !he completed chain.
Keep in mind Ihal Ihe globin ehains were al various slages of eompletion during Ihe period of Ihe pulse (Figure 2-19a). Thus, naseenl e hains Ihal had only jusi slarled lo be synlhesized would be unlikety lo have reaehed eompletion duriog Ihe period of Ihe pulse beca use Ihe time of Ihe pulse-Iabeling was less than the time required lo synlhesize a complete globin e hain. On the other hand, globin ehains tha! were almosl fulll englh would be highly likely to have reaehed eompletion during the pulse. AIso, keep in mind Ihat only ehains that had reaehed fulll ongth during the lime of the pulse were isolated and sub jected to trypsin treatment. It. Iherefore, follows that Ihe trypsin-genCl'atcu peptides wirh Ihe Jeast amonn l uf radioaetive amino acid (normali zed lo Ihe size of the peptide) s hould have derived from regions of Ihe globin protein thal were the first to be synlhesized. Conversely. peptides with lhe grealcst amount of radioaelivily shou ld hAve derived from reginos of the protein tha!. were Ihe last to be synthesized . The results of Ihe expcriment aro shown in Figure 2-19b. As you can see, radioactive labeling was lowesl for peptid es from the aminotenn inal region uf globin an d greatest ror peptides from lhe carboxylterminal region. We, therefore. conclude lhal Ihe direction of protein synthesis is from the amino-terminus lo Ihe carboxyl-tenninus. In other words. during prote¡n synlhesis Ihe first amino add to be incorpornled in to the nascenl cbain is the ammo aeíd al the amino terminal end of lhe protein and Ihe last to be incorporated is al the carboxyl-terminus.
Start and Stop Signals Are Atso Encoded
within DNA fnitially, it was guessed lhat transtation of an mRNA molecule would commence al one end and finish when lhe entire mKNA message had been read into amino acids. But, in fact, translation bolh starts and stops at internal positions. Thus. signals must be presenl within DNA (and its mRNA products) to initiate and terminal e translat ion. f irst to be worked out were the stop signa ls. Three sep8rate codons (UAA. UAG , and VCA). firsl known as nonsense eodons. do nol direcl Ihe addition of a parlicul ar amino acid . lnstead, these codons serve as Iranslational stop signals (sometimes ca lled stop codons). More compHcated is the way lranslalional slart signals are encoded. The amino aeíd melhi onme starls all polypeptide chains. but the triplel (AVe) Ihal corles for Ihf:se initiating melhionines also codes foc methionine resid ues that have internal locations. The AUe codons, a t which polypeplide chains staft, are preceded by specific purine-rich blocks of nucleolides that sen 'e to altach mRNA lo ribosornes (see Chapler 14).
THE ERA OF GENOM1CS With the elucirlation of the central dogma, it became clear by the mirl-1960s how the genetk blueprint conlained io Ihe nuclcotide sequence could determine phenotype. This meant thal profound insighls into Ihe nature of livin g things aod Iheir evoJuti on would be revea ted from ONA scquences. In recent years Ihe advent of rapid. aulomatcd DNA sequencing methods has led lo the determination of
SlImmQf)'
39
complet e genome sequences for a wide variety of organisms. Even the human genome, 8 single eopy of which is composed of more Ihan 3 billion base pairs, has been elucida ted and show n lo contain more lhan 30,000 genes. During the upcoming years , many more complete genome assemblies wil! be available from a broad spectrum of organisms, including puplars, sponges, jellyfish, crus taCeans, Sttli urchins . Crogs, and dogs. In the future it s hould be possible to extend the inte rpretation oC genome sequences beyond the identification of genes and their encoded proteins. Other d asses oC ONA seqllences media te replication. chromosome pairing. recombination. and gen e regulation. It is possi ble lo envi s io n a day whao comparative DNA sequence analysis will reveal basic lnsighls into the orig ius of complex behavior in humans. such as the acqu isition oCJangllage, as well as the mechanisms underlying the evolutionary diversification of animal body plans. The purpose of the forthcoming c hapt ers is to provide a fiTm foundation for underslanding h ow DNA funetions as lho templete Cor biological complexity. The remain ing chaplers in Pa rt 1, review the basie chemislry an d biology relevant lo the main themes of this book. Part 2, Maintenance oC the Cenom e , describes the slru cture of tho genetic material and as faithful duplication. Parl 3, Expression of the Genome. s h ows how the genetic in stmct ions contained in DNA ís converted into proteins. Part 4, Regulation, describes strntegies for differentia l gene activity that are u sed lo generate complexBy within organisms (Cor example. emhryogenesis J and diversity among organisms ICor example, evolulion). Finally, Part 5. Methods, descri bes various 1a baratory techn iques. bioinfurrnatics approaches , aod model syslems that aTe commonly used lo inves tigate biological prob lcms .
SUMMARY The discovery thal ONA is Ihc genclic matcrial can be traced lo experiments performed by Griffith. who showcd thal Ilonvirulenl slrains or bacttJria could be genetiCfl lly lransformcd wit h a 5ubstance derived from a heal-killcd palhogenic slrain. Avery. McC'.arly. and MacLcod subsp.quclltly demonstraled that lhe transforming s ubstüll(;e was DNA. further evidonce Ihal DNA is Ihe genelic malerial was oblained by Hershey .. nd Chase in experi rnents wilh rnelio-Iabcled bacteriophage. Building on Chargaff's nJles and FrankJin and Wilkins' X ·ray diffrac:tion sludies. Walson anel Crick proposed a double-helic.al structure of ONA. In Ihis model. !\VO polynucleotlde chains are twisled amuJ1d each other lo lorm a regu lar double helix. The ¡wo chains wilhin Ihe double helix are held togelher by hyel.rogen bonds bctween pllirs of beses. Adenine is always joinecl lo [hymJne. and guanine is always bonded lo c}'lcsilw. The exislence of tbe base pairs means Ihat Ihe sequence oC nucleolides along Ihe IWo ch .. ins are nol idenlit-:a l. bul comp lementary. 'M1A lindi ng of Ihis relntionship suggesled a mcchanism for Ihe replicalion of ONA in whi(:h euch slrnnd serves as a lemplale lor ils complement. proof I'oc Ih is hYPolhesis ('.Ame from lb) the observation oC
Mesclson and Slahllhat Ihe two slrancls of each double heIix sepamte during each round of DNA replication, tlnd lb} Kornberg's discovery of an enzyme Ihal uses singlestründed DNA as a lemplalc for the symhesis of A eürnplemenlary slrnnd. As we llave seen, according lo the "central dogma" informatiOll flows from ONA lo RNA lo protein. This lransfonnalion is achieved in two sleps. First, ONA is trnnscribcd inlO an RNA intermediate (messenger RNA), alld second . the mRNA is transluled in lo protein . Trnnslalion of Ihe mRNA requires RNA adaptor molt:~cu l es called IRNAs. The key characlenstic of Ihe genctir: cacle is UJaI each triplel codon is recognized by a IRNA. which is a"socialed w ilh a cognale amino acid. Oul of 64 14 X 4 X 4} polential codons, 61 aJ'C used lo spocify Ihe 20 amino acid buiding blocks of proteins. whereas 3 are used lO provide chain-Ierminal ing signals. Knowledge of the gcnetic code allows us to predict protein coding sequences from DNA 8el:luences. The tldvent of rapid ONA sequendng melh ods has ushered in a new ere oCgenomics. in which complele gtmome sequences are being determined for a wide variely of organisms. indudlng humans.
40
Nuc/eic Acids Convey Genelic l n!omlutiofJ
BIBLlOGRAPHY General References Brenner S .. Slrelton A.D.W.. and Kaplan S. 1965. Cenelic l:ode: The nonsense tri plels for chain lerminalion and Iheir suppression. Na turo 206: 994 - 998. DreJUJer S .. Jacob F" aud Mesdson M. 1961. An unslable inlernwdiale ctlrrying informali on from genes to ribosomes for prolein s}'n lhesis. Notu re 190: 576-581. Cairns J., Slenl C.S .• and Walson J.O., cds. 1966. Phagc and tJJe origins of molecular bio/agy. Cold Spring Harbar Laoomlory Press . Cold Spri ng Harbor. New York. Chargáff E. 1951 . Slruclure and funcHon of nucleil.: aeids as cell <:onslituents. Fed. Prac. 10: 654 - 659, Cald Spring Harbor Symposia on Quantitative Biology. 1966. Va lu me 31: The genetíc codeo Cold Spring Harhor Laboralary Prl;lss, Cold Spring Harbor, New York. Crid: RH.e. amI \Valson J.D. 1954 . The complementary struclure o[ deoxyri bonucleic acid . Proc. Roy. Soco (A l 223: 8096. Crick Y.H.e. 1955. On degcnemle templete and the adaptor hypothesis. A note for Ihe RNA TIe Club. unpublished. Menlioned in Crick's 1957 discussion, pp. 25-26, in The slrudure of nudeic aeids and Iheir role in prolein synlhl$is. Biochem. &X;. Symp. no. 14. GlJllbridge Univers ity Press. CAmbridge . Enghwd.
- - - 1958. On proteio synlhesis. Symp. Soc. Exp. Bio/. 12: 548-555. - - - 1963. The rectlnt tlxe it ~menl in Ihe coding problem. P /"Og. NucleicAcíd Res . 1: 164-217. - --1988. Whal Mad Pursuit ; 11. Personal View of Scienlifte Discovery. Bflsic Rooks, New York. Ei:::hols H. and Gross e.A., eds. 2001. Operotors and Promoters: T/¡e SIOl)' of Mo/eeular Biology and íls Crootors. Universily or California Press. B~rkeley, California. F"ran kli n R.E. amI Cosling RG. 1953. Molecul ar configuralion in sodium thymonuch..'Gse. Na lure 171: 740-741. Hershey A.D. and Chase M . 1952. lndepende nt funcHon o f viral prolein ancl nudeic ar.id on growlh of bacleriap hage./. Gen. Pllysiol. 36: :J9-56. Hoagland M.B., Slephenson M.L.. Seo" I,E, Hocht L.I .. flnd Zanlt:l:nik P.e. 1958. A soluble ribonudeic ad d inrermed iate iu protein synthesís. f. Biol. Chem . 231: 241 -257.
Holley R.W .. Apgar J.. Everctt C.A.. Madison I.T.. Marquisse M., Mtlrrill S.H.. Penswick I.R.. and Zamir A. 1965. Slrudure or a ribonudeic acid. Science 147: 1462-1465. Ingram V.M. 1957. Gene ll1utati ons in human henlOglobin: The dlflmica l difference betwecn normal hnd sickle r.ell hemoglobin. Na IUl"e 180: 326-328. IAcob y: and Monod J. 1961. Cenelic regu lalory m~(".ha · nislTls in the synthesis of protdns. /. Mol . Biol. 3: 318-356.
Judlion H.Y. 1996. The oighth day of creation. Expanded edHion. Colcl Spring Harbar Leboratory Press, Ccld Spring Harho r. New York. Ko rnberg A. 1960. Biological liYOlhesis of dooxyribonu· d eie aeid. Science 131: 1503-1508. Kornberg A. and 8ake r T.A. 1992. DNA Repbcation. \V.H. Freeman, New York. MeCarl )' M. 1985. Th e transforming principlf1 : Discovering th at genes are mude of ONA. Norton. New York.. Meselson M. and Stahl EW. 1958. Thc replicatioo of ONA in Escherichio eoli. Proc. Nat. Acod. Sd. 44: 671- 682. Nirenoorg M.W. IInd MaUhfle i ).H. 1961. The dependcn ~ u f cell-free proh:in synthes is in E. coli upon nalurall)' oceurring or sy nlhelic- po ll'ribonucleolides. Proc. Na l. Acod. Sci. 47: 1588 - 1602. Olby R. 1975. Tlle poth lo Ihe doubJe helix. University of Washington Press. Sea ttle. Portugal EH. and Cohen J.S. 1980. A century of DNA : A his· tory of fhe discol'ery of the .qructure ond function of lhe genetic sub.q ance. MIT Press. Cambridge. Malisachusells. Sarabhai A.s., Stretton A.O.W.. 8 re nner S .. and Bolte A . 1964 . Co-Iineari ly o f Ihe gene wilh th~ po lypeptide d lain. Noture 201: 13 - 17. Slent C.S. and Calendar R. 1978. MoJecular genetics: An introductOlY narrfl tíve. 2nd edili oll . FretlmWl . San Francisco. Volkin E. and Aslraeha n L. 1956. Ph osphorus incorporatiao in E. CQJj ri bonuckic ücid afler infcetion wit h bac.teriophage T2. Vir%gy 2: 146-1fjl. Walson J.O. 1963. Invo lvemenl of RNA in synthesis ofproleins. Science 140: 17 -26. - - - 1968. 7'I,e double heJix. AlheneuOl. New York. - - - 1980. Th e double he/ix: A NorlOIl criticaJ edilioll. (ed. C. S. Slenl). Norlon. New York. 2000. A Passion for DNA: Genes, Genomes and Society. Cold Spring Harbor Laboralory Press, Cold Spring Harbor, New York. - - - 2002. Gir1s. Genes, and Gamov: After the Double HeJix . Knopf. New York. _ _ _o
Walson '.0. and Crit.k. EH .e. 1953a. Genetica l int plicatiolls o f Iht: :;tnll:turtl u f d~ xy rib{) l1 udeil; add. Nufu re 171: 964- 967. - -- 1953b. Molecular strudure of nucleic acids: A struclure for deoxyri bose nud e ic acid. Nalure 171: 737 -738. Wilkins M.H.F., Stokes A.R. . and Wilson H.R. 1953. Molecular slruelure of deoxypen lose nucleic aeid. Noturo 1.71: 7::16-740. Yanofsky C.. earllon Re., Cuest J.R.• Helin:ski D.R., and H~ nni.ll g V. 1964. On the colinearily of g~ n~ strueture and prot ein slruc lurc. Proc. Na t. Acud. Sci. 51 : 266- 272.
e
H A P TER
The Importance of Weak Chemical Interactions he rnacromolecules Ihal will preoccupy us Ih roughoul Ihis book- and Ihose oC mosl concem lo molecular b¡ologisIS-8J~ proleins and nucleic aci ds. These are made of amino acids ünd nuclcotidcs respectively, and in both cases Ihe constitllenls are joined by covalent bonds lo make polypeptide (prole¡n) and polynucleotide (nucleic acid) chains. Covalcnt banda are strong. stable bonds. and essential1y never break. spontaneou!;ly wilhín biologícal sysJems. Bu' weaker bonds also ex isl, and indeccl are vital for life, partly because lhey can form snd break under the physioJogica1 condüions present within cells. Weak bonds mediate the intemctions between enzymes and their substrates. and betwoon macromolecules-most strikingly. as we shaU see in later chapters. between peoleins and DNA. or proleins and. oUler proleins. But equally important. weak bonds also medi Rte inlemctions between differenl parts oC individual macromolecules. detennining lhe shape of those molecules and bence their biological function. Thus. ahhough a protein is a linear chain of covalently·linked amino acids. its shapo and funclion are determinoo. by the stable throe. dimensional slructu.re il adopls. That sbape IS detcrmincd by a large coUection oC iodividuRlly weRk interdcfiOllS thal form beh.\leen am;no acids Ibal do nol need lo be adjacenl in lhe primary sequence. Ukewise , it is lhe weak. noncovalent bonds thal hold Ibe two chains of a DNA double helix together. In this chapler we consider the nalure oC chemical bonds. concen· tratiog in large pa .·' on the wcak bonds so vital lo lhe proper funclion of all biological mac romolecules. In particular we describe what iI is thal gives weak bonds their weak character. These bonds ¡nelude van der Waals bonds. hydr ophobic bonds. hydrogen bonds. and iooie bands.
T
Q UTllNE
• Olaractcristics of Olemical Bonds (p. 4 1) The Colcept 01 Free Energy (p. 44)
• Wcak (3(XlcJs In Bio1ogical 5ystems (p_45)
CHARACTERISTICS OF CHEMICAL BONDS A chemical bond is ao allractive force thal holds aloms togelher. Aggregates of fin ite size are called molecules. Originally. il was thoughl that o nly covalenl bonds hold aloms together in molecules; now. weaker a llracli ve forces are known 10 be imporhml in holdi ng togethcr many .uacromolecuJes. Foe example. the four POlypeplide chains oC hemoglobin are held togelber by the combined action oC several wcak bonds. I ~ is Ihus no\\' customary also lo call weak posi· live interactions che mical bands. even though they are nol strong enough. when presenl singly. lo effectively bind two aloms together. Chemical bonds are characlerizod in several ways. An obvious characteristic of a bond is its strength. Strong bonds almost never Cal! apart al physiological temporatureH. Thi s ia why atoms unilod by cava lonl 41
7'hc fmportunce 01 1'1-'001: ChclII;rn/ fn leracljom
4Z
!he ~-C" bond in gIUC05e-. This carbon-carbon bond ¡s a smgle bond, and so any of !he ttllee coofió\Jraoons, (a), (b), Of (e), may OCQ.Jr. F I G U R E 3- 1 Rotation ilbout
a
H
e
b OH H, I ...... H C
Ho, I ...... H C
I
c
I
H CF=""Q
H
H
H, I ...... OH
H e- o
I
H
H C..-O
H
J<~~>lH j(~~>lH J~~)L I
H
o 11
"
N, C
I
H
FIGURE 3-2 lheptanarshapeofthe peptide bond. Shown here is a portion 01dn extended polypeptide chain. Almosl no rotation IS possíble about the peptlCle bond because of
I
OH
I
H
I
OH
I
H
I
OH
bonds always belong to thc same molecule. Weak bonds are easily broken, and when thcy exist singly, they ex ist fleetingly. On ly w hen prcsent in o rdered groups do weak bonds lasl a long lime. The strength of a bond is correlated with ils lenglh. so that two atOms connecfoo by a strong bond are always c1 0ser together Lhan the same two ato ms held togelher by a weak bond. For example. two hydrogen atoms bOltnd eovalently to form a hydroge n molecule (H:H) aro 0 .74 A apart o whercas the same Iwo ato ms held logether by van der Waals rorees are 1 .2 A apar!. AnolhElr important e haracteri stic is the maximum number oC bonds thal a given alom can make. The number of covalent bonds that an atom can Corm is call ed its valcncc. Oxygen, for example. has a valence oC two: It can never fonn more than two covalenl bonds. There is more variability in the case of van der Waals bond s, in which lhe limiting factor is purely sleric. The number oC possible bonds is limited only by lhe number oCaloms tha1 can to uch eueh other simultaneously. T he formation of hymogen bonds is subjecl to more restrictio ns. A covalentIy-bonded hydrogen atom usually participates in only one hydrogen bond. whereas an oxygen atom seldom participates in more tha n lwo hydrogen boods. The angle between two bonds originating CroIn a single alom is called the bond angle. The angle belween 1wo specifie cova lent bonds is always approximately the same. For example. when a earbon atom has Cour single eov;:¡lent bonds . they are directed !etrahedral1y (bond angle = 109°), In con trast, !he angles be!ween w{mk bonds l11'e muen more variable. Bonds differ also in the freedom ofrotation lbey allow. Single eDvalent bonds permit free rotation oC bound atoms (Figure 3-1), whereas double and tripl e bonds are quite rigid. Bonds wilh partial do ubtebond Chaf'dcter, such as the peptide bond, are also q uite rigid. For lbat reason , the carbonyl (C = O) a nd ¡mino (N=C) groups bound together by Ihe peptide bond must He in Ihe same plane (Figure 3-2). Much weaker. ¡onie bonds, on the other hand. impose no restrictions 0 0 the relative orientati ons of bonded atoms.
its partial double-bond character (see middle
panel). AlI the attms ín me F/ay area must líe in the same plane. Rotatíon IS pDSSIble, however, iYOUnd Ihe rem¡iining two bonds, v.f1ich
Chemical Bonds Are Explainable in Quantum~Mechanical Terms
makc up !he poIypeptide COf\figuratiOf1s.. (Scurre: Adapted ffOrn pauling L 1960. 1he noture o( file dJemtCTJl bond ond /he strUCfure d moIecule5 ond aystoIs: An infroductiofl ro modem structura/ chemisuy, 3rd edllion, p. 495. Copynght O 1960 Comell University. Uscd by permiSSlon 01 the publísher.)
The nalure of the forees, both strong and weak, that give rise lo ehemical bonds remained a myslcry lo ehemists until the quanlum Iheory of the a10m (quantum mechanies) was developed in Ihe 19205, Then. for the first time, the various empiricallaws about how chemical boods are formed wel'e pul on a fum theoretieal basis. Tt was realizecl that all ehemical bonds. weak as wel1 as slrong. are based on electrostatie fo rces, Quantum mechanics provided explanalions ror eovalen t
bonding by the sharing of e leclrons and also for the formation of weaker bonds. Chem.ical~Bond
Formation lnvolves a Change in [he Form of Energy The spontaneous fo rmation of a bond between two atoms always involves '-he relcase of sorne of Ihe internal encrgy of the unbonded atoms and its conversion to another energy formo The stronger the bond . the greater the amount of energy released upon its formation . The bonding reaction between two atoms A and B is thus described by
A + D-
AD + energy
IEquation 3-1]
where AB represents the bonded aggrogate. The rate of the reactioJl is directl)' proportional to the frequem:y of collision between A and B. The unit most often used to measure energy is the calorie, the amount of energy required to mise the temperature of 1 gram of water from 14.5
fEquation 3-21
The amounl of energy tha! must be added lo break a bond is exactly equal to the amount thal was released upon formation of the bond. Trus equivalencc foJlows from the firsl law oflhermodynamics. which slates that energy (except as il is inten:onvertible with mass) can be neither made nor destroyed.
Equilibrium between Bond Making and Breaking Every bond is thus a resull of lhe combined actions of bond-making and bond-breaking forces. When an equilibrium is reached in a dosed system, the number of bonds forming per unit time will equal tbe numher of bonds breaking. Then the proportíon of bonded atoms is descrihed by the fol1ow1ng masS'action formula : AR
K
oq
=
conc --::¡:::;::-----., cone A X cont:B
[Equation 3-3]
where ~ is tbe equilibrium constanl, and eone A• <:011<:8. and cone AB are the concentrations oC A, D, snd AB. respectively. in moles per liter. Whether we start with only free A an d B. with only the molecuJe AB. or with s combination oí AS and free A and B. at equilihrium the proporlions of A, D, and AS will reach the concenlrations given by K..q.
44
Tlle fmponance of Weok Chemicuf In /eroctiolls
THE CONCEPT OF FREE ENERGY There is always a change in the form of energy as the proportion of bonded atoms moves toward the equilibrium concenlration. Biologica Uy, Ihe mosl usefu l way lo express Ihi s energy c.:hange is through tbe physkaJ chemisl's concept of free energy, denoted by the symbol e, which honors the great nineteenth-cenlury physicisl fos iah Gibbs. We shall not give a rigorous description of free energy in thís texl oor show how it difIeTS from the other forros of energy. F'or Ihis. Ihe roader musl rofer to a chemistry text thal discusses the second law of thermodynamics. It must suffice to say here thal free ener;gy ¡... energy thol has the ability to do work. The second law of Ihermodynamics t~lIs us Ihat a decr.ease in free energy (;lG is negative) always occurs in s pontaneous reactions. When eq uilibrium is reached. however. there is no further change in the amount oC free energy (;lG = O). The equilibriwn state Cor a closed collection oC Btoms is thus the slate Iha! contai ns Ihe least amounl oC free encrgy. The free encrgy losl as equilibrium is approached is either lransCormed into heat or usoo to ¡ncrease the amollnt of entropy. \Ve shBII nol attempl to define entropy here except to say tha! tha nmount of entropy is a measu1'e oI tbe amounl of disord er. The grellter the disor· der. the greater the amount of entropy. The existence of entropy means Ibat many spontaneous chemical reéictions (those with a net decrease in free energy) neecl not procecd with un evo] ution oC hea!. F'or exampie. whcn sodium chloride (NaCI) is dissolved in water, heal is absorbed rather lhan released. There ¡s. nonelheless, a net decrease in free energy because of Ihe merease in disorder oC the sodium and chlorine ions as they move from a solid to a dissolved slate.
Kt'(j Is Exponentially Related to
tlG
Clflarly. the slronger Ihe bond, and hence the greater the change in free energy (!le) that accompan ies its formation, !he greater t.he proportion of atoms that must exist in the bonded form o This commonsense idea is quantitative ly expressed by Ihe physical-chemical formula aG = - RT In ~
lABLE 3·1 The Numerical Relationship between tfte Equilibñum Constant and .1(1 at 15° e
K., 0,00 1 0.0 1 0. 1 1.0 10.0 100.0
lIXXl.O
AG,kcaVmol
4.089
2.726 1.363
o - 1.363 - 2.726 - 4,089
Of
~ =
e
.lCJR'
(Equation 3-4J
where R is the universal gas constan t, T is the absolute temperature. In is thelogarithm lof K"qJ t.o the base e, K~ is the equilibrium constant. amI e = 2,718. lnsertion of the appropriate values oC R (1.987 calldeg-rooll and T (298 at 25 oC) teJls us that ;le values as low as 2 kcaJ/mol can drive a bond-forming reaction to virtual completion if all reaclanls are presenl at molar concentrations (Table 3-1).
Covalent Bonds Are Very Strong The !lG values accompanying the formati on oC covalent bonds (roro free atoms. such as hydrogcn oc oxygen. are very large and negative in sign o usuaUy - 50 to - 110 kcallmol. Equation 3-4 tells us that ~ of the bonding reachon will be correspondingly large. and so Ihe concenImtion oC hydrogen or Oxygen atoros existing unbound \ViII be ve ry small . For example, wi th a lle value of - 100 kcallmol. ir we start with 1 mollL oC the reacting atoms, only one in 10411 atoms will remain unbound when equilibrium is reaehed.
~"'eok
Bond!> in Biologiwl SJ'.~tems
45
WEAK BONDS IN BIOLOGICAL SYSTEMS The maio types of weak honds important in biological systems are the van der Waals honds. hydrophobic bonds, byillogen bonds. and ionie honds. Sometimes. as we shall soon see, the distinction between a hydrogen bond and an ¡onie bond i8 arbitrary.
Weak Bonds Have Energies between 1 and 7 kcal/mol The weakest bonds are the van der Waals bonds. These have energies (1 to 2 keal/mol) only sJightly greater than the kinetic energy of heat molion. The energies of hydrogen and íonie bonds range between 3 and 7 keal/mo1. In liquid solutions, almost all molecules forro a number of weak bonds to nearby atoms. AIl molecules are able lo form van der Waals bonds. whereas hydrogen and ¡ooie bonds can form only between molecU,les Ihat llave a oel charge (ions) 01.' in wh ich the charge is unequally distributed. Sorne molecules thus have the capacity lo fOrIn several types of weak bonds. Energy considerations. however. tell us tha! moleeuJes always hove a greatcr lendency lo fonn the stronger bond.
Weak Bonds Are Constantly Made and Broken at Physiological Temperaturcs The energy of the strongest weak bond is anly abaul ten limes larger than the average energy oC kinelic moHon (heal) al 25 oC {O.6 kcal/mo1}. As Ihere is a significant spread in the encrgies of kjnetk mOlian, many molocules with sufficient ....inetic energy to break the strangest weak bond always exist al physiologieaJ temperatures.
The Distinction betwcen Polar and Nonpolar Molecules AH forms of weak interactions are based on ottractions between electrie t:harges. The separation oC elcdric charges can be permanenl or temporary, depending on the atoms involved. Far exomple, the oxygen molecule (0:0) has a symmetric distribution of eleclrons betwecn its two oxygen atorns. so cat:h of its two atoms is uncharged. In contras\, there is a nonuníform distribution al charge in water (H:O:H), in which the bond electrons are unevenly shared lFigure 3-3). They are held more strongly by lhe oxygen alom. which thus corries o considerable negative t:harge, whereas the two hydrogen atoms together have an equal amount oI positive charge. The centcr of the positive charge 1S on one side of lhe centcr oC the negative charge. A combinabon oC separated positive and negative charges is called an eleclric dipote momen\. Unequal electron sharing reflects dissimilar affinities of the bonding atom s lar eleclrons. Atoms Ihat have a tendency to gain eleetrons are caBed eJectronegative atoms. Elocttopositive atoms have a tendency lo give up e leclrons, MoJecules (s uch as H:t0) Ihal have a dipole moment are caUed polar molecules. Nonpolar molecules are those with no effective dipole moments, In methane (CH,¡), for examp le. the carbon amI hydrogen aloms have similar affinities for their shared electron pairs, so neither !he carhon nor the hydrogen atom is nOliceably charged. The disUihution oI charge in a molecule can also he affecled hy the presence of nearhy molecules. partic ularly iI the affected molccule
van der Waals radius of hydrogen /
covalent bond length
·.f~
' .."
/
/
.~ 10S~t
van del" Waals radius of oxygen
._1:4 .~__.
• ,
"
_
directioo of dipole moWlmenl
f I G U R E 3-3 The structure
m04ecute.
of a. wate.-
46
The tmporlance 01 Weok Chemic.'Ol tnleroctions
10Á
FIGURE 1-4 \/ariationohanderWaats torces with distance. The atoms shown in this diagram are i11Qm5 of lhe inert rare
gas ilrgon_ (Source: Adapted frorn P3llling L 1953 . General chemisJJy, 2nd edition. p. 322. Copytight 1953 by W.H. Freeman. Used with
weal<.
van (ler
•
Waals auactíon
permission.)
.A
>oery strong van der Waals attracUon
•
•
about4Á
van de!" W3als atlraction jusI balancad tl)' lepulsive lu"oos. owing lo interpenellation 01 ouler electron shelts
o
•
•
p
"
-!)
is polar. The effecl may cause a non polar molecule lo acquire a slightly polar charactcr. If the second molecule is not polar, its presence will still alter the nonpolar molecule. cstabli shing a fluctuating charge distribution. Such induced effects. however, give rise to a much smaIler separalion of charge Ihan is found in polar molecules, resulting in smaller interoction energies and correspondingIy weaker chemical bonds.
O
O
acelale
~
~ ,
~
O
Van del"" Waa ls Fot"ces
O glycine
O
guanina
F I C; U JI: E 1·5 Dfawings ot several molecuies with!he van der Waals ,adii of the a10ms shown in purp4e. blue. and Ofange.
Van der Waals bonding arises from a nonspeciñc attractive force originating when two atoms come close to each othe r. It is based not on the existence of permanent cha:rge separations. bul rather on the induced fluclllating charges caused by the nearness of molecules. It therefore operales between all types of moJecules. nonpolar as well as polar. [t depends heavily on the distancc between the interacting groups, since the bond energy is inversely proportional lo the sixth power of distance (Figure 3-4). There also exists a more powerful van der Waals repulsive force, which comes inlo play al even shortet distances. This repulsion is caused by !he overlapping of the olller electron she lls of the atoms involved. The van der Waals aUractive and repulsive force s balance al a certain distance specific for each type uf atom. This di stance is the so-caBed van dcr Waals radius (Table 3-2 and Figure 3-5). The van der Waals bonding energy belween !wo atoms separated by the sum of their van der Waals radii increases with the size of !he respective atoms. For two average atoros, it is only about 1 kcallmol , which is JUSI slightly more than the average thermal energy of molec ul es al room temperature (0.6 kcallmol).
lVenk Brmrls ill BiOloll icol Systems
This means Ihal van de r Waals forces are an effective binding force al physiologica l temperatures only when several atoms in a given molocuJe are bound lo severa! atoms in anolher molecu le. Then the enllrgy of inleraction is much greatúr than the dissociating tendency resulting hom randorn thermal movements. For severo l atoms lo intera!;! effecti vely, !he molecular fil musl be precise, since the distance separati ng sny two in!eracting atoros musl nol be much gmater than the sum of their van der Waals radii (Figure 3-6 ). The strength of interaction rapid ly approaches zero when this clistancc is only slighUy exceeded. Thus, Ihe stronges! Iype of van der Waals contacl arises when a molecule contains a cavjty exactly complementar}' in shape lo a protruding group of another molecule, as is the case with an anligen and its speci6c anlihody (Figure 3-7 ). In this ¡nstance. lhe binding energies somelimes can he as large as 20 lo 30 kcallmol . so that antigen-antibody complexes seldom fa l] aparl. The bonding pattero of polar molecules is rarely dominated by van der Waals inte ractions, sint:e sut:h molecu les can acqu ire a lowe r energy state (lose more free energyl by forming other types of bonds.
J ABL E 3 -2 Van derwaals Radíiof lhe Atoms in Biologacal
Molecvfes
H
1. 2 1.5
N O
1.4
P S
19
185 2 .0 1.7
CHJ group Hall thickness Of aromalic rnolecule
• • "-
o"
.0
O
,~
o
O
•
O
o
~,
0 0.
0'"
O o
O
F I e u RE 3 -6 TIte ammgement 01 mofeades in a I~r of a crystal formed by the .. mino acid gtycine. The p¡¡oong of the molecules IS determined by lhe van cler Waals rac!ii of ¡he groups. exce¡:x lar tht> N-H O contacts, ,...tllch are shortened by lhe formabon of hydrog€n bonds. (Source: AdapleO fiom Piluling L 1960. The noture 01 the chcmicol bond ond rhe structure o/ moJecules ond
crysloJs: An IfJrroductior¡ ro modem Sfructurol dJemisuy, 3 rd edillorl, p, 262. Copynght e 1960 Comell Unrvmlty. used by permlSSion 01¡he publishef.)
J ABLE 3 - 3 Approximate Bond lengths 01 Biologically Importanl Hydrogen Bonds
Sorne Ionic Bonds Are Hydrogen Bonds Many organic molecules possess ¡ooie gruups that conlain one oc more units oC nel positive or negalive charge. The negalivcly charged mononucJeolides, fOI" example, contain phosphale groups . which are negatively charged. whereas each amino acid {except proline) has a ncgalive carboxyl group (COO ) and a positive am ino group (NH J +1. bolh of whit:h t:arry a unit of c harge. These charged groups a re usuall y neutra lized by nearby. oppositely charged groups.
van der Waals radius (Á)
Atom
Hydtogen Bonds A hydrogen bond is formed belween a covalentl y bound donor hydrogen atom wilh sorne pos iti ve t:harge and a negativel)' charged, covalentJy bound aeceptor atom (Figure 3-a). For example, the hydrogen atoms of the amino (-NH:J group are attracled by !he negalively charged keto (- C=OI oxygen otoms. Someti mes. the hydrogen-bonded atoms belollg to gruups with a unil of charge (such as NHa" or coa 1, In otber cases, both the donor hydrogen atoms and lhe negative au:eptor atoms ha ve less Ihan a unit of charge, The biologically mOSI important hydrogen bonds involve hydrogen atoms covalent ly bound lo oxygc n atoms (O-Hl or nitrogen aloms (N-I-I). Likewise. the negati ve acceptor aloms are usually nitrogen or oxygen. Table 3-3 lists sorne of the 0105 1 important hydrogen bonds. In Ihe absence of surrounding water mo lecuJes , bond energies range between 3 and 7 h :allmol, the stronger bonds involving the greater charge differences betwee n donar and act:eptor atoms. Hydrogen bonds are thus weaker Ihan covolen! bonds. yel considerahly stronger Ihan van der Waa ls bonds. A hydrogen bond, therefore, wi ll hold two atoms doser together than lhe sum of Iheir van der Waa ls raclU , but not so c10se together HS a covaJenl bond would hotd them. Hydrogen bonds, unlike van der Waals bonds, aro highly directi onal. In the strongest hyrlrogen bonds, the hyrlrogen atom points direclly al lhe acceptor atom (Figure 3-H). If it poinls more than 300 away, the bond encrgy is much less. Hydrogell bonds are also much more speci6c than van der Waals bonds, sinee they demand Ihe ex istence of molecu les with complementar)' donor hydrogen and acceptor groups,
47
Approltimate H bond length IÁ)
Bond
Q - H ..·,,·· Q OONN N-
H·""" H .""" H ,.".., H ~, .. " H .... ·.,
Q N Q Q
N
2.70 ::!: 0.10 2.63 :: 0.10 2.88 ~ 0. 13 3 .04 ::!: 0. 13 2.93 ± 0 10 3. 1O :!: 0 13
The tmportonCl' (JI Wook Chp.mical l nteror.:tiotls
46
•
b
f I (; U R E 3-7 Antibody-antigen inletaction.
The strtJcture shows lhe complex bel'ween Fab O l.3 and Iysozyrne. (Frschmann T.O~ Bentley GA, Bhat T.N .• 8oIJ1ot G.. Milfluzza RA, Ph~I¡p!> S.E~ lello O.. and PoIjak RJ. 1991. 1. Biol. O1em. 266: 129 15.)
hydrogen bond between IWO hydrOlC)'I grOl4lS
o
O hy
!he h)ldrOllyl group ot 1yr06lne
hyórogen bond betweetl a chargcd amlnQ group en(! a chargBd catbol(}'l group
o.
O
-1
F I (; U R E 3-8 bample of hydrogen
bonds in biological molecules.
The electrostatie forces aeling be tween the oppositely charged groups are called iooic bonds. Their average bond energy in an aqueous solu-tian is abo ut 5 kca.llmol. In many cases. either an inorganic enlion like Na ". K+. or Mg.o- or an inorganic anion like el or 50/ - neutrnlizes Ihe charge of ionized organic molec uJes. Wh en this happens in aqueous solution. the neutralizing cations and anions do no! carry fixed positions beca use morganic ions are usually surrounded by shells of wal.e r molecules snd so do not directly bind lo opposilely charged groups. Thus, in water solul¡ons, eledroslalic bonds lo surrounding inorganic cations or anions are usually nol of primary imporlance in determining Ihe molecular shapes of organic motecules. On the other hand , highly diroctional bonds result ir the oppositely charged groups can form hydrogen bonds 10 each olher. FOf example. COO - and NH 3 '" groups are often hetd togcther by hyd rogen bonds. Since these bonds are slronger Ihan Ihose that involve groups wilh less than a unit of charge, they are correspondingly shorter. A strong hydrogen bond can also form between a group with a unit charge and a group having less than a unit charge. For example, a hydrogen nlom helonging lo an amillO group (NH 2) bonds strongly lo an oxygen atom of a carboxyJ group (eOO -l.
Wca k Interactions Dcmand Complementary Molecular Surfaccs Weak binding forees are effective only when the interacting surfaces are clase. This proximity is possible only when the molecular surfaces huye complementary structures. so that a protruding group tor posiUve chargel on one surfnce is matched by a cavity {or negalive chargel 011 another. That is, the inlerac.1ing molecules must have a lock-and-
Weok Bonds in BiologicaJ Systems
key relationship. In cells . tltis requiremenl often means that sorne molecules hardly ever bond lo other molecules of Ihe same kind, beca use s uch molecules do not have the properties of symmetry necessary for self-interaction. For example, sorne polar molecules contain donar hydrogen aloms and no suitable acceptor atoms, whereas other molecules can acce pt hyclrogen bonds but have no hydrogen atoms lo donate. On Ihe other hand, there are many molecules with the necessnry symmetry lo permil slrong self-inleracUon in cell s. Water ís Ihe mosl important example of this.
49
a
b
Water Molec ules Form H ydrogen Bonds Under physiological conditions, waler molecuJes rarely ionize lo form H'" amI OH- ions. lnslead . they exist as polar H-O-H mo lecuJes with bolh !he hydrogen and oxygen aloms forming stroog hydrogen bonds. lo each waler molecu)e, lhe oxygen atom can bind lo Iwo externa l hydrogen atoms, whercas each hydrogen alom can bínd lo one adjacenl oxygen atom. These bonds are directed tetrahedrally (Figure 3-1 0), so in its solid an d liquid forms, each water molecuJe tends to have four ncarest neighbors . one in each of Ihe fOUT directions of a tetrahedron. ln ice. lhe bonds lo these neighbors are very rigid snd the arrangeme nt oCmolecu les fixed. Above the melting temperalUTe (O gC), tbe energy oC Ihermal molion is suffident lo break Ihe hydrogen bonds and to allow lhe water molecules lo change their oearesl neighbors continually. Even in lhe Iiquid form. however, at any given inslant mosl water moleculcs are bOlUld by foUT slroog hydrogen bonds.
F' e u JI lE 3-9 Directional properties of hydrogen bonds. (a) The vector along the
úMllent Q-H bond points directly al the ac.ceptor axygen. thereby fool1lng a strong boncL (b) The \o€CtOr points aw~ from me orygen atom, resulting in a rnuch weaker bond.
Weak Bonds between Moleculcs in Aqueous Solutions The average energy of a secondary bond, though small compared lo Ihal of a covalent bond , is nonelheless strong enough compared lo heat energy to ensure thal most moJecuIes in aqueous solulion wiH form secondary bonds lO other moJccules. The proport.ioo of bonded F IGU RE 1-10 Diag,am of a lattice The energy galned by forming specffic hydrogen bonds befWeer1 water rnolerules fawrs lhe arrangemenl o, lhe molecules in adjacent tetrahedrons. Oxygen aloms are indicated by I¡¡rge a rdes, hy
tetrahedron
50
The Importance 01 weok ChemicoJ In!emctions
to nonbonded arrnngements is given by Equation 3-4, correctecl to take ¡nlo 6l:Counl the hjgh concent.ration of molecules in a liquid. ti teJls us lhat interaction energies as 10\\1 as 2 lo 3 kcallmol are sufficient at phys iological temperatures to force mosl mol ecuJes to form the maximum number of strong secondary bonds. The speciflc s tructure of a solution at a given instant is markedly influenced by which salute molecules are present. not only hecause molecules h ave specific shapes. hut also beca use molecules differ in which ty pes of secondary bonds they can fo rmo Thus. a molecule \Vil! tend to move until it is next to a molecule with which il can form the strongest possible bond. 501ution5 . of course, are not s tatic. Because of the di5ruptive influence of heat. the spedflc configuration of a solution is constantly 80x l-I Tbe Uniqueness of Molecular Shapes and the CoMept o, Selec:tive Stiddness
Even lhot.Jgh mast cellulé'lr moIecules are built up from ooly a number of chemical groups, such as OH, NH1• and CHl< Ihere is great specificity as to which molecules tend to lie nex! to earn other. lhis is because e.xh moIeOJle has unique bond-ing properties. One very dear demonstration comes frcrn the speciflCity 01 stereoisomers. For example, p!Oteins are always construct.ed from t-amino acids, never from their mirror images, the o-amino acids (Box 3-1 FIgUre 1). Nthough the D- and t-amino acids halle identical covalen! boods. their binding properties to asymmetric mnlecules are often ve!)' differenl Thus, small
mest enzymes are spedfic !Of L-amino adds. lf an L-amino aad is able to attach to a specific enzyme. !he D-amino acid is unable to bind. Most moIecules In ceUs can make ~ "weak" bonds VIMh onIy a small number of other molecules, partly booIuse most moIecules in biological systems exist in an aq..¡eous crMronment. The forrnation ot a bond in a cell tf1erefoe depends not only on whether tIMJ moIeOJJes bind \llle1l to eoch other. but also on whether bond fcxmation is overan more favorable than the alternative bonds that can torrn 'Nith salvent water rnolerules.
L-alanine
o-alanine
BOX 3-1 FICURE 1 Tbe two steuoisomers of the amino acid aaanine. (Source: Adapted from Pau~ng L 1960. Tñe f1(Jture ofthe chemico/ bond ond!he structure af moJea¡/es ond aystoIs: An introduction ro mcxJem strucrurol chemistry, 3rd e&tion, p. 465. CGpfrighl Cl 1960 Cornell University. Used by permission 01the publisher. And Irom Pauling L. 1953. Generol chemistry. 2nd edition, p. 498. Copyright 1953 by W. H. Freernan. Used \Mth perrnission.)
changing from one armnge mtlnt to another of apprax imately the same enerS)' content. Equally importan t in biological syslem s is the facl th at metabolism is conti nually transforming one molecule into anolhor and so autom atically changing the nature of the secondary bonds that can be formed. The solution structure of cells is thus constautly disrupted not only by heat motion. bul also by the metabolic transfor· mations of me cell's solute molecules.
O rganic Molecules T hat Tend to Form H ydrogen Bonds Are Wa ter Soluble The energy of hydrogen bonrls per a!omic group is much greater t.ha n that of van der Waals contacts; thus. mo lecules will form hyd rogen bonds in preference lo van der Waals contacts. For example. if we try to mix waler with a compound that cannot (orm hyd rogen bonds, s uch as benzene. the water and henzene molecu les rapid ly separate from each olh er, Ihe water molecu les formin g hydrogen bonds among themselves while the benzene molecuJes atlach to one another by van der Waals bonds. It is therefore im,possible to insert a nonhyruogenbondiog organic molecule into waler. On the oiher hand. polar molecules s uch as glucose ami pyruvata. whic h conlain a large number of groups that form exceJJent hydrogen bon ds (such as = 0 or OH ), are soluble in wole[ (that ¡s, they are hydrophitic as opposed to hyruophohic). WhUe the insertion oI such groups inlo a water lattice breaks water-water hydrogen bonds. il results simult31100us ly in the formation of hydrogen bonds between the polar organic molecule and water. These alternati ve arrangements, however, are nol lls ually as energetically satisfacIOty as the water· water arrangements. so !hat even the mosl polar molec ules orclinari1y havc only Iimited solubility. Thus. ahnost all the molec ules tha! cells acquire. eíther through food intake or through biosynthcsis. are somewhat insoluble in wa ter. These molecules. by their thennal movements. ran domly collide with othe r moJecules until they find compl~menta ry molecular surfaces on whid to attach and thercby release waler mol ecules for wale..... water inluractions.
Hydrophobic "Bonds" Stabilize Macromolecules The strang tendency of water lo exclude nonpoJar groups is freque ntly referred to as hydrophobic bonding. Sorne chemists like to cal l aH the bon ds between non polar groups in a water so/ution hydrophobic bonds (Figure 3-11). In a sense this terro is a misnomer, for the phe· nome non tbat it seeks to e mphasize is the absence, nol the presence, of bonds. (The bonrls tha! lend to form hetween the non polar groups are due to van der Waals attractive rorees.) On the other hand, the terro lly drophobic bond is ofien usefu l, since ji e mphasizes Ole fael that oonpolar groups \V ill try lo arrange themselves so lbal they are not in contact with water molec ules. Hyruophobic bonds are important holh in the stabilization of proteins snd complexes of prQteins with olher molecules and in the parlitioning oI proteins into membranes. They may account ror as much as one-half the total free energy of protein folding. Considero for example, tbe difIerent amounts or energy generated when the ammo adds alanine and glycine are bound . in wate r. to a
5Z
•
Tllt! lmportom:e o[ lVeok ChemicoJ lJ11erocUons
b
FIGURE }-11 Eum.,.es of 11M der Waals (hydfophobic) bonds between the nonpolar siete IfOUpS of amino acids. The hydrogens are not InólCaloo ¡ndMduaUy. For the sake of c1alÍty, the van de.Waals radii are reduce:! by 2Cl'*l. The strud.ural formulas adjacent lo earn space-filhng driM1ng irodicate me anangemenl of!he atorrtS. (a) Phenylalanine-leucine bond. (lJ) Phenylalanine-p~lan ine bond. (Source.: Adapte.:! frorn ScheraSiI HA, The ptoteins, 2nd edltion, p. 527. Copyrighl C"I Harold Scheri!BiI. Used lMth permiSSIOn.)
Ihird molecule that has a surface l;omplementary to alanine. A methyl group is present in alanine but not in glycine. When alanine is bound to the third molec ule. the van der Waals contacts around the methyl group yicld 1 kcallmol of energy, which is not released when glycine is hound inslead. From Equation 3-4 , we know thal tbis small energy ditTerence alone would give only a factor of 6 between the binding of aJa.nine and glycine. However, Ihis calculation does nol take inlo consideralion the Cae! that water is trying to exclude alanine much more than glydne. The presence of alanine's eH;¡ group upsels the water Jattice much more seriollsly than does the bydrogen alom sí de group of glycine. At presento it is still difficult to predict how large a correction fa ctor must be inlroduced fo r this disruplion of the waler lattice by the hydrophobi<.: side groups. It is Iikely thal the water tends to exclude alanine. thrusting it toward a third molecuJe, wilh a hydrophobic force of approximately 2 lo 3 kcal/mo l larger lhan the forees excluding glycine. We thus arrive at the important condusion that the energy difference between the binding of even the most similar molecules to s third molecille '",hen the difference bellNeen the similar molecules invalves Il non polar group) is al Icast 2 lo 3 kcallmol greater in the aqucous interior of ceJls Lhan under nonaqueous conditions. Frequently, the energy difference is 3 to 4 kcallmol. since the molecules involved often contain polar groups thsl c:.an form hydrogen bonds.
The Advantage 01 !l.G between 2 and 5 kcaVmol We have secn that the energy of just one secondary bond (2 to 5 kcal/mol) is often sufficient to ensure that 8 molecille preferentially binds to a selected group of molecules. Moreover, these energy difference5 are not so large that rigid lattice ammgements develop within s cell ; Ihal is, the interior oC s cell ne ver crystallizes. as it woulcl if the cnergy of secondary bonds were severa! ti mes greater. Larger energy differences wouJd mean that the secondary bonds se ldom break. resuIting in low diffusion rates incompatible with cellular ex istence.
SummOly
53
Weak Bonds Attach Enzymes to Substrates Secondary forces are n ecessarily the basis by which enzymes and thejr substrates initially combine with each olher. Enzymes do not indiscriminately bind aH molecules. having noticeable affinity ooly lar their own substrates. Since enzymes catalyy.e both directions of a chemical reaction, tbey must have specific affinit ies for both seIs of reacUng molecules. In sorne cases. it is possiblc lo rneasure an equilibrillm constanl for tne bincling of an enzyme to one of lis substrates (Equation 3-4), which consequently enables us to calculate the !:J.C upon bincling. Thls calculation in turn hints at which types of bonds may be involved. For !:J.C values between 5 ancl 10 kcallmol, several slrang secondary bonds are Ihe basis of specific enzymp.-substrale inleractions. AIso worth nnting 1S thal Ihe !:J.G of binding js ncver exceptionally highj thus. cnzyme-substrale complexas can be both macle and brokcn apart raphlly as a rosult ofrandom thermaJ movemenL This explains why enzymes can function quickJy, sometimes as often as 1 ()6 times per second. lf enzymes were bouncl lo their subslrates. or more importanlly lo their proclucls, by moro powerfu l bonds, they wou ld act mucl1 moro slowly.
Weak Bonds Mediate Most Protein:DNA and Protein:Protein Interactions As we wiIJ see through out the book, interaetions between protcins and UNA, and between peoleins ancl other prole¡ns, He al Ihe hcart of how cells detect iJ nd respond lo signals, express genes, replicate, repair. and rocombine their ONA. and so oo-as wcll as how those proccsscs are regulatecl. Again. these inleractions are mediatccl by weak chemical bonds of Ihe sorl we ha ve clcscribed io Ihi s chapter. Despite the low eneegy of eaeh individual bond, affioHy in Ihese interactions, and specificily as well, resulls from the combiJl ecl cffects of many slleh boncls between any Iwo in leracting moleculcs. [n Chaplee 5 Wf! relu rn lo Ihese matters with a cletailed look al how proleins are built, how they adoPI particular structures, and how Ihey uind DNA aud each other.
SUMMARY Many importan! chemical evenls in cells do nol involve Ihe making oe breaking of covalenl bonds. The cellular location of mosl molecules depends 011 weak. or secondary. attractive oe repulsive forces. ln addition. weak bonds are important in delennining Ihe shape of muny molecules. espedally very targe ones. The mosl impOrlanl of Ihese we
the result woutd be a release of free energy (negati ve LlG). For Ihe bond lo be broken, this i'lume amounl of free energy musl be supplied . Because the formalion 01' covalenl bonds between atoms lIslJally involves a very large negative !le, eovalently bOllnd atoms almosl ncver separate spontaneously. In contrast, the LlG values acoornpanyiog lhe rormalion of \'Veak bond s are only several times larger Ihan the average thermal energy of molecules al physiologica l lempera lures. Single weak bonds are thus freqlJently being made and broken in li vi ng cells. Molecules having polar (charged l grollps inlerael quite differently from non polar molecllles (in which the charge is symmelrica ll y distribuled) . Polar molecules can form good hydrogen bonds. whereas nonpolur molecules can form on ly van der Waa ls bonds. The mus! importanl polar molecule is water. Each water molecule
54
Tf¡u rmpci/"Io/lf:v 01 lI'I](}k Chemicol r"ln'Ocliolls
can form roUJ' hydrogen bonds lo other waler mol ecul es, Although polar molecules lend to be so lubl e in waler h o various degrees), non polar molec ulc~ a re insoluble because Ihey cannol form hyrlrogen bonds with waler moleculos. Every clislinel molecule has a unique molecular shape Ihat restricls ¡he number of iTlolecules wilh which it can fonn strong secondary bonds. Strong secondary inlerac-
tions demand both a complementary (Iock-and-key) relaliollship between Ihe two bonding surfaces and Ihe involvemenl of many atoms. Although molecule... bound togelher by only olle or two secondary boncls frequently [an apart, a collection of theso woak bonds can rooult in;) s lable aggrega le. The ra el thal do uble-helical DNA never raJls apart spontaneously demonstrales Ihe extreme stabil · ily possible in such a n aggregale.
BIBLIOGRAPHY General References Brandon C. and Tom'.e J. 1999. lntroduction lo prolein s/ru cture. Garland Publishing. New York. Creighton T.E. 1992. Prolcins: Slrvc/ure: and molecular propertics. 2nd edilion . W. H. Freeman . New York. - - - 1983. Proleins. Freeman, San Francisco. Donohue J. 1968. Selecled lopi cs in hydroge n bonding. In S/rucluml chemislry und molecular biology led. A. Rich and N. Oavidson). pp. 443-165. Freeman. S
ence: A guide lo enzyme calolysis and prolein fo/ding. W.H, Freeman, New York. c,ray H.8. 1964. Electrons ond chemico l bonrling. Den jnmin Cummings. Menlo Park. California. Klotz I.M. 1967. EnmgycJlOllgcs íll biuchemicaJ rellcliollS. Academic Press, New York. Ky te J. 1995. Mechanism in prolein chemistry. Carland Publishing, N€l.v York.
- - - 1995. Struclure in proleill chemislry. Garland Publishing. New York. Leh ninger A.L. 1971, Biocnef'!,'etics, 3rd edilion. Benjamin Cummings. Menlo Park. f:a lifomia. Lesk A. 2000. lntroductioll lo prolcif! orchilecliJre: Th e slmel llral bio/o¡;y of prolein s. Oxford Univorsity Press, New York. Marsh R.E. 1968. Sorne oomments on hydrogen bonding in purine and pyrimidinc bases. In Slruclurol chemislry and molecular biulogy led. A, Rich and N . Davidson), pp. 485 - 489. Freemlln . San Frdnciscu. Morowitz H.J. 1970. Entropy for bio/ogists. Ac.ademic. Press. New York. Pauling L. 1960. Tlle (lO/ure uf thc chcmicaJ bond. 3rd ed ition. Comell University Press, Ithaca, New York. ·tinoco r. led.), Se uer K., Wang J,c', Puglisi ).0. 2001.
P1lYSiL'ol chemislry: principIes and applicalions in IJfe sciences, 4th edition . Prenlice Hall College Division , Upper Saddle River, New Jersey.
CHAPTER
The Importance of High-Energy Bonds
n Ihe previous chapler we looked al Ihe formalion of weak bonds from the Ihermodynarnic viewpoint. Each time a pot entiaJ weak bond was considered . th e question was posed , Does its formation involve a gain or a 10s5 of free energy? Only whcn 6 G is negative does the thermodynamic cquilibrium favor a maction . This same approach is et.¡uaUy vaHd foc covalent bonds. The fuc l tJlat enzymes a re usually involved in the making oc breaking of a covalenl bond docs not in any sense alter the requiremenl of a negative tJ.G. On superficial examination. húwever, many oC the important co\'alen t bonds in cells appear to be formed in violation of lhe laws of thermodynamics. particularl)' those bonds joining small molecules logethe r lo form large polyrneric molecules. Tho form atioll of sueh bonds involves an in crease in free energy. Originall y, Ihis faet suggested lo sorne people thal eelIs had the unique ability to work in violation of thermodynamics and Ihal Ihis property was, in ract. Ihe real "seerel of life." Now. however. it is dear that these biosynthelic processes do not v¡oJale Ihermodynamics bu l ralhe r are hased on differcnt reactions from those originally poslulated. Nucleic acids, for exarnple, do nol form by Ihe condensation of nucleoside phosphates: glycogen is nol forrn ed directl y frorn glucose residues; proleins are not forrned by the union oC amíno acids. Instead. the monomeric precursors. us ing energy present in ATP. are Besl converled lo high-energy "activaled " precursors. which then spontaneous (y (wi th Ihe help of specific enzyrnes) unite lo form larger molecules. In Ihis chapler. we shall iJluslrale these ídeas by concenlrating on the thermodynamics of peptid e (protein) and phosphodiesler (nucleie acid) bonds . Firsl. bowevcr. we rnust bri efly look ni sorne generaJ therrnodynamic properhos of cOvalent bonds.
I
OU T L I NE
MoIecule5 tbal Donate Energy Are Thermodyrlamically Unstable (p. 55) Enzymes Lower Activation Energies in Biochemical Reactions (p. 57)
• Free Energy In Biomolerules (p. 58) High·Energy 80nds ín Biosynlhetic Reactions (p. 60) Acti"valion of Precursors in Group Transfer Reactions (p. 61)
MOLECULES THAT DONATE ENERGY ARE THERMODYNAMICALLY UNSTABLE There is great varialion in the amounl of free energy possessed by specific molecu les. This is oocause covalent bonds do not all have lhe same bond energy. As an example. the covalenl bond bctween oxygen and hydrogen i5 con5iderably slronger Ihan the bond bctween hydrogen and hydrogen, or oxygen and oxygen. The form alion of an O-H bond al Ihe expense of 0 -0 or H- H willthus release encrgy. Ellcrgy cons iderations. therefore . tell us thal a su fficientl y concentrated mixture oC oxygen and hydrogen will be transCorrned into water. A molecule thus possesses a larger amoun! oC free energy ir li nked toget her by wook covalenl bonds than ir it i5 linked together by strong 55
56
Tire ImporfrJIICf!
o/ High -E:mJrgy Bond!! bonds. This idea seems almost paradoxical al Hrst glance since it means thal the slronger the oond, the less energy it can give off. Bul Ihe notion automatically makes sense when we realize thal an atom thal has formed a very strong bond has already lost a large amollnt oC free energy in this process. ThereCore. the best food molecules (molecules that donate energy) are those molecules that contaln weak covalent bonds and are thereCore lhermodynamically unstable. For examp le. glucose is an excellent food molecule since there is a great decrease in free energy when it is oxidized by oxygen to yield carbon dioxide a nd water. On the other hand, carbon dioxide. composed of slrong covalent double bonds between carbon and oxygen. known as carbunyl bunds. is not a Cood molecule in a nimals. Jn the absence of the energy donor ATP, carbon dioxide canno! be transformed spon taneously into more complex organic molecules. even with !he help of specific enzymes. Carbon dioxide can be used as a primary source of ca rbon in plants only because the energy supplied by light quanta during pholosynlhesis results in the forma tion of ATP. The chemical reactions, by which molecules are transformed into olher molecules containing less free energy, do nol occur al significant rates al physiological leluperatures in the absence of a calalysL This is because even a weak cova lcnl bond ¡s, in reality. very stron~ and is only rarely broken by thermal mol Ion wilhin a ecU. For a covalent bond lo be broken in the absenee of a calalyst, cnergy must be supplied to push apart the bonded atoms. When the atoms are partially aparto Ihey can reeombine wilh new partners lo form slronger bonds. In th e proeus8 of recombination, Ihe elJergy relei:l8cd i" the s uro of lhe free energy supplied lo break the old bond plus the differenee in free ellergy belween the old and the new bond (Figure 4-1). The eneegy that must be supplied 10 break the ald covalenl bond in a molecular transformalion is called !he activatiun energy. The activaHan energy is usually less Ihan the energy oC the original bond becausc molecular rearrangements' generally do not .involve Ihe production of completely free atoms. (nstead. a collisioll between U1C Iwo maeting moleculcs is required. followed by lhe lemporary formation of a molecular complex called the activaled state. In the activaled state. the clase proximity of the hvo molecules makes each olher's bonds more labile. so fhar less energy is needed lo break {1 bond than whcn the bond is prescnt in a free llloJecule. Most raactions of covalent bonds in cells are Iherefore descTibed by (A- B)
+
(C- O) ~ (A- O)
+ (C-B)
IEquation 4-tl
acti~stale
FI G U RE 4·1 The energy of activation of
a memical reaction: (A-B) + (C-O) (A-o) + (C- B). This reaction is acmnpamed by iI decrease in free
ffi_.
activation energy
f
~ ~-=~~"-=c----------~\ -----------1 1 (A-B) + (C-O) .6Goflhe reaction
L--------,,"""'=.... =c:oI:.-::"''acbon ''
(A-O)
+ (C-B)
EllzymBs LOIver 1l(;1ivatiun Eneq¡ies in Hiuchemica / Reactions
-------
Tbe mass aClion expression for such a reaction is concA- D X concc - B K.,q = cone- B X conce o where
conc A - B,
r A B l E 4-1 The RelatiOflsflip between K..
K.,
and so on, are the concentralions of Ihe I<..q is rwaled to .le by Eqlla tíon 4-3 (sce oIso Table 4-1 J.
severa) reaclants in moles (ler titer. He re, also , the vallle of
or
K,.q =
e - M,;//iT
and J.C(~C =: - RTinK~
IEquation 4-21
concC - D,
!le = - RTln K"'l
57
lEqualion 4-31
Becauso energies of activation are generally belween 20 and 30 kcal /mo), activaled slates practically never OCCllr al physiologica l lemperalures. High activation energies are thus barriers pre venting sponlaneous reorrangements of celluJa r-cova lenl honds. Thcse barriers are cnormously important. U fe would be impossible ¡fthey did not exist, for all a toms would be in Ihe state of leasl possible energy. Tbere would be no way to temporarily slore energy for future work. On the other hand, Iife would also be impossible if means were no! found lo selectivoly lm-Yer Ihe activation energies of cerlain rcaclions. This also musl happen if cell growth IS lo occur at a rate suffi ciently fas l so as not lo be se rious ly impeded by random destructh:e forces , such as ionization or ultravioJet radiation.
10 -f! 1O -~
10-' 10 ~ 10- l 10- 1
10" 10 ' 10' 10'
./lG (kcaVmol)
8.2 6.8 51 4,1
V 14 0,0 - 1.4 - 2.]
- 4. ,
ENZYMES LOWER ACTIVATION ENERGIES IN BIOCHEMICAL REACTIONS Enzymes are absolutely necessary foe lire. The fu nclion of enzymes is to speed up lhe rate of Ihe chemical macHons requisite lo cellular existence by lowering !he activation eneegies of molecular roarrangemonts lo values that roan be supplied by the heat of molion (Figure 4-2). When a specific enzyme is present, Ihere is no longer an effeetive barrier preventlng the rapid formatioo of the reactants possessing the lowest amounts of free cnergy. Enzymes never affect the nature of an equilibrium: They merely speed up the rate at which it ls reached. nlU~. if the thermodynamic equ ilibrium is unfavorable for lhe formation of a molecu le, Ih e presenee of an enzyme can in no way briog aboUI lhe molecuLe 's accumulation . Because enzymes must catalyze esseutiaUy every cellular molecu lar rearrangement , knowing the free energy of vacious lllolecules canoot by ¡tself teH us w hether an energctiea lly fea sible renrrangement will. in fact , occur. The rate of Ihe rcactions musl always be consirlered. On1y if a cel! possesses a suitable enzyme will the rcacHon be importan!.
1 '6
e;
".c
~
r"-~~---------i~c\----1
progress of reaction
]
activatioo
energy of
~l~y~ reacticn
activalion energyof llncatalyzed reacti(rl
FI e u R E 4-2 Enlymes (C"'Of curve) lower acthlatíon energies and thU5 speed up the rate of the reaction. Note mat tJG remains!he same becduse!he eqtJilibrium pa;ition remains unaltered.
FREE ENERGY IN BlOMOLECVLES Thermodyn am ics tells liS that all hiochemical pathways m u s l be characterized by a decrease in fr ee energy, This is clearly the case fo r degradativc pathways. in which thermodynamically unstabl e (Dad molecules are converted lo more stable com pounds. 5uch as ca rha n dioxi de and wa ter, with the evolution of heat. AlI degraclative pathways hava two pri mary purposes: (1) lo produce the sma ll organ ic h agments necessary as buildi ng blocks Cor larger organic moteculas and (2) lo conserve a s ignificant fraction of the free ene rgy of the original foad molecu le in a form that can do '\\'ork. Tlüs latler purpose is accomp lished by coupling sorne of Ihe steps in degradative pathways with the s imultaneous formatian al' high-cnergy molec ules s uch as ATP, which can slore free energy. Not aU the free cnergy of a food moJecuJe is con verted into the free energy of hígh-energy moJecules. If this were the case, a degradative pathway woul d nol be c har aclerizcd by a decrcasc in free energy. and Ihere would be no drivíng force to fa vor the breakdown of rood molecu les. lnstead , we fi nd Ihat a11 degradative pathways are characterized by a conversion of al least one-half Ihe free eDergy of the rood moJecule ¡nlo heat or entropy. For examp le, iI is esti mated thal in cells. apprmd mately 40% of the free energy o f glucose i!¡ USf'..d lo make new high-energy compounds, the rcmainder bcing d issipaled into heat energy and enlropy.
High#Energy Bonds Hydrolyze with Large Negative ll.G A high-energy molecule contains olle or more bonds whose breakdowll by water, called hydrolysis, is accompanied by o large dccrease in free eneeg}' (5 kcallmoll. The specific bonds whose hydrolysis yields Ihese largo negativo !:le vfllucs are called high-energy bonds, a sornewbat m islcad ing lerm, sjnce iI is nol the bond cnergy bul the free eneTS)' oChydrolysis that is high. Nonelheless, th c term high-energy bond is gencrally employed. and ror convenience, we shall continue litis usage by marking high-energy bon ds with Ihe symbol - . The energy of hydrolysis ofthe average high-energy bond (7 kcal/mol) IS very much smaller Ihan the amount of energy that would be released if a glucose molecu)e were lo be completely degraded in one step (688 kcallmol). A one-step breakdown of glucose would be inefficienl in making high-energy bonds. This is undoubtedly the reason why biological glucose degradation requires so many steps. (n Ihis way, the amount or energy released per degradalive step is of the same order of magnitude as the free energy of hydrolysis of a high-enerS)' bond . The mosl importanl high-energy compound is ATP. (t is formed from inorganic phosphate G and ADP, using energy obtained either from degradative reaclions or fTom the sun , a process known as photosynthesisoThere are, however, many other im portant high-energy compounds. Sorne are d irectly formed during degrudflti ve rcactions; others are formed usi ng sorne of lhe free cnergy of ATP. Table 4-2 lisis lhe mosl important types of high-energy bonds. AH involve either phosphate or sulfur atoms. The high-energy pyrophosphale honds of ATP arise from the union of phosphate groups. The pyrophosphate linkage (0-0 1is nol. howcver. the only kind of high-energy phosphate bond: The attachmen! or <1 pbosphalc group 10 Ihe oxygcn alom of a carboxyl grOl\p creales a high-energy acyl bond. 11 is now c1ear lhat high-energy bonds involving sulfur atoms play almost as im portant a role in energy
Free EIlergy illlJiamo/ecu/es
'A B l E 4-2 Important OaS5eS of Higll-Energy Bonds
Class
MOlecular Example
Pyrophosphale
o-o
Nucleoside diphosphates
3denosine-Q - O (AOP)
AOP
;;:=:=:=. AMP + O
tlG=-6
Nucleoside triphosphates
adeoosine.......-Q- Q - Q
ATP ~AOP+ O
tlG=-7
Reactlon
&G 01 Reactlon, kcaUmol tl G =-6
pyrophosphale
(ATP) ATP ~AOP+ O -O
o-
O phosphoenolpyruvale ,~ (PEP)
Enol phosphates
PEP ~ pyrLNate, + O
6G=-12
AMP-AA~AMP+AA
tlG=-7
1
c-o- Q 11
CH,
Aminoacyl adenylalcs
/0
H,c-C~ Guanidinium phosphates
'0-
1 N /"\. H H3C C- N-Q
crealine - P
~
creallne
+P
tlG ""-B
1I
NH creatine phosphale
ThlOcsters
Acetyl CoA ~ CoA-SH
metabolism as Ihose involvmg phosphorus . The mosl important molecule containing a high-encrgy sulfur bond is acetyl-CoA. This bond is thc main source of energy for rallyacid biosynthesis. The wide range of t1C va lnes of high-energy bonds {see Table 4-2) mcans lhal calling a bond "high-energy" is sometimes arbitrary. The usual criterion is whether its hydrolysis can be coupled with another reaction to effect an important biosyn thesis. For example, the negative t1G accompanying tlm hydrolysis of glucose-6-phosphate is 3 lo 4 kcallmol. Bul this ÓC is nol snfficien t for efficien t synthesJs of peptide bonds. so this phosphate ester bond is not included among highenergy bonds.
+ acetale
tlG'=-B
60
T{¡elmpoI1oJJ(:e of 1·ligh.E:llergy BOllds
HIGH-ENERGY BONOS IN BIOSYNTHETIC REACTIONS o
i
>. Q)
.~
unfavorable: A_
B
e_D
e
very favorable:
- B_ D_
e E
E progress of reacllon
FI e u RE 4-3 Free-en«gy c::hanges in
a multi-slep metabolic pathway, A - B e ..... o - E. Twosle1>S(A-B.mdC-D) do not fao.u the A ..... E directJon of the reaction, since they have small positNe lJ.G values. Howevel, lhey are ¡nsigJlIficanl owing lo !he very large negatille lJ.G va!ucs provided in Sleps B ..... e and D ...... E. Therefore. !he O'o€ral! -00
reachOn favors the A ...... E~ .
ThA construction of a large moleculA from smaller building block,s oft en requires th e input of free energy. Yet. a biosynthetic pathway. like a degradative pathway. would not exist ir iI were nol characterized by él nel decrease in free energy. This means thal many bi osynthet ic pathways demand an external source of freo energy. These free-energy sources are the high-energy compounds. The making oC many biosynlhetic bond::: is coupled wilh tbe brea kdown of a highenergy bond, so that the net change of free cnergy is a lways negative, Thus , high-energy bOnds in cell s generally have a very shorl life. Almost as SOOI1 as they are Cormed during a degradati ve roactiol1, they are enzyrnaticalIy broken down lo yield the energy needed to drive another reacti on lo completion . Nol a1l the steps in a biosyn lhe tic palhway require Ihe breakdowll of a high-energy bond. Often , only one or two steps involve such a bond. Somcti mes this is because tbe !:J.C, even in the absence of an eX lernaUy added high-energy bond , Cavors Ihe biosynlhetic direction. In othcr cases , /lC is effectively zero or may even be slightly pos iti ve. These small positive !:J.C va lues, howevcr, are not sign ifica nt so long as they are followed by a reaction characterized by the hydrolysis of a h igh ·energy bond. Rather, it is Ihe s um of a ll the free·e ncrgy changes in a pathway that is sign ifican t. as shown in Figure 4-3. [t does' nol really malter Ihal the K..... oC a specific biosynthe tic step is slightly (80:20) in favor oC degradalion if the Kecl of the succeeding step is 100:1 in favor of the Corward biosynlhetic dircclion, Likewise, not al! the steps in a degradative pathway generate highe ncrgy bonds. For examplc, only two steps in the lengthy glycolytic (Embden·Meyerhof) breakdown oC glucose generate ATP. Moroover. thero are many degrndative pathways that have one or more steps J'C(juirlllg the brcakdown of a high-energy bond. The glycolytic breakdown of glucase is again an example. rt uses up two molecules of ATP for every four that il generates . Herc. of course, as in every energy-yield ing rlegradative process, more higl~energy bonds musl be made Ihan consumed.
Peptide Bonds Hydroly:ze Spontaneously The formati on oCa dipe ptid e an d a wale r molecule from two amino acids requires a !:J. G of 1 lo 4 kcal/mol, depeJlding on which amino acids are being joined. These positive !:J.C values by themselves toll us that polypeptide chains cannol form from free ami no acids. (n addiHon, we must take in Lo accounl Ihe fact Ihat water molecules have a much, much higher conccntratlon ¡han any oLher cellular molecul es (generally more than 100 times higher). AH equilibrium reactions in which water participales are thus strongly pushed in lhe direclion that consumes water molecules. This is easily seco in Lhe defini tion of equilibrium constants. For example , the reaction formin g a dipeptide,
ami no acidiA) + am ino flc id(B) -
dipeptide(A- B) + H 20 tEqualion 4-31
has the following cqui librj um constanl: keq :;
conc;A-a X conc'I,O conc;A X conca
IEquation 4-4)
where concentrations are given in moles per Hter. Thus. for a given Keq value (related lo !:J.G by the formula !:J.G = - RT In K.¡l. a much greater concentration of water means a cormspondingly smaller concentration of the dipeptirle. The relative concenlratiOI1S are, Iherefore, very importan!. In fact, a simple calculation shows Ihal hydrolysis may oflen proceed spontaneously cven whcn the 6.G ror the nonhydrolytic eeacHon is -3 kcal/mol. Thus, in theory, peoteios are u nslable and, given sufficiel1l time, will spontaneously degeade lo free amino acids. On lhe olher hand. in I.he ahsence of specific enzymes , these spontaneous rales are too slow to bave a signifi cant dfect on cell11lar metabolismo Thal is, once a protein is made. it remains stable unless its degradation is calalyzed by a specific enzyme.
Coupling of Negative with Positive óG Free energy must be added to am ino acids before they can be uniled to form proteins , How this happens became cleae wilh the discovery ofthe fundamental role of ATP as an energy donor. ATP contai ns thrce phosphate groups attachad lo an adenosine molecule (adenosine--O~-O ). When one or two of Ihe terminal -ti groups a re broken off by hydrolysis. I here LS a significant decrease of free energy.
Adenosine-O-O-G - G + H ~O-Adenosine-O-G-O + G (!:J.G = - 7 kcallmol) (Equation 4-51 Adenosine-O-O + O-O (6.G = - 8 kcal/mol) (Equation 4-6)
Adenosine-O-G-O-O + H 20 Adenosine-O-f)-f) + HzO -
Adenosinc-O-G + G (6.G = - 6 kcallmol) (Equation 4-71
AH these breakdown reaclions h~1\'e neg;:¡tive 6.G valucs considc rably grealer in absolule value (numerical value without regard lo signl than the positive 6. G values acmmpanying the formation of polymcric molecules from their monorncric building blocks. "Ole essential trick underlyi:ng Ihese biosynthetic reactions. which by themselves ha\'e a positive 6C, is that they are coupled with the broakage of bigh-energy bonds, charactelized by negative 6G of grealer abso luto value. Tllus, during prolein synthesis. the formation of each peptide bond fó.G = +0.5 kcallmolJ is coupled with the brenkdown of ATP to AMP and pyrophosphate. which has a !:J.G of - 8 kcal/mol (see Equatioll 4-6). This results in a net 6.G of - 7.5 kcallmol, more than sufficient lo ensure that Ihe equilibrium ravors protein synthesis rather Ihan breakdown.
ACTIVATION OF PRECURSORS IN GROUP TRANSFER REACTIONS When ATP is hydrol yzed lo ADP and phosphate, mosl or the free energy is liberated as heal. Bccause heal encrgy canno! be used to make covalent bonds. a coupled reaction cannot be the resull of tv.ro completely separate reactions. one with a positive ó.G. the other V\tilh a negative aG. lnstead , a coupled reaction is achieved by two or
more successive reactions. These are always gruup-transfer reactions: reaclions. not involving oxidations or reductions. in which molecules exchange functional grou ps. The enzymes Ihat calalyze Ihese reactions are called transferases. Consider the react ion (A-X) + [B- Y)
~
[A- B) + (X- Y) _
IEquation 4-8)
In tms example. group X is exchanged wilh component B. Crouplransfer mactions are arbitrarily dcfincd to exclude water as a participant. When water is involved. (A-B) + (11-011) -
[A-oI-1)
+ (B -
I1)_
(Equalion 4-9]
Thi s rcaction is called a bydrolys is. a nd Ihe enzymes ¡nvolved are called hydrolases. The group-transfer reactions that ¡nleresl us here are those involving groups aHached by high-energy bonds. When such a high-energy group is lransfcrrcd lo an appropriatc acceptor molecule, it becomes attached to tbe acccptor by a high-encrgy bond. Group transfer thus allows the lmnsfer of high-energy bonds fmm one molecule to another. For example, Equations 4-10 and 4-11 show how energy present in ATP is transferred to form GTP. oue of the precursors ust::d in RNA synthesis: Adenosin e-G ~ G ~ G + Cuanosine-& -
Adenosine-O - Q + Guanosine-G- O Adenosine-Q - Q - G + Guanosine-G - G - Adenosine- G-O + Cuanosine-G - G - G .
IEqualion 4-101 lEq, 4-11)
The high-energy G-G group on GTP allows il lo unite spontaneously witb anol he r molecule. GTP is thus an example of what is callcd an aclivatcd molecule; correspond ingly, the process of transferring a rughenergy group js ca lled group aclivatiun.
ATP Versatility in Group Transfer ATP synlhes is has a key role in the eontrolled trapping 01' the energy of molecules Ihat serve as energy donors. In both oxidative and photosynthetic phosphorylations. eneegy is used lo synthesize ATP from ADP and phosphate: Adenosine-Q - Q +
G
+ energy -
Adenosine-G - O - G IEqualion 4-121
Because ATP is the original biological recipient of high-energy groups. it mus! be Ihe starling point of a variely of reactions in which highenergy groups are transferred lo low-energy molecul es lo give them the potentia1 lo ceaet s pontaneous ly. ATP's central role utilizas the fael that it contains Iwo high-cnergy bonds whose splitting releases specific groups. This is seen in Figure 4-4, which shows Ihrce im portant groups arising from ATP: & - G, a pyrophosphate group; -A MP. an adenosyl monophosphate group; and - O . a phospha te group. [1 is importan! to notice Ihat Ihese high-energy groups retaín their highcnergy qua lily only when transferrad to an appropriale acceptor molecule. For example, although the trans fer of a .... $ group lo
Adil'Gliorl DI PffiCursm~ in Croup Tronsfcr RecleliOIlS
ATP
F I (i U RE 4-4 Important group Iransfers
NH i
involWng ATP.
N
e +e +e-"l/'...I
UrJbose
ROH " O
R-C~
A
A
R-O-e-e + e-~ ,
'o - e +e- e-~ ,
I
ADP
AMP
0 - 0 + R _ C":7o
A
'o -e-~
, -AMP
ACtlvatlon of Amino Acids by Attachment of AMP The activation of
H
I
H-
R
I
pO
N+-
C -Cf"
I H
I H
H
+
Adenosine---G - Q - e
-
H-
'O-
R
O
I I ,!' W-C-C +O - O I I 'O - Q-Adenosine H H IEquation 4-13)
(In the equation, R represents Ihe spccific s ide group of the amina aeid.) The cnzymcs Ih al catalyze this type of rcadjon are called aminoacyl synthelases. Upon activation, an amina acid lAA) is lhermudynamically c.. pable oCbcing efficicntly uscd ror protein synthcsis. Nonetholes:'!. the
AA-AMP complexos are not Ihe direct pmcursors of protcins. lnstead. for a reason we shall explain in Chapler 14., a second group transfer must occur lo transfer the amino aeid. sliJI activated al its carhoxyl grollp, lo the end o f a IRNA molecule: AA-AMP + tRNA -
AA-tRNA + AMP.
63
(Equalion 4-141
A peptide bond then form s by the condensalion of Ihe AA-IRNA molecule onto Ihe end of a growing polypeptide chain : AA- tRNA + growing polypeplide chaill (01' n amino acidsJ ----,o tRNA + growing p'olypcptide chain (or n + 1 amino acidsl IEquotion 4-151
TIlUS . the final stcp oC Ihis "coupled reacHon." like that 01' all otbcr coupled reactions, nccessarily involves the romoval 01' lhe activaling grOllp amI the conversion 01' a high·energy bond inlo one with a lower
free energy of hydrolysis. This is the source oC Ihe negative !lG thal dri ves Ihe l'C
Nucleic Add Precursors Are Activated by the Presence of O-G Both types of nucleic aeid . DNA and RNA, are built up from mononucleotide monomers, also caUed nucleoside phosphate. Monoouc1eotidp..s, however, are thermodynamic
+ ADE
rEquation
4. 171
These Iriphosphatús cnn Ihcn unitn lo forro polynuc1colides hcld togethcr by phosphodiestcr bonds. In this group-I ransfer reaction, a pyrophos phale bond is broken 8nd a pyrophosphatc group rclcased : Deoxynuc1eoside-O-G-O + growing polynuc1eotidú chain (of n nucleotirles) O -O + growing polynuc1eotide chai n (Eqllotion4-181 (n + 1 nuclcotidcs). This rcaction, unlike Ihat wh ich forms peptide bonds. clocs not have a nega tive !lG. In fact, the !lG is s lighll y positivo (about 0,5 kcallmol). This situation immediatel y poses the qu eslion -as polynuc1eotidcs obviou sly funn -What is the source of the neces' sary free energy?
The Value oC
O-G
Release lO Nucletc Acid Synthesis
The nceded free energy comes from the splitting of the high-energy pyrophosphate group that is formed simultaneously w ith the highenergy phosphodiester bond. AH cells contain a powe rful enzyme, pyrophosphatase. which brcaks down pyrophosphale molecules almos! as soon as they are formed:
0-0- 2 O
(6 G ~ - 7 kcal/mol).
lEquation 4.191
The largo negative I1G rneans th
O- ci> SpHts Characterize Most Biosynthetic R eactions The synlhesis of nll cleic acids is nol the only reacti on where di rection is dotorminod by the roloaso and splitting of O-O. In fact , essontially all biosynlhetic reactions are characterized by one or more steps thal rclease pyrophosphale groll pS. Considero for example, Ihe activation of an am ino ac id by the att achment of AMP. By itsel f, the tra nsfe r ol' a high-enccgy bond from ATP to tbe AA-AMP compJex has a slighlly positi ve !:J.C. Therc forc. it is the release and splitting of ATP's termin al pyrophospbatc grou p tbat pTOvides thc negative .1.G th at is necessary to d ri ve the reaction. The great lltility of the pyrophos phate s plit is neatly demonstrated \Vhon we consjder the problems that wQuld arise if a col! attempled lo synthesize nucleic acid from nucleoside diphospbates ralher lban tri phosphates (Figure 4-5). Phosphate, ralher Ihan pyrophosphate, \Vou ld be liberalcd as the backbone phospho-diestor li nkagos wero made. The phosphodiester linkages, however. are nol stable in the presence of sign ifica nt quantil ies of phosphalo, becauso they al'C formcd wit holll a signi ficant rolease ol' free energy. Thus. tbe biosyn thelic roacUon would be easily reversible; ir phosp bate \Vete to accumu late, the rcaclion wou ld begín to move in the di rcction of n ucleic acid breakdown according to Ihe law of mass acHon. Morcover. it is nol feasiblc for a co ll to remove Ihe pbosphate groups as soon as they are generated (thereby preventi ng tbis reverse read ion ). as all cells require a significa nt intem al level of phosphatc lo grow. In contrast, a seqUfmce of reactions Ihat Iiberate pyrophosphate and then rapidly break it down into two phosphates disconnecls Ihe liheration of pbosphatc from the n ucleic acid biosynthesis reacti on , and thercby prevcnls Ihe possibility of rcversing the biosynthelic roaction (seo Figure 4-5). In conseq llcnce. jI wonld 00 very difficult lo accumulate enough phosphate in Ihe celJ to drive both reaclions in the reverso, or breakdown. diroction. It is clear that the use of nllc!eoside tri phosphates as prccursors of nucleic acids is nol a maller 01' cha nceo Thís same type of argu ment tells us why ATP, and nol ADP, is the kcy donor of high-encrgy groups in aH cells. Al firs l Ihis preforcnce sccmerl arbitrary to biochemists. Now. however, we see that many rcactions using ADP as an energy donor woul.d occur cqually weU in both direclions.
fifi
Thc lmportancc o{ High-Eflllrgy
Bond.~
a
•
+ nudeoside dlpllosphate
,
Q phosphate
growing chaln (n long) grOYJing chaio (n+ 1 long)
b
Q- Q- o v r
•
+
+
oudeoside
-
Q + Q
pyrophosphate
Q- Q
phosphate
Q- Q
-
Q + Q
pyrophosphate
phospllate
triphosph8te
growing chalo (n long) grCMiog ehaio (n + 1 long)
Q- Q- o v r
•
+
+
nudeoside triphosphale
grow-iog chaio (n+ 1 long)
growing chajo (n+ 2 long)
F I GU R E 4-5 Two scenaños tar nudeic acid biosynthesis. (a) Synthe51S 01 nudoc i!c](ls USlng nudeoside diphosphates. (b) Synthesis of nudeic ackls using nudeoside triphosphates.
SUMMARY The biosynthesis of many molecules appears, al a superficial glance. to violate the Ihermodynamic law thal sponlaneous reactions always involve a decrease in free energy (6.G is negalive). For example, !he fonnation of proteins from amino acids has ti positive j,G. This paradox is removed when we realize Ihal the biosynthetic reaelions do nol proceed as initially postulaled. Protcins, for cxample, liJ'e nol fo nned from free am iuo acids. lnslt'!ad,
the precursors are firsl cnzymaticaHy oonverled lo highencrgy acti vateJ moleculcs, which, in the presenre of a s pccific enzymc. spontancously unite lo forro the desired biosynthetic product o Many biosynthetic proeesses are thus the restill of "ooupled" reactlons. Ihe tirsl of which supplies tbe energy thal alJows the spontaneou5 occurrence of the sccomI reacHan. The primary enel'gy source in cells is ATP. lt
Bibliography
is form ed from ADP and inorganie phosphate. either duriog degradalive reactions (sueh as fennentation or respiration) or duri og photosynthesis. ATP contains several high-energy bond<; whose hydrolysis has a large negative .6.G. Groups linked by high-{lnergy bonds are called higbenergy groups. High-cnergy groups can be transferred lo otber molecules by grou p-lransfer reaelions , thereby Cl'ealing new high ellcrgy compounds. These derivative higb-energy moleeules are then the immediate precursors for roany biosynthetic stcps. Amino acids are aelivated by tbe addition of an AMP group. originaling from ATP, lo form ao AA- AMP múlecule. The energy of Ihe high-ellergy bond in the AA-AMP moleeule is similar lO thal of a high-energy
67
bond of ATP. Noncthe less, the group-Iransfer resclion proceeds 10 complelion bccause the high-encrgy ~-O molecule, erealed whel1 the AA-AMP rnolecule is formed. is broken down by Ihe enzyme pyrophosphatase lo low·energy gl'OUps. Thus. Ibe reverse reaelion, O - G + AA-AMP - ATP + AA. eanno! oceur. Almost s il biosyntbetic reselions result in Ihe release of Almost as soon as il is made. it is enzymaticaJly broken down to Iwo phosphale molecules , thereby making a reversa l of the biosyT!thetie reaction impossible. The grcat ulilily of the C)-(i> split providcs an explanation for why ATP, not ADP, is the primary energy donor. ADP call1lol trallsfer a high-energy group and al the same time produce Ci> -O groups as a by-producto
O-e>.
BIBLIOGRAPHY Genct"d.1 Rcferenccs Kornberg A. 1962. On the metabolie significance of phos· phorolytie alld pyrophos pborolylic reacHons. In Hori· zons in bioehemistry led. M. Kasha and B. Pullman). pp. 251 -264. Academic Press, New York. Krebs H.A. and Kornberg H. L. 1957. A survey of lile energy Irandormation in living mate rial. Ergeb. P/¡ysi o l . Bio/. Chem. Exp . PharmakoJ. 49: 212. Nelson O.L. and Cox M.M. 2000. Lehninger principies al biochemistIy, 3rd edition. Worth Publishing, New York. Nicholls D.G. and Ferguson S. l . 2002. Bi oen ergeHcs 3 . Academic Press. San Diego, California. Purich D.L (ed.) 2002. MetllOds in enzym%gy; Enzyme kinetics and mecltanism ; Detection ond characteriza-
lian of enzyme reaclion intermediates. Methods in Enzymology. vol. 354. Academie Press. San Diego, California. Silverman RB. 2002 . Tite arganic chemistry 01 enzymecota/yzed mocHan!>. Academic Pross, San Diego. Ca lifomiCl. Stryer L. 1995. Bjochemistry. 4th edition. Freeman. New York. Tinaco 1. (ed.). Sauer K.. Wa ng J.C. and Puglisi J.o. 2001 . P/lysica/ chemistry: Principies and appUcutions in lije sciences, 41h ed iti on. Prentice Hall ColJege Division, Upper Saddle River. New Jersey. Voet D .• Voet J.c.. snd Prall C. 2002. Fundamen ta/s o{ bio. chemjstry. John Wiley & Sons, Now York..
CHAPTER
Weak and Strong Bonds Determine Macromolecular Structure NA, RNA, and protein are a1l polymcrs of simple building blocks. As we leamcd in Chapter 4, synthesis ofthese polymcrs depends on the controlled, catalyzed Iinkage oC aclivated bu ilding blocks. For DNA and RNA. these building blocks are nucleotides (see Figure 2-111. For protcins, the building blocks are Ihc 20 amino acids donaled Crom their activated inlenncdiatcs. Ihe donor tRNAs. Assemb ly of tbese choins rcquircs breakage of multiple high-energy bonds for lhe adclition of each building block. For all these molecules. the order of Ihe constituenl building blocks determines Iheir genctic and biochemical funel ion. Weak boncls playa critical role in determining the strueture and function oC thesc polymers. The primary infonnation 01' RNA, DNA. and protcins is th e order o" Iheir cova lently-linked building blocks. Ncvertheless, il is on ly after tbey bave fonned cxtensive additional wcak bonds between their different parts Ihat Ihese polymcrs adopt chameteri slic shapes that allow Ihem lo carry out Iheir functions. Tbe hydrogen bonds and ¡onie. hydrophobie. amI van dCT Waals interaetions dcscribed in Chapler 3 direcl proteins to form eritical binding siles and ONA lo assume il s dOllble hclical strucluTC. Indecd . Ihe disruption oC these inleraction s (by heat or detergent, foc example) without discuption orcovalenl boncls completely destroys the activity of all but él few biologieal polymers. In Ihis chapler we brieny describe the structurc o" biologiea l macromolecules and Ihe Corces that control Ibflir sbape. DNA and RNA are discllssed brieny here and more Iboroughly in Chapler 6. Wc then focus on Ihe diverse struct ures oC protoi ns. The final seetions of the c hapter Cocus on thc internctions bctween protcins and nucleie acids. an acti vity central lo many 01' Ihe pcoccsses we will cncounlcr in thif; book, and tha control of protrlin CllJ1ction by allostery.
D
OU T l
lNE
Higher-Order $trtictures Are Determlned by Intra- and Intermolecuwr Inleractions (p. 69)
• The Specific Conformation 01 a Protein Resulls from lIS Pattem
of Hydrogen Bonds (p. 78)
• Mas! Proteins flIe Modular, Conlalnlng Two Of Three Dornains (p. 8 1)
Weak Bonds Correctly Posltion Proteins along DNA and RNA Molecules (p. 84) Allostery: Regulation of el Protein's fundion by Chang¡ng lIS Shelpe (p. 87)
HIGHER-ORDER STRUCTURES ARE DETERMINED BY INTRA- AND INTERMOLECULAR INTERACTIONS DNA Can Form a Regular Helix ONA molecules usually have regu lar helienl configurations. This is
because most DNA molecu les eontain two antiparallel pol ynucJeotide slrands Ihat have complementary struetures (see Cha pler 6 ror more dctailsl. Both internal and external noncovalent bonds stabilize the structuro. The Iwo stranels aro held together by hydrogen bonds betwcen paies of complementruy purines and pyrimidines (Figure 5-1). Adenine is always hydrogen-boncled lo thyrnine. whenlas guanine is 6.
70
Weuk uml Sl roflg Ilonds f)pl p,rmi nc Mucromoler.ufur Slru cluffi
F I (; U RE 5-1 The hydrogen-bonded base pairs 01 DNA.. The figure shows the posibon and length of the hydrogen bonds b€tween the base !)dirs. The (ovalent bonds between the atoms within each base are shown, but double and single bonds are not distln·
Ihymine
adenlne
o
o
guished (see Frgure 6-6 in the next chapter).
o 11.1
o cytosine
O
A
0 =----=-_ 0----.J
L:I
~ ~ ""''''''''''''
I
guanine
.""",,,,,
o 1o.sA
/
\
f I GU RE 5-2 The breaking ollenrtinal
base pairs in ONA by random thermal motion. The figure sho.vs thal once sorne bond!. ha\..e broken III Ihe termirn, they can refOfm (Iower Ieft) or addiliol'oill bonds can break.
hydrogen-bonded lo cylosine. In add ition, virtually all Ihe surface aloms in Ihe sugar and phosphate groups form bonds lo water molecules. The purine-pyrimidine base pairs are found in the center of the ONA molecule. This élrrélngemenl élllows their fIal surfaces to stélck on 10p of úach olher, crcating shared (TI - 11) úlectrons belwoon the bases and limiting lhcif conlact with water. This arrangemllnt, known as base stack.ing , wouId be much less satisfaclory if only one pulynucleotide chain were presento Because pyrimidines are smaller lhan the purines, single-strandcd DNA would resu lt in the un favorable exposure of hydrophobic surface helwecn adjaccnl bases. Tho presence of c:omplementary base pairs in double-helical DNA makes él regular structuro possible, since éach base pa ir is 01" lhe sa me size. The double-helical DNA molecule is very stable for two rcasons. rirst, disruption of the double helix would bring the h ydrophobic purines and pyrimidines into greater contact with wfller, which is very unfavomble. Second. c1oublc-stranded DNA moleculcs conta in a I'el)' Jarge number of ""uak bonds. arranged so thal most oC Ihem cannot break without simultaneously broaking many others. Thus. fOf example, even Ihougb thermal motion is conslantly brcaking apart Ihe purinc-pyrimidine pairs al the ends of eélch molecule, the two chains do not usually faH aparl because other hydrogcn bonds in lhe molecule are still intacl (Figure 5-2). Once a given bond is brokon, thl:l mosl likcly next ovent is the reform ing 01" Ihe same hydrogen bonds lo reslore Ihe original molecular configuration, ralher than the breaking of additional bonds. Sometimes, of coursc, Ihe Ilrsl breakage is followed by a second, and so forth. Such multiple breaks, however. are quite rare . so lhal double helices held togelher by moro than ten baso pairs are very stable at room lemperaturo. When DNA slran c1s do come apflrt withollt reforming, Ihis typically starts al 000 c nd of tbú molccule an d procccds ¡nward. This is because
HigJwr.Order StJ1J(;/lIres Jlm o.!fermi/loo by Intl
Ihe inlor¿)ctiOIlS betwecll the bases al Ihe end of Ihe DNA are tho leasl supported by adjacent inlcractions. Thal ¡s, tbey have only one neighbariog base pair lo help secure the inleractiun. As describcd in more delail bclow, the same principle-the use of multiple weak bondsgovems the stabil ily of proteins. Ordered collcctions of secondary bonds bocomo ¡ess and less stable as their temperaturc is raiscd above physiological tcmpcmtures. At elevaled tcmpcratllres, the slmllltaneous breakage of several wcak bonds is more frcquClnt. Afler a sign ifican! numb(ll· h.-.ve broken, a molecule usually loses its original form (the process of denaluralion) ami assumes an ¡nactive, oc denatured, configuration. Thus, as the lomperature rises. moro interaetions are rnquired lo mi:lintain the double-stranded nature ol" DNA.
RNA Forms a Wide Variety of Structures In rontrasl to the highly regular structure of Ihe DNA doublo helix. RNA is usually fOllnd as a s ingle-stranded molecllle. Sorne RNA moleeules (such as messenger RNAs) function as transienl carriers of genetic infoooation and are constantly associated wilh protcins a nd thus do nol have an ¡ndependent, slable, terti ary fold. Other RNA molecules fold inlo unique tortiary struclures. For these I~As , intramolccular inleractions bctween distinCI regions Icad to the formation of spccific elements of scr;ondary slructurc. These interactions are principally between Ihe bases of the RNA and ¡nelude traditional Watson-Crick base pairing, unusua l base pairing found only in RNA. and hydrophobic base slacking. RNA differs from ONA in that Ihe ribose sugar ofthe backbone carries a 2'-hydroxyl group. In Ihe fold ed structure of RNA molecules. these 2' -hydroxyl groups often participate in interactions thal slabilize Ihe slructure. The binding of divalenl metal ions (such as Mg2+. Mn 2+. and Ca H ) lo the RNA is oft en criticAI lo the foooalion of a slablc, foldcd con formalion oocallse these ions can shicld the negative charge of Ihe RNA backbone, allowing regions of lhe molecule lo paek more elosel y together. TIle prccisely folded. compect nature 01" RNA tertiary structllre is iIluslraled by the high resolulion struclllres of sorne important RNA rnolecules, for exarnple. tRNA -a molocule Ihal participalcs in protein synthesis (see Figure 14-16). These structures reveal that base stacking plays a major role in RNA conformation: for example. 72 out of the 76 bases in IRNA are involved in stacking interactions. As in Ihe DNA double holix struclure, stacklng of RNA bases 00 lop al' one anothor is cnergetieally favorable. For tbis mason. short base paired, helical regions of RNA stack on top of ono iffiothcr lo form longer, diseontin1I0US helicaJ regioos. Thcse regions of stacked heliccs Ihen pack against cach other via additional tertiruy inleractions. We have onl)' briefly discussed the features of DNA ;:md RNA slructure here. In Chaptcr 6, we \ViII describo in much more detaillhe interactions Ihal govem the structures of these critica l ceJluJar moleeules. For the rmna inder 01" thi s chapter \Ve for;us on the rorces infl uencing the strueture 01' pl"oteins.
Chemical Features of Protein Building Blocks In contrast lo Iho four nucleotido building blocks used for RNA or DNA, Ihe 20 amino acid building blocks used for proteio synthesi s are highly di verse. Thc eonunon struct ural features of the ami no acids are tlw
72
~Uwk
(wd StrOflg Ikmds Dl:/er mifle Macrowolecu ltlr S /ruc /ul'e
H
H-
H
I I N+- C I 1"
H
'-----'..J
amino group
R
side chain
,f'0
C
'- _ O
'------.J
carboxyl group
FIGURE 5-3 Thecommon 5lructural
features of amino adds.
central carhon (G..¡ linkcd lo a hydrogen. a primary amino group. and a carboxylic acid group (Figure 5-3). Tho fourth Iinkago is to a variable sido c hain called tho R group. Tho R groups of the 20 amino acids can be calcgorized by thdr size. shape. and chemical composition (Figure 5-4 ). T he R groups fall inlo fom categorícs: l1eulral-nonpolar. nculralpolar. acidic. and basic. Tho neutral-non polar side chains aro composed 01' simple carhon chains or aromalic rings and make principalIy hydrophobic conlacts. The neutra l-polar side chains indude hydroxyl. sulfhydryl. amide. and imid
The Peptide Bond Tho primary cova lcnf linkage belwüCn amino acids in proleins is the peptide bond (Figure 5-5). Thi s bond is made when the primary amine group of one amino ilcid is covalently joined to the carboxylic élcid group 01' a second amino acid. This linkage has a partially doublebonded characler. Because Ihis type of bond mvolves moro than one pair of eleclrons. rotation around lhis linkagc is limited; complelely free rotation abou t a bond is on ly possible when atoms are attached by single bonds. (For cxampla. the melhyl groups of ethane. H3C- CH3 • rotate about the carbon-carbon bond.) Ln contrasl lo the pcptido bond. all or the olher Iinki:lgos in the pcplide bar.khone aro single bonds ami thus rotatl~ free ly. Theoretically. Ihese bonds could exist in an infi nite number of conformations; however. in Ihe contcxt of a protein, steric interference betwccn adjacent pcptide groups limits the ir rolal ion. The orie ntation 01' adjacent p lanar pcptide bonds can be described by two bond angles: Q and Ij¡ (Figure 5-6). Within proteins. Ihese angles are constrained by the need lo maxim iza fonnation 01' secondary bonds amoog func lional groups wi thin the peptide backbone while minimizing steric interfercncc.
There Are Faur Levels oE Protein Structure The final tmee-dimensional structuro or shape 01' a protein is fonned tmollgh the seq llenlial association 01' increasingly distant amino acids. The types ofinleract ions observed withi n a protein can be dividcd iulo four dasses (Figuro 5-7). The linear sequence 01' amino acids in the polypoplide chain is the primal)' !>1ructure. Nearby amino acids associale with one anolher lo form regions 01' secondary structure. Thc elemcnls of sccondmy structurc are usually formed tmough interactions between those parts of tIle amina C1cids that make up the polypeptidé backbone ratller than tbe side chains. As we will see bclow. o: heliccs and ~ sheets are the elemenls of secondary struclure. Thoso clemcllts pC1ck fogethcr in a defincd mnnncr lo gcncrnlc a givcn polypcptide's tertiary structure, which is the overall conformation of a single polypeptide chain. Many proteins aro composcd of multiplc polypeptidc chains known as protein subunits. The manner in which these subunits associate with one anolher is referred to as the protein's quartemary structure. The infonnation contained wilhin Ihe primary structure is ncad y always sufficient lo detenn ine the eventual terli ary structure of a polypcplide. Thi s was demonstrated in a c1assic experimenl in
neutral--nonpolar amlno aclds glycine (Gly, G) H
alanioe (Ala, A)
H
1
1
6
I
I
H
. . . .(T
H
H
0
H_N'_C_C9'
1
H
1
H -N'-C-e
I
I
H
""
,,0 ,
aeidie a mino acids aspartle acld
glutarnie acld
(ASp. D)
(Glu, E)
Iysine (lys, K)
ti ti O H _N'_C_ C9'
" 1 H1 úO H _ N'_C_C9'
H
ti
It -
o-
bask ami no acids
11 "y° 11 '0. H eH, I
ll ú 11 "0' H Oi, I
N'-C-C
/e~
1
I
eH,
H
I
. . . .0 .
1
úo
I
,
, eH,
1 úO N'- e-c 9'
1 H
o-
\
N~ / Nt-f
I
"'e,
'" I
I
I
I
I
eH,
CH,
"o'
I eH,
HC=e
eH,
'r'
,
1
H-
I
I
o
H
H
1
OH,
I /C~
"
H
H _ N' _C_ C9'
I
Ot.
"O
hislidine CHis, H)
arginine (Arg, R)
e
' ';'
>7'N'i,
H,N'
• •
• •
• neutra"'polar amino acids va!ine (Val, V) It -
isoleucine (Ileu, 1) ,
,
H
I
I
RO
J
,
b-
H' -C-C
H
1 I 4'0 tI-f('-C-C
:¿~etl.
I
H
,/
I
CH
,
,/ 'eH,
serine
U'lreonine
(Ser, S)
(Thr, T)
H H_
1
o-
H
H
I 1 úO H' _c_e9'
H
I
H_
. . . .0-
CIi;
asparagioe
tyrosine (Tyr, Y)
H
I 1 úO N' _C_C9' 1
H
I
1
Oi,
O-
I
#0
1
1
. . . .0-
H
I
OH
,
I
I-I-f(' -C-C
....
glutamine (Gln, a)
(Asn, N) H H
CH,
H-
H
I 1 {'o N"-C-C I I , H eH, o-
I
C-ai,
,/ 1
.f"
o
OH
0'II
e
' H,
""
e
o
....
••
Iryptophan (Trp, W)
,I "I
úO H-N"-e-c9'
I
I , OH oI '
H
f"'t-¡¡ <...A. _OH
proUne
phenylalanine (Phc, F)
, ,
H_
I 1 úO N' _C_C9' I I ,
'6
,I "1
ú
I
. . . .0-
O
I-I_N'_C _ C9'
\
HoC,/CH,
o-
leocine (leu,l)
melhionine (Met, M)
(Pro, P)
, , 1
I
"yO
I
I
,
I-I-N"-C- C ti
eH"
I
OH,
'r' ,
~
I
OH.
•
+
o-
cysleine (Cys, C)
"I ,I
{'O
I
"0'
H
H-N"-C- C
I
,
'H,
I
eH
,
I 1 6° H-W-C_ C9' I
I
"
eH,
I
~
H/ 'CH.
...
.,.•
I
,
o-
74
Weal.: and SlrOn¡; Bonds De/f!rmine MacromoJecl.l Jor $uucture
e o.
o
el
which the single-polypeplidc enzyme rihonucIease was suhjected to harsh conditions lhat interfere with hydrogen bonding and olher weak chemical interactions lead ing lo the complete denaturation (or unfolding) of Ihe polypeptid e. When Ihe de natured ribonuclease was restored lo conditions thal allow Ihe formation of weak c hemical bonds, the enzyme rapidly regained both its norma l three-dimensional structure and RNA deaving activity. f'or a description of how protein structures are worked out experimentall y, see Box 5-1, Oetermination of Protejn Structure.
a Heliccs and Il Sheets Are the Common Forms of Secondary Srructure
f J (¡ U R E 5·5 Peptide bond.
TI)!:' brilCk-
a s Indicate the two amino acld resKlues thal are
joined Di a peplide bond
The most stable arrangement of a polypeptide backbone is the a helix. This is a right-handed helix, repeating every 5.4 A aIoog the helical axis (Figure 5-8). This slructure is prefcrred because Ihe pepUde backbone has favorable lb and q, angles that accommodate a regular pattero of hydrogen bonding between carbonyl and imino groups on the same c hain. The hydrogen-bonding pol cntiaI of the peptide backbonc js full y utili2tKl lo slablizc Ihe structure. As a eonsequence of the precise geometry of lhe poi ypeptide chain, each tum of the a helix has 3.6 amino acids. rf. foe example. four amino acids were used por turn. tho hydrogen honds would nol be so neatly formed . nor would lhe individual backbone atoms fit togethet so wcll . Many ~mino acid sequcnces can adopt an o: helica l seconrlary strucl ure. This is because the structure of the o: helix is stabili zed by conlacts between lbe nearly universal backbone aloros of the carbonyl and imino graups in the polypeptirlo eha in. The only amino acirl that lacks these atoms is proline. which cannat parlicipate as n donor in the hydrogen banding that stabilizes the helix bccause of ils cyclic chemieal structure. Thus, proline i8 a helix-breaking residue. Although their structures do not prevent il. glycine, Iyrosine, and serine are also rarely fo und in o: helices. Another consequence of the raet that Q heliees are constTucted through exclusivel)' backbone contacts is that the side chains proiecl away from the heJix. This puts these sido chains in an ideal position lo interact with another mgion of the prote¡n or another maemmolecu le, suc h as DNA. The second com mon secondar)' structural elernenl is the j3. sheet (Figure 5-9). In contrast lo lhe a helix , Ihe J3 sheet is a highly extended [orm of lhe polypcptide oockbone. Stablization of lhe J3 sheet structure comes from alignment of regions of "po1ypeptidc in this extended
F I (¡ U R E 5-6 4> and >ji angtes of rotat;on
abo\lt the Co-N and Cn-C bonds. The
shaded areas represent lhe piares of the pepride I::oods. (5aJrce: lOustraticfl. lIVing Geis. Rights o.vned b; Hov.rard Hughes Medicallnstitute. Not to be reprocLced ~ho..rt permissic.n.)
•
O
el
primary
--
.'
secondary
tertiary
-,
..
.....------ ------- ------ -- --------.........
-' --
• o
el
F I e u R E 5·7 Four iewls of Pfotein stRlcture. (SO ....ce: Adapte<:! from Branden 1999. Jn!rcdtx:rion lO protein strucrure. 2nd eáliol'\ p. 3. fig 1.1 .)
Boa: 5--1
quartemary
e and Tooze J.
oetenninatioo of Protein Structure
lhere are two principal methods to determine the three-dimensional struOUre of proteins. The fi rst to be developed was X-ray O)'Stallography. This method relies on the formation of highty ordered crystals of pure protein. As with the original diffractian studies of ONA fibers, the ¡rradiation of pmtcin oystals IlVith high-energy X-rays results in diffraction patterns that are related 10 the structure of the protein. MO"e recentty. nudear magnetic resonance techniques have been developed to elucidate the conformanon of smaller proteins. This techniqJe exploits the magnetic properties of certain atoms (sucl1 as l H) ro !T'Onitor how neighboring atoms ¡nf/uence earn olher. This information can be used to determine the relative Iocation of specific atoms within !he JX>lypeptide chain and these dístances predict !he OJerall structure 01 the prolein (see Figure 5-7). In prindple it shoold be possible to predict a protein's three-dimensional structure from its primary amino aOO sequence, because, aher alt that ¡nformation is sutflCient for a prote;n lO adopt a unique conformation. Although progress is being made in the prediction of protein structure based en amino acid sequence, the ful! detennination of the energetic constraints of a particular sequence is still beyond the most powerful computalional approaches. Nevertheless, prediction of certain secondary structural elements (such as the commorl Q helix structule introduced be[ow) is becoming increasingly reliab[e. 1he inaeasingly large number of available e:q:>erimenlillly-~termined strudures has provided an important resource for making protein structure preáctions based en amiro acid sequence. These atomic strudures have helped to define families of amino acid sequences that share reIated three-dimensional shapes. By comparing the seq.J€f1ces of proteins of unknoNn structure with those that have been delermined, it 15 often possible to make structural predictions based 011 the identified similarity. Combining this information with computer algorithms that predia secondary strudures 15 prrMng lo be a pov.erful method far predicting row proteins fold The Iong-term oul100k is tha! these approaches will allow al least an approximate structure 10 be predicted for any protein from its primary sequence alene.
76
W.!ak ond Strong 80nds Determine Macromolef:ular Slruct ure
FICURE 5·8 Apotypeptidechain foIded into a helacal conflgtJration called the a helix. (Source: Molecul¡II structure adapted from pa¡jing L 1960. The noture 01 chemicaJ bond ond structure 01 moIeaJles ond aystaJs: An introductial lo modem struclurol chemistry, 3rd edition, p. soo. Copyrigtlt 10 1960 Cornell University. Used by petmission of lhe publisher.)
me
me
•
O
01
o
Q
o
o
(3.6 residues)
confonnation such that hydrogen bonds can Coml betwooo carbonyl groups of one p strand and NH groups on the adjacent sl.rand. Typically, a region of p sheet is composed of fom to six sepacate stretches of polypeptide (each forming an individual p strandl , each eigbt to ten amino acids in length. In thc p sheet. adjacent amino acids are relaled by a rotation al 1800 and thus their respective sirle groups emerge from o pposite sides afthe p sheet {see Figure 5-9bl. p sheels come in predominantly one oI two f"onns. These dlfler in ¡he relatlve orienlatlons oI their chains (Figure 5-J 01. In one. the adjacent chains run in the same amino-Io-camoxyl direction lo produce a parallel p sheet. In the other, the adjacent chains run In opposite directions to yield an antiparal1el p sheet. Although less cornmon, there are also p sheets thal have bolh paraJlel and antiparallel camponents. In both parallel and antiparallel p sheets, al1lhe peptide groups tie approximately in the plane oC the sheet. Struclural sludies have reveaJed !hal in mosl cases the individual strflnds or f3 sheets tend to be twisted along their length in a rlght-handerl manner [Figure 5-11). Thus. instead of fiat sheets of protein. regions oC p sheet tend lo curve lo generate a compact protein modu le. For a protein lo roId properly, bolh tbe backbone and the side chains mus! adopt conformation s Ihat maximize favorable interaetions. The Cl helix and p sheet are both very slable conforma li ons oC the polypcptide backbone. Bu! for each s ide chain lO make Ihe maxirnum number al weak bonds, proteins have to adopl more varied
Higher·Order S/ructures Are De/ermined by [n/ro· ond [ntermolecoJor [n/erocliolls
77
b side view
a lop view
@o e o e!
R
F IC UR f 5-9 I! sheet.s are held together by hvdrogen bonds. (a) A p sheet is 5hown 'rom above. Note that the oxygens and nit!Of!eOS of lhe backbone are fully h';'drogen-bonded. (b) A 11 sheet sh()Nf1 110m iI 5Ide view. This. ~lustrates the Iocatioo 01me sirle groups. wt1ich alremate between eme.ging ffom atxJ.oe ()( beIow pIane 01 the 11 sheet. (Source: Molecular structure adapte
me
•
N
N
\ /C a O=C
\
N-
\ /
\
H
N
H-
~.
! \
O=C
!
N-
I
H
H-
N-
H
I H-
N
H-
~. N-
,
I H-
H-
/
H
N
N-
!
c,
\
!
~. N-
I H-
\
!
H
C= O
N
\
\
\
\
la
/ Co.
lo.
lo.
e
e
e
e
H-
e \ Co. ! N \ I
!
C = O .... H-
\
e,
\ NI
I C \ Co. I
'í e
N-
H'''O =
FI cu RE 5 - 10 Two types of ti sheets. (a) Parallelll sheet: schematic diagram showing !1ydrogen bond pattem; nOle!he cMins (Un in lhe s.ame amino- lo carboxy-directioo. (b) Anfíparalle1l3 sheet: schematic diagram showmg lile hydrogen bond"lng pattero; nole thal me maín NH a.nd o atOlT'lS wilhin a 13 sheel are hydrogm-bonded lo each 01h6". (Source: Adapted Irom Branden e aOO TOOle J. 1999. IntroductJon ropmtein 5~ 2nd editlOn, p. 19. fig 2.6a and p. 18. fig 2.5b.)
N
!
N-H
H ·... O= C
\
la
C= O ·..·H -
C = O""H -
/,
N-
! C,
e \ Ca ! N \ c=o !
e,
\
N
I
o= c
N
\
a
N
O "" H -
H ....O= C
\ /C
\
! Co. \ C= !
l.
N-
H .."O=C
C = O"·H -
CIl
N
e,
\
o= c
N
/ e'(
,
C= O... ·H-
le.
O= C
C= O
N
H
C= O
I
I e'(
C= O
,
N
\
N-
!
e'(
I
/
'í
\
!
,
O=C
e'(
C= O
N
\
I H-
~.
!
e'(
!
N
!
•..."
H
C= O
,
O=C
•••..
e'(
C=O
,
N-
!
Co.
/Ca O= C
\
H
N
!
\
/C a O= C
e'(
C= O
H-
N-
!
/
b
N
\
/C a O= C
I e'(
I
N
I N \
N
\ c= !
o
N-
H
/.
e"
\ NI
! C \ CIl !
H""O= C
Co.
'í e
H .... O=
N
78
WP.ak and Sl rong Bonds Delermine M acrolllnJt!"IJ/llr SlrU(:lure
FI (; UR. E 5-11 P sheets twist in a r¡pt-kanded manner aJong their length. lhe schematl( shows ¡he mlXed sIJucture 01!he E. cok plOtein lhioredoxin. 13 SlJands afe drawn as arrQWS fmm (he amlno lo Ihe cartoX')04 eOO of lhe proteio, (Source: Adapted Irom Blanden C. aOO Tome J. 1999_lntroduction ro prorein structure, 2nd edition. p. 20, frg 27a.)
F' (; U R E 5-12 Regular and irregular features of prote.in sÍJuctures. Irregular conligtlfahons in the backbooe (green) all
the maximlJrn lormation 01 sewnoory sl l\Jctures (13 sheet in purple and Q tleli)( in lurql.lOlse) by olt1er regions of lt1e pretern. lhe structure shov.n IS that 01 the El protelO 01adenovrrus. (Enernark EJ., Oren G.. Vaughn DE, Stenlund A.,
shapes. The three-dimensional structures of the polype ptide chains oC proteins are thus compromises betwcen the tendency of Ihe backbones to form either (1 helices or 13 shcets and the lendency oC the side groups lo twis l the backbone inlo less regular configurations lhal maximize the strength of the seeondary bonds formed by those side groups (Figure 5-12l. As we discuss in more detail below, one of the strongest influences on protein folding can be attTibuted to the burial of hydrophobic (nonpolarl amino acid side groups into the coro orthe protein's strocLure. This leads to the prediction that in aqueous solutions , proteins containing very large numbers oC non polar side groups will tend to internalize the non polar resid ues and be more stabl e than proleins containing mosll y polar groups. If we disrupl a polar moJeeule he ld 10gelher by a large number oC internal hydrogen bonds , the decrease in free energy is often small since lhe polar groups can then hydrogenbond to water instead. OIl the other hand, when we disrupt molecules having many nonpolar groups, there is usualJy a much greater loss in free energy because the di sruption necessarily inserts nonpolar groups into water.
THE SPECIFIC CONFORMATlON OF A PROTEIN RESULTS FROM ITS PATTERN OF HYDROGEN BONDS Whereas a portion of the energy stabilizing a protein is provided by hydrophobic interat.1ions, the specific conformation of a protein struclure is largely determined by hydrogen bonds. The energy associated with Ihe hydrophobic stabi b zation of proteins has no d irectional component. wheroas hydrogen bonds require precise distances and angles (see Figure 3-9 and Table 3-3 l. In general , aH hydrogen-bond donors and accepl ors wilhln a protein's interior have s uitable mates. Failure lo make a hydrogen uond in Ihe prolein inl eri or Is energe licall y costly, al Ihe rate ol a few kilocalories per hydrogen bond. The vitally importanl role of hydrogen bonds in proleins is lo destabilize incorreel slructures as much as lo stabi lize the correet one. The neeessity of s8tisfying all Ihe hydrogcn-bond donors an d acceptors on the polypeptide backbone (two per residuel drives fonnation of lhe large sections of a helices and 13 sheets found in most proteins. The only way Ihat a polypeptide can Iraverse the nonaqucous interior of a prolein, as it must, and satisfy the hyillogenbanding necessity is through formation of regu lar secondary stTuclures. Side chains do not have enough donors aud acceptors lo do lhe jobo Thus, al l large proteins contain significan t regions of 13 shcets. ar helices, or both. Despite the small number of secondary-structure building blocks , the variety of protein structures that can be built froo) Ihese is vnst. Evcn prole ins that are composad entirely oCp sheels or d belices adopt structures spanning a wide range (Figure 5- 'J31_ Of COllrse, some polypeptide seclions musl be less regular lo a llow their chains to tum at the ends of ('( heliccs and individual strands of 13 sheets (13 stTandsl . Turns are loops oC amino acids that link a heli ces and 13 strands but do nol cxhibil a defiued secondary sl ructure Ihcmselves. Turns can vary in length from only a few fimino acids lo eXlended segments that are substnntiall y longer. Thcy are, however. generaUy relatively sbort so as lO minimi l.e Ihe nmll ber of unful fi llad
The Spec:ific Conformolion 010 Protejn ResuJls from
II~' POllun
01 HydroBen Bonds
'9
a
b
e
o
hairpin loop
1» "" 111 10
" FIGURE 5-13 Potypeptide chalo folding. (a) Pf~€lns com¡:osed of« helic€S: myoglobin aOO the N-terminal domain 01 h repressor. (b) Proteins oomposed of 13 sheels: lhe Green Fh.lOJes.rent Ptotein (CfP) and gamma e~alline. (e) Compañson of lhe N·temunal doma.n 01 ~ repressor, composed 01 o: hclices wilti \he Cterminal domain of ~ repressor, eomposed 01 13 sheets. «a) \.tJjtechovsky J, Bcrenc!zer¡ J., O1u K, Sch~chting l.• and 5we€t RM subm.rted and Beamer U. ald Pabo en 1992. J. Mol. &01. 227: In (b) Onno M, Cubin A.B., I
hyruogel1 bonds thal accompany tlleir fonnation (for examp le. see Figuro 5-141. In addition. the less regUlar structures of Ihese loops are critical for the formati on al bin ding sites for sma ll molccules, the active sites of enzymes , and the surfaccs involve d in protein-protein interactions. This will become apparenl in the three-dimensional protei n slruclures we discuss in the rest of this c hapter and the remaind er of (he (ext.
o
o
Q
1»
p slrand 1
p slrand 2
fiGURE 5-14 Adjacentantipanllllel
p strands are joinrd by haitpin loops. Schernat.e shooMng an exampe uf a 1wO-reSidue hai!pÍf11oop. The 00xIs v-ithin the hairpn loop
(in shaded ared 11\ top of strtJcture) ae green
FICU RE 5 w 15 The let.Icine tipper Irorn the yus! transaiption fadof Ccn4. lhe Ieuor.e Zlpper IS an example 01a co.led-<:oil (see lext). Here we show two vieINs of the leuone zipper: fmm the side (on ¡he Ieft) and ¡rom above (on ¡he nght). (Fllenberger lE, Brandl U , Sfruhl K, aOO HarTison s.e 1992. CeO 71: 1223.) lmages prepared ~h fv'Io/Soipt, BobScripl. and Raster 3D.
a Hdices Come Together to Form Coiled-Coils Many polypeplides inlerael with one anolher through the supercoiling of {'( helices around eae h other. Typically. Ihis ean onl y occur wh en ¡he nonpolar side cha ins along each a heli x are arranged so lhal their side groups contacl Ihe olher heli x. The twisting ol ¡he ll elices around eaeh olher reflecls tbe non integral (3 .6 res jdues per turn) nalure oC lhe a helix, which allows the s ide groups lo paek neatly togelher onl y when Ihe {'( he liees inleract al a n angl e oC 18(> from parall el. If the Ci heli ces remained perfectly rigid , they eould stay in contact fOI" only a few rcs idues. Bul by supereoili ng in a lefthanded direction. neatly paeked , highly sl able, coiled-coils are created (Figure 5-15). One example or a coiled-coil is found in (he leuci ne zipper fumily 01' DNA-binding proteins. These .DNA-bind ing factors ha ve two subunits Iha t come togelher lo form a dimer through Ihe use of a eoiled-eoil region. This coiled-coil region is called a leucine zipper due lo the repeating appearance of leucine oc other amino acids with an aliphatic side group, such as valine or isoleucine. These leucines appear in a regul ar pattern as follows. If you consider two lurns of an a helix Ihis will represenl a segment of a pproxima tely seven amino acids. The aliphatic amino acids are localed wilhin each seven amino acid strelch al the Hrst and fourth posit.ions. Thi s positioning ensures that one side of Ihe o: helix is aliphatic. sinee Ihe Hrst and fourth posili ons will be on the same face oflhe heli x. These faces in two adjacent helices are packed againsl each other, burying their hydrophobic s irle chains away from the aqueous environment.
MOSI Proleins Are Modulor, COlllWflÍllg Til'o PI Three Domoins
11
MOST PROTEINS ARE MODULAR, CONTAINING TWO OR THREE DOMAINS The subunits of soluble proteins vary in size from less than 100 to larger than 2,000 amino acid residues. The smallest polypeptides ibat form folded pmteins have molecular weights of aboul 11.000 daltons (approximately 100 residuesl, bul mosl are between 20,000 and 70,000 daltons for a single subunit. Proteins larger iban abou! 20,000 daltons are often formed from two or more domains (Figure 5-16: see also Box 5-2 , Large Proteins Are Often Constructed oI Several Smaller Polypeptide Chains). The term do ~ main is used to describe a part of the structUffi Ihat appears separalo frorn the rest, as if it would be slable in solution on its O\'\!'Jl, which is oflen the case. Typica lly, a single doma in is formed from a continuous amino acid sequence and not portions oI sequence scattered throughout the polypeptide. This is an important poinl when considering how multidomain proteins have evolved . Proteins Are Composed
oE a Surprisingly Small Number oE
Structural Motifs Determination of the first half-dozen protein structures showed a bewildering variety of protejn fo lding motihi, imp lying the existence oE an infinite number of protein struclures. No\\! thal \\le know the three.dimensional structures oI Ihousands oE proteins, however. it appears thal a relatively small number of different domains aecount for most of Ihe large variety of protein structures. Although en Bccurale estimate is nol possible, the number oE truly unique dornain molifs will be orders ofmagnitude smaller than the number of lmiQue proteins. Specific kinds of domain motifs are often associated with particular kinds of activilies. One frequ enlly obse rved motif has been termed the dinucJeotide fold becausc it is frcquently found in enzymcs ibat bind F I e u R E 5-16 Pynavate '!:inase is composed of dislinct domains. The predominant ó:lmains ollhe enzyrne ale shown in tUlquoise. purple. and red (AI1en s,e and MUlrhead H, 1996. Acta Oyslollogr. D. 8KJJ. Crystallgr. 52: 499.) lmage prepared with MoIScripl. Bob5cripl. and Raster 30.
Box 5-1. Large Proteins Ale Often Constructed of Several Smalle.
Polypeptide O1ains Most large pwteins are regular aggregates of several smaller polypeptide dlains. The relationship among the polypeptide dlains making up suCÍl a protein is termed lts quartemary structure. Fbr example, !he macrorrdewlar complexes responsible fOf the synthesis of RNA (RNA!XJIymerase) and protein (ribosome) are each assemblies of multipe subunits. 1he a:mplexes are about SOO,()(X) and 2,500,000 daltons, fespectively, but do not indude any individual subunits greater!han 200,000 daltons. The nOOsome is composed 01 both protein and RNA suOOnits. This type 01 lactor is called a rioonudear protein (RNP). Why are large protein COTlplexes composed of multiple subunits rather than a sing4e large suoonit? The use of multiple subunits to build large protein complexes reflects a building principle applicab1e to all complex structures, nonliving as well as living This prin~e states that it is muCÍl easier to redJce the impact of cortSIrucOOn mislakes if faulty subunits can be discarded bebe they are inccxpaated into the final pfOduct. Far example, let us consider two ahernative W
ATP (Figure 5·17). This dornain binds ATP through a central, parallel J3 sheet with o: heBees on bolh sides. The nucleotide binding site is on the carboxyl end of Ihe 13 strands. Whal varies is the nurnber and detailed arrangement of the Cl helices and, lo a Jesser oxtent. ¡he order of Ihe 13 slrands. Related dornains of similar structure serve the same function in many differont proteins.
Different Protein Functions Arise from Various Dornain Combinations The various functional properties of proteins appear lo arise from lheir modular construction in lI1uch the same way as compulers with differ· en! specifications can be assembled from the appropriate modular componenls. Numerous examples can be given. There are, for example. many dehydrogenase enzymes, each working on a specific substrate. Each enzyllle consisls of two domains, one a COWlllon dioucleolide
Masl ProfeÍns Are ModulCU'. Qmluinif/g Twa ar Tlltee Domainll
83
F I e u R E 5-17 Emymes that bind ArP_ lhe red arrows point lo the ATP molecules bound within each stndUre. (a) PecA (b) OoaA « a) Story RM. and Stcitz lA 1992. Na!Ure 355; 374. (b) Eaberge J.P~ FYrucceIIo MM, and Berger 1M. 2002. EMBO J. 21: 4763 - 73.) Irnages prepare
binding domain lhat binds the coenzyrne NAD+. lhe other a dornain !hat binds substrate and ha.o; me catalyt.i c Rite. The structure of Ibe lalter domain varies among d ifferenl dehydrogcnases. Tbe geno regulalory rc pressor and aclivator proteins provide another examp le 01' modu lar conslruction. The Lac re pressor and the calabolite gene aClivator proteio (CAP) of E. coN bolh con tain multi pie dorna¡ns. The crystal s tructure of CAP shows two domains: A larger dornain binds a molecule of cyclic AMP in ils interior, while .he smaller domain recognizes specific DNA sequences (Figure 5-18). Thcre are significant am ino acid sequence similarities between the c AMP-binding dornaio of CAP and the regulatory subunit oE cAMP-dependent protein kinase. suggcsting thal Ihe cAMP-hinding dornaio of horh proteins evolved from the same FICURE 5 - 18 CAPcomplexwithcAMP
intefacting with bent DNA.. The Iargt'J domain 01 CAP, sI'OIrVIl in lurquoise, biods cydic AMP. shown in red and yeIIow in tt,e renter of that domain. lhe smaller, DNA-bindng domain (shov.n in purple), recognizes specific DNA sequences (the double helix is ~0IIVI1 in red and gray). (SdlUltz S.C , Shields G.C, and Steitz l A 199 1. Soence 253: 100 1.) lmage prepared ~Ih MoIScript, BobSmpt, and R.3S1er 3D.
84
Wook ond s tronjl Bonds Determine Mncmmo/ocuJor Slrllcllll'e
precursor. In CAP, this cAMP-binding domain is attached lo the DNA-binding doma in, so that changes in cAMP levels control transcription levels. (n the kinase, the cAMP-binding doma in regulates tbe aclivit y of the rust enzyme in a cascade of enzyrnes that result in the breakdown of stored glycogen ,
WEAK BONOS CORRECTLY POSITION PROTEINS ALONO ONA ANO RNA MOLECULES DNA-binding proteios mediate maoy of the central processes in bioJogy, The bonds tbat bold these proteins ooto DNA are the sarue colleclion of weak bonds that give proteins . DNA. and RNA tbeir own specific threedimensional conflgurations. Tbe rnost abundant DNA-binding proteins have a structuraJ role in pac:kaging and compacting the huge amount of DNA tbal must be fitted into the ceH. For exam ple, tbe nucleus of a human cell is only 10 JUl1 (10- 5 meter) across bul contBins roughly 2 mcters of double-stranded DNA. There are many ways thal proteios can recognize DNA. Sorne protein-DNA interactions aro specific for particular sequences oC DNA, whereas otbers are more specifi c Cor DNA in specific conformations. For example, when DNA 1S unwound in the cell during DNA replication or recombination, tbe single strands are rnpidly bound by single-slranded DNA-binding proteins (SSBs). Thcse proteins bind with little sequence specificity but are highly specific for singleversus double-Slranded DNA. To accomplish tbis specificity, the primary internctions between SSBs and the single-stranded DNA are through ionic or hydrogen bond interactions with the phospha te backbone or tbrough inlercalation of bulky ring-shaped sida chains (Cor cxample, Tyr or Trpl between Ihe bases (Figure 5-19) . Most DNA-binding proteins w e wUl considcr in tbis book mcognize spocifir. DNA seqUlmr.es in. doublc-stranded DNA. Such proteins are frequently involved in chooslng specific sequenccs ID fue genorne lo act as sites for the initiation of transcription or replicalion, or other DN A u-.wsactions, Indeed, 2-3% of prokaryotic proteins and 6-7% of eukaryotic proteins are either known or prcclicted lo be sequencespecilk DNA-binding proteins. By far the mosl common mechanism for protein recogn ition oC a specifir. DNA sequence 1S through the insertion oC an ( l helix in the so-called major groove oC the DNA (see Figure 5-20). As was evidenl Ln Figure 5-2 and is shown explicitly in Figure 6-1, the doublc helix has a wide groove known as the major groove and a narrow, or minor groove. Recognition using an a helix thal inserts in the major groove is advantageous for several reasons. 1. Tbe width and depth oC the major groove is a very good match to the
f IGU R E 5-19 Proteirt-singte-strand DNA mteractiOf'l for singJe-sb'~nd DNA..fJinding protein (558). SSB 15 shCM'l1 in grCly Clnd single-stronded [)NA IS silo...." in
s..
red. (~humath.3n Kozlov A.G" LohlTliln T.M., Clnd Waksni
dimensions oC an Q helix. This match allows weak interne! ioos lO occur between the DNA and approximately half of the surCace oC the Q helix, 2. The major groove is rieh in bydrogen bond aeceptors and donors located on tbe edgcs oI the bases (see Figure 6-101. More importantly. the pattem oI hydrogen bonding elements is distincl for each of the base pairs. Trns a1lows tbe patlem of hydrogen bond donors and acceplors to ael as a code for tbe sequence of the DNA. in tbe sarne way thal hydrogen bonding between tbe base pairs ensures the appropriate recognition oC cnmplementary PNA
Weuk Bond.~ Correctly fJosilion Profeins aloTl8 DNA ond RNA Mo/ocules
85
sequenr:es during DNA hybridization. A diagram oC tbe pallem oC hydrogen-bonding clonor and acceptor residues in the major groove Cor each base pair illustrates the distinct pattero Cor each base prur (see Figure 6-10). Note that not only can a C:C base pajr be easil y distinguished from an A:T base pair, bul A:T and T:A, and C:C and C:G base pairs can also be distinguished. In contrast, the pattem oC base pairs in the minor groove has significantly less information and generally only allows the distinction of A:T and G:c. 3. ~ helices have a djpole mument Ih<:ll leads to their N-terminal end being positively charged. Th is positively-charged end frequently makes weak interactions with the phosphate backbone adjacent to the major groove. The helix-turn-helix moUf was lhe first prolem motif involved in sequence-spocific DNA binding to be iclentified . T his molif is composed of two adjacent o helices tha! are separated by a short lum (Figure 5-21). One (Y he/Ix. called the rccognition helix. lS responsible for DNA recognition. The second a helix is located l'lpproximately perpendicular lo Ihe first a helix . Although tbese Iwo helices form lhe core of the DNA recognilion motif. other nearby regions of helix-tum-helix DNA-binding proleins frequenlly stabilize the arrangement of these two eL helices and contact the DNA. Olher ONA-binding molifs also insert o: helices inlo the major groove, such as lhe zinc finger and leucine zipper DNAbinding motifs (as we shall discllSS in Chapler 17). Whereas thc use of an o: heLix is the predominant fonu of specific DNA recognition. sorne proteins do use different strategies. An extreme example of Ihis lS seen with tha TATA-binding protein (TBPI. which determines the site of transcriptional initiation al many eukaryotic promoters (see Chapler 12). TBP uses an extensive regian of 13 sheet to recognize the minor groove of the so-called TATA-box (Figure 5-22). So, in this case, \Ve see Ihe use of 13 sheet instcad of o: helix and inleract ions with the minor groove ralher than the majar groove (for a delailed disclISsion of Ihis malter. see Chapter 12).
Proteins Sean along DNA to Leeate a Specifle
DNA~Binding
Schematk of interaction between the remgn;tton helhl of ~ repfessor monomer and major groove of OperatIX DNA. (Source: Adapted from Jordan
F I G U R E 5-20
SR and rabo CG. 1988. StrucWre of the
lambda rorrplex al 2.5 A resolUflon. Sdence 242: 893-899. Copyrigtll ltl 1988 Americ:ao AssooaI100 lor lhe Mvancernent 01 Soence. Used with perrnr.;s¡on.l
Site
Many DNA-binding proteins make substantial contacls with the DNA backbone as weJl ac; with the specific base pairs of their recognit ion siles. Mediating these backbone contacts are patches of positivelycharged amino acids localed al siles vel)' close lo those that bind to the base pairs. These associations reIy primarily on electrostatic attraction belween these positive patches and Ihe negalively-charged phosphate backbone of Ihe DNA. Because Ihe backbone has a simiJar negative ly-cbarged surrace, regardless of the sequence, these proleinDNA backbone conlacts contribute substa ntially both the specific and nonspecific affinity of a protein for DNA. Thus. even a highly specific ONA-binding protein will have a substantial affinity for nonspccific DNA sites as wcll. FOr Rxample, Ihe affinity of sorne weH-chnracterizoo regu lators of gene express ion (such as the Jactase reprcssor) for their recagnitjon sequences is about 105 -fold greater than Ihuir affinity for nonspecific DNA. As a consequence. in the cell these prolcins are Iypically bound al a number of non~pecitic sites as well as al their specific larget sequence. This is clue lo the much larger numher of nonspecifk s ites compared 10 the spp.cific siles. Indeed, flvery nucleotide in Ihe genorne
FICU RE 5-21
Geometey eh
repte5SOf-eperator Ulmplex. 1he sdlematic sI1CM'S tVYO monomers of A repressor boo.ro lo !he ~or_ 1le helic:es in each mooomer are labded I 10 5. It IS heb 3 which insef1s II1ID t1~ ma¡or groo¡e as shown In Figure S-20. (ScIUIce: .Adapted!ran.lofdan SR and Pabo CO. 1988. Structure of the Ian-bdil comp!ex al2 5 A resolution. SdcncE' 242: 893-899, f. 2b, page 895. Copynghl O 1988 AmeriCiln Associdtíon lar me MJancelTent of Science. Used Wth petmission.)
86
Weak t1IId StTOll8 Bonds Deten nine MacTOllloleclllor SrI1Jf:IIJfe
FIGURE 5-22 StructureoftheTBP-TATA
bol( compleJ{o lhe backbone 01TBP is shown in purple al !he lop of the figure; !he DNA helix below is shown in gray and rose_ (N ikolOlJ DB, Chen H.. Halay E.O.. Usheva AA, Hisatake K, Lee O.K, Roeder RG., and Burley 5.K 1995. Nature 3n: 119.) lmage prep
leemor Joshua·Tor.
can be considered the beginn ing oC a potential (and almosl always nonspecific) binding sUc. Thus in e. eoli, wh ich has - 5 x 106 bp in ¡ts circular genome, Ihere would be - 5 X 1011 nonspecific binding sites. So. nlthough the ratio of spccific to nonspecific DNA bind ing affin ity lS higb (105-fo ld ), the ratio of nonspecific-to-specific siles lS aven higher (5 x l06-rold). This comparison explains why the cel! would have lo conta in multi ple copies of tha repressor protein lo ensure continued occupancy of the specific regulatory DNA-binding site. Under Ihese conditions. 111051 of Ihe repressor protein moleculcs will be bound to nonspecific sites. Nonspe.cific prol e in-DNA inlerac lions are no! jusi an un avo idnble consequence of prole ins usi ng Ihe charge of Ihe ONA backbone in ONA recognition. These inleraction s are believed lo speed up the mte al which a given regulatory proteio find s ¡ts ap propriate target. Nonspocificnlly-bound proteins are constrained, by their charge inl eraction. lo diffuse Iinearly atong DNA, ralher tban simpl y hopping 00 and off the DN A, This diffusion nllows a UNA-binding prote¡n lo sample sites a l random in the ir "search " for a specific binding site. By being restricted to linear movements , proteins wi ll reach lhei,. ta rgel s fa ster lban if they were free to difrusc throughout \he cell. A small subsel of DNA-binding prote ins do nol merely diffuse on DNA , bul instea d, activcly Irac k along Ihe DNA. These prote ins use directional movement on DNA lo perform key functions during DNA replication . repair. and recombination (see Chaple rs 8, 9 . and 10) . Because this movement is directional. il requires e nergy. Thus , these proteins hydrolyze ATP lo direct changes in their binding to ONA,
Diverse Strategies for Protein Recognition of RNA As introduced above, RNA is struclurally more diverse Ihan DNA. RNAbinding proteins have various roles jo RNA function . from stabilizing the RNA lo enzymatically processing the RNA. The struclures of several RNA·binding proteins bound to their target molecules revcaJ various strategies for protein-RNA recognitioll .
Allosle¡y: Regulalion 01 a Prolein 's Function by Changing lts Shape
87
Sorne RNA-binding proteins interaet speciflcally with doublestrnnded RNA. In these cases, the proteins recognize feature!> tha! disünguish Ihe RNA from the DNA double helix. Foc example. the presence of tbe 2'-hyrlroxy l group is cJearly a dislinguishing fealure of RNA, as is Ibe fuct Ihat RNA forms predominantl y an A-fonn helix (see Chapler 6), which has botb deeper and narrOWer grooves Ihan Ihe B-fonn he lix. In contrasl to the DNA-bind ing proteins discussoo above. these proleins do not engage the nueleic acid by inserting o: balicel regioos inlo Ihe RNA grooves. Many important RNA-bindiog proteios bind to RNA molécules Ihal are nol in a regular heBcnJ conforma tion. Included are proteins Ihat internct with messenger RNA moJecules dorlng transcriplion and RNA processing. Likewise. maeruneries Ibe! splice and translale RNA conta in subunit s cons isting of RNA complexed with prottlin. The ribonuclear proteio (RNP) motif is one of the mosl comlIlon proteio sequence motifs thal is dedicaled lo making specific RNA contac ls. This 80 residue doma in has a mixed 0:-13 fold (Figure 5-23). lt binds to stem-loop stTuclures in RNA. as illustraled by tbe complex of the spliceosomal protein U1A (sea Chapler 13) with U1 snRNA (sce Figure 5-23 ). Clearly tbe s hapc of the RNA bind ing surface is specific for this slructura) molifin RNA.
ALLOSTERY: REGULATION OF A PROTEIN'S FUNCTION BY CHANGING ITS SHAPE The binding oC cilhcr small oc large molccuJ es Oigands) lo a prolein
can cause a substanli al change in the conformalion of thal protein. Such ligand-induced conformoti ona l changes can have a varicty of
FIGURE 5-23 Strud:ureof spticeosomal protein·RNA complex: U1A
binds hairpin 11 of U1 snRNA The Ploten Is shOM"l in gray; lhe U t snRNA is shcrvvn in green. (Oubridge c.. ho N., Evans P.R~ Teo CH_. and Nagai K 1994. Noture 372: 432) lmage
prepare
88
\M:>uk ond Suong Bonds Detennine Mocmmo/eclJlar Structure
effects, from increasing !he affinity of!he proteio ror a second ligando to switching the enzymalic activity of a protein on or off. This is lnown as allosteric regulation and is a prevalen! control mechanism in biological syslems. "AlIostery" means "olher shape," and !he basic mechanism is as follows; A Iigand binding al one sile on a prolein changes Ihe shape oC that protein. As a result of that change, an active site, or another binding site, elsewbere on the proteio is altered io a way tbat increases or decreases ils activity (Figure 5-24). Examples of proleins conlrolled in ibis way range From metabolic enzymes lo transcripHonal regulalory proteins. Tbe ligand (the allosteric éffector) is very often a sma ll moleculea sugar or an amino acid. 8ut aJlosteric regu lation of a given protein can ruso bu mediated by the binding oC aoother protein , and a very similar effect can. in sorne cases , be triggered by enzymatic: modification of a single amino acid residuo within the regu laled protein. We will see examples oC allosteric regulation by all three roechanlsms in Ihis section.
The Structural Basis of Allosteric Regulation Is Known for Examples Involving Small Ligands, Pr otein~Prote i n lnteractions, and Prote in Modificadon Here we consider the detailed structural basis for three cases of allosteric regulation, in one, the DNA-binding activity of a transcriptional regulator is r.:ontrolled by the binding oC a small molecule lo thal protein. In anothcr, we see how a protein-prote io inle raclion , and a protejn phosphorylation event, can mediale allosteric regulation of an enzyme involved in cell division. Small Moleculc Effeclor: Loe Repressor Regulation by Allo/aclose The lael gene oC E. ooli encodes the lactose repressor (Lac). This protein (about which we will learn more in Chapter 16) is controlled allosterically- indeed, it was one oC the carlicst CheI8cterizcd exampJes or an aUosterically controlled DNA-binding protein. The proteio is ¡nvolved in gene reguJation, and, when bound to DNA, it prevents lranscription oCthe genes required for the cell lo use the sugar lactose as a carbon
FICURE 5-24 Schematkviewofhow the bindmg of an end-product inhibitor inhibits an enryme by uusing an allosl«k: l1"ansfonnalion..
a reglJlalOr
/
slJbstrate
/
,
• enzyme-SlJbstrate
b
"""""
AlIlJslmy : Uq¡ulutiall al u Proteill 's FUllcl lon by Chlmging fts Shope
salll'Cc. Howevcr, whcn lactase is present in the environment, a specifi c forro of this sugar (~- 1 -6-allolactose) induces expression of the lactose genes. ThA allolactose inducer funclions by directly binding lo Ihe Lac repressor prote in and destabilizing its interaction wilh DNA. Struclural allalysis reveals that lhe Lec repressor changes shape upon inducer binding. (Thase slructuraJ sturues used the artificial inducer molecuJe isopropyl-I3-D-thiogalactoside [IPTGj.) This change in shape. in lurn. explains how Ihe DNA-binding activity of the protein is \\peakened. Lec repres50r is a lurge protein (a tetramer of 155 kDa) and contains distincl dornains ¡nvolved in DNA bindiog. proteio rnultimerizatian, and índucer binding. Tbe very N-terminal region of Ibe proteio (amina acids 1 lO 49) is a helix-tum-helix molif tbat specifically binds Ihe DNA majar groove within the control region of !he promoter, as we have seeo in the case of ¡" repressor. Adjacenl lo Ihis region is al1 addilional helix. known as the hinge hclix. lhal makes minor groovc conlacts. The inducer-binding pocket, in contrasto is in Ihe middl e of the large care domain (composed of residues 62-3331. Comparing Ihe DNA-boul1d structure of LacI with that oC the protein free ITom DNA (and bound to inducer) providcs a picture of why these two states are essentially mutually exclusive. Bioding of inducer causes a distortian in Ihe disposition oC Ihe N-tenninal half oC '-he lruge core dornaio. This conCormational change. in turn, disrupts the structure of the hinge heJix, which weakens DNA bioding; the structure oC the adjacent helix-tum-helix domB in is rendered more flexible as well. a change Iikely lo lower the prolein's affinity COI' its specific DNA sile (Figure
89
a
induc.er j
b
5-25).
The Bllosteric rnodiFication oC lbe enzyme asparta le tr8nscarbamoy lase by ils Iigand , CTP, provicles another examp le oCa sma)) molecule erreclor (Figure 5-26). In Ihat case the Iigand induces a well -characterized change in protein tertiary structure.
ProIein Effector: Cdk Activolion by CycJin We now turn lO a case of alJosteric regulation of an enzyme by Ihe interachon between thal enzyme and a regulal.ory protein. The enzyme (called Cdk2) is a membcr oC a family of kinases known as cyclin-dependenl kinases (Cdks l lbat rcgulale progression through the call cycle. It is inactive unliJ complexed with a regu lalory protein caBed a cyclin. Binding of Ihal sccond prolein induces a conformational change that alters lhe structure oC Cdk2 around its active site, partia l1y activating its function. Further conformalionBI changes induced by phosphorylation oCa specific threonine residlle nearby activale the enzyme further (see below). The struclural details oC the allosteric event mediatcd by cycl in binding have becn establishcd. The slructure oC Cdk2, free fTOm cyclin. 1001<5 very Hke thal of othcr kinases. lWo elements of Cdk2 slnlcture are critica! for its regulat ion: an el helix, called the PSTAlRE helix. and a flexible loop, called Ihe T loop. These are both locnloo near lhe kinase active sile. CycJin binding induces alloSleric changes in the location of the T loop and P$TAIRE heli x of Cdk2 (Figure 5-27). In Ihe absence oC a bound cycl in , thc loop is located al Lhe entrence lo the active site and the helix is we)) away from that s ile. In this confonnation. a glutamate residue critical lO calalysis is held oUlside the active site. Binding oC !he cyclin results in !he movemenl of the helix inlo the active site. allowing lhe critical glutamate residue lo take part in catalysis. Cyclin binding also moves lhe loop away from the entrance of the active sile. aIlowing access of lhe protein subslrate.
F1GUR E 5-25
Allosterk changes o. lile
repressor. E«h pan of thc ftgure sha.Y.:. a dil'Tlef 01 lac repressor. (a) fue lefI side ot the f.gure !>hows Ihe dimer of me induc.er·lac re-pressor complex. Blnding of inducer causes el change in ttle Slructure thal reduces affimty of repteSSOl' for!he operator. (b) The right side of lhe figure shows lhe dimer in !he absenc:e of inducel. In rm c.ase. the hinge helic:es Iorm and ,he N·lermll1éll domaln m
operator sequence. (Source: Adapteroe'1c.an Assooanon for the M lIancement of 5oence. Used w.th permISSÍOIl.)
90
Weak and Slrons Bonds Determine Macromolecular Structu re
FIGU R E
5-26
J'he aVostetic modifieation
01 aspartate tral1SCafbamoylase (AJease) by
o calalylic polypeptide
subslrate
•
CTP (inhibitor)
regulalOly poIypeplide
crP.
inhibitor site
Phosphory lation as EfJector: Cdk A ctiv ation by CAK As we have jusI seen , Celk:; are acti vated by binding cyc lins. Full activalion of Cdk requires a second allosteric change in t11al unzyme, mecliated by phos· phorylation. This pbosphorylation ta kes place on a threonine residue within the T loop muntioned above. Th is modification leads to furthcr
a
F I GU R E 5-27 Cydin.induced confonnational changes in Cdk. (d) The mooomene kinase Sfructure, shown in turquoise. is inactive. The position 01 !he PSTAlRE helj~ hoIds il critical residue eto ef me Célldlytic cenler, v.here ATPis Iccaled, ilOd lhe T loop blocks access te lhe prctein subslrate (not shcwn). (b) The structure shows me repcsillooing cf the helix upon binding el cydin (shcwn in purple) élnd!he removal el me loop lrom the cpening el !he catillytic renter. This ccmplex is partially active, (e) Upon phosphorylaticn el the T loop (shcwn in red), !he Cdk-cyciin con-piex beccmes fully active. (Sc.hulze..Gahmen u.. De Bcndt H.L, and Kim 5.H. 1996.J MedOlem 39; 4540; Jeffrey P.D~ Russo AA, Pclydk K~ Gibt)5 E~ HurMtz J~ MdSSague L dOO Pavlelidl N.P. 1995. Norure 376: 3 13. Russo A.A, Jeffrey P.D., aOO Pdveltich N.P. 1996. Not Struct BicJ 3: 696.) Images prepared with MclScript, BobScript, and Raslef 3D.
Summol)'
91
roorganization of the active site of the Cdk. Once added, the phosphate group is bound by three arginines, each from a differenl region around the catalytic cleft. These intcracHons fix the catalytic deft in a conformation favorable for high activily. The phosphorylation is performed by another kinase (called CAK) . Many kinases are activated by a si milar phosphorylation e vent. Tbe two events Ihat toge ther activate Cdks-binding oC cycJin and phosphorylation-occur in Ihat arder. This is because cydin binding nol only increases the activity of the enzyme, b ul also makes th e T loop accessible for p hosphorylation by CAK.
Not AH Regulation of Proteins ls Mediated by Allosteric Events Sorne proteins are controUed in ways Iha! do nol ¡nvolve a1lostery. For cxample. one protein can recruil anothcr lo particular locations or substrates and in thal way control what that protein acts on . Whcn we diseuss reguJation oC RNA polymerase (Ihe e nzyme Iha! transcribes genes inlo mRNA), we will sea lbal whal (in lh at case) i5 usually meant by regulallon is Ihe choice oC which gene is transeribcd al any given time. This is done by regulalory proleins, which bind the DNA wilh one sumee alld lhe RNA polymerase with another. These interactions bring the enzyme lo the gene (or genes) that bear appropriate binding siles for that particular regula.or, This is an example of cooperalivc binding oC proteins lo DNA.
SUMMARY DNA, RNA. and proleins are alJ poJymers, each composed of a defined set of subunits joincd by covatent oonds. For example, DNA is made up of chains oC nucleolides , and protcins are chains of ami no acids. The throo-dimensiona l shape of each such polymor i5 further deterrnined by multipte weak, or secondary, intefllctions between those subunils. Thus, in Ihe case of DNA, hydrogen bonds and slacking inleractions belween the bases of nucleotides aecann! for Ihe double-helical charad er oC !ha! molecule, Ukewise, the slable Ihree-dimensional slructure of a giveQ protein reqttires multiple weak interaction betwecn (nonadjacentl amino ucids witrun Ihe polypeptide chain. We discusscd Ihe nature of Ihese weak bonds in Chapler 3; in ,his chapler we lookL>d al how Ihose weak interactions determine the shapes of molecuJes and Ihe inleractions belween and among them, particularly proteins. (We shall cOllsider the structures oC DNA and RNA in moro detall in Chapler 6,) There are lOulUple Icvels to the struclural organization oC a protein, The inltial covalent linkage of the amino acids is the primary structure. Each amino acid is linked lo Ihe neXI by a peptide bond . Secondary struc!ure is fonned by interactions between amino ftcids typically Cound rather near eaeh nlher in Ihe primary s lruelu ro of lhe protein. The a helix and ~ sheet are ex.amples of scconrlary :;Iructural elemenls. '['he tertiary slructure of a proteín ls the final lhree-dimensional shape oC a polypeplide {'hain. and is determinad by the arrangement o( the various eleme nts oC secondary slructu re in an energetically favorable way. For many proteins there is another level oC structural organization - Ihe quartemary structure. This
refers lo multimerization oCindividual polypeptide chains into dimer or higher-order structures. Many proteins work as multimers-hemoglobin is a !etramer. for example, and many ONA-binding proleins work as wmers. Many nal.ive proteins oonlaill several discrele falded SL1::tions (doITlB.ín<¡) thal are slable hy Ihemselves and whieh arise (rom a continuous amino acid sequenco. Combinalions of such domains attounl for a large variely of aU knoWJ] proleins. The number oC troJy unique domains is probably only a Cew hundred. Each domaio is aften assoeiated with a specific functionaJ activily, ror example, DNA binding. The spccifi c shape of eaeh macramolecu]e restricts Ihe number of other molecules ......¡th wh¡ch it can ¡nteruet Strong secondary interaclions lw.tween mo\ecutes demand both a complemenlary (Iock-and-keyJ relationship belween the (wo bonding surfaces and the involvement of many 010015. Although molecules betllld together by only one or two secondary bonds frequently fall apart , a collection oC tilese we.ak bonds can resull in a quile stable complex. The ract thal double·helical DNA doos nol fall apart sponta· neously shows jusi hew stable such oomplexes can be. Allhough complexes held together by mulliple weak bonds are nOl observed lo faH apar! s p ontA n~usly. their assembly can OCCllf spontaneously, with Ihe correct bonds formi ng in a slep-by-step munner (the principie ofself·assemblyl. The binding of specific proteins to specific sequences along DNA molecules also involves the fonnation oC wea\.: bonds. usually hydrogen bends ootwetJn groups on DNA bases and appropriate acceplor or donor groups on proteins. Most regulatory proteins use an (X helix lo roc~ize specific ONA sequences. Thal Urecognition helix" fits into the major
92
Wook ond Slrons 80nds Detennine MocromoJet:uJar Strtlcture
groove ofONA, and the amino acids in the helix contact the edges ofb~ in a sequence-spec:ific manner. These contacts ¡¡re slabilized by Ule binding eoergy of the spocifie interactions. DNA binrung prolejns ¡¡Iso conlain rogions Iha! allnw nonspecific bonding lo lhe DNA back.bono. Thase nonspecifie backbone interactions pcrmil linear diffusion atong DNA, allowing proleins lo Iheir specifie latgef scquenceli more quickly. A few proteins use p sheels (ralhor Ihan « heBces) lo recognize specifie ONA sequenees, and inleractions wilh Ihtl IIlinor groove, Lul these are lIluch ItlsS eornmon. Protejns perform many funelions , sueh as catalysís or ONA bindi ng. These activities are commonly regulated by the bindíng 8mall ligands or olher proteins lo Ihe proteio in queslion. or through enzymatie modifications of residues wi lrun tha! prolein. These ligands, or modifica-
room
or
(jons, afien regulale pl"Otein function lhrough allostery. Tha! ¡s, the Iigand binds lor Ihe motlificalion largetsl a sile on the pl'otein separate from the region of Ihal protejn that mediates its main function (Ihe active site of an enzyrne, DNA-binding domain, etc). This binding or modiflcaUon triggers a change in the shape of !he prolein which in· creases or deCl-eases the acUvity uf the aclive site, or ONA· bindjng dornajn, essentially swilching Ihe
BIBLIOGRAPHY Books Brandon C. end Tooze , . 1999. / nlroduction lo protein slruclure, 2nd edition. Garland Publishing, New York. Pllllling L. 1900. rile na/l1ro o/file chemical bond, 3ed cd ition. Cornell University Press. IIhaca, New York .
The Specific Conformation of a Protein Results from lts Panem of Hydrogen Bonds Choth ia C. 1984 . Principies thal determine the struclures of proteins. Ann. Rev. Biochem. 53: 537 - 572.
Most Proteins Are Modular, Containing T\\'0 or Three Domains
e.E. 1979. Hiererardtical orgnni7..ation of domains in globular proteins . J. Mol . Biol . 134: 447- 470.
Rose
Steitz l:A., Weber I.T., and MeUhew '.B. '982 . Calabolite gene aclivalor protein: Struclure. homol ogy, wilh olher proleins. and cyclic AMP and DNA binding. Cold Spring Hamor Symp. in Qlwnl. Bio. 47: 419-426.
Weak Bonds Correctly Position Proteins along DNA and RNA Molecules De Guzman R.N .. 1\lmer R.B., and Summers M .E 1999 . Protein-RNA recognition_ Biopolymers 48: 18 1 - 195. Sperling R. and Wachlel E.J. 1981 . The histones. Adv. Pro/. Chem. 34: 1 -52.
Allostery: Reguladon of a Protein's Function by Changing Its Shape BeU e.E. and Lewis M . 2IJ01. The Lac repressor: A second generation of structuraJ and funclionul sludies. Curo Opinoin S truct. Bio. 11: 19 -25. PAce H.C., Kercher M.A .. Lu P., Markiewicz p. , Mil ler '.H .. Qumg G., and Lewis M. 1997. Lac repressor gcnelic map in real spacc. TIBS 22: 334 - 338. Pavlelich N.P. 1999. Mechanisms of cyd.in-dependent kinase regulali on: Structures of cdks, Ihejr cyd in aeli· \'alors, and cip Md INK4 inhibilors. l. MoJ. 8iol. 287: 821 -828.
PAR
T
MAINTENANCE OF THEGENOME
94
Pon 2 Ma;,llenoncc ol lhe Geno",!!
PAR T
OUTL 1NE
• Chapter 6
The Strudures 01 DNA and RNA
Chapler 7
Chromosomes. Chramarin. and the Nudeosome
Chapter 8
The Replication oFDNA
O)apter 9
The Mutability and RepaJr olDNA
Chapter 10 Homologous Recombinatiofl at the MoleClJ[ar levt:l Chapter 1I Site-Specific RecombinatiOl1 and Transposition of DNA
arl 2 is dedicated lo Ihe stnlcture of ONA and the proccsses that propagate, maintaio , and alter it from úne ceLl generation to the next. In Chaplers 6 trnough 11, we will examine ONA and ils c10se re latíve. RNA, aod address ¡he foll owing queslioos!
P
• How do the structures of ONA and RNA account for their func. tions? • How are ONA molecules, which are extraordinari1y long comparcd lo the size of the cell, packaged within the nucleus? • How i8 ONA replicated accurately and completely during the cell cycle. and how is this ach ieved wit h high fid elity? • Ho\\' is ONA protected from spootaneous and environmenlal damage. and how is damage. once inflieted, reversed? • How are ONA sequences exchanged belwCen ONA chromosomes in processes known as recombination and transposition? In answering Ihese questions, we wiII see Ihat Ihe DNA molecule is subjecl both to conservative processes that act lo ml:lintain it unaltered from generation to generation , and to other processes that bring about profound changes in the genetic material that help drive evoJution. In the cel!, DNA is subjected to forces that peel apart its strands. twist it into topological!y conslrained slruclures, wrap it around and through proteio assemblies, and break and teseal ils backbone. These manipulat ions are mediatcd by myríad enzymes and molecular machioes thal propagate, maintain. and alter the genetic material. Chapter 6 explores the structure of DNA in atomic detail , from the chemistry of its bases and backbone, lo the base-pairing intemctions and other forces Ihal hold the two strands together. DNA is often topo10gically conslrained , and Chapter 6 considers the biological effects of such constrdints, together with enzymes tha t a lter topology, This chapter also explores Ihe structure of RNA. Despite the close similarity of its chemislry lo that of ONA. RNA has its own distinctive slructural fealures and properties, including ¡he remarkable capacity lo act as a catalysl in severa! ce\l ular processes. As we willl earn in Clmpter 7. ONA is nol naked in the ce n . Rather. il is packaged with speciaJized proteins in a slruet ure called chromatin. This packaging allows exceedingly long molecules to be aCcommodated in the cell and lo be accurately segregated lo daughter cells during cell division. ChromaUn can be modified lo increase or decrease the aecessibilily of the DNA, These changes eontribute to ensuring it is replicated, recombined. and transcribed at the righl lime and in Ihe right place. Chapter 7 introduces us to the histone and nonhistone components of chromatin, to Ihe structure of chromati n. and lo Ihe enzymes that mediate crnomatin modifica tion. The structure of DNA offered a li kely mechanism ror how genelic material is duplicated, Chapter 8 describes this copying mechanism in detail. We describe the semiconservati ve nature of DNA replication. and the elaborate col!ection of enzymes aod olher proteins required lo carry it out . Dul Ihe repJication machinery is nol infallible. Each round of replicalion resu lts in errors. which, ir left uncorrected. would become mutations in daughter DNA molecules. In additioo . DNA is a ffagile molecule that undergoes damage sponlaneously and from chemicals and radiation. Such damagf! must be detected and mendecl if the genetic material is to avoid rapidly accumulating an unacceptable load of
Porl 2 Main lenance oi Ihe Genome
9S
mutations. Chapter 9 is devoted to the mechanisms that detect ünd re~ pair damage in ONA. Organisms from bacteria to humans rely on s imi ~ lar, a nd often highl y conserved, mechanisms for preserving the integrity of their ONA. Fail ure of these sys!ems has catastrophic consequences , such as cancer. The final two chapters of Par! 2 reveal a complementary aspect of DNA metaboli sm. In contrast to Ihe conservative processes of replication and repair, which seek to preserve the genetic material with minlmal a lteration, the processes considered in lhese chapters aro designed to bríng about ocw arrHngements of ONA sequences. Chapter 10 covers Ihe topic of bomologous recomb ination - ihe process of breakage and reunion by which very similar chromosomes (homologs) exchange equivalent segments of DNA. Homologous reoombination allows the generation of genetic diversity, a nd (l Iso replacement of missi ng or damaged seque nces. lWo models for pathways ofhomologous l'ecombination are described, as well as the fascinating set of molecular motors fhat search for homo logous sequences between DNA molecules and then create Hnd resol ve Ihe intennediates predieted by the pathway models. Finall}', Chapler 11 brings us to Lwo specialized kinds of recombination known as site-specific recombiJ}¡).li on and transposition. These proces::;es lead to the vast accumulation of some sequences within the gonomes of numy organisms, including humans. We will di sCll SS the molecular mechanisms and biological consequences of these forros of genetic exchange.
PHOTOS FROM THE COLD SPRING HARBOR LABORATORY ARCHIVES
BiJl'baz'iI McClintock iIIId Robin HoIliday, 1984 Symposium on Recombinatkm at the ONA Level. 1VicC1intod proposed Ule eJ:islenre of tfaosposons 10 a«ount for!he results of her geneñc studies \Mth maize. carried out 1hrough rhe 19405 (Chapter 11 ) ; me Nobd Prize in recognition of Ihis worl; carne more than 30 years later, in 1983. Holliday proposed me fundamental model 01homologous recombinatian whidl bears his name (d1apter 10).
96
Port 2 MoinrcnOI!l;e O/ the Ccnomc
Arthur Komberg. 1978 Symposium on ONA: Replkation and RéC:ombination. Kof1"lberg's
extensi\€ coolributions 10 me Sludy of DNA replica(Chapter 8) began Wlth purifying lhe first enzyrne \hal ro.Ad S)11tf1esize DNA, a DNA poty. merase from E. cok His e:periments showed mat a DNA template 'AlaS rcquired Ior new DNA synthesis, c:onfirming a prediction of lhe model fo. DNA replicalion proposed by Watson and Crid.. For mIS \I'IOrk Komberg shared in tf1e 1959 NobeI Prize for
!Ion
Med"ICine.
Reiji Okalakl. 1968 Symposium on Repli. catlon of DNA In Mlaoorganlsms. Okazaki had al this time JUSI shown how, duñng DNA
replicallOO, one of me new Slrands is synlhesized in short fragments mal are only lalef )oined IOgether. The exisreoce el lhese "Gazaki ftagments" explained how an enZ)1T'le mal synIhesizes DNA in only one direction can neverIheless make two strands of opposite polarity simultaneously (Olapter 8).
Matthew Meselson. 1968 Symposium on Replicirtíon of DNA in Microorganrsms.
MeseIson, witt1 Frank Stahl, demonstrated ¡hal DNA rs replicaled by a semi-conservative med1anism. ThIS was once famousty called "the most beauliful experiment in biology" (ChoiIpter 2).
franklin Stahl and Mal Oelbrück,. 1958 Symposium on ElChange of (OenetK Material: Mechanism and Consequences.. Stahl was Meselson's pal1ner In tf1e expenrnenl described above, and subsequenlly c:ontribuled much ro 00/ underslanding of homoJogous recombination
Paul Modrich,. 1993 Symposium on ONA and Chromosomes.. A pioneer in lhe DNA repair fieId (Chapter 9), Modridl worked out much of tt1e mechanisuc: basrs of mismatdl
(Chapter 10). DelbrOck was rile influential cofounder oIlt1e so-cal!ed 'Phage Group" -a group of scientists thal de\IeIoped bacteriophage as the first model systern 01 molecular biology
repair.
(Chapter 2 1).
CHAPTER
The Structures of DNA andRNA he discovery that ONA is the prime genctic molecule. carrying all Ihe hereditary information within c hromosomes. immediately focuscd attention 0 0 its structure. JI was hoped Ihat knowledge of the struclure would reveal how ONA carries the genetic messages Ihal are replicated when chromosomes divide lo produce two identical copies of Ihemselves. During the lale 1940s and early 1950s, severa! research groups in lhe United States and in Europe engaged in seTious efforts-ooth cooperntive and rival-to understand how lhe atoms of ONA are linked togelher by covalenl bonds and how the resulting molecules are arranged in three-dimensional space. No! surprisingly. there initially were fears that DNA might have very complicated and perhaps bizarro strllctureS that differed radica1ly from one gene lo another. Creal relief, if not general elation. was thllS expressed wheo the fundamenlal ONA strllcture was found lo be Ihe double helíx. 11 luid us thaL al! genes have roughly the same lhroo-dimensional form and lhal Ihe differeoces between Iwo genes reside in the order and oumber oC their four nuc\eotide building blocks aloog lhe complementary strands. Nov.:. some 50 years after the disoovery of the double helix. tbis simple descriptioo of the genelic mal erial remains true and has 001 had lo be apprer:iably altered lo acenmmodale new findings. Nevertheless. we have enme lo reaUze that the structure of DNA is nol quile as uniform as was M I thought. For example, the chromosome of some small viruses have single-stranded, nol double-stranded, moleculcs. Moreover, Ihe precise on entation of the base pairs varies slighfly from base pair to base pair in a manner thal is influenced by Ule local ONA sequencc. Sorne DNA sequences even pernul Ihe double hcLi x lO twisl in !he left-handed sense. as opposed to Ihe right-handed sense originally formulaled for DNA's general structure. And while sorne DNA molecules are linear. others are circular. Still additiooal complexity comes from the supercoiling (further twisting) of the double belix. often around cores of DNA-binding proteins. Likewise, \\le nm'; realize that RNA, which at firsl glaoee appears lo be vcry simil ar lo ONA, has its own distinetive slructural teatures. Jt is principally found as a single-stranded mo lecule. Yel by means of intra-strand base pairing. RNA exm bits eXlensive doubl e-helical characler and is capable of folding into a wealth of diverse tertiary structures. These slruclures are full of surprises. sueh as oonclassical base pairs, base-baekbone inleractions. aod knol-like configuralions. Masl remarkable of a ll, aod of profound evolutionary significanee. some RNA moJecuJes are enzymes that carry out reaetlons that are al the eore of informatioo lransfer from oucleic acid lo protejo. Clearly, lhe structures of DNA and RNA are d eher and more inlricate tban was at first appreciated. lndeed. Ihere is no one generic structure for ONA an d RNA. As \Ve shall see in Ibis chapter. there are in faet varialions on common themes of strueture that arise from the unique physical. ehemical. and topologicaJ properties of the polynucleolide chain.
T
O UTllNE
• ONA Stl1.ldufe (p. 98)
ONA TopokJgy (p. 111) RNA Stnlcture (p. 122)
97
Tlw Slrucfures 01 DNA
9B
RNA
DI1d
a S' -~ hydrogen bond
3'
::!
.~
• m •
,:; ' "
~
base sugar-pIlosphate baol
~
.." ~
d
"" ~
M
"
.
E
2
.o;" ~
8J _
G e
,
S'
20A(2nm)
T
b
A
• H
• o
• •
e in phosphate e and N estar chalo
DNA STRUCTURE DNA ]5 Composed of PoJynudeotide Chains The mas! important feature ofDNA is that it is llsually composed aftwo polynuc1eotide chains twisled around eaeh olher in the form oC a double helix (Figure 6-1). The upper parl oC the figure (a) presents lhe slructure oC the double heJix shown in a schematic formo Note that ir ¡nverted 180" (rar example, by turniog this book upside-down), the double helix looks superficially the same, due to the complementary nature of lhe Iwo ONA strands. 'fhe space-filJing fiodel of lhe double heli>:, in the lower pan of Ihe figure (b), shows the componenls of Ihe ONA molecule and tbeir relative positions in the helical structure. The backbone of each strand ofthe helix is composed of alternating sugar and phosphale residues; the bases project inwaro bul are accessible through the major and minor grooves. Let us begin by considerlng the nature of lhe nucleotide , lhe fundamental building block oC DNA. The nucleotide consists of a phosphate joined to a sugar. known as 2' -deoxyribose, lo which a baso is altached. 'fhe phosphate and Ihe sugar have the structures shown in Figure 6-2. The sugar is called 2' -deoxyribose because there ls no hydroxyl al posilion 2' (jUSI two hydrogens l. Nole that lhe positions 00 Ihe ribose are designated with primes lo distinguish them from positions on Ihe bases (see the discussion below). We can Ihink of how Ihe base is joined 10 2'-deoxyribose by imagining Ihe removal of a molecule of water between the hydroxyl on lhe l' carbon of lhe sugar and Ihe base lo form a glycosidic bond (Figure 6-21. The sugar and base ruone are caUed a nucleoside. Likewise. we can imagine Hnking Ihe phosphate lo 2'-deoxyribosc by removing a water mol(!cule Crom between lbe phosphate and the hyrnoxyl on Ihe 5' carbon lo make a 5' phosphomonoester. Adding a phosphate (01' more IhaJl one phosphateJ lo a nucleoside crcates a nucleotide. Thus. by making a glycosidic bond belween the base and the sugar, and by making a phosphoestcr bond betwecn the sugar and the phosphotic aeid, we have crealed a nucleotide (Table 6-1).
P
in bases
F I G U R E 6-1 The heliul structure of (a) 5chemilic modeI of!he dotbIc helix. O'le l'-'n of me heIix (34 AOf 3.4 nm) spans approxillldle!y , 0.5 base pairs. (b) Space-filling model ot the doubIe he\{l(, The sugar aOO phospI1ate residues., each strand form lile bacItbone, ~ich ilre tr«ed by lile yellow. gray, aOO red cirdes, sholMng lhe heflCaI t>Ms1 of the overall molerule. The bases proteeI if"Mlilltl but are acce;sllIe thfough rnflIOl aOO rTllOOI grooves. DNA
FIGURE 6-2 formation of nudeoúde by removal of water. The numbers of Itle carbon alorns in 2'-
DNA Stnlclure
99
TABLE 6-1 Adenine and Rdated Compounds
Base Adenine
Structure
Nucleoskte 2 '--deoxyadenosine
Nudeotlde 2'--deoxyadenosine 5'-phosphate
...,
ce) "5:") l" " ~
H
H
OH
H
25 1.2
33 1.2
H
H
OH
H
OH
135.1
-O-í-~~_ H
H
Molecular weighl
f¡ H
H
Nucleotides are. in lurn, joined to e8ch olher in polynucleotide chains through the 3'-hydroxyl of 2'-deoxyribose of one nucleotide and the phosphate atlached lo the 5'-hydroxyl of another nucleotide {Figure 6-3J. Th is is ~ phosphod.iesfer linkage in which Ihe phosphoryl group between the two nucleoUdes has one sugar ester¡(¡ed lo il
F I G U R E 6-3 Oetailed structure 01 poIynudeotide potymer. lhe structUfe shOVl"5 base paimg between polines (in blue) and pyrimldines (in yell()VII), and lhe p/"'osphoáteSter lInkages of the badbone
H
I 3'
(Source: Adapted flOO1 Dickerson RE. 1983.
Scienrific Americon 249: 94 Inustrotion, lIVing G€iS. lmage 11001 lIVIng Ces CollectJon/Howard HugIles MedicilllnstilUlion. Not /O be rep«l-
duced ~ pennission.)
3'
S'
The Stnll;1ures 01 DNII und RNII
100
adenine
H-{-f:r ¡¡Jl purine
H
I.
¿A- H
N
H--{:~~ N~lAH H
N
guanine
t.hrough a 3'-hyd.roxyl and a seco nd sugar eslerified lo it through a 5' -hydroxyl. Phosphodíester Iinkages create the repeating, sugarphosphltte backbonc of lhe pol ynucleotide chain . which is a regular featH.re of DNA. ln contrast, the arder of the bases along Ihe polynucl eotide c hain is irregul ar. "Ch is irregularily as \Ven as Ihe long lenglh is. as \Ve shall see, Ihe basis for Ibe enormous ¡n(ormalion con len t ofONA. The phosphodiesler Iinkages impart an inherent polarity to Ihe DNA chain. This polarity i5 defined by the ilsymmetry oE the nucleotidos and Ihe way they are joioed. ONA chains have a free 5'-phosphale or 5'-hydroxyl al ane end and a free 3'-phosphate or 3'-hydroxyl al the olher cnd. The convention is lo write DNA sequences from the 5' end (on Ihe left) lo the 3' cnd, generaJly with ti. 5' -phosphafe and a 3' -hydroxyL
Each Base Has lts Preferred Tautomeric Form
F I (j U R E 6-4 Purines and pyrimMlínes..
The doned lines indicate rile ~tes of attacNnent ot the bases lo Ihe sugars. For simpliciry. hyd,ogens are omitted frorn me susars and bases on subseqt./E'llt r¡glft€S, ~I where pertinent 10 !he illustration.
The bases iJ] ONA faH into Iwo classes. pttrines and pyrimidines. The purines are adenine and guanine, a nd the pyrimidines are cytosine and thymine. Tho purioes are derived from Ibe double-ringed structure shown in Figure 6-4. Adeni ne and guanine share this essential structure bul with differenl groups attached. Likewise. cylosine and thyrnine are varlations on the single-ringed slructure shown in Figure 6-4. Thc figure also shows the numbering oC the positions in the purine and pyrimidine rings. Thc bases are atlached to Ihe deoxyr¡bose by glycosidic Iinkages al Nl ofilie pyrimidines or al NO of the purincs. Each of Ihe bases exists in Iwo alternalive laulomeric states, wh ich are in equilibr ium with each oLber. The equilibrium Hes far to the si de of lile conve ntional structures shown in Figure 6-4. which are the predomjnant states and the ones importa nt for base pairing. The nitragen aloms attacbed to tbe p urine and pyr im id ine rings are in the amino form in lhe predominant state and on ly rarel y assume the im ino co nfiguratioo . Likewise, the oxygen atoms altached to the guan inc and Ihymine normall y have Ibe kcto form Hnd only rarely lake on Ihe enol configuration. As examples. Figure 6-5 shows tautomer i7.atio n of cytosinc into the ¡mino form (a) and guanine inlo the e nol forOl (b) . As we s hall see, the capaci ly to form an alternali ve laut omer is a frequent so urce of errors during DNA synthesis.
The Two Strands of the Double Helix Are H eld Together by Base P aicing in 3n AntiparaUel Orientation The double helix is composed of two polynucleotide chains that are he ld together by weak, noncovaJe nt bon ds between pairs of bases. as show n in Figure 6-3. Aden ine On one chai.n is always paired \Vith thymine on lhe ather chain and , likewise, guanine is always paired witb cytosine. Tbe two slrands have the same helical geometry but base pairing holds them together with Ihe opposite polarily. That is. Ihe base at the S' end of one strand IS paired with the base at lhe 3' end ofthe other strand. The strands are sajd lo have an antiparaJlel od entation. This aoUparaUel orientatiún is a stereoche mical conSflque nce of the way thal adenine and thymine. a nd guanine and cylosine. pair wilh each together.
DNA Slructure
a
.... H,amoo"..,... H
imino
O
b
=0 . .l e ... /
N
. .1~
101
N
• ,
.... H
O
N
I
H
N
I
I
keto
e"~
•
,
"
H-bond donor
"
H-bond acceptor
FI G U RE 6 - 5 8iJse tautomers.. Amino:::; imino and kelo:::; mol taulomerism. (a) Cytosine is usually in the amino form but rarety forms lhe imino configuratíon. (b) Guanine IS usually In the keto form bu! is rarely found In the enol configuration.
The Two Chains of the Double Helix Have Complementary Sequences The pairiog bel ween adenine and thymine. and bctween guanine and cytosine. results in a complementary relationship between the sequence of bases 00 the Iwo intertwined chains and gives DNA its self·eocoding charilc1er. Por example. if we have Ihe sequence S'·ATGTC-3' on une chain . the opposite chain must have the complementary sequence 3'·TACAG-S'. The strictness of the rules for Ihis "Walson·Crick" pairing deri ves from Ihe complementarity bolh of shape and of hydrogen bonding-propem es between aden ine and Ihyro ine and belween guanine and cylosine (Figure 6-6). Ade nine and thymine match up so Ihal a hydrogen bond can form between the exocyclic amino group al C6 on adenine and the carbonyl al C4 in thymine; and likewise, a hydrogen bond can forro be" '\'cen N1 of aden ine and N3 of lhymine. A corresponding arrangement can be drawn between a guanine and a cytosi ne, so that there is both hydrogen bonding and shape complementarity in this base pair as well . A C:C base pair has three hydrogen bonds, because the exocycHc N Hz at e2 on guani ne lies opposite to, and can hydrogen bond with , a carbonyl al C2 on cylosine. Likcwise. a hydrogen bond can fonn betwecn N1 of guunine and N3 of cytosine and between the carbonyl nt C6 of guanine and the exocyclic NHz al C4 of cytosinc. Watson'('J"ick base pairing re· quires that the bases are in tbeir preferred tautomeric slates. An importunt fenture of Ihe double helix is that the two base pairs have exactly the same geometrYj having a n A:T base pair or a G:C base pair bet\veen the t'\Vo sugars does nol perturb the arrangement of the sugars beCliuse the dislance belwcen Ihe sugar attadunent points are the same for both base pairs. Neither does T:A or C:G. In other ""oros.
F I Ci U R E 6-6 A:T and (j:C base pain.. The figure shQ\o\'S hydrogen bonding between ¡he bases.
10Z
The Sflvcll.l res o/ONA onu RNII
there is an approximately lwo[old axis of symmetry thal relates the two sugars and .'1 11 four base pairs can be 8ccommorlated wilhi n the same arrangemen l without any distortion o[ Ihe overal l slruclure of Ihe DNA . In addition. the base pairs can slaek neatly on top of each other belween the two he llcs) sugar-phosphate backbones.
Hydrogen Bonding Is Important for the Speciflcity oí Base Paidng
F I G U R E 6-7 A:C incompatibility. The struoure shows the Inab~ity oi adcnine lO hxm the proper nydrogen borJds lMtt1 cytOSlne. lhe ba5e pall IS lherefole lJns!able.
The bydrogen bonds between oomplementary bases are a fundamental feature of the double helix. oontribuling lo Ihe thennodynamic stability of the helix and the specillcity of base pairing. Hydrogen bonding might not . al 6rst glance. appear to contribute importanuy to the stability oi DNA for Ihe following reason . An organic mo lecule in aqueous soluHon has all of its hydrogen bonding properties satisfied by water molecules that come on and off very rapidly. Ap, a result , for every hydrogen bond that is m.. de when l:I bl:lse pair fo rrns. a hydrogen bond witlt water is broken that \Vas there before th e base pair forroed. Thus, the ne! energelic contribution oihydrogen bonds to fue stability ofthe double helix would appe
FI G U R E 6-8 Base flipping.
Struc:ture o,
me
isoIa\ed DN.A,. sholMng ffipped cyto5ine residue i'lnd !he smi'l1I distol1lOOs 10 !he adjocenl
base pairs.. (KlilTlaSo'Juskas 5., Kutnal 5., Roberts RJ~ ;,OO Choog X, 1994. CeO 76: 357
Image prep¿wed Wlth BobScrip~ MoIScnpl, and Rilster 3D.)
As we have seen. the energetics of tbe double heh x fa vor the pairing of cHch base on one polynucleotide strHnd with tbe oomplemenlary base on the othcr strand. Sometimes. however, individual bases can prolrude from the double helix in a remarkable phenomenon known as base flipping shown in Figure 6-8. As we shall see in Chapter 9, certain enzymes that rncthylate bases or reruove dnmaged bases do so wirh the base in an extra-helical configuration in which it is flipped ou l [rom tllC duuble helix, enabling the base to sil in the catalytic cavity of lhe enzyme. Furlhermore, enzyrnes involved in homologous recombinatlon and ONA repnir are bel ieved to scan ONA [or homology or lesions by flipping out one base aficr another. This lS not energet ically expensive bee.wse onl}' one base is flipped out al a time. Clearly, DNA is more flexible than migbt be I:Isswned al first glance.
DNA [s Usually a Right-Handed Doub[e Helix Applying the Jlandedness ru le from physics , we can see tha! eaeh of the polynucJeotide chains in Ihe duuble heli x i5 right-handed. In your mind's eye. huid yúur righ l hand up lo the DNA molecul e in Figu re
6-9 wilh your thllmb poillting up and along the long axis oC!.he helix and yOllr fingers foJlowing the gmoves in the helix. Trace along une strand of Ihe Jlelix in Lhe direction in which your Ihumb i5 poinling. Notice Ihal you go around the helix in Ihe same direction as yOl1f fingllrs are pointing. This does not wúrk ir you use yúUl' leO hand. Try it! A consequence oCIhe helica} natura oC DNA is its periodicity. Each oose pair i5 disp laced (twistedl from Lhe previous one by abo ut 36°. Thus. in Ihe X-ray crystal structure of DNA il takes a slack of aboul 10 base pairs to go complctely around the helix (360") (see Figure 6-1a). Thal is, the heli cal periodicity is generall y 10 base pairs per lurn of tJle helix. For further discussion, sue Box 6-1, DNA Has 10.5 Base Pairs per Turo of the Helix in Solution: The Mica ExperiOlent.
The Double Hellx Has Minor and Major Grooves As a result of Ihe double-heli ca l struclure of Ule two chains. the ONA molecule is a long extended polymer with two grooves that are nul equal in size to each olber. Why are Ihere a minor groove and a major geoove? It is a simp le consequence of Ihe geometry of tllC base pairo Tho angle al which Ihe Iwn sugars protrude frotn the bnse pairs (thal is. Ihe angle between the glycosidic bonds) is about 120" (for the narro\\! angle or 240" foe Ihe wide angle) (sea Figures 6-lb and 6-6). As a resll lt , as more and more base paies stack on tnp of each other, the narrow angle between the sugars on one edge of lhe base pairs generales a minor groove and the large anglé on lhe other edge generales a majol' groove. (lf the sugars pointed away from each otIlel' in a straight line, tJla t is, al an angle of 180", then !he two grooves wou1d be of equal dimensions and there would be no minor and majar grooves.l
The Major Groove Is Rich in Chcmtcal Information The edges of each base pair are exposed in Ihe ma jur and minor gruoves, creating a pattero of hydrogen bond donors and acceplors a nd ofvan der Waals surfaces tImt identifies the base pair (see Figure 6-10). The edge of an A:T base pair disp lays Ihe followi ng chcmical groups in 11m following urder in the major groove: a hydrogen bond acceplor (!he N7 of adeninel. a hydrogen bond donor {Ihe exocyc\ic am ino group on ('.6 of adeninel, a hydrogen bond acceptor (lhe carbonyl group on C4 of
3'
5'
5'
3'
r S'
3'
righl-handed
3'
S'
left-handed
F1G U R E 6-9 left- and right-handed helices, The two po!ynlldeolide (haios jo Ihe doobIe helix wr¡¡P arollnd one another in a righl-haoded maons.
Box 6-1 DNA Has 10.5 Base ,ailS pel Tum of the Helix in Solution: The Mica Experiment
lhis value of lO base pairs pe!' tum varies somevo.hat under different conditíons. A dassIc experimenl that was carried out in the 1970S demonstrated that DNA absorbed Ofl a surface has SOIT\E!'Mlat greater than lO base pairs per tum. Short segments c4 DNA were allCMled lo ~nd to a mica surface. The presence of 5' -terminal phosphates on !he ONAs held them in a fixed uientation ro the mica. 1he mica-t:ound ONAs were Ihen expc6ed to DNAse 1, an enzyme (a deox-y1ibonudease) tIlat deaves the phosphodiester bonds in the ONA backbooe. Because the enzyme is bulky, it is rnly able 10 deave phosphOOiester boods rn !he DNA surface furIhesl from !he mica (thlnk c4!he DNA as a cylinder lying doMi on a tia! surface) due lo !he steric difficulty cA reoching !he sides or bottr.m surface of !he [)NA. As a resuh, the /ength of the resulmg fragrnents should reflect the peñodiáty of rile D/'IIA. Ihe number of base pairs pe!' turn. After tIle mica-bound DNA was exposed to DNAse !he resulting fragments were separated by electrq:¡horesis in a poIyaoylamide gel, a Jelly-like matrix (Box 6- 1 FIgUre 1; see also Chapter 20 for an explanation of gel electrop/loresis). Because ONA is negalively charged, it migrales through the gel toward lhe positive pole of !he electri, field. The gel matrix impedes movement of the fragments in a manner tIlal is proporti01al lo Iheir lenglh such that larger fragments migrate more sloNly than smaller fragments. When the experiment is comed wt, we see clusters of DNA fragments of average sizes 10 and 11, 21. 3 1, and 32 base pairs and so fonh, mat is. in multiples of 10.5, whidl is the number of base pairs per turn. This value of 10.5 base pairs per turn is d~ lo that of DNA in solution as inferred by other methods (see !he section titled The Double Helix bists in Mu\tipie Confonnarions. bdo.V). The strategy of using DNAse to probe Ihe struClllre of ONA is rlCJIoJ used to analyze Ihe interaction of DNA with proteins (see Chapler 17).
JI 22
"
20
11
"
BOX 6-1
FI GUR E 1 The mica experiment.
thyrnine) and a bulky hydrophobic surface (the methyl grou p on Cs 01' IhymineJ. Similarl)', Ihe edge of a G:C base pair displays Ihe fo llowíng groups in lhe majar groove: a hydrogen bond acceplor (al N7 of guanine), a hydrogen bond acceptor (the carbonyl on C6 of guaninc). a hydrogen bond donor (the exocyelic amino group on C4 of cytosineJ, a small nonpolar hydrogen (the hydrogen at es ofcylosine).
DNA Sf ro c ture
major grOOllC
majar groovc
o
105
F I G U R E 6- 1o Chemkal groups exposed in tite major Md minor grooves hom the
o
edges of the base pairs.. Tre Ietters in red idenlify hydrogen bond aC'ClO'ploo (A). hydrogen bond donors (D), nonpolar hydrogens (H), CI"Id methyl groups (M)_
A
A
o
O
mina- groove
minor
major groove
major groove
o A
A
H ' N-
H' ......
(~'0 ~ '" ."".HN
9'1
/
M
eH
CH 3
OM' 4
N
N-I(N-3~J
H
minor groove
5~
T
\<-N
o1/
H
A
M
A
"
QlOOIIC
o
A
H
A
O ""''' H_ N/
M
N
?I' •
/
~- H ". ~,~ o A
H
H
N
'''/
"\
A
minor groove
Thus, Ihere are charactenstic patterns of hydrogen banding and of overall shape thal aJe exposed in the majar groove tbal disti nguisb an A:T base pair from a G:C base pair, and , for tha! malter, A:T rrom T:A, and C:C from CC. We can think of these fcal ures as 8 code in which A represents a hydrO!,>en bond accepfor, D a hydrogen bond donar, M 8 mcthyl group, and H a nonpolar hydrogen. In such a codeo A D A M in the major groove signiñes an A:T base pairoand A A O H stands rOl' a C:C base pairo Likewise. M A D A stands for a T:A base pair and H D A A is charaetcrislie of a C:C base pairo In aH cases, Ihis eode of ehemieal groups in thc majar groove specifies the identity of tbe base pairo These patteros are important beeause they allow proteins to unambiguously recogn ize DNA sequences wilhout having lo open and Ihcreby disrupl the double helix, Indeed. as we shall sec, a principal deeodi ng IIlcchanism relies upon the ability of amino aeid side ehains to protrude into !he ma jar groove and lo recognize and bind to specifie DNA sequences. The minar groove is nol as rieh in chemical informal ion and what information is available is less useful foc distinguishing between base pairs. The small size of the minar groove is less able to aeeommodate ami no acid side chains. AIso. A:T and T:A base pairs and C:C and C:C base paies look similar to one another in the minar groove. An A:T base pair has a hydrogen bond aceeptor (at N3 of adeninel, a non polar hydrogen (al N2 of adenine) and a hydrogen bond acceptor hhe earbonyl on C2 of thymine). Thus, its code is A H A. Bul this code is lhe same ir read
106
Tl1e Structl./fflS 01 f)NA and RNA
in the opposite direction, and hence an A:T base pair does no! look very different from a T:A base pair from the poinl of view of Ole hydrogenbonding properties of a prolein poking its side chains into the minor groove. Likewise, a C:C base pair exh ibils a hydrogen bond acceplor (al N3 of guanine). a hydrogen bond donor (the exocyclic antino group on C2 of guaninel, and H hydrogen bond acceptor (the carbonyl on C2 of (.ytosineJ. representing tbe code A O A. Thus. frum the point of view of hydrogen bonding. C:C and G:C base pairs do nol 100k vcry difIerent from each olher aHher. The minar groove doos look dirferent when comparing an A:T buse pair with a C:C base pairo bul C:C and CC, or A:T and T:A, cannot be easily distinguished (see Figure 6-10).
The Double Helix Exists in Multiple Conformations Early X-ray diffraclion studies of ONA, which were carried out us ing concenlrated solutions of DNA Ihat. had been drawn out into thi n fihers. revealed l\Vo kinds o[ struct.ures, lhe B and the A [onns of ONA (Figure 6-11 J. The B form , which is observed al high humidity, mosl closely corresponds lo the average slru cture of DNA under physiological conditions. It has 10 base pairs per turo, and a wide major groove and a narrow minar groove. The A form, which is observed under conditions of low humidity, has 11 base pairs per turno Its major groove is narrower and much deeper than thal of the B formo and ils mi nar groove is broader and shallower. The vast majarily of the DNA in Ihe cell is in Ihe B formo but DNA cloes adopt lbe A structu re in certain DNA-protein complexes. AIso, as we shall sec, Ibe A form is similar lo thc struclure Ihat RNA adopts when double heUcal . The B forro of DNA represents an idea] structure Ihal deviates in t\Vo respects (rom the DNA in cells. First. ONA in solution, as we have seen. is sorncwhal more twisled on average than the B form , having on
FI G U RE 6-11 Models of the 8, A. and Z forms of DNA. The Stlgar-phosphale oockbone of eocll chaln IS on lile QlJIslde In all strudures (one purple and one green) v.;th Ihe
bases (silver) oriemed illlNOrd. Side views are shQVl.11 al lile lop, and views along the helical axis al lhe bonom. (a) lhe B loon of DNA. lhe usual lorm lound in cells, is characterized by a helical rum every 10 base pairs (3.4 nm); adjacenl stacked base pairs are 0.34 nm apart. lhe major afld minO!" grooves are also \'isible (b) The more compact A 10flTl of DNA has li base pairs per Ium and exhibits a ¡arge till 0 1 !he base pairs wilh respect lO lhe hellx axIS. In addition, lhe A form has a central hale (bonom). This helical forrn is adopted by RNA- DNA aOO RNA-RNA helices. (c) Z DNA IS a ieft-handed helix and has a zigzag (hence 'T) appearance. (Sourc.e: Coortesy 01
e Kielkopf and P. B. Dervan.)
a
BDNA
b AONA E
Ji o
e
ZONA
DNII Structure
•
b
101
F I G U RE ~ 12 The propettef twist
between the pwlne and pyrimktlne base pairs of a right-handed hetix. (a) lhe Sln.Jdure shcMts a sequence of th¡ee consecu· tiYe A:l base pairs Wlth normal Watson-Crick
bonding. (b) PropeIler tv.iSl causes rotation of the bases about thei¡ long a xis. (Source: Adapted I¡om AsgaaIWilI el al. 1988. Soence 242: 899-907, figure 5b. CopyI'ighl 1988 Prnerican AssocJalion for lhe Advancement of
e
Science. Use
1- O O O
p
anti poslion of guanine
• average 10.5 base pairs per tum of the helix. Second. the B form is en average structure whereas real DNA is not perfectly regular. Ralher, it exhibils variations in its precise structure fmm base pair to base pairo This was revealed by comparison uf lhe crystal struclures uf individual ONAs of different sequenct>s. Far example. the twu members of each base pair do nol always He exactly in the same planeo Rather. they can display a "pmpeller twist" arrangement in which the twu fiat bases counter rotate relative lo each other alons the long axis of!he base pairo giving the base pair a propeller.. like charocler (Figure 6-12). Moroover, the precise rotation per base pair is nol a constan!. As a resull. the widtb of!he major and minor grooves vades locally. Thus . DNA moleculcs are never perfeclly regular double helices. Instead, lheir exact confonnation depends on which base pair (A:T. T:A. G:C, ur C:GJ is present al each posHion along !he double helix and on lhe identity of neighboring base pairs. Slill, the B fonu is for many purposes a good first approximation of the structure of DNA in cells.
DNA Can Sometimes Form a
Left~Handed
O
deoxyguanosinc as in B ONA
1 syn position 01 guanine
~
•
O
. !*
Helix
ONA containing alternative purine and pyrimidine residues can fold into left whanded as well as right-handed helíces. To undersland how ONA can form a left-handed helix. we need lo consider Ihe glycosidic bond that connecls the base to the l' posltion of 2'-deoxyribose. This bond can be in one of two conformations caBed syn and unt; (Figure 6-13J. In right-handed DNA. th e glycosidic bond is always in the ona con formatio n, Ln Ihe left-handed helix. lhe fundamental repeatillg llnü usunlly 15 11 puriue-p}Timidine dinllcleotide. wilh lhe glycos idic bond in the onU confonnation al pyrirnidine residues and in the syn confllrmation at purine residues, lt is tbis syn conformation at the puriue nucleotides that is responsible for the left-handedness of lhe helix, The c)1ange to the syn position in the p urine residues 10 alternating anti-syn conformations gives the backbone of left-handed ONA a zigzag look (hence its designation of Z DNA: see Figure 6-11J, which d isti nguishes it fmm right-handed (orms. The rotalion thal effects the
deoxyguanosine as in ZONA
6- 13 Syn and on" positions of guanine in B and Z DNA. In right-handed B DNA. !he gIycosyI bond (colored red) conneC1.In!! me base 10 !he deoK'¡llibose group ¡s always in \he anfi pasillOn. v.nile in Ieft-handed Z DNA ,1 rotilles In me direction 01 ll1e arrow, fofm,ns Ihe.syn confOfTllation al the puñne (here gu
(Source: Adapted from Wang A J. H. el al 1982 CSHSQB 47: 4 1. Copyright O 198 2 CoId
Spring Harbor laboratory Pre5S. Used v-,ilh perrnission,)
108
Tlle Strucfu res o[ DNA ílnú RNA
change from anti lo sJ'n also causes the ribose group lo undergo a change in its pucker. Note, as shown in Figure 6-13, thal Ca' and e2' can switch locations. In solution alternating purine-pyrimidine rosid ues assume the left-handed eonformation only in the presence of higb eoneentrations of positively charged ions (for example, Na ' ) that shield the negat ively charged phosphate groups. At lower salt concontrations, Ihey form Iypical right-handed conformations. The phY$iologica l significance of ZONA is uncertain and left -handed helices probably aceounl al most for only él small proportion of a cell 's DNA. Further details of the A, B, and Z forms of DNA are presented in Table 6-2.
D NA St rands Can Separate (Denature) and Reassociate Because the two strands of the double heHx are held togetller by relatively weak (noncovalent ) forces, you might I,}xpecl that the two slran ds cOllld come apart easi l}'. lndeed. Ihe original structure for the doub le he lix suggested that DNA replication would occur in just tbis manner. The complementary strands of the double heli» can also be made to come a part when a solulion of DNA IS healed above physiologica l temperatures (lo near 100 0 el oc under conditions of high pH, <1 process known as denaturatio n. However, th is comp lete separatinn of DNA strands by denaturation is reversible. When heated sulu lions of denalured DNA are slowly cooled. single strands orten meet their complementary strands and reform regular double helices (Figure 6-14). The capacity to renature denatured DNA molecules permits
lA BLE 6-2 A Comparison of the Structural Properties of A. B, and Z DNAs as Derived hom Single-Crystal X-Ray Analysis
HeliXType
Overa!! proportions Risc pef base pair Helix·packing diameter
Helix rotatioo sense Base pairs per helix (apeal Base pairs par lurn 01helix Rotation pe¡ base pair Pilch per turn of helix Tifl 011)..1S(1 núfmals 10 helix axis Base-pair mean propeller twist Hclix a)(is localiorl Major·groove propcl(tioos
z
B
A
SOOrt and broad 2.3Á 25.5 Á Righl-handed
Longer
and Ihinner 3.32 A 23,7 A
left·handed
1
2 12
- 10
33.6" 24.6 Á
35.9'" 33,2 A
+ 19"
_ 1.2 G 0
Minor-groove prop
VefY broad bul shallow
+ 16 Through base pairs Wrde and of inlerrnediale depm Narrow and 01Inlermediale
Glycosyl·bOnd conformalion
ami
deplh ami
'1- 18 Major groove Exlremcly narrow bul \/Cfy deep
Source. Adapled trorn DóckelSDn R. E. el al. 1962. CSHSOB 47; 14. Copyrigh1
3.BÁ 18.4 Á
Aighl-handed
1 - 11
0
Elongated and slim
-6
Rattened out un hellx surface Ex:tremely narrow bu!
very deep ami at C. syn al G
e 1982 CoId Spring Harbor Labora/OfY P'ess. Used wilh pcrmissiorl .
DNA Sfructure
wild-Iype
ONA
I _
ONA moIecules deoalured by heating
I
--~--- - - -- - - - - - - - -- - -- - - -_
ONA moIewle
_ missing region a
. << s-
-----------------------.. - -----------> I cool slowly and s\a(1 te renalure I
I
continue lo rcoature
I
FI GU R E 6-14 Reanneal;nB and hybridlution. A milcttJre of lwO othelw1se idenllcal double--Slmnded rrolerules, ene normal v.M-type c;t.jA and the 0Iher il mutant missing a short stretch of nudeotides (rT\ilIked as region a In Ied). are denatured by healing. The denalUred ONA moIeru\e; are allo>.ved te ler\ilhKe b; Í1O.J)atIOl"l iust beIovv the meltlng temper¡ltlle. This UeatlTl61l lestJlts in lwO types d fer1altireO moIecules. Orle type IS ~ al cornpletely renatured moIe<:uIes IIlIM1ICh IWO complemt'f1taJy wiId--type strands letom il heID: and lwO compIemenlilly mulanl stlancis leform a helix. The othe- type ale 1r,tJnd rnoIea.Ies, (l)fllJX&'d of a v.1Id-type and a mutn strand, exhbfug a short unpaired Iocp of ONA (fE:'glOll a)
[)f\ltl.
for several indispensable techniques in molecular biology. such as Southem blol hybridizati on (sef.l Chapter 20) and DNA microarray anruysis (see Chaptee 18, Box 18-1). lmportant insights into the properties of the double helix were oblained from c1assic experiments carried out in Ihe 1950s in which lile denaturation of DNA was studied under a variety of conditions. Jn these experiments. DNA denaturation was monitored by measuring Ihe absorbance of ultraviolet light passed through a solution of DNA. DNA
-~-
109
1 10
The Slrvr:fuf'f:$ of ONA U/ld RNA
maximally absorbs ultraviolet lighl at a wavelength of about 260 nm. It is the bases that are principally responsible for this absorption. When the temperature or a ~olution or DNA is raised to near the boiling point or water, the oplical density. called absorbance. at 260 nm markedly ¡ncreases. a phenomenon known as hyperchromicily. nle explanation for this increase is that duplex DNA absorbs less ultraviolet light by about 40% than do individua l DNA chains. This hypochromicity is due to base stacking, which diminishes the capacity or the bases in duplex ONA lo absorb ultraviolet light. If we plo! the uptical density of DNA as a function uf tem pemtu re. we observe that the ¡ncrease in absurption OCCuts cthruptly over a relativeLy narrow temperature range. The midpoint of U,is transition is the melting poi"t or Ton (Figure 6-15). Like ice. DNA melis: it undcrgoes a transition from a high ly ordered double-helical structure lo a much less ordered structure of individual slrands. The sharp ness nf the increa."ie in absorbance al !he melting tempera lure tells us thal t'he denaluration and renaturntion of cum plemenlary DNA strands is él highly cooperative. zippering-like process. Renaluratiun. for example, probably occurs by means of a slow nucleation process in which a relatively small stretch of bases 011 one strand find and pair with their eomplement on the eomplementary strand (middle panel of Figure 6-14). 'Die rema inder of Ihe two slrands then rapidly zipper-up from the nucleation site lo reform an extended double helix (lower panel of Fi.gure 6-14). The melling temperature of ONA is a characleristic of each DNA Ulat ls largel'y determi ned by the G:C content of the DNA and the ionic strongth of the solution. The higher Ihe percent or G:C base pairs in the ONA {and hence the IUW lif Ihe content uf A:T base paies}, the higher !he mell ing point (Figure 6·16). Likewise, the higher the sa lt concentration of Ihe solution. the greater Ihe temperatllre al which the DNA denatures. How do \Ve exp lain Ihi s behavior? G:C base pairs contribule more lo the slability of DNA Ihan do A:T base pairs OOcause or the grealer number or hydrogen bonds fm the fonner (Ihree in a G:C base pair versus two foe A:T) but also importantly, because Ihe slacking inleracUons of G:C base pajes with adjacent base pairs are more favorab le than the correspunding interactíons of A:T base paies wilh their neighboring base pilles. The effect of jonie slrenglh reflecls anotller fundamental fealure or the double helix. The backbones of the two ONA strands con!ain phosphoryl
F I G U R E 6-15 ONA denaturation QlMle..
single slranc!e
~ c{
double sttanded
40
60
Tm lemperature (oC)
100
DNA Topology
.
100
:¡;:
......
.~..•....
so
•
.,,'
........••
•
•
"~ 60 •e
•
1í
~ + e
• ,•
40
e
..
~
20 -
......•.. ...-
...
__
10
80
O +---~r---
60
----~----
__
----~
100
'"
grnups which carry ü negative charge. These negative charges are close enough across the two stranrls Iha! ir not shieldcd, they tend lo cause the strands lo repel each other, facilitating their separation. Al high ¡aoie strenglh, Ihe negative charges are s)l\elded by calions, thereby stabilizing Ihe helix. Conversely, al low io oíe strength Ihe unshieldl,..>d nugative
charges render Ihe helix less stable.
Sorne DNA Molecules Are C i.rcles It was initially believed tha! all ONA molecu le::; are linear and have tWD free ends. lndeed, Ihe chromosornes af eukaryotic cells each contajn a s ingle (extremely long) ONA molecule. Bu! now we know tllal sorne DNAs are circles. Fur exam ple. the cbromosome a f the small mankey DNA virus SV40 is a circular, double-heli cal ONA molecule of aboul 5,000 base pairs, Also, mosl (buI nol all) bacterial chromosomes ate circul ar; E. coJi has a circular c hromosome of about 5 million base pairs. Additionally. many bacteria have small aulonúmaus ly replicating geneti c elements known as plasmids, which are generally circu lar ONA malecules. lnterestingly, sorne DNA mol ecu les are someli mes linear and sometimes circular. The 111os1 well -known exam ple is tha! of the bacleriophage A, a DNA virus of E. con. The phage h genome is a linear double-stranded molecule in the virion parlid e. However. whe n th e Á genome is injected into an E. col; cell during ¡nfecliún. the DNA circu larízes. This occurs by base-pairing between single-stranded regions thal protrude fro m the ends of ¡he DNA and that have cornplemenlary sequences, also known as "sl icky e nds."
DNA TOPOLOGY As DNA is u flexibl e slructure. its exact molecular parameters are a runction of both the surtounding ¡onic environmenl and Ihe nuture of Ihe DNA-bind ing proteins wilh which il is comp lexed. Because their ends ure free, linear DNA mol ecules can freely rolate lo accommodate
1t 1
f I G U R E E).. 16 Dependence of DNA
denaturation 00 G + e content and on salt concentration. lhe gr€éller \he G + e conlen!, 1he h¡gher 1he lemperature rnusl be lo den¡lIure me DNA str¡¡nd. DNA from diflefel11 SOLKce5 was dissotved in soIutions of Iow (red tine) and hlgh (green Ijne) concenlrilfions of sal! al pH 7.0. The polnts represent Ihe temperature ilI v.hich the DNA denalured, graphed i3gainst !he G + e content (Source: Dala from Marmur J. and Doty P. 1962. JoornoJ 01 MoIecu/01 Biology 5: 120. CopyTighl e 1962, wrth per~ mis$lon frorn Else.-ier SÓ€nce .)
changes in the number of limes Ihe two chains of the doubJe helix twist about each other. Bul iI the two encls are covalenlly Iinked to form a circuhU' DNA molecule and if lhere are no inlerruptiuns in lhe sugar-phosphate backbones of lhe t.wo strands. lhen lhe absolute Dumber of limes the chains can twisl aboul each olher caflnot change. Such a covalently c1osed . circular DN A is said lo be topo logica lly constrained. Even the linear DNA molecuJes of e ukaryotic chromosomes are subject lo topological constraints due lo their extreme length, onlra inmont in chromatin . snd inleraction \Vilh other cellula.. components (see Chapler 7). Despile these constraints, ONA participates in numemus J ynam ic processes in the cell . Foe examp le, the two strands of the double helix. which are lwisted around each oilier, must rapiclly se parate in order for ONA to be d uplicated a nd lo be lranscribed inlo RNA. Thus, underslanding Ihe topology of ONA and how tlle coll both accommodates an cl exp loits topologica l conslra int s during ONA replication. transcription . and other chromosomaJ transactions is of fundamental imporlance in molecular biology.
Linking Number [s an lnvariant Topological Property
of Covalently Closed, Circular DNA Lel tlS consider the lopological properties of covalently dru.ed, circular DNA, which is reforced lo as cccDNA. BecAuse thero are no interruptions in eilher polynucleolide chain, Ihe l\Vo stra nds of cccDNA cannot be separaled from each otlwr withoul the breaking oI a cova· lenl bond. If \Ve wished lo separate the two circular stra nds without permanently breaking any bonds in Ihe sugar-phosphate backbones. \Ve \Vould have lo pass one strand through the otber strand repealedly (we will encounter ao enzyrne that can perform jusi Ihis feal!). The numher of limes one strand would have lo be passed Ihrough the olher slrand in order for Ihe Iwo strands lo be entirely separaled from each oilier is calJ ed the linking oumbcr (Figure 6· 17). The linking ntlmber. which is always a n ¡nteger, is an ¡nvarian ! topolugical property of cccDNA, no matle r how much the shape of the ONA molecule is di storted.
L¡nking Number 15 Composed of Twist and Wrlthe The linking ntlmber is the sum of two geometric components called the twist and the writhe. Lel us consider twisl Rrs!. Twist is simply tJle munber of helical lums or une strand aboul. UlC olher, Iba! is, tIle number of times one strand completely wmps around the other strand. Consider a cccDNA that is lying flat on a plane. Jn Ih is nal conformation , Ihe linking number is fully co mposed of Iwist. Indeed, the Iwisl can be easily determined by counting Ule number of times the Iwo strands cross each othér (seo Figure 6-17a). The helical crossovers (twist) in a right-handed hclix are defined as positive such that the linking number of DNA will have a positive vaJue. Bul cccDNA is generally nol lying Oal on i1 planeo Rathe'l'. it is u ~ uan y torsionally stressed such tha! the long axis of the dooble helix crosses over itself, often repeatedly. in three-dimensional space (Figure 6-17bJ. This is called writhe. To visualize tbe distorti ons ca used by torsional stress, tltink of Ule coiling of a telephone cord Ihal has been overtwisted. Writhe can lake two forms. One form is the inlerwound or plectonemic writhe. in which the long axis is twisted around itself, as depicled in Fig ure 6-17b and Figure 6 -1 Ba. The other (onu o[writhe is
DNA TopoJogy
•
e
b
topofsomerase
1.-
,
t 13
,
,
,
.-2'
25
,., bp: 360
bp: 360
bp: 360
36
Lk: 32
Tw: 36 O
Tw: 36 Wr: -4
32 Tw: 32 Wr: O
Lk:
w,:
Lk:
FI GU R E 6-17 Topologkal states of (ovalenlly dosed. circular (ccc) ONA. The figure shO'NS CQn\IefSlOO of the ,elaxed (a) lo me negiltively supercoiled (b) foon of ONA. The strain in Ihe supercoiled form may be lakerl up by ~ting (b) Of by local disruptioo of base paillng (e). [Adapted FlOm a WgrdfTl provided by Dr. M. ~Ilert.l (Source: foIvdiflE.'d from Kornberg A. an
a toroid Of spirul in which the long axis is wound in a cylindrical mauuer, as aften occurs when DNA wraps around protein (Figu re 6-l8bJ . The writhing number (Wr) is tha total nu mber of inler....ou nd and/or spiral writhes in cccDNA. Fol' example, Ihe motecule shown in Figure 6·17b has a writhing number of four. lnlerwound writhe and spira! writhe are topological1y equivalenl lo each other and are readily inlerconvertible geometric properties of cccDNA. A1so, twist and writhe are interconvertible. A moJecule of b
FIGURE 6-18 Twoformsofwrithe of supercoUed DNA.
The figure shows inter-
wou(l(! (a) and toroidal (b) writhe of ccdJNA
01 lile same length. (a) fue Interw:JUnd or plectonemic wnlhe IS formed by toMStlng of the double hehcal ONA moleoJe 0\16 itself as depicled in Ihe example of a branched moIecuIe. (b) Torcidal or spifill wrlthe IS depicted in ¡his eMll'pIe by l.ytindrk.dl coils. (5Ource; Modflt:'d fmm Komberg A. o3nd Bakef TA 1992. DNA repficotJon. I 1- 22, p. 33. CI 1992 by.w. H. Freernan crnd Company. Use
114
The Stl1lctures ofDNA and RNA
cccONA can readily undergo distortions that convert some of its twist lo writhe 01 some of lts writhe to twist wilhout the breakage 01' nny covalent bonds. The only constraint is thal the sum of tbe twist numbcr (TI,,) and (he writh ing numoor (Wr) musl remain equaJ to the Jinking numoor (Lk). Tbis constraint is described by tbe equalion: Lk = Tw+ Wr.
LkO Is the Linking Number of Fully Relaxed cccDNA u nder Physiological Conditions Consider cccONA lhal is free of supcrcoiling (that ¡s, il is said lo be relaxcd) and whose twist corresponds to that of the B fOfD1 of DNA in solution undf'J" physiologicaJ conditions (about 10.5 base pairs per lum of the helix). The linking number (1. k) of such cccDNA under physio· Jogical conclitions is assigned the sym boJ LJ
DNA in Cells Is Negatively Supercoiled Tbe extent af supercoiling is measured by tbe difference belween Lk and Lko, which is caBed lhe linking diffcrcncc: 6Lk
=
Lk - Lko.
FIC;;URE 6·19 RelaxingDNAwffil DNAse L
n;ck .....
•
/ pivot
•
DNA Topology
115
If the l1Lk of a cccDNA is significantJy different fmm zero, Ihen the DNA is torsionally strained Il.nd henee it is supercoiled. Ir Lk < LkD and 6.l..k < 0, then the DNA is said to be "negatively supercoiled." Conversely. if Lk > Lk D and l:J.Lk > o. then the DNA is "positively
supercoiled." For example. the motecule shown in Figure 6-17b is negatively supcrcoil ed and has a linking diffp.rence oC - 4 because its Lk (32) is fou r less than that (36) for the relaxed fonn of the molecule shown in Figure 6-1 7a. 8ecause l:J.Lk and LkD are dependent upon fue lenglh of the ONA molecule. it is more conven ient to refer lo a normalized measure of supercoiling. Trus is the superhelicaJ density, whicb is assigned the symbol (f and is defm ed as: (J
= t:.LkJ Lko .
Circular ONA molecules purified both from bacteria and eukaryotes aro usually negatively supercoiled, having values of (J of about - 0.06. The electron micrograph shown in Figure 6-20 compares the structures ofbacteriophage ONA in ils reJaxed form with its supcrcoiled fonu. What does superhelical densUy mean biologically? Negati vo supercoils can be thought of as a stora of free energy that aicls in processes that requirc strand separation, sueh as ONA replication and transcription. Because Lk = fu + Wr. negative supercoils can be eonverted into untwisting of the double helix (compare Figure 6-17a with 6-17b). Regioos of negatively supereoiled DNA. therefore. havo a tendeney lo partially unwind. Thus, slrnnd soparation can be aeeomplished more casily in negatively supercoiled DNA than in relaxed ONA. The only organisms that haV{'¡ becn found lo have positively supercailed DNA are eertain tbermoprules. mieroorganisms tbat li ve under canditions of extreme high temperatures , such as in hot springs. (n this case, the positi ve supercoils can be thought of as a stDre of free energy thal helps keep the DNA ITom denaturiog al the elevated temperaturos. (o so far as positive supercoils can be converted into more twist (posi lively supereoiled DNA can be thought of as being overwoundJ, strand separation requires more cnergy in thermophiles Ihan in organisms whose DNA is negatively supercoiled.
Nucleosomes Introduce Negative Supercoiling in Eukaryotes As we shall see in the nexl chapter. DNA in the nucleus oC eukaryotie
rells is packaged in small partid es known as nudeosomes in wruch tbe double h{llix is wrapped almost two times around lhe outside circumference of a protein coreo You will be able lo rccognize this wrapping as the toroid or spital form of writhe. Imporlantly. it occurs in a lefi-handed manner. (Convinee yourself of this by applying the handadness rule in your mind's eye 10 DNA wrapped around the nucJeosome in Chapler 7. Figure 7-18) . It twns oul that writhe in the fonn of 1efthanded spirals is equivalenl to negative supercoils. Thus. the packaging or DNA inlo nucloosomes introduces negativo s uperhelical density.
Topoisomerases Can ReJax Supercoiled DNA As \Ve have seen, the linking number is an invarianl property of DNA tbat is lopologically eonstrained. lt can only be ehaoged by introducing
interruptions ioto the sugar-phosphale backbone. A remarkable c1ass oC enzymes known as topoisomerases are abte lo do just this by introduciog transient single-stranded or double-stranded breaks ¡Tltu Ihe DNA.
FICURE 6-20 Elec:tron mkrogtaphof
superroiled DNA. The upper electron microglaph is a relaxed (noosuperroiled) DNA molerole of bacterophage PM2. The Iowa electron mlOogr'ilph shows !he phage '" ¡ts supertWlsted form. (Source: flectron rnicrographs COU1esy ot Wilng lC. 1982 .5of1ltificAmerican 247: n)
116
The St rucll1feS ofDNA and RNA
FIGURE 6-21 SdlematicfOfcbanging lbe linking number in DNA with topoisomerase 11. Topoisometase 11 binds lo DNA. aeales a doubIe-stranded break, passes lJ1CI.J DNA through the gap, 'hen reseals 'he break.
pass back duplex 'hroogh break
""
Iopduplex
resea! break
Topoisomerases are of two general types. Type U topoisomerases change the linking number in sleps of two, Thay make Lransient doublestrnnded breaks in lhe DNA through which they pass a segment of uncut duplex ONA before resealing Ihe break. This type of reacHon is shown schematically in Figure 6-21. Type 11 topoisomerases require lbe energy of ATP hydrolysis for their aclion. Type 1 topoisomerases. in contrasto change Ihe linking number of DNA in steps of one. They make transicnl single-stranded breaks in Lhe ONA. allowing the uncut strand lo pass through lhe break before resealing lhe nick (Figure 6-22) . In contrast lo the type II topoisomerases. typo I topoisomerases do not require ATP. How topoisomerases ffllax DNA and promote othar related reactions in a contmlled and concorted manner is explained below.
Prokaryotes Have a Special Topoisomerase that Introduces Supercoils into DNA Both prokaryotes and eukarytoes hava type J and type n topoisomerases tbat are capable of removing supercoils froro DNA. In addition. however. prokaryotes have a speciaJ type 11 lopoisomerase known as DNA gyrase that introduces. mthar than removes, negative supercoils. DNA gyrase is responsible for the negative supercoiling 01' chromosomes in prokaryotcs . This negativo supercoiling facilitatf!~ the unwinding of the DNA duplex. which stimulates many rcactions of DNA inclurung initia· tion ofboth transcription and ONA replication.
FIGURE 6-22 Sc:hematicmechanismof action fo, topoisomerase 1. The enzyme aJs a Single Slrand 01lhe DNA duplex, passes the uOOJl strnnd ,h~ me break, then reseals the bleak. The process iooeases the linklng runber by + 1.
nick
§ Lk: n
•
pass straod through break
§
and Jigare
•
d
§
•
DNA TopoJogy
117
Topoisomerd5eS also Unknot and Disentangle DNA Molecules In addition to relaxing supercoiled ONA, tapoisomerases promote several aLhar fP.8Ctions important lo maintaining the proper DNA structuro within calls. Tho OIlZ)'ffies use the same transient DNA break and slrand passage reaction Ihat they use to relax DNA to carry out these reactions. Topoisomerases can both catcnate and decatenate circular DNA molecules. Circular DNA molecules are said lo be catemlted ir they aro linked together like two rings of a chain (Figure 6-23a). OC these two acti vities. the ability 01" topoisomerases to decatenate DNA is of dear biological imporlance. As we will see in Chapler 6, calenaled DNA molecules are commonly produced as a round oC DNA replication is finishad (500 Figure 8-33). lbpoisomerases play Ihe essential role of unHnking these DNA molt.'CuJes to allow them lo separate iuto the two daughter cells for cell division. Decalenalion of two covalently c10sed circular DNA molacules requires passage of the tWQ ONA strands of one molecule through a double-stranded break in the second DNA molecule. This reselion therefore depends on a type U topoisomerase. The requircment for decatenation explains why type 11 topoisomerases are essential cellular proteins. However. if al least olle of Ihe two catenal'ed DNA moleeules carries a nick or Il gap. then a type 1 enzyme may also unlink the lwo molecules (Figure &23bl. Although we often focus 011 eircular DNA molecules when considering topological issues, the long linear cmomosomes of eukaryotic organisms also f!xperience topological problems. For example, during a round a type 11 topoisomernse
00 o
FIGURE 6-23 Topoisomerases
catenatiort , , decatenation
b type I topoisomerase
catenation , , decatenaUoo
e type 11 topoisomerase
d type 11 topoisomerase
,
'
de
of DNA replication, the two double-stranded daughter DNA molecules will often become entangled (Figure 6-23c). These siles of enta nglement, just Iike the links between catenated DNA molecules. block the separa· tian of the daughler cruomosomes dtuing mitosis. Therefore, DNA disentanglerne nt, generall}' catalyzed by a type n topoisomerase. is also required for a successfu l round of ONA replication and ce)) clivision in eukaryotns. On occasion , a DNA molecule becomes knottüd (Figure 6-23d). For p.xample. sorne site-spocifi c recombination reactions. which we shall discuss in detail in Chapler 11. give rise to knoued ONA producIs. Once again, a type 11 topoisomerase can "untie" a knot iu duplex DNA. If the DNA molecule is nickad or gapped. then a type I enzyme can aiso do lrus jobo
Topoisomerases Use a Covalent Protein~DNA Linkage to Cleave and Rejoin DNA Strands To perfonn their functions. topoisomerases must cleave a ONA slrand (or two strands) and theo re join the cleaved slIand (Of strands). Topoisa-merases are able to promote both DNA cleavage and re joining witbout Ihe assislance of other proteins Of high--energy co--faclors (for example. ATP; also sea below) because they use a covalent· intermediate mechanism. DNA cJeavage occurs when a tyrosine residue in the active s ite of the topoisomerase attacks a phosphodiester bond in the backbone of the larget DNA (Figure 6·24). This attack generales a break in Ihe ONA. whereby the topoisomerase is covalently joined to one of tbe broken cnds via a phospho--tyrosine linkage. The other e nd of Ihe DNA term i· nates with a free OH group. This e nd is also held tightly by the enzyme, as \Ye wiII see below. Tbe phospho·tyrosine linkage conserves the energy of lhe phosphodiester bond tbat was c1eaved. Thererore. the ONA can be re·sealed s imply by reversing tbe original Naction: the OH group from one broken ONA end attacks the phospho--tyrosine bond reforming tbe DNA phosphodiester bond. This maction mioins the DNA strand and releases the topoiSOmp.fa<;e, which can then go on to Cl'I lflly7R. :mother reaction cycle. AI though as noted above. type II topoisomerases requi re ATP.hydrolysis for activity. the energy released by this hydrolysis is used lo promote conformational changes in the topoisomerase·ONA complex rather than to c1eave Of rejoin DNA.
Topoisomerases Form an Enzyrne Bridge and Pass DNA Segments through Each Other Between the steps of ONA cleavage and ONA rejoining, the topoiso-merase promotes passage of a second segme nl of ONA through the break. Topoisomerase function thus requires !hat ONA c1eavage. strnnd passage. and DNA rejoini ng all occur in a highly coord inated manner. Structuros of several clifferent lopoisomcrases havo provided ¡n5igh! inlo how the reaction cyele occurs. Here we will explai n a model fOf how a Iype I topoisomerasc relaxes ONA. 1'0 initiate a rclaxation cycle, the lopoisomerase bínds to a segment of duplex DNA in whi ch the two strands are melted (Figure &25a). Melting ofthe DNA strands is favorod in highIy negatively s upercoiJed ONA (see above), making trus DNA nn excellent substrate for relaxation. One 01' lhe DNA strands binds in A c1eft In the enzyme thAt placp.s it np.ar the
DNA TopoJogy
a
~
b
' ct
5"
"'
13'
ldea-
5"
8
tt~o ....
>3'OH r -O-3'
5"
FI CU RE 6-24 Topoisomerases duve DNA using a covalent tyrosine-DNA intermediate, (a) 5c.t1em<'I1IC oIlhe de;wage and repntng reactIOO. Rlf Slmplioty, onIy a SIngle strand of DNA 15 shown. 5t"E' Figure 6 ·25 ro. a fl10Ie feaw pu:tue. The samE" rnec:hdnism 15 used by l)1)e Il topoIsomefases. althougtl two enzyme SlbulItS are reqtired, ene lo deave e«h of me two DNA stJands. Topoisomerases sometimes cut 10 !he 5 ' SIÓe and sometirne5 10 !he 3 ' SIde. (b) a ose-up view of the pnospho-tyn:lSine covalent intE.'fTnediate.
Iyrosine intermediale (Figure 6-25b). The success ofthe reaction requ ires thet the olbor ond o r tha newly cleaved ONA is also tightl y bound by the cnzymc. After cJOOVagt1, lhe topoisomerase undergocs a larga confo rmational changa lO open up a gap in the cleaved strand . with the enzyme bridging tho gap. The second luucleaved) ONA strand theo passes though the gap, aud binds lo a ONA-binding site in an internal "donulshaped" hole in tbe proteio (Figure 6-25c). After strand passage OCCWS, a second conformational change in tbe topoisomerase-ONA complex brlngs the cleaved DNA ends back together (Figure ti-Zsd): rejoining of the ONA strand occurs b)' attack. of the OH end 0 0 tbe phosopb
120
The S'r1Idores oIONII ond RNt\
'"
..
'-
.
~dea,ageao, ""
"",,,,,,. strand
opemng 01
"
gale
5'
a
e
b
rejoining , of cJeaved
"",o,
..
DNA
release
e
d
FIGURE 6-25 Model fo, the .eactton cyde catatyzed by a type I topoisomerase. Thefigue shCM'S a series 01proposed steps 10/ Ihe lelaxation of Orle tlXll 01a negatively supercOIled plasrmd ONA. The two strands of ONA are shov-.n as dalk gray (and not draw¡ 10 sarle). 1he tour domains 01the prolan are labeled In panel (a). Dorna¡n I IS shown In red. 1I is blue. III IS green, and IV is orange. (Source: Adapted 110m G-oampoux J. 2001. DNA topoisomerases. Annuol F«:view DI Biochemisrry 70: 369-4 13. Copyright Q 200 1 by hlnual Reviews. \oVIM\'.annuallev;ews.org.)
A B
e o
-- -
o ,...O;"'·'X)
FIGURE 6-26 Schemaltc of
DNA Topoisomers Can Be Separated by Electrophoresis Covalently c1osed. circular DNA molecules of the same length but or dirrerent linking numbers are caBed DNA topoisomeI'S. Even though topoisomers have the same molecu lar wcight. they can be separated from each other by electropboresis through a gel or agarose (see Chapter 20 ror ao explanation of gel elcctrophoTesis). The basis for this separation is that the greater the writhe. the more compact the shape of a cccDNA. Once again . think of how supercoiling a telephone corel causes il 10 become more compacto The more campact the DNA. the more easily (up to a point) it is able to migrate lruough the gel malrix (Figure 6-26). Thus. a fu lJy relaxed cccDNA migrates more slowly than a highly supercoiled lopoisomer of the same circular DNA. Figure 6-27 shows a ladder of DNA topoisomers resolved by gel electrophoresis. Molecules in adjaceol rungs or the ladder diffcr from each ather by a link.ing number difrerence of just one. Obviously. eleclrophoretic mobility is highly sensitive lo the topological slate of DNA (see Box 6-2, Proving that ONA Has a Helical Periodicity of aboul 10,5 Base Paics per Turn from the TopoJogiC
dedrophoretic separation of DNA
lopoisomers. lane A represents relaxed 01 mcked circular ONA; lane B. linea. DNA; lane C. highly supeicoiled a:cDNA; and lane D, a ladder
01lopocsorners.
Ethidium lons Cause DNA to Unwind Ethidium is a large . flal . mulU-ringed cali an. Its planar shape enables elhidium lo slip. or intercalate, betweon the stacked base pairs of DNA
DNA Topology
121
Box 6-2 Proving that DNA Has a Helical Periodicity of about 10.5 Base Pairs pet Tum fTom the TopoIogical Properties of DNA Rings The observation that DNA topo&mers can be separated from each other eEctrophorebcalty IS the baSIS far a SImple experiment that PKJv'eS that DNA has a helical periodidty of about 10.5 base pails per tum in sclution. Consider three cccONAs ot SiZeS 3,990, 3,995, and 4,011 base pairs \hat were re\axed lo corTlJle1Ía1 by treatment with type I topoisornerase. When subjected lo eIectrophoresis
thtOugh agarose, the 3,990- and 4,011-base-pair DNAs exhibit essentially icIentical mob~¡ties. Que to thennal f1uctuation. topoisornerase treatment actually generates a narrow spectrum of tO(Xlisomefs, but for simplicity let lIS consider the mobilily of onIy the most abundan! tq:>Oisomer (that carespondi'l:: to !he cccDNA. in its most relaxed state). The mobilities of the 1TDSt abundant tq:OSOO'lelS for the 3,990and 4,Oll-base-pair DNAs are indistinglllshable because the 21-base-pair difference bet\.veen them is negligible compared 10 the sizes of the rings. lhe mast abundanl topoisomer fOf lhe 3.995-base-pair ring. hcwever. is frund 10 migrate slightiy more rapidly than lhe other two rings even though it is only 5 base pairs larger lhan the 3,99D-base-pair ringoHQ.y are ......e 10 explain this anomaly1 lhe 3,990- and 4,0 11base-pair rings in their most relaxed states are expected lo have linking numbers equal to L/P, Ihal.S, 380 'n the case of the 3.990-base-pair ring (dividing the size by 10.5 base pairs) aOO 382 in the case of the 4,011-base-pair ring. Because U is equal to 01', Ihe linking difference (All = U - UP) in both cases is retO and mere is no writhe. But because lhe linking number must be an integer, me most relaxed stale far the 3,995-base-pair ring VvOUId be either of two topoisomers having linking numbers of 380 Ot 381 . However,l.J.O fa,-!he 3,995-base-pair lVl8 is 380.5. lhus. even in its most relaxed state, a CO«Ilenlty dosed órde of 3,995 base pairs \MllI1d neces:sarily have about half a unit of wrilhe (lis linking difference 'MJI..Jk! be 0.5), élnd hence it \MJlJ1d migrnte l'T'I()re rapidly!han the 3,99Q. and 4,0I 1-base-pair ardes. In pIain t-oN rings !/lal differ in Iength by 2 1 base poio; (\v.o tums 01 the helix) have !he same mobiIily, v.hereas a ring mal diHers in length by on/y 5 base pairs (abwt haIf a helicaI ttm) exhibits a different mobiIity, V'ot! must condude \hat [)NA in soIution has a helica! periodicity ot aboot 10.5 base pairs per tumo
(Figure 6-Z8 ). Becauso it fluoresces when exposed lo ullraviolet ligbt, a nd because its fluorescence increases dramalicall y after intercalation , ethidium is usad as s slain lo visuali ze DNA. Wheo so ethidium ion intercalales belween two base pairs, il causes tho ON/\ lo unwind by 26°, reducing tJlO nonna l rotation per base pair from _36° to - 10", In other words, cthidium decreases Ihe twisl of ONA. Imagine tbe extreme caSe of a DNA molecule lbat has all ethidium ion between e very base pairo ¡nslead of 10 base pairs per turn il wOllld have 36! Wben elhidium binds lo linear ONA or lo a nicked circle, ít simply causes lbe helical pitch to increase. Bul conside r what happens whe n ethidium binds lo covalently c1osed. c ircu· lar ONA, The linking numbor of !he cccONA does nol change (no covaJ ent bonds are broken snd resealed), bul the twist decreases by 26° for each molecule of ethidium that has bound to the ONA. Becausc Lk = TIV + Wr. thi s decrease in T1V musl be compensElled ror by a COITespooding ¡ocrease in Wr. the circular ONA is iniliall y negatively supercoiled (as is normal1y the ease for circular DNAs isolated from cells). IhEln tha addili on of elhidium will increase Mlr. In ofher words, tbe addition of ethidium will reJax the DNA. If enough elhidium is added, lhe negativA supercoiling will be brought to zero, snd if even more elhidium is added . Wr will ¡nerease above zoro and Ihe ONA \ViII become POSitiVfl ly supCI'coiled ,
Ir
FIGUR E 6-27 ~raUonoh~axedand wpercoíled DNA by gel eledropboresis. Relaxed i:Ind ~oiled OOA ropoisorners i:lre resdved by gel electro¡:;horesis. The speed w.1h
....t.id11he ONA moIeaJes fTlÍgrale increases i!S IhI! nt.n'bef of superhekaltuns lI"IC1e.lSe5.. (Seu(@.: COl.f1eSy 011. C. Wi:l~)
1l!2
The Slrucfures 01 DNA (md RNA
F I GU RE 6-28 InteJ'Céllation of ethidium ¡nto ONA. Ethidium increases the spacing 01 successive base pairs. diS1011S Ihe regular sugarphosphale backbone. and decrease; Ihe 1:'Ms1 ofthe helil\. ethidium
• nucleolide
intetcafaled molecule
= Because the binding of ethidium increases Wr. its presence greatly affects the migration of cccDNA during gel electrophoresis. In the prosence of nonsaturating amounts of elhidium. negaUvely supercoiled circular DNAs are more relaxed s nd migrate more slowly. whereas rclaxcd cccDNAs become positiveJy supercoiled and migrate more rapidly.
RNASTRUCTURE RNA Con.ain, Ribo,e and Uracil and 1, U,ually S ingle~Stranded
We now tum our atte ntion to RNA . whic h difrers froro DNA in thme respecls (Figure 6-29). First , Ihe backbone oC RNA contains ribose rather than 2' -deoxyribose. That ¡s, ribose has a hydroxyl group at the 2' posilion . Second. RNA contains uracil in place of thymine. Uraci l FIGURE 6-29 Stnlduralfeaturesof RNA. fue figure shc:MtS lhe Slructure 01Ihe backbooe 01RNA. composed 01 allernaling phosphate and ribose moieties. The features 01 RNA tIlal dlslinguish it Irom ONA are highlighted
m"'"
5' end 0-
I I
o=p -
O - CH 2
O
G
O-
o I o=p -
I 0-
OH O-CH2
O
o I o= p-
I 0-
U
OH O - CH2
O
A
o OH I o=p - O - CH 2
O
e
b-~
OH OH
3' end
RNA StructUfe
has the same single-rioged slruelure as thymine. except thal iI lacks the 5 methyl group. Thymine is in effccl 5 methyl-uraci l. Third, RNA is usually fou nd as a single polynucleolide chain. Exeúpt for the case of ccrtain viruses, RNA is not the genetie material and does nol núed lo be eapable of serving as a template for ils own replica tion. Rather, RNA funetions as the intermedíate. the mRNA, between the gene and the protein-syn thesizing madtinery. Another fundion of RNA is as an adaptor. the tRNA. between Ihe eodons in the mRNA aud amino acids. RNA can also playa stru,c tural role. as in the case of lhe RNA components oC thú ribosome. Yet another role for RNA is as a regulatory molecule. which through sequence complementarity binds to, and interferes with the translation o f, certain mRNAs. Finall y, sorne RNAs (including one of the strur.turaJ RNAs of the ribosomel are enzymes that catalyze essenlial readions in the cell. In aU of these cases, the RNA is copied as a single slrand off only one of the two strands of the DNA lemplate. and its complementary strand does nol existo RNA is capable of forming long double helices, bul these are
123
a
b
ullusual in nature.
e
RN A Chains Fold Back on Themselves to Form Local Rcgions of Double HeJix Similar to A~Form DNA Despite being single-stranded. RNA molccules often exhibit a greal deal of double-helical characler (Figure 6-30). This is because RNA chains frequently told back on themselves lo form buse-paired segments between short stretches of complementary sequences. If the two stretchcs of complemcntary sequence are near each other. the RNA mayo adopt one of valioos stem-Ioop slructures in which the intervening RNA is looped oul from the end oC the double-helical segment as in a hairpin, a bulge, or a simple loop. The stability of such stcm-Ioop structures is in sorne inslanccs cnhanced by the special properties Of tllC loop. For example, ¡¡ stem-loop wit11lhe "tetraloop" sequence UUCG ís unexpecledly stable due to spedal base-staclUng interactions in tl1e loop (Figure 6-31). Base pairing can also take place belween sequences that are nol contiguous lo form complex structurcs aptly numed pseudoknots (Figure 6-32). The regions of base pairing in RNA can be a regular double helix or they can contrun discontinuíties. such as noncomplementary nucleotides that bulge out from the helix.
u
F I (i U R E 6-30 DDtJble helical characteñstia of RNA.
In
naving regioos of complementary sequeoc.es. ¡he Inte rvenlng (noncOO"Jllememary) strelChes of RNA lTliIy becorne 'oopec! out" lo form ore of lhe struaures Illusllated in me rogure (a) halrpin (b) bulge Ce) loop
F I {j U R E 6-31 Tetrak)op.
inleri'lctions promote aOO
Base stackmg
stabi~le
lhe letraloop
struClure. The gray ardes be~ the riboses shown 1t1 purple represent u'e phosphatc fTIOteties 01
the RNA bad:bone. Honzontal Ilnes represent base staeking interoctions.
C(UUCG)G Tatraloop
124
Tht: Structureb' uf DN/I nml RNA
5'
FIC:;URE 6~32 Pseudoknot. lhepseudo-
5'
knol structure is fooned l1)' base pairing
~~-- -
,.-
between nonrontiguous cornpIementary
",-
..
----
)0-"'",
."- , •....I....;..._ • • ...... , ....
"""""'"
:,'
:
I
.'
,,,
~
-:~
_H"'' ' O'>r-\
ribose
FICURE 6-33 G:tJbasepaíf. The
structure.sho.vs hydrogen bonds thal allow base
•
"
•
,/
3'
3'
A feature of RNA Ihat adds tú its propensity to fonn double-helical structures is an additional, non-Walson-Crick base pairo This is the C:U base pair, which has hydrogen bonds between N3 of uradl ond the carbonyl on e6 of guonine and between the carbonyl on C2 of uradl and Nt of guanine (Figure 6-33}. Because G:U base pairs can occur os well as the four convenlional. Watson-Crick base pairs, RNA chains have an enhanced capacity for self-complementarity. Thus, RNA freq uently exhibits local regions of base pnidng bul nol the long-rangc. regular helidlY ofDNA. The presence of 2'-hydroxyls in ¡he RNA bBckbone prevents RNA from adopt ing a B-form helix. Rather. double-helical RNA resembles the A-form structure of DNA. As such. the m inor grouve is wíde and s ha lluw. and hence accessible. bUI recall Ihat the minor groove offers HIIle sequence-specific information. Meallwhile. the ma jor groove is so narrow and dcep Ihat it is not ver}' accessible to ami no add side chaius from interacting proteins. Thus . the RNA do uble helix is quite disli nct from the DNA douule helix in ils det
piI,ril'1g lo OCOJr belvv'eer1 guanil'1e ¡md urnaL
RNA Can Fold U p ¡oto Complex Tertiary Structures Freed of the oonstraillt uf forming long-range regular helices, RNA can adopt a wealth of tertiary slJUcIUre5. This is because RNA has enormous rotational freedom in (he backbone of ils non-base-paired regioos. Thus, RNA can fold up into complex tertiary structures frequcntly involving unconventional base pairing. such as tlle base triples and base-backbone inleractiol1s SL"Cn in IRNAs (sec, for example. !,he illustration of tha U:A:U base triple in Figure 6-34). Prúteins can assist Ihe furmation of tertiary structures by large RNA molecules, such as those fOWld in the ribosome. Proleins shicld the negative charges of backbone phosphates, whose electrostatic repulsive forces would olherwise destabilize the structure. Researchers have lakcll advantage of the polential stIuctural complexity of RNA to genera le novel RNA species (nol found in nature) tha! F1(; u RE 6-34 U;A:U base tripte. .[he structure 5hows one example 01 hydrogen booding lha! allows ul'1usual triple base pairing.
U:A:U base triple
RNA Struclu re
have specific desinlble peoperties. By synthesizing RNA molecules with randomized sequcnces. it is possible Lo generate mixtures oColigonucJootides representing enomlOUS sequence diversity. For cxample, a mixture of oligoribonuclcotides of length 20 and having tour possíble nucleolides at cach position woutd havc a potential complexity of 421) seq uences or 10 12 sequences! From mixtures of diverse oligoribonucleotides, RNA molecules can be selected biochemical1y thaL have particular properlies, such as an affinity for a spccific smaJl molecule.
a
e
,,~5"'} J: ~: I T
slem 11
Sorne RNAs Are Enzyrnes
deavage
A.."olJ
A A..
, X"
~:"
~
125
c,~ · .~ , ,...... )
u. '
'AsG5
slem J
It \Vas widel}' believed for many years Ihal only proleins could be
enzymes. An enzyme must be able lo bind a substrate. carry out a chernicaJ maction, roJease the product and repeat this sequence of events many times. Proteins are well-suited lo this task beca use they are composed of many diffrlrent kinds of amino acids (20) and they can fold iuto complex tertiary structures with bindi ng pockets for the slJbstrBte und smBJI molecule co-factors und an uctive site foe cululysis. Now we know fhat RNAs. which as we have seen can similarly adopt complex lertiary structures. can also be biological catalysts. Such RNA enzymes are known as l"ibozymes. and they exhibit many of the fealures of a dassical enzyme, such as an active sito. a binding site for a substrate, and a bindillg site for a co-factor. such as a metal ion. One of the first ribozymes to be discovcred was RNAsc P, a ribonuc1ease Ibat is ¡nvolved in generating tRNA molecules from larger, precur· sor RNAs. RNAsc P is composoo of both RNA and protein; however, Ihe RNA lIIoiety alane is Ihe cataiyst. The protein moiely of RNAse P facilitBtes the reaction by shielding tbe negative charges on the RNA so that it can bind effecti vely to its negatively-charged substrate. Tbe RNA moiety is able to catalyze cleavage of tbe tRNA precursor in the absel)ce of the protcin ir a sroall, positively.roarged counter ion. such as the peptide spr.nnidine. is used to shield Ihe repulsive. negative charges. Othee ribozymes carry oul trans-esterification reactioIlS invoJved in Ihe romoval of intcrvening sequences known as inlrans from precursors to certain mRNAs. tRNAs. and ribosaroa) RNAs in a process known as RNA splicing (soo Chaptee 1;j).
The Harnmerhead Ribozyme Cleaves RNA by the Formation al a 2', 3 ' C yclic Phasphate Before concluding our mscussion of RNA, let us look in more detail at the structure and funcHon of one particular ribozymc. the harnrncrhcad. The harnmerbead is a sequence-spccific ribonuclease tllat is found in oorlain intectious RNA agents ofplants known as viroids. which depend on self-cleavage to propagate. When. lhe viroid replicates. it produces multiple copies ol' itself in one continuOus RNA chain. Single viroids rulse by c1eavage. and this cleavage reaction is carried out by the RNA sequence amund the junction. One such solf-c1caving sequonco is called the barnmerhead because of the shape of its secondruy structure. which CQnsists of three base-paired sfems (J, JI. and lUJ surrounding a core o r nonr..omplementary nudeolides requircd foe catalysis (Figure 6 35). Too lertiary structure of the hammerhead. however. Jooks more Hke a wish· bone (Figure &-36). To undersLand how the hammerhead works. let us first look at how RNA undergoes hydrolysis under alkaline conditions. Al high 4
b
F I C; U RE 6-35 Seoondary structure of
ttle hammedlead ribOlyme. 1he molff.ule is shown with me \Y.() halws of eadl stan con· nec8l with a loop, bul none 01the three S!emS need be a loop: In fact In Ihe varold, the tvvo halves 01stem 111 are nol joined \Mm a loop. (a) lhe figure shows the predicted secondaf)' strudures of lhe h.,mfT'lCfhead ribozyme. Watsoo-Crick base-pair interactions are shown In red: lhe sciSS11e bond IS shown by a red arTON; approxtmate minimal substrate slrands are labeled in blue; (U) uradl; (A) adenine; (C) cyrosine; (G) guanine. (b) fu:! hilmmerhead ribozyrne deavage reaction involves an intermediaf)' state during whtch Mg(OH) in complex wilh !he ribozyme (st-ov.n in green) acts as a general b.:Ise catatyst lo retll(M" a
pr'Oton tmm Ihe 2 ··~droxyl of!he actiI.€ sile cytosine (shcvm al position 17 in par! (a» , and lO¡ruMIe me deavage reaction al me SOSSIJe phosphodiester bond al the active S11e. (Source.: (a) Redrav.K1 from McKa~ o . B. and Wedekind 1. E. 1999. In The RNA ~ 2nd edition Ced. R. F. Gesteland el al.) p. 267, Figure l . pan A. CoId Spring Haroo.. NY. (b) RedralM1 from Sron W. G. et al. 1995. Ceff8 1: 99. p . 992. Figure 1, par! 8.)
126
Th~ StnJcfl.Jr'f,l~
01 DNIi ond RNfI
f I(j U R E 6-36 Tertiary structure
af the
hammemeitd ribo:ryme. ThIS ...;ew of the refined hamrnertJead libolyme strudure shows the conserved bases of stem 111as wcll as !he 3 bp augrnentlng heIix IhallOlns stem 11 (Iop leto
to slem- Ioop IU (boltom) highlighled in cyan. the CUGA urldme 10m highlighled in red. and me active site cytosir.e (cut site al position 17) In gleefl. (Scon WG~ rlnch n , and KJug A. 1995. Ce/IBI : 991. lmage prepaled v.ith MoIScripl. BobScript, and Raster 30.)
pH , th e 2'-hydcoxyl of the ribos e in the RNA backbone can become deprotonated, and the resulting nega ti vely-charged oxygen can attack the sciss ile phosphate at the 3' position of the same ribose. This rear.lion breaks the RNA chain , producing a 2',3' cyclic phosphate and a free 5'-hydroxyl. Each ribose in an RNA chain clln undergo this rcacHon, completely cleaving the parent molecule into nucleotides. (Why is DNA not similarly susceptible to alkaline hydrolysis?) Many protein ribonucleases also cleave thcir RNA subslrates via the formation of a 2', 3 ' cyclic phosphate. Working at nonoal cellular pH, these prolein enzymes use a metal ¡mi. bound at t.heir active sile, lo ar.tivate the 2'-hydroxyl of Ihe RNA. The hamme rhead is a sequence-specific ribonuclease, but it loo cleaves RNA via the fo rmation of a 2', 3' cyclic p'hosphate. Hammerhead-medialed cleavage involves a ribozyme-bound Mg-++ ion that rleprotonates the Z' -hyrl roxyl at n eUtral pH, resulting in nudeophilic .att3ck on the scissil e phosphate (Figure 6-35b). Because the normal react ion of th e hammerhead is se lf~ cleavage, il is not rca ll y a ca talysl; each molecule nonna ll y promoles a reaction oue timo only. thu s having u lumover number ol" one. But lhe hammerhea d can be engineered to functíon as a true ribozyme by divid iog the molacu la inlo two portions-ona. the ribozyme . lbat contains the cata lytic core and Ihe olhar, the s ubSlrate, Ihat contains Ihe cleavage sito. The substrate binds to Ihe ribozyme al stems I an d fII (Figure 6-35a ). Afte r cleavage, the substrate is released and re placed by a fresh un cut substrate , thereby allowing repealed rounds of cleavage.
Did life Evolve from an RNA World? The discovery of ribozymes has profoundly aItcred our view of how life mighl have evolved. We can now imaginfl tha! Ihe re was a primitiva form of Jife based entirely on RNA. In lrus world , RNA would hove fun cti oncd as Ihe ge neti c material and as the enzymatic machiuery. This RNA world would have preceded life as we know it today. in which information trans fcr is based on DNA. RNA, and prolnin. A hint that Ihe prolein world might hava ari sen from an RNA world is Ihe discovery thal tbe com ponent in t ha ribosome that is respons ibla ror Iha form alion of the paptide bon d . Ih e peptídyl tnmsfera se. is ao RN A molecule (sea Chapl er 14). Unlike RNAse P. the hammerhead, and oLbor previously known ribozymes which ar.t on phosphorous centers , the peptid yl transfcrase acts on a carbon center to creaJ e th e peptidc bond. It tbus links RNA chemistry to the most fundamental reacHan in the protein world , peptide bond form ation. Perhaps then tlle ríbosome ribozyme is a reHc of
SUMMARY The heli x coosisls of two polydeoxynudeolide c hail1S. Ear;h chain is ao a lteroat ing polymer of d eoxyribose sugHrs
phosphate backhone is regu lar, the arder of bases is irregular and Ihis is responsible Jor Ihe illformation con lenl of DNA. Each chalo has a 5 ' lo 3' rolarity. and Ihe Iwo chains
and phosphates that are joined together \'ia phosph odiester Iinkages. One of foUT bases protrudes from each s ugar: adenine and guanine. wh ich aro purines. Ul1d thymiJlo and cylosine, which aJe pyrimidines. While I he sugar-
o f Ihe double h elix are orientcd in an anli para lleJ manner- Ihal ¡s, Ihey run i n opposile dinx;tions. Paí dog between the bases holds th e chaios togelher. Pairing is metlialed by bydrogel1 honds and is specific:
DNA is usually in Ihe form of a right·handed double helix.
Bibliogrophy
adenine on one ehain is always paired with Ihymine on Ihe olher chain, whereas guanine is a lways paired wi lh cylosine. This slrict base pairing reflecls Ihe fixed locations oC hydrogen atoms in Ihe purine and pyrimidine bases in the forms of Ihose_bases found in ONA. Adenino and cytosine almos! always exisl in the amino as oppose
127
10.5 base pa irs per lurn and is free of writhe. If Ihe Iinking number is decrea.sed, then the ONA beeomes torsionall}' stressed. and it is said 10 be negatively supercoiled. ONA in cells is usually negalively su percoiled by about 6%. Tha loft-haodod wrapping of ONA around nucleosomes inlroduces negalive supercoiling in eukaryotes. In prokaryoles, which Jad: hislones, Ibe enzyme ONA gyru.se is responsible for genernling negative supercoils. ONA gyrase is a member oC the Iype II family of lopoisemerases. These enzymes change Ihe linking numher ofONA in sleps of Iwo by making a tmnsient break in Ule double belix amI passing a region oCduplex ONA through the break. SOrne J}'pe n top oisomerases relax supen;oiled ONA, whereas ONA gyrase generales negalive supercoils. 1'ype t lepoisomcrases also relax supercoiled ONAs. bul do so in sleps of one in which one ONA slrand is passed through a lransicnl nick in the other straml. RNA differs from DNA in Ule foll owing ways7 its backLone contains rlbose rnlher Ihall 2' -deoxyriboSi;l; it contains Ihe pyrimidine uradl in place of thymine; and il usually exisls as a single polynudeolide chaln. wilhou l a complementar)' chain. As a consequence of being a single slrano. RNA can fold bad on itself lo form shorl strelches of double helix belween regions Ihal are complementary lo each other. RNA allows a grealer range of ba~e pairing Ihan does ONA. Tilus. as wel! as A:U nnd C:G pairing. lJ can a lso pair with C. This capaci ly lo form a non-Watson-Crick base pulr adds lo Ihe propensily 01' RNA lo fOrln rloubJe-helical segments. Freed of Ihe constrainl oC forming long-range regular helices. RNA can form complex lerliary slruclures, which are often b..1sed on unconventional lnleractions betwcen bases and Ihe sugarphosphale -backbone. Sorne RNAs Hel as en:.>.yrnes- Ihey calaly:w c.;h emical rcacHens in Ihe cell ami in vjlra. These RNA eozymes are known as ribozymes. Mosl ribozymes 8Ct on phosphorous centers, as in the case of Ihe rihonuclease RNAse P. RNAse P is composed of protein and RNA, bul il is Ihe RNA moiel}' lhal is Ihe catalyst. The hammerhead is a stllf-clt:aving RNA, which culs Ihe RNA b:lckbone via the formntion of a 2 ' . 3' cyclic phosphate in a reaclion Ihal involves an RNA-bound MgH ion. Peptidyl traos ftlra5e is an example of a ribozyme Ihat acts on a carbon ceoter. This ribozyme. which is respons ible for Ihe formal ion of Ihe peptide bond, is one of lhe RNA componenls oC Ihe ribesome. The disl:overy of RNA enzymes that can act on phosphorous or carbon cente rs suggesls Ihat Jife mighl have evolved from a primitive fOrJn in which RNA fune lioned bolh as Ihe genetic material and as Ihe enzymalic machinery.
BIBLIOGRAPHY Books Cold Spring Harbar Symposium on Quantilalive Bio/ogy. 1982. Volume 47: Struclures 01' ONA. Cold Spring
Harbar Laboral ory Press, Cold Spring Harbar, N.Y. Gesleland , R.E. Cech , T.R., and Alkins. J.F., ros. 1999. The RNA WorJd. 2nd edil ion. Cold Spring Harbor Laboralory Prcss. Cold Spring Harbor, N.Y.
Kornberg. A. and Baker. T.A. 1992. DNA RepJicolion . W. H. Freeman, N.Y. Saenger. W. 1984. PrincipIes o[ Nucleic Acid StruclUre. Spri nger-Verlflg. N.Y. Sarma, R.H., ed. 1981. BimoJecu/or Stereodynomics. Vols. 1 and 2_ Adenine Press, Gu ilderland, N.Y.
128
T/18 Slruclul?'!> Df ON/1 (]nd HNJ\
DNA StTUcture
R.E. 1983. TIle DNA helix and how il is read. Sci. Amer. 249: 94-111.
Diclwrson,
Franklin, RE. amI Gos llng. R.G. 1953. Molecu lar conJlguralion in sodiuln Ihymonudt:ale. Nottlte 171: 740-741. Ric.h, A., Nordheim, A., and \Vong. AH.I. 1984 . The c.hemiSlr.y anó biolog}' of left·hallded ZONA. Annu. Rcv. Biochem. 53: 791 - 646.
Roberts.
R.J . 1995 . 00 base flipping.
Cell 82(11: 9-12.
Wang. A.H.. Fujii. S" van Boom. ' .H., ond Rich, A. 1983 . Right-hanued and left-handee! double-helical DNA: Stnlclural sludies. Colú Spring Hnrl). Symp. Qtlont. Biol. 4 7 PI 1 ; 33-44. Walson, J.D. and Crick. EH.C. 1953. Molecular slnlctu re of nucleic acid<¡; A Slnlclure for deoxyribonucleic adus. Noturc 171: 737-738. - - -. 1953. Genetical implicalions of the StruClure of deoxyribonucleic acids. Na/ure 171: 964- ~l67. Wilkins. tvlH.F., Stokes. AR" and Wilson. H.R. 1953. Molecular strm;ture deoxypentose nudeic adds. Nolure 171: 738-740.
or
DNA Topology Bauer. WR., Crick, F.H.C" and While. '.H. 1980. Supercoiletl DNA. Sci. A m er. 243: 11 8-133. Boles, T.e., While. I.H" a nd CozzareUi, N,R. 1990. Strllcture of p leclonemica lly !>upercoilt.'d DNA. /. Mol, Biol. 213: 931 -951.
Champoux, J.J . 200 1. DNA Topoisomerases: Slruclure. Function, and Mechanism. Annu. Hev. Biochem. 70 : 369-413.
Cric k, F.H.C. 1976, L.inking n\Jmbers aml nudeosomcs . Proc. Notl. Acaú. Sci, 73: 2639-2643. Drbge, I~ and Cozzarell i, N.R. 1992. Topological slructure 01' DNA knols and caleoancs. Methods EnzymDI. 212: 120 - 130.
Gcllerl, G.H. 1981. DNA topoisomerasus. Annu. Rc\', Biochem. 50: 879-910.
Wang. J.C. 2002. Cellular roles of DNA lopoisomerases: A molecular perspeclive, Na t . Rev. Mol. CcJJ Binl. 3: 430-440. Wasserman, S.A and Cozzarolli , N,R. 198(1. Biochemical topology : Applications lo DNA recombinalion and replic¡¡tion. Scicnce 232: 951-960.
RN A Structure Doh erty, E.A. and DomIna, J.A 2001 . Ribozyrne slnlctures and mechanisms. Ann. Hev. Biophys. Biomo/. Struct. 30: 457 - 475.
McKay, D.S. and Wetlekind, ' .E. 1999. Small ribozyrnes. In 'fhe RNA World. 2nd edition (ed. Gestcland. R.F. el aL), pp. 265-286. Cold Spring Harbor, N,y,: Cold Spring Harbar LaborolOry Press. Uhlcnbec.k, a.c., Panl i, A., and feigon, J. 1997_. R NA SlructllfC comes of oge. CcIJOO; 833 - 840.
CHA PT ER
Chromosomes, Chromatin, and the Nucleosome n Chapler 6, \Ve consídered the structure of DNA in isolation. Wilhin the cell. however. DNA is associated w itb proteins and each DNA and its associated protein is called a chromosome. This orga· rnzation holds Irue for prokaryolic aud cukaryolic cells and even fo r viruses. Packaging of the DNA into chromosomcs serves scveral important functions. Firsl. the chromosome is a comp act fom of the ONA that rcadily fi ts ¡nside the cell. Secand, packaging the DNA into t:hromosomes selVes to protec! the DNA from damage. Completely naked DNA molecules are relatively unstable in cells. In contrast, l:hromosomal DNA is extremely stable. allowing the informatíon encoded by tbe ONA lo be reliably passed on. Third, only DNA packaged joto;:¡ chromosome can be IransmiUed efficient ly lo both daughter cells cach time a cell divides. F'illally, the chromosome confers an overaU organization to each molecule of DNA. This organization facilita tes gene expression as well as the rccombination between pareotéll chromosornes that generales the diversil y observed among differeni individuals of any organismo l-Ialf of the molecular rnass of a eukaryotic chromosome is protein. In eukaryotic cell s, a given region of DNA with its associated proleins is called chromalin and ¡he majority of the associated proteios are small, basic proteins called bi.o;ta ncs, Although not nearly as abundanl. olber proteins, frequently referred lo as lbe non-bis'ane protcins, are assodated with the chromosome. These proteins include the numerous DNAbinding proteins that rogulate the transcri ption. replication. repalr. and recombillBtion of cellular DNA. Each of !hese topies will be discussed in more detail in the next five chapters. The proteins in chromati n perfonn anolher essential function: they compact the ONA. The fo llowing calculation makes lhe impoetance of Ihis function clear. A h uman cell contains 3 x lOll bp per haploid sel oCchromosomes. The thickness oC each base pair (the "risc'·) is 3.4 A. Therefore. ir the DNA molecuJes in a haploíd sel of chromosomes were lai d out cnd-Io·c nd. Ihe total lengt h of ONA would be approximatel)' 1010 A, or 1 meter! Foe a diploid cell (as human cclls typically are), tbis length is doubled to 2 meters. Since the diameler of a typical human cell nucleus is only 1O- t5 ¡.tmelers , it is obvíous that Ihe ONA musf be compacled by several orders of magnitude to 6t in such a small space, How is this achieved? Most compaction in human cells (and aH other eukaryotic cells) 1S the result of Ihe regular association of ONA with histones to fonn structu res caUed nucleosomcs. The fonnation of nucleosomes is the fi rst step in a process !hat allows the DNA to be folded ¡nlo much more compact struetures thal reduce Ihe linear length by as much as tO,ODO-rold. Compacting Ihe DNA does nol come wilhout a cost. Asso-
I
OU T l l N E
• Chromosome Sequence and Diversity (p. 130)
• Chromosome Duplication and Segregatioo (p. 138) TIle Nudeosome (p. 151)
• Higher-Order ehromatin Structure (p. 160)
RegulalJon of Chn:roatin StructUre (p. 165)
• Nudeosome Assembly (p. 175)
129
dalion of the DNA wilh histones and other packaging proteins reduces lhe accessibility of the DNA. This reducad accessibility can interfere with the proleins Ihal mediate replicaUon . repair. recombination, and-perhaps mosl significantly-transcription of tbe DNA. Indeed , packaging af e ukaryotic DNA results in a globa l repression of DNA transactions that m ust be overcome lo allow enzymes such as DNA and RNA polyrnerases access to the DNA. The conflicting needs of compacting and accessing Ihe DNA have fOf:;usod attention on how chromatin structure is regulate
CHROMOSOME SEQUENCE AND DIVERSITY Before we discuss the structure of chromosomes in detail. it is important lo understand the fealures of the DNA molecules that foun thei r foundation. The recenl sequencing of the genomes of numerolls organisms has provided a wealth of in[ormation conr:erning the makeup of chromosomal ONAs und how theu characteristics have changed as organisms have inc reased in complexi ty.
Chromosomes Can Be Circular or Linear The tradilional view is that prokaryotic cells have a single, circular chromosome and eukaryotic cells have multiple, linear chromosomes (Table 7-1,). As more prokaryotic organisms have been studied, this view has bren challenged. Although the mosl sludied prokaryotcs (such as E, co/j aud B. subtWs) do indecd have single circular chroroosornes, thcre are now numerous exa mples of prokaryolic cells that have 'mulUple chromosomes, linear chromosomes . or even botb . hl contrasto all eukaryotic cells have multiple linear chromosomes. Depending on the eukaryolir. organismo the nlJmoor of chromosornes typically varies from 2 lo less than 50, but in rare instances can reach
Chromu~ome
Sequenct:I (Inri Diver5ity
131
, AB LE 1-1 V.añat;on in Chfomosome Milkeup in D;Herent Otg.anisms Species
Numberof chromosomes
Chromosome copy number
Fonn of chromosome(s)
G..,ome size (Mb)
PROKARYOTES
Mycoplasma genitalium
1
EscheriCflIa co1iK-12
1
Agrobaclerium tumefaciens $inofl1llizobíum me/Hotl
Circular Circular 3 Circular 1 Linear CirCUlar
4 3
0 .58 4.6
5.67
6.7
EUKARYOTES
SaCt;/tarornyces cerevl$lae (budding veast) Scllizosaccharomyces pombe lfissjon yeast) C. elegans (roundw(I(m) ArobÍdopSIS truHana (wecd) DroSOphi/a rnelanogaster (Irull [ly) Terrahymena fhermophllus (protozoa) Fugo rubripes (fish)
Mus mllsclllus (rnouse) Horno sapq/ens
16
lor 2
lirop.ar
12.1
3
,,,, 2
linear
12.5
6
5
2 2
4
2
linear linear Lcnear
Micronuc1eus 5 MacronucJeus 225
Micronucleus 2
LInear
22 19+XandY 22+XandY
Macronucleus 10- 10.000
2 2 2
thousands (for example. in Ihe macronudeus of the protozoa Tetrohymena, Table 7-1). Circular and linear chromosomes eal:h pose specific cha llenges !hat musl be overcome 1"01' maintenance and replication of the genome. Circular chramosomes require topoisomerases to separate the daughter mulecu les after they are replicated. Without thcse cnzyrnes, the two daughter molecules would remain interlocked, or catenated , wíth one analher aftcr replication. In contrast, the DNA ends of the lincar eukaryotic chromosomes have lo be protec:led fram cnzymes thut nonnally degrade DNA cnds and present a different sel of difficulties during DNA replicalion. as we shull see in Chapter 8.
Every Cell Maintains a Characteristic Number of Chromosomes Prokaryotic ce lls Iypically have only one complete copy oí their chromosome(s) that is packaged into a structure called lhe nucleoid (Figure 7-lb). When prokaryotic m Us aro di vid ing rapidly, however, portions oC !he chromosome in the process of replicating are presenl in two and sometimcs even four copies. Prokaryotes also frcquently carry ane or more smaHer ¡ndependent circular DNAs. called plasmids. Unlíke the larger chromosomaJ DNA. plasmids typically are not essentiaJ for bacterial growtb . tnstead. they carry genes tllat confer desu able traits lo the bacteria, suro as antibiotic resistance. Also distinct from chramosomal DNA, plasmids can be present in many complete copies per cell. The majari l)' of cukaryotic cells are diploid; that is o they coniain two copies of ear,h chromosome (see Figure 7 1c). The two copies of a given chromosorne are called homologs; one is derived from each 4
97 125 180
220 (MicronucJeus)
Linear Linear linear
365 2.500 2.900
132
(.nromosomes. Cllromolin.
flfld 111t< NUc/fIOSO/lJe
F I e u R E 7-1 Compañson 01 typical proka".otic and eukaryotic ceU. (a) The diameter 01 a typKal eukaryotJc ceIl;s - 10 ¡UT'I. The t¡pic.al prokaryotie 001is - 1 ¡lm long. (b) Prokar¡.(Jtie ehrornosomal DNA is Iocated in the nudeoid and occupies a substantial prnion 01 the internal region 01 !he eell. Unlike lhe eul:.aryotie nudeus, Ihe nudroid is no! seperated frorn Ihe rcmilinder 01the eell by a membrane. Plasmid DNA ;s shoNn In red. (e) Eul:.aryotie rnromosomes are Iocated ;n lhe membrane bound nucleus. Haploid (1 copy) aOO diploid (2 copies) ceUs are diSfinguished by lhe number 01eopteS of eam ehromosome presenl in the nudeus. (Source: Adapte
b haploid bacteria
" u
1
"m
eukaryotic
~o ~ ~
10 I!m
haptoid ce"
parent. But, no1 all cells in a eukaryotic organism are dipIoid; a subset of eukaryotic cclls are either haploid or polyploid. Haploid cells conlain a single copy oC each chromosome and are involved in sexual rcproduction (for example. spenn aud eggs are haploid cells). Polyploid cells have more than two copies of each chromosome. lndeed , sorne organisms maintain the majority of their adult ceUs in a polyploid state. In extreme cases Ihero can be hundreds or eve)) thousands of copies of eac.b chromosome. This type of global genome amplific
Genome Size Is Related to the Complexity of the Organism Genome size {the length of DNA associated with one haploid compl ement of chromosomesl varies s ubstantially betwcen diffcrenl organisms (Table 7-2). Becll USC more genes are required to direct the fo nnation of more complex orga oi sms (at least wben comparing
T A B L E 7-2 Comparison of the Gene Density "' Different Organisms' Genomes Genome size
Approxlmate
(Mbl
numbet 01 genes·
GenedensHy (genesIMb)·
500
000
22
2.300
1.060
4.6
4 ,400
950
5.7
5,400
reo
6.7
6 ,200
930
12
5,800
"""
12
4,900
41 0
220
> 20,000
> 90
97
19 ,000
200
180
13 ,700
80
845
-22,000
- 26
5,000
od
od
365 2,900 2,500
> 31 ,()(x)
> 85
Specios PROKARVOTES (bacteria)
Mycoplasma genitalium S/replococcus pneumonia Escherichia coIi K· 12 Agrobactenum lurnefaciens Sinorhlzob,um
0.58
meliloti
EUKARVOTES (animals)
FiJngl Saccharomyces cerevisiae Schizosaccharornyces pombe
Prorozos Telrahymena lhermophita Invertebrares Caenomabditis efegans Drosophila melanogasler Strongylocentrolus PlJrpuralus Locusta migratoria Vertebrales Fugu rubripes Horno sapiens Mus musculus
27,000
9.3
29.000
12
125
25,500
430
:> 45.000 45,()(X)
200 > 100 > 20
od
od
Plants Arabidopsis thafiana Oryza satlVa (rlee) Zea mays Frililla"a assyriaca l(tul ip)
' nd
~
no! deleunined
2,200 120,()(X)
>
134
Chmlllo$ollles. Chmmotin. olld the NucJeO$Olllfl
bacteria. single-cell eukaryotes. and multiccllular e ukaryotes-sec Chaplcr "l9). it is not surprising Ihat genome size is rough1y cOITelated with an organism's apparent com pl úxily. Thus, prokaryotic cells typically havc genomes smallcr th an 10 megabases (Mb). The genomes of s ingle-cel! eukaryoles are typically less Ihan 50 Mb, a1though thc mom complex prolozoans can have genomes grealer Ihan 200 Mb. Mu lticeJluJar organisms have even Jarger genomes Ihat can reach sizcs grcater Ihan ]00,000 Mb. A1though lhere is a cOITclation betwcen genomc sizc and organism complexily. il is far from perfect. Maoy organisms of appar/mu y similar cornp J exi ti e~ have very differen! genome sizes: a fruil fl y has a genome approximalely 25 times sma J!er than a locust and the rice genome is aboul 40 times smallcr Ihan wheal (~e e Table 7-2). Ln these examples, the number of genes rather Ihan the expansion in genomc size a pp ear~ lo be more c10sely reJaled to organism cornplcxity. This becomes c1ear when we examine the relative gene densities of differenl genornes.
The E , coli Genome ]5 Composed almost Entirely of Genes The groal mnjority oC Ihe single chromosome 01' Ihe bacteria E. coJi encodes ptoleins or slructural RNAs (Figure 7-2). The ma jority of ¡he noncoding scquencL's are dcdicalcd lo regulaling gene tmnscription (as wc shnll see in Chapter 16). Because a single site of transcription initiation i~ oft en uscd to control the cxprcssion of several genes, even thesa regioos are kcpl lo n minimum in Ihe genome. Dne critical clernent of Ihe E. col; genome is nol parl of a gelle: the E. cali origin ol' replicalion. This short chromosomal region is derlicaled to directing Ihe 8ssembly of Ihe replication machinel'y (as we shall discuss in Chapter 8 ). Despile its importanl role. this region is still very small , occupying only ~ few hu ndred base paiTS of Ihe 4.6 Mb E. coN genome.
More Compfex Organisms Have Decreased Gen e Densíty What explains the drnmatically differen! gcnome sizes of organisms oC apparently similar complexity (~uch as the fruit fl y and locust)? The
•
genes
•
rcpeal ed sequences
inlrons
•
O intergenetic sequences
RNA poIymerase gene
ESCheriCn'",II"."' (i5i.7jge.~ .¡II.U.ulI!I __ - • •III.m. l. '" I~• •II!I._(
.
1• • •11I• •1II1II• • •1JI111• •'
.
"
Sacdlaromyces cerevisiae (3 1 genes)
'-, _
••RJ.IL~.. ~.""i""lIúI"U.IU.IU.~""UII."••~"""~••L "'I"~" .~..~•.uI~.u•.u.~ Drosophila melanogaster (9 genes) -
.'
I
"
_
,'
Human (2 genes)
,
,
,
O
10000
20000
••
..-
"_ , _ _ o . , . . .. . .
,
30000
,
1
1 '
-------. -------
••
.. _
'0000
FI (j URE 7-2 Comparrson of tlle cllromosomaJ gene density for different organkms. A r€presefltative 65 I:h region 01 ONA IS ill ustrated fa each organismoThe regien tha! encocles the Ia!gest SUbuM 01 RNA poIymerase (RNA Pd 11 101 Che eut.aryotic cells) is intl!c&ed in red. Note ho.v lhe ntJfl'lber of genes encoded Wlthin lhe Si\m~ Ier.gth 01 ONA deoeases as organ¡sm complexity traNSeS.
••
_ '. '
,
50000
-
, 60000
Chromosome SeqUf!/lCf' and Dillefsi(1I
135
differences are largely related to gene density. One simple measure of gene density is the average number of genes per Mb of genomic DNA. Thus, if an organism has 5,000 genes and a genome size of 50 Mb, then Ihe gene density fur thal organism is 100 genes/Mb. When the gene densilies 01' diffcmn l organisms are comparcd , it becomes clear that differen! organisms use Ihe gene-encoding potential of DNA with varying efficiencies. There is a rollgh inverse correlation between organism complexity and gene density: the less complex the organism, the higher the gene density. For example, tbe highest gene densities are found fOI" viruses Iha! in some Instances use bolh strands of the DNA to encode overlapping genes. Although overlapping gonos are rareo mielenal gene densily 1s consistent)y near 1,000 genes/Mb. Gene densily in eukaryotic organisms is consis lently lower and more variable than in lheir prokaryolic coun lerparls (see Table 7-21 . Among eukaryoles. lhere is st Hl a genera l lTend lar gene density Lo docrcasc wi lh increasing organíslll complcxity. Tho simple uníce ll ular eukaryote S. cerevisiae has a gene densi Ly very close lo prokaryates (- 500 gene5/Mb). In contrast, the human genome i5 estimalcd to have a 50-fold lower gene density. lo Figure 7-2 the amollot of DNA scquence devolcd to lhe cxpl'ession of a rclatcd gCllP. l:onscrvcd across all organisms (the large subu oil of RNA polymerase) is compared. i1lu slraliog the vasl differences in gone density. Organisms wilh mUl.:h larger genomes tlum humans are likely to ha ve much lower gene densitics. Whal is responsible far Ihis red uction in gene density?
Genes Make Up OnIy a Small Proportion of the Eukaryotic Chromosomal DNA 'f'wo fnctors conlributc lo Ihe docreased gene density observod in eukaryotic cells: lncreases in gene size and incrcascs in the ONA betwoon genes. r.:alled inlcrgcnic sequences. Individual genes are longcr f"or Iwo masons. FirsL, as organisms bccome incmasingly complexo Ihero is a significant ¡ncrease in regions of DNA requircd lo dircct and reguJate transcription, called regulalory sequences. Second, proteinencoding genes in eukllryotes frcquently have discontinuolls proteincoding regions. Thcse interspersed non-protein-encoding regians. called introns, are removed room the RNA after transcription in a pro(,'(lSS ca1led RNA splicing (Figure 7-3); we shall consider RNA splicing in detail in Charler 13. The presonce of iotruns ean increase dramat it:ally
¡nlroo 1
2
F I (j U R E 7-3 Smemadc o. RNA spltdng.
ONA 2
3
2
3
¡ primary
RNA transaipl
spliced mRNA
' ''-=========~''~===========-''3
---3'
Transcriplion 01 pre-mRNA is Initiilted al Ihe allow sl"lo.M1 aboJe e~on 1. This primal)' tronSOlpl is lhen processed (by splid ng) lo remove ncn:odlng inlrons lo produce messenger RNA
Chromosomes. Chromolín, ond Ihe NucfoDSDme
, A 8 L E 7- 3 Contributlon of Introns and Repeated 5equen(es to D;tferent Genomes
Specles
Gene density (9......... ·)
Average numbet 01 Introns per gene.
Perceotage of ONA lhal 15 repelitive·
PRQKARVOTES (bacteria)
Escl'rerich;a COi; K-12
950
O
480
00'
200 BO
3
<1
EUKAAYOTES (Mimals)
Fungl Sacchat07lyces cerevisiae
3.4
'nvertebrates Caenorhabdilis e/egans Drosoph¡JB mP./anogaster
5
6.3 12
Vertebrates
Fugu rubnpe$ Hcxr¡o saprens
6
2.7 46
125
3
nd
470
nd
42
75 8.5
5
Plants
ArabrdopSls tha/rana Oryza saliva (rice) 'na ;: OOt deterrnll1e
the length of DNA rcquired lo encode a gene (Table 7-3). For example, lhe average transl:ribed regions of a hwnan gene is about 27 kb (ihis should nol be con fu sed wi lh ¡he gene dcnsity), whercas Ihe average protcin-coding region oC a human gene is 1.3 kb. A simple calcu lation reveals Ihal on1y 5% of the average human protein-cncoding gene directly encodes Ih e dcsired prolflin _ The remain ing 95% is made up of introns. Consistent with their higher gene density, simpler euk8fYotes have far fewer inlrons. For example, in the yeast S. cerevisiae, only 3.5% of genes have introns. none of which is greater than 1 kb (see Table 7-3). An explosion in lhe amount of intergenic sequences in more complex organ isms is responsible for the remaining decreases in gene density. Inlcrgcnic DNA is lbc portion of a genomc lha! is nol associaled wilh the exprcssion of proleins or slructural RNAs. More than 60% of lhe human genome i5 composcd oC intergen ic scquences snd mosl of th is DNA has no known funelion (Figure 7-4J. The re are two kinds oC intergenic DNA: unique and repealed. AbouJ a quarter of Ihe intcrgeo ic DNA i5 unique. These regioos comprise many apparently nonfunctional rclics, including nonfunctional mulanl genes, gene fragments. a nd pseudogenes. The mutanl genes and gene fragrnents arise Cl'Om simple random mutageoesis al; mistakes in DNA recombination . Pseudogcnes a risa from Ihe ac:lion 01' s n onzymp. caJJ ud reverse rranscriptase (Figure 7-5 snd Cha pter 11). This enzyme copies RNA into doublc-strnndcd DNA (rcfcrrcd lo ns copy DNA or cDNA) bul is only expresscd by certain typos of viruscs tha! require lhis enzyrnc lo reproduce. Bul . as a side erIcct oC infection by such a virus, the ccllu lar mRNAs C811 be copicd ioto DNA . nnd the resu bing DNA fragrnents rcinlcgratcd into Ihe genome al a low rateoT hese copies are not expressed. however. oocause they lack the correcl sequences lo di ~ rect their expression (such sequences are genorally not part of a gene's RNA product, soe Chapter 12).
Cnromosome Sequence and lJIversily
unique 510Mb other jntergentc
137
F' G U R E 1·4 TlIe organilation and
content of the human genome. The human genome is composed of many diflerenl types of DNA S€quences, !he
regions 600 Mb
90Mb
repeals
1,400 Mb
rTlCljority of lMlich do r.ot enrode proteios. The figure shCMIS the disuiootion and amwnl of earn 01 the Vanous types of sequences. (Source: Adapted from BrCM11 TA 2002. C~ 2nd edition. p. 23, boJ: 1.4. e 2002 BlOS Scientilic Publishers. Used by permission. \WI\o\I.tandf.com.)
human geoome
introns,
3,200 Mb l
UTR,
gene
relate
fragments
152 Mb
pseuclogenes
genes 48Mb
The Majori.ty of Human Intergenic Sequences Are Composed of Repetitive DNA Almost hnlf 01' the human gellome is composoo of ONA sequences lhat aro rcpcated many times in the gellome. Therc are two general cJasses of repcated ONA: microsatcllitc ONA and gcnome-widc repeats. Microsatellite DNA is composed of very short ((ess than 13 bpl, tandemJy-rcpcatcd scqu,cnccs. The most cornmon mic rosatcllit e sequcnccs are dinucleolide repeals (for example. CACACACACACACACA). These repcals arise from difficulties in accuralcly duplicating the ONA and represent nearly 3% of the human gellame.
Wytt
"[jf~"tz"" nd ~at, ~genc:e~~,,~====~~~~~~~~~~ J 1Z¡
DNA O
tlCln5l;ription/5pliclng ' - - - - . . RNA
-1i!III•• reverse transcripflon
~ DNA
I re-integration
functionai gene
•
pseudogene
FI e
u R E 1 - 5 Proa5Sed pseudogenes arise from mtegration of reYefSe transcribéd mRNAs. lMlen revefSe tra~as.e IS presml ... a celi, mRNA moIeaJes can be (~"ilIo dsDNA in rare inst6lCes, these DNA rrdecules can inl:egrote ¡nto the genome aeafing pseu:Iogenes. Because mtrcrrs ilfe rapdly remo.ted from ~ !Janscribed RNAs, these pseudogmes have IhE (OOlrnm char.lctenstic of ladólll introns. Ths ástilguishes the psaxlogene flan !he (qt)' 01 Ihe. gene trom v.,f-.ich 'rt lrVélS derived. in ad:litia1.. pseu:logenes Iack the appI~!e pfcrncta
_.
sequences lo d rec.1 mar trarlscriproo as mese are no! par! of !he rrl\NA frOO1 v.+id1 they are
Genorne-wide repcats are much larger than thcir microsatellllo countnrparts_ Ench genome-wide repeat unit is greater than 100 bp in length and many are g:reater than 1 kb_ These sequences can be found either as single copies dispcrsed Ih1'oughoul Ihe gcnome, or as closely-spaced duslers, Although Ihere are numerous classes of such repeats, their cornmon fcaluro is thal all are fonns oftransposable elcments. Transposable elemenls are scquencf:s thal can "move" from one place in the gmlOme lo anolher. fn Iransposition, as this movenle nl is cal1cd, the cIernent movcs lo a llCW position in Ihe geflome, often Icaving the original copy behind . Thus. these scquences multiply and ac~ c LUuulatc throughout the genomc. Movement of transposable c1ements js a relativeJy rare evenl in human celJs. Nevertheless, over long periods of Ilme. thoso elr.;!ments have becn so s uccessful at propagating copies of themselves thal thoy now compriso approximately 45 % of the human genome_ In Chaptor 11 we will consider the mechanism by wruch transposahle e]ements movo atound the genome and how their movemenl is controllod lo preven! chromosome damago. Although we have discussed the nature 01' intergcnic sequcncc in the conlexl of the human gellome, many of the same features are fOllnd in olber organisms. For examplc, comparison of the known sequences of purtions or several plants with very large genomes (sllch as n18izc) indi· cates Lhat transposable elements are likcly to comprise an cvcn larger pereentage of thcse genomes. Similarly. even in tJle compact genomcs of E. coli and S, cerevisjae, thero aro examples of transposnble elements and microsatelJite repcats (sce Figure 7·2). The differcnce is lbat these elements have bccn less succcssfuJ al occupying the genomes of these simpler ol:gélnisms. This lack of success is Iikcly a combinalion 01' inefficient duplicalion and/or more efficient olimination (either by repair cvcnts or by elimination ol' organisms in which dupHcation has OCCUIToo). Although it is tempUng lo refer lo repeated ONA as ¡unk DNA. Ihe stable mainlenance of these sequences over hundmds lo thollSllnds of gcncrations suggests Ihat intergenit: DNA confers a positive value (or selective ndvantage) lo lhe host organismo
CHROMOSOME DUPLlCATION AND SEGREGATlON Eukaryotic C h romosomes Require Centromeres, Telomeres, and Origins of Replication to Be Maintained During Cell Division Thero are several important DNA elements in eukaryotic chromosomes Lhat are not genes and are not involvcd in regulating the expression 01' genes (Figure 7· 6). These e lcments ¡nelude origíns of replication that di· rcct tIle duplicalion of Ihe chromosomal DNA. centromeres thal act as "handles" Cor the movement 01' chromosomes Lulo daughter (:e lls, and tc lomeres lhal protocl and replicate the ends of linear chromosomes. Al! thcse fcaluros OIe critical for Ihe proper duplicalion and segrcgaUon of LllC chromosomcs during cdl division. We now look at e8ch of ilicsc elemcnts in more detail. Origins of repJk..ation are lhe sites at which the DNA rcplication machinery asscmbles lo initiatú replication. They are found sorne 30-40 kb apart IhroughoUI thc lengln of cach eukaryotic chromosome. Prokatyotic chromosomos ruso require origins of replicalion. Unlikc Lheir cukaryotic counterparts. prokaryotic chromosomes ty pical1y have
Ch m masome iJtJplirntioIJ ofld Si!gregalion
t";\
\ \!
t";\ (;\
:-::--
I \ I \ I 1 1 I I \ J J \ ~.;
-I
DNA replicatlon
I \ I
I
\
J
-·1 11
•
\! mitosis
FI C; U RE 7-6 Centrorneres, otigins of the repl'ic.ation. and telomere5.re required lor eukaryotic m.inten.nce. Each eJ:aoyooc dYomosome indudes two teIomeres. roe cenlromere, and rnanv 0JJgIIlS cA repIicabon. rebneres are OOIc:"
muy a s ingle sito of rcplication initiatlon. In general origins of replication are found in noncoding regioos. Tho ONA scqucnccs that are recoSnizcd as arigins ofrúplication are discusscd in dctsi l in Clmpter 8.
Cenlromercs are rcqu ircd rol' the l,'Orrccl scgrcgation of Ule chromosornes after ONA rcplication. Tho twa copies of each rcplicatcd duomosome are callcd daughtcr chromosomcs nnd they mus! be separatcd WitJl one copy gaing to cach of Ibe twa daughtcr cells. Like origins of replil,;iltion. ccnlromcres direcl Ihe form ation of an clnborote protcio complex, in Ihis caSe, cnJled a kinetochore. Tho kinClochorc inlcracts wilh the machincry Iha! pulls the dsughter chromosomcs away from o ne anolher and into tho two dsughter cells. In contras! lo the rn8ny origins of replicolion found on cach cukaryotic chromosornc , il is criticollhat each c hromosornc has Dno and onJy one ccntromere (Figure 77a). In the abscuce of a CCfltromere, the roplicnted ch.romosomes scgregate randomly, lending to frcquent loss or duplication oC cltromosomcs (Figuro 7-7b). If presen! in multiple copies . cenlr omvfCS can cause n single chromosomc lo bc pulled into bolh dnughtor cells, Icading lo chromosomc breaknge (Figure 7-7e). CClllromercs vnry grcnUy in size. In lhe yeasl S. cerevisiae, centromcrcs are Icss lhan 200 bp. lo contras t, in the mnjority oC eukaryotes. centromcres are > 40 kb and are composcd of largcl y repetitivc DNA scqucnccs (Figure 7-8). Telomeres are located al Iho two end s of a li ncnr chromosom c. Like origins oC rcplication and ccnlromcrcs . tclomeres are bound by Il number of protcins. In this case. the protcins perform two importa nt Cuoctions. F'irst. telomcric proteins dislinguish the nalurol cnds of the cllromosornc from sites oC chromosomc brcakage amI olhcr DNA brcaks in
J \
Chromosomes. C/l romu rirl, ufld Ihe Nucloosome
1 4{)
a one centromere
~I
~ ~
•
k <
)'1
one chromosome fof each cal!
b no Genlromeres
/~ I
e
-
\
~I
•
~ ilt ,
ran(\(lm segregation of cIvomosome
two cenlromeres
/
~
•
~-
-
~
- ~ ~
chromosome breakage (clue lo ma"e than one centromere)
FIGURE 1~7 More or le5s dJan one centromen! leads lo d1romosome loss Of breakage. (a) Namal mlOl'llOSOlTleS have one centromere. Mer fI:pJicallOO 01 a dlromosome, each copy of the centromere direds me formatlOn 01 a kinetochore. lhese two kinetochores lhen bine! lo ClppOSIte poles 01 lhe mitotic Spindle./lnd are polled ioto lhe opposite sides of the cel! plia to cell division. (b) ctvomosomes lacking ceruromeres aJe r~idly IosI 110m (€lis. In !he
v.Och ene di!u~lIer gets two copies of a chranosome aOO the other daughler ceU 15 mlSSlng lhe same monosome. (e) Chromosomes with \'NO or lTlOIe cenlromeres are rrequently broken dUllng segregaticn If a chromosome has more. lhan ooe centromere. 11 Celo be bound slmultaneoo§!y lo both poIes o/Che milolic: Spindle. When segregahon is initi.Jted, lhe opposing lorces 01 me mitotic spindle frequenlly break. chromosomes i'Ittached lO both peles.
the cell. Ordinarily, ONA ends are ¡he sites of frequent recombination and ONA degradation. The proteins Ihat as~mb l o al tclomores fonn a slnlcturo that is resistanl lo both of those events. Second . Iclomcres acl as a spccializcd origin o C replication tltal allows Ihe cell lo replicate Ihp. ends of Ihe chromosomes. For reasons Ihal ••vill be described in delail in Chapler B, the standard DNA roplicatioll machinery cannot completcJy replicale thc ends of a linear chromosome. Telomeres fa cilitale cnd replication through the tccruitmen! oC an unusuaJ DNA polymct8sc called telomerase. In conlrast lo most of the chromosome, a substantia1 portion oC thc telomcre is maintained in a s inglc-stranded fonn (Figure 7-9). Mos! tclomcres have 8 simple rcpeating sequence tha! varies frum organ-
C/rro/IJoso me Duplication und Segrego tion
, 125 bE! I! :.1.1 n
, 111
COE HII
a $ . cerevisiae
I
"15 .
Il b S . pombe
) )
cen1
r r - - - - - - - '-
-
- -
40-100 kb
e D. melanogasler
FIGURE 7-8 Centromere sizeandcom· pOSitioo vañes dramatic.ally bctween differen! OfEOoisms. s: cerevisioe centromeres ¡ue sma!1 and composed el non-repetitive sequences. In contrasl the cenlromeres el other organisms such as Ihe fmil fIv, Drasophila meIonogoster; and !he fission )€as!, 5chizosocchoromyce porrbe, are mud'l Jarger and are lalgely corno posed of lepetitrve S€Qt,Jences. Only !he central 4-7 kb of lhe S. porrbe centrornere is nmrepetilrve
-400 kb
d Human
11
141
li 240 kb to severa! Mb
ism lo organismo This repca! is typicall y composcd oh short TG-rich repoat. For cxamplc. human tclamores Il8ve the rcpeating sequence of 5'-TIACGC-3 '. As we wiU seo in Chapter S, lhe repetitivc oature of
tclamcres is 8 consequence of lheir uniquc mothad of rcplication.
Eukaryotic Chromosome Duplication and Segrcgati.on Occur in Separate Phases of the Cell Cyclc During cell division, the c hromosomes musl be duplicated amI segregated julo the daughtcr ceJls. Jn bacteria l cclls these evcnts occur simu ltnneously. Thal is. as the DNA i8 replicated, the resu lti ng two copies are scparated ioto oppositc sides of tho cel!. Although it i8 dear that thcse cvents are tightly rcgulated in bacteria, the details of how this rcgulation is achiovoo aro poorly underslood. In contrast, eukal)'oli c cclls duplicate and segregate thcir chromosomes aL distinct times' during ccll division. Wo will JOCIlS 00 these evoots for the rcmainder of our discussion of chromosomcs. The evenls tequired for a single round of ccll division are coUce· tivoly knowo as the ceLl cyele. Most euknryotic col! divisions maintain the numbcr of chromosomes in tho daughter cells Ihal wero presont in the parental coll. This typc of division is callcd mitotic: celJ division. Tho mitotic cel! cyclc can be dividcd io Lo four phases: Gl , S, C2, aod M (Figure 7-10). The key ovents ¡nvolved in chromosome
FI G U11 E 1-9 lhe structure o. a typical telomere. The repcéued sequ€nce (from human cell5.) is shown in a represenlative 00..:. Note lhal the regiCH1 of ssONA al Ihe "3 ' end o, Ihe ctuomosome edn be hundreds of bases long.
142
Chmmab'OJllp.¡;;, Chromu¡in. um' the Nuc/vosorrm
flCURE 7 l0 Theeukaryoticmitotk: ceU cycle. lr.ere are four stages 01 !he MéW)()IlC ceU cyde, Chromosornal replication ClO:Urs duung S phase and d'lrornosaTle segregation occurs during M phase. The G 1 and G2 gap phases allo1.\I Ihe ceO lo prepare lor ¡he oext events in lhe cel1 cyde. FoI' exilflllle, rnaoy eukaryotlc cells use Ihe G1 phase 01 lhe ceII cycIe 10 eslab~sh thal !he leve! of nulrients is suffioently hi~ ro all
prepa.-e for
c hromosome
chromosome segregation
001 division.
I
s
I
I prepare for
ONA replication
cell drvisiOn
propagation occur al disti nct times duriog ¡he cell cycle. DNA synlhesis occurs during lhe syolhesis. 01' S phase. of Ihe ceJl cycJc. resu lting io lhe duplicntioo of each chromosome (Figure 7-11). Each chromosome of the duplicaled pair is ca lled a chromatid. a nd the lwo chromatids of a given pair are ~a ll ed sisler chromatids. Sisler l.:hromalids are helll togelher afler duplication lhrough Ihe aclion of a molecu le ca ll ed cohesin, which we describe below. The process Ihal holds thcm togcthcr is caJled sister chromalid cohesion a nd Ihis tethered sla te is maintained until the chromosomes scgregalc from ona aool her. Chromosornc segrcgntion occurs during mitosis or ¡he M phase of tho cell cycJe. We will consider lhe overall p rocess of milosis bulow, bul fil'S ! we tocus OIl thrcc key sleps in thu process (Figure 7-12). Firsl, car.;h pair of sistcr chromatids is bound lo a structure called ¡he milotic spindle. l1lis struclure is composcd 01' tongo protein fibers cal1cd microtuhules Ihal a ro attached lo one of the Iwo microtubule orgilnizing
fiGURE 7-11 Thee~ntsofSphase. Two major c:hromosomal evenls occur during S phase, ONA repllcation copies eoch duomosome c:ompletely, and shortIy after replication has occuned, sisle r chromatid cohesin is estabfished by placing ring-shaped CoheslO moIccules a round lhe two copses of Ihe recently replicated ONA. Each bIue or red "Iobe" represenls an ssONA molecule.
key 8Veflts Ifl S phase
iniliatiQl ofONA replication
•
more replication eslablishmefll 01 coheslOfl
I L f.---. ..
•
-.\ I
I 1 ~ sislef chromatids
Chromosofm: Duplic;otiml (/(Id Svgregutioll
key events in M pflase
cohesin
kinetochore
microlubles deslroy cohesin
•
micrOlubule organizing cenler
F I (; U R E 7-12 The events of mitosis (M phase). Three majar events ocwr dunng mlloslS. FírSl, lhe two kiretochores of each linked sister-chrornatid pail attacr. lo oppositE' poIes of the mitotic spindle Once all kl retodlores are bound lo oppoSl te poIes, slSlef-ctlfOmalid cohesion 15 d immaled by destloying die colle;in I,ng. Fi""lly, i1fte.- calles"JI-' is elirT'l,r,Med, lhe sisle. dllot""'¡ds lile segregated lO oppositc poIes cA !he mllotlC spmdle.
oonters (also called r.:cntrosomes in animal cells or spindle pole bodies in yeasts amI other fungi) . The microtubule organizing cenlel,'S are located on opposite sides of the cell forming "pules" toward which the microlubules pulllhe chromatids. Chromalid altachment is medialed by Ihe kin~tochorc assembled al each centromere (Figure 7-6). Secolld. the cuhesion belwrell the chromatids is dissolved. Before cohesion is dissolved, il res isls the pulling forces of Ihe milotic spill dle. Afier cl)ht~ion is dissolved, the Ihird major evenl in mitosis can occur: sisler chromatid separation. In the absence of Ule OJunterba.landng force of chromati d cohesion, Ihe chromalids are rapidly pulled toward opposite poles of the mitotic spindle. Thus. cohesion between the sister chl'omatids and aUachment of sister cruomatid kim;tochol't'.$ to opposite pales of the mitolic spindle play opposing roles Ihal musl be careftdly coordinaled for chromosúme segregatíon lo oOCur properlr-
Chromosome Structurc Changes as Eukaryotic Cclls Divide As c hromosomes proceed Ihrough a round of cell division. Iheir structure is altered numerous times; however, Ihere are two main states fol' the chrumosomes (Fi b'1lre 7-1 3). The chromosomes are in their mosl compact form as cclls proceed through mitosis or meiosis. The process !hal results in Ihis compact form is called chromosome (;ondensation. In this condensed slate the chromosomes are completcly disentangled from one anolher, g reatly facilitaling Ihe sHgregation process. During the Gl. S, and G2 phases (collectively reCerred lo as interphase) , the chromosomes are significantly less cumpa1.1. Indeed, al Ihese stages of Ihe ccll cyde, the chromosomes are Iikel y to be bighly intel1wined, resembling more of a plate of spaghetti than Ihe organized view of chromosomes during milosis. Neverthcless. even during Ihese slages lhe structure uf Ihe chromosomes c hange. DNA replirnlion
143
144
Chromosomes, Cl,ronm'¡n. ond 'he Nlldeflsome
Interphase
M phase
ONA replicalion
•
•
F I G U R E 7-13 chances in d'tromatin stnldwe. Ct1rOrTlO5Om€S ;;!re mallÍm.dly CO~ in M phase and dewndensed throughoul lhe fes! of lhe cell cyde CG 1, S, an(! Gl In milotlc ceUs). Togethel lhese decondensed swges afe relerred 10 as intetphase.
requires the nearly complete disassembly and reassembly of the proassociated with each chromosome. Immediately after DNA replication. sister-chromatid cohesion is established, linki ng the newly replicated chromatids to one another. As traflscdpti on or indiv idutl l genes is turned on and off 0 1' up s nd down. there are associated changes in Ihe structure oCIhe chromosomes in those regi oos occuning throughoul the cell cycle. Thus, the chromosome is a con stantl y changlng structure that is more like an organel1e than a s imple string of t~i ns
DNA.
Sistcr Chromatid Cohesion and Chromosome Condensation Are Mediated by SMC Proteins The key proteins thal mediale s isler chromatid cohesion and chromosorne condensation are related to one another. The stru ctural maintenane!;! of chromosome (SMC) pl'otein8 are extended proteins th... t form dcfined pairs by interacting Ihrough length y caBed-eoil domains (SL.oe Chapler 5). Together with non-SMC proteíns they form multiprulein complexes Ihat act to link two ONA helices togelher. An SMC-protei ncontaíning compl ex ca lled cohesin is ('equ ired to li nk the two daughter DNA du plexes (sister ch romatids l logether atler DNA replication. Il is this Hn leage thal is the basis fuI' sister chromatid co hesiún. The structure of cohesin is thought tu be a large riog compused of two SMC prole¡ns a nd ti third non-SMC proteill . Indeed, thero is growing e vidence thal the mechanis m of sister chromatid cohesion is thal both daughter chromosomes pass Ihrough lhe center oC Ihe cohes;n prote¡n ring (Figure 7 -14). ln trus model, pruleo lyt ic c1 eaveage of lhe nun-SMC subunit oCcohesin resl1lts in the opening of the ring and the lúss 01' cohesín . The chromosome condensatioo lhat accompanies chromosome segregation al80 requires a relaled SMC-conlaining-comp lex called condensin. Although less is known about Ihe structure and fun clion of th is complexo it sh ares man y of the fcalures of th e cohes in complexo s l1ggésting that it too is a ring-shaped complex. IC so, it Illay use its ring-Iike natUfe to induce chromosome condensatioo. For example. by linking d.ifferent regioos uf lhe SiUlle chl'Olllosome togelher cotldensin could rcadily reduce !he overall linear lengl h of lhe chromosomu (Figure 7-14).
Chromosol1lc Dupliwtion uml Sr,'Mregnl ¡oll
chromaUn
cohes¡n
)
145
f I (¡ U R E 7 · 14 A spec:ulative model for the strlldllre of cofoIesins and condensins. CohesIn5 and condensins ille componefIts of
Ihe nucl8" SCilfldd. 80th play lIllpOftiIflt roles in bnfl8l"8 dtlo1ilnt 0 1 diflerent regions of DNA togetheT. The proposed nng41aped structure á
lhese proteios WOlIld illIoN a fIeóbIe, bul stroog link, between two regions of [)NA. In tr.; iIIustlation, the SMC proleins are shoNn as g«!€Il (cohesin) 01 blue (a:lfICIensin). (SourCE: Haering CH. 2002. Mol. CeIJ 9: m 785.)
<:J¡ condensin
melap/lase
- 778, Fa, page
I
N uclcosome
Mitosis Maintains thc Parental Chromosome Number We now return to the overall process of mitosis. Mitosis occurs in several slages (Figure 7·15). During prophase, the chromosomes condense into the highly campact farm required for segregation . Al Ihe end of prophase, the nuclear enve lope breaks down ami the cell c nlers mt~t.a phílse.
During meta phase, Ih e mitoti c spindle forrns and rhe kinetochores of sister chrumalids altat:h to the microlubules. Proper chro matid altachment is ool y achieved w hen the two kinetochores of a sistcrchrom alid pair are aUached lo microtubules emanating from opposite microtubule organizing centers. This type of attachmenl is cal1ed bivalent aUachmcnt (see Figure 7-1 5) and results in the microlubules exerting tensioo on the chromatid pair by pulling the sisters io 0Pposite directions. Attachment of both t:hromatids to microtubules emanaling fram th e same microlubul e organizing center OT atlachment of only one chromatid of Ihe pair, called monovalent attachmcnt, dues not result in hms io n and eventuall y leads lo cbromoso me loss . The tens io n exerted by bivalent attachment is opposed by siste r chromatid cohesion and res ults in all lhe chromosomes al igning in lhe middle of the cell between the Iwo microtubule organi zing cenlers (this pos ition is called the me laphase platel , At this poillt. each sistel' c hromalid is pre pared lO be scgregated. Chromosom e segl'egation is triggered by pruteolytic destruction of Ihé co l1esin Illolecul es , resulting in the loss of sister chromatid eoht'ls ion. This loss occ urs as cell s ellter anaphase, during w hich Ule sister chromatids sepl:lrate and move to opposite sidAs of Ihe cel!. Once Ihe two sisters are no lo nger held togetber, they cannot resisl the outward pull of tlle microtubule s pindle. Bi va lent attachment ens ures that !he me mbcrs of a sisler-chromatid pa ir are pulled toward opposile pules and each rlaughter cell receives one copy of each duplicated chromosome. The fin al ste p of mitosis is telophasc, during w hich tbe nuclear envelope reforms around each set of se!''l'egated chromosomes. At Ihis point, cell division can be completed by physically separaling the s hared cytoplfls m of thu two pres urnpti ve cells in_ 1:1 process callud cytokinesis.
The Gap Phases of the Cell Cycle AlIow Time to Prepare for the Next Cell CycJe Stage while also Checking that the Prevlous Stage Is Finished Correctly The remaining Iwo phases of the eukaryotic ce ll cycle are gap phases. Gl occurs prior lo DNA synthesis amI Gz between S phase and M phase. The gap phases uf tbe cell cycle serve two purposes. T hey provide time for the cell to prepare fur the nexl pha'ie of the cell cyde a nd to check thal tbe previoll5 phase of lhe cell cyele has been completed appropriately. Fo. example, pri or lo entry iJl to S phase. mosl ce lls must reach a cel't¡.!in size and leve l of proteio syntht>,sis lo ensure that there will be adequate prole ins and nutrienls to complete the Il ext ro und of ONA synthesis. Ir there IS a problem witb a previous step in Ihe cel l cy cle, cell cycle chet:kpoints arresl tlle cell cycl e to provide time for the cell lo compl ete thal step. For example, cells with damaged DNA arresl the ce ll cycle in Gl before DNA synthesis or in G2
OImmm;omc UUpliClItioIl oIld Sr.gregu tiol1
Interpl1ase
14.,
F I (j U R E 1-15 Mitosis in detail. d~ensed
replicaling
chromosomes nuclear membfane
Prior 10 mitosis, !he dll~ are in a decondensed state calle.::! inlerphase. l:U"ing P'opNse Ó'llOrrosonleS are condensed and
de-Iar®ed In preparation Ior segregatioo and !he nudeaf me.-nbrane surrOlllding Ihe
propflase
~:::~:;;;~ ~~"... microlu!jes
~
microtubles organizing center
chromabd cohesion resullll'€ In the separatJon of SlSlef dvcmatids.. TeIophase is dislinguished
oohesIn rings
by lhe Ioss of d"l"crrlO5()rTl(! coodensallon .:lOO !he relolTT'ldtion of !he rucIear membrane around lhe two populations of segregated chromosomes.. CytOk¡nesrs is lhe fiNI event d lhe cell cyde dur¡"g v.hch me cellulal
merrbane surrounding the \'NO nudeI constIict:s and eventually compIetely 5epiIfates into two dau¡t1ler ceIIs. Al DNA moIecules ille dolbIeslranded.
metaphase
monova!ent lIttachmenl
bivaleot attactmeol
aoapflase
lelophase
cytoklnes.~,. :;.._ _....
daughler cell
duomosomes bfeaks dcJv.KIl(\ most eukaryotes. DlJfing melaphase, eoch sister-dYOIl""Iatid palr attac:hes lo opposite peles of lhe milOlic spindle. Anaphase is ioi,ialed by ,he Ioss of sistef·
daughler 001
before mitosis to prevent ei ther ovent from occurring with damaged chromosomes. This delay allows time for the damage to be repaired before the cel! cycle continues.
Meiosis Reduces the Parental Chromosome Number A second type of eukaryotic cell division is specialized to produce cell s that have half tha number of dtromosomes than the parental ceU. Like the mitotic cel! cyele. the mciotic ceJl cycle includes a Gl. S. and an elongated G2 phase (Figure 7-16). During the meiotic S phase. each chromosome is replicated and the daughter chromatids remain associated as in the mitotic S pitase. Cells that enter meiosis mus! be diploid and thus contaio Iwo copies of each chromosome. one derived from each parent. Affer DNA replication. lhese relaled s ister-c.h.romatid pajrs. called homoJogs. pair wilh one anolher and recombine. Recombillation between the homologs creales a physical Iinkage between the two homologs tita! is required to connel.1 the two related sister-chromatid pairs during chromosome segregation. We will discuss the detaHs of meiotic recombinalion in Chapter 10. The most significan! difference between the mitoli c and meiolie cell cycles occurs dllring chromosome segregation . Unlike milosis. during which lhere is a single l'útuld of chromosome segregation . chromosornes participating in meiosis go through two rollnds of segregation known as meiosis 1 and n. Like mitosis. eoch of these segregaHon events ineludes a prophase. metaphase. and anaphase stage. During the mehlphase of meiosis 1, illso cHlled metaphase l. lbe homulogs iltlach to opposite poles of the microtubule-based sp indle. This attachment is mediated by the kinetochore. Because both kinelocbores of each sisterchromatid pair are aHached to the sa me pole of the micrOlubuJe spindie. Ihis inleracHon is referred to as monovalent aUat;hment (in contrasl to Ihe bivalent attachment seen in mitosis, in which the kinetochores of each s ister-chromatid pair bind lo opposite poles of Ihe spindl e). As in mitosis. tbe paired homologs initially resist tbe lension of the spindle pulling them apart. In the case of meiosis l. this is medi ~ ated throllgh the physical connections between the homologs. or crossovers. that are induced by re<.:ombination. This resistan ce also requires sisler-chromatid cohesion along the arms of Ihe sister cruamatids. When cohesion along Ihe arms is eliminated during anaphase l. tbe homologs are released from one another alld segregate lo oppositc poles of tlle cell . Importantly. tlle cohesioll betwl."Cn th e sisters is maintained near the cenlromere. resulting in the sisler chromaticl s remaining paired. The second round of segregation during meiosis. meiosis II, is very similar to mitosis. The major difference is that fl round of DNA replication does not precede Ibis segregation evenl. lnstead. a spindle is formed in association with each of the two ncwly separotcd s ister ch1'omedid pairs. As in mitosis, during mctaphasc U. these spindles attach in a bivalenl manner lo the kinelochores 01' each sister-chromatid pair: The coheslon lbal remains al lhe centromeres ar~er meios is I is critica! to oppose thu pull of tbe spindle. As in mitosis, anaphase II is initiated by the elimination of omlrumere cohesion. Al Ihis point tbere are four sets of chromosomes in the cel! o each uf which cuntains only one copy of each chromosome. A nucleus forms around each set of chromosomes, and Ihen Ihe cytoplasm is divided to form four haploid cells. These cells are now ready lo male lo form new diploid cells.
Chromuronr(: /Jllpliclltiofl (Jnd Segr'f!8a liofl
149
FIGURE 7- 16 Meiosisindetail.
üke mitosis, meiosis can be divic\ed ¡me
I ONA replication
•
disc::rete stages. After DNA replicatiol\ hCllTlOlogous siste r chromatlds pail witll one anothef lo form ShUctures .,.,;ttl relate
'0Ut
between ¡he homoIogous dlromosornes caHed masma. During metaphase l lhe two kinetod10res 01 each sister
I
homolog pairing and recombinalion
~
together resutt In !he separation uf Ihe hoInoIogous dlrornosomes 110m ooe another. The sister-chromatid cotleSion is IosI along the arms of Ihe chrc.mosornes and lhe dJi;,SrT\(I between !he homoIogs are resolved. Togethef, lhese events ,€SURin Ihe sepalation 01 lhe horrdogs irom one anoln€f. The sist€.'l' chrornatJds rernain attached through cohesion al ¡he centromefe. Mlelosis 11 is very SImilar to mitosis. During me!otic metaphase 11. two meiotic spindles are 100med. As in mitotic metaphase. !he kinefodlOles associated 'Mth eocIl sisler-chromatid pan anadl to cpposite peles of Ihe meotlC spindles. Ounng anaphase 11. the rema!ning cohesicn between the sisters is IosI aod Ihe sister dlromatids separate lrom one another. The lour separale sets of chromosornes are then packaged inlo nudei and separated into jour cells lo a eate lour spores 01galT"le(e5. AlI DNA moletules are dOlbIe-stranded. (Source: Adapted from Murray A. and Hum 1. 1993. The ceN cyde: n¡e introduction. lig. 10.2. Copyrifttt e 1993 by Didord University Press. Inc.. Used by perrrnssion of Oxford University Press, Inc..)
metaphase I
.0
~
~
sister-chrorrotid kinetochores attach !O upposite peles a~tmg lension lila! is resisted D,.r ¡he wnnecOOn between the homologs. Enny rnto anaphase I ¡s cOITelated ~"'¡Ih two events li'Alich
anaphase I
melaphase 11
four gametes (or spores)
150
Ch romosomes, Chromotin , and Ibe N ucJeosome
.
a
"
.
~
""
..
\
•
.
"
,.~
" ]o
" ~' .'")
\. _.,..s'
.) ~,1
<~/~'),
{
"
"l"
"
~
",
""
' ..,
"
"""
-' J
~
(" ~J -.
1o-nm tiber F I G U R E 7- 17
Fonns 01 dJromatin sttucture seen in the EM. (a) EIectron micrographs or M phase anc! inlerphase [)NA shov-i lhe changes in !he strucrufe 01 duunatill, (b) Electron microgfaptls or diffeent lorms á chroToatill in Illferphase celIs shcMi lhe 30fun anc! IQ-om chromanll fII:;.e.s (beads on a 5f1ing). (So.tte: (a) Courtesy of Victoria Foe: lO 2002 lrom Alberts B. el al. 2002 Molecular bioIogy 01 fhe ce( 4111 ed"rticn Reproduce:! by permission 01Rootledge Inc., part of The T~ & Francis Group. (b) Courtesy or Balbara Hamkalo; 10 2002 frun Alberts B. el al. 2002. Mo/ea.J/orOOlogy 01 ceIJ. 4th edition Reproduced t.." permlSSlU'I of RolJ!IWb>e Inc.. par! 01lile T
me
Different Levels of Chromosome Structure Can Be Observoo by Microscopy Microscopy has long been used to observe chromosome structure and function. Indeed. long before it was cJear that chromosomes were the SOllce of the genetic informati on in the cell , theil' movements and c hanges duriDg cell division were we ll understood. The compact nature of condensed mitotic chromosomes al so makes the m relati vely easy to visua lize even by simple Iight microscopy (Figure 7-17a). Indeed, iI was in this form tha! chromosomes were first identified. Con · densed ch romosomes are also llsed to determine the duomosoma l make-up of human ce ll s to detect such abnormalities as chromosoma l de letions Uf indiv iduals with extra copies of a si ngle chromosome. Clu;,o mosomaJ DNA nol in mitosis (that ¡s, in interphase) is less compaet (Figure 7·1 7a). In Ihe e lectron microscope Iwo states of chromatin are readily observed: fibcrs w ¡th n diameter of eithel' 30-nm nr 10 nm (FigurH 7-17b). The 30-nm fiber is a more compact version of duúmatin that is frequently folded int o lruge loops reltddng out frorn a prote in core or st:affold . In contrast, th~ 10-nm fiber is a less com ~ pact form of chromatin that resembles a regular series of "beads on a slring." These beads are nucJeosomes. We will first focu s on the nalure of ¡he nudeosome, induding how they a re formed. and then describe ho\\! nucleosome-depende nt stnJclures control gl obal" effects OD the accessibili ty of nuc)¡,lar DNA.
"rhe N Uc/f.'QSofllf'
151
THE NUCLEOSOME Nucleosomes Are the Building Blocks of Chromosomes The majorit y of the ONA in euk
sornes. The nud eosome is composed of a eore of eight histone proleins and lhe DNA wrapped around Ihem. The DNA between each nucl eosorne (the "st ring" in Ihe "beads on a s lring") is called linker DNA. By ussembling ¡nto nucleosomes, lhe DNA is compacted approximately s ixfold. This is fur s hort of Ihe 1 ,000- lo 1O,OOO-rold ONA compactiún
observed in eukaryotic relIs. Nevertheless, this first 518ge of DNA packaging is essential for a ll the remaining levels of ONA compaction. The DNA mos! tightly assúcia led wi th Lh e nuc1 eosome. ca ll ed Ihe core ONA, is wound approximat e ly 1.65 tim es uro LJud the- outs ide· uf
Ihe hi stone uctamer Iike thread around a s pool (Fig uJ'e 7· 18). The length of DNA associaled with each nucleosome can be determined using nudease lreatm ent (Box 7·1 , Micrococca l Nucl ease and ¡he ONA Ass oc iated with th e Nudeo15omeJ . The - 14 7 base pair length of Ihis DNA is a n invariant featme of nud(~o so m es in a ll eukaryoti c cells, In contrasto the length of lhe Iinker DNA between n ucl eosomeS is variable. Typically this distance is 20 - 60 bp and each eukaryote has <1 charucterislic average linker DNA length (Table 7·4). The dif· ferell(;l~ in average Iinker DNA lenglh is likel y lo reflect th e ditTerences in the nature of larger stru ctmús formed by nucleosomal ONA
•
,
nucleosome
linker DNA '.;:::::."" (20-60 bp)
FIGURE 7-18 DNApackagedinto
nucleosomes. (a) 5chematlc á Ihe packag· ing and Ofganization 01 nudeosome5. (b) úystal Slrtlcture of a nucleosome shCl\lVing DNA Wfapped around Che hislone protein core H2A is shown in .ed. H26 in ~1kJv.¡, H3 in plllple, and H4 In green Note lhallhe colors oIlhe different histone proteins Il€fe and in follCNl'ing suuctures are the same. (l uger K., MacIer AW., Richmond RK., Sargenl D.F., aOO Richmond TJ. 1997. NatUle 389: 25 1-260.) lmage prepared w tll BobScript. MoIScri~ and Raslef 3D.
b
Box 7-1 Micrococcal Nuclease and the DNA Associat~ wilh the Nudeosome Nudeosomes were first purified by treating chromosomes with a sequence noospecific nudease called microcoaal nuclease. The ability of this enzyme lO deave ONA is primar¡Iy goJerned by the accessibiliry ti the ONA Thus, miaococcal nudease deaves protein-free ONA sequences rapidly aOO protein-assodated DNA sequences poorIy. Limited treatment of chromoscrnes with this enzyme results in a nuclease-resistant population of ONA moIecules that are associated with histones. These ONA molecules are between 160-220 base pairs in Iength and are associated with two copies each of hístones H2A. H2B, H3, and H4. On average, these particles indude
the OOA tightly associated with the nucleosome as well as one unít of linKeJ DNA. More extensive micnx:occal nudease treatment degrades all of the linker DI\IA. The remaining mínimal nudeosome ¡ndudes only 147 bp o, ONA and is called the nudeosome (ore particle.
lhe average Iength of DNA associated with each nudeosome can be measured in a simple expenment (Box 7-1 FIgUre 1). Cl'Iromatin is treated with the enzyme micrococ::cal nuc1ease but this time ooly gently. This results in single cuts in sane but mi: all of the linker ONA. After nuclease treatment, the ONA is extracted fran all proteins (including the histooes) aOO subjected te gel electrt:lJfuesis lO separate the ONA by size. Electrq:lhoresis reveals a "ladder" of tragments that are multiples o, the average nudeaiome-lo-nudeosome ÓlSt3nce. A Iadder of fragments is
obseJVCd because !he miaucoccal nudease-treated chrumatin is ooly partially digested. Thus, sometimes muh.iple nucleosomes wiUrema!n unseparated by digestion, leading to OOA fragments equivalen! te all the ONA boond by lhese nucJeosomes. FuMer digestioo v.wkl result in all linker ONA being deaved and the foonation of nudeosome rore partides and a single - 147 bp
tragmenl
light digestion wilh nudease
wO 400 bp
more extensive digestion
-11
W
"m[
Q
""-
200bp
gel electrophoresis
bp
800
600 400
•--
--
release
core partiCle 200
EB
B O X 7-1 F I G U RE' Progressive d igestion of nudeosomal DNA with Mnase. (Source: Courte!.)' of RO. Komberg.)
TA B l E 1 - 4 Average Lengths of Linker DNA in Various Org¡misms
NUCfeosome repeat
Species
length (bp)
160-165
S.cerevisiae
Sea urchin (sperm) D. melanogaster
- 260
13- 18 - 110
-'80
Human
Avet"age Unker ONA length (bp)
- 33
185- 200
38- 53
in each organism rather than differences in the nllcleosomes themselves (see section on Higher·Order Chromatin Slfllcture). In any cell them are stretches oC ONA that are nol packaged into nuc1eosomes. Typi cally these are regions oC DNA engaged in gene ex· pression. replication. or recombinatioD. Although nol bOllnd by nucleosomes. these sites are typically associated wilh non -histone proteins Ihal are eilher regulating or participating in these events. We wiIl discuss tbe mechanisms lhat remove ullcleosomes from DNA aud maintain suc h fflgions oC DNA in a nucleosome-free stale helow and in Chapler 17.
Histones Are Small, Positively;Charged Proteins Hislones are by far the most abundant proteins associated with eukaryotic DNA. Eukaryotic cells commonly contain five abundant histones: Hl , H2A, H2B, H3 , and H4. Hislones H2A. H2B, H3, and H4 are the core bistones and form the protein core around which Dueleosomal ONA is wrapped. Hjstone Hl is nol part of!.he nucleosome core partide. Instead. it binds lo the linker DNA and is rcferred to as a li.nker bistone. The four eDre histones are present in equal runounts in the ccl1, whereas Hl is ha lf as abundant as the otber histones . 'Cbis is consistent with the finding lbat only one molecule of Hl is associated with each nucleosome (which conlains Iwo copies of each core histone). Consistent with thelr close association wilh the negativcly-charged DNA molecule. the h¡slones have a high conlen t of positively-cbarged amino acids (Table 7-5). Greater than 20% of the residues in each histone are either Iysine or arginine. The core hislones are also relatively smaH proteins mnging in sire from 11 to 15 kilo daltons (kdl. whercas hislone Hl is about 20 kd. The proteio core oC the nucleosome ls a d1sc~shaped structure Ihat assembles in sn ordered fashjoo only in the presence of DNA. Without DNA, the eDre hislOn.es form intermedia te assemblies iD solu· tion. A conserved region found in every core histone, called the histone-fold domain, mediales the assemb1y oC these histone-only TABLE 1-5 C:;eneral Properties of me Hbtones
Historie type
Hlstone
Core hislenes
H2A H2B H3 H4 H'
Unker histone
Molecular welghl (M,)
% ot lyaine 8nd Arginine
14,COO 13.9CQ 15,400 11 ,400
20% 22%
20.800
23%
24%
32%
154
C/!romosornes. Chrowa/ill. (llld /he
FIGURE 7-19 Thewrehistonessharea common stnJct....al told. (a) The loor historIeS are diagramed as linear moIea.Jles. The regions of the histone loId molif that lorm a helices are indicated as cyllndefS. NOTe mal there are adjacent regions 01 each histone fhal ale sllUctulally disllnct induchng additional a hehcal regions. (b) The hekal regions of two hístOfleS (hefe H2A and H2B) come together to form a dimer. H3 and H4 also use a similar
ml.eIaC"JOfl to Iorm H3J oH4z tetramers. (Soulce: klapted from Albefts B. el al. 2002. MoIeculor bIOIogy 01 fhe ce//, 11th edÍflOf\ p. 209, fig 4-26. (q)yrigtlt e 2002. Reproduced by pelmiSSion of Rootledge/Ta'fb & Francis Books, lfIC.)
Nud~"'Osorne
•
N-Ierminal lail
histone fold 11
H2A
N
H2B H3 H4
a:::J-+1
~
I
e
N
~
NN
e
ti
,
e
>-11:J-c
U
l
a:::;rC
b
H3oH4 tctramcr
interme diates (Figure 7-19) . The histone foid is composed of three o helical regioos separated by two short unstmctured loops. In each case the histone foId mediales the formation of head lo tail hete rodimcrs of s pecific pairs of histones. H3 and H4 histones Brst form heterodimers that theo come together to fonu a telramer with two molecules each of H3 and H4 . In contrast, H2A and H2B form heterodimers in soluljon bul not tetramers. The assembly of a ou cleosome involves the ordered association of these building blocks with DNA (Figure 7-20). First, Ihe H3oH4 telramer bin ds to DNA; then two H2A-H2B dime rs ¡oin Ihe H3-H4DNA com plex to form the fi nal nucleosome (see Figure 7-18). We will discuss how th is assembly process is accomplis hed in the ceU later in Ihe chaptcL T hc core hi slon es each have an N-terminal mctcnsion, caBed a""tail," beca use it lacks a defined slructure and is sccessible within the intaet nuc\eosome. This accessibUity can be deled ed by treatme nt of nudeosomes with lhe protease trypsin (which sp eciñ cally c1eaves prole ios arter positi vely-charged amino aci ds). Treatment of nucleosomes with trypsin rapidly removes the accessible N-Ierminal tails of Ihe histones bul cannot cleave Ihe tighlly packed hislo ne-rold regioos (Figure 7-21). The exposed N-terminal tsi ls are not requ i.red for the association of DNA with the }dstone octamer, as the DNA is still tight1y associated with Ihe nucleosome after protease Ireatment. Instes d , Ihe lails are the siles of extensive modificstions Ihat alter tbe function or individual nucleosomes. Thcse modifications include phosphorylation, acetylnti on. and methylation on serine and Iysine residues. We wiH return to Ihe role of histone taiJ modification in nucleosome fu ncti on later. Now, we turn to the detailed slructure of I.he nucleosome.
The Atomic Structure of the Nucleosome The bigh-resolution three-dimensional slructure of the nucleosome core particle (Figuro 7-1 Bb, 14 7 bp of DNA plus an ¡nlact hi stone
¡ H2.
H2A
H4
FIGURE 1-20 The iJssembtyofa nucleosome. The élssembly of a nudeosome IS l!'\Ihat e
COPVflgnl e 2002. Reprc:xlJred by pem1ÍSSIOO of Rouliedge/Tdyb & Franos Bods, klc.)
N
N
N
H2A· H28 dimer
N
>5'
156
Chrurn o.~{Jmes.
QlIumotin. ond thfJ Nudoosome
D H2A . H3
. H2B . H4
N-te rminal tails of hlslooe
FIGURE 7-21 TheN-tenninaltails of the CD«! hístones are accessíble lo
proteases. Trealmenl of nucleosomes wilh limlting amourts uf protCilSeS tha! deaYe afie. basic amino aóds (101 example, Irypsln) speorlCaU,; fel'TlOJe5 lhe N·termlnal "I¡¡ils" leaving !he hislooe core Intact.
octame r) has revealed much about how it Cunclions. The high affin ity oC the nucleosome for DNA, the distortion of the DNA when bound lo the nudeosome, and the lack oC DNA sequence specifici ly, can each be explained by the natme of Ihe inleractions betwecn the histones and the ONA. The structure also sheds light on the runcHon and location oC the N-terminal tails. Finally. the interaclion bctween the DNA and the histonc octamer allows an unders tanding of th e dynamic nahue oC the nucleosome and the process 01" nucleosome assembly. Although not perfectJy sy mmetrical. the nucleosome has en approximate twofold axis oC symmetry, called the dyad axis. This can be visualized by lhinking oC the face of the octamer disc as a dock with tbe midpoint oC the 147 bp oC DNA localed at the 12 o'dock position (Figure 7-22), This placcs the ends oC the DNA juSi short of 11 and 1 o'dock. A ¡ine drawn from 12 lo 6 o'clock through the middlc oCthe disc defines the dyad axis, Rolation of the nucleosome aIOund Ihis axis by 1800 reveals a nearly ¡dentical view of the nucleosome to that observed prior lo rotation , The H3' H4 tetramers and H2A ' H2B dimers each interaet with a particular region of Ihe DNA within Ihe nuc!eosome (Figure 7-23). Of the 147 base pairs of DNA induded in the structure. lhe hislonefoId regions of tbe H3·H4 letramer inleraet with Ihe central 60 base pairs. The N-terminal region of H3 mast proximal lo the histone-rold region rorms a fourth a helix thal interacts with thc final 13 bp at each e nd of Ihe bound DNA (lhi s region is distinct from lhe unstructured H3 N-terminal laH described above), Ir we piclUfe tbe nudeosome with a dock race a.. described aboye, the H3"H4 tetramer forms the top half of the histone octamer. Importantly, histone H3'H4 tetramers oct:upy a key position in lhe nucleosome by binding lhe midd1e and both ends of the DNA, The lwo J-lzA·H2B dimers each associale with approximately 30 bp of DNA on cHher side of th e central 60 hp of DNA bound by H3 and H4. Using the dock analogy again. the DNA associated with H2A'H2B is IOCflted From approximately 5 lo 9 o'dock on either Cace of the nudeosome disco Togetbcr. Ihe tWQ H2A"H2B dimers form the bottom par! af the hislone octamer located across Ihe disc fram the DNA ends. Tbe exlensive interactions bctween the H3· H4 letramer and the UNA help to explain the ordered assembly of lhe nuclcosome (Figure 7-24) . H3·H4 tetramer association with lhe middle and cnds of the bound DNA would result in lhe DNA being extensively benl snd constrained making the association of HZA'H2B dime rs relatively easy. In contrast. th c relatively short length of DNA bound by H2A ' J12B dimers is not sufficicnt to prepare the DNA Cor H3'H4 tetra mer binding. This more Jimited association oC H2A+12B dimers has beco hypothesized lo Cacilitate their release as nucleosomal DNA is transcribed. Such a mechanism would aHow RNA polymerase increased access to nucleosomaJ DNA during transcription,
Many DNA Sequence~lndependent Contacts Mediate the Interaction between the eore Histones and DNA A doser ¡ook al the interaclions between the histones and the nudeosomal UNA reveals the structural basis for the binding and bendiog of the DNA within the nudeosome, Fourteen distinct sites of contact are obseeved, oue for each time the minor groove of tbe DNA Caces lhe
TJ¡e Nucleosvme
•
.
b
cxil
c.;. , cntry ;
'.2
9
,
3
6
.
~exit
,
entry : ;
-
, •
oo. y
6
F IG U k E 7-22 The nudeosome has an approximate twofoId alCis of symmetry. Tlvee lI6"o'S 01 the atomic o;tructure of lhe B.Jdeosorne are shw.n Eadl shONS él 90" rotation around lhe axis between !he. 12 Md 6 o'dock positions 01 !he view shown in f.gure 7-2la. Note Iha! él 180" rotatIOn re\leills a structure nearly odenlicalto lhe ortgll1
•
b
FI G U R E 7-23 InlerACtiom; of the hislones with nudeosomal DNA. (a) H3-H4 bnd lhe middIe aro the erds of lhe DNA The [)NA bOllnd by !he H3· H4Ie(rame IS shown In turquase. (b) H2A "112B bnd 30 bp 01 DNA en one SIde of!he nudeosome. The DNA bound by Ihe H2A "H2B dimer lS shCMTI In orange.. (luger K. Mader AW, RlChmond RK., Salgent D_F~ and Rl
e>
3
9
6
15
158
C/lfU/>IOSOmes . Cluumofin. nnd fhe Nuc/eosolllP
fiGURE 7-24 Nud eosomelackingHlA
and H2B. The H2A anc! H28 histones have been artifidally remOJe
h istone oclamer (Figure 7-25). The association ofDNA with the nueleosome is mediated by a large number ( - 140) of hydrogen bonds betweeo the histones and Lhe ONA. Tbe majority of these byillogen boncls are belween the proteins and Ihe oxygen atoms in Ihe phosphodiesler backbone near the minor groo"e of Ihe DNA. Only seven hyUrogen bouds are mad~ betwmm Ihe protein sirle chains aud the bac;es in the minor groove of Ihe DNA. The large number of Ibese hydrogen bonds (a typical sequeOl:e-specific DNA-biodiog proteio only has about 20 hydrogen bonds with DNA) provides lhe driving force lo bend the ONA. The bighly basic nature of Ihe histones also serves to mask lhe negative charge of the phosphates lhal would ordinarBy resist ONA bending, which brings Ihe phosphates 00 the iosicle o[ the bend into unfavorably close proximity_ The basic nature oC the hislones also facilil ates the clase juxtaposition oC!hu Iwo adjacenl DNA helices nccessary lo wrap the ONA more Ihan once around the histone oclamer. Thc finding thal all the sites oC contact betwecn thú histones and the ONA ¡nvolve either Ihe minor groove oc Ihe phosphate backbone 1S consisten! wilh the non-sequence-spedfic nature oC the association
FIGURE 7-25 Thesitesohontact between the hi~OfIe!li and the DNA. r-or darity, only the Interacttons betv.€en a single H3"H4 dimer are shov.11. Asubset 01 !he pallSo 01 Ihe hisfones Ihal ",Ieraa with Ihe DNA are higllligllted in red. Nofe that Ihese regions. cllls.ter around Ihe mlnor gi'OOYe cllhe ONA (luger K , Mader AW., Richmond RK., Sargent D.F., an
Th", Nllcloosome
oCthe histonc oclamer with DNA. Neither the pbospbate backbone nor the minur groove is rieh in base-specifie informs tion . Moreover, oC the seven hydrogen bonds formed with the ooses in the m inor groove. none are wilh ele menls thal distinguish between a G:C and A:T base pairs (see Chapler 6, Figure 6-10).
The Histonc N -Terminal Tails Stabilize DNA Wrapplng around the Octamer The structure oC the nucleosome also tells us something about the bistone N-terminal tails. Tbe four H2B snd H3 tails emerge from between the two ONA belices. Their path of exit is formed by two sdjacent minar grooves, making a "gap" ootween the two DNA helices just big enough far a polypeptide chaio (Figure 7-26a). Slrikin gly, tbe H2B snd H3 tails emerge al approximately equal dislances from one another around the oclamer disc (at spproximately 1 o'clock s nd 11 o'dock for the H3 lails and 4 o'dock and B o'dock for H2B). The H4 and H2A tails emerge froro eUher the "top" or "bottom" Cace oC the oclamer and are located id 3 o'dock and 9 o'dock for H4 and 5 o'd ock snd 7 o'dock Cor H2A (Figure 7-26b). By emergiog bolh betwee n and 00 either si de of the DNA helicos, the hislone tails sen/e as the grooves of s screw, directing fue DNA to wrsp around lhe histone OClamer disc in a len-hended manner. As we discussed in Chapter 6, Ihe leCt-handed oature oC lhe ONA wTapping introduces negative s upercoils in Ihe DNA. The parts of the tails mosl proximal lo tbe histone disc (and therefore not subject to the protease cleavage discussed aboye) also make sorne of Ihe many hydrogen bonds bctween the h istones and the DNA as they pass by the DNA.
b
a
H4
J H4
F IG U R E 1-26 The hi.5tone tails emerge from ,he core of the nudeosome al specifk positions. (a) The side view ¡llustrates tIlat lhe H3 and H2B talls emerge from between !he two ONA t-eliccs. In contrasto the H4 and H2A lails emerge either abaJe or belovv both ONA hehces. (wger K., Macler AW. Richmond RK., Sargent O. F~ anc! Richmond TJ. 1997. Nature 389: 251-260.) lmage preparcd ~h GRA$P. (b) The JX)SI1IOn 01 I~ tails relative to lhe enlry and exit 01 rile ONA is shown here.. This view reve,:,Is \hal lt1e histor,e tails emerge al nurne rous po5Iócms relatiYe lo the ONA. (Davey CA, Salger.t O.F~ lLJger K, Mader AW~ anc! Richmond lJ. 2002. J. Mol. 8iol 319: 1097 - 1113.) lmage prepare Scrip~ MoIScript, and Raster 3D.
H2A
lfiO
ChromOSVnUl5i. Chl'vmatin. and the N uc/ro5ivme
HIGHER·ORDER CHROMATlN STRUCTURE Histone H 1 Binds to the Linker DNA between Nudeosomes
H1 bound
F I (; URE 7-27 Histone Hl binds two DNA heflCes..
Upoolnteracting Wlfh <1 nucleosorne, histone H 1 binds ro (he ~nl:er DNA <11 CIIle end 01!he nucleosome and lhe central DNA helix 01 lile nude05Ol'ne bound DNA (the middle 01 the 147 bp bound by lt1e eore histone octarner).
FIC;URE 7-28 TheadditionofHI5eads to more (omp6d nudeosomal DNA The t......o iT1'l
Once nucleosomes are form ed. the next step in the packaging of DNA is Ihe binding of hislone H1. like 'he core histones. Hl is a small. posilively-charged protein (Table 7-5). Hl interacts with the linker DNA between nucJeosomes. further tightening Ihe association of lhe ONA with the nucJcosome. This can be detected by the increased pJ'Otection of nucleosomal DNA ITom microeoccal nuclease digestion. Thus, in contrast to the 147 bp protected by the eore histones. addition of histone Hl to a nucleosome pr01ccts sn additional 20 bp of DNA from micrococcal nuclease digestion. Histone Hl has the unusual property oC binding two distinct regions oI the DNA duplex. 7ypicalIy. these 1wo regions are parl of the samo ONA molccule associatcd with a nucleosome (Figwe 7-27). The sites of Hl binding are located asymmetrically relative lo the nucleosome. One oC the two regions bound by Hl is the Ii nker DNA al olle end of Ihe nucleosome. The second site of DNA binding is in the middle 01' the associated 147 bp (lhe only DNA duplex present al the dyad axis). Thus, the additional DNA, protecled &om nucIease digestion described above. is restricfed to linker DNA on on ly one side of IDe nucleosome. By bri.nging tbese 1wo regioos of ONA into c10se proximity. H1 binding increascs the lenglb of the DNA wrapped tightly around the bistoneoctamer. Hl binding produces a more defined angle of DNA entry and exi! from the nucleosome (Figure 7-28). This effeel. which can be visualized in Ihe electrou microscope. results in Ihe nucleosomal DNA laking on a distinctly zigzag appearancc. The angles of cntry and exit vary substantially depending on conditions (including salt concentration, pH . and the presence oC other proteins). lf we assume tbose angles aro
•
the presence 01 histone H 1. (5o..Jrce: lhorna el al. Involvemcr'lf 01 histone H1 in the organization of the nudeosome. J. (ell BioJogy. 83: 410. figs 4 8 6.)
b
füs/!e r-OrderDlruma tin Sfruclure
161
fiGURE 7-29 Histone HI induces
tighter DNA wri!Jptring i!Jround the
. . . 8---"1 Hl
approximately 20" relative lo the dyad axis, this would rcsult in a pattero in which nucleosomes would alternate on either side of a central region of linker DNA bound by hislene Hl (Figure 7-29).
Nucleosome Arrays Can Form More Complex Structures: the 30.. nm Fiber Binding oC Hl stabilizcs higher-order chromatin structures. In Ihe test tube. as salt concentrations are increased, (he addilion oChistone H1 results in the nucleosomal ONA forming a 30-nm fiber. Tbis structure, which can also be observed in vivo, represents the next level oCDNA oompaction. More importantly, the incorporation ofDNA iolo this fiber makcs the ONA less accessiblc to many ONA-dependent enzymes (such as RNA polymerases). There are two models for the structure oCthe 30-nm liber. In Ihe solenoid model, the nucleosomal DNA fonns a superhelix containing approximately six nucleosomes per turo (sce Figure 7-18a). Th is structuro is supported by both EM and X-ray di ffracti on studi es. which indicale Ihat the 30-nm fi ber has a helical pitch of approximalely 11 nm. This is also the approximate diameter of the nucleosome disco suggesting that the 30-nm fiber is composed of nucleosome discs stacke d on edge in Ihc forro of a he lix (Figure 7-30a). In this model. the Oal surfaces on either Cace oCthc hislone octamer disc a re adjacent lO each other and the DNA surface oCthe nucleosomes forms the outside aocessible surface oC the superhelix. The Hnker DNA is buried in the center of the superhelix. but it never passes through the axis of the fiber. Rather. the linker ONA circlcs around Ihe central axis 8S the DNA moves fro m one nucleosome to the n ext
nudeosome. The two illustréltions show él romparison of lhe Vofélpping of DNA é1round the nudeosorne in !he presence and absence of tristone H 1. ene histone H 1 can é1s5OCÍa1e with each nudeosane. Histone H1 binds 10 both ijnker DNA é1nd the ONA helix located In the middle of lhe nudeosome-bound DNA
162
Chromruoffills, Chl'onlalin, and th e Nudllf)!;ome
FIGURE 7-30 Twomodelsforltte
a soJenoid
3Q-nm mromatin libe,. (a) The solenoo mode1. Note lhat Ihe linker DNA does no! pass through lhe central axis of !he supefhelilc and
!hat the sides and entry and exit points of Ihe nudeosomes are relatively inaccessible. (b) The
'zigzag" modeloIn this model. the
~nker
DNA
frequently passes through!he cE.T1tral allis of me liber and me sides and even lhe ent'Y and exit points are more accessible. (Source: PoIlard T.
I DNA
rinker DNA
b zigzag
and Eornshaw W 2002. CeH bioIogy, 1st edition,
p. 202, 113·6. CopyI"ight O 2002. Reproduced by permission of WB. Sélunders roe.)
linkerONA
An alternative model for the 30-nm fiber is the "zigzag" model (Figure 7-30b). This mode l is based on the zigzag paltern oCnucleosomes Carmed upon H1 addit ioll. In this case, the 30-n m fiber is a compactcd fonn oC these zigzag nucleosome arrays. Analysis oC Lhe spring-like nalure of isoJated 30-om fibers supports this zigzag model. Unlike the solenoid mode!, the zigzag conCormation requires thc linker ONA to pass through the central axis oC the fiber in a relat"ively straight form (see Figure 7-30b) . Thus. longer linker ONA ravors th.¡s conformation. Beca use lhe average linker ONA varies helween differenl species (see Table 7-4). the form oflhe 30-nm fiber may not always be the same.
Thc Histone N~ Terminal Tails Are Required for the Formation of the 3()...nm Fiber Core histones lacking their N-terminal lails are incapable oC forming the 30-nm fiber. T he mosl likely role of Ihe tails is to stabilize tbe 30-nm fiber by interacting wilh ad jacenl nucleosomes. This model is supported by the three-dimensional structure of lhe nucleosomc. which shows lhat the amino terminal taits oC H2A, li3. an d H4 each ioteract with adjacent nucleosomes in the cryslal latti ce (Figure 7-31). For example. the hislone H4 N-Ierminus makes multiple hydrogen bonds with H2A and H2B on Ihe sUfface of an adjocent nucleosome in the crystal. The residues of H2 A and H2B that interact with the 114 tail are conserved across many eukaryotic organisms bul are not involved in DNA binding or formation of the histone octamer. One possibility is that lhese regions of HZA and H2B SIe conserved to mediate iuternucleosomal inleracli ons with Ihe H4 taU. As we shal l see below. the hislone I'ails SIe froquen l targets ror modification in the eeU. tt is Hke ly that these modifications influence lhe ahility to form the 30-nm fiber and other higher-order nucleosome structures. Furthe r Compaction of DNA Involves Large Loop...
of Nuclcosomal DNA Together. the packaging o r ONA inlo nucleosomes and the 30-nm fiber results in the compact io n oC (he linear length of DNA by approxi-
l1igher.OrderChromatín Slru ctl.ll'e
163
flCURE 7-11 Aspea/~tivemodel
for!he stabilixalion of!he lD-nm liber by
...
3O-nml ....-I'~ fiber
mately 40-fold. This is sUU insuIficient lo fit 1-2 meters of ONA iolo a nucleus approximately 10- 5 metcrs acrass. Additional folding oCIhe 30-nm fiber is required lo cúm pact Ihe DNA further. Although Ihe cxact oature of this rolded structure rcmains unclear, one popular model propases that the 30-om fiber forms loops oC 40 - 90 kb Iha! are hcld together al their bases by a proteinacious s tructUl'C refcHad lo as Ihe nuclear scaffold (Figu re 7-32). A v8riCty of methods have beco developcd lO identify proleins tha! afe part oC Ihis structure although Iha true oature oC Ihe nuclear scaffold remains rnyslcrious. 1\'0'0 classes oC proteins tha! coutribute lo the nuclear scaffold llave beco irlentified. One of these is topoisomerase Il (lopo nJ, which ís ahundant in bolh scaffold preparations and purificd mitotic chromosomes. Treating cclls with drugs Ihal result lo DNA breaks al Ihe siles of Topo n DNA bindiog generales DNA fragmeots that are aboul 50 kb in size. This is similar lo Ihe size range observed for limited nudease digestion of cruomosomes and suggests thal Topo lt may be part of the mechanism thal holds Ihe DNA al Ihe base of these loops.
The SMC proteins are also abundan! componenls of Ihe nuclear scafrold. As lVe discusscd earlier (see section 00 Chromosome Duplirntion and Segregation), these protei.ns are key oomponcnls of the machinery that condenses and holds daughler chromosomes Logcther after chromosorne duplicalioo. The associations of Ulesc prolcins with the nuclear scaffold rnay serve lo enhance Ih eir functions by providjng an underlyiog foundatioo for their interactions with cruomosomal DNA.
Histone Variants Alter Nudeosome Function The core histones are among Ihe mosl conserved eukaryotic proleins; therefore , the nucl eosomes formed by these proteins are very similar in a11 eukaryotes (Figure 7-33a). Bul there are severa1 hislone variants founrl in eukaryotic ceUs. 5uch unorthorlox hislones can replace one of the four standard hislones lo forro altemale nucleosomes. 5uch nucleosomes may serve lo demarcate particular regioos oC cruomosornes or confer specialized functions lo the nucleosomes into which they are incorporated. For example. H2A.z is a variant of H2A that is wirlely rlislributed in eukaryotic nucleosomes and is generally associ-
histone fHennin. tails. In this model me 3O-nm tiber is illustraled using lhe "zigzag" model. Several differenl tail-histone rore intelactions ale possible. Here lile inleractions are shown as betv.een every altemale histone bU they (ould also be v.;m adjacent 01 mOfe distant histones.
164
b
ChromOSOInf!S, ChromotJ'11. ond thB N ucJt1OSOtllf!
cflromaün fiber
""ked
n · · · ··
ONA
/
chromosome scaffold
30nm
10nm
1
naked ONA
FI e u RE 7-32 The higheNMder structure of duomatin. (a) A transmission electron micrograph shcws d llomatm emerglng from a central struct\.ire of a chromoscrne. The electron-dense regicns are the nuclear scaffold lhal acts te organize dle la'8e arrounts 01 DNA lcune! in eukaryotic chrcmoscmes. The bar represents 200 nm. (b) A rrodel for the struct\Jre 01a eukaryotic chlOT11O!'>Ome shows tIlal !he maJonty of !he [)NA is pacJ.:aged il1lo large Icops 01 3O-nm filer that are tethered te lhe nuclear scaffold al !heir base. 5ites of active [)NA manipulabon (Ior example, sites 01 transOlprion or [)NA rephcation) afe In the foon of lO-nm llber or even naked DNA. (Scurce: (a) Ccurtesy 01 IR Paulscn aro U.K. laemmli.)
ated with transcribed regions of DNA, There is HUle change in Lhe overall structure of a nucJeosomc cootaíning Ihis variant histone. lnstead. the presence of Ihe H2A ,z hislone inhibits nucleosomes from farmiog repressive chromatin struclures , creating regioos of easily accessibl e chromatin Ihal are more compatible with transcription.
a normal (noovariant) hislones
b with CENP-A 00II......-
...
kinetochore binding protein
\
inleraction wilh kinelochore protein
•
FI C;U R f 7·33 Atteratton of dtromatin by inCOt'poration of histone variants_ (a) Tra nsition belD-nm and 3O-nm fibers for srandard histones. (b) tnrorporation 01 CENP-A In place 01 hisrone H3 is proposed te act as a bindl[',g s,te lar Orle ar more componenlS 01the kioelochore.
~
A second hislone variant , CENP·A, is associated with nueleosomes that inelude centromeric DNA. In this chromosomal region. CENp·A replaces the hislone H3 subunits in n ueleosomes. These nu eleoso mes are incorporated m to the kinetochore which medi ales attachment of Ihe chromosome 1.0 the mitolic spind le (see Figure 7-12). Compared to H3. CENP-A incl u.des a substantial extension of the N-terminal tail region. Thus. like nucleosomes wi th H2A.z, it is unlikely Ihat incorporation of CENP-A changes Ihe core slruclu re of the nucleosome. lnslead, the extende d lail of CENP-A may generale novel bind ing sit es for other p rotein components of the kin etochore (Figure 7-33b). Given the critica lrole of the hi slone N-termin i in the forma tion of h igher-order chromatin strucl ures, these changes may alter the interaclions beIween n ucleosomas al the centromera/kinetochore as weJI.
REGULATION OF CHROMATIN STRUCTURE The Interaction of DNA with the Histone Octamer Is Dynamic As \Ve wi11leam in detai l in Chapter 17. the incorporation ofDNA into nucleosomes can have a profound impact on the expression of the genome. In many instances it is cri ticalthat nucl eosomes can be moved or lhal Iheir gli p on Ihe ONA ca n be loosened lo a llow access lo particular regions of DNA. Consislent with this requiremenl , the association of the hislone oclamer wilh the ONA is inherently dynamie. 10 addition, thcre are factors Ihat acl on lhe n ucleosomc lo increasc or dcercase the dynamic n ature of this association. Together, these properties aUow changes in n ucIeosome position and DNA association in response lo the frequently ehanging needs for DNA accessibilily. Like all inleractions mediated by noncovalent bonds, the association oI any particular region of DNA with tbe hislone oclamer is nol permanenl: any individ ual region of the ONA wil! transientJy be released from
166
C/,romoso""lS. Chromnlin , and Ihe Nudeosonre
fiGURE 7-l4 Amodeffoq¡:aining access lo nudeosome-associated DNA.
Studies of the ability 015eQueflce-speaflC DNAbirding protelOs 10 b,od nudeosomes suggest mal ul1Wfappmg 01 the DNA from lhe nudeosome IS responsible lor accessibility 01 the DNA lhus. DNA sites dosest 10 the ent¡y and exit points are !he rnost accessible and sifes dosesl 10 me midpoint 01 !he bound DNA are least
accessible.
prolein 1
protelo bindiog site 1
proteio bindiog SlIe2 proteio 2
tighl inleraction with the oclamer no\\' and then. This release is analogous to the occasional opening of the DNA double helix (as we discussed in Chapter 6). The dynamic nature of DNA binding to th e histone core sfructure is important. hecause many ONA-bindiJJg proleins strongly prefer histone-free DNA. Such proteins can only recogn izc Iheir binding si ltl wh_m il is released from the histone octamar or is con tained in linker or nucleosome-free DNA. As a result of ¡ntenoinent, spontanoous un wrapping of DNA frem the nudeosome, a protein can gaio aeccss lo its ONA-binding sites wilh a prohability of 1 in 1,000 lo 1 in 100.000. depending on where lhe binding site is wiLhin Ihe nucleosome. The more centra l the binding site. the less frequenlly it ís accessible. Thus, a binding site near position 73 of the 147 base pairs tightly associated with a n uc1eosome is least frequell tly accessible, whereas binding siles near the ends (posi tions 1 or 147) of Ihe nucleosomal DNA are mosl frequently accessible. These ñndings indicale Ihal the mechanism of expOSUTe is due to lInwrapping of the ONA from the nudeosome, ratbcr Ihan lo the DNA brieny coming off the surface of the histone octamer (Figure 7-34 ). JI is importanl lo note Ihat these studíes wme performed on a popu lation of individ ual nucleosotnes in a tesl tube: the ability of DNA to unwrap from the nucleosome may be differen! for the large nudeosomal arrays in the cel!.
Nucleosome Rcmodeling Complcxes Facilitate Nucleosome Movement The stahility of the histone octamer-DNA interaction is inOuenced by large protein complexes referred to as nucleosome remodcling com:pl.exes. Thesn. multi-\lrob3ln comQ Ip.xes facilitate changes in nucleosorne locatiun or interaction with the DNA using the ellergy uf ATP hydrolysis. These changes can come in tbree flavors: (1) "sliding" of the histone octamer alollg the DNA (Figure 7-35aJ. (2) the complete "Iransfer" of a histone octamer from one DNA molecul e lo another (Figure 7-35b), or (3) Ihe " remodeling" of the nucleosome to allow incl'eased access lo the DNA (Figure 7-35c).
RegU/OUO'l of Chmmolf.l SIn.,clure
b transfer
a slidiog
e DNA 2 .......
+
nud"'""",,
1
remodeling
nucloosome remodeling
~ !
in'
~
167
flCURE 7-35 Nudeosome mcwement catatyzed by nudeo50me remodeling activities. (a) Nudeosome fl')()I.Iement o,.. sliding along il DNA rmlecuJe expD5eS sites fof ONA·binding proteins. (b) Nudeosome roovement can altemati...ely cx:cur by transfer of the
nudeosome from one strand 01DNA te anemer. (e) Remocleling allows association 01a ONA· binding protein Wlthout altering ,Is posItion on ONA
accessible
AH nucleosome romodeUng com pl oxes can facilitate nucleosome sliding. howevcr, onJy El subset have the ability to lransfer or remodel nucleosomes without altering their posiliún on the ONA. The exact struclura1 altecations of the nucleosome that lcad lo remodoling are not cJear. Nevcrthelcss, il is clear that the DNA associated with these "remodcled" nuc1eosomes IS more acccssible. There are mulliple types of nudeosome rcrnodeling complexos in any given ceJl (Table 7-6). They can have as few as two subu nits or more than 10 su bunits, Although the ATP hydrolyzing subunit is relatively well-conscJvcd among thcse diffcrent complexes, ¡he addition of differenl suhunils can modulale fun eHon. For example, th ese complexcs can inelude suhunits, that larget them lo particular chromosomal locations. In some instances , Ihis targeting is mcdiated by inl'eractions between subunits ol' the remodeling complex and DNA bound transcliption fa e101'8 (Figure 7-36). In olher instances, localization can be mediated through interactions with specific modifications of the histone subunits themsel,'es (via clu'Omo- oc bromodomains, as we shall see below ).
TAS L E 7-6 Nudeosome Remocleling and Modifying comptexes
ATP-Depe:ndent Chromatin Remocleling COmplexes
Type SWI/SNF ISWI Mi2!NuRD
Number 01 subunHs
8 - 11 2- 4 8 - 10
SIIde
Transfer
Restructure
8 rornodomaln
Ves
Ves
No
Ye,
Yes No
ChlOmQdomain
Ves
No No
BromodomainIChromodomain
No
]68
ChfOnrosomes. C}¡ramofin. ond theNucleasome
FIGURE 7~16 Twomodes of
a
DNA-bincfing p«rtein-dependent nudeosome posrtioníng. (a) Association 01 many DNA-binding p'0teins ~ ONA Í5 inrompatlble INith the association 01 the same DNA wilh!he histone oaamer. Because a nudeo"<>deo""",, assembly
sorne JeQuires more !han 147 bp 01 ONA 10 formoil two such factors bínd 10 !he DNA le!.s lhiln this distance apart. me intervening DNA camot assemble inlo a nudeosome. (b) A sub5eI 01 ONA-binding proteins have lile ability te bind te nudeosomes. Once bound 10 ONA, slJch proteins will ladlilOte lile assembly of nudeosomes irrmediately adjacent lo the protein's DNA-binding site.
1
nucleosome-free
b
(J
•
oucleosome assembly
I
I
posttioned
""'eorome
Sorne Nucleosomes Are Found in Specific Positions in vivo: N ucleosorne Positioning Because of their dynamic interactions with DNA, moSI nuclcosomes are nol fixed in their locations. But lhere are occasions when reslrictioS nucleosome location. or positioning nucleosornes as it is ca ll ed. is beneficial, Typically. poshioning a nucleosome aJlows the DNA hindiog s ite for a regulatory protejo lo remain in the accessible Iinker DNA region. In many instances, s uch nucleosome-free regioos are larger to aJlow extensive regulatory regions lo remain accessible. Nuclcosome positioning can be direded by DNA-binding proteins or particular DNA sequences, In ¡he cell, the mús! frequen t method in vol ves competilion bctween nucleosomcs and DNA-binding proteins. JUSI as many proteins canool lJind to DNA within a nucleosome. prior bioding of a proteio to a sitc on DNA con preven! association of thc rore his!ones wilh tbat stretch of DNA. If two such DNA-binding proteins are bound lo siles positioned closer lhan lhe minimal region oC DNA required to assemble a nucleosome (- 150 bp), the DNA bctween the proteins will remain nucloosome-free (Figure 7-36a ). Bindíng of ad ditional proteins to ad jacent DNA can further increase Ihe sire of a nucleosomefree region. In addi tion to this inhibitOI}' mechanism 01' proteindependen l nucleosome positioning. sorne DNA-binding proteins
G :CrlictJ
hislone octamer
F IG U R E 7-37 Nucleosomes prefer 10 bind bent DNA. Spea fic DNA sequences can position nudeosomes. Bec.ause the DNA is bent severely during associarion v.ith !he nudeosome, DNA sequences rIlal p:'lSitioo nudeosomes Me inlrinsically bent A:T base pairs nave an lnmosic tendency te bend toward the mi roe groo\le and G:C base pairs have!he opposite tendency. Sequences mal alternate between A:T- and G:C-rich sequeoces 'l'Jith a periodícity of - S bp will iICI as prelerred nudeosome binding sites. (Source: Jldapted Irom Alberts B. el a1. 2002_MoJecuIar biology of /he ceH, 4th edition, p. 2 11 , f4-28. Copyright e 2002. Reproduced by pennrssion 01 Routledge/Tayior & Fraods Books, Inc.)
interael tightly with adjaeent nucleosomes. lead ing lo nllcleosomes preferen lialIy assernu ling immedialely adjacenl to these proteins (Figure 7~36b). A second mcthod oCnucleosome positioning ¡nvalves particular DNA sequences that have a high affinity for the nllcleosome_ Becallse DNA bound in a nucIeosome ¡s· bont, nucloosomos preferontially fmm on DNA Ihal bends easily. A:T-rich ONA has an intrinsic tendency lo bend towaro the minor groove. Thus, A:T~eich DNA is favored in posi ~ lions in which the minor groove faces the histone aclamer. G:C-rich DNA has Ihe opposite tendency ando thereforc. is favored when the m i~ nor groove is fadng away from the histone octamer (Figure 7~ 37). Each nucleosome will try to maximize this arrangement of A:T rich and G:C-rich sequences. 11 is importanl lo note that such altemating stretches of A:T rich and G:C-rich DNA are rareo More importall tly, despite being favored . sueh un usual sequenees are not requircd Cor nucleosome assembly. Thesc mechanisms oC nuc\eosome positioning influ encc the orga ni ~ zation oC nucleosomes in the genome. Despite Ihis, Ihe majority of nucleosomes are nol tighlly positionecl_ As you will leam in the chap~ tees on eukaryotic transcription (Chaplees 12 and 1 7), tightly positioned nucleosomes are most often found al si tes directing the initiation oC transcription . Although we have diseussed positioning primarily as a method to ens ure that a regulatory DNA sequc nee is aecessible. a positioned nuc!eosome can just as easily prevent access to specific DNA siles by being positioncd in a manner that overlaps the same sequen ce. Thus , positioned nucleosomes can have bolh positive aml negative effects on lhe accessibilil y oC nearny DNA sequences. An approach lo mapping nucleosome locations is dcscrihed in Box 7~2. Detennining Nu cleosome Position in the Cel!. 4
4
Modification of the
N~ Teminal
Tails of the Histones
Alter-s Chromatin Accessibility When histones are isolated from eells . their N ~terminal tai ls are typi~ cally modified with a variety of smaU molecules (Figu re 7~ 38). Lysines
170
Cf¡ro/J!osomes.
Chromolin . (md 'he Nucleo!>ome
in Ihe fails are frequently modified with acetyl. groups or rnelhyl groups and serines are subject lo modification with phosphale. Typically, acetylated nueleosomes are associated with regions of the chromosomes thal are transcriptionally active and deacelylated nueleosomes are associated with transcriptionally-repressed chromalin. Unlike acetylation. methylation of different parts of the N-terminal taBs is associated with both repressed and active chromatin, depending on the particular ami no acid Ihat is modified in the hislone tail. Phosphorylation of the N-terminal lail of hislone H3 is commonly observed in the highly-condcnsed chromatin of mitotic chromosomes. It has becn proposed that Ihese modifications result in a "code" thal can be read by the proteins ¡nvolved in gene expression and olher DNA transactions (Figure 7-38). Haw does hislone modification alter nucleosome function? One obvious change is thal acetylation R11d phosphorylation each acl to reduce the overal! positive charge of the histone tails: acelylation of Iysine neutralizes its positive charge (Figure 7-39). This loss of positive charge reduces Ihe affinity of the lails fOf Ihe Ot.-gCllively-charged backbone of lhe DNA. Equally importent, modiñcation of the hislone lails affects the ability of nueleosome arrays to fonn more repressive higher-order chro-matin struclure. As we describcd aboye. hi slone N-tenninal tails are required lo (oon the aD-nm fiber. and modification of tbe tails modulates this function . For example. consistent with the association or acel'ylatcd hislones with expressed regiDns of the genome, nucleosomes with this modification are significantly less likely to participate in the formalion ofthe reprcssivc 30-nm fiber.
Box 7-2 Determining Nudeosome Posmon in the cell
The significance of the Iocanon of nudeosomes adjacent lo important regulatory sequences has led lo Ihe developmenl of melhods lo monitor the Iocalion of nudeosomes in ceUs. Many of Ihese methods exploit me ability of nudeosomes lO protea DNA from digesbon by mloococcal nudease. As desaibed in Box 7-1, micrococcal nudease has a stroog preference to cleave DNA between nucleosones rather than DNA tightly associaled \Mth nudoosomes. This property can be used 10 map nudeosomes lhat are asscx.iated with Ihe same position throughout a cell¡x¡pulation (Box 7-2 Figure 1). To map nudeosome location accuralely, il IS importanl lo isolate the cellular chromatin and treal it with the appropriate amounl of micrococcal nudease with minimal disruptioo of the overaH chrornatin strudure. This is typically adlieved by genlly I~ing cells .......,ile leaving Ihe nudei intacto The nude. are lhen bnefly lreamd (typ'ical1y for 1 minute) .....;th several diffcrent concentrations of mioococ.cal nudease, a protein Slna!! enough lo rapidly diffuse into lhe nudeus. The goal of lhe titmlion is fOf micrococcal nudease to c1e~ Ihe region of inlerest only once in each cel!. Once Ihe DNA has been digested, the nudei can be Iysed and all the prolein re~ from Ihe DNA. The sites of de.avage (and. more importanlly, !he sites not de
To identify the sites of deavage in a particular region, it is necessary lo crcale a defined end fXlint for all Ihe deaved fragments and expIoit Ihe specificity of ONA hybridization. To aeale a defined ene! point, Ihe purified ONA from eadl sample Is cut Wlth a rcstnction enzyme knavvn to deave adjacent to !he site of interest. After separatioo by size using agarose gel electrophoresis, the DNA is denatured aOO transfel'led to a nitrocellulose membrane. This all0Y6 a labeted ONA probe of spocific sequence 10 hybridize to the ONA (this is called a Southem blot and is describcd in more detail in Oiapter 20). In Ihis case, (he ONA probe is carefully dlosen 10 hybridize immediately adjacent 10 Ihe restriction enzyme dcavage srte at the site of interesL Afler hybridization and washing, !he ONA probe \Nill sho.N !he size of lhe fragmenls generated by microax:cal nudease in Ihe region of interest HON do the fragment sizes reveal lhe location of positioned nudeosomes? ONA assooaled V>Ji1h positKlned nudeosomes """;11 be resistant to miaococcal nudease digestion leaving an - 160-2QO bp region of ONA lhat is not deaved. This \Nih appcar as a large gap in the laddcr of ONA bonds detected on lhe Southem bIot Fi'equently, there are arrays of positioned nucleosomes leading lo a similar 16Q-2QO bp periodidty to sites of deavage and proteaion.
80x 1-2 Contínued
~~~:w;
1 Analysisof nudeosome poshioning in the c:ell The experimental steps in determining nudeosome poslhoning in lhe cell are illustrated. See boIc text lar details. 80X 1-2 FIGURE
!
isoIate nudei
o°á''t,'11~O induce double-strand breaks in linker ONA with micrococcal rudease
1 • MNase cut sites
isolale deaved ONA and cut with reslricliorl enzyme
, RE deavage sitas
posilioned nucleosomes
random nucleosomes
ctrDlp i= , U: O " O¡~ DNA detecté
y
• southern bIot
OO
3rd
2nd
1st
,:
:~
,, ,,
~I~!~
=== 0 • 00 : u-uIU:= : "UI O O j~ -===;;0= O O ¡~ !<=>
.
I
~',",,,,,,,,P"'''''''~: ~
,
:
: ='ltTtTU==== -¡~ ~~~r=7r~,1 e
:
O=U U¡~
a l
OO
¡~
= 1U
U
- -""Southetn proba DNA separate on agarose gel and perform Southefn bIot
•
1
u
1•
<±l •
111111111111111111111111111111
I I
I
~L
_ _"--_
---'
3td
deavage al many posilions due 10 randOm nudeosome
2nd 1st deavage al spedfic regions between posiloned
posilions
nucleosomes
171
172
Chromusotnes, Chmwat in, und l/le NuclooMm lf:'
FICURE 7-lB Modificationsofthe histone N-tenninal tails alters ttte function of dtromatin. lhe sites of known hiSlone rnodifications are illustrated on each histone. The majonty of these modifications ocru on lile tail regions bu! mere are occasional modificabons \Mmin!he histone lold The effects 01 hislone modilicatioo are dependen! on both !he type 01 rnodificallon aOO lhe site 01 modiflCabon . The different types 01 !TIOÓflcation observed Ofl !he hlStone H3 aOO histone H4 N·¡erfTlÍnal tails are shQ'Mi. (Sourre; Adapted (rcm Alberts a et al. 2002. Molecular bioIogy of the ceIt 4th edi· tiOfl, p. 2 15,1435. Copyrigtu e
2002. RePfo-
duced by pem1ission 01 RoutIcdge/Ta)"ior & Franos Books. loe. aOO Jenl.llNein aOO AIIis. 2001 . 50ence 293; 1074- 1080, figures 2 and 3. Copyright lO 200 1 Amencan Association lar lhe Advancement 01 Scienre Used \M!h permissien.)
p
o r> 1
histooe-fold domain
&¡=:r
H2A
NTi:,
'""
1 H28
N
t t t
K
5
K
12
H3 ~
t t t ~s 34' RKK
10
N N Nt N N N N! N!
K
14
5
K
18
K 23
K 15
K S 27 28
7.
gene silencing
// _
gene eJq)fession
gene s~encingl helerochromalin
'/ _
chromosome condensation
'/ _
gene expressioo
j
t !
;
¡'1
• ?
'/ ;
N N N
n¡slone deposition
'/ _
j
-
H4 N
K
'f _
t
e
// -
¡
t
e
&
1:
t
,
f(]
c
transaiption elongalion
,
S 1
t
5
t
8
t
12
'1 '1 j -
hislene deposition
'1-
gene expression
)
,t.
;
gene silencing
In addition to dimet effects on nucleosomaJ funr,tinn . modiflca tin n ofhistone fail s also generates binding sites for proteins (Figure 7-39b). Spocific protein domains calIed bromodomains and chromodomains mediate these intemetions. Bromodomain-containing proteins internet with ilt;clylated hi slone tail s and t;ruomodomain-containing proleins interne! witb meth ylated histon e tails. Many of the proteins that contain bromodomains are themselves associated with hislone tail-specifi e acetyl transferases (Table 7-7). Sueh r,omplexcs can racilitate the maintenance of acetylated duornatin by rurther modify ing regions that are already acetylated (as we shall discuss below). Thc association 01' chromodomain-containing proteins with hislone lail-specific melhyl-
lIogu/oticm ofChromotin Structure
•
173
H4 acetylated
H2A H3
•~
+
unmodified + 0~-'
+V' f--
-r-
_____
=-
+ H3 + + ...
---]
+~ ~
methylated
H2B
+
~+
+~ +
•
H2B H4
+
bromodomain
chromodomain protein
pl"otein
b
H4
+'(
H2B
e
~H3
FIGU R E 1-39 EHects of histone tail modifications. (a) The eHect on the association with nudeosome-bound DNA Unmodir~ aOO methyldled !listone lail5 are thoughl to associ
ating enzymes s uggests a similar mechanism fm the maintenance of methylated nuclensomes (Tabl e 7-71. Other bromodomain- and chromodomain-containing proleins are nol histone modif)'ing proteins bul ¡nslead are proteins jnvolved in regulating transcription or the fonnation of heterocbromatín. For example, a key component of Ihe transcription ITlachinery callad TFHD also includes a bromodomain . Thi s dorna in directs the Iranscription machinery lo s ites of nudeosorne acetylation, which contributes lo Ihe increBsed Iranscriptional activity of the DNA Bssodated with acetylated nudeosornes. Similarly, nucleosome-remode ling complexes frequently ¡nelude s ubunits with bromod ornains (Table 7-7).
Specific Enzymes Are Responsible for Histone Modification The histone modifications we have jusI described are dynamic and are mediated by specific enz)'mes. Hi slone acetyl trans fera ses catalyze Ihe addition of acetyl groups to the Iysines of the histone N-tcrmini. whereBs histone deacetylases remove thef;e modifications. Similar/y. histone methyl Iransferases add melhyl groups lo hislones (histon e demet hylases have yet to be identifiedl. A number of different hislone acet)'1 lran sferases have been identifie d and are distjngui shed by theit abilHies lo largel different hi slones or even differenl Iysines in the same hislooe lail . Simila rl y. each histone methyl transferase targels
+
+
+="",<,,-~
, H2A n'----,J..~
__r
174
ChmmtJ/itJmes. C / lffi/lltJli rl . tJfld fIl e N udeosoJ/le
JABLE 1-7 Nucleosome Modifying Enzymes Histone Acetyl-tfansferase Complexes
Type
Number 01 subunits
SAGA
15 11 3 6 1
PCAF NuA3 NuA4
P300ICBP
Catatytic subunil
BromodomainIChromodomain
Gen5
Bromodomain
PCAF
BranoclOmain
S"3
P3OO/CBP
Nt'IÍlher Chromodomain Bromodomain
Catalytic subunit(s)
BromodomainIChromodomain
Esa 1
Target hrstones H3 anc! H2B H3 and H4
H3 H4 H2A. H2B. H3. and H4
Hislone Deacetylase Complexes Type
Number 01 subunits
Sin3 ecmplex NuRO SIR2 COOlplex
7 9 3
HDAC 1JHOAC2
Neither
HDAC1JHDAC2 S¡r2
ChrOrrOOomain NClther
Histone Metht1ases
Name
BromodomainIChromodomain
Target hrstone
SUV39/CLR4 SE"f1
Chranodomain
H3 (Lysine 9) H3 (Lysine 4) H3 (Arg inine 3)
PRMT
Neither Neither
s pecifi e Iysine or arginine on specilic h istones (Table 7-7 ). Because different modifications have different effects on nucleosome funetion , the modification of a nucleosome wit h different hislone aeet)'1 trans{erases o r melhyl transferases can res ult in various effects on chromatin structure and function (see Figure 7-38). Like lheir nucleosome remodeling c:omp lex co unlerparts, these morJifying enzymes are pnrt of Inrge multiprotein compJexes. Addilional subunits play important roles in recruiting these enzymes lo specific regions of Ihe DNA. Similar lo the nucleosome-remodeling c:nmplexes, Ihese interaetian s can be wilh tran scription faclars bound lo DNA nr direr.t ly wi lh mndified nudensmnes. The recruilment of tbese enzymes lo pArticular DNA regions 1S responsibJe fOf Ihe distinct patterns of histane moclifica tion observed along Ihe chromatin and is a major mechanism for modu lating Ihe levels of gene expression a long the e ukaryotic chromosome (see Chapler 1 7).
Nucleosome Modification and Remodeling Work Together to lncrease DNA Accessibility The combination of N-terminal taH modifications and nucleosornt! remodeling can dramaticaHy change Ihe accessibili ly of Ihe DNA. As \Ve w iJI learn in Chapters 12 and 17, lhe prote¡n complexes im 'o lved in Ihese modifieations are frequen lly recruited to sites of active transcriplion. Althnugh the order of their funr:lion lE nol always the saffie , lhe combinoo aclion can result in a prufuund , bu llocalized. change in DNA accessibility. Modification of N-terminal tails can reduce the ability 01'
NrJc/(.'o~·(j(¡ 1C As.~('mbly
175
chromatin
F I c:; U R E 7-40 Chromatln remodeling compLexes and histone modifyin¡: enzymes work together lo a lter chromatin slrudure. Sequenre-specific DNA-bindiog proteins rypi cal/y reauiC lt1ese enzymes te speolic. regions of iI chromosome. Jn the iMuslratKlf\ Ihe lifst IJNA. bnóng protein recruts 4 thrOO"liltJn remodeIing compIE'lf that modifleS !he adtacent rnIeosome. Increasing the accessib.lity 01 the as500ated [)NA. This illlows the blndmg 01 a second ONA· birrlng protetn !ha! recruits iI tIIsIone acetyt transferase. By modifying the N·temunal t.lib 01 !he ad)acent nudeosomes. !hes enzyme changes!he coofarnatton of ~ dvomabn lrorn me 3O-nm form lo !he more accessbIe I l -nm foon. /lJthou¡jl we show the crOO-- 01 assoOatlOl"l as chromOOn remodelmg compIeK tten histooe arer,-t tJiloslerase, bo(h 0Iders are chserved aod can be equal~ effecWe. " is illso true!hal re oompact and rnacesSlble chromaM.
remodeJing ~~~
nucleosome arrnys lo fo rm repre..c;sive slruct uros, crcaling siles Ihal r.An TOCTllH other proteins, induding nucleosorne remodelers. Remodeli ng of the nuck'Osomes Cfl.1l the n fl,lrfher increase the accessibility of the nud eosoma! ONA lo I'l llow ONA-bind ing proteins aecess lo their binding sil.es. In addition, Ihese complexes can eause the sliding, or relcase. oC the nudeosomes. In combination with thc appropriate DNA-binding proleins or DNA sequences, these changes can resull in Ihe positioning or release of nucleosomes al specific siles on the DNA (Figure 7-40).
NUCLEOSOME ASSEMBLY hislone acetyl
Nucleosomes Are Assembled lmmediately after ONA Replication The duplic.a lion of a c hromosome requires replic.alion of the ONA and tbe reassembl y of the associated proteins on eaeh daughler ONA moJecule. The Jaller proc:ess is tightly linked lo DNA repl icati on lo ensure Ihat the newly rep li caled DNA is rapid ly packaged inlo nucJeosomes. In Chapter 8 we wiU discuss the mechanisms of DNA rep lication in detail. Here we cli scuss lhe mechanisms thal direel Ihe assembly oC nucJeosomes after the DNA is replicaled. Although Ihe replication of ONA requires Ihe partial disassembly of the nucleosome, the DNA is rapidly repackaged in no ordered seri es of events. As di scussed earlier, the firs l slep in the assembly of nucleoSO nIes on Ihe DNA i8 the binding of an H3·H4 tetram er. Onr.e Ihe telramer is bound . two H 2A 'H2B d imers Bssociate lo form Ihe finAl nucleosome. Hl ¡oios this complex last, presumably during the. formA.lion of bigher-order chromalin assemblies. To dupli cAle a chromosome. al leasl half of Ihe nucleosomes on Ih e daughler chromosomes musl be newly synlhesized. Are 011 Ihe old histones losl And only new rustones assembled inlo nudeosomes? Ir nOI, hnw Hre Ihe old h i ~1n n es dislributed belween Ihe lwo daughte r chromosomes? The fate of the old histones is a particulnrl y importanl issuc givcn Ihe effucl motl ificstion of the histones can ha\'e 0 11 Ihc accessibility of Ih e resulting chromatin. Ir Ihe nld hislones were losl complelely, Ulen chromosome duplication would erase any " memory" of Ihe previously modified nucleosomes. In contras!, If Ihe old histones were rel oinoo 0 0 a single chromosome. thO! crnoOlosome wou ld have a di slincl sel of modificAtiolls relative to lile oUwr copy of tbe r..hromosome.
transferase
l Cl-nm c:hromatl n
"be,
ONA binding proteW1 2 reauits hislone acetyl
lransferase
176
Chromosomes, Chromalin, and r/¡I! Nuc/oosome
F I G U R E 7-41 The inhentance of histones after DNA replkatlon. f
cid histones:
O H2A
new hislenes: [
H2A
•
H2B •
H4
O H2B O H3 O H4
direction 01 DNA replication
•
H2A'H2B dimer
H3 •
parental nucleosome
H3' H4 telramer
In experiments Iha! differentially labeled old and new histones, it was fo und Ihat the old histones are present on both of lhe daughter chromosomes (Figure 7-41) . Mixing lS not entirely random , however, H3'H4 tetrnmers ond HZA·HZB dimers are composed of either all new or aIl old histones. Thus, as lhe repHcation fork passes, nucleosomes are broken down inlo lhe ir com ponent subassembli es, H3'H4 tetra mers appear lo rem ain bound lo one of the two daughttlr dup lexes al random and are never released from DNA in lo the free pool . In con trast, Ihe HZAoHZB dimers are released and enter the loca l pool avai lable for new nuclt,>osome assembly. The d.istributive in heritance of o ld histones during chromosome duplica lion provides a mechani sm for Ihe accurate propflgation of Ihe parental pattem of histone modification. By Ihis mechanism . old histones, no matter on wh ich daughler chromosome lhey end up, tend 10 be found close, in Jocation, to their position on Ihe parental r:hromosorne (Figure 7-4Z). Thi s localized inheritance of modified histones provides a Iimited number of modi6cations in similar positions on each daughter chrornosome. The ability of these modifications to recruit enzymes that add si milar modifications lo ad jacent llucleo sornes (see Ihe discussion of bromodomains and chromodomains above) provides a simple mechanism to mainlain similar states of modification after DNA replicBtion has occurred. Such mechanisms are likely lo playa critk.a l role in the inheritance of chromalill states from olle generation to another. 4
Assembly
oE Nucleosomes Requir'es Histone "Chaperones"
The assembly of nucJeosomes is not a spontaneous process. Earl y studies found Ihat Ihe simpl e addition of purified hislones lO DNA
Nucleosomc .-1!>'!>'cmbly
cid hislones: ~ H2A •
new hislones:
H2A
H28 •
H3 •
O H28 O H3
e
H4 H4
r> replication
letramer
acety l transferase binds
acetylated histone tails
1
"",~I
transfefase
mocIlficat lon of adjacenl "neYl' histones
1
1 77
F I (; U RE 1~ 42 Inheritana of parental HJ.ti4 tetramers faalltams the inheritana of chromatin states. As ~ chfOl'l1ClSOITle is rq>lkaled. lhe distribllbon 01 the pdrental HJ.H4 letramers results in lhe daugt"lter chromosomes receiving the same rnodific~bons as lhe paren!. The abi~ty 01 lhese modiflcations lo reauit enzymes tha! peform lhe same modiflCdtions lacilrtoleS the mt'I'eCI propagoricw1 of lhe same sldte 01 modification kl the \'NO daughtef chromosomes. PcetyLation is stn.-m en the core regioll5 of !he histones lor SImpIioty. In l eality, Ihis modiliation is generalty on!he N-terminal toils..
178
Chromosomes, Chromolin, ond ,he NucJcosomf!
resulted in littJe or no nucleosorne fonnation. lnslead. Ihe majority of the histones aggregate in a nonproductive rorm. For correet nucleosorne assembly, it W(lS necessary to mise salt concentralions 10 very high levels (> 1 M NaCI) and Ihen slowly reduce the concentralion over many hours. Although useful ror assembling nucleosom es for in vitro studies (such as for the structural sludies of Ihe nucleosome described earlier), elevated salt concentrations ate nol involved in nucleosome assembly in vivo. Studies of nucleosome assembly unde r physiological saJt concentralions identified faclors required lo direct the assembly of hi stones onto the DNA. These factors are negatively-charged proteins that form complexes with either H3'H4 tetramers or I-I2A-I-I2B dimers (see Table 7-8) and escort them lo sites of nucleosome assembly. Beeause they aet lo keep hislones from inleracting with the DNA nonproductively, these faetors have been referred lo as hislone chaperones (see Figure 7-43), How do the histone chaperones direct nucleosome assembly lo sites of new DNA synthesis? Studies of the histone 1-13-1-14 tetmmer chaperone CAF-I reveal a Iikely answer. Nucleosome assembly directed by CAF-I requires that the target DNA is replieating. Thus, replicating DNA is marked in sorne way for nucleosome assembly, lnterestingly, Ihis mark is gradually lost after repli cation is completed. Studies of CAF-l-dependent 3ssembly have determined that the mark Is a ring-shaped sliding clamp proteio ca lled PCNA. As we will discuss in delail in Chapler a, this fa ctor forms a nog around the DNA duplex and is responsible fm holding DNA polymerase on Ihe DNA during DNA synthesis. After lhe polyrnerase is finished , PCNA is released from the DNA polyrnerase but is sl illlinked to the DNA. In this condition, peNA is tlvailable lo interacl with other proteins. CAFI associales wit h fue released PCNA and assembles I-I3 "H4 tetmmers preferentially nn the PCNA-bound DNA. Thus. by associating with a componen t of !he DNA replication machinery, CAF-l is direcled lo assemble nucleosomes al siles orrecent ONA re plica tion,
old h¡stOfles:
H2A .
new histones: [
H2A
replication
le~ - >13-1<4 tetramer
H2B .
H3 .
H4
O H2B O H3 O H4
I G U R E 7-41 Chromatin 3ssembty fa(totS faalitafe the assembly of nudeosomeJ. Mer!he repllcation fed has passed, chromatin assembly factors dlaperone free H3·H4 tetrclmers (00-1) and H2A-H2B dimers (NAP·J) lo the site 01 f'IEYAy replicated DNA. Once al the ~ rephcated ONA, these lactors transler Iher hislone contents lo lhe DNA. The CAF-I factors are reauited ro the newIy replicated Dl'I!A by inte.actions wth ONA sliding c1amps. l hese nng-shaped, au¡¡jliary Ieplia tion lactas encirde the ONA and (lre reIeased frorn the repkation machinery as the replicatiOll rork moves. A more detailed desaiption of DNt\ sliding damps 3nd lheir lunction In ONA replicatian is presented in ClJaptef 8. f
Summary
179
TA8l E 7·8 Propemes of Histone Chaperones Name CAF-I RCAF NAP-I
Numberof subunits
Hist0ne8 bound
Interaction with sliding clamp
4
H3' H4 H3' H4 H2A-H2B
Ye,
1
No No
SUMMARY Within the cell, ONA is organized into large structures callad ehromosomes. Although Ihe ONA forms !he Coundations fm each chromosome, as mueh as ha1f of each chromOSOOle Is composerl oC protein. Chromosomes can bo eUher circu lar or linear; however, eseh cell has a characleristic number aIld composilion oí chromosomCS_ We no\\' know Ihe scquence oC Ihe enlire genome oC numerous organisOls, Tnese sequences have revealed Ihal Ihe under-lying ONA oC each organ is m's chromosomes is usad more or h!ss efficienl ly lo encode proleins. Simple organisms lend lo use Ihe majori!}' of DNA lo encode protein: however, more complex organisms use only a small portian oC Inei r DNA lo aChlal ly cncarle proleins or RNAs. Gells musl carefully mainlain Iheir eomplement oC ehromosomes as Ihey divide. Eaeh chromosome musl have ONA elemenls Ihal dirCCI ehromosome mainlenance dur-iog ccll division. All ehromosomes mUSI have one or more origins of replicalion. In oukaryotic cells, Ccotromcres playa crilical role in lhe segrcgalion of chromosomeS and telomeres help lo protecI and replicate Ihe ends of linear ehromosomes. Eukaryotic cells carefu lly separate Ihe events thal duplicale and segrcgate c hromosomes as ccll divisiofl I-'roceeds. Chrolllosome segregaUon can occur in one oC Iwo manners. During mitosis, a highly specialized apparalus ensures Ihal olle. copy of each duplicated chromosome is delivered lo eaeh daugbter eeU. Doring meiosis, an additional round of ehromosorne segregation (withOul DNA replication) further reduces Ihe numbcr of chromosomes in Ihe resulting daughter eells. The combination of eukaryotic ONA and ¡Is associalcd proleins IS referred 10 as ehromali n. The fundamental un ít of chromatin IS Ihe nucll.'OSome, whieh is mada up of two copies cach oC Ihe core hisloncs (H2A, H2B. H3, and H4) amI approximalely 147 bp of ONA. This protcin-ONA complex serves !\\fO important functions in the cell: il compacls the ONA lo allow il lO fit inlo Ihe nuclcus and il reslricls Ihe accessibility oC Ihe ONA. This laller Cunction is extensive!y exploited by the celJ lo regu lale many differenl ONA Iransactions including gene expression. The atomie struelure of the nucleosorne shows Ihal Ihe ONA is wrapped about 1.7 limes around Ihe oulside of a disc-shaped, hisloneprotein eore. The inlc11:Iclions belween lhe ONA and lhe hislones are extensive bul uniConnly base nonspecific. The nature oC !hese interactions explain bOlh the bending of 111e ONA around Ihe hislone
odamer and Ihe ability of virtually aH ONA sequences lo be incorporaloo inlo a nucleosome. This struclure a1so revcals the locarion of the N-Ienninal tails oC Ihe hislones and Iheir role in direcling the patb of the ONA around Ihe hislones. Once ONA is packaged iolo nuc!eosomes, il has Ihe abi lity lo form more complex Slructures Ihal allo\\' addilional compaction of the ONA. This prQCess is faeilitated by a ñflh hislonc called Hl. By hincling Ihe ONA associatad with the nuclcosoma, Hl causes Ihe ONA lo wrap more ti ghtly around Ihe oclamer. A more campact form of chromatin , Ihe 30-ron fiber, is readily formed by arrays of Hl ·bound nucleooomes. This structufe is more repressive Ihan DNA packaged i nlo nuc1oosomes alone. Currenl evidence suggests that the incorporatíon of ONA inlo tbis struelure rcsults in a dramatic rerluclion in ¡Is accessibilily lo Ihe em:ymes and proteins involved in Iranscriplion of the DNA. The intúraction of Ihe DNA w ilh the histones in Ihe nucleosome is dynamic, allowing ONA-binding prolei ns inlermittenl aecess to the ONA. Nucleosome-remodeling complexes inerease the accessibility oCONA incorporated ioto nucleosomes by increasing Ihe mobility of nuclcosorneS. 1'hrec Corms of mobilit,Y can be observed: slirling of tha hislone aclamar atong the ONA, complele lransfer of 1he his tone oclamer from one ONA molecule to another, ancl more subtle remodeling of Ihe protein-ONA ¡nlerac¡lons within Ihe nuc1aosomes. These complexes are localized lo particular regions of the genorne lo faci litale altera Uons in chromatin accessibility. A subsel of nuc!eosomcs is restricted lO fixed posilíons in Ihe genome and are said lo be "posilionad ." Nucleosome positioning can ba di· reded by DNA·binrling proleins or particular ONA sequences. Modificarion of the hístone N-tenninal laHs also alters Ihe aecessibilily of ehromatin. The Iypcs oC modifications include acely lalion and melhylation oC Iysines and phosphorylflllon 01" serines. Acelylalioo of N-Icnnina l tails is frequentl y associaled wi th regions oC active gene expression. These LTlodifications alter both Ihe properties of Ihe nucleosome ilsalf as well as aeting as binding sites for prolejns Ihal influence Ihe accessibility of Ibe chromatin. These morlifications also recruit enzymes Ihat perfonn the same modification, leading to similar modificalion oCadjacent nucleosomes. It is likely Ihal this leads lo Ihe stable
180
Clu'OlIlo:;omer.·. Chromntin, nnd (1m Nucloo~ome
propagation of regions of modifit..>d nucleosomes/chromatin as the chromoSQmcs are duplicalad. Nudeosomes are assembled illlmed ialaly after Iha DNA is replicaled , leaving Iittle time during wbich the DNA is unpockaged. This involves the fundion of spocialized histona ch aperones Ihol eseorl Ihe H3·H4 telmmers and tl2A'HZe dimers lo the replication fork. During the replication of Ihe DNA, nucleosomes are tran-
sienlly disassambled. Hislone H3·H4 tclramers and HZA ·Hz B dinlCrs are taJldornly d istributed to one or Ihe a lher daughler lllo1ecules. On average, each new DNA molecula rcccivcs half o ld and h alf new histo nes. Thus, both chro mosomes inheril modificd hislones w hich can Ihf'.fl Rct as "soo
BIBLlOGRAPHY Books Alberts B.• lo hnson A" Lewis J.• Roff M., Rooorts K .. and Walter P. 200Z. Molecular biology 01 th e call. 41h edition. Carland Science, New York. Brown T,A, 2002. Conomos 2, 2nil cdHion. Jo hn Wiley & Sons, Now York. Naw York, w ilh BlOS Sd enlit1c PubIishers Limitcd , Oxford, Unitee! Kingdom, Elgln S.C.R. alld Workman J., ed s. ZOOO. Chromoun stmc~ IlI r e and gene e>tprc.<;.<;ion, 2nd edil ion. Oxford Univei'" sity Press, Londo n, United KingdolO. Murray A. and Hunl T. ' 993. The cell CJlcle: An introduction. w,n Frooman and Co., New York. Weaver R.F. ZOO2. Molecll/or bio/ogr. Znd edilion. McGrdw-HiII Higher Ed ucalion. New Yo rk. Wolffe A. '998. Chromatin: Struclllre and fundion . 3rd edition. Academic Press . San Diego. California.
Chrornosomes Bendich A.J. e nd Drlir-.a K. ZOOO. Prokaryotic and e ukaryotic chromosomas: Whal's Ihe difforence? Bioessays 22: 461 - 466. ¡nlemarional Human Genorne Scquencing Consortium 2001. Inilinl saquencing au d analysi s o f Ihe human ¡,tenQme. NO/llre 409: 860-9Z1 . Nucleosornes Alillunz.iato A.T. and Hansen J.e. 2000. Role of hislone acelylalion in the assembl y and rnod ulalioll of chro~ malin strllr.lu res. G,ml'! Exp r. 9: 37-61. Belmont A.S. , Dieh:cl S., Nye A.C, Strukov V.G .. a nd Thmbar T. 1999. Latge-sr.alc chrornaLin structure Ilnd function. Cllrr. Opin. CeJl Bio/. U : 307-311 .
Eberharter A. snd Becket P.s. 2002. Hislone acetylalion: A switch bolwecn rcprossive and permissive chromalln. EMBO Reports 31 : 2Z4 - Z29. Grogory P.D., Wagner K., and Ho rz W. 2001. Hisfone aCflly· lati on and duomali n rcmodeling. Exp. Cel/. Hes. 265: 195-202. Hayes l.'. and Hansan J.C. 2001 . Nucleosomes and Ihe chromali n fi bcr. CUIT. OpinoGenet. [)ev. 11: 124- 129. Jenuwc in T. and At1is C.D. 2001 . Translating the hiSIQne code. Sciel1ce 293: 1074- 1060. Luger K., Madev A.W., Ridunond R,K., 1997. CrySlaJ slructure of the nucleosome core particle al 2.8 Á rasolu lion. Noture 389 : 251-260. Luger K. and Rlchmood T.J. 1998a. ONA binding wilhio Ihe nudeno:;ome c:oro. r.urr. Opin. Stmct. Biol. 8: 33- 40. Luger K. and Riclunond T.J. 1998b. The hislo ne laits of the ouc1eosome. Curro OpinoGen et. Dell. 8: 140-1 46. Narliker C.J.. Fa o H-Y, and Kingston RE. 2ooZ. Coopcrn· lion bctween complexes thal rcgu late chromatin slruoture and transcriplion. Cel/108: 475-487. Roth S. Y.. Denu 1.M., und AlIis O. 2001 . Histone acety l transferases. Ano. Rev. Biochcm. 70: 81-120. Thomas J.O. 1999. Hislone Ht : Location and role. C IITr. Opill. CeU. Biof. 11: 312-3 17. Woodcock c.L. and Oimitrov S . 2001 . Higher-oroer strucluro of chromalin and ch romosomüs, Curro OpinoGenRt. Dev. J1 : 130 - 1 35.
CHAP T ER
The Replication
ofDNA hen the DNA double helix was discovered. the fealure lhal mosl excited biologisls WRS the complemenl ary relalionship belween Ihe bases 011 its inlertwined polynudoolide c:h¡.¡i ns. It seemed unimRginable Ihal such a compl emen lary slructure would nOI be ulili:t.oo as the bas is for ONA replicalion. lo fael, it was the selfcomplementary nature revealed by the ONA slrucl ure lhal filllllly led most biologisls lo accept Oswa ld T. Avery's conclusion lhat DNA, nol sorne form of proteilt , WAS the carrier of genetic informal ion (Chüpter Z). In our discussion of how lemplates ac!. we emphasized that Iwo ide.nlical SUrlACt:S ",ill nol /lllract cac:h olher (Ch/lpler 6). Inslead, il is much easier lo visualize Ihe altraction of groups wilb opposile shape or charge. Thus, without any delailed structural knowledge. \Ve mighl guess thal a molecule as oompHcated as the gene could not be copied dírectJy. tnslead. replication \Vould ¡nvolve the fOITllatíon of s molecule complementary in shape, s ud Ihis, in tum, "'ould serve as s lempl ote lO make a replica oC Ihe origin/ll molecule. So, in Ihe days before delailed knowledge of protein or nucleic scid slructure. sorne geueticisls wonderecl whelher DNA served as a lemplate for a specific prolein thal. in lum, servoo A~ il lAmpls la for a mTTP..sponding DNA mnler.ule. But ns 500n /l5 th e self-complementar)' nature of DNA bccame known , Ih e ¡de/l lhat protein templates mjght p ia)' H role in ONA replicstion was d i5carded. lt \\las irnmensely simpler lo postulale Inal cach of the Iwo slrands of c\rery parental ONA molecu le scn red as a temp lale for Ihe formalioll oC a complementary daughler alrand. AIlhough frnm Ihe start this hypol hesis seerneo too good nol to be true. experim ental support nonelheless had to be generated. Happily. within five years oC UlC discovery oC Ihe double heHx, decisive evidenc:e emerged for tho separation of the comp lemenlary slrands doring DNA mpl icalion (see disClIssion of Meselson ami Slah l expen menl in Chaplar Z) and firm cnzyrnological proof lhal UNA aJona can function 8S th e temptale for Ihe aynl hesis of ne\-\' ONA stranda. With lhese results. the problem oC how genes replicate was in ano sanse solvecl. Bul in anol he r sense, the atudy úf DNA replicati on had only begun. As we "viii seo in this c hilpter. the replication of eveo Ihe simplesl DNA molecul e is a comp lex, multi-step proccss, involving many more enzymes thon was initialIy anticipated following Ihe discovery of Ihe firsl DNA polymerizing enzyme. The replication of lhe large, linear chromosomes of e ukaryoles iB still more r.ompl ex. Tbese chromosomes require many starl sites of replicalion to synlhesi7.e the enlife chromosome in 8 limely fashion. and Ihe ¡nitiali oo of replicali on musl be ca refully coordinaled lo ensure Ihat all sequences aro replicated exaclly once. In Ihis chapler, we will first describe the basic chemislry of ONA f!yn Ulesis and lhe fuu ction of Ihe enzymes Ihal catalyze thi s macHon.
W
OUT llN E
• The Cheffilstry of DNA Synlhesis (p. 182)
• lhe Mechanism 01 DNA PoIymt....-ase
(p. IB4)
• The Replicatioo Fork (p. 192)
The SpeclilhlilllO'\ of DNA PoIymerases (p. 2(0)
• ONA Synthesis at !he Replication Fork (p. 205)
• tniMtion of ONA Replication (p. 2 12)
• 8toong aOO UOL\1nding: Ongin 5e1ection and Activation by the Initiator Prolein (p. 2 14)
Finishing Replication (p. 228)
,.,
182
The Replico /ion
o[ DN/\
We will then discuss how the synlhesis of ONA occurs in the contexl of an ¡nlacl chromosome at structures call ed replication forks . An an-ay of additionnl proleins are required lo prepa re the ONA for replicalion al the..<;e sites. The last part of t.he chapter focuses on the initiation and lermination of UNA replication. UNA replicalion is tightl y controlled in all cells and initiation is the slep that is regulaled. We will describe how replication ¡nHialion praleins u nwind the UNA duplex at specific sites in Ihe genome caUed origins of roplication. Togelher. tho pl"Otoins invo lvod in UNA roplication rep· resent an inlrieate machine Ihal performs Ihi s criti cal process wilh astounding speed , 8ccuracy. aud cúmpJe le ness.
THE CHEM1STRY OF DNA SYNTHES1S DNA Synthesis Requlres Deoxynucleoside Triphosphates and a Primer:Template Junction For Ihe synthesis of DN A to proceed, two key substrates musl be presento Firsl. new synthesis requires the four deoxy nucl eosid e triphosphat es- dGTP. dCTP. dATP. and dTTP (Figure 8-1a). Nucleosid e triphosphales ha ve three phosphoryl groups which are atlac!Jed via the 5' hyd ro xyl of the 2'-deuxyribose. The innerrnost phosphoryl group {Ihal is, Ihe group proximal to the deoxyriboseJ i5 called the n -phosphal e whereas Ihe middle and outermost groups a re called Ihe 13- and )'-ph05phales, respective ly. The second important s ubstrme ror UNA synlhusis is a particu lar arrangemflnl of ssDNA aod dsONA caBed a primer:template jum::tion (Figure S-lb). As suggested by its Harne, the primer:template jUllction has two key components. The template provid es Ihe ssDNA Ihat will direct Ihe addition of each complementary deoxynudeotide. The prim er is complementary to, bUI shúrtcr Ihan . the lemplate. The primer must also have nn Rx posed 3'OH adjaceJlt to Ihe singlostranded region of the te mplate. It is Ihis 3'OH that will be extended as new nucleotides are added. Forma " y. only Ihe primer portion of Ihe primer:lemplate jUllction is a substrate for DNA synlhesis since ooly the primer is c hemically modified during ONA synthesis. The templnte only providHs the informalioo necessary to piel which nucleotides are addecl. Nevertheless . both a primer and a template are essentia l for a ll DNA synlhesis.
• HO -
b
o 11
O 11
I
I
P- O -
0-
y
P- O-
0-
base (A. G. C. or T)
O 11
annealed primer
5'
.....growing en
P - O - CH
I
0-
'O OH
dsDNA
F I (; UJI E 8-1 Subshates lflIuired lor ONA synthesis. (a) The general Sfn.JCtllre of lhe 2' -cIeo¡¡ynudeoslde Illphosphates. lhe posÍfions of the 0-, ~., and y-phosphdles are labeIed (b) The Slruc~ Iure 0 1 a generalized prirner.lemplate IUnctton. The shorter primer strand is COIl1pletely annegled 10 lhe longer DNA slrand and musI hcwe a lree 3'OH adjacenl 10 a ssDNA regrOll of Ihe lernplale. The Iorrger DNA strand indudes a region annealed 10 the primer and an adjacent ssONA reglon tIlal acts as lhe templare rOl' f'leIN ONA synthesis. New ONA s)"llhesis exteflCls tIle 3' end of m e primet'.
ssDNA
T he Cl!cmistry o/ DNA Syn lllOsis
183
DNA Is Synthesized by Extending the 3' End of the Primer The chemistry of DNA synlhesis requires that the new chain grows by extending the 3' end of the primer (Figure 8-2). Indeed . this is a universal feature of UlC synthesis of both RNA and DNA. The phosphodiester bond is formed in an Sr.;2 macHan in wh ich the hydroxyl group at the 3' end 01' the primer strand attacks tho a-phosphoryl gmup of the incoming nucleoside Lriphosphate. The ¡oav ing group for the reaelion is pyrophospha le. which arises from Ihe release oC the /3- and 'Y-phosphates oCIh e nucleolide substrate. T he template strand direels which of thfl fo ur nucJeoside tt iphosphates is added. The nucleoside triphosphate that base-pairs with the template strand is high ly favorcd for add ition to the primer strand. Recall lhal the two slrands of the double helix have an antiparallel orienlation. This arrangement means that the template strand for DNA synthesis has the opposite orientat ion of the growing DNA strand.
Hydrolysis of Pyrophosphate ls the Driving Force for DNA Synthesis The add ilion of a nucJeotide lo a growing polynuclootide chajn of length n is indicated by the foUowing reaetion: XTP
primer
+ (XMP)"~ (XMP)".. +0-0
5'
tempfete 3' HC' -OY~
5'
5
F I (; U R E 8-2 DieBram 01 ttte mechanism ot ONA syntttesis. ONA synlhesl$ 15 lfullaled by lhe nudeophilic attack 01 !he u phospt1ate of tht> IOcorning dNTP. ThlS results in lhe extenSlOn of lhe iflCDrrllng 3' end of the primer by one nuc\eolide and lhe release 01one molea.lfe 01 pyrophosp/ldte Pyrophosphatase rapidly Ilydrolyzes !he pyrophosphate ¡nto two phosphilte moIecules.
But the free enm'g)' for this reaction is rather small (6.G' = -3.5 kcallmole). What then is the driving force fQr Ihe polymerization of nucleolides jnto DNA? Add itional free energy is provided by Ihe rapid hydrolysis oC Ihe pyrophosphate into two phosphate groups by an enzyrne known as pyrophosphalase:
Tho not resull of nucleotide addition ond pyrophosphate hydrolysis is the breaking of Iwo high-energy phosphate bonds. Therefore. ONA synthesis is a coupled process, with an overall reaction of: XTP + (XMP)" - (XMPl n~ 1 + 2 Gi
This is a highly favorable reacHon with a 6.G of - 7 kcal/mole which corresponds Lo an equilihrium conslant (K,-qJ of ahout 105 . Such a high Keq rneans Ihat Ihe DNA synthesis reaclion i5 effeetively irreversible.
THE MECHANISM OF DNA POLYMERASE DNA Polymerases Use a Single Active Site to Catalyze DNA Synthesis The synthesis of DNA is eatalyzed by an enzyme called DNA polymerase. Unlike mosl enzymm:. whieh have an active sile dedicalcd to a single reaetion. DNA polymerase uses a single active site to calalyze the addition of any of Ihe four deoxynucleoside triphosphates. DNA polymerase accomplishes Ihis catalytic fl exibility by exp loiting the neady identical geometry ol' the A:T and C:C base pairs (remember thal the dimensions ol' !he DNA helix are largely independent oCIhe DNA sequencel. The DNA polyrnerase monitOl'S the ability of Ihe incoming nucleoUde lo form an A:T or C:C base peir ralher than detecling the exact nucleotide lhat enters the active site (Figure 8-3). Only when a correet base pair is formed aTC the 3'OH of lhe primer and the «-phosphate of the ineoming nucleoside triphosphate in the optimum position rOl' catalysis lo occur. Incorrecl base-pairing leads lo dramatical1y 10Vv'nr rates of nucleotide addition due lo a catalyticaJly unfavorable alignmenl of these substrates (see Figure H-3b). This is an example of kinetic seh'!ctivi ly. in which sn enzymfl favon:; catal)'sis lIsing one of several possible substrates by dramatically incrp.asing the rate of bond formation only when the correc!. s ubstrate is presenl. Indeed. the rate of incorporation of an incarrecl nucleotide is as much as 1O.000-fold slower lhan incorporation when base-pairing is correcto DNA polymerases show an impressive abilily to dislinguish between ribo· and deoxyribonuc!eoside triphosphales. Although rNTPs are prosent al approximate ly ten-fold higher concentration in the ceH, they are incorporaled al arate Ihal is more Ihan 1.00O-fold lówer Iha.n dNTPs. This discrimination is modiatod by Iho slorie oxclusion of rNTPs from Ihe DNA polymerase active site (Figure 8-4). In DNA polymerase, Iho nuc!eotide binding pocket is too small to allow the presence of a 2'OH on the incoming nucleol ide. This spacc is occupied by tw"o amino acids that make van der Waals contacL<; with the sugar ringo Interestingly. changing these amino acids to others wilh smaller side chains (for exampie. by changing a glutamate to an alanine) results in a DNA polymerase with significantly reduced discrimination between dNTPs and rNTPs.
rh e Mccl!nniSIl)
a COITect base pair
o/ DNIl Polymilms~
l H..'i
b incorrect base pair
template
a
\
o
\
P
/0
P ¿\o I
Q=P - O
y
I O
F I C; U R E 8-) COlJ'ectly paired bases are required fo, DNA poIyrne,ase catalyted nudeotide addítion. (a) Schematic diagram of Ihe attdd o ( a primer 3'OH end on a correctly base-paired dNTP. (b) Schemalic tiagrarn ollhe consequence 01 incorrect base-pairing on catalysis by DNA po/ymerase. Ln lhe aafT1lle ~ Ihe incorrect AA base pair dspLaces the o·phosphate 01 !he inc::onir¡g nu:Jeotide. This ircorred
•
b
Wm":';'~~C-O-r,--O-t1
a
F I e U 11: E 8-4 Schematic iIIustla-tion of ttte steric constraints pieventing catal)'Sis ming rNTPs by DNA pofyme,ase. (a) Binding of a coned:ly t;.ase·paired dl\'lP 10 Ihe DNA poIo,merase. Under Ihese oorxfllions. Ihe 3'QH o( lile primer anc! !he (1phosphate of the dNlP are in dose proximity. (b) Addition of a 2 'QH restJlts in a sleric dash with amino acids (the disaiminatol amino aods) In !he fIlldeotide bindil'll pockel TIlis ~ in Ihe (I·phosphale 01 !he dNTP being displaced <100 a misalignment with the 3'OH al lhe prime.-, drillTllltically leducing the late ot catalysis.
186
Thc Rcp/ication o{ DNA
DNA Polyrnerases Resemble a Hand that Grips the Pl""imer:Template Junction A molecular undcrs tanding of how tbc DNA polymerase catalyzes DNA synthesis has emerged frorn studies of the atomic structure of various DNA polyrncrases bOlmd lo primer:template junclions. These slructures reveal that the DNA su bstrate sits in a large cleft Ihaf resernbies a partially closed right hand (Figure 8-5). Based on lbe anaJogy lO a hand, Ihe ,hrep. dnmains nf 'he polymerase are called th e thumb. fingers, and palmo Th e palm domain is composed of a J3 sheet and con lmns Ihe primary elements of the cataJytic site. In particular. tbis region of DNA polymerase binds two di valenl metal ions (typically Mg.... 2 01" Zn ~ zJ Ihat alter Lhe c hemical environmen t around the correcl ly base-paired dNTP and the 3'QH of Ibe primer (Figure 8-6). Qne F I C; U R E 8-5 lhe ttnee-dímenstonitl stl'Ucture of DNA polymet'itse fesembles it righl hand. (a) Schematic of DNA poIymerase bouoo 10 a primer.leITlplate il:nction. nle fingers, thumb, aOO palm are noted. The recenlly synlhesized ONA IS assodaled with the palm ard !he slle cA DNA catalysis IS Iocated in tIle aevice retween the fingers ard tlle Ihumb. The single-straOOed region of!he lemplale strand is benl sharply
and (\oe. nol pass between lile Ihumb aOO dle fingers. (b) A similar vievv of Ihe nONA poIymerase bouOO 10 ONA. The oNA is sha.vn in a space-fllling manner and lhe protein is sha.vn as a nt-bon diagram. n"le fingers and !he Ihumb are composed 01 0;1 helices. The palm dornain is obswred by !he DNA. The incomrng dNTP IS shov.n in red (fa lhe base ard Ihe deoxyribose) and ye!low
a fingers
~ lhlJmb
\
. / "palm
\\~ "-!A., -\-
lemplate
.~ ~~
\
fL
primer
TheMcclloniml
a
o/ DNIl PolymF.mse
187
FI C; U RE 8-6 Two metal ions bound to DNA poIymet'lIse catalyte nudeotide a ddi-
tion. (a) UluslTation 01 the active site of a DNI\ poIymerase. The IWO metal ions (shov.n in green) are heId in place bt inleractions with IWO l1igtiy conserved Aspartale residues. Melal ion A prirnarily Inlefacts w tll Ihe 3'OH resulbng In redu:::ed associalion between lhe O and lhe H. This Ie.wes él nudeophillc 3'0 . Wietal ion B inlef3Cts \Nilh !he triphosphates of !he inoorning dNTP 10 neulfalize ,hei, negative cNrge. Afte!"
catalysis, !he pyrophosphae product i:; st.b6zed
tl1loogh Similar inleradlOns Ih'ith rnelallOn B (nol s11(Mn). (b) Three-dimensior.al structUJe 01 lhe active 51te metal íons élSsocíated v.ith Ihe DNA poIymefa5e, !he 3'OH end of!he primer
and me incoming nudeollde. The metal
io~ are
~
in Breen and lhe remainlflg elelTlt'nts an: sha.vn in Ihe same coIors as In Figure 8-Sb. The vtew" ollhe polymerase shown here is rooghly equivalent 10 rOlatlng lhe írnage shown In F"lflure s-sb -180" around Ihe axis 01 the [)NA heli~ (DOlJbrlt' S., Tabof S., LDIll AM., Richarc!sc;.n ce, and Ellenberger 1. 1998. Nature 391 : 25 L) lmase Pfepilfed wilh BobSmpl, MoISaipt, am Raster 3D.
metal ion reduces the affinity of Ihe 3'OH for ¡t~ hydrogen, This generales a 3'0- tha! is primed for Ihe nucleophilic attack oC the a-phosphal e o( tha incoming dNTP. Tlle second metal ion coordinates the negative charges oC the 13- and -y-phosphales oC the dNTP and slabilizes the pyrophosphate produced by join ing Ih e primer and the incoming nucleotide. In addition to ifs role in catalysis, the palm domain also monitors lhe accuracy of base-pairing Cor Ihe most recentJy added nucleotides. This r egion of the polymorase makes extensive hydrogon bond conleel,> with base pairs in tJle minor groove oC fh o newly synthesized ONA, Thesc eontacts me not base-spocific bul only (onn if the recently added nudeotides (whichever Ihey may be) are correctly base-paired. Mismatched DNA in this region dramaticaUy slows calalysis. The combinalion of tho slowod catalysis and reduced affinity (or lho newly s)'l1thesized ONA all ows lhe release o( th e primer:template from the pol ymerase aclive sit e and binding 1.0 a separata proofreading nuclease active site on tJ-Je polymerase. Whot aTe th e roles of the fingers and the thumb? The fingers are also important for catalys is. Several residues located within lhe fingers bind lo the incoming dNTP. More importantly. once a correet base pair is formed between the incomil)g dNTP and the template. the finger domain moves lo endose Ihe dNTP (Figure 8-7). This dosed (orm of the polymerase hand stimu lates cata lysis by moving lhe incomjng nudeotide io close contacl with the cataJylic melal ions. Th e finger dorna in also associates \vith the templa te region, leading lo a nearly 900 tum of Ihe phosphodiester hackbone of the lemplate immedial ely after Ihe active s ite. This hend serves to expose only the first template base after the primer al Ihe catalylic s ite. This conforma~ tion of the template avoids any confusion conceming which template base is ready lo pair with Ihe next nuclootide to be added (Figure 8-8). In contrast to the fingers and the palmo lhe thumh domain is nol intimately involved in catalysis, lnstead , Ihe thumb in leracts with Ih e ONA Ihal has beeo mosl rocently synthesized (see Figure 8-9). This serves two pmposes, First, it maintata<; thc correcl position o( lhe primer and the active site. Second, lhe thumb helps lo maiolain a strong association between the DNA polymerase and ¡ts substrate, This associa tion contribules to tho abi lity of th e ONA polymerase to add many dNTPs each time it binds a primer:template junction (see below). To summarize, an ordered series o( events occurs o
DNA Polyrncrascs Are Proccssive Enzymes Catalysis by ONA polyrnerase is rapid. ONA polyrnerases are capable of adding as many as 1.000 nucleotides per second l O a primer strand .
Tlw Mechon ism of DNA Polymf1l'Ose
a
FI G U R E 8 - 1
O-helix of _ ONA polyrnerase
O-helix (closed)
~G
(open)
r
Ti
~
·.. ··~ .... 40"
'~'\t
I
incoming dNTP
. ¡onB a ienA
5' ----;,---<;c""'"
189
DNA poIyrnerase "g.-ips"
ltIe lemplale a nc! !he ínwming nudeotide when a corred base pair is made. (a) An ¡nustlilbon of lhe chilf"€CS in DNA pol'ymerdSe Sfructure afte the incornil'll nudeotide base-pairs correctly 10 the lemplale DNA The prirnart change is a 40" rotallon 01 ene of !he heIices in lhe fing€r domain called !he Q-heIix. In the c,:en confonnaticn trus helil( is astan! ffom lhe ircoming nudeolide. \foAlen lhe poIyrnerase is in the dosed conformalÍO'l, Ihis l1eKx moves and makes severa! importMllmleracticns with the incorn'lll dNTP. A t)fosine fNkes
~1.rlJng inler-
actions wilh !he base ot me dNTP and IWO charged residues ilssociale 'hith !he Ir'iphosphafe the combination of dIeSe inleractioros posiIior'6 !he dNTP tor caalysis mediated by !he IWO metal ions bound to lhe ONA poIymerase (b) The SIn./ctl.fe of TI DNA pdymerase Ix:und 10 its SLbslrales in !he dosed conlCllTlation The Q-helix is shoM-i in Pl.fPIe ard the lest of the protein stTUC-
rotation of O-helix
•
lure LS shov.n as Iransparenl la darity. 1he 0lIICaI
•
5'
primer
b
• •
,I
I
The speed of DNA synthesis is largely due to the processive oature of DNA polyrnerase. Proccssivity is a c haracteristic of enzymes tha! operato
00
polyrneric s ubstratos. In the case of DNA polyrnorascs, tilO
dcgree of process ivity is defined as the O1'croge number o[ nucleotides oddnd each time th e enzylfle binds a primer:lemplote jUJlction. E8Ch DNA polymerase has a chamcteristic processivity that can range from only a few nucJeotidf!s to more than 50,000 bases added por binding event (Figure 8-9).
The rale of DNA synlhesis is dramaticaUy increased by addjng multiple nuc1eotides por binding evento lt IS the ¡nitial binding of
arrl argirn can be seen be1ind lile Gheix in ~nk. The ~ and the deoxyribose of !he incorring dNTP are shorvn in red, the 1yr0Slne, tyslne,
3'
prime¡- is sM.vn in ~ght erav. and !he lemplale strard is sI-o.-.n in dark gray. The IWO cat
s..
190
Tho Replico/ton o[ DNI'l
F I G U R E 8~8 Illustrabon of the path
of the lemplate DNA throtlgh the DNA poIyrnerase. The recently replicaled DNA is assoOated with Ihe paIm region 01 !he DNA poIymerase. fJJ. !he actfye sile, !he ¡irsl base of !he single-Slrarded reglOn o f lhe lerrplate is in a posiIion expeaed 101 cIooble-Sfranded DNA.
fingV"
/
thUrnb
inoornirg nucleolide
( templale
Pó one follow. lhe templale slrand loward its 5' erd, !he phosphodiester backbone abruptly bends goo. Ihis fesults in lile second and
all subsequenl single-stranded bases being place
\
I
S'
bendln templale
3'
lernplale b~e
polyrnerase lo the primer:t emplate junction Ihat is th e rate-limiting step, In a typicaJ DNA polymerase reaction. iI takes approximately one second for lhe DNA polyrnerase to locate and bind a primer: template junctioo. Once bound. addition of a nucleotide js very fa sl On the mi ll isecondrange). Thus. a comp letely nonprocessive DNA polyrnerase wou ld add approximately 1 base pair per second. In contrast, lhe fastest ONA polymerases add as many as 1.000 nucleotides per second by remaining assoc iated wilh the template fm multiple ro unds of dNTP addi tion. Consequently. a hi ghly pror:essive po lymerase i.ncreases thc overall rate of ONA synthesis by J' HO -
)UIUIU I~~I;.'lllllllllllltll
F I G U R E 8·9 DNA polymefases
synthesile DNA in a processrve manRer. This ill~1ra6on sh(No;S lhe difference between a proc~ ane! a r.onp~ DNA polymerilse. Bah DNA ~ bind lhe pruner.\elTlplate jur.ctJOn. UpOl"l blnding, the enzyme adds él single dNTP 10 Ihe 3' erd of !he primer and then is released ñom !he new primer:lemp!ale junction. In cOl"llra~ a pfocessive DNA ~rnerase adcI5 many dNTPs ead"1 time it binds 10 the templale. nonpforesswe
' s'
DNA polymerase binds (stow) "putative" floflprocessive ONA polyfll.erase
~'
:lm mrm mt iñi n'
1
processive DNA polymerase
:111l!@~5'
ONA synlheSis (fasl)
~'
1
:. 'ffiiiíiTnn n¡~ 0flEI
dNTP
rnany dNTPs aclded
added DNA polymefase releases
J., 1111111111111111111111111,1'5· 5' 1111111 111 J'
as much as 1.000-fold compared to a completely nonprocessive e nzyme. Increased processivity is facilitated by the ability of ONA polymerases lo slido along the DNA te mplate, Once bound lo a primer:templale ¡uncHon. ONA polymorase inl e racls lightly with much of Ihe double-stranded portion or Ihe DNA in a seque nce nonspecific manner. These interactions inelude electrostatic interactions between the phosphate backbone and the "Ihumb" dornain . and intoractions belween the minar groove of the ONA and Ihe palm dornain (describfld above). Th e seq ucnce-indepflnden t nalure of Ihese inleractions permits the easy move ment of Ihe ONA even afl er il binds lo polymerase. Each time a nucleotide is added tn th e primer slrand. the DNA partially releases from the polymerase (the hydrogen bond s with the minor groove are brokcn but the electrostalic interaclions with lhe th umb are maintainedJ. The DNA then rapidly re-bind s lo th e polymerase in a position thal is shifted by one base pail' using the same sequence nonspecific rnechanis m. Furlher increases in processivity are achieved thwugh interactions belween the DNA polyrnerase and a "sHding clamp" protein Ihat completely encircles the DNA. as we shall discuss further below.
Exonucleases Proofread Newly Synthesized DNA A system based only on base-pajr geometry and the complementad!)' between the bases is incapable of reaching the extraordinaril y high levels of accuracy th
1~2
Th,! Rt;>pUcu/ion
o[ DN/I
a slow oroo ONA ~s
mispaired
/ """"' ""
.......
3 ·011
-
" 3'
!ions wi th the palm region. This altered geomctry reduces the rate of nucleotide add ition in much the same way thal addition of An incorrectly paired dNTP reduces catalysis. Thus, when e mismatched nucJeolide is added, it both decreases the mte of new nuc1eolide add ilion and increases Ihe rale of proofreading exon ucl ease activity. As with DNA synthesis. proofreading can occur withoul releasing the DNA fro m the polymerase (Figure 8-10). When a mismatched base pa jr is detected by the po lyrnerase. the primer:template juncti on s lid es away from th e DNA po lymerase active sh e and into th e exonuclease site. (This is because the mismatched DNA has a reduced affin ity of the palm region .) After the incorrect base pair is removed. thc correctly paired primcr:templale junction slid es back into the DNA polymerase active s ite and DNA synthesis can continue. In essence, proofread ing exonucleases work like a "delete key" on a keyboard. removing only lhe most recent erroIS. The add ition of a proofreading exonudease greatly increases the accurncy of DNA synlhesis. On average. DNA polymerase inserls one incorrect nucleotide for every 10~ nucleotides added. Proofreading cxonucleases decrease th e appearance of an incorrect paired base to one in every 10 7 nucleotides added. Thi s error rote is still significantly short of the actual rate of mutation observed in a Iypical cell (approximately one mistake in every 10 10 nucleolides added). This add itionallevel of accuracy is provided by the post-replication mismatch repair process that is described in Chapter 9.
e resume ONA svnlhesis
THE REPLlCATION FORK
i
Both Strands of DNA Are Synthesized Together at the Replication Fork
" 3'
fl(;URE 8-10 Pfoofreading
ellonudeases removes bases from Ihe 3' end of mismaktted DNA. (a) 'M1en en irIcorreCIl1l.Idectide is inrorporated ,nto the ONA
by iI po/vrnefase, !he IiIteot DNA~ 1'> redu:ed and lhe ilffirWty of !he 3 ' enrj 01 the prwner lOf the {)NA ~ ~e SlIe is Qminished (b) \r";hen rni5matched, \he 3' ero of lile DNA ha!. increased affiríty lor the prooffeading e:or.udease active site Once bourd at lt1is CIC1/'JE' site, !he rTWS/lIiltd-ed 1"lJCI«ltide is removed. (e) Once Ihe misl1li:ltched nudeotde IS removed, the affinty of the propt:rly bdse-p¿ired ONA for the [)NA pay-
merase active si1e IS re:stoo!d aro OOA s)11thesis connnues. (Source; Ac!apled ,"om Baker TA. aod BeU S.P. 1998. PoIyr1"X.'ta5eS ilOd the repIisor¡le;
t\I\actlnes ~It'm 1"T\i!Cllnes. Cel 92. 296, f.g. lb. Ccpyfighl © 1998 VoIIth pernisSlOl'l frorn ElsevJer.)
Thu s far we have discussed DNA synlhesis in a relatively artificial contexto Thal is. al a primer:template junction that is producing only one new sl.rand of DNA. In the ceU, both slrands of the DNA duplex are replicated al the same time. This requires scparation of the twa slrands of the dou ble helix to create two template DNAs. The junction between lhe newly separated template slrands and the unrep li cated cluplex DNA is known as lhe replication fork (Figure 8-1 1). The replicalion fork moves contin uous ly toward lhe duplex region of unreplicated ONA, leaving in its wake two ssDNA templates that direct !he formation of two daughter DNA duplexes. Thc anti-parnllel nature of DNA oeates a complication ror the simultaneous replication of the two exposed templates al the roplicalion fork. Because DNA is only synth esized by elongating a 3' end. on ly one oC the two exposed temp lates can be replicated continuously as the replication fork moves. On this templale strand, the polymerase sim ply "chases" the replication fork. The newly synthesized ONA sltand directed by this templete is known as the leading slrand. Synthesis of the new DNA strand directed by the other ssDNA temp late is more problematic. This template directs t.he DNA polymerase lo move in lhe oppOSitc direclion of Ihe replication fork. Th e nc\\' DNA strand direcled by this template is known as the lagging slrand. As shown in Figure 8-11 , this strand of DNA must be syn thesized in a discontinuous fa shion.
rile Replication Fork
directioo of leading Slfand poIymernse movemenl
-.
overall direction á ONA replication
,
-
I
lag91n9 strand
direcliQ'l of lagging strand polymerase rTlO\Iemenl replicaled ONA
FI Ci UR E 8-11 11Ie replKalion fork. Nev.ty synthesized DNA is indicated in red and RNA primers are indicated in green. The Okazaki fraglnenlS showl ~R' artOOally sr.:,rt for illustra~ purposes. In lhe cell, Okazaki fragmenls can vary between 100 lo greélter than 1,000 bases,
Although the leading strand DNA polymerase can rep1icate it:o template as sooo as it is exposed. synthesis of the lagging strand mus! wai! for movemenl of Ihe replication fork lo expose a substantial length of templete before it can be replicated, Each time a substanHal length of new lagging strand templnte is exposed, DNA synlhcsis is initiatad and continues until il reaches the 5' end of Ihe previous newly synlhe:oized :olretch of laggillg straod DNA. The resulling short frogments of oew DNA fooned 00 the laggiog slrand are called Okazaki fragments and can vary in length from 1,000 lo 2,000 oucleotides in bacteria and 100 lo 400 nucleotides in e ukaryoles. Shorlly after bcing synthosized. Okazaki fragments are covalently joined together lo generata a coTltinuous. ¡nlacl' strand of new DNA. Okazaki frngments are. Iherefore. transient intermedintes in ONA replication.
The Initiation of a New Strand of DNA Requires an RNA Primer As described übove. all DNA polyrnerases require a primer with a free 3'OH. They cannol in¡tiate a new ONA strand de novo. f-foware new strands of DNA synthesis started? To accomplish this. the cell takes advantage of Ihe ability of RNA polymcrases lo do whal DNA polymerases cannot: slart new RNA chains de novo. Primase is a special¡zed RNA polymerasc dedicated lo making short, RNA primers (5 -10 nucleotides long) on an ssDNA templa te. These primers are subsequently extended by DNA polyrnerase. Although DNA polyrnerases incorporale onIy deoxyribonucleotides ioto ONA, they can initiate synthesis using either ao RNA primer or a DNA primer annealed to t}le DNA templale. Although bolh the leading and lagging slrands require primase to ¡niliale ONA synthesis. the frequency of primase funcli on on Ihe Iwo strands is dramatically different (see Figure 8-11). Eacb lending
unreplicated ONA
•
193
194
Tlle RepJicotio" of DN/t.
1
RNAse H
:JlmmnIHl'''lII~mmlllln::
!
s'
",,00'-"
:1lmmmlll~lllll~lllmmnC: I
I
prime r:lemplale juoc:tJ'on
!
OOA poIy""""",
mllll f.,
5 ' 3 '
,]11111 IIl11lHy1111111111
FIC;URE 8-12 RemovalofRNAprimen from newly synthesited DNA. The seq.
uenti
gray, lile RNA primer IS shooMl in greet1, and the newty synthesiled ONA thélt replaces the RNA primer is shOVo'fl in red.
slrand requires only a single RNA primer. In contrast, Ihe discontinuous synlhesis of the lagging strand means that new primers are needed for each Okazaki fragme nto Because a single replication fork can replicate miHions of base pairs, synthesis of Ihe Iagging strand can requ ire hundreds tu tbous8nds of Okazaki fragme nts and their associated RNA primers. UnIike the RNA polymerases involved in mRNA. rRNA. and tRNA synthesis (see Chapter 12), primase does nol require specific ONA sequencrn> to initiate synthesi s of a new RNA primer. Instead, primase is acHvaled onIy when i' associales with other ONA replication proteíns, such as DNA helicasc. These proleins are c:onsidared in more detail below. Once aclivated . primase synthesizes a RNA primer using tbe most recenUy exposed lagging strnnd template, regardless of sequence.
RN A Primers Must Be Removed to Complete DNA Replication To complete ONA replica tion. the RNA primers used for the initiation must be removed and replaced with ONA (Figure 8 -12). Removlll of fu e RNA primers can be thought of as a DNA repair e venl and this proccss sharos many uf Ihe properties oI excision ONA repair, a process covered in detail in Chaptar 9. To replace the RNA primers with ONA , ao enzyme called RNAse H recognizes and removes most of each RNA primer. This enzyme specifically degrades RNA that is base-paired with ONA (hence. the "H" in its name. which stands for hybrid in RNA:ONA hybrid). RNAse H removes 11.11 of lhe RNA primer except the ribonucJeolide directl y linked lo the ONA end. This is becauseRNAse H can onIy cieava bonds between two ribonucleotides. The final ribonucleolide is removed by an exon uclease thlll degrades RNA or DNA from their 5' end. Removal of lhe RNA primer leaves a gap in the doubte-stranded ONA that is an ideal substrato for ONA polymerase-a primer:template junction (see Figure 8-12). DNA polymernse fill s Ibis gap until evel)' nucleotide is base-paired. leavíng a ONA molecule Ihat is complete excepl for a break in Ihe backbone betwcen tile 3'OH and 5' phosphflte of '-he repait:ed strand. This "nick" in the DNA u m be repaired by un enzymc called DNA ligase. DNA ligase uses a high-energy co-faclor (such as ATP) to create a phosphodiester bond between an adjacenl 5' phosphale and 3'OH. Only after al] RNA primers are replaced and the associaled nicks are sealed is ONA synthesis complele.
DNA Helicases Unwind the Double Helix in Advance of the Replication Fork DNA polymerases are generally poor at separating Ibe two base-paired strands of duplcx ONA. Therefore , al fu e replication fork, a second class of enzymes. caJled DNA helicases, catalyze the separation oI ¡he two sLrnnds oI duplex ONA. These enzymes bind lo and move d irectionaJty along ssDNA using the energy of nudeoside triphosphale (usually ATP) hydrolysís to displace any ONA strand that is annealed to the bound ssDNA. TypicaJly. DNA helicases Ihat ad al replication forks are hexameric proteins thAl assume the shape of a ring (Figure 8-13). These ringshaped prolein complexes encircle one of tiUl two single strands at the replication fork near t.he single-stranded:double-stnmded junction.
The Replico/io n Fork
3'~
" -. ..1
HI5
FIGURE 8~11 DNAheIKa5eS.sepa.-ate lhe two strands of tf1e do"ble heli... lIVhen ATP ¡s added lo a ONA helicase bound lo
-
ssONA. the herlCa5e ity on
rTlO'v€S
with a defincd po/ar-
the ssONA In !he instance iUustJated, the DNA helicase hc:s a 5' ....Y poIarity. This polanty means thal the ONA helicase v..oulc! be bound lo the lagglng strand template al the replication fOfk
\.
•
Like DNA polymerases. DNA helieases Bet processively. Each time they associale with substrato. they unwind multiple base pairs DI DNA. The ring-shapcd hexumeric DNA helicases found al replication forks exhibít high processivity because they cncircle Ihe ONA, Release of Ihe helicase from its ONA subslflile therefore raquires the opening a f the hcxameric protein ring, which is arare event. Alternatively, lhe helicase can diSsúciale when iI reaehes Ihe cnd o flh e ON A strand Ihal it has cllcircled. Df course, this arrdllgement of enzyrne and ONA poses proble ms for lbe binding of tbe DNA heliease to the DNA su bstrale in lhe first p lace. Thus, there are specillli zed mechanisms Iha! assemble DNA hclicases around Ihe DNA in cells (sec " Initiation of Replication" below). This topologicallinkage between proleios iovolved io ONA replication and their DNA substrates is a common mechanism lo increase processivity. Each ONA helicase moves aloug ssDNA in a defincd direcl ion. This property is a characterislic of oac:h DNA hclicase called lis polarity (see Box 8-1, Determining Ihe Polarity of a DNA Helicase). DNA helicases can have a polarit y of eithor 5'-3' or 3'-5'. This direction is always defined according to the strand of DNA bound {or encirded ror a ring-shaped helicaseJ rather than the slrand that is displaced. In the case o f a DNA he licase Ihal funclions on Ihe lagging slrand template of the replication fork, Ihe polarity is 5'-3' lo aIlow the DNA helicase to procced loward Ihe duplex region of Ihe replication fork (see Figure 8-13). As is true for all en zymes Ihal move aIong ONA in a directional m anner, movement of the he lic8so aJoog ssDNA requ ires Ihe input of chemical eDergy. For helicases. lhis encrgy is provided by ATP h ydrolysis. Single~Stranded
Binding Proteins Stabilize Single~Stranded DNA Prior to Replication After tbe DNA hc licase has passed, Ihe newly gencrated single-strnnded DNA must remaio free of base-pairing u ntil it ato be used as a lempIale for ONA synlhesis. To stabilize Ihe separated strands, singIe-strandod DNA binding proteins (designated SSBs) rapidly bind lo the separaled
196
711e Replicalion ofDNA
strands. Binding of one SSB promotes the bindiJlg of another SSB to Ihe immcdiately adjacen l ssDNA (Figure 8-14). Th is is r..alled cooperative binding and occurs because SSB molecules boun d to immediately adjacen l regions of ssDNA can also bind tú each other. Thi s strongly stabilize::; the ¡nteradion of the SSB wilh ssDNA maki ng siles alrcady occupied by Olle or more SSB molecules preferred over otber sites. Cooperative binding ens ure::; that ssONA is rapidly coated by SSB as it emerges froro Ihe DNA helicase. (Cooperative binding is a prop-
Bolt 8-1
Determining the Polarity of a DNA Helicase ssONA cirde by DNA helicase, it will migrate according to its actual size, 200 bases. A modificatíon of this simple expenment can be used to determine the polarity of a DNA helicase. Suppose there is a restr1dion enzyme deavage site lacated asymmetrically within the base-paired regían (Box 8- 1 Figure 2). VVhen this site is deaved it will generate a largely single-stranded, linear DNA with two regions of dsDNA of different lengths al each end. Remember that DNA helicases bind to ssDNA, not dsDNA. Thus, me only place that a DNA helicase can bind Ihis new linear substrate is between me two dsONA reg"ions. Because of ¡he polarity of DNA helicases, any given DNA helicase can displace only one 01 the two short ssDNAs. 8ecause the two short ssDNA regions are of different lengths. the size 01 the released fragmenl W111 reveal which diredion Ihe DNA helicase moved along the ssONA region 01 !he linear subslrate.
The activity of a ONA helicase can be detected by its ability lo displace one strand of a DNA duplex from another. In a typical DNA helicase assay, the substrate is composed of one short, labeled ssONA annealed to one long. unlabeled ssDNA (typically the label is radioadiva IIp incorporated inlo the short ssDNA). Consider a large cirrular ssONA (fOl" example, 5,000 bases) hybridized lo a short (200 bases), labeled linear ssDNA molecule (&::»1 8- 1 Figure 1) . A DNA helicase will displace the short linear ssONA from the large ssDNA cirde. Separarion 01 the sttands can be detected by a change in electrophoretic mobility of me short, labeled ssDNA, in a nondenaturing agarose gel (see Chapter 20). After Ihe gel 15 exposed to X-ray film to detect only the radiolabeled ONA, !he posilion in me gel that the short DNA occupies can be determined. V'.tlen it is hybtidized to Ihe ssDNA a rde, Ihe short ssONA will co-migrate with the larga ssDNA a rde. In contras~ ance Ihe short ssDNA has been displaced from the
a
o
b
O O • @l bailed O O ® ® • ATP O @ O @ O DNA heliCase
200 ba,., fnKIiolabeled) ONA
0 ---1
5,000 bases (unlabeled ssONA arde)
--
--
,,;-ray film e1:posed lo agarose gel
80X 8-1 FICURE 1 A biochernical assay fot DNA hefkaseactMt)t
(a) DNAsubstraleto
dctect helicase actMty. A 5.000 q:. unlabeled ssDNA ciroAar D-IA is anlleilled 10 a 200base raálOlabeled DNA. For CO/l\Ief1lence me two moIecuIes are Il()( dr.3Wll 10 scaIe. (b) lo detect DNA heficase activity, lhe DNA Slbstrclte is exposed to the DNA hcIicase (in this case with aOO withrut ATP). NtJ:s!he reaction, the resulting DNA moIecUes are separated by agarose gel eIectrophoresis (noodenatur;,g). Wlen lhe short raddcbeled DNA is base-paired YoiIh the large ssDNA drde, both mde
Th e Hcp /ical ion Fork
Box 8-1 (Continued) BOX 8-1 FIGURE 2 Abiothemtcal usay for DNA helka5e poIarity. (a) lhe [)NA substra(e. lhe same DNA 5U~rate illustrated In Figure 1 is cleaved v.1th a restrictlon enzyme that Ieaves blunt ends. 1he restriction e nZ)"le is dlosen to deave asymmetricaOy, leaving 12S-base and 7~ase radiolabeled ssDNA frilgmenls anneilled lo the ends of a s,QCX}base Ullabeled ssDNA. 1he 5' and 3' ends of lhe resulling DNA moIecules are inólCilted. (b) AA illustration of an X·ray film exposed to an agarose gel used lo separate the DNA products after DNA heticase treatment Is shown. The substrate generated in pCirt (a) can be incubated with a DNA helicase 10 determine its poIarity. Results fcr a 5'-3' 8nd a 3'-S' DNA helicase are shoMI. Boiling of!he substrale inooles the consequences of complete denaturation of all base-pairing.
a
restrictioo enzyme cleavage sile
I
5~" 125 75 bases bases
Ioleavage w;1h e nzyme
~ resuiction
1:.'S~¡¡¡¡¡¡¡¡¡¡¡5';========~';¡'~~: 125 bases
b
75 bases
O
®
O O
O
O
o
O
boUed
~
~
ATP
5' __ 3' 3' __ 5' DNA helicase
- ¿--- - 125b - 75b 40 --
--- -
x-ray film exposed
to agarose gel
erty of maoy DNA-binding proteins. see Box 16-4, Coneenlration . Affinity. ond Cooperative Binding. ) Once eovered w ith SSB, ssDNA is held in an eloogated slate Ihal faci titales it s use as a te mplate for DNA or RNA primer synthesis. SSB inferacls with ssDNA io a seque nce-independenf manner. SSBs primarily contad ssDNA through eleetrostatie iDteractiODs with a
binding al additional SSBs O
\
•
FIG U R E 8-14 Binding DI s ingle-stra nded binding protetn (558) lo DNA. (a) A Ilmilmg amount d SSB IS bound lo four of \he nine ssDNA rnoIecuIes 5haMl. (b) As more SSB a nds 10 DNA. it prefecoliaHy Mds adjacent to prevO.¡sIy boond SSB mo1ecuJes. Only ah€f SSB has compIetely coated the n llially bound ssONA molecules does blnding occur on other moIecules. Note fhal when ssDNA i5 coated with sse, ir 3SSlJ11eS a more extroded confoonalion that inhibilS \he fOlTlli:ltion of inttamolOOJfar base pairs.
197
198
Thtl Replieotion 01DNA
Ihe phosphale backbone and stacking intCfactions with the DNA bases. In con trast lo sequence specific DNA-binding proteins, SSBs make few, if any, hydrogen bonds lo the ssDNA bases.
Topoisomerases Remove Supercoils Produced by DNA Unwinding at the Replication Fork As the strands of DNA are separated at the replication fork , the double-slranded DNA in front of the fork becomes increasingly positivoly supercoiled (Figure 8-15). This accumulati on of supercoils is Ihe resull of ONA helicase elimiImting lhe base parts between Ibe Iwo strands. If the ONA s lrands re main unbroken . the re can be no reduction in linking number (the number oC times the two DNA strands are intertwinedJ to accommoda le Ihis u nwin ding oC Ihe DNA dupl cx (see Chapter 6). T hus. as Ihe ONA helicase proceeds. Ihe DNA must accommodate tbe same Iin king number wilhin a smaller and smaller number of base pairs. Indeed, for the superhelicily lo remaln the sam e, ODe DNA link must be removed approx-
replicalion machinery
FI(;URE 8· 15 AcUon of topoisomerase al the r~KatiOf1 tork.. As positive supercoils aa:umulate in froo! ollhe replication for!<., lopoisomerases rapdly r€fl"lOl.e lhem. In this dlagram. lhe actiOll of Topo 11 remo..e; !he posill'.o1:' supefCOll induced by a rcplicáhon lork By passing one part of the unreplicated dsONA through a double-stranded break in a neilrb'f unreplicated legion, lhe posiúve supercoils can be rell"lClVed. 1I is worth not/ng thal thts change \IIIOUId reduce lhe linking number by Iwo and mus 1MJU1d only have 10 occur once f?oIefy 20 bp rep1icated. Mhough lhe action of a twe n topoisomerase is ¡11ur;trated here, twe 1 topoisom€fases can also 11;!Ii10VE' the positive supercoils generaled by lhe replication forl:..
topoisomerase 11 break DNA
,
, pa"
DNA l
Ihmugh Ihe break
imate ly every ten base pairs of ONA unwound. lf Ihere were no mechanism to relieve the accumulalion of Ihese supercoil s. Ihe replication machi nery wou ld grin d to a hall in the face of mounting pressure. The problem is most dear for the circular chromosomes ofbacleria (see Figure 8-151. but il also applies lo e ukaryotic chromosomes. Because eukaryolic chromosomes are nol dosed cird es. lhey could, in principie rotate along tbeir length lo dissipate tbe introduced su pereoils. This i5 no! Ihe case, however: il i8 si mply nol possible lo rotate a ONA malecule tbat is m ilJioDs of base pain; long each time one tum of the belix i5 unwound, Tbe supercoils introdueed by Ihe aelion of Ihe DNA helicase are removed by topoi somerases Iha! ael on Ihe unreplicaled doublestranded DNA in fronl of the replicalion fork (Figure 8-15 ). These enzymes do Ibis by breaking either one or both slrands of Ihe DNA withoul letling go of the ONA and pflssing rhe same Dumber of DNA strands througb the break (as we discussed in Chapler 6). This aelion relieves the accumulation oC supercoils. In Ihis way, lopoisomerases act as a "swivelase" Ihat rapidly dissipalcs Ihe accumulation of su percoils induced by DNA unwinding.
Replication Fork Enzymes Extend the Range of DNA Polymerase Substrates On its own , ONA polymerase can only effici ently extend 3'OH primers annealed lo ssDNA tempIotes. The addilion of primase, ONA hclicase. and topoisomerase dramati ca lly exlends lhe poss ible su bstratos for DNA polymerase. Primase prov ides the ab ility to initiate new DNA s trands on any piece of ssDNA. Of course, t}IB use of primase also imposes a requ irement for Ihe removal of the RNA primers lo com plete repli cali on. Similarly. strand separation by DNA helicflse tlDd diss ipalion of positive supercoils by topoisomerase allow DNA polymernse lo replicate dsDNA . Although the names of the proteins change f.rom organism lo organism (Table 8-1), the same sel oC cnzymalic aclivilies is used by organisms as diversc as bacteria, yeast, and huma ns lo aecomplish ch romosomaJ DNA replication. It is noteworthy Ihal both ONA heli case and topoisomerase perforro Ibe ir functions withoul permanonlly allering the chemica l struelure oC ONA or synlhesizing any new molecu le. DNA heliease breaks only the hydrogen bonds that hoId the Iwo slrands of DNA logether withoul breaking any covalent bonds. Alth ough topoisomerases break one or more of DNA's (;ovalenl bonds, each bond broken is procisely reformad before the release of Ihe DNA (see Figure 6-25). lnslead of allering the chemica l stru clure of ONA, the action of these enzymcs
lABLE 8-1 Ent.ymes that function al Ihe Replication F(H'k E. col;
Primare
DnaG
S. cerevlsiae
Human
Primase
Primase
(PAr l/PAr 2)
DNA hclicase SSB Topoisomerascs
OnaB
Mcm complex
SSB Gyrase. Topo I
APA
Mcm complex APA
Topo 1. 11
Topo 1. 11
resu lts in a DNA molecute wilh an alterad conformation . lmportantly, Ihese conformational alterations are essential ror Ihe duplication of Ihe large dsDNA motecu les that are the foundation of both bacterial and eukaryol ie chromosomes. The proteins Lbat ael af the replication rork interacl tightly bul in a sequence-independent manner with Ihe ONA. These interactions exploit Ihe fea lures of DNA thal are the same regardJess of the particular base pair: tbe negati ve charge and structure of the phosphate backbone (for cxamplc, the Ihumb domain of ONA po!yrnerasc); Ihe bydrogen bonding residues in tbe minor groove (for example, the palrn dúmain of the ONA polymerase); Ihe hydrophobic stacking interac1 ions ootween the bases (for exampte. SSB). In addition, many of tbese proteins have sITuclUres thal allow thero lo encircle (for example. ONA helicase) or enc:ompass (for example. ONA potymerase) the DNA to rernain associated with the DNA.
THE SPECIALlZATION OF DNA POLYMERASES DNA Polyrnerases Are Specialized for Different Roles in the Cell The central role of DNA polymerases in the effi cient and accurate replicat ion of the genome requires that cell s have multiple specialized ONA polymerases. For example, E. coli has at least five ONA polymera ses that a re dislinguished by their enzymaLic properties. su bunit composition , and abundance (Table 8-2}. DNA polymerase III lONA Poi 1Il) is th e primary enzyme involved in Ihe rcplication of Ih e chromosome. Beca1lse Ihe e nlire 4.6-Mb E. co/i genome is repli cated by two replicatioo forks, ONA PoI m musl be highly processive. Consisten l wit h these requirements. DNA Pollll i5 genera Uy found to be part of a larger complex thal confers very high processivity-a complex known as Ihe DNA Poi ID holoenzyme. lo contrast, DNA polyrnerase I (DNA Poi 1) Is specialized Ior lhe removal or the RNA primers Ihat are used lo inHi ale DNA synthesis. For Ihis reason, tbis ONA polymerase has a 5' exonuclease Ihal allows DNA PolI lO remove RNA or DNA irnrnediately upstream of lhe site of DNA syn thesis. Unlike ONA PoI III , ONA Poi I is 110 1 highly processive, adding only 20-100 nudeotides per binding event. lbese properties are ideal for RNA primer removal and ONA synthesis across Ih e resu lti ng ssONA gap. The 5' exonuclease of DNA Poi 1 can remove tbe RNA-DNA Iinkage tha! is resistan! lo RNAse H {see Figure 8-12). The sborl extent of synthesis by DNA Poli is ideal fo r repladog the short regioD previously occupied by lhe RNA primers «10 n ucleotides). Beea use both ONA Po i [ aod ONA Poi IU are invo lved in ONA replication , both of lhese enzymes must be highly accurate. Thu s. both proteins carry an associated proofreading exonuclease. The remaining three DNA polymerases in E. co/i Bre specialízed for DNA repair and lack proofreading aclivities. Tbcse cnzymcs are discussed in Chapter 9. Eukaryolic cells also have mu ltiple ONA polyrnerases. wilh a typic.al cell having more than 15. Of Ihese. three are essential to duplicate the genome: DNA PoI 15, DNA Poi e, and DNA PoI n/primase. Each of these eukaryolic ONA potymef'dses is composed of multiple subunits (see Table 8-2). DNA Poi n/primase is specifically involved in initialing new DNA stf'dnds. This four-subunit prolein complex consists of
Tlle :5
TA B L E 8-2 Activities and Functions 01 DNA Potymerases
Prokaryotic (E. coll)
Number of subunits
Functton
Poi 11 (oin A)
1 1
ANA primer ICfTIOVal. ONA oNA lepai,
PoI 111 core PoI 111 hQloenzyme
3 9
Chromosome ,eplicmion ChromúSome replication ONA repall. Tmns leston Synthests
FU"
Poi IV (oln B)
PoI V (UmuC, UmuD'}:) Eukaryotic Polo
3 Number of subunits 4
FUI P
1
PoI y
3
PoI,
2-3
"",
4
PoI.
"'"
Poi ,
PoI" Poi ,
1
rapair
(TlS) TlS
Function
P,imer synlhesis during oN A ,eplication Base excislOn repair MirochOndrial oNA replication and 'epair oNA replication; nucleotide Ilnd base excision repair oNA rcpllcatiOfl; nucJeotide and base excislOn repair oNA repair 01crosslinks Transles"ion synlhesis (TlS) Meiosis-associated ONA repai, Somatic hypcrmulation
TlS
FUI,
Relatively aCCurate TLS past cis-syn cyclobutane dirners TLS. somatic hypermutaUon
Rev1
TLS
Poi "
Source: Dala Ifun Sutton and Wallrer. 2001 end relereoces It)erein,
a two-subunit ONA Pol a and a two-subunit primase. Afier the primase synthesizes a RNA primer. the resulting RNA primer:template junction is immedialely handed off lo the associaled ONA PoI n lo initiate ONA synthesis. Due to its rdatively low process ivity. ONA Poi n/primase is rapidly replaced by tbe highly proccssive DNA polymerases o and e. The process of replacing ONA Poi n/primase with ONA Poi B or e is L:alled polyrnerase switchi ng (Figure a -16) and results in Ihree dif· ferent ONA polymerases functionin g al Ihe eukaryolic replication fork. As in bacteria l cells, the majority of Iha remaining aukaryotic ONA polymerases are involved in DNA repa¡r.
Sliding Clamps Dramatically lncrease DNA Polymerase Processivity High processivity at the replication fork ensmes rapid chromosome duplicalion. As we have discussed, ONA polymerases at the replication fork synthesizc thousands to millions of base paies wHhoul releasing from the template. Despite this, when looked at in the absence of other
202
Tll e Replicolíon
vI DNlI 3·
FIGURE 8-16 ONApolymerase swtkhing duri.,g eukaryotit DNA replication. The Ofdef 01 DNA poIymerase functionis illustli1ted_The length of rhe ONA synthesired is shorter than in re
5·
llllllllllllllllllllllllllllllllllllllllll':~:mm::!!!!I!1IIIn!IC, ONA PoI af
RNA primer synthesis by primase
primase
between 100 anc! 10,000 nudeotides. AlIhough
!
bolh DNA PoI 6 and ~ can substilute for DNA
PoI O/pllll1
ONA synlhesis by?ola
'l!'IUlIIlI::mll@)"""""!'llIll1ll1lmmmlmr: ONA PoI óor E sliding _
clamp
proteins, the DNA polymerases that ae! al the replication fork are oruy uble to synfhesize 20- 100 base pairs before releasing from the template. How is the proressivity of those enzymes increased so dramatically at the replication fork? The key to th e high processivity of the DNA polymerases lhat
The Specializa/io¡' 01 Df\'A Poly m
direction of replicaUon
b
r
damp
•
(a) Three-órnension SlruCtUfC of a slicling ONA assooated w lh ONA. lhe opcl1l1' S lhfOlJ8h !he cefltef of thc slidlng damp t5 about 35 angsll0fT\5 aM !he width of lhe ONA he!1X IS applOlOl"l"\alely 20 angstrans. n.s plOIides 6IOtJ8h space 10 allow a lhin \ayef of onc or two walCf rnoIccUes bctwcen me sliding damp and !he ONA. Thrs is lhougtN to allow lhe clamp toside abog!he DNA easiIy. (Knshna 1.5.. KoogXP. QHy S. Bu~ PM. cmd KUflyan J. 1994. CeI179' 1233.) I~ prepaed Yo1th BobSrnpt. MoIScnpt. and Rasrer 30. (b) Siding [)NA danl:rs enrucle the newIy feplicated [)NA proc:Iured by an associat ed DNA poIvmesase. The sltding damp IflleroKls WIth!he pIIrt of me [)NA poIyrncrase thal lS doses110 \he l'lE.Yoiy syntheslzed [)NA as It emf.'fge5 from !he DNA poIymerase. f I C; U R E 8-11 SlnKtun: of • sliding DNA clamp.
~
template DNA on average once every 20- 100 base pairs synthesized . In lbe presence oC lhe sliding clamp, the DNA polymeraso still d isengages its active site Crom the 3'OH end of the DNA frequently. bullbe association with tha slid ing clamp prevents tbe polymerase from c1 iffusing away from Ihe DNA (Figure 8-18). By keep jng lhe ONA polymerase in close proxíuüly lo the DNA. the sliding clamp ensllres thal Ihe DNA polyrneJ"!:lse rapidly rebinds the same primer:template junclion. vastly increasing the proccssivity oClbe DNA polyrnerase. Once 8Jl ssDNA template is completely copied. the DNA polymerase musl be roleased &cm lhis DNA and the sliding clamp lO ad al a new primer: lemplate junction. This release is accomplished by a change in Ibe a rfinity between the ONA polymerase and Ihe sliding clamp lhal depenrls on !he bound DNA. DNA polymerase bounrl to a primer:lem· plate junction has a high affinity Cor Ihe clamp. In contrasl. whell the DNA polymcrosc maches Ihe cnd of an ssDNA templete (for exnmple, al Ibe end oC a n Okazakj fragmenl), a change in Ihe ronformation oC lhe DNA polyrnerase reduces i15 affinity for the sliding clamp and the DNA (800 Figure 8·18). Thus. when a polymerase completes the replical ion of a streteh of DNA, jt is released by !he sliding clamp so ji C.8D acl al ti new primer:template jlIDction. The clamp. on the otber hand . remains bound lo the DNA and can bind olher enzymes thal act on the llewly synthesized ONA (as we describe below).
Zil4
The Replicatiofl 01 DNA
FIGURE a..18 SlKlmg DNA damps
inaease the ptocessMty of assoOated
damp DNA potymerase
I
DNA potymerases.
ONA po"","",,"
DNA polymerase rebinds 10 lhe same primer:template and cootinues DNA synlhesis
1 1
~
in lhe absence of a primef: lemplate DNA polymerase is released li'om sliding damp
1 /
G
Once released (rom a DNA polymerase, slidillg clamps are not ¡mmediately removed &om the replicated DNA. Instead , other proteins tOOt mus! fundion atibe site of recen! DNA synthesis lo pe rfonn their function inleract with the clamp proteins. As described in Chapter 7, enzymes that assemble chromatin in eukaryotic celJs are recruited to the sites of DNA replication by an interaction with the eukaryotic sliding DNA clamp (called PCNA). Similarly, eukaryotic proteins involved in Okazaki fragment repair aJso inlf'..ract with sliding clamp proteins. [n eacb case, by interacting witb sliding clamps, these proteins accumulate at sites ofnew DNA synthesi s wnere 'hey are needed the mosL Sliding clamp proteins are a conserved part of the DNA replication apparalus derived from organisms as diversc as vinJses, baderia, yeast, and humans. Consistent wilh tboir conserved funcl ion, the structure of sliding c1amps deríved from these different organisms is ruso conserved (Figure 8-19). In cach case, the clamp has the S8me sixCold symmetry and the same diameter. Despite the similarily in overall structure. however. the number of subunits tbat come together to fonu the clamp differs.
Sliding Clamps Are Opened and Placed on DN A by Clamp Loader, The sliding clamp is a closed ring in solution and must open to endr· ele lhe DNA double helix. A speclaJ class oC protein complexes, called sliding clamp loaders, calalyzes Ihe oponing and plac:ement of sliding clamps on tbe DNA, Tbese enzyrnes couple ATP binding and hydrolysis
DNA Synth f!Sis al rhe Replicotirm Fork
•
b
F I G U R E 8~ 19 1he duee-dimensional stfUCture of slMling ONA damps ;solated ftom djfferent
otganisms. Sbding ONAdamps are fool1d aaoss an OIBOnisms and smlfe a similaf slrudure. (a) The s1iding E. coIi 15 CCIf1'lXlSCd 01 two copies cA ¡he 13 protein. (Kong x.P., Onrust R., OOWneUM, and Kurryan 1 199 2. Cel/ 69: 425.) (b) The T4 phage sliding DNA clamp is a trirner of the gp45 protein. (Moarefi L, Jeruzam D, TutnCr 1, ODonoeJl M, an is a mmer ot lhe PCNA ¡:rotein. (Krishna T.$., Kong XP, Ga¡y S., Burgers P.M., aOO KLuiyan J. 1994. CeI/79 : 1233.) Ifl"IiIges prepared with BobSoipt, MoIScrip\. and Raster 3D.
[)No\. clamp frorn
to the placement of the sliding clamp around primer:template junctions on !he DNA (see Box 8-2, ATP Control oC Protejn Funellon). The clamp loader a1so removes sliding cJamps rrom the ONA when they are no longer in use. Like DNA beli cases and topoisomerases, thcse enzyrnes alter tJw conformation of their target (the sliding clamp) bul nol its chemica1 composition. Whal conlrols when sli ding c1amps are loaded and removed from tbe DNA'? Loading of a sJiding clamp occurs anylime a primer:tem· plate junction is prescnt in the cel!. These DNA structures are formed not only during DNA replication bul al so during several DNA repair events (see Chapler 9). A sliding clamp can on ly be removed [rom the ONA ir il is not being used by another enzyme. Sliwng clamp loaders and DNA po!ymef'dSeS cannol inleract with a sliding clamp al the same time because they have overlapping bioding sil es 0 0 the same face of the s liding cJamp. Thus , a sliding clamp thal is bound lo a DNA polymerase is not subjecl to removal from the DNA. Similarl y, nucleosome assembl y faetors. Okazaki fragment repaír prote¡ns. and other DNA repair proteins aU interact wilh the same region of lhe slid¡ng clamp as tho clamp loader. Thu5, sliding c1amps are only -removed from the DNA once a1l the enzymes tba! interacl with Ibem have completed their function.
DNA SYNTHESIS AT THE REPLlCATION FORK Al the replieation rork the leading and Jaggil1g strands are synthesized sirnultaneously. This has the important benefil oC limiting lhe amount of ssDNA present in the cell during DNA replication. When a ssDNA region of DNA ís broken, there is a complete break in the ch romosome Ihat is much more diffieu!t to repair than an ssDNA break in a lIsDNA region. Moreover, repair oC this type of lesion
e
205
Z06
Thv Rf!pliCalion olDNI\
frequently leacls lo mulatioD of the DNA (see Cbapter 9). Tbus, limiling the time the DNA is in this state is cT\Jc ial. To ooordin alc Ihe replica tion of botb DNA strands, mu lti ple DNA polymerases funelion allhe replicaliún (ork. In E. coU the coordina le action of thesc polyrncrases is facilitatcd by physícally Iiu king them togelher in a large multiprote in complex cnlled lhe ONA Poi III holoenzyme (Figure 8-20). Holoenzyme js o general I ¡ome ¡or a muftiprotein complex in which a core e nzyme activity is associated with additionn/ components Ih al cn}¡ance ¡unctioll. The DNA Poi III hol oenzyme ¡neludes Iwo cop ies of tile "core" DNA Poi UI enzyrne a nd one copy of the five protein ')'-complex (Ihe E. coli s lidi ng clamp loader). Although presenl in oo ly one copy. the ')'-complex bincls lo bolh copies of t be core DNA Poi JI! sud i s essential to Ihe formation of the bo loenzyme {see Figure 8 -20}.
Box 8-2 ATP Control of Proleín Function: loading a Sliding Clamp HO\N is ATP binding and hydrolysis ooupled 10 sliding clamp loading? lNhen bouoo to ATP. the clamp loader can bind and open the s1iding clamp nng by causing one of Ihe subunit subunit interfaces to come apart (Sox 6-2 Fi@Jre 1). The nO\N open sliding clamp is brought 10 !he ONA through a high-atfinity ONA·biOOing site on Ihe clamp Iooder. Consistent with lhe need for sliding damps al !he siles of ONA synlhesis, tIlis ONA-biOOing site sperifically recognizes primer:templale junctions, but only when the damp Ioader is bound 10 ATP. As the damp Ioader bínds the primer:template junction. the open sliding clamp is paced around tI"¡e DNA. The final steps in sliding damp Iooding are stimulated by ATP hydrolysis. BiOOing of the clamp Ioader to !he primer:template juncti"on activales ATP hydrolysis (by Ihe clamp Ioader). Because the clamp Ioader can only bind the sliding damp and ONA when il is bouOO to ATP (bU! not ADP), hydrol'ysis causes lhe clamp loader 10 release tIle s1íding clamp and disassoc.iate lrom tIle DNA Once released from the damp Ioader, the sliding damp sponlaneously doses arouOO tIle ONA. The net result ot Ihis process IS the Ioading 01 lhe sliding damp at rhe site of ONA poIymera5e actlons-the primer:template juncTion. Re/ease of AOP aOO P, aOO biOOing to a new ATP moIea.ile alk:J..vs \he damp Ioader lO initiate a new cycle of loading. The function of the clafTlP Iooder illustrafes .several general teatures of lhe ooupling ot ATP binding and hydrolysis to a molecular 8Ieflt. ATP biOOing to a protein typically is involved in tlle assembIy stoge ot the event the éISSOdation of factor 'llrith the target moIecule. For example, tIle damp Ioader has t"v\.Q larget moIecules: the sliding clamp aOO the pñmer:lemplate junction. ATP is required ter Ihe clamp loader lo biOO lo either talget Similarly, ATP binding stimulales the ability of ONA helicases 10 biOO to ssDNA. In each case. Ihe E\lents coupled to ATP biOOing couk:l be considered tIle action part of !he cyde. For ¡he damp Ioader, ATP binding but nol AW hydrolysis is required to open
!he sliding clamp nng. Fof lhe ONA helicase. binding ssONA is likely ro be tIle key event unVllinding ONA. In Ihese cases, biOOíng to ATP stabilizes a conforma60n ot Ihe enzyme lhat fal.-QfS interactioo with Ihe substrate in a panicular conforrnation. Vvhat Os !he reJe 01 ATP hydrulysis? ATP hl'irolysis l',jlicaly is InvoIved in the disossembly stage of !he event: releasing the l:xJund targets from tlle enzyme. Once the ATP-stabilized complex is formed, it must be disdssernbled. This could occur by simple disassocialiol; ho\rvaIer, more often than rol: this Pfocess Vv'OUId retum Ihe rom¡:onents to their starting situation (fa" example, the sliding damp free in soIution), and this process 'A'OIJd be slO\N if !he AlP-stabilized romplex is lightly associated. lo ensure that disassemtiy OCOJrs al !he apprq:>riale time, ~, and rate. AW hydrolysis is used to "initiale disassembly. For example, ATP hydrclysis causes !he damp Ioader 10 revert back lo a state in which it cannot bind either ¡he sliding clamp or ONA. Reversion to mis grouOO state may OCCUJ while !he enzyme is still bouOO to !he products of ATP hydrolysis (ADP and PJ or may require their release. The final key mechanism to couple AW hydrolysis to a reaction per1ains to the trigger ter ATP Irydrolysis . 11 is critical that lhe factor not hydrolyze ATP until a desired complex is assembled. Typically, fomlatlOO of a p.3rtia.:llc»'" complex triggers AW hydro/ysis. In the case of the clamp k:lader, Ihis rompex. is !he lerlÍary compIex 01 the sliding clamp, the clamp Ioo:ler, and the Pfimer:template junctioo. Thus, ATP control of these molerular events is most directly related lo controUing the timing of oonformatlonal changes by the enzyme. By requinng the enzyme ro ahernale bet\AJeen l'Ml ronforrnational states in arder and requiring tlle forrnation 01 a key intermediate 10 tngger AW hydrolysís, !he enzyme can accomplish v.ork. In contras!, if Ihe enzyme merely bound and released ATP (wi!hOUl hydrolysis), the reaction would relurn lo !he initial state as often as il would proceed forward aOO InUe. if any, work would be accomplished.
DNA
Syr)th!;!,~is
Box 8· 1 (Continued) BO X 8·2 fiGURE 1 AJPconlfolof
srtdin¡ DNA clamp 10ICÜng.
a
(a) Slldlng
c:Iarnp \oaders are fiI.re sOO..mll pfOtem complexes who5.e actMty is cormolled by ATP bndaog and hydrotys.s. In E..a:Ji the clamp
Ioader 15 called lhe y-compIex, and .n eukatyobc reIIs .\ IS called fepIicat¡on factor e (RF-C). (b) To catalyze lhe sJW:fing damp opeoing. !he clamp lo;Kjer must be bound lo ATP.. (e) Once bound lO ATP, !he clamp Ioadcr bind$ the clamp and opens !he ñng al one ollhe subunit:subunil
b
interfaces. (d) The resulttng complex can OON
bind lO ONA. ONA b.nding is mediated by the damp !redel, v.tJich p.eferenlially binds lo pnmcr: templale jl.lOCfiorrs. Correct binding lo lhe ONA has two consequences. First. rile ~ed slidi'lf clarJll is positJOncd so lhat dsONA.s m v.t.at.,.,;" be the "holeo d!he clamp. Second, !)NA bmding stirntllates ATP hydroIysis l7t (he cl.Ynp 1oadeI. (e) BecalJ5e only an ATP-bound damp Ioader can bind (()!he c:Iarrl) aod 10 DNA, lhe ADP IOfrn 01 (he clamp badcr rapldty &sas.soc:iales Irom lhe clamp cYld !he QN.I\ leaving bt!IlI~ " dosed ddmp pa;itioned olI"Qt.'f"Id the dsONA portlOO d the ¡:Wne: ~Ie jonction. (Source: Based on aDorme!! M. et al 2001, Clamp Ioader structure predicts!he afchnecture of!)NA poIymerase m hoIoenz}.me and RfC. ÚJff8'1t BioIogy 11 ; R942, fig 5. CDpvñght e 2001 Wth pemlÍSSIOfllrom ElsevIer.)
e
1
ATP hy<:Irolysis
~
ot the Replicotion
t-prolein
FIGtJRE 8-20 lhe comp~ition of the DNA PoI 111 hoJoenzyme. lhefeare three euymes in eKh copy of the DNA PoI 111 hoIoenzyme· two copies 01 ¡he DNA 1'01 111 cae I2I1ly111e IInd one copy of the y-ctlfnple~ The y-c~ indudes two copes 01 !he T""protcin, each 01 "I'kIich incWes a d::1n"\¡¡;n!ha1 interi:lC"lS Wlth ore [l\JA 1'01 111 core.. AncIysIs of!he cY\1Ino add sequence of the r-protem ildicates that me DNA PoI 111 binding ~ion 01 !he protcin 6 sep,lIated from lhe pdrt of the prOlelO inIdred in clamp Ioading by an exterded flexible linl:.et-.lhis linker is proposed 10 aliow the two pot.,tmerases 10 rrta.Ie 111 a rel~ IOdependent rn'lnro thal would be necesSilfy for one
potyrnefase lo reprlCalc lile \eaÓIlg stral"ld and It-e other 10 repIica!e!he la8glO8 suand. (Soulce: Based on O'Donncll M. el al 2001. Clamp Iooder struct.ure preócts !he IIlchitecttlre 01 DNA potymerase 111 hoIoenzyme aOO Rrc Current BIdogy 11: R943, f.g 6. ~t 1&) 2001 wrth permission fmm EIsevier.)
How do two DNA polyrnerases remain lioked allbe replicaljon fork w h.il e synthesizing DNA on both fue leadiog and lagging te mpl atc slrands? A model tbat explains this proposes lhat. the replication machinery exploits Ihe fl ex ibilit y or DNA (Figure 8-21). As the heli case unwinds the DNA al the replication fork, thc leading strand is rapidl y copied while the lagging stran d is spuoled out as ssDNA that is rapidly bollnd by SSB. lnlermittenll y. a ncw RNA primer is synthesized 011 the lagging strand template. When tha lagging strand DN A polymerase completes tbe previolls Okazaki fragment. this polymcrase is reJaased from Ihe "template. Because Ihis polymerase remaíns tetbered lo Ihe leading strand DNA polymerase. it will bind lo the primer:lemplate junction nearest the replication fork - the one formed by Ihe newly syntbesized RNA primer on the lagging strand. By binding lo tbis RNA primer, tbe lagging strand polymerase forms a new loop and iniUales the nexl round of Okazaki fragmen l synlhesis. Tbis model is call ed Ihe " Irombone model" in refercnce lo Ihe changing size oC the DNA loop Cormed by lhe lagging slrand lemplate. DNA replication in eukaryolic cells also requires multiple DNA polymerases. Three different DNA polymerases are prese nt al each rcplication fork: DNA Poi u/primase. DNA PoI 8. and ONA Poi € (see Figure 6-1 6). DNA PoI o:/primase initiates new strands and DNJ\. Poi 8 aOlI 10: extend these strands. Although Ihere is evidence lbat DNA Poi Eo and e synlhesize opposite DNA strands. ít remains ondear which poJymerase is responsiblc for leading and which is responsiblc for laggi ng strand synthesis. Similarly. the proleins tbat recruit. mainlain. and coordinate the aclion of these three polymerases al the eukaryotic DNA replication fork re main unknown (Ihe eukaryolic sliding clamp loader. RF-C. does nol perform this fuu(;tion).
DNA SynihesiH IlIlhR I/eplicotion Fork
flCURE 8-21 Jbe "trombone" modeI
lo, mordinating replication by two DNA potymerilSft iIt tite E coIí replicalion forle... (3) The DNA helicase at the E a;Ji Ct>IA replication 1M traveIs en lhe lagging strand template in a 5'--3' difedion. The ONA PoI m hoIoenzyme intcracts with the ONA hekase through the T-Subtriit, which also binds to both ~ ~ases. CX!e DNA PoI 111 core is ,epIicating the IeOOing strand and \he aher [)NA. PoI 111 rore replicates lhe lagging strand. SSB coals me ssONA reg.ons of the [)NA (b SlmpIK:Ily SSB en the lagging stfand is 0111)' shoNn in p«t (11». (b) PeriOOcaIly, DNA primase......,n associate v.ith the i:X'JA helease and synthesize a new prmer on !he laggIng Slrand templale. (e) W1en the lagging SlIand DNA pdyrnefase ccmple1es an aazaki fragment. it IS released frcm the sliding clamp ilOd !he ONA.
•
,,-
sliding _
""""
b
,7
~.¡.)
e
ONA pc:¡Iymet'ase is rele8sed Irom DNA ~
aamp arte,
~o(8fI
Okazakllragmeflt
,-
210
The Ilepliculion ofONA
d
•
clamp is loOOed 0010 Ihe ~ primed
lagging strand
yx;JJr;::J-J ,.
~~ syn!hesizes oew Okazaki fragrTleflt
J.
S.
F I G U R E 8-21 (continued) (d) The recently prJned lagging strand O\IA is Ihen el target of!he damp Ioader, wI1ich assernbIes el new sliáng da......, al the prirner:tffl1¡:kte jundlon aeated by s)11thesi21ng a new RNA primer. (e) 1he primer.~e junction wilh its assodated s~d¡ng clamp binds 10 !he lasging slrillld DNA paymerilSe, wIuch initiates [)NA synthesis 0I""l dle nc);! Okazaki Iragrn«ll AIthoogh this description has concer¡.. lrated on!he rncxe complexaction QCC.\Kring during Ihe synthesis of!he lagging strcYld, cluing triS enfile process, lleVo' ssONA terrr¡'.IIafe 101 the leading strand has been gl2llefated and !?pidly rep!Oled by the leading sttand [)NA PoI lit
Interactions between Replication Fork Proteins Form the E. coli Replisome The cOilnections bctween the componenls of tbe DNA Poi III boloenzyrnc are not the only interactions lhat occur between the components of lhe bacteri aJ replication fork. SeveraJ protein-protein interaclions. beyond tbose between the components of the Poi 1lI holoenzyme. Jacili tale rapid replirntion fork progression. The most importanl of these is an interaction between the DNA helicase (the hexameric dnaB protein; see Table 8-1) and the DNA PoI III holoenzyme (Figure 8-22 ). This interaction. which is mediated by the clamp loacler component of Ihe holoenzyme. holds the helicase and the DNA Poi III holoenzyme together. In addition, this associati on stimulates the activity of the helicase by increasing the rale of helicase movement tenfold. Thus. Ibe ONA helicase slows clown if it becomes sepnrated from fue DNA polymerase (see Figure 8-22). The coupling of belicase nctivily to Ihe presence of DNA Poi III prevents the helicase from "running away" from Ihe DNA Poi ro holOflnzyrne and thus serves to coordinate these two key replication fork enzymes. A second important protein-protein interaction occurs between the DNA helicase and primase. Unlike most proteins lbal act al Ihe E. coJj replication fork. primase ís not tightly associated with Ihe fork. (nstead, al an ¡nterva! of abaut once per second. primase
UNA SynlhP.!lis nI ,he lIepUoorkm Fork
•
b
"
~
~
,,'
FI(iU RE 8-22 BindingcA Ihe DNA hdicase to DNA PolI" hoIoenzyrne stim.....es the rate of DNA manci ~ation. The T-5Ubunit al !he dirnp Ioader lflteIaClS ....;tI, both the Cf.¡A hekase and the [)NA ~ i'Jl the
replication fOO.. (a) WleI'I thls InlEfaction IS made, !he OOA IleIiaIse UflWInds the OOA
al approximdte/y Ihe sarnt! .elIe ilS ~ [)NA po/yfnerme5 lepIicate the DNA (b) If the [)NA hcIicase is no!
PoI 111 tdoenzyme. DNA lJl'lVo'Indtng sIows by lenfrJld. l.!ndet lhese condi6ons, me [)NA leplicate faster th.m the DNA hekcasc can separare !he strands 01 uruepkatcd [X\IA, This ~ Ihe ONA F'oI lIl hoIoenzyme to -caKh l4'" 10 !he DNA helicase and the ,eI~ian of a tul! ~
<'6SOÓated WIIh ONA ~~ can
associa tes with the belic
21 1
212
The Replicalion 01 DNA
lb full y appreciale the amazing capabiliti es of the enzymes lbat rcpHcate UNA, imagine a sitüation in whicb a UNA base is the size of your textbook. Und er t.hese conditions double-stranded DNA wou ld be approximatcly one meter in diamcter and the E. coli genomc would be a large circ1e aboul 5UO mnes (800 km) in circumference . More importa nlly. the rep lisome would be the size of a FedEx delivery tnlck 81ld \'Vonld be moving al ayer 600 km/hr (375 m pb)! Replicating the E. coli genome would be a 40 minute, 250 mile (400 km) trip tor two such machines. each leaving two 1 meter DNA cables in their wake. Impressively, during Ihis trip the replication macru nery would, on average, make only a single error.
INITlATION OF DNA REPLlCATlON Specific Genomic DNA Sequences Direct the Initiation of DNA Replication
replicator
I
The initial formation of a replication fork requires the separation of the two strands of the DNA duplex to provide 8 template for Ibe synthesis of both Ihe RNA primer and new DNA. Althollgb strand separation (also called ONA unwinding) is most easily accompl ished al chromosome ends. DNA synthesis generally initiales a l inlernal regions. lndeed for circular chromosomes , the lack of chromosome ends makes internal DNA unwinding essentiallo replication initiation. The specific sit es at which UNA ul1wind ing and inilintioll of replication OCC lIf are called origins ofreplication. De pending on the organismo lbere may be as few as one or as many as lholl san ds of origins per chromosome.
The Replicon Model oí Replication Initiation
l
FICURE 8-2J The ,epticon model.
Binding of!he initiator lO lhe replicata- stimuIaIES InitiatÍOfl 01 replícalion and !he dupbCdtion of the associdted DNA.
ln 1963 Fran¡;ois Jacob, Sydney Brenner. and Jacques Cuzin proposed a model lo explain the events controlling the inilialioo of replication in bacteria. They defined a ll tbe DNA replicated from a particular origin as a replicon. For example. because the single chromosome found in E. coli cells bas only one origin of replication. the eotite chromosorne is a single replicon. In cont rasl, the presence of multiple origins of replication divides each eukaryolic chromosome into mnltiple replicons-one for each origin of replication. The replicon model proposed t\Vo components tbnt controlled the initialion of replicatioo: t.he replicator aud the initiator (Figure 8-23J. Tbe replicalor is defined as thc enlire sel of ds-acling DNA sequences lbat is su/fideot to direct ..he initiation of ONA replication. Tbis is in contrast to Ihe origin of replication which is Ihe site on the DNA where the DNA is unwoulld and DNA synthesis initiatcs. Although the origin of replication is a1ways parl of the replicator. sometimes (parlicularly in eukaryotie cells) tha origin of ropliCo.tion is only a Craclion of tho DNA scquences required lo direct the initiation oC replication {the replicatorJ. The same distinction can be made between a transcriptional promoter and the start site of transcription. as we will see in Chapter 12. The second component of the replicon model is the initintor protein. This protein specifically recogni?.es a DNA eIernent in Ihe replicator and activales the ¡niliaUon of replication (see Figure 8-23). lniliator proleins have bren identified jn matly different organisms. indudiug bacteria.
viruses. and. eukaryotic ceUs. Although these proteins are nol c!osely related, Lhey alJ seleet the siles tilat will become origins of replication. As we will see OOlow. the initiator protein is the only sequencespecific DNA-binding protein involved in the iniUaLion of replication. The remaining proteins required foe replication initiation do not bind Lo ONA sequence specifica ll y. LnSlead. these proteins are recruited 10 lhe replicator through a combinati on oC protein-protein inLeractions and affinit y foe specific DNA structUJ'eS (for example. ssDNA or a primer:tcmplate junclion).
Replicator Sequences Indude Initiator Binding Sites and Easily Unwound DNA The DNA sequences of replicalors share Iwo cammon (eatures (Figure 8-24) . First, they lncl ude a binding site for lhe inítiator protein thal nucleates (he assembly oC tllC repUcetíon initi ation machí nery. Second, they indude a streLch oC AT-rich DNA that unwinds rcadily buL nol sponta ncously. Unwi nding oC DNA aL repli cators is controlled by the re plica tion initi alion proteins. and Ihe ection of these prote ins is tightly regulated in masl organisms. The single replicator required Coc E. coN chromosomal replicatíon is cn lled. oriC. l'here are two repeated motifs that are critica l Coc oriC funcHon (Figure 8-24a). The gomer motjf is the binding s ite for the é. coli ¡n¡tiator, OnaA, and is rcpeated five limes at oriG. The 1 3-mer motif, repeated three ti mes. is Ihe inHial s ite of lisDNA Cormal ion during initiatlon. AlIhough lhe speci fic sequences are different , the overall structures of replicators derived from many eukaryotic viruses and the singleccll e ukaryote S. cerevisiae are similar (Figure 8-24b-c). The methods
a 0riC (E. col,)
f
13
13
13
9
9 245 bp
b SV40 EP EP P P P P
6S bp
e S. cerovisiae
.2
.,
A
100bp
Fl G U R E 8-24 Sttucture of replicators. The DNfI elemems th.lIl'1l(1kc!..p thlce welkh.1racteozed ~icators
dIe shoNn. lhe inrtiator ONA·bindtng sites efe shCl\l\('l in gr~n, elements that facilitate DNA blue, and !he sire 01 the fil'.il DNA symhesis in red (the site fOI rxiC is outside Ihe sequence shown). (a) oriC is composed of Iolir "9-mer' OndA blnding Sltes aOO three "'3·fTIe(' repeplicalion tclC\Ol"S ~nding in
9
The R,¡plicat¡ol1 ofDN/I
214
5" 3"
lniljator
ONA blnding
5'· 3"
I
I
casily me1ted
ONA
l
DNA lH1winding
ONA strand s.••~~~ separalion J '«;
BINDING AND UNWINDING: ORIGIN SELECTION AND ACTIVATION BY THE INITIATOR PROTEIN
replication
F I G U R E 8-25 Functions of the initiator
pt'oteins during the inüiatton of ONA feplication. The Ih(ee common fuoctions ofinitlalOf proteil'1s are
Ilh.&rated: ONA blnding.
DNA Slrand ~ralioo, and replication prolein reaudmenl. (Here the recruited prol em is ¡IIustrated as a DNA helicase; ho.vever, me
recru[ted protelns diflcr lar eam ,nínatDr proteln.)
Box
used lo define origins of replication are d escribed in Box 8-3 , The Identification o f Origins of Replicalion and Replicators . Rep licators found in multicellular eukaryotes are not well understood. Their identification and c harncterization has bmm hampercd by !he lack of genet.ic 8ssays foe stable propagation or small circular DNA comparable to those used lo identify origins in single-cell eukaryotes and bacteria (see Box 8-3). ln the few instanres in which replicatcrs have bee n identified, Ihey are found to be much larger than the replicators identified in S. cerevjsiae and bacterial chromosomes, genorally encOlnpassing more than 1,000 bp of DNA. Unlike their smaller cmmterparts, mutations Iha! eliminale the function of these replicalors ate 1101 readHy isolated , perhaps because importanl elemenls wilhin these sequences are redundan!.
Initiator pmleins Iypieally perform three different funclions during the ¡nhiation of replication (Figure 8-25). First, 'hese proleins bind a specifie DNA sequence within the replicator. Second , onCe bound lo the DNA, Ihey frequently distor! or unwind a region of DNA adjacen! to their binrung si le. Third. initiator proteins interaet with additional factors required for replication initiation , thus recruiting them to the replicator, Consider, for example , the K coN initiator protein, DnaA. DoaA binds (he repeated 9-mer elem ents in oá C (s~e Figuro 8-24) and is regulatoo by ATP. When bound to ATP (but not ADP), DnaA also in leracts with DNA in the region of the repeated 13-mer repeats of oriC. These addUlonal
8-, The IdentifKation of Origins of Replication and Replicators
ReplicatO' sequences are typically identifled using genetic assays. FOr eltarrple, !he first yeast replicatas were identified uslng a DNA transfcrmation assay (Box 8-3 Figure I ) . In these studies, investigat(JS randcrnly cloned genornic ONA fragrnEflts into ptasrnids lacking a rep~c.aIO' bul containing a selectable maiker. For the plasmid lo be maintained in a cell after transfotmation. !he dooed ONA fragmeni had lo contain a yeast replicator. The identified ONA fragments were called autonomously repücating sequences (ARSs)_ Mhough these seqU€l"lCeS aded as replicatcrs in the artifICial conlext el a ci"cular plasmid, further evidence was required lo demmslrale \hal these sequences were also replicators in their native chrOlT\05OlT\i'lI
Iocalion To derrrnstrate th
identify the locatioo or erigiros or replication in the cell. One approach to idenlify erigiros takes .xtvantage of the unusual structure of !he DNA replication interrnediates formed during replicatioo initiaocn Unlike either fully replicated or fully unreplic.ated ONA. ONA lhat is in the process of being replicated is noI: linear. For e.xample, a ONA fragrnent (generated by deavage of the
ONA \Nith a restriction enzyrne) that does not cootain an oogin of replication will take 00 a variety of "Y-shaped" cooformatioos as it is replicated (Box 8-3 Figure 2, blue ONA fragrnents). Similarly, immediately after me jnitiation of replication, a ONA fra@TlOOt OO1taining an Drigin of replic.ation \NiI1take on a 'bubble" shape. Finally; if the origin of replication is loca1ed asymmetrically >A-tth;n the ONA fragmen~ lhe ONA will start out as a bubbte shape tren COIlVert to a y-stlape (Box 8-3 F¡gure 2, red DNA fragments). lhese unusually shaped ONAs can be distinguished from the majority of linear DNA. using two-dimensional agarose gel eIectrophoresis and v.nen they are seen can previde dear evidence of an origin el replication (Box 8 -3 Figure 3). To identify ONA that is in !he plncesS cA replicating. ONA derived from dlviding cells is first cut Wth a restrictioo enzyme and separated on a two-dimensiCllal agarosc gel. In !he first dimensim, !he ONA is separated Di size ond shope and in the second dimension, the ONA is separated primarily by size. This is accomplished by using cfiffa-ent ga density and eIectrophoriSs rates fa each dimension To separate Di size and shape, the agarose gel pcres are small and the rate of electrqJhrn:sis is fasl In conlras~ te separale primarily by size, the agarose gel pares
Box 8-3 (Continued)
are larger ane! me rate of electrcphaesis is s1ovver. Once electrc:p/"'tcresis is amplete, the DNA moIecules are transferred te nltrocellulose and detected by Southem blotti"lg (see Chapter 20). lhe choice d the restrictioo enl}1T1e and ONA probe used can dramatically affect the oulcome of the analysls. In general Ihis method requires thal the investigator already ha\..€ significant information about the Iocation of a pc(ential origin of replic.atim How can the t>MXIimensional gels identify the DNA intermOOtates assodated with a replication origin? lhe particular pattem of ONA migraticn can Iead ID urequivoc.al evidence of an OOgin of replication. The most unusual structures migr
(a small circular Df\.!A moleruJe) WfIlaining a selectable rnarker is cut with a restric:ñon enzyme mal TeSlJts 10 lhe eJ(Cl5lOl1 01 !he plasrnids normal replicator. Thls rea\eS a DNA fragrnet1t th.:ltlads a replicalor. To isoLate ol replicator Irom a panICUlar organism, lhe ONA nom lhal organism is cut v.ith lhe same restriction enzyme and ~ga!ed iOlo!he cut plasrrnd to recreare cifaJlar plasmids. each induding a single fragrnenl derived from lhe Ies.t organismolhis. DNA is !heo Irans.lormed !he host organism aOO Ihe recombinanl plasmids. are selected using a s.electable mafker on lhe plasmid (101 examp1e, il lile man.:er conferred al1tibiotic resislance, Ihe alls. wauld be grOlNn in the presence 01 !he anhbtotic). Cells. lhat grow are able lo rnaintain the plasmid and its s.electable marker, indicaling that the plasmid can replicate in lhe cell ancl rnus.t contain a replicator. Isclalion oi!he plasmiel lrom!he hos.t celr aOO s.equencing of Ihe inserted [)NA a!rows. !he idenlification af lhe s.equence ot Ihe tragrnenl lhat cornatns. lhe replicalor. Funher mutagenesis of !he ins.erted [)NA (sud! as. deIetíoo of specific !1281011S of Ihe insened DNA), loIkN.ed by a repetition ot !he as.say
ment in an are that eventually reaches a Iocation that a linear mderule ~e the size of me unreplicated DNA would be e:q:>eeted ID migr
ONAlacking
a replicalof
~,~:b«v
e
-rl.~"-
restriction enzyme)
+
\
¡Iigale
,oto
lfsnsform DNA [ into ce!ls
cetls Ihal conlain plasmid ONA with replicatO!' grow
K
Plale cells on [ selectable media
-
~ . -_._-~ .
:1 1
--
isoIale ONA Irom [ cells lhal !Jrow
-;::::::::
inserlONA ind udes repliCator 215
Boa 8-1 (Continued)
...,
-
.u
origin
RE' RU
I
I
I
S Oll 8-3 F IGU R E 2 DNAtha' ¡s in die p,oc:ess of ,eplication has an unusual structure. Results 01 restriction enryme de.wage 01 DNA io the process ot repIication are st.ov"" The inlEitration sI'lOINS the gro.vth d a "replication bLt:bIe" (aeated by two replication fofks progressiog l/INily from ao origin of rep~cation). The consequences ot ruttiog these replication imennediates is foIIowed Dt- detection Dt- hytn:Iizalion v.ith the indicated labeled ONA probe. If me red restriction enzyme is used .lnd only the Ir.lgments IIlat hybridize ta me red DNJ\ prcbe are examned, me pattem on the Ieft side v.-;U be generated If the bIue restrictian enzyme and me bIue [)NA prabe is used lo detect the resulting [)NA Iragments. me pattern 00 !he nght wiY be absefved, Note that the \eh-hand pattern starts with a DNA lragment eontaioing a "bubble" .lnd eventually ends v.-;!h 'Y'5haped" rnaIeoJes. The right-hand pattem ~ has a 'btbbIe" but daes assume a fun variety 01 'Y4laped" interrnediat~ Onl)' a [)NA ftagment containing .ln origin ot replicalÍOI1 can produce !he panem en the len.
'= '
+
ci '
¡
C>
<
~
:
:
: +
+
,
b , .
RIE
RIE
I
separ3Ie ONA in !'ni dirnenaion (by shape end ~j
d
:
"I
oógin
I
""" .iII....i1h _ -+
..-
.
prcbc
,
0 ;:'>---;
O >---O---!
e~
e~
>-1
[>
"""'-
,
e :,
e:
separate ONA In (by size oriyj
, ,
:
l
~ · arc
-hansrer _atodONA 10 r1tIoceIuIose
1 probe .... radio1abeIed Df.IA from siIe DI poI.-aI (Jigin d repIicalk:n
~
¡
¡
;1
•
I
-= . -
tlime"slOl'1 _
.~ -
8 0 x &-3 FI (j u RE 3 Molecular identification of en origin of replication. (a) By electrophoretically separating ONA io two dimensions, ONA in Ihe process ot replication can be separaTed han fully replicated or uoreplicaled ONA. Tctal [)NA is isoIated fmm dMding (and Iherelore, replicatiog) cells. The ONA is 5epafated first by size .lnd shape (using high vo/tage electrophoresls thfOUgh relatively srnal1 pores). Then !he electfic: field is rolated by 9(f aod the [)NA is separated predaminantly by size (elearophoresed with Iow voItage in large pClfe agarose), Southern analysis is used 10 detect Ihe [)NA of imeres!. The Ihree different par.erns mat can be o~ are illustfated. lhe largest replicatiOll bubbIes migrate the sIowest io the first dimension (e) and V-shaped moleaJes with nearlyequal length arms nigrate the next slONeSt (b), Because the 'Y-ar¿' and "bubble-arc" panems are diffkulT 10 ástin~sh, Ihe "bubble- to V-arc" pattem (d) i5 consideJed the most iodicat~ 00 an origin
...
' : C>,
:, , ,
Bil!ding al!d Unwinding: Drigil! Se/ection and Activatian by the Initiator Protein
217
interdctions result in the separation oI the ONA strands over more than 20 bp \vithin th~ 13-mer repeat region. This uuwoulld DNA provides an ssDNA tempJat~ for additionai replication proteins to begin the RNA and DNA synthesis steps of replication (see below). The formation of ssDNA ul a s ite in Ihe c hromosome is nol suffjcient for the DNA helicasu and other replication proteins to assemblc. Rather, DnaA recruits additional rep lication proteins lo the ssDNA fmmed al lhe replicator including the DNA heli.case (sec below). The regulation of E. ooU replicillion is Jinked lo the control of DnaA activ ity and is discllssed in Box 8-4. E. ca}; DNA Roplication Is Regulated by DNA·ATP Levels and SeqA. In ellkaryolic cells. the initiator is a s ix protein complex caBed the ongin recognítion complex (ORC). The &mction oí ORe is best understood in yeast cetls. ORe recognizes a conserved seqllcnce found in yeas! replicators, called the A-element. as \Vell as a second less conserved Bl-element (see Figure 8-24). Like DunA . ORe binds and hydrolyzes ATP. ATP binding is required Cm sequence-specific ONA binrnng at tbe migiu. Unlike DnaA , binding oC ORe to yeast rt)plicators does nol itself duce! strand separalion oC the adjacent DNA. ORe is, however. required lo recruit all Ihe remaining replication proteins to !he rep liCiltor (see below). Thus, ORe performs two of the three functions common lo initiators: binding to the replicator aod recruiting other replication proteins lo the replicator. Protein~Protein
and Protein~DNA Interactions Direct the Initiation Process Once the inHiator binds to the replicator, the remaining sleps in the initiation oC replication are largely driven by protein-protein inleractions and protein-DNA interaclions that are sequence independent. The end resull is Ihe assembly oC two replication fork machines that we descrihed carlier. To explore tbe events thal produce tbese prolein machines. \Ve first tum to E. coli, in which they are understood in the mosl detail. Aftcr the initiator (DnaA ) has bOllnd to oriC and unwound the 13-mer DNA , the combinaríon of 8sDNA Hnd DnaA rt.'c ruits H 1:010plex oC Iwo proteins: lhe DNA h elicase, OnaB, and helicase loade r
80x H
E. col; ONA Repliotion 15 Regulated by OnaA'ATP Level5 and SeqA
In all organisms it ís critical !hat replication initiation is 6ghtly cortrolled to ensure that chromosome number aOO cell number remain apprcpriately balanced. Although this balance is most tightly regulated in eukaryotic ceUs (see beloN), E coIi also prevent runav.ray dm:mosorne duplication by inhibiting recently initiated origins fmm re-initiating. Several different memanisms act te prevent rapid replication re-initiation from oriC. One method exploits changes in the methylated state of Ihe ONA bebe and afler ONA repllCdtion (Box 8-4 Figure 1). In E cd¡ ceUs an enzyrne caUed Oam methyl transferase adds a methyl group te the A within every GATC sequence (note ¡hat the sequence is a palíndrome). Typicatly the genome is rully methylated at GATC sequences. lhis situaban is changed after
each GATe sequence is replicated. Because the A residues in the nevviy synthesized DNA strands are unmethylated, those sites that have been recent/y replicated wiU be methylated m only one straOO (referred to as hemimethylaled). The hemimethylated state of the nC\o\lIy replicaled afie 1S detected by a protein catled 5eqA. SeqA bíOOs tightly te the GATC sequenre, but onIy when it is hemimethylated. There is an abuOOance of GATC sequences immedialely adjacent te ariC. Once replica6rn has initiated, SeqA binds lo these sites befare they can become fully methylated by the Dam methyl Iransferase. BiOOing of SeqA has two consequences. F.rst il dramaticaUy reduces the rate at which the bound GATC sites are
218
Thl! RepliCfJlion ofDNA
Box 8-4 (Contínued)
a BOX 8-4 fiGURE 1 SeqAbound lo he mimetftylated DNA inhibits ,e-initiation 'mm ,ec:ently 'eplicaled d.ughte, origins.. (a) Prior la DNA rep~ca tioo, CATe sequences thrvughout the E. roIi genome are methylaled on both srrands ("tuDy" meth}4ated). Note that throughout the fi~, the methyl groups are repcesented by red heJr.agons. (b) DNA replication cvnverts these siles to !he hemrmet~ed state (ontv ene strand af !he DNA is melh\oiated). (e) Hemimeth)-iated CATe sequences are rapidy bowd by SeqA.. (d) Bound SeqA prOleln nhibits the full methytation of sequeI"lCES and !he blndlng af onC by DnaA protern (fol simplicity, only one of me l'Ml daughter moIeo.JIes is illustrated in parts el, e, and f). (e) \r\Ihen SeqA infrequently disassociates from the CATe sites, !he seqo..ences can become funy methytated by Dam DNA me!hyl transferase, preverlling reblllding by SeqA. (1) When !he CAl( Sltes becvrne funv methy1ated, DnaA can bind and direct a new rOUld of repr.cation from the daughter orie replicators.
5" 3"
1, 1,
e:
el
9 I
! ! 1 1e ¡
999 _
C1iA'i'if" L [ ~3'
1I
CWI ) ! l I; T
, Inlti
mese
s.qA
re-lnltialion
'
Binding Ol1rl U/1lvil1dirl/f On'gil1 Se/lICUan ol1d Activa/ion by rhe {nitioror Prorein
_
219
.... (Confinued)
methylaled. Secand, when bound lo Ihese criC proximal sites, SeqA prevents DnaA from associating with oriC and initiating a new roune! of replication. lhus, the conversion of Ihe criC· proximal CATC sites from melhylated 10 hemimethylated (an event tila! is a dired oonsequence of initiatíon of replicatíon from oriC) leads to the inhibition of DnaA binding and, theretore, prevents rapid re-initiatíon of replication from the two neMy synthesized daughter ropies of criC DnaA is targeted by other mechanisms thal inhibit rapid re-initiation al Ihe ne.\'1y synthesized dau~ter oopies of orje As described abole, only DnaA boone! to ATP can dired initia600 of rep1ication; hONeVef, this bound ATP is converted lo AOP during the initiation process. lhus. the process of directing a round of replication initiation inactivates OnaA preventing 'its reuse. lhe process of exchanging the bound ADP fOI an ATP is a slCJVII ene. further delaying the accumutation of replication
9-mer bine!ing sites ootside of oriC (DnaA .lIso acts as a transcriptimal regulator at a number of prcrnoters), ane! as they are replicated, this number doubles. lhe inoease in OnaA bine!ing sites acts ro reduce the levels of available DnaA. Together these methods rapidly and dramaticaUy reduce the ability of E. coIi to initiate replication frcm new copies of oriC A1though these mechanisms preven! rapid re-initiatioo, this
inhibition does no!: necessarily last until cell division is oomplete. lndeed, for E coIi cells to divide at the maximum rate, the daul#1ter cq:>ies of oriC mus! initiate replicatim prior to the canpletirn of the previous round of replication. lhis is because E coN ceUs can divide every 20 minules bu! it takes lTIO'e than 40 minules lo replicate the E ooIi ~nome. lhus, under rapie! grONth conditioos, E coIi cells re-initiate replication once and scrnetimes twice prior to me cornpletion oi previous rounds of replication (Box &4 Figure 2). Even under sudl rapid growth cooditions, initiatJon does rol OCQJr more than once per round of cell division. Thus. fer each round d ceU division, mere is only one round of replication inmation from oriC.
80 x 8-4 F1G UR E 2 Origins of reptication re-initiate repl" cation prior to cell division in ,apicly growing cells. lo allow the get10me lo be fully replicated prior lo each round 01 (en divísiOll, replicating c:hromosomes
1
segregat~
bacterial cells frequet1t1y have to initiate DNA replication from their single origin prior to Ihe completion of (en division. This means that
¡he chromosomes that are segregate
1_
unrepllcated _ replicated on~
DnaC (Figure 8-26). Both proteins are present in six copies within the complex. The DNA helicase is maintained in <:I n in8ctive st8te in the hcli casclhelicase loader com plexo Once bOllnd to th e ssDNA al the origin, the helicase loader directs (he assembly of its associated ONA heli case around the ssDNA (rocall that ssDNA passes lhrough the middle of the helicase's hexameric protein ring). This process is analogous to the assembl y 01' s liding ONA clamps ¡:tround a primer:template junction. Upon completion of this lask, the helicase loeder is released acti vating thü helicase. One helicase is loaded unlo tmch of Ihe Iwo separated ssDNA strands at the origin, and th e orientation of these two helicases is such that they will proceed toward each other as they move with a 5'_3' polarity along their associated ssDNAs.
220
The R..,plicntiorr o/ DNA
FIGURE 8·26 AmodelforE.coIi initiation of ONA repication. lhe major events in Ihe E. roIi Inlliation of replication are iIIUSh"ated. (a) Multiple OnaA·ATP proteins bind
10 !he repealed g·mer sequenres wilhin ariC. (b) Binding of DoaA·ATP lo lhese sequet"lCeS Iea&; 10 Slrand separation ""lhin lhe l3-mer repeatr.. This Is mediated by an ssONA binding dcmain in OnaA·ATP. (e) DNA helicase (OrlaB) ar.::l me ONA helicase loader (0naC) assoc.Me ....;th lhe OnilA bound originoAn ssONA bmding dornain in the helicase loader as IM.'I I as protein· proto?in inter«tions lNilh OnilA are required 10 lorm Ihis complex. (d) [)NA helicase Iooders cataiyze me openlng 01 me DNA helicase pro· tein nng and placemerrt ot lhe ring around Ihe ssONA al !he origino l oading ot lhe DNA heI¡. case leads 10 !he disassocialion 01Ihe hellcase Ioader lrom !he repllcatcx and actf.tates Ihe ONA helicases. (e) The DNA helicases each recruit a [)NA plimase .....trich s}'TlIhesizes an RNA primer 01 eoch templ.,te. The movemenl of !he DNA helicases also removes any rernaimng DnaA bound lO the replicator. (1) lhe neMY S)"1lthesized primer.; are recogrl1zed by the clamp loader cornponents of two ONA PoI In hoIoenzymes. Sliding damps a~ assembled on each RNA prime!", and leading strand S)'Ilthesis is initialed by one of lhe two core ONA PoI 111 enzymes 01 eaci1 holoenzyme. (g)lIfte r eaci1 DNA helicase has rTIOYed apprm:imalely 1,000 bases, a sewnd RNA primer is synthesized on each lagging strand terrplate and a sliding clamp is Ioaded The resulllng primer:template junction is recognjzed by lhe second DNA Poi 111 core €f1ryrTIe in eaci1 holoenzyme, resulting in the initialion 01 lagging strand synthesis. (h) Leading and lagging stri!nd synthesis IS I10\o\I iniliated al eoch replication foil:. and continues lo lhe e nd o/ Ihe template or unlil anocher replicalion lork Irom an adjacenl origin 01 replication is
",,&.<1
a
b
ONA helicase (OnaB) ONA helicase -----Ioacler (OrlaC)
e
d
ONA primase
e
ONA polymcrase 111 holoenzyme
f
,
i
O
Binding and Unwinding: Origin ScJuctian and Activatían by the Iniriotor Protein
221
Th e protein· protein interactions between the helicase and olher eomponents of the replication rork described above direet lhe assembly of the rest oC lhe replication machinery (see Figure 8·26). Heliease recruits DNA pritTh1Se lo tbe origin DNA, resulting in the synthesis of nn RNA primer 011 each strand of the origino The DNA Poi 1fT holoenzyme \S brought lo the origius through interactions wilh the primer:template junction and the helicase. Once the holoenzyme is presento sliding c1amps are assembled on the RNA primees. and the leading s(rand polymerases are engaged. As new ssDNA is exposed by the action oC the helicase. it is bound by SSB and DNA primase synthesizes the first lagging strand prime rs. These new primer:template junctions are taegeted by the clamp loadees. which place two additional sliding clamps on the lagging slrands. These c1amps are recognized by the remaining unengaged eore DNA PoI [JI enzymes, resulting in the initiation of lagging steand DNA synthesis. At this point, two replica tion CorKS h av~ OOen assemhled und initiation of replication is complete (t:!xactly how tha lwo replication forks are assemblcd is a maftef of debate. scc Box 6·5, The RcpBcalion Factory Hypothesis).
Box 8-5 The Repliation Fadory Hypothesis
There are two ways to think of tIle relative motion of the ONA and tIle replic.3tion machinery (Box &5 Figure 1). One simple vieN is that !he replieation machlnery mOles along the DNA in a manner analogous to a train moving along íts tracks. replieat¡ng both strands of tIle approaching ONA. In thís traditional view. the ONA helicases pass by one aro!her immediately aher Ioading and subsequently ad independently from one another at the two new replieation forks. An allemative viM suggests lhat !he ONA mOlleS while the replieation machlnery is static, similar to film moving into a movie projector. Mechanistieally, ~ has been proposed that the two ONA helieases do not pass by each other but instead "'run into each other" and remain associated fa the remainder of the replication process. The \lÍew of replieation occurring al statlC sites has becorne ircreasingly favored. Studies of bacterial ONA replieation dearly indicate lhat the replication machinery remains in a single Ioc.atirn """;':hin Ihe cell during ONA synthesis. lnstead of the replicatirn machinery moving. the ONA moves in and out of tIlis ~replieation factor( anc! in tIle process is duplieated. Similarly. replieatioo in eukaryotic ceUs is obselVed to OCQJr at discrete siles within Ihe ceH nudeus. Studies of the helicases that fundion at replication ferks also support a static replieatiO'l
machinery. Severa! hexameric ONA helieases form doublehexamers. This suggests that rather than Ihe !'NO hexameric helieases rapidly separatiog from each other atter initiation (as suggested by Ihe ~railroad" mooel), Ihey rernain togetr.er thfOughout !he replication process. These two views of the assembly of the replieation fork also have interesting consequences conceming !he DNA that is replieated by each ONA Poi 111 holoenzyme. If the ONA heli· cases pass by one another immeálale1y aher they are loaded, then the dosest strands that ean be replieated simuhaneously by the two poIymerases of the ONA PoI 111 ho1oenzyme 'Nilt be the Watson and Crick strands of the most recently UrllMJUnd ONA (Box 8-5 Figure 1, leh panel). In contraS!, if the tvvo helieases remain assodated aher inióabOn, then it is possible tIlat tIle lagging strand ONA poIymerases of the ONA PoI 111 hoIoenzyme coukf associate with either of t'No primed templates, since !hey are llOIN both nearby. By most estimations. in ttlis scanario, !he choice will be ter each DNA PoI 111 holoenzyme to have the same template strand for me leading and lagging slrand synthesis. That IS, ene care enzyme 'Ni!! replieate the UCrick" str.lnd of !he ONA and the otller wilt replicate the "Watson~ strand al the ONA (Box 8-5 Figure 1, rigllt panel).
8OJI: 1-5 (Continued)
helieases associate with each other
G
;:
[)NA primase makes lBading strand P';rT1Cf synthesis
á~~:;'
DNA poIymetase 111 hok:lenzyme binds RNA Pt'Omer
Cl
[)NA pnlllase makes first lagging strand primer
secood DNA poIymerase 111 rore enzyrne binds nearest primer
BO X 8-5 F I (i U R E 1 In Ihe Ieft panel, the two ONA helicases function independently. In lhe righl panel, me two DNA helicases remain associated with one anothcr. Note lhal in !he righl panel one DNA PoI 111 hdoenzyme uses only !he Watson strand as a lemplare and !he od1ec uses only !he (riel stTand as a template. For simplicíty, the DNA PoI 111 is no! shown associaled v.1th lhe DNA helicases.
•,.
8indif18 ond Umvindi/!g : Ori'sin Se/ection
tllld
Aclú'otitln by ,he ¡nitio lor Protein
223
Eukaryotic Chromosomes Are Replicated Exacdy Once per Cen Cycle As discussed in Chllpler 7, lhe eveols required for eukaryotic cell divi·
sion occur al distind times during cell cyde. Cluomosomal DNA replication occurs only duriog the S phase of the cell cyde. Duriog Ihis time, all the DNA in Ihe cell must be duplicated exactly once. IncompJetc replicalioo of any part of a chromosome causes ioappropriate links between daughter cbroOJosomes. Segregation of Iinked chromosomes causes chromosome breakage or loss (Figure 8 -2 7). Rereplication of DNA can 81so have severe consec¡uences, increasing the number of copies of parlicuJar rcgions of Ihe genome. Addition of even one or two more copies of critical regulalory genes can lead to calastrophic defecls in gene expression , celJ division , or Ihe response lo environmental signals. Thus, it i5 critica] Ihal every base pair in el]ch chromosome is replicllled once and onJy once each time a eukaryotic ccll divides. The need lo replicalc Ihe DNA once and only once is 1] particular challenge ror eukaryotic chromosomes because Ihey each have man y origins of re plica tion. Firsl, enough origins musl be Ilctivated lo ensure Ihal el]ch chromosome is fulIy replicated during each S phase. TypicaHy, nol all potentitll origins need to be activated to complete replication but, if loo few are activated, regions of the genome will escape replicalion (see Figure 8-271 . Second, although somc pOlential origi ns may not be used in any givcn round of cell divisioll . no origin of replication can initiale after il has been replicaled. Thus, whether an origin is I]ctivaled to cause its own replica tion or replicated by a replicotion fork derived from an adjacent origino il must be inactivo t(.'d unlil the next round of cell division (Figure 8-28). Ir these conditions were not true, the DNA associated wi(h an origin could be rcplicated twice in the same cell cycle.
Pre·Replicative Complex Formation Directs the lnitiation of Replication in Eukaryotes The initiation of repli calion in euknryotic cells requi re5 Iwo steps lo occur al distinCI times in the cell cycle (see Chapter 7): replicator seleclion and origin activation . Replicator selcction is lhe process of identifying sequences that wi ll direct the initialion of replication
_~ _,ne_ae
--:crr ~r\ -
_ _ _-'u="~roPhcal:e~ d _ __ "
DNA d1romosome
segegatioo
"
•
""""""i break.!
fIGURE 8·27 Cfuomosomebreakage as a resuh of in(omplete DNA reptj~tion. This illusuation shows me consequences of incomplete replic.afiOfl lollcnNed by chrornosorre segregalion. lhe top 01 each illustral10n shows !he entire chlOmosome. The bonom shoM. the details of the mrom05OlTle breakage al !he ONA 1eveI. (FOI the details 01mromosome segregation. see Chapter 7.) As me chromosomes are pulled élpdrt. stress is place
224
Thp. Rp.plioolion 01 DNA
FIGURE 8-28 Repliators are inactivated by ONA replication. A duomosorne wilh We replicalors is sho.vn. The replicalOfs Iabeled 3 aod S ale lhe firsllo be activated, lead,ng lO !he formation 01 two pairs of bidirecrional repli-
cation fOlks. ActIvatlon of!he p.:uental replicator reslits in the inactlValion of Ihe cc;pses 01 each re¡:AicatCl' on bah daughter DNA n1oleoJes until me neKt cell cyele (índicaled by a red X). Further ~ of Ihe restJling replication faks re¡:Aicates!he DNA O\IeIlappmg INIth me number 2
! !
origin 3 and 5 iniliate
origin 1 inil:iates orlgin 2 Is passively replicated
•
a nd occurs in Gl (pri or to S ph ase). Th is process leads to Ih e assembl y of a multiprotein complex al each re plicalor in Ihe genome. Origin activation only occ urs afl er cell s e nte r S pbase and triggers Ihe replicator-associ ated protein complex lO initiate DNA unwinding a nd DNA polymerase recruitm en t. The sepnra tion of replicator se leclion and origi n aclivati on is differenl from lhe situati on in prokaryolk eells . where Ihe rccognitioo oC replicator DNA is intrinsic:a ll y coupled to DNA unwinding and polyrn erase recruitment. As we will see be low. the tempora l sepnration oC these two eVl!nts in e ukaryotic cells ensures that each chromosome is replicated only once durin g each cell cycle (bacteria l cell s salve Ihis problem differently. see lJox 8-4 , E. coli DNA Replication Is Regula ted by DnaA · ATP Levels and SeqA. Replicnlor selection i5 medial ed by the formation oC pre-replicati ve complexes (pre-RCs) (Figure 8-29). The pre-RC is composed of four separnte proteins thal assemble in an ordered fashion al each replicalor. The fi rsl step in the fonnalion oC Ihe pro-RC i5 the recognition of tlle rcplicator by the eukaryotic initiator. ORe. Once ORC is bound. it recruits two helicase loading proteins (Cdc6 and Cdt1). Togetber, ORC and the loading proteins recruil a protein that is thought to be the eukaryotic replication fork heJicase {Ihe Mon 2-7 complex}. Formation ol' Ihe pre-RC does nol lcad lo Ihe im mediate lUlwinding of origin DNA or the recruitmenl of DNA pol ym emses. loslead I,he pro-RCs Ih al are forroed during Gl aro only adi voted lo initiate replication after cells pess from the Gl to the S phase of tho cell cycle. Pre-RCs a re aCliva le d lo inili ate replication by Iwo proteio klnases (Cdk a nd Ddk; Figure 8-30). Kinases are prote ins that covalently a tlach phos pha le groups to target proteins (see Cha pter 5). Each oC Ihese kinases is inactive in Gl and is activated ool y when cells cnter S phase. Once adivated. these kinases target Ih e pre-RC an d other replication prnte in s. Phosphorylation of these prote ins results in Ibe asse mbly of additiona l re plica tioll proleins al the
Binding rJlJd Um l'indil/g: Origin SeJection ond II clivotion by Ihe lnWalo[ Prolein
•
•
replica /oc
FIGURE 8-29 Thestepsinthe fonnation of the pre-replicatwe (omp~x (pre-RC). The assembly of the pre-RC is an ordered process tIlal is iniliatt!d by Ihe
®~
: ~iO~~ ~
t
225
Cd.
: @~ ~
origin a nd the ¡nHiati on of replicati on (see Figure 8-30) . These new proteins inelude the three eu karyotic DNA polymerases and a number of other proteins requ ired for their recruitment. Inte restingly. the polymerases assemble al the origin in a particu lar orde r. DNA Poi 5 a nd e associate fi rsl. fo ll owed by DNA Poi o/primase. This order ensures Ihat al! three ONA pol yme rases are present al th e origin prior lo lhe sy nthesis úf lhe fir st RNA primer (by ONA Poi a/primase). Dnl y a subset of Ihe proleios tha t assemble al Ihe origin go on lo fu nction as part of the eukaryotic replisome. In addition to the three ONA polymerases. the Mc m com plex and many of lhe factors requircd for DNA polymerase recruitme nl hecome parl of Ihe replicntiún fork machi nery. Similar to the Ti. coli DNA helicase loader (Dnae). the other factúrs (such as Cdc6 and Cdt1) are released or deslroyed aft e r their role is complete (see Figure 8-301. Pre~RC
Formation and Activation ls Regulated to Allow only a Single Round of Replication during Each Cell Cycle How do eukaryotic cell s con trol the acti vity oC hundreds or even thousands of origins of re plicolion such Iha! no l e ve n one is activaled more Iha n once during a ceIl cycle'? The answer Hes in the tight regulation of the formation a nd activation of pre· RCs by cycI in-de pendont kinases (Cdks). Cdks play two seelllingly contradictory roles in regul ~ ti ng pre-RC funcH on (Figure 8-3 1). Fi rst. as \Ve described above. Iheyare requ ired lo acti vate pl'e-RCs lo inil iote ONA repJicallon . Second , Cdk aCli vity inhibits Ihe Connetion of new pro-RCs.
association of the origin reco¡flffion complex witil \he replicalor. Once boond lo \he replicator. ORe roourts dI leasl two addilioodl protcins, Cdc6 dro Cdt l . These tIlree proteins function togelher 10 roouit tIle PUldtive eukdryotic: DNA helicase-Ihe Man2'7 comp1ex lo complete the
fOl!Tldtioo o, me pre-RC
22(1
T/¡a Replirotion 01 DNA
FIGURE 8-JO Activalionoftflepre:-RC leads lo Ihe assembty 01 Ihe eukaryotic rtpllGlltion fork. As cells enter u'llo Che S p/1ase of lhe 001cyde, cdk .lOO Ddk phosphoryIale replication prolems lo tnggcr !hc initialioo of rcplicalion. The events that Iead lO DNA. unwiOOing al lhe Ofigin are poorty t.ndet. Slood bu! arc likely lo require me actMTy of !he Mcm c~ and resuk in lhe reouitment of a ruri:Ief of auxiliafy rCf,kation factolS and DNA PoII.i and ti. DNA PoI nlprimase IS rriy recru.ted af1er DNA PolI) and 1:. Once preseot iIt the ~ DNA PoI alprimase synthesizes .!In RNA prrmer and bnefly extends Il The resull:.ng pti"ncr:tefTlllatc junction is recognized by the euk.'lryotic sliding ddmp Ioader (RF·C), ......,ich asserrblcs a sliding clamp (PCNA) al thcse site5. E"!ther DNA PoI 6 Of 1: recognizes tt11S ¡::m-e and begins leading straod synthesIs. Afta a period cA [)NA trrWinding. DNA PoI uIpirnase synthesizes additimal pnmcrs, "hch alQv the inrbation of lagging str;,nd [)NA synhesIs by eictlEf ()\lA PoI 6 Of e. Here we ~Iuslrale FoIl! (J'I tt-e leading Slrantl and PoI e al !he lagglll8 strand
auxiNary fadors and polymerases
b
O'
potymcrase al
p
slicliog clamp end clamp Ioader (PCNA RF-C)
+
S
start lagging strand synlhesis
Bim/ing ond Um"/lldillg: On-gin Selp.cl irm ond Adivo/iofl by tlm Ini tiolor Prol pJn
,,--=:::J:
Cdk activity Iow
I
FI G U R E 8-11 Effect of Cdk actMty on
1 ~~~
pre--RC formalion allowed
no prc-Re activalion
new pre-RC formation inhlbited
existing p"eR'C '''''''''00
,,"
"
227
pre-RC formatton and activatton. High Cdk actMty is reqUlred fOl ex.sring pre-RC cCl!TlJJbes lo iniriale ONA repliG:!llOn. These same eJewlIed levels of (dI; adMty completely inhibíl tlle fonmlion of new pre-Re cornpleJ.es. In contrast, IO\N (dI:. activity is oonduóve 10 new pre-RC forrnatioo bul is inadequale 10 trigger DNA replicalioo initia60n by lile lJe'.'JIy fooned pre-RC complexes.
The Ilghl conneclion between pre-RC function , Cdk levels, and t11e cell cycle ensures that the eukaryotic genome is replicoted only once per ceJl cycle (Figure 8-32)_ Active Cdk is nbsent during Cl , whereas el e~ valed Jevels or Cdk are prescnl during the remainder oCthe cell cycle
FIGURE 8-12 cellcyderegulationof cdk activity and pre-RC fonnation. In G 1_ Cdl:. levels are Iow and new pre-RC oomptell€S can form bU! cannol be élCW"aled Dunng S phase_me elevated leveIs of Cdk activity trigger the iriliation of DNA replication and prevent any new pre-RC COI1lllell forrnation on lle\NIy replicated DNA. Once a pre-RC is used lar !he Inrtiation of replication. il T5 nece;sdri}y dismantled (recall Ihat at least one key component of tr.e pre-RC, the Mcm oomplcx. becomes ~rt of tlle replicalion fod). Simnarly, replication of pre-RC associated [lIJA also causes destructiClfl of the complex (00 sI1ov.n)_ Because Cdk leveIs rernain hrgf1 until Ihe end of mitosis, no noo preRC oomplexes can be formed until cl1romosome segreg.3oon is COITplete _Wtthout nev.r pre-RC oomplelles. re-initiation is impossible_
no pre-RC activation
cell cycle
I -=;;;;;;~ : •~
_ no pre--RC _ , o:::
formation
228
The Ittl/J/jCUJiOIl
al
DNII
{S, G2, a nd M phaS(>.sI. Thus . during üach c:ell cycle there is onJy one opportunity for pre-RCs lo form (during Gt) and onJy one opportunity for Ihose pre-RCs lo be aclivated (duri ng S, G2. and M-although in prddi ce all pre-RCs are activated or disrupted by replication forks in S phase). Pre-Res are disassembled after Ihey are activated or after the DNA lo which they are bound is repli cated. These e:xposed replicators' are tIlen 8vailable for new pre-Re formation and rapidly hind lo ORe. Despite the preseoce of the initialor al Ihese sites, the elevaled levels ofCdk activity in S. G2, a nd M phQse cell s prevents the associati on of the olher mp.mbers of the pre-Re complex with ORe. It is on ly when cells scgregate their chromosomes and complete cell division that Cdk acti vi ly is eliminated and new- pre-RC com plexes can form o
Similarities between Eukaryotic and Pl'okal'yotic DNA Replication Initiation Now lhal we have dcscribed initiation in eukaryotes and prokaryotes, it is clear that Ihe general principies of replication initiation are the same in both cases. The first step is Ibe recognition of the replicator by the initiator protein. The initiator prote¡n in combination with one or more heBcase loading prolein5, recru il the DNA helicase lo the repli· calor. The helicase (and potenliall y other proleins al the origin in eukaryotes) generate a region of ssDNA tha! can acl as a lemplale for RNA primer synthesis. Once primers are synthesized , the remaining components of the re plisome assemble through interacti ons with tbe resulting primer:lemplate junction.
FINISHING REPLICATION Completion of DNA replication req uires a set of specific events. These events are differenl for circular versus linear chromosomes. For a circular chromosome, the convenlional replication fork machinery can replicate the enli re molecule, bul the resulting daughter molecules are topologically lioked lo ooe another. lo contrast , replication ofthe very e nds of linear chromosomes ca nnot be completed by Ihe replicalion fork machinery we have discussed so faro Therefore, orgl.U1isms conlaining linear chromosomes have developed novel strategies to overcome this end rcplic8tion problem.
Type 11 Topoisomel'3ses Al'e Requil'ed to Separate Daughter DNA Molecules
FIGURE
8-33
Topostomerasell
catalyu5 the decatenation 01 reptication products. Alter d OfQJlar DNA rnolecule is repl'lcatec!. lhe resultlng complete daughlef DNA moIcculcs rcrnain línked lo one
Aft er replicalion of a circular chromosome is complete. the resulting daughter DNA molecules remain Uoked togethe r as ca lenanes (Figure 8-33). Catenane is the general term for two circles that are linked (similar to links in a chain). To segregate these chromosomes into separate daughter ce lls, the two circular DNA molecuJes must be disengagcd from one another. This separation is accomplished by the Belion of type II topoisoruemses, As we SaW in Chapler 6, Ihese enzyrnes have the ability to break a doubJe-stranded DNA 1l101ecule a nd pHSS a second doubJ e-stranded DNA molecule through this break. Thus. type II topoisomeruses catalyze a break in olle of the two daughter molecules and allow tbe second daughter rnolecule lo pass througb the break, Thi s reacli OI1 decatenates the two daughter chromosomes. a Uowing thei r segregation into separate cell s.
Finishing Replication
229
Although the importance of this activity for the separation of circuJar chromosomes is mosl c1ear. Ihe acti vity oftype 11 lopoisomerase..c; is a lso critica! lo the segregation of large linear molecules. Ahhough there is no inherenl lopological li nkage afl er the repli cation of a linear molecule, the large size of eukaryotic cruomosomes necessitates Ihe intricate fold ing of tbe DNA into loops attached lo a protein scaffold. These aUochments lead lo many of Ihe same problems thal circular chromD-sornes have when the two daughter chromosornes musl be separated.
Lagging Strand Synthesis Is Unable to Copy the Extreme Ends of Linear Chromosomes The requi rernent for an RNA primer lo init iale all new DNA syn lhesis creales a dilemma for the replication of Ihe ends of linear cruomo· sornes. This is called the end replication probJem (Figu re 8·34). T his difficulty is not observed during the duplicalion of the leading strand template. In thal case. a single internal RNA primer can direcl the initiation of a DNA strand that can be extended to the extreme 5' ter· minus of its lemplate. In contrasto the requirement fOf multi ple primers to compl ete laggi ng slrand synthesis means that a complete copy of its lempIate cannol be rnade. Even if the end of the last RNA primer for Okazaki frngment synthesis anneals lo Ihe final base pairs of the lagging strand template. once this RNA molecule is removed . Ihere will remain a short region of unreplicated ssDNA al the end of Ihe chrornosorne. This means !hal each round of DNA replicatio n woul d resu ll in the shorlcning of one of the two dlJughter DNA molecules. Obv iously.
FIGURE 8-34 Theendf'e~Kation probtem As the lagglng strand replicallon machinery redches me end of me chrornosome.
at sorne point pnmase no Ionger has suffident spac.e 10 synthesize a new RNA primeJ. This results in incomplete replicatien and a shon ssONA region al lhe 3' end of me lagging slrand DNA product VVhen Ihis DNA product is replicate
win be shortened aOO >Mll lack the regien lha! was no! fully coped in the ptevious round of
.• ~~~: ; ,"
replicaoon.
y.-- incompletety
+
~~~~~~~~~~~~~~~-..Y:~.~:,
3"_
•
sE
•S" 13"
1
replicate agaio
= - - ="' ~
_~ 3"
is shortcr
replicated ONA
2:m
Tf1f> RepliClI/ion 01 DNA
FIGURE 8-35 Proteinpñmingasill
solution to the end replicatton pwblem. By binding lo me DNA poIyrnerase and 10 lile 3' end 01 lile terrplale, d Pfotao Pf(NÍdes Ihe pIlming Ilydro).')ll group lo loitiared DNA synllleSIS. In lhe exdmple sha.vn, tIle pfOlein primes all DNA syntllesis dS is seen fOl' many "ruses. For Ionger DNA moIecules, tIlis rnelhod combines MIIl convenbondl anglo functroo lo repllcale Ihe cllromosomes.
-""", proteio s Q-o " 3"
DNA polymerasc
S~#
flO 5' Q- O'"
rI'"
1 1
3' _
[10- 0 _
+
3"
Ihis scenario would disrupt the complete propagation of Ihe genetic material from generation lO generali on. Eventually, genes at the end of the chromosomes would be lasto Cens solve the end replication problem in a variely of \Vays. One solulion is lo use a prolein instead of an RNA as Ihe primer for Ihe lasl Okazaki fragment at each end of the cbromosome (Figure 8-35). In this situation. the "priming proteio" binds lo Ibe lagging slrand template and uses an amino acid lo provide an O H that replaces the 3'OH normaJly provided by an RNA primer. By priming the lasl lagging slrand , the priming proteio becomes covalenlly Jinked lO Ibe 5' end of tbe cruornt? sorne. Tel'rnimtlly altiJched replil.;ation prote ins of Ibis kind are found al the end of Ihe linear chromosomes of certBio species of bacteria (most bacteria have circular chromosomes) and at the ends of the linear chromosomes of certain hacteria l ¡Ifld animal viruses. Mosl eukaryotic eells use an enlirely diffarent solution to replicate t.lu'!ir chromosnme ends. A s \Ve leamed in Chaptcr 7, Ihe ends of eukaryotic chromosomes are called telametes and tbey are generally eomposed. oC head-to-tail repeats of a TG-rich DNA sequence. Far example. human Iclomeres consisl of many head-tt?tail repeats ol' Ihe sequence 5'-TTACGG-3'. AJthough rnany oflhese repf'.8ts are double-stranded , the 3' end of each chromosome extends beyond the 5' end as ssDNA. This unique structure aets as a novel origin of replicalion that eompensatés for the end replicalion problem. This origi n does nol internct with Ihe same prOleins as the remainder of eukaryolic origins . but il instead reCJUÍ ts a specialized DNA polymerase called tclomera.se.
Telomerase Is a Novel DNA Polymerase that Does Not Require an Exogenous Template Telomerasc is a remarkable enzyme Ihal ¡neludes both prote in and RNA componenls (and Ihis is, therefore, en cxample of a ribonucleoprole in, see Chapler 5). Like dJl uther DNA polymerascs. lelomernse ncis lo extend the 3' end of its DNA substrnte.But unlike mosl DNA polymerases. telomerese does not need an exogenous DNA templete lo direct tbe addition of new dNTPs. lnstead. Ihe RNA component of lelomerase serves as Ihe tcmplate for adding Ibe telomeric sequence lo Ihe 3' terminus at the end of the chromosome. Telomernse specifically c)o ngales the 3'OH of particular ssDNA sequences using ils own RNA as a lemplate. The newly synlhesized DNA is single-stranded.
Finishill8 Replioolio n
231
The key to telomerase fundion is revealed by the RNA component oC the enzyrne. The sequence oC the RNA ¡ndudes 1.5 copies afilie complement ofthe telomere sequence (Cor humans, this soquence is 5'-TAAax::TAA-3'). This region oCthe RNA canrumeaJ to lhe single-stranded DNA al the 3' end of the telomere (Figure 8-36). Annoaling occurs in such a way that a part of the RNA template remains single-stranded, creating a primer:template junction Ihat can be 3cted on by telomerase. Tbe protein romponent of telomerose ís related lo a c1ass oC DNA polymerases thal uso RNA templ ..tes called reverse trnnscriptases. (As we shall see in
FIGU RE 8-lfi Repltcation of I~omeres by teklmerase. Telomerase uses ils RNA componenl lo anneal lo "'e 3' end of "'e ssDNA region of \he telornere. Telomerase theo uses lis reverse transcription actívity lo synlhesize DNA 10 lhe end 01lhe RNA lemplate. TeIomaase lhen displaces lhe RNA from the DNA product and rebinds al lhe ene! of lhe leIornere and repeats the process..
RNA
lONA syntllesis
l
DNA synlhesis
\
T'
232
7'he Replicolion o[ DN/\
"E======~'~'=='"
5,1
¡
telomerase extcnds
3' eod of telomere
= = ____,.
~:5l======~!5~'
additional 3' cnd ONA can act as tCfllplate
for new Okazaki
fragnent
1
"""""".....,.
!:.E======~!5~·
"""'c fragmeot Okazaki J ¡;.E======:!!~~,. . I r Iclomefe C.Ktension (su!! has 3' overtlang) FIGURE 8-37 htenstonofthel'endof
the telomere by telomerase soIves Ihe end repltcation probtem. A1thoud1 telomerase onIy dircctly ClCtends ttle 3' end of!he Ielomere, by pro.iding an ackIrtional template lar lagging slrand [)NA synthesrs, bottl eods of [he d1rornosorne are extended
Chaptcr 11. these enzymes "reverse transcribe" RNA into DNA instead of the more conventional transcription of DNA mto RNA.} The telomerase synthesizes DNA to Ihe eod of the RNA lemplüte bul cannot continue lo copy the RNA beyood that poi n!. Tbe RNA tempJate disengnges from the DNA produr:t, re-annenls to the las! three nucleotides of the telomere, and then repeats Ihis process, The characteristics of telomerase nre in sorne ways distinct and in other wnys similar to Ihose of olher DNA polymerases. The inclusjon of an RNA component. the lock of a requirement for an exogenous template, and Ihe obility to use al1 en tirely ssDNA substTate sets telomerase apart from other DNA polymerases. In addition, telomernse must have the abilil y to displace its RNA template from the DNA product lo nUow repeated rounds of template-directed synthesis. Fonnolly, th is means that telomerase ¡neludes an RNA·DNA helicase activity. On the other hand, like all other DNA polymerases, telomernse requires [1 templalC lo direct nucleolide addition. can only extend a 3' end of DNA, uses the same nucleotide precursors, and acts in a processive manner, adding ffian y sequence rcpeats each time it binds to a DNA substrato.
Telomerase Sol ves the End Replication Problem by Extend¡ng the 3' End of the Chromosome When telomerase acts 011 the 3' end of the telomere, it on ly extends this end of the chromosome. How is the 5' end extended? This is accomplished by the lagging slrand DNA replication machinery (Figure 8-37). By providing an extended J' end, lelomerase provides add iti ona l templote for the acti on of the lagging strand replication machinery which can then extend the 5' e nd of lhe ONA. It is importanl to note that thero will still be an ssDNA region at the end of the c hromosome. Tbe aclion of telomerase and Ihe lagging strand replication machinory, howover. can ensuro Ihat lbe tolomoTO is maintai ned nt su(ficient length lo protect the e nd of the chromosome from becomiog too short (amI potentialJy delet ing importan! genes). AJthough extension of lelomeres by telomernse coul d theoretically go on indefinitely, proteins bound to the double-strunded regions of the telomere carefuJly regulate tclomere length. These proteins oct as weak inhibitors of telomerase octivity. When there are on ly a few copies of the telomere sequence repeat, fe", of lhese proteins will be bound lo the lelomere and telomerase activity will be acl ivated. As lhe telomere gets longer, these proteins wil! accumulate aod inhibit the telomerase. The repetitive nature of the telomeric DNA sequence means thot varialions in the length of the telomere are readily tolerated by the cell. Whether a chromosome has 200 or 400 repeots of the telomeric repeat, it wiII he protected (rom recombination a nd degmdation.
SUMMARY DNA synthesis is dependent upon the presencc of two types of substrates: the four dooxynudeoside triphosphales, dATP, dGTP, clcrr. ancl dGTP; ancl Ihe lemplate ONA slructure , a primcr:template junction. The template DNA determines Ihe sequence of incorporated nudeotides. The primer serves as Ihe substrate for deoxynudeotide addition, each being addcd successively lo the OH al ils 3' enrl.
DNA synlhesis is catalyzed by nn enzyme caUed DNA polymerasc lbat uses a single active site lo add any of the four dNTP precursors. Slructurol studies of DNA polymemses revoo1lhal Ihey rosemble a hand Ihat grips Ihe catalytic site. This slructuro oontributes tu the extremely accumte nalure of the ONA synthesis reacti nn. ONA polymerases are processive: each time they bind a substrate , Ihey add ruany oucJeotides.
Bibliogrophy
Proofreading exollucleases further cnhant:es the aocurncy of DNA synlhesis by acting like a "delete key" that removes incorrectly added nucleotides. In the cel!. both stra nds of a DNA template are d opllcaled simuJtaneously al a structure callcd the replication fork. Because Ihe two slrands of the ONA are anti paralJel, only one of the templatc ONA strands can be replicated in a conUnuous fashion (ClI.lled the leading strand). The olher UNA slnmd lcalled Ihe lagging strand) must be synthesized first as i;I series of shOl1: DNA fragments, caUed Okazaki fragmen ts. Each ONA slIan d is initialed wilh an RNA primer that is synthcsized by an ell7.yme callcd primase. These primers must be removed lo complete Ihe replicatio n process. After thc reph:tcement of tJle R.I'.JA primers wit h DNA. a ll of the separntely primed lagging strand ONA frngments are íoined together lo f0I111 one continuous ONA stm.nd . An array of proteins in addition to the DNA polymerases, helps to coonljnate and faci litate the ONA replication reaction. TItase additiona l faclors facilitate the ul'winding of lhe d sONA templale lONA heliOlse), stabilizü lhe ssDNA template 15SB) , and remove supercoils gencrnled in frent of the replication fork (topoisomerase). DNA polymerases are spocia1ized to perfoml different evenl.. during DNA ro plication. Sorne are designed to be highly proc.e!isive and others only weakly processive. DNA s liding c1amps enhance tbe process ivity of the DNA polymerases that rep licate lruge regiomi of DNA (such as w hole chromosornes). These clamp proteins are to pologically linked to DNA, bul are able to s lide along the recentl)' synthesi\f.ed DNA while bound lO the DNA polymerase. Th is effectively prevents the attach ed DNA polymernse from dissociating from !he primer:template junction. Special protein comp lexes ca1led sliding ONA clamp loaders use Ihe energy of ATP hydrolysis lo place sliding c1amps on !he ONA neal' primer:lemplale ¡Imctions. Interactions between the prolei ns al lhc l'eplicalion fork play an imponant role i.n ONA synthesis. In E. eoli, Ihe two ONA polyrnerases are part of a large complex ca lled !he DNA Poi 111 huluellzyme. Bindiug of DNA polymeraso 111 holoenzyme to the DNA helicase stimu lates the rato of DNA unwinding. Similarly, binding of primase lo the ONA helicase increases its ability lo synlhesize ¡{NA primers.
233
Thus, the replication reaction works bes! wh en Ihe entire array o f replication proteins are present al the replical ion fork . Togethe!' this set of proleins fomlS a complex called tJlC replisome. The initiation or DNA ra plication is dil'ected by spccific: DNA sequences ca lled replicators. The physica l sile of replication iniliation is ca lled an origín uf rep li cation. T he replicator is specifically bound by a prote in called !he inilialor w hich slim u loles the unwinding of the origin IJNA and the recru itmenl or olher proteins required for t1ae initiation o f replication (such as DNA helic.ase). The sub· sequen! events in the initiatioo of DNA replication are largely driven by either protein-prntein Or non-specific protein-DNA inlernclions. In etJkaryotic cells the initiaUon of ONA replication is tightly regulated lo ensure lhal cvery nuclcotide of every chromosome, is replicaled once and onJy once per round of cell divisíon . This tight regulation is accomplished by controlling Ihe rormation 3Jld aclivation of a multiprotcin aswmbly called Ihe pre-replicative comp lex (pre-RC). Fol'malio n of these oomplexes al replicators is required lo recrllit the proteins necessary lo init.iale DNA replication. The ability to foml and activale pre-RCs Is conlrolled by a cell cyde regulated kinase called cycJ jn-depen~ent k inasc. Duri ng the C1 phase of the cell cycle pre-RCs can be formed bul mnnol dired the initiation or replication. During Ihe remainder of the ccll cycle (Ihe S. G2, and M phases), any existingpre-RCs can ¡niliate ONA replicatio n bul no new pre-RC.. l".an be formed . Thus, any particular pre-RC can only direct one rou nd of initiation per cell cycle. ensuriog that the DNA IS replicated eX8ctly once. Finishing ONA repliOltion requires the aclion of specific enzymes. For circular chromosomes, I)'pe 11 ONA lopoisomerases scparate the topologicaJly linked circular products from o ne anolher. Linear chromosomes ruso require special proteins lO ensure their complete replication. In elll aryotic cells, a specializcd DNA polyrnerase ca lled te lomerase allows the cnd.. of the chromosome Icalled telomeres) lo ac! as a unique origill ofreplication. By oxtelldíng Ih e 3' ends or the telomare. te lomerase eliminates ille progressive loss of chromosome ends Ihat convenliona l synthesis by Ihe replicalion fork machinery wou ld cause.
BIBLIOGRAPHY Books Brown T.A. 2002. Genomes, 2nd edilioll . 'ohn Wiley. New YOl'k, and BlOS Scientific Publish ers Ud .. Oxford. United Kingdom .
DePamphilis M.L. 1996. DNA repUcation in eukmyotü; ce/k Cold Spring Harbor Labora tóry Press. Cold Spring Harbar, New York. Komberg A. a mI Baker T.A. 1992 . DNA Replicarion . 2nd edition. W.H. Freeman, New York.
ChernisLry of DNA Synthesis Brautigam C.A. and 51e itz T.A. 199B. Structural a nd functiona l insighls provided by crystal structures of DNA polymerases. C UlTo Opi/l. Stmct . Bio}. 8 : 54 - 63.
Jager J. and Pata ' .0. 1999. Getting a grip: Polymerases and their s ubstra te complexes . Curro Opino S trort. Biol. 9: 21-28.
The Mcchanism of DNA Polyrncrase Ooublié S. and Ellenberger T. 199B. The mechanism of action o f T7 DNA polymerase. Ct/fT. Opin. Struct. Biol, 8: 71l4 - 712. Steitz T.A. 199B . A mechanism for a ll polymerases. Naturf! 391: 23 1 -232.
SuUon M.O. and Wa lker C .C . 2 001 . Managing DNA polymerases: Coordinaling DNA replication, DNA repair, and DNA recombi nati on . Proc. Natl. Acod. Sci. 96: 8342-8349.
234
The RepUco'ion 01 DNA
The Rcplication Fork Baker T.A. and BeU S.P. 1998. Polymerases and the replisorne: Machines within machines. Cel19 2: 295-305 . Benkovic S.J. Va lentinc A.M . and Salinas F. 2001. Replisome-mediated DNA replication. Anrtu. Rev. Biochem. 70: 18 1 -208. O'Donnell M ., Jeruzalmi n., and Ku riyan J. 2001. Clamp loader structure predicts the architecture of DNA polymcrase III holoenzymc and RF'C. Curro Bio/. 11: R935 - R946.
'.e.
2002. Cellular ro les of DNA topoisomemses. Nat. Rev. Mol. Cell Bio/. 3: 430- 440.
Wang
?ate! P. H. , Suzuk¡ M., Adrnan E., Shinkai A., 8Jld Loeb LA. 2001 . Prokaryotic DNA polyrnernse 1: Evollltion. Structuro. 8Jld "base llipping" rnechanism for nudeotide seloction. /. Mol. Bio/. 308: 823-B37.
lnitiation of DNA Replication Gilbet1 D.M. 2001. Making sense of eukaryotic replication origins. Science 294: 96- 100. Jacob F., Brenner S., Ilnd Cuzin F. 1963. On the rogu lation of ONA replication in bacteria. Cold Spring Harbar Symp. Quonl. Bio/. 28: 329-34B. Tye S.K. 1999. MCM proteins in DNA replication. Annu, Rev. Biochem. 68: 649-GB6.
Thc Specialization of DNA Pol)'merases
Finishing Rep1ication
Kunkel T.A. a nd Bebenek K. 2000. DNA replic:ation fidclity. Annu. Re l~ Biochem. 69: 497 - 529.
GreiderC.W. 1996. Telomere lenglh regulation. Anlm. Rev.
lliochem. 65: 337 - 365 ,
CHAPTER
The Mutability and Repair of DNA
he perpetuation oC the genetic materíal froro generalion lo genaralion depends o n maint
T
OU TLI NE
• Replication Errors and lheil Repair (p. 236) DNA Oamage (p. 242)
• Repalr ot DNA Damage (p. 21.
236
Tne fl'fulubility und Repuir o[ DNA
Ihe ceH in which the sequence altoralion has occurrod. bul lesioos Ihal impede replication or transcription can have immcdiate effects on cell fLlncHon and survival. The c haJlenge for Ihe eeJl is IwoCold . Firsl, it must sean Ihe genome lo delect errors in synthesis and damage lo Ihe ONA. Second, it mUSI mend the lesioos and do so in a way that, if possible, restores the originnl DNA sequence. Here we will discuss ClTOrs tbal are generated during replication, Jesions that arise from spootaneous damage to DNA, and damage that is wrought by chemical agents and ·radiation. In each case we shall ooosider how Ihe aJleralion lo Ihe genetic material is dotectoo and how it is properly repalred. Among the questians we shalJ addwss aro tho following: how is !he ONA mended rnpidly enougb lo preveot errars from becoming sel in the genetic material as mutaliaos? How does Ihe cell di.<;tinguish lbe parental strand from the daughter strand in repairing replication errors? How does tho r.eJl reslare the proper ONA sequence whan. due lo a break or severe lesiono Ihe origi ~ oal scquonco can no longer bo read? How does the ceU cope with lesions thal block replicalion? Tbe answers lo these questions depend on the kind oC ercor or Jesion that necds to be repaired. We begin by considering errors that occur during replieation and bow they are repaired. \!Ve then consider various kinds of ¡esions that arise spontaneously or from environmentaJ assaults beCore turning to the multíple ropair mechanisms Ihat allow the cell to mend this damage. We will seo that multiple overlapping systems enable the ceU to cope with a wide range of insults to DNA, underscoring the investmen! that living organisms make in the proserVfllion of the genetic material.
REPLICATION ERRORS AND THEIR REPAIR The Nature of Mutations Mutations induda almosl every c:onceivable changa in DNA sequen<.."'C. The simplest mutations are switches oC one base for another. There are two kinds! tcansitions. which are pyrimidine-to-pyrimidine and purineto-purine substitutions, such as T to C and A to C; and transversions, which are pyrimjdi.ne-to~purine and purine-to-pyrimidioe subslitutions, such as T to G or A and A to e or T (Figure 9-1). Other simple mulations are insertions or deletions of a nucleotide or a small number of nudeotides. Mutations that alter a single O\lcleotide are called poinl mulations.
Other kinds oC mutations causa more drastic changes in DNA, such as extensive insertions and deletions and gross rearrangements oC chromosorne structure. Such changes migh! be caused, for example. by the insertion of a transposon. which typicaJly places many thousands of nucleotides of foreign ONA in the coding or reglllatory sequenccs of a gene (500 Chapter 11) or by the aberran! actions of cellular rccombination processes. The overol l rate at which new mutations arise
F I (j U R E 9· 1 Base changt substitutions.
(a) Trdnsruons. (b) TrdOS\lCrsions..
a
b
sponlanoously al any given sile on Ihe chromosome ranges from aboul 10- 6 to 1O- l1 por round of ONA replicotion, with sorne sites on the chromosome being "hotspols" where mutolions ariso al high frequency and other siles undergoing a1terations al B comparolively low freqllency. One kind of sequen ce thal is particularly prone to mutation merits spedal commenl because of its importance in human genetics and dis-ease. These m ulation-pronc sequen ces are repeats of simple di-o tri- or letranucleotide sequence.s. which are known as DNA microsatellitcs. One well-l:,nown example involves repeats of the dinucleotidc scquence CA. Stretches of CA repeats aro fOllnd 01 many w idoly scattered sites in the chromosomes of humans and some olher eukaryotes. The replication machinery has difficulty copying such. repeals accurately, Crequently undergoing "sli ppoge." Trus sli ppage increases or reduces the n umoor of copios of the repeated seqllence. As a resul t, the CA rep eat length al a particular sito on the chrornosome is often highly polyrnorphic in tho population. This polyrnorphism provides a conven ient physica l marker for mapping inherited mutations. such as mutations Ihal increase lhe propensily lo cerlain diseases in humHl1s (see Box 9--1 , Expansion of Triple Repcals Causes Disease).
Sorne Replication Errors Escape Proofreading As we have seen, the replicatian machinery at:hjeves a romarkably high degree af accuracy using a proafreading mechanism. the 3' - 5' exonu* dease component of the replisome, which removes wrongly incorporaled nucleolid es (as we discussed in Chapler 8l. Proofrcading improves the fideLity of ONA repliC
Box 9-1 Expansion of Triple Repeats Causes Disease
ArlJther well-known example d error-prooe sequences is repeots d !he triple! nudeotide sequeoces CGG and C/lC, in certain genes. tn humans such triple! repeats are often 'ound 10 undergo expansion from one generation to me next. resulting in dseases are progressive/y more severe in lhe c.hildren and grandc.hildren of afflicted inárviduals. Examples of diseoses lhat are caused by triple! expansion Cle OOuIt muscular (myot01ic) dystrophy; fragile X syndrcme. whích causes mental retardatiorJ; and Huntington's disease. vvhKJl causes neurodegeoeration. CJrl:. is lhe cocIoo 1cr glutamine, and iB expansi01 in !he coding sequence for me Huntingtin plOlein results in an extended suetch of glutamine residues in the mutant protein in patients ~ Huntiret0n's disease. Recent research indicales Ihat mis polyglutamine stret.ch it1lerferes with !he normal interaction ber.veen a glutamine-rich pCllch in a tronsoiption factor canee! Sp l ancl a wrespcnding glutamine-rich patch in "TAFUl 30.« a subunlt of a ccxnponenl of the transoipti01 machinef)' caned TFUD (see Chap!er 12). This inlerleren:e impairs transaiption in neurons of the brain, induding the transcripticn of !he gene for tne receptor of a neurotransrriner. Similar po/ygtutamine stretches from OC ~ in other genes may also exert their effects by disruptif'lg interactioos betv.'een trcY1SOlption factors and TAFII I30.
mal
238
The Mulabilily und Repuir o{ DN/t
FICURE 9-2 Amutationcanbe
permanently incorpotated by replication. A mut
seoond round
lhe seoone! roune! 01 replicatiOfl, me muUltion becornes permaneotly illCOrpOfatcd 10 the DNA
replication
>eqUon
first round replication (misincorpo,cltion)
•I
r
n )
and rcp laced, the sequence change will becomo pormanon! in tho genome: during a second round of replication, the misincorporated nudeotide, now part. oC the tl:!mplate strand. wilJ direct tbe incorporation of ils complementary nucleotidc mto Ihe newly synlhesized strnnd (Figure 9-2). Al this point. the mismateh will no longer exist; iostcad it will havc resultcd in a permanent change (a mUlatioo) in the DNA sequence.
Mismatch Repair Removes Errors that Escape Proofreading Forluniltoly, ti lIlechanism exists for detn::ting mislllatches and ropairing tbem. Fioalresponsibility for Ihe fidelity of DNA replication rests with Ihis mismatch rcpair system. which ¡ncreases !he accuracy of DNA synIhesis by an additiooal two lo three orders of magnitude. The mismalch repair system faces two challenges. First. it must sean thc geMme fOe mismatchos. Bocausc mismatches aro lransi.ent (they are eliminated following a second round of replíc.ation when fh ey result in mutations). Ihe mismateh repait system must rapidly find and repair mismalches. Second. the system must coreecl the mismatch accurately; tha! is. iI must replace the misincorporaled nucleotide in Ihe newly synthesized slrand and nol the corred nuclcotide in the parenl al strand. In E. coli, mismatches are detected by a dimer of lhe mismatch repair protein MulS (Figure 9-3). MulS senos the DNA. recognizing mismatchos from Ihe distortion they cause in the DNA backbone. MutS embraces Ihe mismatch-containing ONA, inducing a pronounced kink in the DNA and a conformal ional change in MulS ilself (Figure 9-4). A key to the specificity uf MutS is Ihal DNA coolaining a mismatch is much more readiJy distotted Ihan properly base-pilleed DNA. This complex of MulS and the mismatch-containing DNA recruits Mutl. a second protein component of the repair system. MutL, in turo, activates MuIH. an enzyme that causes an ¡neiston or nick on one slrand near the site of the mismatch. Nicking is foll owed by the action of a specific helicase (UvrD) and ono of Ihreo cxonudoascs (seo bclow). Thc haliease unwinds Ihe DNA, starting from the ineision and moving in the diroction of lhe site of tbe mismatch, and tbe exonuclease progressively digesls the displaced single slrand . extending lo and beyond Ihe site of the mismatched nucleotide. This aetion produces a single-stranded gap, which is then filled in by DNA polymerase (JI (Poi III) and serued with DNA Iigase.The overaU effeel is to remove the mismatch and rap lace it with the correctly oose-paired nudeotide.
I/e¡;licution Errors (inri Their I/epuir
239
F1C;URE 9-3 Mismatdlrepailpathway
fot die repait o, replication elTors. (Sou rce: Adapted from Junop M.$ , 0brr0I0va G.. R
Mutl
,c=:>t ml l --C- '
MulH
(Aopl
Mutl
nick
!
el(onuclease
l ONA polymerase
•
.-
J
mismatch repaired
Z40
The MutubiJity Glld Repair nI DNA
F I G U R E 9·4 Crystal structure o, ltte MutS-ONA compJex. Nolice me kink in lhe DN.A, Pfesenl near lhe bonom 01 !he structure. Also, neal !he top of !he structure of !he enZ)"1"1"le, is ATP. shown in green and red. (Jur"qJ M.S., Obmolova G., Rausch K, Hsieh P., and Yal"'€ W 2001. Composite active site of an ABe ArPase. Mol (elI7: 1- 12.) lmage prepaled INith BobScript, MoIScript, and Raster 3D.
Bul how does Ihe E. con mismalch repaie system know which of the two mismatched nudeotides lo replace? lf repair occurred randomly, Ihan half of the time the error would become permanently eslablished in the DNA. TIte answer is that E. cob tags ths parenta l strand by Iransient hemimethyLation as we now descri be. Tho E. coJi enzyme Da m methylase melhylule5 A residues on both strands of the sequence 5'-GATG3'. Tho CATC sequence is wirlely distributed along the entire genome (OCCutTing at about once every 256 base pairs (4 4 )) , and aH of tbese sites are melhylatcd by Ihe Daro melhylase. When a roplication fork passes through ONA tha! is methylated at CATe sites 00 hoth straods (fuUy melhylated ONA), the resulting daughter ONA dup lexes wi ll be hemimethyJated (thal ¡s. methylated 00 only the parental strand), Thus for a few minutes. untiJ the Dam methylase catches up and methylales Ute n cwly synthesized strand, daughter ONA duplexes wiil be methyloled only on the strand Ihat servcd as a temp lete (Figure 9-58). Thus, the newJy synthesizcd strand is marked Ut lacks a methyl group) and henee can be recognized as the strand for ropair. The MutH protein binds al such hemimethy lated sites, bul its endonuclease activity is normally lale nt. Only wheo it is contacted by MutL a nd Mut S located al a nearhy mismatch (which is likely to be within a distance of a few htmdred hase pairs) does MutH hecome aClivated as we described ahove. Once activaled, MulH selectively
Repliculiofl f.rrOls ond Their Rcpair
a
241
F I (i U RE 9-5 Oam methytation at replication fork. (a) Repl"teahon gene.ates hem¡mClh~ted DNA in E. co~·. (b) MutH mak€S Íflcision in unmethylated dau~ler strand.
3'
5'
5' _ _
MutH /
---¡---./ nkk
nicks Ihe unmethylotcd slrand . so nnly newly synthesized DNA in the vicinily of the mismatch is removed and replaced (Figure 9-5b). Methylatioll is therefore a "memory" device that enabl es the E. coJj repair system lo retrieve the corred sequence from the parental strand ir an error has beeo made during ropli cation. Different exonucleases are used lo remove single-stranded ONA bernreen the nick created by MutH and lhe mislTUltch , depending on whethor Mullí cuts the ONA 00 the 5' O r ths 3' side of the misincorporaled nucleolide. lf the ONA is deaved 00 the S' side of the mismatch, LIlen exonuclease VII or RecI, which degrade ONA in a 5'-3' direction. remove the strotch of ONA from the MutH-inrluced cullhrough the misincorporated nucleotide. Conversely, if the nick is 00 the 3' side of the mismatch, then tho ONA is removed by oxonuclease 1, which degrades DNA in a 3'-5' direction. As we have 5000 , aCtor removal of tbe mismalched bas(;!, ONA Poi 111 fill s in Ihe missing sequence (Figure 9-6). Eukaryotic cells also repair mismalches and do so using homologs lo MutS (cnUed MSH proteins for MulS. llomologs) and MutL (called MLH snd PMS). lndeed, eukaryotes have multiple MutS-like proteins with diffurent specificities. FOl" cxample, one is spcciñc for simple mismatches. whereas another recognizes smaH inserlions or deletions resulting &om "slip page" during DNA replicatioo. Dramatic evidence thal mismatch repair pLays a critical role in higher organisms carne from Iho discovory thal a genelic predisposition to colon cancer (hereditary nonpolyposis coloreetal cancer) is due to a mutation in the genos for human homoJogs oCMutS (specifically the MSH2 homolog) and MuI.L.
24'
The Mulabililyand Repuirof DNA
a
b MutS
M"tL 3'
3'
(=o
5' .
M"II/ ";ck
.
El(o VII or RccJ
1
(S'-3'meo)
3', 5',
J3'QH
3',
" Mutl. MulS,
I
ExoVI
~
3' ,
•
S"
ONA polymerase 111 holocnzyme.SSB
~ J::!'OH
!
3',
"
r
3'OH
~p:_
.
p'-5'eKO)
helicase 11. ATP
l •
S''
MutL
l .
•
-
3'OH
f I €O UR E 9·6 Ditectionalrty in miSfllatch repair. exonudease remOllal 01 mismatmed ONA.
(él)
Unmeth~ted
CATC is S' of mutation. (b) Unmethylated CATC Is 3' of mutatioo.
Even though eukaryotic cells bave mismatch repair systoms. thcy lack MutH and E. coli's c1evar trick of using hemimethylation to lag the parenlal strand. (lndeed, mas! bacteria lack 08111 methylase and are also unable lo use hemimeth ylation to mark the newly synthesized slrand .) How fhen daos Ihe mis match repair syslem i>now which of the tWQ strands to corree!? Lagging strand synthesis. as we saw in Chapler 8. takes place di5continuousJy with Ihe formation of Okazaki fragmcn ts that are joined lo previously synlhesized DNA by DNA ligase. Prior to Ihe ligation step, the Okazaki fragmenl is separflled from previously synthesized ONA by a nick, which can be thought of as beiog equiv?lent to the nick created in E. coli by MutH on the newly synlhesized strand. lndood , extracts of eukal'yotic cells will repair mismalches in artificial templates that contain a riick and do so se lectively on the strand that canies the niek. Recent results indicute that human homologs oC MutS IMSH) inleracl with Ihe sliding clamp component of the replisome (PCNA. which wo discussed in Chapter B), and would thereby be recruited to lhe site oC discontinous DNA synthesis on the Jagging slrand. mteraction with the sliding clamp could also recruil mismatch repair proteins lo Ihe 3' (growing) end of the leading slrand.
DNADAMAGE DNA Undergoes Damage Spontaneously from Hydrolysis and Deamination Mutations arise nol only from errors in replícation bul also froro damage to the DNA. Sorne damage is caused , as we shall sec. by environmental factors, such as radiation and so-called mutagens, whic:h are chemiesl
DNA Domage
243
agents thal ¡ne rease the rel.e of muta tion (500 Box 9-2. The Ames Test). But DNA also undergoes sponlaneous damage from the aetion of water. {This is ironic since the proper structure of thc double helix depcnds on an aqueous environment.)
Box 9·2
lhe Ames Test Figure 1). Howevef, if the mutant cells are treated with a chemical Ihal is mutagenic (and hence potentiaUy carana. genic), the chemical will cause the rnissense o r frarneshift mutation (depending on the nature of the mutagen) lO revert in a small number of me mutanl ceUs. This reversal restares the capacity of me cells lo grow and form colonies on solid medium lacking histidine. The more potent me mutagen, the greater !he number of colonies. Sorne chemicals that cause cancers ate not mutagenic lo begin wilh, bul rather are converted into mutagens by the liver. which metabotizes foreign substances. l o identify chemicals Ihat are converted into mutagens in Ihe liver, Ihe Ames test treats potential mutagens with a mixture of liver enzymes. Chemicals !hat are found to be mutagenic in the Ames test can men be tested far their potential carcinoge nic effects in animals.
Oetermining the potential carcinogenic effeas oí chemicals in animals is time-consuming and expensive. HOINever, because mast tumor-causing agents are mutagens. the potential cardnogenic effects of chemicals can be oonveniently assessed trom their capadty to cause mutations. Bruce Ames of !he University of California at Berkeley devised a simple test for !he potenlial carcinogenic effecrs of chemicals based on meir capacity to cause mutations in the bacterium Sarmone/Io typhimun'um. The Ames test uses a strain of S. typhirnur/Um thal IS mUlant for the operan responsible for the biC6ynthesis of the amino acid histidine. For example, the mutan! operon might contain él missense or a frameshih mutation in one of Ihe genes for histidine biC6ynthesis. As a consequence, cel1s of me mutant fail to grow and form rolonies on solid medium lacking histidine (SO>! 9-2
/
Salmene/Ia bacteria ClJlture reqUiriOQ histicline lo f!FC!tN
j 10'1 cens added
ro agar with nutrients bu! no histicline
no adclilionsl
ro medrum in agar
r colonies arising from spontaneous revertants
80X 9·2 FIGURE 1 TheAmestest.
l
SUSp6C'ed mutagen
incubate 12 hours
adOe
colonies 01 reverlants induced by lhe mutagen
244
The MIl/CJbiJi,y nnd Hepnir o{ DNII
• o
\ NH,
. Hl~ O
N
J
f I GU RE: 9-7 Mutation due te hyckolytic
(a) DeaminatlOl of tytOSlne aeales urad. (b) iJepI.Irinatlon of guaMe by hydrolysis oeates apurinic dec«yribose. (e) DrnrTunatioo 01 5-me1hyl cy10Sine generates el natural base in damage.
The mos! frequent and importanl kind of hyruolytic damage is deamination of the base cylosine (Figure 9-7a). Under nonnal physiological conditions. cytosine undergoes spontaneous deaminalion . Ihcreby generating the unnatural (in DNA) base ucadl. Urocil preferentiall y pairs wilh adenine and so introduces Ihal base in the opposite strnnd upon replicalion. rather Ihan the G Ihal would have bren directed by C. Adenine and guanine are also subjecl. lo spontaneous deamination. Dcamination converts adenine to hypoxanthine. which hydrogen bonds to cytosine mlhar than lo thym ine; guanine is converted lo xanthine. which conlinues lo pair with cylosine. lhough with only h-vo hydrogen bonds. DNA aIso lUldergoes depurination by spontaneous hydrol ysis of Ihe N-glyeosyl linkage. and this produces an aba'Sic site (Olat is. deoxyribose lacking a bose) in the DNA (Figure 9-7b). Notice lhal. in co ntras! lo Ihe replication eITOrs discussed above. a1l of Ihese hydrolylic reaclions result in alterations lo Ihe DNA thal are unnalural. Apurin ic sites are. of course. unnatucal and each of Ihe deamination reRetions generates an unnatwal base. This situation allows dU:iuges lo be recognized by the repl:l.ir syslems descdbed below. nlis situalion also suggesls an explanation for why DNA has thyrnine instead of uradl. Ir DNA nalurally contained uradl instead of thymine. then deaminalion of eylosine would generate a natural base. which Ihe repair syslems eould not easily recognize. The hazard of having deamination generate a naturally occurring base is illustrated by Ihe problem caused by Ihe presence oC 5-methyl cytosine. Vertebrale DNA frequenlly contains S-methyl cytosine in place of cytosine as a resuh of lhe aclion of methyl transferases. This modified base plays a role in the transcriplional silencing (see Chapter 17). Deamination of S-methyl cylosine generales thymine (Figure 9-7c), which obviously will nol be recognized as an abnorma l base and, following a round of DNA replication. can become fixed as a e to T Iransition. Indeed. methylated es are hotspots for spontaneous mutations in verlebrate DNA.
ONA, 1~1!ne.
DNA 15 Damaged by Alkylation. Oxidation. and Radiation
alkylalion f I (j, U RE 9-8 G modifkation. lhe figure shows specilic sites on guanine 1M are wlnerable lo damage by chemltal treaJmcnt, such as alkylatioo or oxidation. 1Inc! by radiabon. TI1C prOÓJets 01 Ú1e5e modific:.ations are oIien highly mutagcnic.
ONA is vulnerable lo damage from alkylation. oxidation. and radiation. In alkylation. methyl or ethyl groups are lransferred to reactive siles on Ihe bases and 10 phosphates in the DNA backbone. Alkylaling chemicaIs indude nitrosllmincs and the vmy potcnt lllborotol)' mutllgen N-melhyl-NI-nitro-N-nitrosoguanidine. One of the most vulnerable sifes of alkylation is Ihe oxygen of cacbon atom 6 of guanine (Figure 9-8). The producl of this methyllllion. Ofi-methylguanine. afien mispaies with thymine. result ing in the change ol a G:C base pair into an A:T base pair when the damaged DNA IS replicated. DNA IS also subject lo aHack from reactive oxygen spedes (for example, Oz-, .HzOz. and OH· j. These pOlen! oxidizing agents are generated by ionizing radiation and by chemicaJ agen ts thal generale free radicals. Oxidation of guanine. for example. generales 7.8-dihydro8·oxoguanine or oxoG. The oxoG adduct is highly mUlagenic bccause it can base-pajr with adenine as we ll as wilb cytosinc. If it basc-pairs with adenine during replicalion. ir gives rise lo a G:C lo T:A transversion. which is one oC Ihe most common mutations found in human cancers. Thus. perhaps Ihe carcinogenic effeets of ionjzing radial ion and oxidizing agenls are partly caused by free radicals that convert guunine to oxoG.
DNA Chmoge
3'
o
anoUlcr Iype of clamage lo bases is caused by ultraviolet Iight. Radiation with a wave length of about 260 nm IS strongly absorbed by ¡he bases. ane conscQuence of which is the photochemical fusion of IWQ pyrimidincs thal occupy adjaccnl posifions on the 5ame polynucleotirle chaio. In tllC case oC two thyrnines. lhe fusioo is called a lhymine dime(Figure 9-9), which oomprises a cycJobutanc riog generaled by lioks between C81'OOn atoms 5 anrl G of adiacen! Ihymines. In Ihe ellSe of a thyrnine adjacent lo a cytosine. the resuWng fusion is tJlyJnine-cytosine adduct in which the thymine IS linked via ¡Is carbon alom 6 lo Ihe carbon atom 4 or cylosine. These linked bases are incapable oC baso-pairing and cause the DNA polymerase lo stop during rcplication. Fiually. gamma radiatjon and X-rays lionizing radiationJ are particularly hazarclous because they cause double-stTancl breaks in the ONA, which are difficult to repair. Ionizing radiation can directly altack (ionize) the cleoxycibose in Ihe DNA backbone. Alteroatively, Ihis radiation can aHack iodirectly by genoratiog reactive oxygen species (described above). which io lum reael with the cleoxyribose subunits. Bec8use cells requi re intaet chromosomes lo replicate their UNA, ionizing radiation is used therapeutically lo killrapidly proliferatiog cells in caneer trcalmenl. Certain an licancer drugs. sueh as bleomycin. also cause breaks in DNA. Ionizing cacliatioo and agenls Iike bleomycin Ihat cause DNA lo break are said lo be dastogenie (from Ibe Creek k1astos. which meaos "broken"). Ye!
Mutations Are also Caused by Base Analogs and lntercalating Agents MulaUons are also caused by eompounds thal substitule for normal bases (base anaJogs) or s lip between the bases Ontercalating agents) lo cause errors in replication (Figure 9-10). Base ana logs are slructurally s imilar to proper bases bul differ in ways Ihat make Ihem lreacherous lo Ihe cell. Thus, base analogs are similar enough lo Ihe proper bases lo gel. laken up by ce lis. converted into nucleoside triphosphates. an d incorporated into DNA during replicalion. But , because of the struclural differences belween these analogues and the proper bases. the ana logues base-pair inaccuralely, leadiog to frequenl mistakes during the replica ti un process. Dne of the mosl mutagenic base ana logs lS 5-bcomouracil, an analog of Ihymine. The prcsence of th e bromo subs tituent allows the base to mispair with guan ine via the enol lautomer (see Figure 9-10a). As we saw in
FIGURE 9-9 Thyminedrmer. U inÓJres !he formaÚOll of el cycIobulilfle broveen adjacent thymirle!o.
246
T'he Mulabilily and Repairof DNA
FIGURE 9·10 Baseanatogues8nd intercalaring agents that cause mutations in DNA.
a
(a) Base analogue 01 thl'fTlinc,
•
S-bromouracil. (an mispatr v.flh gu.'lnll1e.
(b) lntercalating agents.
5-bromouradl (kelo tautomer)
>
5-bromouradl
guanine
(enoltautomer)
b
ethidium
.-"
H proflavJn
N H acndine orange
Chapter 6, the keto tautomer is strongly ravored over the enol tautomer, bu! more so ror thymine than ror 5-bromouracil. As we discussed for eLhidium in Chapler 6, intcrealating agents are Oal molecules containing severa! polycycli c rings tha! bind lo the equally flat purine or pyrimidine bases of ONA, juSI as the bases bind or slack wilh each othcr in the double heli x. lntercalating agents, such as proflavin, aeridine, and ethidium, cause the deletion or add ition of a base pair oro even a few base pairs. When sud delelions or additions arise in a gene. they can have profouncl conscquences on the translation of its messenger RNA beca use they sbift the coding sequence out of its proper reaeling frame, as we shall see when we consi der the genelie code in Chapler 15. How do intercalating agents cause short insertions and deletions? One possibility in the case of ¡nsertions js that. by slipping between fue bases in tlle templafe slrand, fhese mutagens cause Ihe DNA polymerase to insert an extra nucleotide opposite the intercaJa ted molecule. {The intercaJation of one of Ihese structures approximately doubles lhe typlcaJ dis· tance oolween Iwo base pairs.) Converse)y. in the case of deletions, fue distortion to Ihe template caused by the presence of an intercaJated mo)ecule might cause thc po)ymerase lo skip a nucleotidc.
REPAIR OF DNA DAMAGE As we have secn. clamage lo DNA can have two consequences. Sorne kinds of damage. such as thyrnine dimers or nicks and breaks in the DNA backbone. create impediments to replication or transcription. Other kinds of clamage create altered bases thal have no irnmcdiate
RepairofDNI'I Domoge
247
slruclural consequence on replicalion bul cause mispairing; these can result in a permanent alteration lo Ihe ONA sequence aft er replicalion. For example. the conversion of cytosine lo m ad i by deamination crea les a U:C mis malch. which, after a round oC replica tion . becomes a C:G lo T:A transifion mulation 00 one daughter chromosome. 'fhese considerations exp laio why cells have evolved elaborale rnechanisms lo identify aod cepair damage befare il blocks replication or ca uses a mutation. Cells wou ld nol endure long without such mech anisms. In Ihis section . we co nsid er the systems thal repair damage lo DNA (Table ~.1). In Ihe mast direcl of these syslems (reprr:sen ling Irue repa ir), a repair e nzyme simply reverses (undoes) the damage. One more elaborate slep invulves excision repair systems. in which tlw damaged nucleolide is nol repaired but removed Crom the ONA. in exdsion repair systems, the olher, undamaged , slrand serves as a temp late ror reioeorporation of fh e correct nucleolide by ONA poly· merase . As we shall see, lwo kinds of exdsion repair exisl. one involving the removal oC only the clamaged nudeotide and the ol her. thA removal of a shorl stretch oC single-stranded DNA tha t contains lhe lesion. Yel more elaborate is ccoombinationaJ repaic. which is employed when bolh slrands are damaged as when Ibe ONA is broken. In such situations. ooe strand eannol serve as a femplale Cor the repair of lhe olber. Hence in recomhinational rcpair (known as double·strand break repair), sequcnce informal ion is relrieved from a second undamaged copy of the chromosome. Finally, when progression of a replicating ONA polymernse is blocked by clamaged bases. a speda l Iranslcsion polymerase copies across the 5ite of the damagtl in a manntlr Ihal clOC5 no! depend 00 base pairing between the template and newly synlhe· sized ONA strands. This mechanism is a syslem of lasl resort because transles ion synthesis is inevHably highly error-prone (mulagenie).
Direct R eversa) of DNA Damage An example of repair by simple reversal of damage is photoreactivalion. Pholoreaetivatiun direclly reverses the formation oE pyrimidine dimers thal resull from ultraviolet irradiation. In photoreaClivation. the enzyme DNA photolyase captures energy from light and uses il lo break Ihe covalenl bonds HnIdng ad jacenl pyrimidines (Figure 9·11). In other \Vords. the damaged bases are mended direclly. Another exa mple of direcl reversal is the removal oCthe methyl group from tlle melhylaled base O"-methylguanine (see ahoye). In Ihis case,
TABLE 9~1 ONARepairSystems
1\'pe
Oamage
Enzyme
Mismalch repalr
Repllcalion errOl""S
Pholorefictivation
PYl imidine dimers Damaged base Pytimidine dimer Bu/ky adduct 011 base
MurS. MutL, and MUIH in E coIi MSH. MLH. and PMS in hl..fl1ans DNA phototyase
Base excisloo l epair Nucleotide excision repair
Double-strand break lepan Tf ansleSlon DNA syntt1esls
Double-sll and breaks
Pyrimidine dimer or apunnlc site
DNA glycosytase UVfA, UvrB . UVfC, and UvrD in E coli XPC. XPA. XPD. ERCCI-XPf. and XPG in humans RecA and RecBCD in E. coN Y-fam ily ONA polymerases. sucli as UmuC in E COII
248
The MU/(Jbifify (Jnd Repair
(JI DNA UV DNA photOlyase
rr
T T
~ _u"V'-"li9"h'-l••~
lighl
)
dark\'
visible h 1'9 l •
"iT"fl U I . ~.
~-----<:C\-.~
F I G U RE 9-11 Photoreactivation. LN irradiation télUSCS formatlon 01myminc cimCf'i. Upon exposurc
lo lighl. DNA phololyase breaks lhe ring formed betwccn !he din1('1"5 to restere me two Ihyrmne residues. a melbyltransferase removes the mel hyl group from the guanine resirlue by transferring il lo one of its own cysteine residues (Figure 9*12). This is vcry coslly to the cell oocause the methyltransfecasc is not catalytic; having once accepted a methyl group . it cannot be used again.
Base Excision Repair Enzymes Remove Damaged Bases by a Base..F1ipping Mechanism The most prevalenl way in which DNA is cleansed of damaged bases is by repair systems tha! remove and replace the altered bases. The two principal repair systems are base excision repair and nucleotide excision repair. In tbe b8se excision repair, en enzyme ca lled a glycosylase rocog* nizcs and removes the damaged base by hydrolyzing the glyeosidic bond {Figure 9*131. The resulting abasic sugar is removed from !he DNA oock* bone in a further endonucleolytic step. EndonucLeolytic cleavage also removes apurinic and apyrimidinie sugars lbat aeise by spontaneous hydrolysis. Arter the damaged nucJeotidc has becn enlirely removed from the backbone. a repair ONA polymerase and ONA ligase restore an ¡ntaet strand using the undamaged strand as a template. DNA glycosylases are lesion*specific and cells have multiple ONA glycosylases wi th different specificities. Thus . a speciflc glycosylase recognizes oraci l (generated as a consequenee of deamination of cylo* sine), and ano!her is responsihle for removing oxoG (generated as a consequence of oxidalion of guanine). A lola l of e ighl d ifferenl DNA glycosylases ha ve been idenlified in the nucLei of human eells. Cleansing the genome of damaged bases is a formidable problem beeause eaeh base is boried in the DNA helix. How do DNA glycosylases detee! damagcd bases while scanning Ihe genome? Evidence indicates that Ihese cnzymes diffuse laterally along Ihe minar groove of Ihe DNA until a specific k.ind of lasian i5 detected. Bul ha\\' is the FIGURE 9-12 MethylgroupremoYaI. Mcthyt transferase (ataJyzes the transfer of me mcthyl group on 06-methyt guann"le 10 a cystClflC residue on ¡he enzyme. thereby restoring the normal G in ONA
Rf]puir of DNA Domage
00"''''
normal
ba~
S'
I
ff
O-P-O-~
I o
o
o
""'.
S'
O)
Yr:~d'
H
Z49
I
I
ff
ft
O-P-O-CH I '0
O-P-O-C~
I
O
~U
O
9
O
here'
O
O
/
"" [ 0=P-0-CH20 site
~
~.,,,
b
O
I
Ó
I
0=P1-0-<;::H;¡0 O
AP endo cleaves here (makes 3'OH forPoI 1)
O
o=~-o-rd ,gIYCOSylase AP [ O=~-O-<;.l-t;¡ OH I 20 deaves s·le I V?~ O
normal base
S'
gtycosylase
..
O=PI -O-~O
•
AP endolexo
O
3'
I
exOlludease removes lo here ~./',
I
o=í-O-CH20
o 3'
3'
f IGU R E 9-13 Base excision pathway: 'he uracil glycosylase reaction.
Uracil glycoS)tase the gIyrosidic bond lo release Ulad l from Ihe ONA backbone lo lcave an AP site (apurinic or, in this case, apynmldinic sile). AP endonudease rus lhe ONA backbone al lhe S' position 01the AP slle, IoMng a 3 'QH; exonuclease CUIS al Ihe 3 ' position 01 the AP sil€, lea..ting a S' phosphate. lhe rcsulting ~
gap is lilled in by ONA poIyrnerase 1.
enzyme able fo aet on the base ir il is buried in t.he helix? The answer lO this riddle h ighlights Ihe remarkable tlexibi.lity of DNA. X-ray crystaJlographic studies revcalthat fhe damagcd base is llippcd out so Ihat jI projecls away fTom Ihe double helix. where il sits in dl e specificity pockel of Ihe glyeosylase {Figure 9-14). Interestingly, the doubl e helix
fiGURE 9-14 Sb"uctureofaONA-
¡fy<:osytase COmpkUL The enzyme is shw.n in gray and me oNA in purple. The darnaged base, in this case oxoG wnich is shown In red, is
f1ipped out 01!he helix and ioto me catalytic (enter 01Ihe enzyme. (Bruner S.D~ Norman O.P., aOO Veróne G.L 2000. Noture 403: 859-666. lmage prepare
250
The Mulability and Repoir of DNIl
base excision
Oxidation of ~al'1lne produces 01<.()(;. The f1"'lOÓfled base can be repillred prior lo reprlCatlOrl by []\lA !W.0S)iase vii! &e base e~ision pathway. If replication ocrurs belore lhe o)(()(, is removed resulting in rre misil'1COfJXltation of an A, &en a fail-safe ~ can remove the A, alloIr'i.ng it to
F I GU R E 9-15 oxoG: A repair.
be replaced I::ry a C. This pro.tides a second opporturvty for the DNA ~ 10 re.move me modified base.
is able to allow base tlipping wilh on ly modesl dislortion lo its slrucfuro and hence Ihe energetic cost ofbase flipping rnay nol be great (see Chapler 6 and Figure 6-8) . Nevert}leless, il is unlikely Ihat glycosylases flip out aval}' base to Gheck ror abnormalities as they diffuse along DNA. Thus, ti\(! mechanísm by which these enzymes sean for lIamaged bases remains mysterious. Whal if a damaged base is not romoved by base excision before DNA replicat ion? Does this inevitably mean Ihal Ihe lesion wil! cause a mutation? In the case oE oxoG, which has 1M tendency lo m ispair with A, a fail-safe system exists (Figure 9-1 5). A dedicated glycosylase recognizas oxoG:A base pairs genernted by misincorporntion of an A opposite s n oxoG on the template strand. In trus case, however, Ihe glycosylase removes the A. Thus, the repajr enzyme rccognizes an A opposilC no oxoC as a mulalion ancl removes the undamaged bul inoorrect base. Anothel' axample of a faiJ-sa fc system is a glycosy lase thal removes T opposite a G. Such a T:G mi smatch ca n arise, as we have see n. by spontaneous deaminalion oE 5-methyl cylosille, which occurs frequently in the ONA of vertebrates. Beca usc both T and G are norma l bases. how can the ccll reeognize which is the ¡ncorreet base? The glycosy lase system assumes, so lo speak, that the T in a T:G mismatch arose from deamination of 5-methyl-cytosine and selectively removes Ihe T so thar jt Cal) be replaced with a C.
Nucleotide Excision Repalr Enzyrnes Cleave Damaged DNA on Either Side of the Les ion UnJike base excision repaie, the nucleotide excision repair enzymes do not recognize any particular lesiono Rather, this system works by recognizing dislortions lo the shapc of the doubl e helix. such as those caused by a thymine dimer or by the prescnce of s bu lky chemiesl adducl on a base. Such distortions lrigger a chain of evenls thal lead lo the removal of a shorf si ngle-stranded segmenl (or patdÜ that indudes the lesiono This removal creales a single-stranded gap in the DNA. whkh is filled in
by ONA polymerase using the undamaged strand as a template ami thereby restorlng the original nucleotide sequenee. Nucleotide excision repair in E. coli is largely accomplished by four proteins: UvrA, UvrB, UvrC, snd UvrD (Figure 9-16). A complex of UvrA and UvrB scans the DNA, with UvrA being responsible for delecting distortions to Ihe helix. Upon encountering a dislortion , UvrA exits the complex and UvrB melts the DNA lo erea le a singlestranded bubble around the lesion. Next, UvrB recruits UVIC. and UvrC creates two indsions: on e located eight nucleotides away on the 5' side of Ihe lesion and the other four or fi ve nueleotides away on the 3' side of the lesiono These deavages create a 12 to 13 residue-Iong, single-stranded DNA segmento which is made accessible by the aelion or the DNA helicase UvrD. Fioally, DNA polyrnerase 1 (PolI) and ONA ligase fill in the resulting gap. The principIe of nucleotide excision repair in higher cells ls much the same as in E. coli but the machinery for dctecting, excising. and repairing Ihe damage ls more eomplicated, lnvolving 25 Ol' more polypeptides. Among these is XPC, which ls responsible for detccting dislorlions lo Ihe helix. a fundion attribuled to UvrA io E. eoli. As in E. eoli, the DNA is opencd to create a bubble around the lesiono Formation oI the bubble involves Ihe helicase activities of the proteins XPA and XPD (the equivalellt to UvrB in E. col!1 and Ihe single-strand binding prolein RPA. The bubble creates c1eavage sites 00 the 5' side of the lesion fm a nuelease known as ERCCI-XPF and on the J' sirle ror Ihe nuelease XPC (representing the funelion of UvrC). In higher cells, lhe resulting single-slrandcd ONA segOlent is 24 to 32 nucleotides long. As in bacteciH, 1he DNA stlgment is releawd fo ereale a gap tha! is fiUed in by the aetion ofDNA polymerase and ligase. As fheir oames imply. Ihe UVR proleins are needed lo mend damage from ultraviolel lighl; mul,a nls or Ihe uvr genes are sensitive lo ultraviolcllight and lack 'he capadly to reOlOVC thymine-thymine and thymine-cytosine adducts. In raet, these proteins broadly rccognizc and repair bulky adducts of many kinds. Nucleotide excision repair is important in hums ns, loo. Bumans can exhibit a genetic disease caBed xeroderma pigmentosum, which renders affiicted individuals highly sensitive to sunlighl and results in skin Jesions , including skin cancer. Seven genes (refcrred lo as XP genes) have beeo identified in which mutations give rise lo xeroderma pigmentosum. These genes correspond lo proteins (such as XPA, XPC. XPD. XPF. and XPG, referred to above) in the human pathway for nudeotide eXclsion repeir, umlel'scoring the iOlportam:e of nucleolide excislon repair in mending clamage rrom ultraviolel light. Not only is nucleotide excision repair capable of mending damage throughout the genome, bul il is also eapable of rescuing RNA polymerase, Ihe progression of which has been arrested by Ihe presenee of a lesion in Ihe transcribed (templ are) slrand of a gene. This phenomenon, known as transcription-coupled repair. involves rccruitment lo the slalled RNA polymerase of nudeotide excision repair pIOteins (Figure 9-17). The significnnce of transcription-coupled ropa ir is that it focuses ropair enzymes on DNA (genes) being actively transcribed . In effCl.:l. RNA poi ymcrase serves as anothcr damage-sensing protein in the eeH . Central lo transcription-eoupled repair in eukaryotes is lhe general transcriplion factor TFllH. As we will see in Chapter 12. TF1IH unwinds Ihe DNA template during the ¡nitialion of transcription. Subunits of TFIIH ¡ndude Ihe DNA helix-opening proteins XPA and XPD discussed above. Thus, TFTlH is responsible for two separate
252
The Mutability and Repoir al DNA distortion
f I G U R E 9-16 Nudeo6de exdsion
\
repai, pathwa)!. (a) lMA and Uvr8 sean {)NA 10 identify a disrortion. (b) LMA leaves the complex. and lMfl meIts [)NA kallv afOund the cistortion. (c) lNrC Iorms a complex.,.,.;th Uvr8 and creates nicks 10 the S' Slde 01 Ihe lesion aOO 10 the 3' side 01 the lesion. (d) ONA helicase lMD releases the single stranded Iragrnent lrom the ÓJpb, and ONA Poli aOO ~gase repair aOO sea! !he p¡lp. (Soorre: (parts a-d) Ad4pted from Zoo V. and Van Houten B. 1999 Suand opening by the l.J\IfA., complex alOMo dvMmIC r~ 01 D-IA &mase EMBO Jotx. noI 18 : 4898, fig 1. Úlp)fIgj1t e 1999 ~Ofd Univetsity Press. lJsed with permission.)
UvrAB+ ~
A
DNA heücase uvrO DNA potymerase, DNA ligase
d
Repair 01 DNII Dall1OJ!,C
transcriptioo
F I CU R E 9·11 Transcription coopted DNA repaiL (a) RNA poIyrnerase transcribes ONA normally upstteam of the lesion. (b) Upon encoonlering !he lesion in DNA,. RNA poIymerase stalls and transaiption stops. (e) RNA polymerase reauits lhe nudeotide excision repair prot€lns to che site of !he lesioo, meo erther bads up or dissodates Irom the DNA 10 o3llow !he repair proteins
© 1999 Oxford University Press. Used INith pennission.)
nucleotide excision repa" proteins
5·==========I.~
functions: its slrand-separnting helicélses rnelt the ONA around a lcsion during nuclcotide excision repair (including transcription-coupled rcpairl and a lso help to open the DNA template during the process of gene tr-dnscription. Systems for coup ling repa ir lo transcri ption also exist in prokaryotes.
Recombination Repairs DNA Breaks by Retrieving Sequence lnformation from Undamaged DNA Excision repair uses thc undamaged DNA strand as a template lo replace a damaged segment of DNA on lhe other strand. How do ceUs repair double-slrand brcaks in DNA in whieh both strands of the duplex are broken? This is accomplished by the double-strand break (DSB) repair pathway, which rctrievcs scquencc infOlmation frem the sister chromosorne. Because of its central role in general, homologous recombinalion as well as in repa ir, Ihe DSB-repair pathway is an important topic in ils own righl , which we shall consider in delail in Chapler 10. DNA recombination also helps lo repair errOfS in DNA replicatian. Omsider él replication fork Ihal encounlers a lesion in ONA (such as a thym ine dimer) thal has nol becn corrected by nucleotide excision repair. The DNA polymerase will sometimes stall attempting lo replicate
253
ovcr Ihe lesion. Although Ihe templale strand cannol be llsed, the seqllencc information can be retrievcd frOIlt the other daughter molocule of the replication fork by recombination (see Clmpter 10). Once Ihis recornbinational repair is cumplete, Ihe nudeotide excision system has another opP0l1unity lo repair the thyminc dimer. lndeed , rnutants defective in rccombination are known to be sensitive lo llll rnviolel lighl. Considcr also the situarion in which the mplication fork encounters a nick in the DNA templ;:¡le. Passage of the fork over the nick will create a DNA break, repair of which can on l.\' be Hccomplished by Ihe doublc·strand break repair pathway. Although \'Ve generally cOllsider recombinalion as an evolulionary device to explore new combinations of sequences, il may be that its original function was lo rcpair damage in ONA. The OSB-ropair pathway can onl)' operale when tbe sisler of the broken chromosomc ls presenl in Ihe ccH. Whal happens when a c hromosome broak,... ear!y in Ihe cel! cycle, before a sisler has becn gencrated by ONA replication? Under these circum:.1ances, a fail-safe sy61em comes into play known as nonhomologous end joining (NHEJ). As ils munes implies, NHE' docs nol involve homologous recombinBtion. lnslead, the fwo ends of the broken DNA are direclly joined lo each olher by misalignment between single strands protruding from fue broken ends. This utisalignmenl is believed lo occur by pairing between tiny stretches (as short as one oo.se paid of complementary bases (sHrendipitous microhomologies). Single-stranded tails are removud by oucleases and gaps are filled in by ONA polymerasc. NHEJ is medülled by Ku , a memrer of él widely-conserved famil)' of proleins found in bacteria. yeaSI and humans. Ku proleins aligo Ihe ends of broken chromosomes, protL"CI thmn fram nucleases, cmd rocruit olher repl:lir proteins. Ku-mediatoo NHEJ is an inefficienl process (allowing survival of only ooc in a thousand yeasl cells in whkh a chromosolllc break has beco introduced) and Icads lO the formation of deletions ranging in size from a few base pairs lo several kiloooses
Translesion DNA Synthesis Enables Rt!plication to Proceed across DNA Damagc In thH examples we have considered so far, damage lo the DNA is mendcd by excision followed by resynthcsis using an undamaged lemplate. Bul such ropair syslems do nol oparate with complete cfficiency and sometimes ti. mplicating ONA polymerasl~ encounters él lesion, such as a pyrimidine dimer or an apurinic site. tha! has nol becn rcpaired. Decause such lesions are obstadas lo progrossion of Ihe DNA poly" merase, Ihe replication machinery must attempl lo copy across the lesion or be forcl:ld to ceastl replicating. Even iJ cells cannol repair Ihese Jesiuns, thure is a fail-safe mechanism that aUows thc replicalion machinery lo bypass these sitas of dalllilge. This mechanism is kl10wn as lranslesion synthesis. Although Ihis mechanism is,
licpoir vf DNA l)amoge ONA potymerase 111
~~~=== ,< /
PoI IV or PoI V (UmUO'2C) (Din B)
9-18 TransJesion DNA synthesis. Upon encounteting a Iesoo ÍI'1 the templare during replication, ONA poIyrnerase 111v.iIh its sliding clamp dl<;SllÓilles Ir(X'n the DNA and IS repIaced by the lril1SlesIon ONA IX*;merase. v.+.ich extends ONA synthesis across !he thymire dimer on the teflllli:te (q:per) strand The transleston pdyrner'ase is Ihen replaced by !he DNA poIymerase lit. (Sou"ce: R 1Abodgate.) F 1C U R E
\
Jklamp
255
,
'<,
\
~=== ,<
FICURE 9-19 Crystalstrudure ofa translesion polymerase. ShOM'l here is!he
structure of a lranslesioo (Y-family ONA) pdymerase. in gray, in compIe>: v.iIh template ONA, in purple, and an il1l:oming nudeolide, in red (üng H. Boudsocq F. Woodgate R, ar.:! Yang W. 200 1. CeII I07: 9 1- 102. lmage prepared with BoI:r Soipt. MoIScript. and Raster 30.)
An importan! feéH ure ofthese polymeJ
llte V-Famity of DNA Pofymerases
Box 9-1
DNA poIymerases can be grouped into lamilies. shú'Ml in various colors in the figure, based en Iheir amino acid sequence similarities to each other. Recently, UmuC and certciin other translesion DNA poIymerases have been discovered ro be founding members oi a large and distinct family of DNA poIymerases kno.Nn as the Yfamily, wtlich are foune! in al! three dornains of lile, Bacteria, Archaea, and Eukaryata. Members af the Y-family of DNA !XJIymerases characteristically cany out DNA synthesis with Iow fidelity en undamaged ONA templates but have the capacity to b'fPdSS lesions in ONA that block replication by members of the uther famaies of DNA polymerases. Box 9-3 Figure 1 sha.vs a phylogenetic lree for the Y-family af banslesion ONA polyrnerases. V~,
0;,,:
MILI DinP
,.,. Ua DinP
Mm Dinbl Hs OlNB1 Al DinBh
ee~nBh.~==::====~~~~~
Sp DinBh _
~~:::===::::::::~LlaErau muC pA01 UvrA
í: Ce REV t h
~:""'\~~===~,
Al REV1h
5th R27 WucB PsyRulB Eoo R391 RumB
E~"",",,
Sp REV1
5ly UmuC
sama
MmREV1h Hs REV1 OmREV l h
Cm
BOX 9-3 FI(;URE 1 lhephytogenetictreeoftheY-familyofONA potymerillSes.. (Source: Adapted from Ohmon H. el al. Lener 10 the editor: lhe y-tamily oi DNA poIymerases. Mol, Ceo 8: 7, fl& 1.)
Summmy
25'7
Because of ¡IS high error rote, translesion synlhesis can he considered a system oC last resort. it enables the cell lo survive what might otherwise be a catastrophic. block to replication but the price thal is paid is a higher leve! oC mutagenesis. Foc this mason, in E. coJj the translesion polymerase is nOI prosent under normal circumstances. Rather, ils synthesis is induccd on ly in response to ONA damage. Thus, the genes encoding the translesion polyrnerase -are expressed as pan of a pathway known as lhe SOS response. Domage leads to the proteoJytic destruction of ¡¡ trnnscrip tiooal reprcssor Hhe laxA repressor) which controls expression of genes involved in lhe SOS response ioduding those Cor UmuC and UmuD, Ihe inactive precursor fOr UmuD'. lnterestingly, tlle same pathway is also responsible for Ihe proleolytic conversion of UmuD lo UmuD' . Cleavage of LexA and UmuD are both slimulated by a protein called HecA. which is activated by single-stranded DNA resulting from ONA damage. RecA is a dual-funcnon proteio that is also involved io ONA recombination as \Ve shall see in Chapler 10. Finally. translesion synthesis poses several fascinatiog and as yet unallswered qucstions. How does the translesion polymerase recogllize 8 stalled roplication fork? I-Iow docs the translesion enzyme ra place the notTllal replicative polymerdse in lhe ONA replicatioll complex? Onm DNA synthesis is extended across Ihe Jesion, how does lhe normal replicative polymemse switch back to and replace the I¡anslasion enzyme al f,he rep licaUon Cork? TI-.mslesion polymerases have low processivity. so pcrhaps they simply dissociate from the template shortly after copying across a lesiun. Nonetheless. this explanation still leaves LIS with Ihe chaUenge oC understanding how the nonual processive enzyrne is able to reenter lhereplication machinery.
SUMMARY Organisms can survive on ly if their DNA is replicated faith fully and is protected from chemical and physicaJ damage thal would change ils coding properties. The limits of accurate replication and repair of damage are revealed by the natural mu lation rateo Thus,
a template. lo base excision repair. ONA glycosylases and endonucJeases remove only Ihe damaged nucJeolirle. whereas in nucleotide excisioo repair a shoft patcb of single-slranded ONA containiog the I~ ion is removed. lo E. eoli, t'!xcis ion repair i5 iniliated by the UvrABC endonucJease. which creales a bubble over the site 01" Ihe damage and cuts out a l2-nucleotidc s€gment of the ONA strand Umt indudes the lesion. Higher cells calTy out nucJoolide excision repair in a similar manner bul a mucil larger numbar 01" proleins is ¡nvolved and Ibe excised. ::;inglestranded ONA is 24-- lo 3l-residues long_ An alternative repair method, whicb is particularly impurtant ir no template for repair synthesis is available (as in Ihe case of a double-strand break), is recombinational or doubJe-strand break repair, in wbich an intacl ONA strand is copioo frorn a differenl bul homologous duplex. Finally. Iranslesion synthesis enables r-eplicatioll lo contin ue across damage tha! blocks Ihe progression of a replic81ing ONA polymerase. Translesion synthesis is medialed by a rustincl and widespread family of ONA polymerases thal are able to carry out ONA synthesis in AA error-prone manner tbat does not depend on base pairing_ Mutagenesis 80d its repair are of coocern lo us because they permanently affecl tite genes Ihal organisms ¡nheril and because c8ncer is oCten caused by mulatioos in somalic cells.
2SR
The Mutobility ond Repoir Df fJNA
BIBLIOGRAPHY Books Friedberg E.C.. WaLker C.c. , and Siooe W. 1995. DNA repair and mutagenesis. ASM Prf..ss, Washington, O.e. Kom berg A. and Baker T.A. 1992. DNA replica/ion. 2nd ed ilion. W. H Freeman, N.Y.
Replication Errors and Their Repair LindalJl T. and Wood R.O. 1999. Quality control by ONA repair. Scionce 286: 1897 - 1905.
ONA Damage Singer B. alld Kusmierek '.T. 1982. Chemical rnu lagtlllcsis. Annu. Rev. Biochem. 52: 655 - 693.
Repai[' oC DNA Damage Bridges B.A. 1999. ONA repair: Polyrnerases (or passing lesions. Curro Biol. 9: R475 - R477.
Citterio E.. Vermculen W.. and Hoeijmakers ' .H. 2000, TransLTiplional healing. Ce1/101 : 447- 450. de Laat W.L.. Jaspers N.C .. and Hoeijrnakers ' .H. 1999. Mo lecular mechan ism of excision nuc1eotide repnir. Genes Dev. 13: 768 - 785. Drapkin R.. Reardon J.T., Ansari A., Huang '.c., Zawel L., Ahn K., Sanear A., nnd Reinborg D. 1994. Dua l rol e of TFIIH in ONA excis ion repair and in transcripti on by RNA Polyrnernse 11 . Na /um 368: 769 - 772. Kleczkowska H.E.. Marra e., Lettieri T., and Jiricoy J. 2001 . hMSH3 and hMSH6 interact with PCNA and colocalize with it to replication (oci. Genes ond Deve/opmen! 15: 724 - 736.
CHAPTER
Homologous Recombination at the Molecular Level H DNA is recomhinanl ONA. Genetic exchange works constanlly to blc nd and rearrangc chlUOlosomcs. mosl ohviousJy during meiosis. wheo homologous chromosomes pair prior lo Ihe firsl nuclear division. During Ihis pamng. gcnelie exehange oolwecn lhe cbromosomes rxx:: UI"S. This exebange. cJassicaUy lermed crossing over, is ane of the results of homologous re(;ombination. This reeombination involves the physicill exchange of ONA sequences between Ihe eh romosornes. The frequency of crossing over betwecn two genes 00 lbe sarne chromosolllc depends on th e physica l dislance between Ihese genes, with long dislances giving fhe highest frequencies of exchange. In Ülct, genetie maps deIlved from early mea.surements or crossing over frExIuencics gave the first real infonnation about ehromosome slruc.1ure by revealing thal genes are arranged in a fixed, linear order. Sometimes, however. gene Ol-der does c hange: for example, movable ONA segmcnts caBed transposons occasionally "'jump" around ehromo· sornes and promote ONA rearrangements, thus allering chromosomal organization. The recombination mechanisms responsible for transpositian and other genome rearrangcments are dislinct from those of homologous recombination. These mechanisms are ruscussed in detail in Dlapter 11. Homologous recombiniJlion is an essenti
A
QUTL1NE
• Models fer Homologous Recombination (p. 259)
HornoIogous Recombination Prolen Machines (p. 268) • Homologoos Recombínatioo in Eukaryrnes (p. 278)
• Mating-Type Swítching (p. 285)
• G61etic Coosequences of the Mechanism of Homologous Recombinatioo (p. 288)
MODELS FOR HOMOLOGOUS RECOMBINATION Elegan! early experiments usiog heavy ¡solopes of atoms iocnfllomlod into ONA provided the Urst molecular view of lhe proccss oC homologous recombinalion . This is the same a pproaeh used by 259
Mallhew Meselson and Fnmk W. Slahl lo sho\\' that DNA replicates in a semi conservati ve manller (seo Chaple¡' 2). In the ir cxperiments, Meselson and Slahl demonstrilled thal Ihe product s of replication conta in one old 8nd one new ly synthesized DNA slrand. In cont rast, this same experimenta l approach rcvealed thal recombination is conservative. involving the direct breakage and rejoining of DNA molecul es. As \Ve \Vi II see in the following sect ions. \Ve now unders l;md thal breakage and join ing of ONA is a central aspect of homol0Sou s recombinalio n. Bul rec.ombinati on a lso often ¡nvo lves bolh t he Iimited deslruction and resy nlhesis of ONA strands. In the years since these ¡ni tia l experiments . numeraus mode]s lo exp lain the molecu lar mechanism of genetic exchangc havcbeen p mposed . Key steps of homo logous recombination s harcd by these models inelude: 1. Alignmenl of twn homnlogous DNA moleculAs. By homolognus
2.
3.
4.
5.
Wfl
mean that the DNA sequences are identical or nearly identical for (t region of tll least a hundred base pairs or so. Despite Ihis high degree of similarity, DNA molecules can have smaJI regions of sequence clifference and mayo for example. carry different sequence variants, known as allcles. of the same gene. Introdu ction of breaks in the DNA. The breaks may occur in one DNA stnmd or involva both DNA strands. The nl;llure of these breaks is the fel;l turc fhat laJ&eJy distinguishes the l\Vo models dcscribed below. Formation of ini,ial short regioos of base pairing between the two I1;!Gombining ONA Illolecules. This pHiring occurs when (j s inglestranded region of ONA originating from one parental moJecuJe p
The Holliday Modcl lIIus.ra.e. Key S'eps in Homologous Recombination A s imple and h isloricall y im portant model for homologous recombinalion is Ihe Holliday model (Figure 10-1). Altho ugh il is now clear tha t most recombinatiol1 events invoIve sorne new DNA synIhesis-a fealure absent from Ihis model-the HolliJay model very we ll iIluslrates Ihe DNA s trond invasion. branch migrl;ltion , and Holliday junction resolution processes central lo homoJogous reoombinat ion .
Model.~for
H011l0loguIl s Recvmbinotion
ZGl
a 5" 3"
A
B
e
A
B
e
top duplex
-, -- " ,
b
single-stranded breaks in each
""pie' 3' 5'
•
•
e 5' •
A
¡
'
"
"
,
--
'
~-' -- --- -- , -- ------é -- ------ - -- --- --- -~ - ,_""0>m_d_"~~,'"
'-..!1< ---
---.---
1
""""
, --- --- -----------
B
e
B
e
invasion
•
:
, ,
b b
heteroduplex
1
d
""'''''
migratíon
" , - --, -'
- --
,-
.-
--
, -
'
~'2:~~~~~:~ ______ c~ ___________ ___________ ~ i3~~~~~~~g: --- ---
-,------------------~!eroduplex ----
flGU RE 10-1 HoUiday modet througll the steps of branch migration. The smal1 arroo.r.tJeads al
the DNA single strands poinl in Ihe 5' 10 3' direcrion. Ncte IMI A and 0, B and b, e 0'100 e specity differ-
en: aUeles, and halle s~ghtly differenl DNA sequences. Therefore, heteroduplex [)NA containing those genes (!hown In the expanded s€cllon !Il panel el) >Nill llave sorne m1f>m
---
hyMd duplexes
When illustrating the Holliday Olode!. il is useful to picturc the two homologous, double-stranded ONA molecules, aligned, as shown in Figure lO-la. These molecu les, although nearly identical, carry different alleles of the same gene (as is denoled by thc Ala, Blb, and CIe symbo ls in Figure 10-1). which are he lpful for following the outcome oC recombinafion. Recombination is initiatcd by the introduction oI a nick in each ONA molecule al an identicallocation (Figure lO-lb). ONA strands near fhe nick sita can fhan be "peeled" away from thair complemcntary str-dllds, freeing these strands fo invade, and ullimalely base-pair wilh, the humologollS duplax (Figure lO-le). In Ihe st ructure shown in Ihe figure, Ihis invasion 1S symmet rical : thaf ¡s, lhe same region of ONA sequence is "swapped" between the two molecules. 5trand invflsio n generales the Holliday ¡unction, lhe kcy recoOlbination inlermediale. The Holliday junction generated by strand in vasinn can lhen move along the ONA by branch migr'dtion, This migf
Modldll lar H amolosous RP.CtJmbi
5'"
• ••
d
8
e
A
b 8
e
•
b
,
sila 1
l
ONA ,Ieav.>ge
•• F I c;; U R E 10-2 Holliday ju nction cleavage. Two alternatr..e pMS of DNA 51les can be CUI durrog It!'>OIution. Oeavage al one palr 01 Slte5 gt:nerates the 'sp\ice~ or OOSSOolef products. CJe.avage lit the second pajI DI sitcs yieIds lile "patch" or non <1OSSOYeI p"x!uC1.S. The irI'iet s/1a.Ns a HoI\iday fUnction ONA structule. Nottce mal lile DNA IS completely base--palred In rllis stUJcture.
(see fat e of the Ala and elc allele markers in the figure). These molecules are, therefore, a lso known as the non-crossover producls. Factors that influence the site and polarity of resoJution will be diswssed below.
The Double.. Strand Break Repair Model More Accurately Describes Many Recombination Events Homologous recombination is often initiated by dOllble-stranded breaks in ONA. A common model describing this type of genetic exchange reaction is the double--stranded break-repair pathway (Figure 10-3), As with the Holliday model, Ihis pathway starls with aligned homologous chromosomes. But in this case, the initiating even! is the introdll(.1ion of a double-strandcd break (DSB) in one of Ihe t\'Vo DNA molecules {Figure 10-3al. The olber DNA duplex remains mtad. Because doublestranded DNA breaks occur relatively frequently {as we shaU see below}, lhis type of initiat ing event is attradive compared lo Ihe pair of aligned nicks that are proposed lo initiale recombinalion by the Holliday modeL However, the asymmetric initial breakage of the t\'Vo DNA molecules in lhe DSB-repair model necessitates that tater slagas in the recombina!ion process are also asymm etric, as we wiIJ Se€:. Afler introdllction of the DSB~ a ONA-cleaving enzyme scquenlial ly degrades Ihe broken DNA mole<.:u le lo generate regions of singlestranded DNA (Figure 1O-3b). This processing creates single-strund extensions, known as ssDNA tails, on the broken DNA molecu les; these ssONA tails tenninate wilh 3'ends. In sorne cases, both slrands al a OSB are pnx.;essed, w hereas in uthe!' cases. onl)' Ihe 5'-terminating strand is degraded. Thc ssDNA tails gcnerated by this process lhen invade the unbroken homologous DNA d uplex (Figure 10-3c). This panel of the figure ,s hows one strand ¡nvasion. as likely occurs initiaUy. \Vhereas the nexl panel shows the t\'Vo invading strands. In each case, Ihe io vading strand basepairs with ils complementary straJld in the olhor DNA molewle. Because Ihe invading strnnds end with 3' lenn ini. they can serve as primers rol' new DN/\ synlhes is. Elongation from thcse DNA endsusing the com plementary strand in Ihe homologous duplax as a template-serves to regenerate the regions of ONA that were destroyed during the proressing orlhe strands al the break site (Figme 10-3 (Le). If Ihe two original DNA duplexl!s \'Vere no! identical in sequence near the sito of the break (for example. having s ingle base-pair changes as described above), sequence ¡nformalion couJ d be losl dw'ing re<.:ombina!ion by the DSB-repair pathway. In the recombination avení shown in Figure to-3, sequence information lost from Ihe gray DNA molecule as a resull of DNA processing is replaced by Ihe sequence presenl 01\ Ihe blue duplex a<; a rcsu lt ofONA synlh({.<;is. This nonreciprocal step in DSB-repair sometirnes lea ves a genetic tf'dcc-giving rise to a gene conversion event -a point we will return lo al the end of Ihe dapler. The two Holliday jllnctions found in Ihe recombinatlon inlem18diales gcocrated by this modol movc by bmnch migration and ultimafely are resolved lo finish recombination. Once again, Ihe strands that are cleaved during resolution oí these HoUiday junctions detennine whether the produ<.:1 DNA molecules will con tain reassorted genes in the regions tlanking the site ol' rccombination (that is, result in Cffissing over) or noto The different ways to resalve a recombination intermediate conlaining two Holliday junctions are explained in Hox 10-1, How to Resolvo a Recombination Inlermediale \'Vith Two Holliday Junctions.
Moords lar Hom ologous RL'combinolio n double-stranded break (OSB) \..
8
FIGURE 10-3 DSBrepairmodeHor homologous tecombinaüon. The figure shoNs the steps leading 10 generation of
B
a reoJITlblnation intermediale "";th twJ HoIliday
~~~) I c::::: . ~~ c:= ---, t
.==
•
• -=-= ,
3' '''''S'
b
p,0"'"Io9 lo geoerale gapped
---
ONA with 3' ss tails
1
b
S' '
A
3" A
=
B B
3' S'
strand invasioo of 3' end
,
1
S, ~ .~ A ~~== ~
S' , ocond , ","" invasion and ONA repair synlhesis from 3' ends
d
1
"',"'" 1
"';9"";0"
and formation of sn intermediate wlth two Holliday junclions
•
26!
=
jUllClÍOrlS.
266
HOlnolo80us Rp.cnmÚ;nation al fhp. Molp.cular Levcl
Box 10-1 How lo Resolve a Recombination Intennediate with Two Holliday Junctions
How the Holliday junctions present in a recombination intermediate are c\eaved has a huge impact on the structure of the product DNA molecules. Products will either have the DNA f1anking the site of rec.ombination reassorted (in the splicelaossover products) or not (in me patch/non-aosSOl!ef products) depending on how resolution 15 achieved. Because the imermediates generated by the DSB-repair pathway conlain t'M:l HoIliday junctions. it can be difflCult to see which products are generated by the different possible combinations of HoIliday junction deavage events. In fad. there is a simple pattern that determines whether aossa.:er or non-aossover products are generated. To expIain the different possible ways these intermediates can be resolved, consider the two junctions (Iabeled x and y) in Box 10-1 Figure L Far each junction, there are two possible deavage sites (Iabeled site 1 and site 2). The simple rule that determines whether or not resolution will result in crossover versus non-crossover products is as loI!ows. lf both junctions are cleaved in the sorne way, that is either both al site 1 or both al site 2, then non-aossover products will be generated. An example of this type of product is shOM"l in panel b of the figure; these are the molecules generated when both HoIliday junctions are cleaved al site 2. Notice, the allele markers AIB and a/b are still on ¡he same ONA molecules as they were in the parental chromosomes. Oeavage of both junctions at site 1 also generales non-crossover products.
j.....ctioo x
•
(j) A 5'1
I
In comrast, when the t'M:l HoIliday junctions are cleaved using different sires, then the crossover products are generated. AA example of this type of resolution is shOM"l in panel c of BOJ( 10-1 Figure 1. Here junction x was deaved al slte 1 ....mereas junction y was cleaved at site 2. Notice tt)at now gene A is linked to gene b, whereas gene a is linked te gene B; thus reassortment of the Hanking genes has occurred. Oeavage of junction x al site 2 aOO junction y at site 1 also generates
aossover products. V\lhy is the simple rule true? To understand Ihis. compare Ihe junctions shov.n here to the single HoIliday junction shcM-n In Frgure 10-2. You should see that, at a single junction, deavage at site 1 VoIOUId give Ihe splice products. wl1ereas deavage al site 2 would generate patch products. So ~en you combine the results of deavage at the tw::l junctions. thiS is wt.al happens: • Oeavage of both junctions at site 2 will give a patch product (patch + patch = patch, non-aossover products). • Cleavage at both junctions at site 1 also gives a patch product (splice + splice = patch because Ihe second splice-type resolution essentiaUy ~undoes" Ihe rearrangement caused by the first cleavage). • Cleavage 01 one jundion al site 1, but the other at site 2 therefore generates crossover products (splice + palch = splice), because Ihe rearrangement caused by the site 1 deavage is retained in me final product
junction y
-
(j)
I
B
:
,
b
reso~ion al site <1> in bolh x aod y
resolutioo of x al sile aod Y al site
e
b
a
•
a
b
D
b
5'
ooo-crossover products
•S'I
A
,
A
b • i'
b crossover producls
BO X 16--1 FIC;U RE 1 Two possible ways of resoMflg afl inlennediate 'rom the DSBoftpair pathway.. The parental DNA moIecules wefe like those in Figure 10-3. The regions 01red DNA are those thal wefe resyndlesized during recombination.
Modfds ¡or HomoloJ:,'Ous Recombination
Double-Stranded DNA Breaks Arise by Numerous Means and Initiate Homologous Recombination Double-Slranded breaks in DNA arise quite frequentJy. 1f these breaks are not repaired, the consequl:lfl(:e to the (.'e H is disa..trous. For exampIe, a single OSB in the E. coU c hromosome is lethal to a cell thal lacks the abiUty to repair it. The majar mechanism used to repair DSBs in most cells is bomologous recombination via Ihe DSB-repair pathway d{Jscribed above. Sorne cells also use a simpler mechanism. called nonhomologous e nd joining (NHEJ) as well. This process is described in Chapter 9. In bacteria, the major biological role of homologous recombination is to repair DSBs. Thcso broken DNA ends Mi se from severa! causes (sea Chaplf'.l' 9). lonizing radiation and oUler damaging agents sometimes dirccUy break both slrands of the ONA backbone. Many types of DNA darnage also indirectly give ri se to DSBs by inlerfering with the progress of u replication fork. Fo!: cxample. an unrepaired nick in one ONA strand will lead to callapse of a passing replication fork (Figure 10-4).
ONA lesion in
nickin
lemplDle slrand
templale slrand
~~g.+.~~
~ ,,,,,001"''''1
OSB fOf
~.iiii¡¡¡iiiiiiiiiiiiiiiiiii;J ~ recombination
• fiGURE 16-4 DamageintheDNAtemplateunleadloDSB fonnatton during DNA replication. This is easJesl lo see when !he 1€fTlJIale contains a mck (Ieft panel), bUI abo can occur when !he temo pl.ale carnes a fork·SI'opping Iesion (rigl11 panel). In tl1is case. lhe t'Ml newIy synttle5!7.ed slrands (shown in red) can base-pair and !he fon: can regress. This Sl'ructure can be further processed by a number of means. The broken end can seTVe lO ¡nitiMe recombination.
267
Similarly, a lesion in DNA that makes a strond unabJe to serve as a template wiU stop a replication fork. This type of staJled fork can be processed by severa] different means (for cxample, fork regrossion or nudease digestion ; see Figure 10-4) that give rise to a DNA end with a DSB. "fhese broken UNA ends then initiate recombination wit.h a homologous DNA mOICL'llle, a process which will. in tum, heal the break. In additi on lo repairing DSBs in c hromosomal ONA. homologous recombination promotes genetic exc hange in bacteria. This exchange occurs between the chrornosomo of one cell and ONA that enter!'; that ceH via phage-merliated transduction or cell-ce ll colljugation (see Chapter 21). In Ihese cases, the en teriog DNA comes into the cell as a linear molecule. and thus provides the critical "brokcn" DNA end oeeded to inHiate recombinalion . lo eukaryotic eells, homologous recombination lB cril ica! for rcpairing DNA breaks and collapscd rcplication forks. However, there are othor times when recombination is also needed. As \Ve will describe bclow, recombination ls essential to the process of chromosome puiriog during meiosis. In this case, as cells enter meiosis they produce a spedfic proteio to introduce OSBs into the DNA a.l1d therefore initiate Ihis recombination pathway (see below). Thus, ahhough they arise &cm many different sourees . the appearance of a OSB in ONA is a key earlyevenl in homologous recombin ation.
HOMOLOGOUS RECOMBINATION PROTEIN MACHINES Organisms &om a1l branches of !iCe encode enzymes that catalyze the biochcmical steps of recombination. lo sorne cases, members of hornologous protein families provide the same function ID a1l organisros. In contrast, other recombination stcps are calalyzed by different cJasses of proteins in different organisms but with the same general outcomc. Our mast detailed understanding of the mechanism of recombioation comes from studies of E. coli and its phage. Thus, in Ihe following l)CCtions. we fust focus on the proteins that promote recombination in E. eol¡ via a major OSE-ropair pathway, known as the RecBCD pathway. Homo!ogous recombination in el,lJearyotic cell s, and the proteins involved in theS() events, are L'O nsidered in later sections. Table 10-1 lists the proleins Ihal catatyze critical recombination steps in baderia as well as those Ihat serve tbese saJm~ fum.i:ions in eukaryotes (the budding yeas! S. cerevisiae is tbe best-understood example). Thesc proteios provide activities neerled lo complete important steps in the OSB-repair pathway. In addition lo these dedicaled rccombinalion pro· teios. DNA polymerases, s ingle-stranded DNA-binding proteins. topo· iSOmeI"'dSeS, and ligases also have critical roles in the proccss of genetic exchange. Notice tha! absent from the list in Table 10-1 is an E. coli protein Iha! introduces OSBs in DNA. despile tho fact tbat rccombination via thc Re(:BCD pathway requires a OSB on one or the roc:ombining two ONA moleL'Ules. As discussed above, in bacteria, no specific protein has been found that carries out this task. Rathar, breaks generaled as a rcsult of DNA damage or failure of a replicatíon fork are the major source of Ihese initiati ng events in chromosomal DNA. The foll owing sections describe the E. coU recombination proleins and how they perform thcir functions during recombim:¡tion by lhe
lA B l E 16-1 Probryotic and Eukaryotic Factors Ihal catafyze Recombination steps Recomblnation $tep
E. coll Prolein Catalyst
Pa1ring homologous DNAs and strand inv3sion
AecA protein
Inlroductioo 01 DSB
Nono
EUkaryotlC Protein Catalyst Rad51
Dcm 1 (in meiosis)
Spo l1 (in meiosis) HO (Ior mating-type switChing)
Processing DNA breaks to
generate single strands
RecBCD helicase¡nuclease
MAX prolein (also callad Rad5Ol5BJ60 nuclease)
lO( invasiOn Assembly el Slrand exchange prOleins
RecBCD and AccFOA
Rad52 and Rad59
HoIliday juncfiOll recognitioo
RuvAB ComplCK
Unlmown
RUvC
pert'laps Mus8 1 and
and branch m igr
junclions
DO",,,
DSB -repair pathway. These proteins are discussed in Ihe order in which they appear during the reactiolJ pathway. First, we will see how the RecBCD enzymc processes ONA al the site oC the OSE lo generale single-stranded regions. Next. Ihe structure and mechanisrn of RacA, the strand-exchange protein , is described. RecA, afiar a¡:;sernbling on Ihe s ingle-stranded ONA, flnds regions oCsequence hornology in the DNA rnolecules and generates new base-pairlng partners ootween ¡hese regions. The RuvA and RuvB proteins Ihal drive ONA branch migralion are Ihen described . Finally, lhe Holliday ¡unc li onresolving enzyrne, RuvC, will be consi dered.
The RecBCD Helicase/Nuclease Processes Broken DNA Molecules for Recombination DNA molecules with single-stranded ONA extensions or tails are the proCerred substrato Cor initiating strand exchange between regions oC homologous sequ ence. The RecBCO enzyme procesSf'"s broken ONA molecules to generate these regions oC ssDNA. RecBCD also helps load Ihe RecA strand-exchange prote in onlo these ssDNA en ds. In additio n, as we wiJI see, Ihe multiple enzymatic activitias of RecBCO provide a means Cor cells lo "choose" whelher lo mrombine with. or destroy. DNA molecules Ihal enter a cell. RccBCD is composed oC three subunits (the p,rodtH..i:S of Ihe recB, rece. and recD genes) an d has bolh ONA helicaow and nuclf'..ase a(.1ivitiBI'i. It hinds to DNA rnolecules al Ihe site of a double-slranded break and tracks along ONA using Ihe energy o[ ATP-hydrolysis. As a result o[ ¡Is a(.1 ion, the DNA IS unwound, w ith or without the accompanying nudeolytic deslrur:tion oC one or both of Ihe DNA sl.rands. The activities oCRecBCO are rontroUcd by specific ONA scquence elcmCJ1Is known as cm sites (for goss-over hotspot instigator). ehi sites were di scovered because they stimulate the frequency oC homologous recombinatioll. Figure 10-5 shows a schematic of RecBCO processing a ONA moleculc containing a single ehi site to act lvate Ihis ONA Cor recamb io nati on. RecBCD enters the ONA al the sit e of Ihe double-strand l)reak
270
Homologous Rl!COlll billalioll allhp Molecular l..evP.1
FIGURE 10-5 stepsofDNAprocessing
by Rec:BCD. NOIe mal RecBCD prolein could llave entered lhis ONA moleaJle from eithef ar both broken ends. HO'oNeVer, eh sites Functian enly in one ol'ientation. 0r1 the ONA molecule sl-roYm, !he chi site is oriented that it will only modily a RecBCD enzyme that is moving from right lO 1eft The RecBCO enzyme has two DNA helicases: RecD, maves rapldly on lhe 5'-ending strand (bonom strand) and RecB, v.t1idl fT)(M:'S slO'.My on lI1e 3' -ending slrand (Iop strand). Because l/;ese tm SUbUfUIS traYe! al differcnl speeds. me DNA moJewles acOJITlU· late a sirVs1rand DNA loop en !he lop slrand during urrovinding. A red X ls shown en tl1e RecD subunit, aflel the en1yl11e has encountered me chi site, to denote the inactivation or loss of this subunit.
sum
5'.~~ 3'~
==\;=='1"'"
5'~~ 3'=
.~ 5'-=~
•
and moves along the DNA, unwinding Ihe strands. The RecB and RecD subunits are bot h ONA helica<:es, Ihal is, enzymes that use ATP hydrolysis to melt DNA base pairs (see Chapter 8). The nudease activities of RecBCD frequently cleave each strand during unwind ing and thereby t..lest roy the DNA. Upon encountering the chi scquence, the nuclease activity of the RecBeD enzymc is aItered. As RecECD moves into the sequence distal to the ehi site (with respect to the broken DNA site at which the enzyme enteredl, it no longer cleaves the DNA strand wilh 3' - 5' polarity. Furthermore, afiar !he encounter ""ith Ihe ch i s ite, lhe other ONA stTand (the one with Ihe 5 ' - 3' polarity) is d eavcd even more frequent ly Ihan it was prior to the c hi site. As a resu lt ofthis clJange in activity. a duplex ONA molccule is convcrted inlo one wilb a 3' singlestrandcd extension tcnn inating with Ihe cm SCqUeJ1CC al the 3' cnd . Tbis structure is ideal ror assembly of RecA i:lJld initiation of strand exchange (sec below). The molecular bas is of tbe change in RecBCO's enzyme activity after the cm.;o unter with a c hi site is undear, but appears to be associated wilb e ither fu e inaclivalion or loss of the RecD subunit. The ssDNA taH generated by RecECD must be coated by the RecA protein for recombinat ion lo occur. I-Iowever. cells also oontain singlestmn ded DNA-binding prolein (SSB) Ih a! can bind lo thi s ONJ\. To
ensure Ihat RecA, ralher Ihan SSB binds Ihese ssDNA taHs, RecBCD interacts direct ly with RecA and promoles its assembly. Chi sites ¡ncrease the Crequency of recombil1alion about ten fold . This stimulation is most pronuunced dicectly ad jéll:lmt lo tlU) chi site. Although elevated recombination frequencies are observed ror about 10 kb distal lo the t:hi site, they drop off gradually over this distal1t:(:l (Figure 10-6). The observation thal recombination is stimulated speciCicaIly only on one "si de" oC the ehi site was initiaIly puzzling. 1t is now clear, however, why this pattern is observed: Ihe DNA between the OSB (where RecBCD enters) and ¡he eh i site is cut ioto small pieces by the enzyme and is therefore no! available ror recombination. In contrast, DNA sequences mel by ReeBCD aftcr ¡ts eneounter with ehi are prcsel'ved in a ceeombinagenic, single-stranded form and are specifically loaded with ReeA. The abUity of ehi siles to control the nuclease activity oC ReeBCD also he1ps bac terial eells protect themselves from foreign DNA that may enter via phage ¡nfection or conjugation. The eight-nucleolide ehi site (GCTCCTGC) is hig hl)' overrepresenled in the E. coli genome: whereas il is predi(.ted to O(."CUC only once every 65 kb. oc abou! 80 times. the c hromosomal sequence reveals Ihe presence oC 1.009 ehi si tes~ Because of Ibis overrepresentation. E. coli IJNA tbat enlers an E. coN cell is likely lo be proccssed by RecBCD io a maoner that generales the 3' ssDNA tails. and thus activated for recombination. In contrast, UNA from another spccies (in which E. coli ehi siles are not overrepresentedJ will lack rrequeot ehi sites. RecBCD actioo 00 Ihis ONA will lead to its exte nsive degradation, ralher than activation for recolTlbination. In summary. Ihe DNA-degradation aetivity oCRecBeD ha .. multiple eonsequenccs: this degradatioll is needed to proccss DNA al a break site for the subsequenl steps of RecA assembly and slrand invasion. In this manoer. RecBCD promotes recombina!ion. However. because RecBCO degrades DNA to activate it. the overall process of homologous recombination must also involve DNA synthesis to regenerate thfl degraded strands. In addition. RecBeD sometimes functions s imply lo destroy DNA-as it does when foreign DNA laeking frequ enl chi sites enters cens o In this way. RecBCD can prolect cells from the pOlentially deleterious consequences of taking up foreign scquences. which. fol' example, may carry a bacteriophage or olber harmful agenl.
,
o
,
,
"
5
,
,
15
2
dooorONA
(linear) relative recombioatioo frequency
2~' 1111#1 . , b
recipieol chromosome (cirrular)
,
"
F I (; U R E 10-6 Poi", action of chi. This schemattc ShOW5 d1at a eh 511e speoficaUy elevales rewmbinatioo Irequ610es direcdy at the site, as well as 10 me distal sequences. The recombinat¡on ev€I1t s.hown represenLS exchange betv.een a trallsfroed ~oear DNA segment illtroduced ioto a cell by transduction or conjugalioo and lhe bacterial mromosome. lhe actual ONA segments partidpatlllg mar be much Ionger. fu example. phage trarrsduc· !ion ollen delivers an approximately 80 kb segrnent 01DNA. lhe E col¡ chrO/TlO5Ol'1"le is
approximately 5 Mb_
212
Homologous Rp.combioa!jon al t hp. MDIp.culur Lel'fd
RecA Protein Assembles on Promotes Strand lnvasion
Singl e~Stranded
DNA and
RecA is th e central protein in homologous recombination . It is the founding member of a famil y of enzymes called slrand-cxchange proleins. These proteins catalyze the pairing of homologoll s DNA moleculcs. Pairing involves both the search Cor sequen<.'e matches ootween two moleculcs an d thu generation of regions of base paidng bctween these molecules. The DNA pairing and strand-exchange activilies of RocA can be ohserved using simple DNA substrates in vitro; examples of DNA pairing 8n d strand-exchange l'eacÜOJ1S llseful for dCn10nstrating the biochemical activitics of RecA are shown in Figure 10-7. The important features of these DNA molecules are: (1) DNA sequence complementarity between the Iwo partner molccules; (2) a region of single-stranded DNA on at leas! anc molecule to aIlow RecA assembly; and (3) the presence of a DNA end withín the regíon of complementarity. enabling tbe ONA strunds in the newly-formcd duplex to intcrtwinc. The active fonn 01' RecA is a protein -DNA filament (Figure lO-e). UnLike mos! proleins involved in molecular biology, tha! runction in smaller discrete proteiJl unHs. such as monomcrs, dirners, or hexamers. the RecA mament is huge and variable i[l size; filamento; Ihal contaiTl approximately 100 subunits of RecA and 300 nudeoti des of ONA are com mon. The fi lament can acconunodah.J one. two, three. or even fOUT strands of DNA. As described below. filamenls with eithe'r one or fhrcc bound slrnnds are mosl common in recombin ation intermediales. Tlul structure of DNA within the fi lament is highly extended compared lo either uncoate rl ssDNA or a standard S-forOl helix. On average. the distan ce between adjacenl bases is 5 A rather than the 3.4 A spacing normally observed (Chapter 6). Thus. upon RecA binding. the length of a DNA molecule is extended npproximately 1.5fo ld (Figure 10.Ba). Jt is wilhin this RccA-fi lament thül the search for homologous DNA sequences is conducted and the exchange of DNA strands execuled. To fonn a filamento subunits ofRecA bind cooper-dtively lo DNA. RocA bindillg and assembly are much more rapid on sirlgle-stranded than
b
F' G URE 10-7 Substrale5 for RecA strand exdlange.
H omoJogous Rccvmbinolion ProItJill Maclrilles
a
;~~~::~l:~~:r!:"·~:-' ~·~ ···~·~";:i~-~.~~):-":~~t~1
\ ":.. -... . ~. .::./ ,:;. .":; -.~. ~ ..""'-i::·:-:~C~·'.~:;. O {:f~ ~''.;('' . '.~., .~,:"' I_.· ---~':~'.'{;":-.(:.: \ "/ .. ~ "::~.:": ,",:Do
~
"'" ~ ~. _, t .
.;.-. .;.
".:---.,-
y, •
•- •
• • • • -,
'.y~ ,.~
_ \
"lJ' . " .
.,:.
"
~
•.
~'"
i"
'.',
,.
_
f~;:>~-,:~~l. ~\~~j}t;t:j~.Yt~"~ '.
_...... .4._ • • .,..\ •.. •
,;;:.:~~ :\-~¿;·~,~.:,J,qY~:f:K~I~#. •:;~~b ...
b
e
Z73
F I e u R E 10-8 ThrM views of die RKA
filament (a) Electral micrograph of cirrul
~.
RecA, ~ the ore on
1he right is the related strand~ p-oIein Rad5 1 from yeasl (5cKJce: Image prCMded t:v Edwan::l Egelmal\ Urt..-ersity ofVirginia) (e) A higt"er resolution view gererated bt X-rit)' ays~~. !-'ere one tum of the heicaI fllarnert is shoNn f rom a top cJov...n View. lróuiduaI :st.bJMs are colored; the red stbJrit is dosest 10 the \iiewer. (Slay RM and Sleitz lA 1992. Nattxe 355: 3 16.) Imi!ge prepilred wrth ~ ~ and Raster 30.
274
Homologous Recombination {JI Ihe Molecular Level
5'c'= ===============" ReoA
1
5''c======~(j)~======"
1
~_ .
~~- ,
~r~~------~----------~ ¡ r l .n.,+,++n+Fn+.
======_(6!11 '~======='3'
5'c.
5'~_ _ _ _--Ii.~~
no filamen! fonnation
acti ... e filament formatioo 10 ooal 3' end of ssONA
FICUR E 10-9 ~arity of RecA assembty. Note lhat rJeW subuoits of RecAjoin lhe fllamenl on the DNA 3' side ID an existing subunit much faster than these subunits ¡oio on lhe S' síde. Because 01 this poIanty 01assembly, DNA moIeOJles \o'o'ith 3' ssONA e:tlensions ....;11 be effioently COiIted IMth RecA.
lo amtril5l moIecules ""'Ih S' ssDNA extens¡ons would no! serve as substrates fOf filament assembly.
on double-stranded DNA. thlls explaining the need for regions of ssDNA in slrand-exchange subsuates. Tlle filament grows by the addition of RocA subunits in the 5' to 3' direction, such that a DNA strand thal terminates in 3' ends is most likcly to be coated by RocA (Figure 10·9). Note that in the DSB·repair modol for recombination. it is DNA molecules with just this structu.re Ihat participate in slrand invasion.
Newly Base ~Paired Partners Are Established withio the RecA Filament RecA-catalyzed strand exchange can be divjded into distillct ecedion stagos. First , {he RecA fiJament musl assemble on OIle of the participat· iog DNA molecules. Assembly occurs on a molecule containing a region of single-slranded DNA, such as an ssDNA tail. This RecA-ssONA complex is the active foem that participatos in \he search ror a homology. During th is soarcb . RocA musl "Iook" roe base·pair complemcutarUy between the DNA within \he filamenl and a new DNA molecule. This homology search is promoted by RecA because the filamen! structure has two distioct DNA-binding sitos: a primary site (bound by the first DNA molocule), and a secondary sile (Figure lQ.10). This séc· ondary DNA·binding site can be occupied by double-stranded DNA. Binding lo this site is rapid, weak, lransiont and- importantly-independent of DNA sequence. in this way, the RocA filament can bl nd and rapidJy "sample" huge stectches of ONA foe sequonce homology. How does tbe RecA filament sense sequence homology? Delails of this mechanism aTe still not dear, The DNA in the seoondary bindíng sito is Iransiently opened and lested for complemontarity with Ihe ssDNA in Ihe peimaey site. This " testing" is presumably via base-paie· ing interactions. although it occurs initially withouf disrupting tbe global base-pairing between the two slrands or the ONA in the secondary site. In support of this idea, experiments suggest lhal the initial alignment may involve base-flipping of sorne of the bases in the DNA duplex (seo Chaptoe 9 foe a discussion of base-flipping doring DNA eopaie). In vitro expeeiments indicate that a sequonce match of jusi 15 hase paies provides a sufficiont signal lo the RecA filament tha! a match has becn round, and thereby Ifiggor strand exchange,
Homo/ogoo& R.,.,;ombirw llOIl Prolei" Madzimls
sita
FIGURE 1()..10 Modefoftwostepsin
secondary binding
primary binding RecA filament,
~
/ sile
...\ I
oro,>-"",tion of
275
,¡"'~
the seClrch far l1omo5ogy and ONA stfand exmange witttin tite RecA filament. t~ere lhe RecA filament is represented Imm a top ONA
strand bound lo RecA protein
clown view as in Figure 1D-8c. 1he incoming ONAduplex is shov.n in ~ue. (Source: Adapte
Publishing Group. Used with permission.)
1 ONA in secondary sila is tested for complementarity
base-paiting between slfands is switched
) Once a region of base-pair compJimentarity is located. RecA promotes the formatioo of a sta ble oomplex between lhese t \NO DNA mo lecules. Thi s RecA-bound three-stranded structure is ca lled a joiot molecule and usually con lains several hund red base pai rs of hybrid DNA. It is wilhin Ih is ¡oint molec:ule that the actual exchange of ONA stra nds Dccurs. The ONA strand in the primary bindiog si te becomes base-paired with its complement in the ONA duplex bound in the seconclary site. Strand exchange Ihus requires the breaki ng of one sel oE base pairs and the formation of a n éW sel oC identical base pairs. Comp letioo of strand exchange also requires that the two newly-paired strands be intertwined lo form a proper double helix. RecA binds preferentially lo Ihe DNA products ¡'¡fter st.rand exchange has occurred a nd it is this binding energy tha! actuaUy drives the exchange react ion toward the new ONA configuration.
RecA Homologs Are Present in All Organisms Strand-exchange proleins of Ihe RocA [amily are presenl in aIl fonns of lICe. 111e besl-characterizoo members are RecA from Eubactoria, RadA
276
UOf/lOl080116 Rt1COmb¡nolion ul/he Molecular UNel
fmm Archaea. Rad51 and Dmcl from Eukaryota, and the bacteriophage T4 UvsX proteill. These ptoteins Corm si milar filaments to Ihal made by RocA (Figure 1(}-1l) and Iikely function in an analogous manner
(although some Ceatures of Ihe proteins are specifically taHoced Cor their specific cellular roles and interaction partners). We wHl discuss the roles of RadSl and Dmcl recombin ation in eukaryotic cells helow.
RuvAB Complex Speclfically Recognizes Holliday Junctions and Promotes Branch Migration Aftar the strand invasion slep of recombination is complete, the twa reeomhining ONA molocules are connocted by a DNA branch known os a Holliday junction (see aboye). Movement of the site oC this branch requires exchange of DNA base pairs belween lbe two homologous DNA duplexes. Cells encade proteins Ihol grcally stimulote tJle rate oC branch migration. RuvA protein is a Holliday junction specific DNA -binding protein tha! recognizes Ihe strur:ture of the DNA junction, regardJess of its specific ONA sequence. RuvA recúgnizes and binds lo Holli day junelions and recruits tbe RuvB protein lo this site. RuvB is a hexameric ATPase, similar lo lhe hexameric bclicases involved in ONA replieatíon (see Chapter 5). The RuvB ATPase provides Ih e energy lo drive the exchange of base pairs tha! move the DNA braneh . Structural models tor RuvAB complexes al a Holliday junetioll show how a letramer of RuvA , togelher with Iwo hexamers oC RuvB work together lo power Ihis DNA exchange proecss (Figure 10-12). Ru~
C1eaves Specific DNA Strands at the Holliday Junction to Finish Recambinatlon Completion oC recombinalion roquires that the Holliday junction (or junctions) between the two recombining DNA molecules be resolved. In bacteria, lho major Holliday junetion resolving endonuclease is RuvC. RuvC was discovered and pwified based on its abilily lo cut
F I e u R E 10·11 RecA-Jike protelns in three branches of life. Nudeoprolein filaments are sh
HU/IIulogOU$ RecombillolJon P rol,,'¡1l Mochi"es
a
b S'
3'
S' 10-12 High resoIution stTUcture of RuvA and sehematic model DI the RwAB (omptex bound lo HoIliday junction DNA. (a) The aystal5b'uCtUre oIlhe RuvA tetramer shoos the foorfokl syrnrn1Jy of !he prote!f1. (Myoshi M~ Nishino T, lwasaki H. Shinagawa H.. and MOO
ONA junctions made by RecA in vitro. Genetic evidence inoicales that it runctions in concert with RuvA and RuvB . Resolution by RuvC occurs when RuvC recognizes the Holli day juncHan (likely in a complex with RuvA anrl RuvB) and specifi ca Lly nicks two of the homologous DN A slrands that have Ihe samo polanty. This cleavage t esults in DNA ends thal tem1inate with 5' phosphales and 3'OH groups that ca n be rlireclly joinad by DNA li gase. Depending OH which pair of slrands is cleaved by RuvC, the result in g ligated recombination products will be of either the "splice" (crossover) or "palch" (non-crossover) type. The strueture of RuvC and a model schematic proposing how it may inlerne! wilh junr.tion DNA are shown in Figure 10-13. Despite recognizing a structurc ratllar !han a specifi c sequance, RuvC cleaves ONA with morles! sequence specificity. Clcavage ta kcs place only al sites confonning to Ihc eonscnsus 5'AIT-T-T-G/C. Clc
'1.77
216
a
Homologous RccombinCltioIl al/he Molecular Ú1v.d
b
5'
3'
5' 3'
5' F 1(j U R E 10-13 H;gh resolution 5tructure of the RuvC resolvase and smematic model of ttle RWC dimer bound lo HoIltday junctioo ONA (a) lhe crystal structure 01!he RwC prolein, (Ariyoshi M, Vassyfyev O.G~ lwasaki H., Nakamura H., Shil11lgawa H., and Moo!<:awa K. 1994. Celf 78: 1063 - 1072.) lmage prepared wilh BobSCflpl, MoISaip~ and Raster 3D. (b) Model fOf bmding 0111 RuI.< dirner lO 11 HoIlJday junc'IIoo. Notice hO\N, in this rrodeI, 11 dirner 01 RtM: can I::md!he Holllday junction and introduce syrrr~I cle
HOMOLOGOUS RECOMBINATION IN EUKARYOTES Homologous Recombination Has Additional Functions in Eukaryotes As we have juSI described. homologous recombinetion in bacteria is requi.red te repair double-stranded breaks in DNA, to restart collapsed rcplicatioll forks, ami lo allow a cell 's chromosomal DNA to recomhine wilh DNA lbal enlers via phage infection or conjugation. HomoJogous recombination is also required for ONA repair and lbe rcslarting of collapsed replicatiou forks in eu.karyotic ceUs. This requir(lment is ilJustrated by the fuct thal cells with defer:ls in Ihe proteins that promote recomhination are hypersensiti ve lo DNA damaging agents. cspecially those Ihat break ONA strnnds. Furthonnore , animals carrying mulations that interrefe wilh homologous recombination are predisposed lo cortein types 01' cancer. However, as \Ve will discuss below, homologous recombination plays important additional roles in eukaryotic organisms, Most imporlalltly. homologous recombination is critical for meiosis, During meiosis. homologous recombination is required for proper chromosome pairing and, thus. for maintaining the integrity of the genome. This recombina· tion aIso rcshuffies genes ootween Ihe parental chromosomcs. ensuring variation in the sets of genes passed lO th e next generation,
Homologous Recombination Is Required for Chromosome Segregation during Meiosís As we saw in Chapter 7. meiosis involves lwo rounds of nuclear division. resulting in 8 reduction ol' Ihe DNA content from lhe nonnal content of diptoid ceUs (2Nl. to the content present in gemetes (1 NJ. Figure 10-14 shows schematically how the chromosomes are configured during these Iwo division cycles. Before division. the ceU has Iwo copies of each chromosome (the homalogs). aue each that was inherited from ils two parents. During S phase. tbese r...hromosames are replicated to gi ve a total DNA content of 4N. The products of replia'ltion - tbat is the sister chromatids - stay together. Then . in preparntion ror the OOt nuclear divi sion. thesJ! duplicoted homologous cI!romosomes must fX1ir and align al !he center oC the ccll. ti is this pairing of homologs thal requirp.s homologous recombination (Figure 10-14). Thase events are carefully timerl . Recombination mu.st be complote before the Rrst nuclear division to aJlow the homologs lo properly align and then separateo During this process. sisler chromatids remain paired (see Chapter 7, Figure 7-16). Thon. in Ihe second nuclear divisi on. il is the sister chromatids Lhal separate. The products oC litis division are the four 8<'lmetes. each with one copy of each chromosome {thal is. the fN DNA content) . Without rocombination. chromosollles often fail to align properly ror lhe first meiotic division. ando as a result o there is a high incidence of chrolOosomc loss. This improper segrcg
Prograrnrned Generation of Double~Stranded DNA Breaks Occurs during Meiosis The developmental program nceded ror cells lo sUCCesSflllly complele meiosis involves turning on Ihe expression of many genes lhat are not needed during normal grav.rth. One oC fh ese is SP011 . This gene encades a protein thal introduces double-strand brcaks in chromosomru DNA lo initiate meiotic rocombination. The Sp011 protein cuts the DNA at many chromosomal localions. with litLle sequence selectivity. bul at a very specific time during meíosiso Spol1-mediated DNA cleavage OCCllfS right around tha time when the replir.eled homologous chromosomes 510rt to peir. Spol1 cul-sites. ah.hough frequent. are not randomJy distributed along the DNA. Ratber, the cut-sites are located most commonly in chromosomal regions that aro not tightly packed witb nudeosomes. such as promoters contramng gene transcription (see Chaptel's 7 and 17). Regions oC DNA Ihat experionce a higb froquency of OSBs also show a high frequency oC recombination . Thus. Ihe most commonly used Sp011 DNA c1eavage sites, like chi sites. are hOlspots for recombination.
280
HO/JIulogous l1r.combinoli(Jtl al/he Molt.'Cull1r l.e\'I!/
FICURE 10-14 ONAdynamicsdurinl
meiosis. Hele, on~ one !)pe of dvolTX)5(lll'lE! IS shown lar danty. The t'M) hanologs are shov.oo, 10 red and bIue. after Ihey ha\.oe beerI d~1ed by a round of [)NA rep!kaoon Ho~ remmbination is required 10 pair
lhese hormIogous chromosomes In preparation fa !he rllst nucIei!f dMsIon. rhoS recomblnatioo can also lead lo o~ng aver, as IS shown here beMeeo the A and B genes.
I
"
~ ~I A
A
S
•
~ sU .V..
I
Hf--:~':----=::o 1 ......---..,.
)j ~
'-.--"
1 ~J
) )
Homologoll¡¡ Rt:cOmbiI1otion in Ellkoryotes
281
fIGURE 10-15 Cytologkalviewof
aossing over. ReciprOC1!I aossing over óirectly visualized in hamster cells in TÍSsue culture. Chromosomes whose ONA contains bromoóeo~yuridlne in place 01 thymidmc in both strands appear light alter treatment wll.h Giemsa stain. whereas those oontai ni~ DNA substituted In only (me strand appeilr dart Afie. two generations 01grO'v'lth in bromodeoxyuridine, one nevvIy replicafed chomatid has only one 01its strands substituted. 'NI1e'eas lis sisler has both substituted. Thus. SlSler duomatids can be distingulshed by stairnng. Then crossovef5 are easily detected as altemating lengths 01 light aOO dart.. (top). Similar recombinanl chromosomes are also seen l'ItIen milotically growing ceJls are treated Wlth a DNA-damaging agefll (bottom). (Sou.ce: Cour\esy of Sheldon Wolff and Jody Bodyeole.)
'-
...
'"
I
•
..
"'(
1 ..-
/ The meehanism oC DNA d eavage is as follow s. A specifie Iyrosine side ehain in the Spol l protein attacks the phosphodiester backbone lo cut the DNA and generate a covalent eomplex between the protein and the severed DNA strand (Figure 10-16). 1'wo subunits of Spo11 cleave Ihe ONA two nudeotides apart 011 Ihe tWQ DNA strands lo make a slaggored double-strand break. Spo) 1 shares Ihis ONA cleavage mechanism with lhe DNA lopoisomerases and the site-speei fi c recombinases (seo Chapter 6 and Chapler 11). In raet, Spoll appears lo be a distant eousin oC these enzymes. The Caet that Spo11 deavage involvas a cova le nl protcin-DNA complex has two consequences. First, the 5' ends of Ihe DNA al the site oC Spol1 cleavage are coval ently bound to Ihe enzyme.1t is !hese Spolllinked 5' DNA ends th at are Ule initiaI sites of DNA processillg lo ereate the ssDNA !ails required fOl assembly oC RecA-like proteins and iniliaBan oC DNA strand invasioo (see below). Second , the energy of tbe cleaved DNA phosphodiester bond lS stored in the bound proteinDNA Hnkage. and so the DNA strands can be resealed by a simple reversa l oCthe c1eavage reaction (sea Chapler 11, Figure 11-7). This resealing can occur when cells receive a signal to SIOp proceeding with meiosis.
282
Humolo80us Recolllbinatiun al 'he Mo!eclJlar Ltwd
f I G U R E 10- 16 Mec:hanism of cleavage
by Spo 11 . lhe OH group 01 a tyrosine in (he Spo 11 protem attacks the DNA te forrn a cova\ef)1 protein- DNA lirrkage. Two subunits 01
Spo 11 are required lo generate a double-stranded DNA break, one te .mack each 01the 1\"1() DNA strands. Note. because 01Ihis deavage mechanism, lhe OSB can be resealed by the SImple re..ersal el the deavage reactiOf1.
Spo1 1 ONA cleavage
HOOH
,.
/ JJH
5'~'~~g~~'L_~ H? .:'
MRX Protein Processes the Cleaved DNA Ends for Assembly of the RecA#like Strand#Exchange Proteins The DNA al Ihe site oC the Spol1-catalyzed double-strand break is processed lo general e single-slTdnded regions neooed Cor assembly oC tho RccA-like strand-exdlange proteins. As was observed in the RecDCD palhway from bacteria, Ihis pror:essing generates long segments oC single-stmnded ONA that termínate in 3' ends (Figure 10-17). During meiotic recombinatioll, the MRX-enzyme complex is ccsponsible Cor this DNA processing event. This complexo ahhough not húmologous to RecBCD. is ruso a multi-subunil DNA ouclease. MRX is compased oC proteio subunits called Mre11. Rad50. and Xrs2; the fil'St letters of these subunits give the complex lis name. Processing of lhe DNA al the break si te occurs excJusively on Ihe DNA strand thal terminales with a s' ond-that ¡s. tho strands covalenUy aHached to the SpOll protein (as descrlbed above). The strallds· terminating with 3' onds aTe nol dograded. This ONAprocessing reactlon is therefore called S' lo 3' resedion. The MRXdependent S' to 3' resection generates the long ssDNA tl'lils with 3' ends; lhat are often 1 kb or longer. The MRX complex is aIso thought to remove the DNA-linked SpOl1.
Dmc 1 ls a RecA#like Protein that Specin.cally Functions in Meiotic Recombination Eukaryotcs encode Iwo well
HomoIogol1s RocQmhinoliQII il1 El1karyoles
5· 3·
replicated (nonsisler) homologo
5· 3·
T
Spo11
MRX
~
5
•
Dse formalion by Spo11
5·
3·
S
3·_
!'-
,•
~
•
5' ...
~
5' 10 3'resectioo
byMRX
5·
3·
F I G U R E 1 0-17 Overv~ of meiotic recombinatton pathway. Formation 01the double-str.mded breaks during meiosis requires lhe preser.ce of both Spo 1\ and the MRX (cmplex. This observaban sugges(S lhal DSBformal!On and subsequent strand pr0C5sing are norrnally ooupled by the coordinated "ction 01 several proteins. MRX protein is responsible for resection of the ,)'-ending strands al the break site. The strand-exchange proteins Omc I and Rad5 1 then asserrhle on the ssONA tails. 80th plOtelflS parliapate In recombinatíon, bu! hcm !hey work together is nol knov.n They are shov.n forrning Sl2parate filarnents lor darity. (Scurre; Lidllen M, 2001. Breaking lhe genome lo s.M' it eum"flt BioIogy 11 : lig 2, p. R255. Copynghl e 200 1, with perrnission from Elsevier.)
D~ '
Réld51
5·= ====
assembly of lhe strand cxchangc protein filaments
3·
hornologS
,
•
I
sislers
~=
slfand invasion
"
sislers
,----,
J1!comblnalion
recombination. Whereas Rad51 is widely expressed in cells dividing mitotically and meiot.ically. Dmcl is expressed ooly as cells eolee meiosis, 5tmnd exchange during meiosis occurs between a particular type of homologous DNA partner. Recall tha! meiotic recombínation OCC llrs al a time when Ihere are fout complete. double-stranded DNA molecules representing each chromosome: the lwo homologs each of which have been copied lo generate two sister chromatids {Figure lO-lB}. Although the two homologs Iikcly contain smalJ scquencc differences and CWTY clistinct a Ueles for various genes, ,he majonty of Ihe DNA sequence amoog these fow copies of the chromosome will be identical, Interestingly. Drncl-dependent recombination is preferentially between the nonsíster homologous chromatids. Tather Ihan between the sisters (Figure lO-lB). Although the mechnnistic basis oC this seleclivity is unknown. there is a clear hiological rat ionale: meiotic recombination promotes interhomolog connections to assist alignment DE the chromosornes for division.
263
intemomolog rerombioation
byOmc1 fiGURE 10- 18 Omcl-dependent
recombinillion O(curs pt'eferentially between nDnsister homologous chromatids. Eilch strUClure shov.n is a replicaled, double-slJanded ONA rroleoJle calle
Many Protcins Function Togcther to Promote Meiotic Recombinatlon As we have described. proteins involved io tbe critical stages of OSB formation . DNA processing lo generata 3' sslJNA taUs. and strand exchange during meiotic recomhination bave been identifi ed and characlerized. Genetic experiments indicale that many addilional proteins aIso partici· pale in this process. Furtherrnore. many proteins appear lo internct with Ibe known recomhination enzymes and il set:'.JllS likely thfl t these proleios fuoctioo in Ihe cootext of o large multicomponent complex. These large protein-ON A complexes. known as recombination faetories. can be visualized in celJs. For example, l.he co-Iocalizalion of Rad51 and Drncl to these faclori es during meiosis is shown io Figure 10-19. Rad52 is anolher essenti al recombination protein that interads with Rad51. Rad52 functions to promote asscmbly of Rac.l51 DNA filameots, Ih e aclive form oC Rad51. It does thjs by antagonizing rhe Bclioo of RPA, Ihe major single-stranded ONA-binrling protein preseol in eukaryoti c cells. lo this respecto Rad52 sh8ms an activily with Ihe E. coli RecBCO protejo. which. as we leamed. helps RecA load oo to ssDNA that would olherwise have been hound by SSB. By analogy with bacteria, we expect lba! eukaryotic ce lls encode proleins Ihat promote Ihe hranch migration and Holliday junction resolution steps of recombioalion. In faet, enzymes eapable of promoling Ihese reactions are being identified, FOf examp le, Ihe MusB1 protein, which is highly conserved in e ukaryotes. is required for meiosis, and may function as a HoUiday junction resolvase. As we have seen, meiotíc recombinalion aligns homologoU5 chromosomes and promotes genetic exchange beh-veen Ihem. These recombinalion reactions afien lead lo crossíng over between the parental chromosomes. RecaJl. howcver, thal depending on how the HoUiday junctions in Ihe recombinalion intermediates are resolved, recombination via Ihe DSB-repair pathway can also give rise to non-crossover products {see ahove}. These events may provide the essential chromosome-pairing function needed for a successful meiotic division. yet leave no detectabl e change in the genetic rnakeup of Ihe chromosomes. But. even non-crossover reco mbination can bave genetic consequences. suc h as giving rise to a gene conversion even!. GentJ conversion happens when an allele of a gen e is losl and repl aced by an a lterna tive a11 ele. Examples of how gene conversion occurs both in mitolically-growiog cells aod duriog meiosis are described in the following sections. Rad51
merged
Dme1
F I GU R E 10-19 to-Iocalizations 01 Ihe Rad51 and Dme1 proteins lo "re
MO fing-Type Swifching
MATlNG-TYPE SWITCHING In addilion lo prmnoting ONA pairing. DNA repair. and genel ic exchange, homologous recombination can also serve lo change the DNA sequence al a specific chromosornal lacaUon. This type of recombinalion is sometimes used lo regulale gene expression. For exarnple . recombination controls Ihe mating type oC the budding yeast S. cerevisiae by swilchíng whi ch maling-typc genes are present al a specific location Ihal is being expressed in that organism's genomc. S. cerevisjoc is a single-cell eukaryote Ihat can exist as any of three different cell Iypes (see Chapler 21). Haploid S. cCJ'e vjsiae Ctllls can be either oftwo mating types. a or a. And, when an a and a cell come in close proximity Ihey can fuse (Ihal is, " male") lo fonn an a to. diploid cell. The ala cell may then go through meiosis lo form Iwo haploid a-cells and Iwo haploid a-cells. The mating- type genes e ncode transcri plional regulalors. These regulalors conlrol expression of largel genes whose products defi ne each cell l-ype. The mating-Iype genes expressed in a given cel\ are Ihose found al Ihe !lli!ting-type loc us (MAT locus) in Ihat cel! (Figure 10-20). Thus, in a-cells the al gene is presenl at the MAT locus. whereas in a-cells, the a l and a2 genes are present al Ihe MAT loc us. In Ihe di p loid cel!, both seis of mating-type control genes are expressed. The regu lators encoded by Ihe mat ingIype genes , logether wi th others fou nd in aH three cell types, ael in vacious combinations to ensure that the correcl pattem of genes is expressed in each celllype (see Chapter 17). Cells can s witch Iheir mating lype by recombination as we now describe. [o addilion lo the a or o: genes present al the MAT loe us in eaeh eeU , Ihere is an additional copy of bolh the a and o: genes presenl (buI no! exp ressed) elsewh e re in th e genome. These additional silent cop ies are fo und at ¡oei call ed HMR and HML (figure 10-20).
a 001
1
HML
MAr
HMR
silent locus
expressed locus I
silent locus
Ya " :n=============:::.t~=:nL=====I=::rc: HMLa MATa
l:
y,
Va
II HMLa a -type
,
cleavage
al
Ia2
;¡;¡:gr; -
y,
JJ
MAra~MRa
silenl cassette
f l e u R E '0-20 GenelK: Ioci encoding mating-type infonnation. AIthough chromosome 111 carnes three rnating-type Ioci, 0I11y me genes af lhe MAr ros are expressed. HML enwdes a s~ent copy 01 !he Q genes, whereas HMR enoodes a silenl copy of the a genes. When rerornblflation occurs befween MI1! and HML . a cells switch lo Q cells. W'hen recornbination occurs belween M4T and HMR, Q cells sv.;.ch to a cells. (Source: Adapled frorn Haber LE. 1998. Matíng.type gene Sv.;¡Úllng in Socchuomyces cerevislOe. AnnuoI ReVJeW 01 Genefics 32; fig 3, P. 566. Copynght e 1998 by Annual Reviews. www.annual~org)
HMRa
HO endonuclease
.type silent cassette
These HMH and HML loei are therofore known as silenl cassettes. Tbcir f\Jllction is to provide a "storehouse" of genetic i!úormaHon lbal can he used lo switch a ceU's maling type. This switch requires the transfer of genetic ínfonnation from Ihe Í-Irvf sites lo lhe MAT loctls via homologous recombination.
Mating..T ype SWltching Is Initiated by a Site .. Spedfic Double.strand Break Mating-Iype switching is injt ialed by Ihe introductíon of a OSB al Ihe MAT locus. This reaclion is performed by a spccia li zed ONAdeaving enzyme. ca Ued the HO endonudease. Expression of Ihe Ha gene is lightl)' regulated lo ensure tha! switch ing occurs only when it should. The rnechanisms responsible fm lhis regulation arc discussed in Chapl.ers 17 and 18. Ha is a seq uence-s pecific cndonu· dease; Ihe only sites in the yeast chromosome Ihat carry HO recognilion sequences are the mating-typc loci. HO culting introduces a s!aggered break in Ihe chromosome. In contrasl lo Spol1 deavage, HO simply hydrolyzes Ihe DNA and does nol remain covaJentIy linked lo Ihe cut strands. 5' lo 3' resecl'ion of the ON A al Ihe si le of lhe HO-induced break occurs by the same mechanísm used during mciotic recomhínalion. Tbus, resection depcnds on Ihe MRX protein cmuplex l:I.nd is s pecific for the stTands thal termínale with 5' ends. In contrasl, the strands lerminatíng wit h 3' ends are \'ery stable. Once Ihe long 3' ssDNA laUs have beco generated, Ihey associate with the Rad51 and Rad52 proteios (as well as olher proteins that help the assembly of the recombinagcnic protein-ONA complex). These RadSl prolein-coaled slrands Ihen search for homologous chrnmosomal regíons lo initíate stnmd invasion and genelíc exchange. Mating-type swilching is unidirectional. That is , sequence information (allhough no! the actual DNA segment) is "moved" to the MAr locus, from HMR and HML, bul tnformatíon Ilever "goes" in tbe olher direction . 'fbus, Ihe cut MAT locus is atways Ihe "reeipient" parlner during recombination and the HMR and HM[ .<¡ites remain unchanged by Ihe recombination process. This directionalíly slerns from the fad that HO endonuclease cannot c1eave its recogn ilion sequence al cH her HML or HMfl because the chromatin structure rcnders these sites inacccssible lo Ihis e nz}'me. The Rad51-coaled 3' ssDNA lails from Ihe MAT locus "choose" the DNA al eHher the HMR or HML locus fUf strand invasion, lf the DNA sequence at ¡'\IlAT is a, then invasion wül occor with HML. wruch caro ries the "slorage" copy of the o: sequences. In contrasl, if the o: genes are presenl at MAT, then invasion occurs with HMR, Ihe locus Ihat carries Ihe stored a sequences. After recombínation. the genctic infmmalion thal was at the choseo HM ¡oei is presenl at the MAT ¡oei as welL This genetic change occurs wilhout a reciprocal swap of information rrom MAr lo the HR loci. This type of nonrecipmcal recombination event is a specialized example of gene conversion,
Mating..Type Switching Is a Gene Conversion EventJ Not Associated with Crossing Over Although the DSB-repair palhway cou td explain the mechanism of rnating-type sw itch recombination, current evidcnce indicates thal,
Mo !illg- Type Swi!C:/l i"g
287
FI(;URE 10--21 Recombinat;onmodeJ fOf mating-type swddling: synthesisdependen' st,and annealing (SDSA). lhe ~ HO cleavage
§§§§~~;;:~:;~ o:'''§§§§S
MATo. S'eL 3','!:'
IX inlormalion
15' ,e,ect~n b
3'(1
11
j a
5' 0
11
1R.d5'-dependen! slrand invasion
e replicalion
assembly here
3'(1 5,g
"
15,nth"";' 01 two n~ ONA strands I'rom a information templale al HMRa
d
3"~ ' ~;-
5'·
);
strand excised
br.mch migralion 10 discngagc duplexes removal 01 seoond old strand al MAr repair synlllcsis and seafirg of ONA strands
e
a lnrormalion
' ~'~ ~~
HMRa 5'ti
3'"
..
-
-
-
ll __ ~Ta 5'3"~I§§§§~"~~~~~~"~§§§§~
a lnformallon
1i~le
sho.vs !he steps leading to gene COrM?r-
sien al the MAr locus. The HMR and MAr re8io05 are shown in green; the region of HMR encoding !he a information IS repreente
after Ihe slrand invasion slep, this rocombinalion palhway diverges from Ihe OSB-repair mechanism. One hinl Ihal Ihe mechanism is differonl is Ihal Ihe crossover class of recombination producls is never observed during mating-Iype switching. Recall Ihal in Ihe OSB-repair pathway, resolution of Ihe Holliday ¡uncHon intermediales gives two classes of producls: Ihe splice. or crossover class, and Ihe palch. or non-crossover. class (see Figure 10-2). According lo the DSB-repair model. lhese two Iypes of products are prodicted lo occur al a similar frequency, yet. in maling-Iype switching, crossovcr products are never observed. Therefore. rnodcls for recombination Ihal do not involvu Holliday junction inlennediales boucr explain mating-Iype swilr:hing. To explain gene conversion withoul crossing overo a ncw recufl]binalion model termed &)'nlhesis-deptmdenl sl....md afincal¡og (SOSA) has becn proposed. Figure 10-21 shows how mating-type switching can occur llsing this mechanism. The initiating event ¡s. as described aboye. Ihe introduction of a OSB al Ihe recombination site (Figure 10-21a). Afier strand invasion. the invading 3' end serves as Ihe primer to ¡nitiale new ONA synl,hesis (Figure 10-21 c and dJ. Remarka.bly, in contrast lo whal occurs during Ihe OSB-repair palhway, a complete replicat ion fork is assembled al th is site. Both leading and lagging stran d DNA synthesis occurso lo contrasl lo nonual ONA repUcation, howevcr, the newly synlhesi7..ed strands are displflced fmm tbe templflte. As a reslllt, fI ncw double-slranded ONA segment is synlhesized, joined lo the DNA site lhal \Vas originally cut by HO . and resected by MRX. Tbis new segment has the sequcnce of Ihe ONA segmcnl used as tbe templatc (HMRa in Figure 10-21). Completing recombination requires lha! the other "oId" ONA strand presonl at MAT (lhe 3' -ending slrand not cloaved by rvtRX) be removed (the bottom sl1'8nd in Figure 10-21d). Then. the newly synthesízed ONA-an exacl copy of lhe infon~lation in the partner DNA molecule -rt!places the information thal was origioally presento Trus rncchanism nicely expla ins how gene conversion occurs without forrnation of a HoUiday ¡unction. Thus. by Ihis model, the absence of crossover products during matíng-Iype recnmbioalion is no longer mysterious.
GENETIC CON SEQUEN CES OF THE MECHANISM OF HOMOLOGOUS RECOMBINATION As discussed in the beginning of Ihis chapter. initiaI models for the múchanism of homologous recombination were fonnulaled largely lo explain the genetic consequences of lhe process. Novv Iha! th e basic steps involved in recombination are understood. it is usefullo review how the process of homologous recombination aJters ONA molecules and Ihereby generales specific genelic c hanges. A central feal ure of homologous recombination is lhat it can occur between any !wo regioos of ONA, regardless of Ihe sequence. provided that these regions are sufficiently similar. We now undersland why Ihis is lrue; none of th e slOps in homologous recombíuation require recognit ion of a specific ONA sequen ce. For steps tbal have sorne scquence preference (such as the transfonnation of Rec~CD by chi sites and DNA cleavage by RuvC protein). the preferred sequences are very con1llion. The committed slep during recombination between Iwo ONA molecules occurs when a strand-exchange protein of Ihe RecA family succússfully pairs ¡he molecules, a process dir.latcd only by the normal capacity of DNA slrands lo form proper base pairs.
Genetic Gollsequences o{ the Mechani~m o{ Homologous Re
A cumllary oC the fact Ihal recombination is general! y independent of súquence is that the frequency oC recombination between any lwo genes is generally pmportional to Ihe dislance between Ihose genes. This proportionality is observoo because regioos of DNA are, in genern1, equaUy like!y lo be used to iniliate a successCul recombinalion event. This fundamenta'! aspect oC hornologous recombination is what makes il possible to use recombination ITequencies lo generale useful genetic ffiaps thal display the order and spad ng of genes aloog a chmmosome. Distortions in genetk maps compared lo physical maps occur when a region oC DNA does nol have the "average" pmbabillty oC particípating in rocombination (Figure 10-22). Regions with a higherIhan-average probabilily are "hol spots," whercas regioos tIlal participate less commonly than an average segment are "cold." ThereCore. two genes that have a holspOI belween them appear in a genetic map to be Carther apar! lhan is true in a physical map oC the same region. In contrast, genes separated by a "cold " ¡nlerval appear by genetic mapping 1.0 be doser together than is true from their physical dislance. We have encountered two examples for lhe molec ular explana¡ion oC hol and cold spots in chmmosomes. Regions near chi sites and Spoll c1eavage sitcs have a higher-than-average probabilily of initiat· ing recombínatlon and are "hol," whereas regions having [ew such siles are con-espondingly "cold."
Gene Conversion Occurs because DNA Is Repaired dudng Recombination Another genetic consequence oC homologous recombination is gene conversíon. We have introducoo the concept oi" gene conversion during the specialized recombination events respons ible for mating-type switching in yeast. However. gene conversjon is a1so commonly observed during nonnal homologous recombination Dvents. such os ¡bose responsíble for genetic exchange in bacteria and lor pairing eh mmosomes during meiosís. To iUustrate gene conversion during meiotic recomhinal ion. consider a cell undergoing meiosis tbat has Ihe A allele on one homolog and the a atlele on the other. Alter DNA replication. four copies of this gene are prosent ant.l the genotype would be: A A o o. In the ahsence of gene cenlromere
~
genetic length il centimargans (cM)
'"
physical lcngth in kilobase pairs
'"
:;~
vE
01
Jjl
<1'
"
O
<>:~
a(.Jt ¡'"
Ji I
le
a
I
., ¡
f<:'
'------.J
"coId spor-"hot spot"
f iG URE 10·22 Comparison of the B"!netK and physKaI maps of a typtcal region of iJ yeast chromosome. Markers shCYI lhe location of ... arious genes. Notice in lhe reglOl1 between Spo7 and Cd:IS thdt the genetic rnap is contacted due 10 a I!:rw Ií"equency of cros.sing 0\I€r. In conlraSf., In lhe legion between Cdc lS and FWllhe genetic map is ~nded due fO a high ffequency 01 CrosSIf'€ over. (Sourc.e: Mapted lrom Alberts B. el al. 2002. MoIec.ulor bioJogyof!he ce//, 4th edítion. p. I \3B, fig 20· 14. Cq:lYfighl e 2002. Reproduced by permissim of RoutledsefJaytor & Flands BooI:s. loc)
240 kbp
290
Homologous Recombinmion at the Mnlp.culnr Úiw¡J
conversioo, IwO gameles carrying the A alIvIe and two gallleles carrying Ihe a allele wouLd be generated. If instead, the garneles with gcnOlypcs A, a, a, a (or A, A, A, a) are rormed, then a gene conversion event has occurred, in which one copy of Ihe A gene has been converted into a (or vice versa). How might Ihis arise? There are two ways that gene conversion can occur during Ihe DSB-repalr palhway. First, consider what would happen if the A gene was very dose lo the site of Ihe double-strand break. In this case, when Ihe 3' ssONA tails iovade the hornologous duplexes, and are elongated, they may copy the a lnfonnation, which could replace the A informatlon in Ihe producl chromosome upon complelion of recombinalioo (see Figure lO-3d) . The second mechanism of gene conversion involves the repair ofbasc pair mismatches Ihat occur in Ihe recombinatioo intennediates. For example, ir either strand invasioo or branch migration ¡ndudes !he Ala gene. a segment of heteroduplex DNA carrying !he A sequencc 00 oue strand and the a sequence 00 the othcr strand would be fonned (Figure 10-23; see also Figure lo-1d inset). Thís region oí DNA caITying base-pair mismalches could be recognized and acled upon by Ihe cellular mismatch repair enzymes (which we discussed in Chapter 9). These enzymes are specialized for fixing base-pair mismatches in DNA. When they detecl a mismatched base pairo Ihese enzymes excise a shnrt stretch of ONA from ooe of the two slrands. A ropair DNA polyrneraso then fills in the gap. now wilh Ihe properly basc-paired sequence. Whell working on recomblnation intennediates, lile mismalch repair enzymes will choose randomly which strand lo ropair. Therefore. after their aelioo, both strands will carry the sequence encoding eilher the A information or the a inFonnation (depending on which slrand was " fixed" by I-he ropair eozymes), and gene conversioo w ill be observed . FIGURE 10-23 Mismatchrepairof
•
A
heleroduplex DNA within rewmbination intermediales can grve rise lo gene conversK)ft.
A
reoombinaliOn
ri:
3'==:;:'==_ :
5" _
,
3" _
•
5·_
b
parental ONA molecules
-
• A
,~~~~;;~~-t,;:~;;~~~; _
cKample region of heleroduplex gener-dled ;:: during recombination
•
a
1
mismalch repair
A
3'·
lA
_
5'-
3·'
~
al
new synlhesis
:;s
a
new ONA synthcsizcd during repair
e
gamete geno~".,
1
-
BilJliogroplly
291
SUMMARY Homologous recombination occuJ's in ell organisms. nllowing foc genetic exchange. Ihe reaR.,>ortment of genes «long chromosomes, and Ihe repeíe ofbroken DNA slrands slld collapsed replica tion !orks. The rccombinalion process ¡nvolves the brna king and rejoining of DNA molecules. The double-strand repaie pathway of homologous recombination well describes many recombination events. By Ihis model. iniliation of exchango caquires Ihal one of !he IWD homologous DNA molecules have a douhleslmnded break. The brokcn DNA cnds are processed by ONA-degrading cnzymes lo generale singlc-stranded DNA scgmon!!>. These single-stranded OO8ion5 participale in DNA pai ring with the homologous parlner DNA. Once poiriog oa::urs, fhe Iwo DNA molecules are joined by a branched struclure in the DNA called a Holliday junction. Cutting Ihe DNA al the HoJlidoy junction resol vcs the jWlction and lenninates rocombination. Holliday jum.: tions can be c ut in !wo alternative ways. One \Vay generates crossover produ,cts, in whieh regions frolU !wo pare ntal ON A molecules are now cO\falently ¡oinro. The a1lermllive way of c1 eaving Ihe junction generales a " palch" of rocombined DNA bul does no! result in crossing over. Gells encode enzymes thal calalyze aU Ihe steps in hOlllOlogous recombination. Key enzymes are the strandexchange proleins. OE Ihese, E. coli RocA is Ihe premier cXllmple; RecA-like protcins are found in al! organisms. RecA-like strand -exchange proleins promote the search iar
homologous sequences bctween lwo DNA molocules and the exchange oi DNA strands wilhin Ihe recombinalion intennediate. RecA functions as a largo prolein-ONA complex, known as the RocA fila ment. Eukaryotic cells e ncode 1\'\10 strancl-exchange proteins, called Rad51 and Dmc1. Otbel' important recombination enzyrnes are Ihe DNA-cleaving enzymes thal generale double-stranded break... in ONA lo initiale recombination; Ihese proleins appear lo be found only in cukaryoles and inelude SpOl 1 and Ha. Nucleases Ihal process Ihe DNA al the break site lo generale Ihe required single-stranded rcgions indude Ole RocBCO enzyme in prokaryotes Rnd tlle MRX enzymfl complex in eukaryotes. Addilion al enzymes promole the movement {hranch migrntion} and cleavage {resolution} of Holliday junctions. Dueing meiosis, recombinntion is essential for the proper homologous pairing of chl'Omosomcs prior fo Ihe first nuclear clivision. Thereforc. reCombinatiol1 is highly regulated to ensurc il QCCurs on a ll chromo!>Omes. The Spol1 DNA-culling enzyme and the Drncl strand-€xchangc protein are both spccifically ¡nvolved in Ihese recomo binntion reactions. Homologous recombination is aloo somelimes used to control gene expression. The mating·lype switching of yeasl is an exccl.lenl example in Ihis type of regu lRtion: il is also fin e:xample of gene conversion. A08lysis of Ihe mechanism of maling-type switching has a new dass of mooels to describe sorne homologous recombination events called synthesis-dependenl stmnd annealing.
BIBLlOGRAPHY Books Brown T.A. 2002. Cenomes, 2nd edition. John Wiley, New York and BIaS Scientific Publishers Limited. Oxford, United Kingdom. Griffiths A.J.F., MilIer I.H., Suzuki D.T., Lewontin R.e., Gelba.rt W.M. 2000. An inrrodl1ction lo genetic ono/ysís. 7th edition. W.H. Freeman, New York, New York.
Recombination in Bacteria Coutl O.L., Sawitzke I.A., and Thomason L.e. 2002 . Genelie engineering using homologous recombinatioJ1. Annu. Rev. Genel. 36: 361 - 388 (Epub 2002 lune 1]). Cox M.M. 2001 . Recombinalional ONA rcpair of damaged replication forks in Escherichia coli: QuesliollS. Annu. He\~ Genel. 35: 53-82 . Kowalczykowski S.C., Du.:oo D.A" Eggleslon Aj e.., Lauder 5.0.. and Rehrauer W.M. 1994. Blochemistry o ihomologous recombioRtion in Escherichia coli. Microbiol. Rev. 58: 401 -465. Luseui S.L. and Cox M.M . 2002. The bacterial RecA protein and Ihe recomhinatorial DNA repair or slalled replication forks. Annu. Rev. Biochem. 71: 71-100.
Smith C .R. 2001. Homologous recombination near and far from ONA breaks: Alternativa roles and contrasting views. Annu. Rev. Cene/. 35: 243 - 274.
Rccombi.nation in Eukar-yotes Eichlcr E.E. and Sankoff D. 2003. Structural dynamics of eukaryotic chromosome evolulion. Science 301: 793-797. Keeney S. 2001 . MecllRnism and control of meiotic recombination initiation. Curro Top . Del'. Bio/. 52: 1 -53. Page S.L. and Hawley R.S. 2003. Chromosome choreography: TlJe meiOlic ballp.t. Sde nce 301: 785-789. Paques F. ancl Haber ' .E. 1999. Multiple palhways of recombination induced by double-strand breaks in Saccharomyces cerevis;ae. Mlcrobio/. Mol. Biol. Rev. 63: 349-404 . Paslink A., Eeken I.C.. and Lohman PH. 2001. Genomic inlegrily and the repair oi double-strand ONA breaks. Mutal. Res. 480-481: 37-50. Prado F.. Cortes-Ledesma F., Huertas P. ond Aguilera A. 2U03. Milotic recomhinalion io Saccharomyces cero-visiae. C UlTo Cenet. 42: 185-198 (Epub 2002 Nov 29).
í:9Z
Hommogous ReQ1mbinolion (JI Ihe MoInCIJJOi·le!.'P./
Symlnglon L.S. 2002. Role of RADS2 epislasis group genes in homologous recombination and double-strand break repair. Microbio/. Mol. 8iol. Nev. 66: 630- 670 (Ieble of conlenls). van den Bosch M., Lohman P.I"!.. and Pagtink A. 2002. DNA clouble-slra nd hreak repair hy homologaus recombinntiol1. Bial. Cllem. 383: 873 - 892. Wesl S.C. 2003 . Molecular views of recomhinallon proleins Rn d fheir control. No'ure Reviews: Molecu lor ClJ I/ B;oI08)'. 4: 435-445.
Mating~Type
Switching in Yeast
Haber I.E. 2002. Swilching of Saccharomyces cerevisíoe mali ng-' Ype genes. In Mobile DNA II (ed. N.L. Cmig. R. Cmigie. M. Gcllcrt. A.M. Lambowitz). ASM Press. Washington. O.e.
CHAPTER
Site~Specific
Recombination and Transposition of DNA NA is a very stable molecule. DNA replication. repaír. and homoIogous recombination . as we have learned in Ihe previous chaplers. aH occur wit h higb fidelity. T hese processes serve lu ensure that the genomes or an organism are nearly identical from one generatiun lo the next. Importantly. however. there are also genetic processes that rearrange DNA sequen ces and thus ¡ead lo a more dynamic genomc struclurc. These processes are Ihe subject of Ihis chapter. 1\\10 classes of genetic recombination, consel'vative site-specific rocombination (CSSR) and transpositional recombinatlon (generaUy call ed transpositionJ. are responsible for mally importan! DNA rearr.mgements. CSSR ís recombination hetween Iwo dcfined scquence elements (Figure 11-1). Transposi tion. in contrast o is recombination between specific sequences amI nanspeciJ'ic DNA siles. The biological processes promoted by these rccombination reactions include the insertion of viral genomes iOla Ihe DNA afthe hast cell during infection. tbe inversion of DNA segments to alter gene structure, and the movemen! of transposable elements-often called "jumping" genes-from une chromosoma l site lo another.
D
• Conservative Site-Specific Reoombination (p. 29<)
• BiologlCilI Roles uf Site-Specific Recombination (p. 302)
• Tlanspuiltiun (p. 3 10)
• Examples 01 Tlansposable Elernents and Their Regulation (p. 327)
• V(O)J Recombi"nation (p. ·338)
FIGUR E 11-1 Two dasses of genetic recombination. The lop panel shcr.Ns an
site-specific
clliImple of silc-speafic rewmbmalion. Herc recorrbinarion bef\Neen the red and blue
recombination
•
recombination sites inverts lhe DNA segmenl C<'Irrying the Aand B genes. lhe bottom panel shov-Is an example of transposition in .....hic.h the red transposable element ext!ses from Ihe gr<'ly DNA ancl mserts into an unreJared Slle in the
transposable elemenl
+
OUTL I NE
blue [)NA.
Iranspositiof\
•
+
293
294
Si/e-Spccific Recombinoliof) ond TronspIJb'irirm
al DNA
Thé im pact of these DNA l'earrange ments 00 chromosome s t ruc~ ture and function is profou nd. In many orga.nisms, transposition is lile maíor source of spontancous mutation and oearly half lhe human genome coosisls of sequences derived rrom transpúsable clements. Furthermol'e. as we wi ll see . bo th vi ra l ¡nfeclion an d development of Ihe vertebrate inunune system depend cril ically on these specialized 'ONA rcarrangements. Cooscrvative site-spedfic recombi11atiun and lnmsposition sharo key mechanistic features. Protelns known as recombinases rocognize spe-cific scquences where recombimttion will uccur within a ONA mollXule. The rocombinases bring these specific sites togelher to fonn a protcinDNA complex bridging the ONA siles. known as the synaptic complex. Within the synaptic c(lInplex . Ule recombinasc catalyzes Ihe deavaf,'C and rejoini ng of lhe DNA molecules eíther to ¡nver! a DNA segmen t or to move a segment lo a new site. One recombinase protein is usuaJ ly responsible ror a1l thcse steps. Both Iypes of recombination are also carefulIy controlled such that lhe danger lo thc cell of introducíng breaks in the DNA. and relJITanging DNA segments, is minimized. As lVe ~hal J see. however. the two types of recombi nation also have key mechanistic diffel'cnces. ln the folJowing sectio ns Ihe simpler site-specific I'ccombination reactiotls are inlroduced first. followcd by Ihe d iscussion uf transposition. Each of thesc sections is organized to describe general features of Ihe mechanism first and thcn lo provide some specific examples.
CONSERVATlVE SITE-SPECIFlC RECOMBINATION Site~Specific
Recombination Occurs at Specific DNA Sequences in the Target DNA
Conservative site-specific recombinatíon (CSSR) is responsible fOf many reactions io which a defined segment of DNA is rearranged. A key fealme of these reaclions is Ihat Ihe segment of DNA Iha! will be moved canies speci fi c short seq ucnce elements. called recombination siles, whcre DNA exchange occurs. An example of this type of recombination is the integration of th e phage ~ genOIne into Ihe bacterial chromosome (Figure 11-2 a nd Chapt.::r 21). During ~ integration. recombination a lways occurs at exactly the same nucleotide scquence within two recombination siles. one on the phage DNA, and Ihc other on the bacterial ONA . Recombination si tes
FI G U R E 11·2 Integration of the A genome into the chromosome of the host
cell
[)NA
exchange occurs specifically
bel\Neen the recombinallon siles on me two DNA rnoIecules. fue relative Icngths of \he h and cenular chromosomes are nO! shCM1'1 10 ",le.
phage
recombina lioo site
bacterial--ft---: recombination site
inlegraüve recorroinalion
•
COlIscrvtltivR Sittt-splJCifit: RI1CO/Ilbínoliof!
295
carry two c1asscs of sequence elements; sequences specificaLly bound by the recombinascs, and sequences "",here UNA deavage and rejoiJling OCCllr. Recomhination sites are ofien quite short , 20 bp or so, although they may be much longer and carry additiona l sequences boul1d by proteios. Examples of the more complex recombination s iles are discussed when we cons ider specHic recombina tion reactions. CSSR can generale three different types of DNA rearrangements (Figure 11-3): (1) insertion of a segmen! of DNA into a specific site (as occurs during phage X DNA integration) ; (2) deletion of a DNA segment ; or (3) inversion of a DNA segmento Whether recombination resulls in DNA inserlioo , deletion, or inversioo depends on the organization of the recombinatioo recogn ilion sites on Ihe DNA molecule or molec ules that participate in recombination. To understand how the organiza tion of recombination sites determines Ihe Iype of DNA rearrangel11e,nt. we musl look at Ih e sequence elements withín th e recombi natioll sites in more detan (Figure 11 -4). Each recombi natio n si te is organized as a pair of recombinase recognition sequences, positioned sym metrically. Thcse recognition sequences fl ank a central shurt asym mclric sequence. known as Iho crossover region, whe re DNA c1 eavage and rejoining Dccurs. Because Ihe crossover region is asymmetri c. a given recombinalion site always has a defilled pola rity_ The orientation of two siles present on a single DNA molecule \Viii be related to each other either in an ¡nverted repeat 01' a direL1 repeat manner. Recombination between a pair or ¡nverted s ites will invert Ihe DNA segment between Ihe t\Vo sites (F'igure 11-3, righl panel). rn contrast, recombination llsing Ihe identical mcchanism but occuring bctween siles organized
insertion
deletion
X
oD
in\lerslon
B
A
Y
ce )
+
x
+
y
FI C:;U RE 11-3 Three types of CSSR recombination. In each case, it 15 tre red segment d. ONA tloat is ~ CI re
B
y
200
Si ftl-SfWt:ific Recombinotio/l ane! Tronsposition of DN/I,
FI G URE' '-4 Structures in~\led in CSSR. The pair of S';TT1metric recornbInase recD!11rtion scqucnces flank the aos5O\ler region vvhere lecornbinalion occurs. The sLbunits of Ihc reccrnbinase bind these recognition sifes.
No1ice that the sequence of the ~ region
recognltion
,",,~eoc,.
is l1(J( palíndromk. resulting in an intñnstic asymmetry to tI1e le
!
recombinase binds recognition sequences
!
recombinalion
;~ : = QmP~ Site.Speciflc Rccombinascs Clcave and Rejoin DNA Using a Covalent Protein·DNA Intermediate There are two families of conservative s ile-speci fic recombinases: the serine recombinases and the tyrosine recombinases. Fundamental to Ihe mechanism used by both families is that when they cleave the DNA, a covalent protein-DNA intermediate is generated. For the serine recombinases, the side chain of a serine residue within the protein's active site attacks a specific phosphod iester bond in the recombination site (Figure n-S}. Tbis reac ti on in troduces a singlestranded break in the DNA and simultaneous ly generales a cova lent linkage bctween the serine and a phosphate al Ihi s ONA cleavage site. Likewjse. for Ihe tyrosine rccombinases , it is the side ehain of the active-site tyrosine thal attacks and then becomes joined lo the DNA. Table 11 ~ 1 classifies a number of important reeombinases by famil y and biological funelion. The covalent protein-DNA inlermediate conserves the energy of the cleaved phosphodiester bond within the protein-DNA linkage. As a result, the DNA slrands can be rejoi ned by reversal of the d cavage process. f'or reversal, an OH gl'OUp from the cJeavefl DNA atlacks the covalen t bond that Iinks the protein lo the ONA. This process covalently seals the DNA break and regenemtes the free (non-ONA bound) recombinase (see Figure 11-5}. It is this mechanistic feature that contributes the "conservative" lo tJw CSSR name: it is called "conservative" bL'Cause every DNA bond that is broken during Ihe rcacHon is resealed by Ih e recomhinase. No external en(!fgy, sueh as that I'eleased by ATP·hydrolysis, is nacdcd for
(,1:!nsctVal ive Sitc-Specific Recombinalian dea\o€d ONA end
5'5'-
o
O
b.,.
cleavage
o
,;;::;, ,c---.. I ~· Ser - OH if"P~
Q
.
~
HR
)
'Ser - o - P- O
o,
<:f 'o-
o,
,
,
protein-DNA covatenl intermediate
FI CU JI E 11-5 covaterlt-irttennediate mec:llanism used by the serine and tytosine recombinases.. t-!ere dn OH group Irom an active-site scrine ís shown lo atlaa !he phos¡Xlate and thefeby intro::!uce él single-stranded break al the sita of recomblncltloo. Tl-e liberated OH group en the broken DNA can tneo reattack !he protein-DNA covalent bond 10 t~ lhis cleavase reacboI1, leseal tre l:J'JA, aOO release Iht- prolein. "Ole recombinase, I.3beled Rec, 15 shOM1 in blue.
DNA cleavage and joining by these proteins. Tbis c1eavage mechanism. wilh its eovalent intermediate, is not unique to the rccombinases. Both DNA topoisomerases fChapter 6} and Spo11. the proteio that introduces double-stranderl breaks into DNA lo iniliate homologous recombination during meiosis (Chapter 10), use this mechanism.
TA8lE 11-1 Rec:omblnases by Famity and by Function Recombinase
Serlne Famlly Salmonella Hin inverlase
Transposon Tn3 and 'Y8 resolvases
'I)rroslne Famlly Phage ), inl egrase
Phage P1 ere
E. coli xerC and XerD
Yeasl FLP
Funct;oo
lnverlS a chromosomaJ region lO flip a gene pl'OTloler by recognizing hix silcS. AIIClV.'s (!xpression 01 two dislinct surface anligens. PromOles a ONA deletion reaction 10 resolvc the DNA JUStan evenllhal results Irom repllcative transposi1ion. Recombination SITaS me called res SItas.
Promoles DNA integration and axcision oIlhe phage ~ genome ,nto. and OUI of. a specific sequence on Ihe E. ,;0/; chromosome Recombination sites are callro aU sitas. Prornotes c ircularizatlon of ¡he phage DNA dunng ¡nfecUan by recognizing sltes (called Ioxsites) on the phage ONA . Prornotes severa! ONA deletíon fooclions tila! corwert d imene circular ONA moleculcs into rnonomers. Rccognizcs bOlh plasmid-borne sites (cer) . and chromoS()(llaI sitas (dif) sites. Inverts a regían of Ihe yeast 2~ plasmid 10 allow for a DNA amplificallon reaclion callad roJling c ircle replication. Aecombinatioo sites are call1.>d Irt siles.
o,
,
297
298
Sj/IJ-Specific Rflcvmbillutioll ulld 7hmsposilioll o1oNA
Serine Rccomblnases Introduce Doublc,Stranded Breaks in DNA and then Swap Strancls to Promote Recombination CSSR always occurs belwcen t\\'o recombination sil es. As we sa\\'
above. these sites may be on the same DNA molecu le (for inversion or deletionJ or on Iwo differenl molecules (for integral ion). Each recombination site is made up of double-strandcd DNA. Therefore . during recombination, four single strands of ONA (t,""o from eae h duplexJ must be c1eaved and then re joined-now with a differenl partner strand - to generate the rearranged DNA. The serine reeombinascs cleave all four strands prior to strand exchange (Figure 11-6). One molecu le of the recombinase protein promotes cach of these cleavage reactions; merefore a minimum of four subunits (that is a tctramer) of th e rccombinase is requíred. These double-s trand ed DNA breaks in the parental ONA moleeules generatc [our double-slranded ONA segments (marked by the proteins bound to them as R1 . R2 . R3 . and R4 in Figure 11 6). For rccmnbin fltion tu OCl:ur. the R2 segmcnt of the top DNA mo\ccul c. 4
FIGURE 11-6 Recombinationbya serine recombinase. Eam of the four DNA strands is dcaved WltNn Ihe crossover region by ene SlJbunit of the proIein. lhese subunits Clre labeled RI , R2, R3, and R4, Oea"age 01 the t\No individual stl'i!nds 01 one a..plex is staggered by two bases. This two base regloo foons a tl'¡hrid ÓJplex In the recombinant proó.Jcts. lhe recom-binatlOfl sites are Similar lo Ihose shoNn In Figure 11 -4.
crossover regiofl I
I
!o• • 3"
.. 1
L
",,1.,.
reJOln ng ~ ~ feeomblnase
COIlservotive S ite·Speciftc; RlJ{:umbination
299
musl recombine with the R3 scgment of the hollom DNA molecu le. Likewise. Ihe R1 segmenl of the 10p rnolecu le musl recombine with the R4 segmenl of the hottom DNA molecu le. Once this DNA "swap" has occurred. the 3'OH ends of each of the c1eaved DNA slrands ca n altad the recombinase-DNA bond in Iheir n ew parlner segment. As discussed above, tlds reacl ion liberales the recombinase and cova lentIy seals the DNA strand s to generate the rearranged DNA product.
Tvrosine Recomb¡nases Break and Rejoin One Pair of DNA Strands at a Time in contrast lo the serine recombinases. Ihe tyrosine recombinases c1eave and rejoin l\Vo DNA str.mds first, and only Ihen c1eave and rejoin Ihe other Iwo strands (Figure 11-7). Consider two DNA molecules wilh Iheir recomhination siles aligned . Here also. fOllf mulecules of Ihe recombinase are needed , one lO cleave each of the fOllf
FIGURE 11·7 Recombinationbya tyrosine ,ecombinase. !-'ere the R1 and R3 slhJnits deave!he DNA in !he first step (a); in \he e¡¡ample snQNl\ the protein becomes ~rted lo \he OJI DNA by a 3' p-tyrosir.e bond. Exdlange of the ¡¡I"SI pair of strands OCOJ(S when the two 5' a-I groups al the bfeak siIes each attad:. the prrtcin ·ONA bood
-¡ HoIliday
e
l
d~,a"of
"bottom" strands
!
bottom slrand
exchange 10 linish recorrt>ination
·
individual DNA strancls. To starl recombination. the subunits orrecambinase hound lo the left recombinase hinding sites (marked as Rl and R3 in Figw'e 11-7a) cach cleavc the lap strand of the DNA lllo1ecule lo which they are bound. This deavage occurs a l the first nucleotide uf the crossover region. Next the fi ght lap strand from the lap (gray) DNA molccule and the righ! lop strand from the huttom (red) DNA mo)cculc "swap" partners. These two DNA strancls are then joined, now in the recombined configurations. This "first strand" exchange rcacHon generales a branched DNA inlermediate known as a Holliday junction (see Chapler 10) (Figure 11-7b). Once Ihe first strand exchange is complete, Iwo more recombinase subunits (those marked RZ and R4) c1eave the hottom strands of each DNA molecule (Figure 11-7c). These slrands again switch partners, and lhen are ¡oined by the reversal of the c1eavage rcaction. This "second slr.md" exchange reaclion "undoes" Ihe Hnl!iday ¡uncHon. to yield the rcarranged DNA products. In lhe next section we discuss how lhese chemical sleps occur in the context of the r€Combinase protein-DNA complexo
Structures of Tvrosine Recombinases Bound to DNA Reveal the Mechanism of DN A Exchange 'fhe mechanism of site-speci fi c recombin ation is best understood for the tyrosine rccombinases. Severa l structures of members of this protein class have been so lved, and these SITuctures revcal the recombinases "caugh t in tIle BCt" of recombination. One beautiful example is the structure of the Cre recombinase bound lo lwo different configurations of lhe recombining DNA. Insights inlo the mechanisms derived fram these s lru clures are explai ned below. Cre is an enzyme encoded by phage P1, which functions lo circularize the linear phage genome during infecHon. The recombination sites on Ibe DNA . where Cre Beis. are call ed /ox sites. Cre-/ox is a simple example of recombi nation by the Iyrosine recombinase family; only Cre protein and the /ox sites are needed for complete recombination. Cre 18 a lso widely used as a tool in geneli c engineering (see Box 11 -1, Application of Site-Specific Recombination to Cenetic Engineering). The Cre-/ox structures reveal that recombination requires fOUT subunits of Cre, with euch molecule bound to one bioding site an the substrate DNA molecu les (Figure 11-8). The conformation of the DNA is generally a sqllure planner rour~way juncti on (see lhe discussion of Holl iday junclions in Chapler lO) wlth each "arm" of this ¡unction bound by one subun it of Cre. Although at firsl glance the structures appear tu have fourfold symrnetry, this is not really the case. Cre exists in two di stinct conformations wilh one pair of subun1ts in conformatioo 1, shown in greeo, end the other pail" in conformatlon 2, showll in purple (Figure 11·8b). Ool y in one of these cOllformatíons (lhe greco subuoit s in the figure) can Cre c1eave and rejolll DNA. Thus. only one pair of subunits is in tIle active conformaticn at a time. The pajr of subunits in th is active cou fo rmation switches as lhe reacllon progresses. This switching is crítical for controlling Ihe progress of recombinatíon and ensuring the sequential "one strand at a time" exchange mechanism.
COIl1/elVOtive Sile-Spccific Recombinalioll
301
•
Cre-ONA
in1emIediale I
"
b
I (; U RE 11-8 MechaniSlll of site-specitic recomblnation by DIe ere recombinase. (a) The left p
recombined
ONA
only me two subunits colored in green are in lhe active conformatim Note that alter first strand deavage. !he colors 01 rile subunits SWItch as the second POI( of (re subunlts become ao.ive for recomblnation. (Source: Frorn Feng Guo el al. 1997. Struclure of ere le-
combinase COI'T"plexed v.1th DNA. Nofute 389: ti 1. COpyright O 1997.) (b) 1he ngh! panel ~!he aystal stJucturc of ere bound lo me HoIliday tunctíon intermediare (correspond-
íng lo the third pd in purple. The colTJllex, tncrefOrc, docs not hcwe loulfold symmetry; notice, lar eKilmple, Iha! two of the pa,rs of a4acent [)NA "arms" in !he structure are rrucn do!;er togcUll'l' than are me othcr p.jrs. (Gopaul D.N., Guo F., and Van Duyne G.D 1998. EM801 17: 4 175.) Image ~red Voim BobSmpt, MoISaip~ (lnd Raster 3D.
Box 11·1 Application of Site-Spec:ifu: Rerombination ro Genetic Engineeñng
Because some site-specifie recombinatioo systems are so simple, they have become widely used as tools in experimental genetics. Cre recombinase, and its cIose relalive FlP recombinase, are bolh used experimentally to delete genes in eukaryotie organisms (also see example io Chapter 21). An example of tIle usefulness of Ihis stralegy becomes clear'JoJhen v.oe consider the loIlavving hypothetical example. A researcher is interested io the role 01 a specifie gene io the devclopmenl of luog canccr aod she wishes lo study this process usiog the mouse as a modeJ OrgJed to deveIop in the absence of the recombinase, but Ihen aftef birth, ere expre5S1oo can be 1:urned on~ The presence 01 the rea:rnbinase causes deletion of the gene 01 inter· esl In this case, the pupensity 01 the Cre-treated mice (in v,.hich the gene 15 deleted) for lung canea- can now be ronpared with lheir -normal" lilter mates, in which the gene of interest is stitl ¡nlact Thus, recornbination using Cre allovvs !he potential fundions of the genes lo be unrotered in differenl stages of developmenl
BIOLOGICAL ROLES OF SITE-SPECIFIC RECOMBINATION Cells and Vlruses use conservative site-specific rccombinalion for a wide variely of biological functions. Sorne of Ihese fullctíons are discussed in the following sections. Many phage insert their DNA into the hosl duomOSQme during inrcetion using Ihis recombination mcchanism. In other cases, site-specific recombinalion is used lo alter gene expression. For example, inversion of a DNA segment ('..Bn allo", two altemutive genes lo be expressed. Site-specific rccombination is also widely USl.>d lo help Illuinlaio Ihe slructural integrity of circular DNA molecules duríng cycles ofONA replicalion. homologous recombination, and cell division. A comparison of sile·specific recombination systems reveals sorne general themes. A H reactions depcnd cr ilically on the assembly of the recombi nase protein on tbe DNA, and the bringing together oflhe t\Vo recombination sites. For sorne recombination reactions Ihis assembly is \'ery simple. requiring onl)' Ihe recombinase and ils DNA recognilion sequences 8S JUSI described for Creo In contrast, other re'lctions re· quire accessOl)' proteins. Tbese accessory protei ns inelude so-called architedural prolelns thal bind spccific DNA sequenccs and bend the DNA. They organiza DNA into a spedfic sha pc and thereby stimulute the recombination. Architectural protcins can also control Ihe direction of a recomhination reaction, for example. to ensure Ihal inle· gration of a DNA segment occurs while preventing the reverse reac· tion-DNA excision. Clearly, this type of regulalion is essentia l for a logical biological outcomc. Finally. we will a lso sce Ihat recombinases can be regu lated by other proteins lo control when a particular DNA rearrangcment takes place and coordina le it \Vith other cellular cvcnls.
BioJogical Rof~ af Silc-Spccifit; Recombinatian
30:1
A Integrase Promotes the Integration and Exdsion of a Viral Genome into the Host Cell Chromosome When bacleriophage A infects a hosl bacteriwn. a series of regulatory events resul! ei ther in establi shmen t of lhe quiescent Iysogenic state or in phage multiplication, a process called Iytic growlh (see Chaplers 16 and 21). Establishment of a Iysogen requires Ihe inlegration of Ihe phage DNA into the hosl chromosornc. Likewisc. when Ihe phage leaves lhe lysogenic stilte lo replicate iln d make new phagc particles. it
musl exdse its DNA fram Ihe hosl chrolllosome . The analysis of this integra tion/excision reuction provided lhe first molecular iosights ioto sjte-specific recombination. To integrate. the A integrase proteio (A lnl) r:ala lyzes recombination between Iwo specifi c siles. known as the atto or altachmenl. sites. The ottP site is on Ihe phage DNA (P fo r pbage) and lhe aUB site is in the bacterial chromosome (B for bacteria; see Figure 11-2). ALot is a Iyrosine recombinase. and Ihe mechanism of strand exchange follows the palhway described aboye for tile Cre protein . Unlike ere rccombination . howe\'er, A intcgf"dtion requires accessory proleins lo help the roquirud protein-DNA complex lo ilssemble. These proleins control the reaction lo ensure that DNA intL"gI"al ion and ONA excision occur al Ihe right time in lhe phage life cycle. We will fusl consider Ihe integra ti on paUlway and then look al how excision is Iriggered. lmportanl lo Ihe regulation of h integration is Ihe highJy asymmetric organiza tion of the attP and attE sil es (Figure 1] -9). 80th siles carry
m,J r ,o! IHF
JI':
FI c.u RE 11-9 Recombination sites ñwoIved in A integration and elldsion showing the important sequence elements.. e, C . B, and B' are !he core ,'..'nl bUlding sites. The additionaI proteir1 binding sites ale en auP aOO flank rhe e and C' Slfes. These re~ ale called the "ilffTlS;" lI1e sequences en !he Ieft are calleó the P arm and those en \he ritt'f are C311ed the p. armoThe srnaHpulple boxes Iilbeled PI. P~, and P,' are !he arm Aln! bindingSltes. Sifes marked Hare lhe IHFbnding sites, and siles rnar\o;ed Xare !he siles v.tJich hind Xís. F is !he SÍle boJnd bv tlS, another architectural proa, not dis· cussed lunher here. lhe grao¡ regioos ale me O'OSSO\fE'f regiCf'ls
For danty; 1..111115 no! sho.vn Ix:.uoo to !he mre sites. Note !ha!: not al proten bindng sires are lilled dunng elther Integratrve ur e),ds¡v~ ¡~nation. Alte¡ ¡~. the P arm IS part of attL v.hereas, !he p. atm becomEs
pan ofattR.
304
S ile-SpedJic Recombinotiofl ofld 'fmnsposition af DNA
FIGURE 11·10 ModelforlHFbending DNA lo brtng DNA-binding sites topthet. fue ~Int aOO \HFbinding siles I,om the P' arm of auP are shov.t1. \HF bindlng lo lhe H' Slte
bends !he [)NA lo allow one moIecule 01 ~Inr 10 bind bot:h !he p .. and e sites. The break in the DNA wit/lin the H' site reflects a nick ¡hal \NaS Pfeer1t in !he DNA used lar strucIufal anolysr; 01 the IHf.DNA complel!. (Source: ffom Rice P. er al. 1996. Oystal strur:ture of an IH F-ONA
ccmplex. CeH87: 130.5. Copyngllt © 1996, with permission from E1seo.1er.)
a central core segmenl (approximalely 30 bp). These eore recombination siles each consisl of Iwo >.Inl binding siles and a erossover region where strand cxchange occurs (as described abovc). Whereas aHB consists onJy of this central eore region, attP is mueh longer (240 bp) and earríes numerous addilionaJ protein binding siles. Flanking each si de of Ihe eore region of aUP are DNA regions known as lhe "arms." These arms carry a variety of protein binding siles, including additiona l sil es bound by >.Inl (labeled HS Pl , P2, and P', in Figure 11-9). Mnt is an unusual protein beeausc it has Iwo domaios in volved in sequence-speciflc DNA bi nd ing: one domain binds tú the arm recombinase recognition siles and Ihe other binds to the core recognition sites. In addition, the arms oCattP carry siles bound by several architectural proteins. Binding of these prote hls governs the directionality and efficieney of n'l combination. Intcgtation requires attB, attp. Alnl, and an architectural protein called integration host factor (IHF). IH F' is a sequence-dependent DNA-bindiog protein thal introduces large bends (> 160.Jnt binds strongly) wilh the siles presenl at lhe central eore (where it binds only weakly) bul wherc it mu st bind to ca talyze reeombination. When recombination is complete. the circula r phage genome is slably integrated into the hosl chromosome. As a result, Iwo new, hybrid sit es are generated at lhe ¡unetions between Ihe phage and Ihe host DNA. These siles are cal.led attL (I eft) and altR (right) (see Figure 11-9). 80th of these siles con tain Ihe core region, bul Ihe l\Vo arm regions are now separated from one a nolhcr (see the locati on of Ihe P and P' regions in Figure 11-9). Thus , neither of lhe two core regions in t.hi s new arrangemcnt is competenl lo assemble an active Mnt racombinase complex via Ihe mechanism thal was used to geoerale Ihe complex for intngealion; .he DNA siles important for assem bl y ara sim ply oot in Ihe right plaee.
Phage A Excision Requires a New
DNA~Bending
Protein
How does >. exci se? An additionaJ arc hitectural protein. Ihi s onJ3 phage-encoded, is essential foe excisive recombination . This prote in, ca lled Xis (for excise), binds lo specific ONA sequences and intro~ duces bends in Ihe DNA . lo lhi s manoer, Xis is similar in fundion to m.F. Xis recognizes Iwo sequence motifs present in one arm oI attR (and also presenl in at/P- marked XI and X2 in Fi gure 11-9). Binding these sit os introduces a largo be nd (> 140°) and logethe r. Xis, >.Int, and IHF stimula le excision by assembling an active prol einDNA complex al a UR. This eomp lex then ¡nleracls producl ive ly with protei ns assembled al ouL and recombinati on occurs. lo a ddil.ion lo sti mllJatin g cxcis ion (reeombination between attL and atlR), DNA bind ing by Xis Hlso inhibits inlegration (recombination between a tlP and attB). The DNA slruclure c waled upon Xis binding to attP is incompatible with proper assembly of Alnt and IHF at this sile. Xis is a phage-encoded protein and is only made when Ihe phage is lriggered to enter Iytic growth. Xis expression is desceibed in detail io Chapler 16. li s dual aelion as a stimulatoey cofactoe foe exci sion and an inhibilor of integralioo cnsures Ih
Bialagical Rales 01 Site-Specific Rocombinatian
305
The Hin Recombinase Inverts a Segment of DNA Allowing Expression of Altemative Genes The SalmoneJlo Hin recombinase inverts a scgmenl of the baderial chromosome 10 alJow expression of two altemative sets of genes_ Hin recombinalion is an example oC a class of recombin
FIGURE 11-11 Micro8'aphofbacteria (So/mOlle/lo) shO"Ning flagela. The co!Of
enhanced scanning electron micrograph shovv> Salmonella typhimurium (red) invading rultured l1 uman cells. The hair-lil:.e protrusions on Ihe bacteria are me f1agella. (Source: COU/lesy 01 lhe Rocl:.y f.IJountain l aboratories, NlAtD, NIH )
:\06
Silll-Spt:cifir' Recombinalion und Tronsposilion o[ DNA
invertible segment
F I G U R E 11-12 DNA inversion by the
Hm recombinase of Solmonedo. Inversion 01 the DNA segrnenf bel\Neen the hix sjles flips
P
hin
ON
a promot€f (P) lo give Iwo altemi'otive pattems
flixR
hix L
of flagellin gene expfesSlOn:
fIJA
fljB
¡~ffi~ 1 hm
O FF
q:~h~~~L~:t¡:=I:=:Jdla==':li~B==II:'i:A==> u hixR
P invertible segment
neither Hz nor the Hl repressor is synthesized. and the HHype flagella are present.
Hin Recombination Requires a DNA Enhanccr Hin recombination requires a sequence in addition to the hix siles. This short ("-'60 bpl sequence is an enhancer thal stimulales the rete of recombination .......1 ,OOO-fold . Like enhancer scquences thal slimuJale transcription (see Chapler 17), this sequence can fun clion evcn when located quite a distance from the recombination siles. Enhancer function requires the bacterial Fis prot ein (named because it was discoverod as a factor for inversion ~timulation) . Like lliF, Fis is a site-specific DNA bending protein. In addition. iI makes proteio-prote¡n contacls \Vith Hin Lhal are im portant Cor recombination , The enhancer-Fis complex acti vates the calaIyti c steps oCrecambi nalion. Hin can actually
alone recognizes and pairs fhe t'Ml hlX sites. When FIS profein is also present, the Ihree-segmem complelc can foon. This complex
':---.l
Hin protein
t!ix siles
?
enhancer
Hin .
is calle
complex lo!- prornoting recomb.nation, (Soufce: from Craig N. el al 2002:. Mobife DNA 11, p. 246, f 9. e 2002 ASM Press.)
Hin-hi)C synaptic COf"I"l'lex
Hin invertasome
Bio/ogicul liolcs of Site-Specific Recombinution
sUPflrcoiling (sec Chapler Q). which stabi1izes the association of the dislant DNA si tes. Another baclerial architeclurtl l protein, HU also facilit¡jtes ¡jssembly of Ihis invertasOllle complex. HU is él close slructural homologue of IHF, yet in contrast lo ruF. it binds DNA in a scqucncc-indepenrlenl manner. Whal is the biological ralionale for control of Hin inversion by the Fis-enhanceJ' complex? The principal fundioo is lo ensure thal recombin¡¡tion only {)(;curs between hix sites Ihat are present on the same DNA molecule. This sclectivily ensures that the in vertible segmenl is flipped frequenlly while intermolecular ONA rearrangements !.hat could disrupt the integrily ofthe bacterial chromosome are avoided. In contrasl lo integral ion an d excis ion of phage A, Hin-catalyzed inversion is not highly regulated. Ralher. rnversion occurs Slochastically. such that wíthin a population of cells there will ¡¡)ways be some r.elJ s Ihat carry the in vertible segment in each orienlalion.
~---l .
ue
307
ONA
replocatioo
Recombinases Convert Multimeric Circular DNA Molecules into Monomers Site-specifi c recombination is cri tica ) to the mainlenance of circular DNA molecules within cells. The c hromosomes of mosl bacteria are circular, as are mosl plasmids in both prokaryotic ami eukaryotic cells. Sorne viral genomes are Hlso circular. An intrinsil: problem wilh circular DNA molecules is that they sometimes form dimers and even higher mullimeric forms during the process of homologous recombination. Site-specifi c recombination can be used lo convert these DNA multimers back into monomers. Consider whal happens when a ONA (;rossovt::r occurs betwcen two identical circular molecules. Thi s proccss is shown oecuring betwcen 1\\'0 copies of a b¡¡cterial chromosome (Figure 11-14) (see Chapler 10 foc ¡¡ discussion of homol ogous recombination). A single homologous recombination evenl can generate a single large circular chromosome with two copies of aU Ihe genes-that ¡s. a dimeric chromosome. Al the time of cell di vision, thi s dimer poses a major problem . as there will be only one rather Ihan two DNA molecules lo be segregated inlo the two daughter ceJls. Because of th is mtiltimerization problem, many circular ONA molecules carry sequences recognized by site·specific recombinases. Proteins th¡¡1 funclion ¡¡t these sequences are sometimes called rnsolvases. as they "resolve" dimers (and Jarger multimers) into monomers. Clearly. it is essential for their funcli on that these proleins spedfically catal)'ze resolut ion (a DNA deletion reaction) bul no! the reverse rearuon (conversion of monomers lo dimers). which would onl)' make the mu!timerizalion problem worse! As we will see, s pecific mechanisms are in place to enforce this direclional seleclivity on lhe recombination reaction. The Xer recombinase ca lalyzes the monomerizalion of bacteria) chromosomes and of many bacteria! plasmids. Xer is a member of the tyrosine recomhi nm;e family, and its mechani sm for promoting recombinati on is ver)' sim ilar lo that descríhed above for lhe Cre protein. Xer is a heleroletramer, containing two subunits of a protein oilled XerC and two subunits of él prótein caBed XerD. Both XerC and XetD are tyrosine rccombinases but they recognize d.i(ferent DNA sequence. Therefore. lhe recombinalion sites used by the Xcr recombinase musl carry recogniti on sequences for each of thesc proteins. The recombination sites in b¡¡cte rial chromosomes. called di! sites,
l
FICURE 11-14 CircularONA moIecules can fonn muttimers. Horrdogous recornbination betvveen the two daughlel [)NA rnoIecIJle5 dUling DNA repflca60n generales a dirneric: chrrxrosome (or pIasrrid). Site-speci!ic ~btnar01 by !he Xe!CD n;occm~ nase ~ then needed to generare !he monomeric DNA moIcctAes needed fer cell dMSioll.
308
Site· Specific; Rocomllinvtivn and Tmnsposilion o[ DNA
have a XerC recogníti Qn seqllence on ane si de and an XerD rccogniHon seqlle ncc on Ihe olhe r sid~ of Ihe crossover region (Figu re 11 -15). There ís one dlf sile on the c h.romosome. H is located within the region where DNA rcplication lerminates (see Chapter 8). When Ihe chromosome forms a dimer, this dimer will of course have two di! sites (see Figure 11 -14). How do cells make sure that Xer-medialed recombination al di! siles wiH convert a chromosome dimer into monorners without ever promoHng the reverse Te¡¡ction? This directional regulation is achieved lruough the inleraction between the Xer recombinase and a cell division prote¡n caBed FtsK. 'fhis rcgulation is shown in Figures 11-15 and 11-1 6, and occurs as follows, When FlsK ls unavaiJable foe inleraction WilJl Ihe Xert::D complex al the dJf site, the recombinase complex adopts a conform aliun in which only Ihe two Xcrt:: subunils are acti ve. As a resu·l!. XerC wiU promote exchange of une pair of UNA strands to form the HoJJiday junction interrnediate (see Lhe discussion 0 0 the general mechani sm of tyrosine recombinase rocombinatiol1, above), B eUtu~e XerD is never Hctivétted , r~ombinalion is ntlVCr corupleled. lnstead, reversa! of the XerC c1eél.vage re
F1GU RE 11~15 Pathways for
Xer-mediated recombination at dif. In the absence of FtsK (FtsK·,ndependem pallYway shown in lhe Ieft panel), onl'( XeIC is active lo p«:m:)Ie strand exchal"lg€ lo form a HoI~day ;.mction lntMnediate. In mis case (because
xerO is no! «tive), recombilldlion IS not como pleted and lhe XeJC leaction is frequently reversed In me plesence of FtsK (FtsK dependent palhway shown In !he right panel), >
Ft'" independeOl parhway
1
1
teCOmbination evenl and generare mtofTiO.SOlTlC
mOOOlTlers. (Source: Adapte
depe"d~t
F"K pethway
XaO
"'''e
Jf
Bio fogieol Roles o( Slte-Spedfit: Ret:o mbi n(l fion
origln
1
odd number 01 crossovers
no aossovers
XerO
¡
a
309
F I G U R E 11-16 Regulation of chromosome segnoption by FtsK. JusI: before ceII division, !he ne-My replicaled origios, shcwl in Breen, move te the potes 01the cell WlereaS !he repIk:ation lerminus that indudes dif, shown as a mangle. typicaIly ranains 10caIized at !he lTHdcell. 'M1en !he dif site is replicated, ttlE' two daughtet dif sites can re:ere If Ihe replicated chrornosome fOl'lT"6 monorners. segregatlOn wiR break !he s'fTlaptic cotnpIex and rhe áf ~es wift move lIWlIY frorn the midcell location belore divt-
.sron In COI1tras~ if !he chrornoscrne forms a ÓiITJef (righl panel), !he S)Nptic ccrnpIex rernains
trapped al rnick;eII and alk:ws access 10 FtsK. whidllS locaIized lo the (ell dMsion site. FtsK lhen activates >:erO. >:erO-medialed recanbnaoon, lo/Iowed by Xerc-mediated recomt.inaoon, then allcr.Ns resolution of !he ÓITlerS inlO
XerC
rnonomers lor ceUdMsicn (Soorce: Barre E.1 al. 200L A'oc. Nat. Arod. Sri U..5A 98: 8189, 15, p. B194.)
-.------. ------.----l·.-----.--.
division site closure - - . --. - - - . -
L--..------. --_.._.l FIsK
aclivalion of XerD by FIsK )(ere also active
1
OD~ ceUdivision
rra nspositioll
01 DNA
FtsK is un ATPase I.hal Iracks a long DNA. JI fun cl ions as a "DNApumping protei n" simil ar lo Ihe RuvB protein Ihat promotes DNA branch migration during homologous recombination (discussed in Chapter 10). FtsK is also a rnembrane-bound protein Ihat is localized in Ihe cell ul Ihe site where cell division occurs. lt functions lo move UNA a'..'ay from the canter of the cell prior lo division so thltl Ihe cell can divide al Ihi s site (l-l gu re 11 -16). This 10calizatiol1 of FIsK lO Ihe division site is key lo how Ihe cclls insure thal XerD is Ilclivated spcci fi cally when <1 dimeric chromosomc is present. lo this case, Ihe chromosome will be "stuek" in Ihe middJe of Ihe dividing ceH as one half of th e ehromosome rlimee is moved ioto eaeh daughter ceU. The ! \VO di! siles in Ihis dimer, with bound XerCD proteins, therefore interael with FlsK. In Ihi s manner, site-specifi c recombination is rcgulated lO occur al Ihe right time and place with respeel lo Ihe cell division cycIe.
There Are Other Mechanisms to Direct Recombination to Specific Segments of DNA Although we have limited oue di scussion lo conservative site-specific recorobinalion, Ihere are olher recombinalion events Ihal occur al specifie soquences and servo similar biologicnl funetions. Some of Ihesc rcaetions. for example, mating Iype switching in yeasl, occur by a targeted gene-conversion event, as we described in Chaptee 10. Tho gene rearrangemcnts responsible for asse mbly of gene segmenls encoding crilical proleins for the vertebrale immune syslem - known as V(D)J recombinati on-also occurs al specific siles. This reaetion is mechanislically similar lo transposition. however, and thercfore is considered later in thi s chaplee.
TRANSPOSITION Sorne Genetic Elements Move to New Chrornosomal Locations by Transposition Tmnsposition is a specific form of genetic recombinalion thal moyos ccrtain genelie elements from one DNA sile to another. Theso mobile genetie e lemenls are eaHed transposabJe elements or transposons. Movamenl oceues Ihrough reeombination belween the DNA sequenees aL the vel)' ends of Ihe transposable elemenl and a sequence jn Ihe DNA of lhe hosl cell (Figuro 11-17); movemenl can ocew with or wilhout duplicati on of the element, as we will see. In some cases the recombina· lion reaelion im'olves a transienl RNA inlernlediale. When lransposable e lemenls mo"o. Ihey often show Iiule sequenee selectivi ty in Iheir choice of in serti on sites. As a result, Lransposons can insert within genes, ofien compJelely disrupling gene funclion. They can also insert wilhin Ihe regulnlory setluences of a gene where Ihei.r presellce may lead lo changes in how thal gene is cxpressed. It was these di sruplions in gene fun elion and expression Ihat led lo Ihe discovery of transposable elemenls (see Box 11-3, Maize EJemenls and Ihe Diseovery of Tmnsposons laler in Ihis chl:l ptcr). Perh ups Dot surprisingly, Ihercfore. transposoble elements are the most common source of new mutations in ruany organisms. In fact, these e lemenls are un important cause of mulations leading lo genetic di sease in humansoThe ability of Iransposable e leme nls lo insert so promiscuously
Transposition
transposoo i
genomíc DNA (
l blJ
i
[?
old site movement withoot duplicalion
11
¡
) ncwsite movement wilh
"[l excised lrom old site and inserte
copies of elemcnt al old and new sites
FI GU R E 11-17 Transposition ot a mobile genetic element lO a new site in the host DNA. IEcor'rbnatJon, in sorne cases. invoIves ~ of!he transposon Imm!:he ole! DNA Ioc.ation (Ieft). lo o!her cases. one copy 01 !he lransposoo S1ays al !he oId Iocat!on .me! another copy is ¡nserted inlo lhe new DNA SIIe
("1/>1).
in ONA has a lso led to their modifica lion and use as mutagen s and DNA delivery vel:tors in experimental biology. Transposable elements are present in the genomes of a11 life-forms. The comparati"'e analysis oE genome sequences Tel'eaJs two fascinating observation s. First , transposon-related sequences can rnake up huge fractions of the genome of an organ ismo For example. more than 50% of bolh Lhe human and maize genomes are composed of transposonrelated DNA sequence. This is in sharp contrasl to the small percentage « 2% in humanl of the sequence Ihal actualJ)' encodes cellular proleins. Second. Ihe lransposon conlenl in different genomes is highl y variable (Figure 11 -1 8). Por example, compared lo hurnaos or maize. the fly and yeasl genomes are very "gene-rich" and "transposon-poor." Tbere are mnny different Iypes of transposable elemenls. These eJcmenls can be divid ed into families that share common aspects of slructure and rccomhination mechanism. In the following sections. Wt~ introduce the Ihree ma jor families of transposable elements and the rucomhinution mechani sm assodated with each family. Sorne of the best-slllCl.ied indi vidual elemenls wiII th en be descrihed. In the descriplion of individual elements. we focus on how lr""dIlsposition is regulated lo balance lhe main lenance and propagalion of these elcmenls with Ihe ir potentiallO disrupl or misregulale genes wilh in the hosl organi smo The genetic recombination mechanisms respol1sible for transposition are also used for funclions other Ihan the movemenl 01" lransposons. For example. many viruscs use a recombination palhway Ilearly idcnlical lo transposi lion lo integrate inlo !he genome of the hosl ceU during infection. These viral inlegralion reactions w ill Iberefore be considered logBther w ilh transposition. Likewise, some ONA roorrangemenls U!;cd by cells to alter gene expl'ession patferns occur lIsing a mechanism very similar lo UNA transpositi un. V(D)J recombiJlRtion, a reacl ion requin:d for developrnent of a funclional immune system in vertebrales. is a wellunderstood eXClmple. V(D}j rocombination is discussed al the end of Ihis chapter.
There Are Three Principal Classes of Transposable Elements Transposons can be divided into Ihe following three families on the basis of Ihe ir ovcrall organization and mcchanism of tnmsposition: 1. DNA lransposons.
Site--Spoci[1C Recmnbinalion and Tronsposifion al nNII
312
2. Viral-like retrotransposons-this class includes the retroviruse:::. These e lemenls are also called LTR retrotransposons. 3. Poiy-A rel.rotransposons. These élements are a lso called non viral retrotransposons. Figure 11-19 shows a schema li c oflhe general genetic organizal ion of each of lhese elernenl fami lies. DNA lransposons remain as ONA Ihroughoul a cyde of recombination. They move usiog rnechanisms Ih at involve lhe cleavage and rejoining of ONA strnnds. and in Ihis way they are s imilar to elemenls that rnove by conservative silespecific rocombination . Both types of retrolransposoos move to a ncw DNA location using a Inmsienl RNA inlermediole.
DNA Transposons Carry a Transposase Gene, Flanked by Recombination Sjres DNA transposons carry bolh DNA sequences thal function as recombination siles and genes encoding proteins tha! participate in recombination (Figure 11-19a). Tbe recombinalion sites are al the Iwo ends of the eIernent and are organized as inverted repeal sequences. These terminal inverte
h I-F
genes
b Human V28
() IJ])))D) e
TRY4
V29-1
TRY5
LJJ I J) In)])) DID IlJ Dlll ID)] DDmDD ) )]J)DD DDII
IDDlll II lJ
D) D)
Drosophi/a melanogaster
d Saccharomyces cerevisiae GLK.1
SR09
HIS4
ODDDU! l! II . IDOII! 1m e
FUS/
BUD3
AGPI
IDD! I Il L. )
IDO! JD l'
Eschcrichia COI;
Ih,B /hrA
o
dnaK
fixA
!hre
'0
20
,.
30
FIGURE 11 -18 Transposons in ¡enomes: occureoce and distribution. Repeated elements, mosdy composed al transposons 01' Iransposon-related sequences (such as truncaled elernents) are shCMll1 in green. Cellular genes are sI1OM1 in b1ue. (a) Malle. (b) Human. (e) DrosophiJa. (d) Buddiflg yeast (e) E mli. (Soorce; From 8r"own TA 2002. Geno~ 2nd edition, p. 34. fig. 2.2 and relerences melein. Copyright e 2002.)
40
)
TroflSpositiOfl
a DNA transposons
f1a nking
~DNA ~~::::::~~e~,e~meo ~~t~~::::::l:iil::::J target Site duplicalioo
transposase term¡nal ¡nverted repeats
3 13
F t G U R E 11- 19 Genetic organizatton o. Ihe II1,ee dasses of lransposabte elements. (a) DNA lJansposon5. The element indudes me terminal inverted repeat sequences (green alTO.f,o'S) which are the recombinatioo siles, aOO a gene encoding transposase.
(b) Viral-!i\:.e retrolransposoos aOO
retrcMfU5t'5.
The elernent Incltxles two 1000g terminal repei'lt ( lTR) sequef1ce; thatllanl;. a reg¡oo encoding
b Vlral-fike retrotransposonslretroviruses
two enzymes, Integrase and reverse transaiptase (RT). (e) PoIy-A relrotransposons. The
etement terminales if1 the S' aOO 3 ' UTR ~es and fficOOes two enzyrnes, iIfl RNA-binding enzyme (ORf I)and an enzyme
hil\llng both leverse transc.nptilse ..md endonue poly-A retrotransposons
c\e3se ac:t.ivities (ORF2).
,------.
S' UTR
()
El
I
ORF 1
ORF2
' UTR
•
ONA transposons Ulrry a gene cll¡;od in g theic own transposase. They may carry a few additi onal genes, samelimes encod ing proleins Ihat regu late transposition or provide a fu ncti on useful lo the element or ils has! eeU. For example, many bacterial DNA transposons carry genes encodi ng proteins lhat promote resistance lo one or more ant ibiotic . The presence of the transposon, thcrefore. caUStlS the hos! cell to be res istant to tha t antibiotic. The DNA sequences immediately flanking the tmnsposon bave a short (2 to 20 bpJ segment of duplicated sequencll. These segments are orgaruzod as direcl repeats, are called target site duplications. and are generaloo during the process of recom bi nation as we shall discuss below.
Transposons Exist as Both Autonomous and N onautonomous Elements DNA tronsposons that carry a peir of tennina l inverted repeats and a lransposase gene have everything they need lo promote their own transpositi on. T hese elements are called autonomous transposons. However. genomes also contain many even sim plcr mobite ONA segmenls known as nonautonomous transposons. These t:llements carry only Ihe lerm inal inverted repeats, Ihat is the ds-aeting sequcnces nceded for transposilion . In a t:e ll that also carries an a utonomous transposon, encoding a nansposase lhar wiII recogllizc these terminal inverted rcpcals, Ihe nonautonomous element will be ab le lo transpose. However, in the absence of this "helper" transpason (to donate the transposase), nonautonomous elements remain frozen, wlab le lo move. Viral~like
Retrotransposons and Retroviruses Carry Terminal Repeat Sequences and Two Genes Important for Recomb¡nation Viral-li ke retrotransposons all d relroviruses also carry ¡nverted termi na l repea l sequences thal are the sites of recombinase binding and action (Figu re 11-19b ). The terminal inverted repeats are embedded wilhin longer repeated seque nces; these sequences are organ ized on the two ends of ,-he elcmt:lnt as d irect repcats and are
called long termjnal repeals or LTRs. Viral-like refrotransposons encode two proteins neerlerl for their mobility: integrase (the transposase) anrl reverse transcriptase. Reverse lranscriptase (RTI is a sptlciall)'ptl of DNA polymerase that can use an RNA template 1'0 synthesize ONA. This enzyme is needed fur transposilion because an RNA intermediale is required for the lransposition reaction. Because thcse elemenls convert RNA into ONA. Ihe reverse of lhe normal pathway oC biological inCormation flow (DNA lo RNA), tlley tire known as "retro" elements. The distinction bctwoon viral-like retrotransposons and relroviruses is that the genome oC a retrovinLS is packaged into a viral partido, escapes its host ceH, and inCects a new celL In contrast. tbe rctrotransposons can move only lo new DNA siles within a cell bul" nover leave fha! celL Like the ONA transposons, these elemcnts are flanked by short target site duplications tha! are generntoo during recombinalion. Poly~A
Retrotransposons Look Like Genes
The poly-A retrotransposons do no! have Ibe terminal inverled repeats present in Ihe other transposon c1asses. Instead. th e two enrls oC Ihe eleme nt have distinct sequences (Figure 11-19c). Dne enrl is called Ihe 5' UTR (Cor untranslated region) whereas the other end has a reglon called lhe 3' UTR Collowed by a stretch oC A-T base pairs caBed Lhe poly·A sequcnce. Thesc elements are also f1anked by short target sitc duplications. Retrotransposons carry two genes, know as ORFl and ORF2. ORFl encodes an RNA-binding protein. ORFz encodes a prolein with both reverse transcriptase activity and an endonudease activity. This protein, although distinct from the transposases and integrases enroded by tbe other dasses oC elemenls, plays essenlial roles during recombination. Like their DNA and viral-like tllmsposon counterparts. poly-A retrotf'cUlSpOSOns exist commonly in both aulonomous and nOnaulonomous forms. F'urthennore. genome sequence analysis reveals that lhere are many truncated eJements Ihal do no! have a complete 5' UTR sequence and have losl tbeir ability to transpose.
DNA Tcansposition by a
Cut~and~Paste
Mechanism
DNA transposons, viral-like relrotransposons, and retroviruses all use a similar mechani sm of recombination lo iosert thoir ONA into a ne'" site. First, let us consider the simplest transposition reaelion: the movement oC a ONA transposon by a nonreplicative mcchanism. This recombination patbway involves the excision of the Iransposon from its initial local ion in lhe has! DNA Collowed by integration of this excised transposon into a new DNA site, This mechanism is Iberdore called cut-and-paste transposition (Figure 11-20). To Íllitia!e recombination, the transposase binds lo the terminal inverted repeats at tile end oC 11m transposon. Once Ihe lransposase roeognizes these sequences. it brings Ihe two ends of the 'lmnsposon ONA logether lo genera te a stable protein-ONA complex. This complex is caBed the synaptie complcx or Iranspososome. 11 contains a multimer 01' transposase-usuall)' !wo or fOllr subunits-and Ihe two DNA ends (sec belowJ. This complex functions to ensure tha! tha DNA deavage and joining reactions needed lo move the lransposon Of.:cur simultaneously on the Iwo ends of tht:l element's DNA. It also prolects me DNA ends from cellular enzymes during recombination.
Tmll sposi /iol/
element in okl DNA 5ocation flanking hOst
. / terminal inverted repeats ............
DNA\_ ......./;.
oc===
"-
t:»
-----j::~====~
l
ONA de""""" 01
I
3'OH
= = 3'OH ~+ target DNA ~
1e:"~,~,""d - S'
to fill gaps
"
01
"'d<'
newDNA \
y
L...J
target sile duplicalion
.site in fue gray host
[)N.A
to a new sile in me
~lications), 1he ONA al me onglMll1SertIon sile (hefe in gJay) v,;1I be left with a
excised transposon
lliga,o.
FIGURE 11-20 The cut-and-paste mechanism of transposmon. The figure shows ~I 01 a traf'lS/XlSOll from a target
blue DNA NoIe the staggered deavage sites on me targel DNA duMg me DNA strand transfer reaction thal gíve nse te short repeated ~ al the nevo larget sile (!he targE't sile
bolh slrMds
element in new DNA klcation
liS
L.J targel site duplication
The next step is the excision of the transposon ONA from its original location in the genome. 1'0 achieve this. the transposase subunits within the transpososome firsl cleave one ONA slrand at each end of the transposon. cxactJy al the junction belween the transposon DNA and !he host scquencc in wrnch it is inserted (a region call ed lhe nanking hosl DNA). The If'clns posase deaves the ONA such that the transposon sequence tenninales with free 3'OH groups al each end of the element's ONA. To finish the excision reaction, tile other DNA strand al each end of the ciernen! mus! ruso be cleavl.'d . Differcnt tronsposons use different mechanisms to cleave these "second" DNA slIands (those strands tha1 tenninate wilh 5' ends al tIle transposon hosl DNA junction). Tllese mechanisms are described in a following section, Afier excision of the trans poson, the 3 ' GH ends of tLe tra'nSposoD ONA-Ihe ends first Iiberated by the transposase -attack the DNA phosphod iesWr bonds al the site of lhe new insertion, This DNA segmenl is caBed the larget DNA. Recall that for most tran sposons, the target ONA can have essentially any sequencc, As a resul! oC Ihis altack. Ihe Iransposon ONA is covalently joined lo the DNA at lhe targel sile. Duri ng each ONA joining reaction . a nick is also introduced jnto Ihe larget DNA (Figure 11-20). This DNA jaining rcm.:tian m;t;urs
strnnded DNA break as a result 01transposon exdsion This break can be repaired by nonhomologous ene! joining CA homologous remmbinatioo (see Olapl:ers 9 and 10),
by a OntHtep transesterification reaction that is ca Bed DNA strand transfer. A s imilar mechanism Cor joining nudeic ncid strands is used for RNA splicing (see Chapter 13), The transpososome en sures that tbe two ends of the transposon ONA attack tbe two ONA strands al' thc same targe t site loge ther. The sites of attack on the two strands are usually separated by a Cew nucleotides (for example. 2, 5 and 9 nucleotide spacings arc common). This distance is fixed for each Iype of transposon and givel riso to the short targot-sito duplications that. flank transposod copi es of Ihe element (as is explained in the uext section). Once ONA strand transfer is complete, the joh oC the transpososome is a1so complete. The remaining recomhination steps are carried out by celluJar ONA repair proteins,
The Intermediare in Cut-and-Pasre Transposirion Is Finished by Gap Repair Tbe structu re ofthe ONA intcrrnediate gencratcd after ONA slrand transCer has the 3' ends of Ihe transposon ONA attaehed lo Ihe target ONA. This slructure also cíllTies Ihe Iwo nicks in the larget DNA thal were generated during the process of ONA strand transfer. The faet Ihat the two sites oC DNA slrand transfer on Ihe Iwo strands are separated by a fcw nucleotides resuJts in short ssDNA gaps flanking the joined transposon. Thcse gaps are filJed by a ONA repair polymerase tmeoded by the hast cell. Note tha! Ihe target DNA is cleaved during the ONA strand transCer step lo generatc 3'OH cnds that can serve as the primers for this repair synthcsis (soe Figuro 11-19). FilUng in the gaps gives rise to the target site duplications Iha! flank transposons (see aboye), Thus, the length uf the targel site duplicati on reveals the distance between the siles attaeked on the two strands oC Ihe target ONA during ONA strand transCer. ACter tba gap repair synlhesis. ONA ligase is needed to seal Ihe ONA strflrtds, Cul~and-paste transposition also Jeaves a double-stranded break in the ONA al' the site of Ihe "old" insertion. which musl be repaired lo mainlain the integrity of the host t.'ell's genome. Repair oC doublestranded ONA breaks by homologous recombination is described in Chapler 10. Those breaks are also sometimes more directly rejoined. as we wiJl see bclow in Ih e discussion of Ihe TC1/mariner family of transposons,
There Are Multiple Mechanisms for Cleaving the
Nontransferred Strand during DNA Transposition As juSI described, the transposase c1eaves tbe 3' ends of tbe elemen! DNA and promotes DNA strand trnnsfer lo catalyze cut-and-paste transposition. However, transposons tha! move by this mocbanism also need to c1eave the S'-terminating strands at the junetions betwcen the transposan and the flanking host ONA. These ONA strands are called the nontransfeJTed strands. as their 5' ends are 110t directly linkcd to the target ONA during the DNA slrand transfer reat.1.ion . Different transposons use different mechanisms lo calalyze Ihis sf,,'cond strand c1eavage reaction (Figure 11-21J. Three melhods are desL'I1bed here. An enzyme other !llan the Iransposase can be used to cleave Ihe nonlransferred ~trand (Figure 11-21). For examp le. the bacteria! transposon Tn7 encodes a speci6c protein (called TnsA) lhal does thi s job (Figure 11-21a). TnsA has a slructure vcry similar to that oC a restrje!ion endonuclcase. TnsA assembles with the Tn7.um:oded transposasa
Tronsposition
317
OH
"
3"
a s'' 3'.
I
b
Tn7
9¡
~
r OH
I
-•
1
e
Tn10,Tn5
+
OH
•] transposase • cleavage
15911
~
OH
" 3"
HO
, 7
+
~ transeslerifica¡;oo
lransesterffication
,"
:;"
lONA ,.aod
trBflSfer~
hairpin opening
OH
s' ' 3' . OH
,7
+
••
FI G URE 11-21 n,ree mechanisms fo' deaving the nontransfened strand. (a) AA enzyme other !han Iransposas€ is used, (b) The transposase calalyzes lhe anack 01 ooe DNA strand 00 the opposl1e 51raoo tú 'foon !he DNA hallpin Intermediate. The tVIO haitpin end~ are subseqlJef1t1y hydrolyzed by !he Ifansposase. (e) The transposase catalyzes me alfad 01 (he 3'01-1 from one end 01¡he elemenl's ONA 00 me 5dme strand ill me opposite end. SlJbsequef11 sleps (nol shCM1'\) rhen resullln an eJ(cised transposon
hhe TnsB protein), By working together. the transposase and TnsA excise the transposon &om its original target site. The other ways of d eaving the nontransferred s Lrand are promoted by the transposase itself-using an unusual DNA transesterification mechanism that is similar to ONA s trand transfer, For example, the transposons Tn,') a nd TnlO cleave the nontransferred strand by gener· ating a structure known as a "DNA hairpin." To form this hai rpin , lhe transposase uses the initialIy cleaved 3'OH end of th e transposon DNA lo altack a phosphodicsler bond directly across th e ONA duplox on the oppos ite strand {Figure 11-21bJ. This rcaction both cJeaves the attacked DNA strand a nd eovalently join the 3' end of the transposon DNA lo one sid e of the break. As a resu lt. the two DNA strands are eQvalently joined by a looped cnd, reminiscent in shape lo a hairpin. This hairpin DNA end is Ihen c1caved {lhat ls "opencd"} by the transposases, to generate a standard doublt....s lra nd break in Ihe ONA. This opening reaetlon occurs 00 both enrls of the transposon ONA, Onc.e
•
'l ~ OH
,s
+
tbese steps are complete, Ihe 3'OH ends of Ihe elernenl DNA are ready to be joinf3d to a new target DNA by tbe ONA strand transfer rcacüon. DNA deavage via a transesteriflcation reaction can also ou,:u r betwef!n the two cnds of Ihe l1'ansposún. This is Ihe third mechanism used by tIansposons to cleave Ihe nonlransferred strands. Jn this casa, one cleaved 3'OB end atlacks the SAme ONA slfflnd at the opposite end of the element's DNA (Figure 11-21cj. The resulting DNA íntennediate is further processed lo generate the excised transposon. The IS3 family of transposons uses this mcchanism. Why might transposases USe transesterification as a d~avagp. mecho anism? It is probably an economic solutiún . Transposases have tbe intrinsic ability lo promote (1) site-specific hydrolysis of the 3' ends of the transposon DNA anel (2) IransesteJification of this end inlo a nonspecifie DNA site. These same activities. with th e transcsterification reselion simply applied to a new DNA site, can allow Ihe transposase lo promole lransposon excision. Thi s mechanism. the refore, avoids Ihe neOO fo r the transposon lo encode a second enzyme to c1 eave the nontransferred slrand .
DNA Transposition by a Replicative Mechanism Some DNA transposons move using a mechanism called repJicarive transposition. in which the elcmen l DNA is duplicated during each round of transposition. Although the products of the transposition reaction are dearly different. as we will now see, thc mechanism of recombinalion is very similar lo that uscd for cut-and-pasle transposition (Figure 11-22). Th~ first step of replicati ve transposilion is the assembly of '-he trdnspúsase protein on tJIe two ends of tbe transpúson DNA to generate a transpososome. As we saw Cor cut-and-pasle transposition. transpososome [ormation is essential to coordinate Ihl! DNA deavagc and ¡oining reactions on Ibe t\Vo ends of the Iransposon's DNA. The nexf step is ONA cleavage al Ihe ends of Ihe transposon DNA. Tbis reaction is cata1yzed by the transposase within the transpososome. Tbe transposase inlrod ul:es a nick intu tbe ONA at each o[ thlo} two junctions betwCün the transposon sequence and Ihe flanking hosl DNA (see Figure 11-22 j. This cleavage liberates two 3 ' QB DNA ends on the transposan sequence. In contras! to eut-a nd-paste transposition. the transposon DNA is no!: cxdscd from Ihe has! sequenees at Ihis stage. This is the majar differencc belween replicative and cut-and-paste Irans· position. The 3'OH ends of the transposon ONA are then joined to the largel ONA site by the DNA slrand transfer reactiun. The mechanism is the same as we saw above for cut-and-pasle transposition. However. the intermed iate generated by DNA stra nd transfer is in Ihis case a doubly branched DNA molocule (see Figure 11-22). In Ihis intermediate. the 3 ' ends of the transposon are covalentJy joined to the new target site, while the 5' ends of the transposon sequence remain joined lo the old flanking DNA. The two ONA hranches within Ihi s intermooiate have Ihe structure of a replication fork (see Chapter 8)_ Afiel' DNA slrand transfer, Ihe DNA replicatiún proteins from the host cell can assemble at these forks. In the best understood example of replicative transpos ition (phage Mu, which we discuss below), Ihis assembly specifically occurs at only one of Ihe two forked structures (seo Figure 11-22 bottom panels), The 3'01-1 en d in the cleaved targel DNA serves as a
Transposilion
transposon
319
FIGURE 11-22 Mec:hanismfor
repltcative t,ansposition. 1he transpososorne Inlroduces a single-sb'and nick at eac.h 01 lhe ends 01me transposon ONA. 11"115 deavage generales a 3'OH group al. eoch ef1d. These OH groups lhel'1 anack the lalget DNA artd beoome joined to the tal~ by ONA strand trnnsfer. Note donar DNA cleavage
1
of transferred strand
3'OH
replication fork
1
assemtoty (al left gap)
! I
leading strand
,
replicalion Ihrough Ihe transposon
¡ continued replicatlon ligaliOfl
1
oointegrale with 2 copies 01transposon
mat al eam end 01 me transposon. on/y Orle strand 15 transferred ¡nm the targel a l thlS pont, res\Jling in lhe lormahoo 01a doubly·branched [)NA structure. 1he replicalioo apparatus assembies al ene 01 these "forks" (me lel! OO€ in the figure). Replicatioo conlinues tt\lough!he lran· poson sequeoce, The resulling product, called a cotrltegrate, has me lwo startlng clrruar DNA molecules joined by lvIto mpies 01me transpoSOIl. The ssOOA gaps In me branched inlerrnedi-ate gtVe ose te \he large! site dlJllicauOllS. These dupkaoons are not shOlM1 in the cointegrate la darity.
primer for DNA synthesis. Replication proceeds tbrough the transposon scqu ence and stops al the second fa rk. This replication reat:tian generales two oopies of th e transposon DNA. These copies are flanked by the short direct target site duplications. Replicative tmnsposition frcquenlly causes chromosomal inversions and deleti ons that can be highly detrimenlol to lht! host cell. This propcnsity to cause rearrangements may put replicati ve transposons at a selective disadvanlage. Perhaps Ih is is why so many e lements have dtlVelopod ways to oxciso complelely {rom thelr original DNA lacalian prior lo joining to a new ONA sile. By excision. transposons avoid generating Ihese major di sruplions to the hos! genome.
Viral ~like
Retrotransposons and Retroviruses Move Using
ao RNA 1nte rmediate Viral-Iike re trotransposons a nd retroviruses insert into new sites in the genome of the host cell, using the same steps oF ONA cleovage and DNA stnmd transfer we have described For the DNA transposons. Jn t:011trasl lo the DNA transposons. however. recombinaljoll Cor these retroelements involves an RNA interm ed iate. A cycle of transposition starts with transcription of the retrotransposon (or retroviral) DNA sequence into RNA by a cellular RNA poly~ merase. Transcription in¡ti ates al a promoler sequence within one of the LT Rs (Figure 11- 23) and continues across thc element lo gcnerale a nearly Full-Jength RNA copy of the elerncnt's DNA. The RNA is then I"tlver~e lTélnscribed lo genera te a duuble-stranded ONA molecu le. This DNA molecule is caBed the cONA (for copied DNA) an d is free frotn any flanking host DNA sequences. It is the cONA that is recosnized by the integrase prolein {a protdn highly re lated to the transposases of DNA elements . as we shall sce below} for recombin ation with a ncw target DNA s ite. Integrase assembles on the ends of Ihis cONA . nncl titen c1 eaves a few nucleotides off the 3' end of eech slrand. This cleavage roaction is idenlical lo lhe DNA c1eavage ste p of ONA transposition. As the direet precursor DN A for ¡nlesralion is generated from the RNA template by reverse transcription, il 15 alrea dy in the for m of an excised transpason. Thcrefare . a mcchan ism lO cleave the sccond stnmd is unnccessary for Ihese e lemen ls. Inlegrase tit en catalyzes the ínseetíon of Ihese cleaved 3' ends in to a DN A targel si/e in Ihe host Cf:lJl genome usi ng the ONA sl rand tra nsfer rcaction. As we discussoo aboye. Ibis target site can bave essentially any DNA scq uence. Hosl ceJl DNA repair proteins fill the gaps al th e target site generated during DNA strand transfer lo complet e recombination. This gap-repair reaction generales the target-site duplications. Because transcription lo generate the RNA intermediale inHiates within one of the LTRs. Ihis RNA does nol carry Ihe en tire LTR sequence; Ihe sequence between Ihe transcription slart si te and the c nd of the clement is missing. Thercforc a special mechanism is nceded lo regenerate the full -Icngth elemenl sequtmce during reverse transcription. The pathway of reverSfl transLTipti on involves t\Vo inlerna l priming evenls and two strand switches ISL'C details oFIhe process in Box 11-2. The Palhway of Relroviral cONA Forma tionl. These switching evenls result in the duplication of sequences at Ihe 4:lnds of the cONA. Thus. the (;ONA has complete. ret:onstrucled LTR sequences to compensate for regions of sequence losl during transcrip-
TronsposítiOI1
FIGURE 11·23 Medtanismofretroviral integration and transposition of ",ral"'ike fetrotransposons. The top panel shows. ¡nlegrated pn;Mrus. For a more detailed view of tre LJR sequences,. see!he figures in Bo,.;. 11-2. The pn.xnote! far tansaiption of tre vira! RNA is errbedded in the Iett tJR as shot.n dJNA synthesis from this viraf RNA is expIained in BcIII: 11 -2 The integrase-catalyzed DNA deavage and aw.. st:and transfer steps are shcwn
!lranscriptiOn element
s·...
RNA
"
1,,,,,,,,,,, ""n,,,;o''''
,,.
oONA
!
integrase-calalyzed 3' end cleavage
, ¡"'" released
3'OH
===~..l::;'~ 5
s'
' 3'
dinucfootlde
targelO NA
,,. Integrase-catalyzed DNA ,"and
"""fe, "
.
,. gap repai' and ~galion
new integrated copy
~==
321
,,.
lion. This rcconstruction of tbe LTRs is esscntiaJ for recognition of the cONA by integrase and for subsequent recombination.
DNA Transposases and Retroviral Integrases Are Members of a Protein Superfamily As we have seen, ONA deavage of tbe 3' ends of the tIansposon ONA (or cDNAj and ON A strand transfer are common steps used for ONA transposition and Ihe movement of viral· like rctrotransposons and retroviruses. Tbis conserved recombination mechanism is retlected in the structure of the transposase/integrase proteins (Figure 11·24). High·resolut.ion struclures reveal that many different transposases
322
11m.
Si/6-Sp'tcifiC Rocombinlllion und Transposilion 01 DNA
11 ~:J
Tbe Pathway of Retroviral cONA Fonnation
l o unc\erstand the process 01 retroviral reverse transcriptioo (or lhal of lhe viral-like retrrtransposons), we fll"St need to Iook in rrore detail al the SlllJCture of !he LTR sequences. Each llR is conslructed af Ihree sequence elements_lhese are called: U3 (far unique 3' end), R (for repeat), and Us (fof unique S' end). Transcription from the inlegrated copy cA the retrovira¡' genome generares lhe ",,"ral RNA with the R sequence at each end (Box 1 1 ~2 Figu-e 1). Therefore, during the process of reverse transaiption, Orle additional U3 and US region must be synthesized. "" expia;ned beioN, Ih;s duplKatim ha~ because priming of ONA synthesis occurs at internal siles ".,.;mio the RNA genome and the R sequence allows two -strand sv,.;lches" lo occur duríng the replication process. h. is the viral RNA lhal is padaged inlO virus partides, and this RNA enlers the new ce!1during infection. 1he viral RNA is package
This US-R DNA strand can then base-paif \o\'ith the R region on the cther ene! r:J the viral RNA rrdecule (Box 11 -2 Figure 2d). lhis step is !.he filst of the two strand swilches. Once this 9M1ming OCClIrs, reverse transaiptase oontinues ONA synthesis 10 copy the remau'lder of!he RNA templale (Box 11 -2 Figure 2e). lhe resulting DNA strand ends with tne PBS sequence at Its 3' terminus. 1he RNA t~ strand is ~ as bebe, ~ RNAse H (Bat: ,1 ~2 Figures 2d and 2e). RNAse H-med¡a;ted degradalion of the viral RNA also generales an RNA fragment lhat serves as lhe primer far synlhesis of the seoond cONA strand. 1 hls region af RNA remains base-paired with a sequenee called the polypurine trad ( PPT) at the edge af the U3 sequenee (Box 11 -2 Figures 2e and 2f). Elongation of Ihls primer ropies the U3, R. US, and PBS sequenees into oNA. Once the tRNA primer is remOJe
f'ogu .. 2b). Reverse transcriptase has two enzymalic. activities lhat ale important for cDNA formation: a DNA po/\meraso activity and an RNAse H octivity. RNAse H enzymes degrade RNA ttlal is base-paired with DNA (as we dlscussed in Chaptet B). Dunng rever5e transcription, RNAse H rerTl().teS lhe template RNA strands. lNheo this step occurs en Ihe first RNA- DNA hybrid intermedate (see Boc 11-2 F.gures 2b and 2e), the U5-R [)NA strand is re4eased in a single-stranded formo
RL<; RNA
gag
5'·
poi
e~
!""'' ' '".
U3 _
3'
IntegatiOrl "orlp"""
DNA
U3 RL<;
gag
poi
"""
...
U3RL<; LTR
LTR
!
proviru$
lranscrlption
R15 RNA
5' ·
gag
poi
M'
U3 R _
3'
80 X 11~2 f iGURE 1 DebiBed vi_ ot the seque nce Mments n. r tM ends ofthe rt!troYirlll RNA and CONA. Viral-hile retrotRJrcsposorts have a veJy similar sequence orgarUaOOn. The poi gene encodes boVl reverse IJ~ (1I'rl.IcIill!he RNAse H adivIty) and ",legrase
~
Sox 11-2
(Continued)
•
ANA
iii~~S======:JllpptTlll"W!ll""=""3' ] ~
b
,
1
RNAse H
....
•."""
+
5" PU l
!
lJ5...R ONA J;erlleS as~atU3
d S' PIS I
IPPfJlJrQ3'
3
!
eKtension bV RT RNA removaI by RNAse H
e 5' PI'
I
¡PPti ")1 ·- 0
!
+
2nd ONA strand synlhesis by RT
f
BOX
11-2
FIGURE 2
RNA primer
Pathwav 01 reYerse transcription lo gener.!he cONA capy ot
the feUovi,..1 or retrot'''''sposon RNA.
324
S itc-Specifk Recvlllbinolion nnd
Tt(ln.~posilion
oi DNII.
a
b
DNA bir¡ding
RSV inlegase
flGURf 11-24 Similaritiesotcilblyti<
domIIins 01 Iransposases and integrases. (a) StnJct\Ies 01 lile cons.erved rore domains of rnS lrarISpI:l'iaSe (O;Nies DA. c..o.y.,twl I.Y., Rezri;.otf ¡nJ RcIyment 1. 2000. Soenre 289: n - 55), of ¡tlage Mu transposa5e (Rlce P and Mizuuct1 K.. 1995. ~ 62: 209- 220), and 04 RSV iotegrase (Chook YM, Ú"irf J.V. Ke H, UId ~w.N. 1994.1 Mol. BId. 240: 176- 500) Common secondary structure ele-
w.s..
ments are shown In lre same coIOf:>. The DDE moti! active site resKlues are ~ In baII and stK.k. I~ Pfepared with BobScnpt. MoI5cr'p~ aró Raster 30. (b) Sdlemalicd the O:main orgarvatioo ot !he Itvee proteins stlCM'Il in part a. n-.e N·tem"Iro1I ~ bind to !he eIernent ONA. The mIddIe c;Iom¡jns C!l'ltiIin the regJOf\S sI"oowI., (a). The C-lefminaI ~ iU"e in\dved in proiein-proten cornacts. needed 10 assernbIe !he transposcrome aod/a 10 tnter.ld \Mth other proreins Iha! regulare 1.'
ca.
and integrases calTy a catalytíc doma in Ihat has a common threedimensional sh ape. T hi s catalytic d9 main conta ins three evolutionar· i1y invariant acidic amino acids : two espartates (O) and a glulama:e (E). Thcrefore. recombinases of Ihi s ch:ISS are referrcd lo as DDE·mohf transpasase/ integrase proteins . These acid ic a mino acids form part of the active site alld coordinale divalent metal ions (such as Mg2 + or Mnl' ) that are required for aclivily (as we described for the DNA poly· merases, see Chapler 8 ). An unusua l feature of the Iransposase/ inlegrase proteins is that they use this samo active sito lo catalyze both the DNA cleavage a nd DNA stnmd transfe l" rathe r than having tWQ ac tive . sites, each specialized for one chemical react io n. In contrast lO Ihe rogbly conserved stnlc turo of the cata lyllc doma ins . the remaining regioos of proteins in th is fam ily are not con· served. These regions e ncode site-specific DNA·binding domains and regions involved in protein-protei n interactions nooded to assemble the protein.DNA complex specific for oach individ~l a l e lement, Thus. these unique dornains enSUTe that transposases and IOlegrases catalyze recombination spcc ifica lly only 00 the e lement Ihat encoded them or on a very highl y rela ted elc ment. Transposases a nd integrases are only acli ve when assembled ¡nto a synaptic complcx. al so called a lranspososoll1c. on DNA (see abovel. The co-crystal structure of TnS transposase bound to a pair of transposon end DNA fragments provides insight into why this is the case (Figure 11· 25). T he transposase subunit that is bound to the recombinase recognit ion sequences on one of these ONA Eragments (!hat is. on one lransposon end ) donates the catalylic domain that promotes the DNA cleavage aod ONA strand transfer react io ns on the olhor end ofthe transposon . Because of trus subunit organization . lhe transposase wi ll be pmperly positi oned for recombination on ly wheo two suburtits and a pair afDNA ends are present togelher in the con1plex.
Poiy-A Rctrotransposons Move by a "Reverse Splicing" Mechanism The poly-A retrotransposons, for example. human U NE elemen ts. move using an RNA intermediale bul use a mechanism differen t than that used by the viral· like elements. Thjs mechanism is calle d target site primed rcvcrse transcription (Figuro 11· 26), 'fhe 61'St step is traJl'
Tran.~posjtion
325
F I (; U R E 11-25 Co-c:rystal of Tns bound
lo subsl,ate DNA. lhe COO1pIex contains a dimer of I~ The C
scriplion of the DNA oI an integrated element by a celJu lar RNA polyme:rase (Figure 11-26al. Although the promoter is embedded in the 5'UTR. it can in Ihis case (lirect RNA synthesis lo begin al lhe firsl nudeotide of Ihe element's seq uence. This newly synthesized RNA is exportad to the cytoplasm a nd translaled lo generale Ihe ORF1 and ORF2 proteins (see above). These prolems remain associated with the RNA lha! encoded them (Figure 1126b). In Ihis way. an elemenl promotes its own lransposition and does nol donale proleins lo competing elements. Tha protein-RNA complex lhen reenters the nucJease anrl associates with lhe cellular DNA (Figure 11-26c). Recall that the ORF2 protein has both a DNA e ndonuclease activity and a reverse transcriptase 8Cti vity. The endonuclease ini tiates the integration reaelion by introducing a nick in lhe cmomosomaJ ONA (see Figure 11-26d). T-rich sequences are preferred c1eavage sites. The presence ortbese Ts ut the deavage site permi ts the DNA lo base-paír with tIle poly-A laíl scquence of the t;lle ment RNA. The 3'OH QNA end generated by the nicking reaction then serves as lhe primer for reverse transcription of lhe element RNA (Figure 11-26e). The ORF2 protein also catalyzes this DNA syn tbesis, The re maining steps of transposition, although nol yet wcll understood. inelude synthcsis of the second cUNA strand. rcpair of DNA gaps al the insert ion site. and ligation to seal the ONA strands. Ma ny of the po ly-A retrotTansposons that have ooen detected by large-sca le ge nomir: sequcncing are truncated elements. Mosl of these are missing regions from their 5'ends and do not have complete copies of element-encoded genes or an inlact promoter. These truncated elements therefore have losl the ability to transpose.
3Z6
S i/e-Speciftc Rpcombinolion ond Tronsposiriml o/ DNA
F I G U R E 11 ~ 26 Transposition of a
•
P
poly-A retrotransposon by target sitepñmed reYet'se transcñption. The figure
UNE ONA
~
C()t======t~15'~UTR ~::f:Jc:t:r~·~I;;~»B!1!t ~B~===) 3' UTR '"o=;p'~o
outlines i'I modeIlor tl1E' movernen! 01 a UNE element (a) A cellular RNA ¡x¡I.,merase IflItli'Ite5 transaiption of an it'ltegr.lted UNE sequerce" (b) The resulung messenger RNA Is !fanslal:ed !O produce the products of the two encoded ORfs lhaI then ~ne! to the 3' end of their mRNA. (e) lhe. protein-mRNA complex then bnds 10 a T-ridl srte In the larget DNA. (d) lhe proteins rnmate deavage in the \afgCt DNA. leaving i'I 3'OH at the DNA end aod fomung
ORF l
ORF2
1
b UN E mRNA
.~5~.U~n< ~. . . .~. . . .~3~·~Un< ~.... 3"
5" 1
ORF l aOO ORF2 proteins
1,,,~~,;on
"",1 ¡
'o
'""0;09 UNE mRNA
.-
,
+ targel DNA
e
,.
larget site d eavage,
1
RNA-ONA h~rid formation
d
e
l'Yothe", 01
f
!!
fiffi' cDNA " " od
RNA degradation and
second strand synthesis
ONA joining and repair
UNE ONA A
TTTT
Exomples of7'rrmsporoble EIRmenl:
EXAMPLES OF TRANSPOSABLE ELEMENTS AND THEIR REGULATION Transposons have successfuUy invaded and co lonized tha genomes af aH life-forms. Clearly they are very robust biologicaJ entities. Sorne af lhis success can be attributed lo the fact that transposit ion is regulated in ways Ihat help lo estahlish a harmonious coexistence with th e hast cell. This coexiste nce is essential for the survival of the ele menl as Iransposon s cannol exist withoul a host organismo 00 Ihe olher hand . as introduced aboye. transposons can wreak havoc in a cell, causing inseMian mutalions, altedng gene expression , and promoting large-scale ONA realTangements. These disroplions are parlicularly noticeahle in plants, a feature that loo lo the discovery of Iransposon s in maize (Box 11-3, Mai ze Elemenls and Ihe Oiscovery ofTransposons). In the following section s W e briefly descLibe sorne af the bestunderstood individual transposons and transposon famili es. (A larger lisl of transposons and sorn e of Ihei r importan! [ealures is summarized in Table 11-2.) Each suhsection provides a brier overview of a specific clernenl and an examp le of regulation Ihat is of particular importance lo Ihal elem enl. As \Ve will see, two types of regulation appear as recwTing themes: • 1'r'-.ifisposons control the nuroher 01' their copies preseot in a givcn ceH. By regulating copy numher, these elemen ts limil their deleterious impact 00 the genom e orlhe host ceH. • TrdJls posons control largel site c hoice. Two general Iypes of targct site regulation are ohserved. ln the first, sorne elernents preferen tially insert into regions of the chrornosome (hal tend not lo be harmful lo the host celL These regions are caBed safe havens fm trans posons. In the secon d type of regulatj on, sorne transJXlsons spccificaJly avoid tran sposing into their own DNA . This phenomenon is called transposition larget immunHy. 4
lS4..Family Transposons Are Compact Elements with Multiple Mechanisms for Copy Number Control The bacterial tl"'
lransposase defective
transposase
, gene
tetracycline resiSlance genes
aro ,
9"'· P OUT
p~
1510 left
15 10 righl
---
- lhc - map - st'ONs!he fune-
F¡GU R E 11-27 Genetic organiution of bKterial b'ansposon TnlO,
lia1aI elernenlS io the bac1enal transposon TolO. To 10, bl;e many bacterial tIansposol"5, acI~ carries lWO 'mi~transp05CnS' al its terrnlni. fa TnfO, !hese elemenlS élfe called IS 101. (Ieft) and 15100 (right). 80th typeS d ISID elements can transpose. and are: Iound In DNA sepdrately frorn Tn ID. 1he VIOle triangles shcMt!he ÍIl· VE'lted repei!t sequences
326
Sile-Specific Recowb¡"n,ion and Trnnsposition olONA
Box
II~J
Maile Elements and the Discovery of Transposons
Plant genomes are very rich in transposons. Furthermore. the ability of transposable elements to aher gene e.xpression can ofien be readily obselVed as dramatic variation in the coloration of the plant (Bo.x 11 -3 Figure 1). Thus. it is not surprising ,hat transposable elements. anc! many af their salient features, were first discovered in plants. Barbara McOintock discovered ~controlling elements" in maize in the late 1940s. It was actually !he ability of transposable efements to break chromosomes that first carne fo McOi1to:::1
.. ... . ............... ... •• ,iI ' ' -. . .,'
-
@'.,..•.,-.r. "'..••1'. .. ~_ .
-
tI.
-:' :] 101'· . ,
t
. .t ....... .. ~ . ... .>\1'.., • 1 • ,~I., . _ T...•'
.
BO)( 11·3 flCURE lb Exllmpleoholo,wrñegationin snapdragon nowers due te TamJ transposition. The size ot wlite patches ís relllted lo the frequern:y of transposition. (Soun:e: Chatterjee M. and Martín e 1997. The Plalt Journa/11 : 759 - 771 ,
Frgure 2a. page 762.) be in differenl chrCfTlO5ClTlélI Iocations in the descendents of (Ir) individual plant. This observation prcMded !he fLrSt insight Ihat netic elements courd rncM!. Ihat is "transpose,~ within chrorno-
se-
sanes.
eo x
11-3 FIGURE la Exampleofcorn(maize)mbshowing mJOI v.wgation due te transposition. (Source: Photograph tllken by Barbara w.caíntock; image counesy CoId Spñng Harbar laborotory An:hives.)
Os, in fact. is a nonautonomous DNA transposon that moves by cut and paste transposition. Os movemenl requires the Ac (adivator) e rement-arso discO\lered by McClintock- Io be present in the same cel! and provide the transposase protein. Ac is now recognized lo be part of a large family of DNA transposons called Ihe MT family named far the hobo erements from flies, the Ac elements from maize, and the Tom e lements trom snapdragon.
TnlO transposes via lhe cut-and-paste mech anism (descri bed above), usiog the DNA hairpin slrategy lo cleave the nontransferred strnnds (Figures 11-19 a n d 11 -21 1_ The TnlO sequence a lso has a site for IHF b inding. lHF helps in the assembly of proper trans pososome oomplex needed for fecombination as it does dwing phage ~ integration (see above) . T n lO is organized into threo functionill modul es. This organi7.ation ¡s· relatively common. and elements th at have il are caJl ed composite tr ansposon':i. The two outennost modul es. called ISlOL (left) and ISlOR loght), aro actualIy mini transposon s. " IS" stands for ¡nsertion sequen re. IS10R encodes th e gene for tbe transposaso Ihat recognizes the terminal inverted repeat sequences of ISlOR, ISlOL, and Tn1O. ISlOL, altho ugh very s im ilar in sequen ce lo ISlOR, d oes nol encode a funcl ional transposase. T h us. both ISlOR and TnlO are autonomous, w h ereas ISlOL is a
Exomplf1S ofTrunspofiflblp.-ElenlP.nts and 11¡eir Regu lo/ion
329
TA8lE 11-2 MajorTypes o. Transposable Elements
Type
Structural Features
Mechanlsm o. Movement
Examples
ONA-MEOIATEOTRANSPOSITION
Bacterial repliCative transposons
Terminal invertcd repeats that Uank antiblotic· resislance and transposase
Bacteriat cUI-and-paste oansposons
Terminal inverted repeats thal flank antiblOtic· resislance and fransposase
Copying of elernent ONA Dcccmpanying cach round 01 inserliO'l into a new la,gel sile.
Tn3, -y6,
phage Mu
Excision of DNA Irom otd larget slle and inserrion ínto new site Excision 01 DNA iforn old larget site and inser!ion into new site
Tn5. Tn lO. Tn7. 18911. Tn917
geces
genes Eukaryotic Iransposoos
Inverfed repeats that lIank coding regían wilh inlr01S
P clements (Drosophila) hA 1 family elernents fc 1/Marinerelements
RNA-MEDlATEO TRANSPOSITION
Vira/-flke retrotr:msposons
Poly-A retrotransposons
-250 10 600 bp direel
terminal repeats (LTRs) ttanklng genes lar reverse transcriptasc, integrase, and relrovirat-like Gag proteln 3' A-T-rich sequence 8nd 5' UlA flank genes encoding an RNA-binding protein and reverse transcriptase
TranSCrlptlon Into ANA ¡rom promoter in left LlA by RNA poIymerase II lol!owed by reverse tmnscriplion and insertlon al target sile Tmnscriptloo into RNA Irom internal prometer; targetprimcd feverse Iranscriptian ini(iated by endonuc tease cleavagc
nonautonomous lransposon. 80th types of 1510 elem ents are found , as e.xpected, unassociated with Tn10. TnlO limits it~ copy nWllber in any given cell by strategies thal restrict ils Iransposition frcquency, One mechanism is Ihe use of an antisense RNA lo con trol the expression of the l.ranspo~ase gene (figure 11-29) (see Ihe diseussion of anlisense RNA regulation iJl Chapler 17). Near Ihe end of ISlOR are !wo prnmoters Iha! dimct the synthesis of RNA by !he host eell's RNA polymerase. The promoter thal directs RNA synthes is inward (called P 1N) is responsible for the expression of the transposase gene. T h e promoter thal directs tran scriplion outward (P OUT) ' in contrasl, serves IQ regu late transposase expression by ma ki ng an antisonse RNA, as follows. The RNAs syll lhesiz(ld from P 1N and POUT overlap (by 36 base pairs) and therefore can pair by hydrogen bonding between these overlapping (complementary) regions. This pairing prevents binding of ribosornes to thc P 1N transcript, and thus syothesis of the transposase protein. By this mcchanism, cells tba! carry more copies of Tn 10 will transcribe more oC !he antisense RNA, whieh in furn willlimit expression oC the transposase gene (Figure 11-28, sce legend ror more delails). The transposition frequency will , therefore, be vel}' low in sueh a strain. ln contrast, ¡f theTe is only one copy ofTnlO in the eeU, the level of antiscnse RNA will be 10w, synlhesis of tbe lransposable protcin will be cfficient, and transposition w ill occur at a high er freque ncy.
TolO Transposition Is Coupled to Cellular DNA Replication Tn 10 al~o coup les lransposition lo celJular DNA replication. Recall fha l bacteria su ch as E. coU (a eommon host for TnlO) methylate Iheir'
Ty elernents (yeast) Copia elCfTll:!nts (Orosophila)
F and G elements (Drosophila)
UNE ane! SINE elernents (mammals) Alu sequences (humans)
330
S ile-Specific Recombinof;on und Tmnsposition 01 DNA
F I (; U R E 11·28 Anti5ense ft8Ulation 01 TnlOexpression. (a)Amapofthe overtapping pl'OlTlOter regions is shov.n The left\.vard promoter (plN) prometes cxpression 01 the tranposase gene; the rightward prometer (pOlff), ......t1ich lies 36 bases lo lhe left of plN PfOIl1OleS elIpress/on 01BIl antisense RNA. lhe first 36 bases of each transcnp! are complemen-
tary to ene another. Note that in cells the anlisense transcripl initlated al pOUT is Ionger ,¡ved than is lhe mRNA irutiated al plN. (b) In c:eIIs h;Mng a high copy number of Tn/ O, !he RNA:RNA pcIiring occ.urs ffequenlly aflcl bIocks translanon ot !he tranposase mRNA (therel7yevenlually reducing Ihe cwy number of the
a
IS/0righl anflsense RNA 'a:u FU
..
Po", transposase gene lfansposon . . ._ _ _ _- -_ _ _"':'.Pii~.io, mRNA ..
'------.J
36 base pair
overtap b highTn10copynumber: RNA:RNA palring is fTequefll
',,;f 1"",__."
eIemen1). (e) lf1 c:ells having a Iow copy number of!he trarlSp05Ofl, RNA:RNA pairing IS rare; the
translation 01tra!1pOSilSE! mRNA is efticiefll ancl lhe o:py flumber in lhe c:e11is inc:reased.
5'
lranslalion of transposase mRNA is blQCked
e low Tn10copy number: RNA:RNA pairing is rafe
5'
1
o,
lfanslation lransposase mRNA is efficienl
ONA al CATe siles (see Chapter 8, Box 8-4). This methylation occurs after DNA replication . suc h that CATe siles are hemimethylated for the few minutes between passage of the repfication tor l:: and recogm-· tion ofthese seq uences by the methylase enzyme. II is during Ihis brief period -when Ihe TnlO DNA is hemimelhylaled - thal transposition is most lik.el y lo occur. Thi s coupling of transcription to Ihe methylation state is due to the presence oCtwo critical CATC siles in the transposon sequence. Qne of these siles is in th e promoter for the tran sposase gene; Ihe sccond is i.n the bin ding s ite for the transposase within one of the inve l1ed terminal repeats. 80th RNA polymerdse a nd trans posase bind more tightly to the hemimethylated sequences than to their fully melhylated ver· sions, As a result , when the DNA is hemimethylated. the transposase gene is most efficiently expressed . aud the transposase protein binds most efficient ly to the DNA. Therefore, transposition ofTnlO occurs al its highesl frequency during Ihis brief phase of the cell cycle jusI after its DNA has been replicated (Figure 11· 29 ). Regulation of TnlO transposition by DNA methylation serves to limit Ihe overall frequency of transposition. lt also restricts transposi· tion specifically lo actively dividing cells. This tirning ensures Ihat there are two copies of the chromosorne present to "heal " Ihe doubleslranded DNA break teft in the old target site as a resull of transposon excision_ These "empty larget sites" are repaired via homologous ¡-e. combination by the double-strand break. repair pathway. T his recombinat ion reaction requires that two c:oples of tite chromosomaJ region be presen t (see Chapter 10).
Exomples ofTronsposoble Elements cmd T1IP.ir Regulotion
F U¡ u RE 11-29 Tfansposition ofTnFO after passage of a replicatwn foric.. Transposition is activated by me hemime!h}'iated OOA thal exists jusI afta-!)NA repllcation (me~
,--
'- ~<==P
'- ~~::
,O , , = , , "
,.I
double-strand break
w,-aro_pas,e
sites are nal 5hov'ln). During transposition. a double-stranded break is made in the c:hromosomal DNA 'lklere \he elernent e«:ised. lhis break can be repaired by !he DSB-repalf pattway
transposilion
(se€' Chapter 10). a process lhal
l
regenerates a ropy of TnlO al !he site 01 excision. By thlS meméll1lSm. transposition may appear 10 be ·replic.atf...e~
In nature. ahhough the actual
rerombir'oiltion process goes 1hrough me
cul-and-paste
producls afler repliCation
1
amI repair
,-,-
331
Phage Mu ]s an Extremely Robust Transposon Phage Mu, like phage k , is a Iysoge nic bacteriophage (see Chapler 21). Mu is also a larga ONA transposon. This phage uses lransposit ion lo inserl its DNA inlo Ihe genome of Ihe hosl cen during infeclion and in Iros way is similar lo the retroviruses (discussed aboye). Mu also uses rnultiplc TOunds of replicative trnnsposition lo amplify its DNA dUTing Iytic grawth_ Duriog Ihe ¡ylie eycle. Mu comp letes about 100 fOund s of transposition per haur, making il the most efficient transpasan known. furlherrnore, even when presen l as a quiescent Iysogen, the Mu genome transposes quite frequent ly, compared lo traditional transposons such as Tn lO. The name Mu is short ror mulator and stems from this ability to tnmspose pmmiscuously: eells carrying an inserted copy of the Mu DNA frequent ly accumulate new mutations due lo insertion of the phage DNA into cellular genes. The Mu genorne is aboUl 40 kb and cardes more than 35 gen es, but on ly two encode pmteins with dedicated mies in transposition. Thesc are the A and B genes, which encode lhe proteins MuA and MuE. MuA is the transpasase and is a member or the DDE prolein superfamily we discll ssed. MuB is .m ATPase Ihat stimuJates MuA act ivily and contmls the choi ce of the DNA target site (Figure 1 1-301. This process is explained in the next section.
Mu Uses Target Immunity to Avoid Transposing into Its Own DNA Mu, like many transposons, shows very little : >equence preference al jts larget sites. As a result, "good" targe! siles occur very frequen tly in DNA
(nonrep~cative)
pathway.
fl32
Si/e·Sper;i/ic Rr.combinolion ond Tronsposilion 01 DNA
FIGURE 11-JO Overviewofthee.ty
stepS of Mu trAnsposition.. fotK subunits of !he MuA transposase asS@mb/e en the end5
cA Wo ONA. MoB blnds ATP and dlen birCs lo ~ of crry seqoence. A pIO!e~n Interac::· hon between MuA and MuS ~ the MuA ONA--transpososome rorr.p6ex lO 11 oew DNA target site. MuS IS l'lOI shown lf1 lhe final pan8 because, afte, ONA strand transler, d 15 no Iorfler needed and probably Ieaves the rompIec
MuS targa!
oom• • "
Examplcs of1h11lsposable Elcmcnts and Thdr ReguJatioll
313
including the DNA of !he Mu genome ¡tself. Given (his nearly random seque nce preference, how does Mu avoid transposing 11lto i15 own DNA, a situation that would Iikely result in serious disrupljons of the phage's genes? This problcm is solved because Mu transposition is regulated by a process caBed transposition target immunity (Figure 11*31). DNA sites surrounding a copy of the Mu element, including the element's OWIl DNA, are rendered very poor targets fOf a new transposition evento lnterplay between the MuA tnmsposase and the MuB ATPase is al the center of the mcchanism of transposition target irnmunily. MuA* MuB interactions prevent MuB from bindi ng to the DNA near where MuA is bound. The interactions responsible for tbis interplay are • MuA inhihits MuB from binding to nearby DNA sites. This inhibi -
(ion requires ATP hydrolysis . • MuB helps MuA find a larget site for transposition . To see how individual protein-prolein and protein-DNA interactions function togetber to ge nerate turget immunity, consider Ira nsposition na_ve DNA
immune DNA
F I e u A: E 11·11 The interplay between MuA iilnd MuB on DNA leiilds to the development of itn imml.lne titrget DNA. The MuA-bindif@sitesare in lhetennir1a1 iO\lel1ed lepeats en \he ends of!he Ilansposon (sl lONn in cIarIIgroo'l). MuA ís shovm bouN::I lo rnIy ore o, !he tv.Q repeat regions for darity. Evefy rime MuS hydrdyzes AW it dÍSSOÓdtes lrom the [)NA MuS boI.Jro lo AW ts st'OWI'1 iI1lhe darkeT gTeen; MuAMuB Cootad snmulates tns hydlOiysis 1edCWn. flJ· though shov.'n contaOing on/y two moIewIes of MuB, MuA wIR prelerentialy contact a~ ,he Mte 00u0.:1 within d05C prOXllTlity to Its ONAbIróng site. DNA IengThs of 5 to 15 kb can be ¡enebed ímmune" by a single MuA-bound termin
invef1ed repeat sequence.
1
good larget
very peor target
inl'o two candidate UNA segments: one is any representative segment oC UNA; whereas, Ihe second has a copy oC Mu already inserted (see
Figure 11 -31). We will caU the first UNA seg menl the nai"ve region and Ihe second UNA segment Ihe imm une region. What ha ppens al each of these UNA regions as Mu prepares lo lra nspose? f irst we consider e vents al Ihe nai"ve region. MuB. in complex wi lh ATP (MuS-ATP) , w ill bi nd Ihe UNA. usiog ¡Is oonspecific. UNA-bindiog adivily, Al Ihe same time, MuA transposase will assemble a transpososome on the Mu UNA. This MuA in Ihe transpososome can Ihen make prolein-protein conlacls with Ihe MuS-UNA com plex at Ihe nai've region. As a resu lt of Ihis interaction, MuS delivers Ihis UNA to MuA for use as a larget site. In contrast, bolh MuA aod MuS bind lo UNA in the immune regioo. MuA interacts with its specific binding sites 00 the Mu gename Ihal is already prese nl ; MuB-ATP agaio binds using its affinity for aoy UNA sequence. I-Iowever, when bolh MuA and MuE are bound l O lrus region, Ihey will interact. As a result , MuA stimulales ATP-hydrolysis by MuB and tho dísassociatiol1 of MuE from this ONA. MuE therefOl'e docl:l nol accumulale 0 0 Ihis immune DNA segment. By tbis means, the Mu trnnsposition proteins use the encrgy stored in ATP to prolecl lhe Mu genome from becoming the targot of transposition. As expected from this mechanism . even a single MuA binding site within a UNA molecu le IS suffi cient to impart larget immunity. Tha nsposition target immunity IS observed for a numoor of dirferenl lransposable elements and can work over very long distances. For Mu. seq ue nces wilhin approximately 15 kb of an existing Mu insertion are ¡mmune to new insertions. For sorne elements-for example Tn3 and Tn7- target immunity OCCUTS over distances greater than 100 kb. Targel immunity protects an element from lransposing into ¡tself, or from hav¡og anolher new copy of the same type of element jnsert iolo jls genome. Furthermore. thi s type of regu lalion of target ONA selectioo also provides a driving force for elemeols to move lo new locations "far" from where they are initially inserted, a feature thal may also be advantageous for Iheir overall propagation a nd survivaJ . 4
Tclf1vfariner Elements Are Extremely Successful DNA Elements in Eukaryotes Recognizablc members ol' the Tc1/morjner family of elements are \Videsprearl in bolh invertebrate and vertebrate organisms. Elements in Ibis fami ly are thc mosl common UNA transposons present in cukaryotes. Although these elements are clcarly related , members isolated fmm difrerent organisms have distinguishing featmes and are named differe nlly. For example, elements from the wonn C. elegans are called Te elements. whereas Ihe original element named Mariner was isolated fram a Dro.<;ophi]a species. Tc1/mariner elemenls are among the simplest autonomous trans-pOSOI1S known. 1'ypicaUy, they are 1,r:; lo 2.5 kb long and carry only a pair of terminal ¡nverted repeat sequences (the site of transpasase binding) and a gene encodíng a transposase proteín of the UDE transposase superfamily (see aboye). fn contrast 1'0 many transposons, no accessory protems are required for transposiHon. a.lthougq the final steps of recombination do require cell ular ONA repair proteins. This simplicily io stlUcture and mechanism rnay be responsible for Ihe huge success orthese elements in such a wide range ofhost organisms.
Exumplcs ofTrallSposable EJements ond Tlleir RegllJution
TC1/mariner elernents rnove by a cul-and-pasle lransposition rnechanism (Figure 11-201 . The Iransposon DNA is cleaved out of Ihe old flanking host DNA using pairs of e/eavages thal are staggered by two base pairs. These elements strongly preCer lo ¡osert into DNA sites with Ihe (obviously, very common) sequence STA. Obviously, this is a very comrnon sequence. Whal happens to the "empty" site in the hest chromosome when a transposon excises? In Ihe case of Tcl /mo riner elemen ls. DNA sequence analysis oC some sites that once carried a transposon reveals tha! sometimes the broken DNA ends are filled in (by repair DNA synthesis) and then directly ¡oined (see Ihe discussion on nonhomologous end joining in Chapter 9). These repair reactions result in the incorporation of a few extra base pairs of DNA at the old insertion site. These sma ll ONA ínsflrlions are lnown as "footprints," as lhey are lhe traces left by a lnmsposon Ihat has ""ravelcd through" a s ite in the genome. In contra!:;t to many transposons, lhe trnnsposition of Tcl/moriner elernents is not weU regu\ated. Perhaps as a result oC trus Iack of control, mnny elements found by geoome sequencing are "dead " - that is. unable to transpose, For example, many p.lements cany mulations in Ule traJ1sposase gene thal inactivate il. Using a large oumber of sequences [rom both inactive and active elements, researchers l:onstruCled an rut ificial hyperactive Tcl/marjner elemento This element. named Sleeping Beouty, transposes at vcry hig,h frequencies compared to naturaUy isolated elements, Sleeping Beouty is promisiog as a tool for mutagencsis and DNA insertion in many eukaryotic organisms. Furthermore, this reconstrudion experiment reveals tbat Ole froquency of transposition by TC1/mariner elcments lS naturaUy kept al bay due to the suboptimal activity oftheir transposase proteins.
A
..
".
~.
335
•
•
Yeast Ty Elements Transpose into Safe Havens in the Genome The Ty elements (Iransposons in Yeasl). prominent transposons in yeast, are vira l-like retrotransposons. In fact , their similarity to retroviruses extends beyond their mechanism of transposilion: Ty RNA is found in cells packaged intu viral-Iike particles (Figu re 11-32). Thu s, these elemcnts seem to be viruses thal cannot oscupe ano cell a nd infeel new cells. There are many Iypes of well-studied 1Y elements; for example. S. cerwisiae carrios rnembers or the 1)rl, Ty3, Ty4, und Ty5 classes (although the Ty5 elements in this yeast species all appear to be inacti ve). Each of these classes of Ty elements promotes its own mobility bul does nol mohilize c1ernents of anolher c1ass. Ty elemenls preferenliaUy in tegrate into specific chromosornal regions (Figure 11-33 ). For example. Ty1 eJements nearly always Iranspose into DNA within - 200 bp upstl'eam of a start site for transcription by the host RNA polymerase Lll enzyme (see Chapler 12). RNA Poi 111 s pecifically transcribes tRNA genes, and most Ty1 inserlions are near these genes. Ty3 integration is also tightly Iirrked to Poi III promoters. lo Ihi s case. integration is preciscly targeted to the start sile of Irunscription (.::!: 2 bpJ. ln contrast, Ty5 prcfurentially integrales into regioos oC the genomo thal are in a sileoced , transcl'iptionally quiescent slate. Sile nced regians targeted by Ty5 incl ude Ihe telomercs and !.he sil enl copies of the lllaling-Iype locí (see Cha pter 101. In aH these cases, lhe mechanism of regional larget-site selection involves the form ation of specilic protein-protein complexes between tbe element's integrase-bouod in a complex lo the cONA-aud hoslspecific proleins bound to these chrornosomal sites. For example, Ty5
B
fiGURE 11-32 YeastTyelements
pKkaged into virus partides. (a) AA elec· lron miaograph 01 S. cerevisioe cells O\Ie!elfples5lng Ty I vuus-like partides. The partides ate 5eefl <'IS 0\1"', eled!Ol"1 dense slruct\J re.. (b) Cryoelectroo miqoscopy shov-Jif"€ the IhTee-dilT\el1'
sienal reconstrudlons 01Ty l lrinons. These Ty I eJernents c.
It ASM Pfess, Washington, O.e. (b) .AIso
Coortesy of H. Saibil )
336
S¡' ....Speci/ic RP.COmbination und Tronspo,';itiQn oJ DI\'.A
chromosome Ul
.
tRNAgencs:
(J tRNAgenes:
,
o
~ ~ ~, ~ 1#r ¡N' ¡y
,
,
$
,
100
t'
,
200
~
tf
~ ~
" kb
~.
,
300
• Ty1 .Ty2
• Ty4
. Ty3
. Ty5
! ~ ! ~ ; ";
tf
... ,
t'~
I
,1
)
# " #tff'
500
FI GU R E 11-33 dustet'ed illtegration sw observed fOf Ty eM,ments. Eac.h colore
iOl'egrase forms a speci fi c com plex with lhe DNA silencing proteiu Sir4 (sec Chapter 17). Why do 1Y elements exhibi' this regional largot sile preference? It is proposed thal Ihis target specificity enables Ihe transposons lo persist in a h as! organ ism by foc using most of their in sertiolls away from importanl regioos of the genollle that are involved directly io codiog for protcins. The use of trus type of targeted transposition may be especially important in organisms with small , gene-rich genames, such as yeas!.
LINEs Promote Their Own Transposition and Even Transpose Cell ular RNAs The autonomous poly-A retrotransposoos known as LINEs are abundant in the genomes of vertebrate Ol'ganisms. lo fact. aboul 20 percent of Ihe human genome is composed of U NE scquences. These elements were firsl recognized as a famil y of repea! sequences. Their name is derived from this ¡nitia! identificatíon : UNE is the acrononym for long lnterspersed nuclear ~leme nt. L1 is one of the beSI understood UNEs in humanso In addition to promoting their own mobility, UNEs also donate the proteins needed lo reverse transcribe and integrnte another relate<:! class oC repeat seq uences, the nonautonomous poly-A retrotransposons. known as .§horl interspersed n uclear ~l e ment s (SlNEsJ. Genome scquences reveal. once again, Ihe presonce of huge numbers oC these elements. wrnch are typically only betweeo 100 and 400 bp in Jcnglh. The ALu sequence is an example oC a widesproad SINE in the human genome. A comparison oC the structures oCtypical LeNE and SINE elements is shown in Figure 11-34. The sequences oC UNEs and SINEs look like simple genes. In fact, the Gis-aeting sequences importan l for transposition simply include a promoter. to direct transcription oCth e element iolo RNA. and a polyA seq uence. RecaH thal' these A residues pair with the DNA at the target site lo help generate the primer lerminus Cor reverse transcription (see Figure 11-23).
Examples of Tramposable EJemenls ond Their Regu/olion
UNE
EN
S'UTR
ORF1
RT-RNAseH
ORF2
FI (i UR E 11-34 Cenetk orgitnitation of
'UTR
'--'
(A,,)5-11
SINE
(1
lEl"'
337
El
)
'\. targel sito duplication
These s imple sequence requireme nts for Ira nsposilion pose a problem for UNEs: how do they a void tra nsposing ce llular mRNA motecules? AIl gen es ha ve a promoter, and mos! are transcribed into a u mRNA that will carry a poly-A seq uence al lhe 3' end of the molecule (Chapl er 12), Thus. aoy mRNA s hould be an altraclive "substrate " Cm transposition, In fac t, genome seq ue nces provide clear ev idence for transposition of cellular RNA vía the targetprimed reverse transcription mechanism. For many ceUular genes. there are additional copies of a highlyrelated scquence in lhe genome. These copies appear lo have lost their promoter and Ihejr introns (regions of sequence present within a gene but removed from the mRNA by RNA splicing; see Chaple)' 131, and often carry truncalh;ms near fueir 5' ends. These sequence~ are k.nown as processed pseudogenes and usually are no!. expressed by Ihe eell. These pseudogenes are often flanked by short repeats in the larget ONA. This structure is exac tly thal expected oC UNE-promoted transposHion of a cellu lax mRNA, Although transposilion oC cellu lar RNAs can accrn, it is arare evenl. The principal mechanism used lo avoid this process is that tbe UNE-encoo.ed proleins bind immediately to lheir own RNA during translation (see Figure 11-23 ). Thus, they show a slrong bias to catalyzing reverse lranscription and integration of the RNA Ihat encoded them .
V(D)J RECOMBINATlON We have ::icen Ihal tran::ipo::iition j::i involved in Ihe moveme nl oC many differenl genelic e lements. Cells, however, have a lso harnessed this recombination mechanism for functions tha! directly he lp Ihe organism. The besl example is V(DlJ recomhinalion. which occurs in the cells oCthe vertebrate immune syslem. The irnmune syslem of vertebrales has the job of recognizu)g and rendíng off invading organisms, including viruses, bacteria, and pathogenic e ukaryoles. Vertebrales have two specialized cel! types dedicated to rccognizing Ihese invaders: B cells and T relI s. B cells produce antibodies that drculate in Ihe bloodstream . whereas T cetls prod uce cell surface-bound receptor proteins (called T cell ff!(;eptors). Recognilion of a "Coreign" molecule by either of fuese c1asses oC pl'Oteins starts a cascade of events facused on destruction oC Ihe invader. To fulfiU their functions successfully, antibodies and T cell rcceptors must be able to recognize an enormollsly diverse group oC rnolecules, The principal mechanism cells use lo generate antibodies and T cell receptors wilh such diversity relies on a specia lized sel of DNA rearrangemenl reaetions known as V(D1J ret:ombination.
a typkal UNE and SINE. Note the valiablelength poIy-A sequet'1re at lhe right end ollhe elements. This is a defining lealult' of the Po/y-A IclrOllano;¡x:rsorlS. These elemer1ts ale also fla nked by largel-Stte duplications lhat ale variable in Jenglh (blue arrows). 5equeor:e elements ale nol shown lo scale. Both types of elements also ca!!}' prorrolef sequences. see FigUfe5 11- 19 and 11 -26. (SOI.uce: From Bushman F. 2002. Leteral DNA tronsfer, p, 25 1, f 8A. e 2002 CoId Spling Harbar labotato¡y Press.)
338
Sile-Specific Recombinotion und Tmnsposifion o[ DNA
Antibo~1 y and T cell receptor genes are composed of gene segme nts that are assembled by a series oC sequence-speci fl c ONA rearrangeme nts, To understand how tllis recombination generales the needed diversily. we need lo look al the slructure of an antibody molecule (Figure 11-35); T cell receptors have a similar modular structure. A genomic region encoding an antibody molecule is shown in Figure 11-36. Aotibodies are cooslructed of two copies each of a light chain and a heavy chain. The part of the protein that interacts with foreign molecules is caUed lhe antigen-binding site. This bindiog region is constructed &om VL and VH domains of the antibody molecule, shown in Figure 11-35. The "V" signifies that the protein sequence in this region is highly variable. The rernaioing dornaios of the antibody are called "C," or constant. regioos and do not differ arnoog different aotibody molecules. F'igure 11-36a shows the genomic region encoding a n anlibody Iighl chain (frorn a mouse), caBed Ihe ka ppa locus. This region carries about 300 gene segmeots cod iog for different versions oC the lightchaio Vl proteio regian. There are also foUT gene segmeols encoding a short region of protein sequCllce called the J region. follow ed by a SÜl· gle cod ing region for the el domain. By Ihe mechAnism we sba ll describe be low, V(O}J recombination can fu se the ONA between fl Oy pajr of V a ud J segments. Thus. as a result of recombination, 1.200 variants of the antibody light chain can be produced from this single geno mic region. These segments are Ihen brought together with the el coding region by RNA splici ng (Chapler 13). The situation for assembly of the gene segments encoding lhe anti· body heavy chain is similar. In this case, however, there is un additionru type of gene segment, called O (for diversity) (Figure 11-36cJ. Heavychain genes can be very complexo For exam ple, a specific hcavy·chain locus in a mo use has more lhan 100 V regions, 12 O regions, and 4 1 regioT1s. V(O)J recombination can assemble this gene lo generate more than 4,800 different protei n sequences. Because functional antibodies can be construcled from any pair of light and heavy chalns, Ihe diversily
FIGURE tl-35 Sb'ucwfeohn .. ntibocfy molecule_ The tv.o light chains are stJov.in in pink, v.nereas the heavy chains are il'] bltJe. The variable and constant regions are labeled 00 the leh side 01the molewle 001..,.. Note that the antiga1 bioding region is tormed al me interface between me Vl and VH domarns. (Harr'5 U~ Skaletsky E. and McPherson A. 1998. J. Mol. BioI. 275: 86 1-872.) lmage prepilfed WJ\.11 BobScnpt, ~ and Raster 3D.
V(D)J Recombination
339
a V1
V2
the process oí V(D)J rerombination. The top paneIs show the S!eps invoIved in produóng che Iight FIGUR E 11 - 36 OVerview 01
germ-line ONA
V3
II l l l l l l
Vn
e
J1J2J3J4
c:hain of ,a¡n [)NA in cefts lhal hiwe no! e:peñenced V(D)J recombini!lXlll (germ-line DNA). (b) RecorrtlIf'lCfun bct.veen two spedfic gene segments (\13 and J3) as OCCLfS dlrif1g &ceJ development lhl5 is on/y ene of lhe many types of recorrlJi~ events that can ocrul in different pre-S-cells. lhe recombned Iocus ]s then lÍ<1nscr'tled aOO !he RNA splieed (~er 13) 10 jlJ)(\apose a constant·region gene segrrent This mRNA is Ihen translated 10 genero ate the ~ghtdJain JEten. (e) Sd1ematic of!he even more COfT'4:IIex heavy chain gEf1Ctx:: region. Wth its additiO"lill "O" gene segments and mtJlttpie types 01 constant reg¡ons segrnents (Q.J. el ete.). (SoI.lrce; From Bushman E, 2002. LateroJ ONA trorrsler, p. 345, f 11 .3.0 2002 CoId Spr'r1g Harbc:r Laboralory Pfess.)
me
DNA rearrangement dúring
B ce!1 differentiation
b
VI
B 0011 ONA
V2
m ,1
e
V3J3 J4
J ..DLJ1.JJ_--,I_n..L1__)
I
l··
n
""POOo
V3J3 J4 lransaibed RNA _ _ _ _,..;.~ ,.¡¡¡,,¡¡,¡
e ___.,¡¡¡_
1
RNA ,pI<'n9
spliced mRN/\
_ _ _ _ _;V3<,.J 3. C4 .. .. -
lighl-chain
v
l. n"",~n
J
-
e
p-otein
e V1
(j ]
V2
Vn
0 1 02
Dn
JI J2 J3 J4
1) 1TL. DDTI DDDD
Cp
l . 1J
ce
Cy
~
J])]))
Ca
)
gcneraled by recombination al the light- and heavy-Joci have a multiplicative impact on protein slIucturc.
The Early Events in V(D)J Recombination Occur by a Mechanism Similar to T ransposon Excision Recornbination sequences, called re(;ombination signal sequences, tlank thc gene segmenls Ihal are assembled by V(O)J recombination. TIlese signals a l} have Iwo highly conserved sequence motifs, ane 7 bp (Ihe 7-mer) and the second 9 bp (Ihe gomer) in length (Figure 11-371, These mOllls are bound by the recornbinase (see bcJow). The recombi natíon signal seqtlences come in two classes. One class has the 7-mer and 9-mer motifs spaced by 12 bp of seq uence, whereas the second c1ass has these motifs spaced by 23 bp (Figure 11-37a), Rccombination always occurs betwcen a pair of rccombination signal sequences in which one partncr has the 12 bp "spacer" and thc other partncr has the 23 bp "spacer," These pairs of recombination signal sequen ces are organized as ¡nverted repeats fl anking the ONA segmcllts that are destined to be joined (Figure 11-37b). The recombinase responsible fOl" recognizing and cleaving (he recombination signal sequences is composed of two protein subunits caBed RAG1 and RAG2 (RAC for Recombinati on Activating Gene). These proteins function in a man ner very similar to a transposase
Silfl-Specijic Recombinalioll ond Tmnsposilioll of DNA
340
a
7 -mer I
J
(l coding "'9~"
rrr __..
12 bp spacer •
I
b
1
J,
v, ). .,
::;=:=.::::.'~"I;:I=:J)
([1
,~.-
~~D_•..-_--c~ ",",J:.....,):-) 19H
..
crr: ~
~
) ) 19 K
~~~~~~~~~~~~~ ((Il:::IC]~E======~ll ~::-:::)C) TCRa. TCR, L: space, I lLG: ~'=FF:=:::::::. ~~:-:.F:=="(I ~=:r:;))TCR~. TCRó
I G U R E 11-37 Recombination signal sequences recogniud in V(D}J recombinatton_ (a) CJose.iJp of the lWo Iypes of recombination silJldl sequences (RSS)_The 12 q:. sparer is stu:1o\ln in blue, lile 23 bp spacer In grOO'1 and !he conseNed 7-mer and 9-mer sequence elements, sllared by both types of seqUE.'l1Ces, are in yellow The nudeotide seqJence in ,he spacer region ;s not impOfiant The length, hoNever, is Olncal. (b) &.amples 01 RSS anangements In lhe genetic reglOns encoding antibocltes (lg ~nes) ane! T-cellleceptOf protcins (lCR genes). (SOUICe: (a) From Bushrnan F. 2002. Laterol DNA transfer, po 346, , 11 .5. e 2002 CoId Spring Har!::q t aroatory Ptess.) f
(Figure 11-38). They recognize Ihe recombination signal sequences and pair Ihe Iwo sites lo fonn a protein-DNA synaptic complexo The RAGl proteins within Ihis complex Ihen introduce singleslranded breaks in lhe DNA al each of Ibe junclions between \he recombiui::Ilion signal 5eqllellce
I'IDJ/ Recombmo/iol1
•
rocombinalion s;gnal sequences
v
-.L
__-_-,_
\
341
FICURE 11-lB TheV(D)J
J
r
recornbinmon pathway: deavages occur by a mechanism similar lo transposon ellasion. The recombinases C of lhe signal sequenres, leaving d free 3'OH. Each :rOH then initiates anad:. on the opposite stTands to form
' :1' ,S'
a hairpin intermediate (see Figure 11 -22b). The l1airpln slructures are SlJbsequently hydrolYled aOO Ihen Pined together lo torm d coding JOlnl
between the Vand J regiOf\S.. The two ends C
me
I)'Ing recombinahon signal sequences are also joined lo 1o!!TI a :signal pnt. 1he Iom-.er stlUÓllre undergoes ftrrtl1er recomJjna!lon;
....nercas lile lanel is discarded. (Source: From BusIlman F. 2002. wrerol DNA tronsfer, p. 348, I 11.6. e 2002 CoId Spnng Harbor Laboralory Press.)
d hairpin oponíng.
signal ¡oint tormal;()(l
lusion of coding
DNAnon-
segmcnls
homologous endjolning Pfoteins
1
further recomblnatlon
dlscarded
appears lo have same sim ilarity lo the DDE (ransposase protein famUy. These observations, togelhcr wit h many athers. have provided averwhelmi ng evi dence for Ihe propasal Ihal V(O)J recombination, now a critical feature of Ihe immune system of higher an imal s, evolved from a DNA transposon. This conclusion s peaks lo the critica! impartance of lransposable elemenfs in Ihe evolution of colluluf genomes.
SVMMARY Ahhough DNA is normally Ihoughl of as a very slattc 0101ocu le thal archi vL'S tho genetic material, il is also subject lo numerous Iypes of ren.rrangements. 'I\.vo d asses of genetic recombinalioo-conservalive sile-spccific recombinalion snd lransposition-are responsible ror many of , hese evems. Conservalive sile-spl.'cific recombínalion occurs al dcfincd sequence elemcnls in Ihe DNA. Recomuinasc proleins rocognize Ihese scq ucnce elemcnls and Bet lo deave and joio ONA slrands lo rearraoge ONA sogmenls conlaining the rccomuinm ion s ites. Threu types of rearrangemenls are common; DNA ¡nserlion, DNA delelion,
and DNA inversion. These rearrangements have many funClions, induding ¡nserlion of a viral gOflome into thal of the ha.<;t ce ll during infeclion, resolving DNA lTlultimers. and altering gene expression. Tho ol"ganizalion of Ihe recombinatiun siles on the DNA us woll as the purticipalion of DNA archilcelUral proteins dietale the outcome of a specific recombination .rcaetion . The arehitcclural prolcins funct ion lo bend ONA segmenls and can have u large infl ncnee on Ihe reactions occurring on a spccific region of ONA. Thcre are two families of conservativo s ile-specific rocombinascs. Both families de
342
Sile-Spc>cific Rer:omb¡na/ion ond Tron.~posiUon
O/ DNA
DNA covalent inlermcdiale. For Ihe scrirrc recombina.';cs, Ihjs linkage is via un act ivErsi le serina res idue; for thc tyrosine recombiuascs, il is via a tyrosine. Slructures of t1w tyrosine rccomhinases yicld mnny in5ights into Ihe de-lails of the rocombination mechanism. Transposition is a r:Iass of recumhination Ihal moves Illobilo genetic elcments, caJlcd lransposons, lo ncw gonomic s iles. l1u~re are threc mn jor c1asses of lransposons: DNA transposons, viral-like retrotransposons, and poi y-A retrotransposons. The DNA tm.nsposons exisl as ONA Ihroughout a cycle of transposilion . They move either by a cut-and-pusto recambinal!on mechanism, which invol\'es an cxciscd transposon inlcrmcdiate. or a rcplicative mecha· nism. The two c1asses of rctrotransposons move using an RNA inlermediatc. These "retro" elemt.'Tlls require the RNA-
dependent DNA polrmerasc. ca llcd. rcverse transcriptase. as well us a recombinase protein for mobility. Transposons are present in lhe genomes of a ll orgnnisms, where ¡he)' can cunstilutc a hoge mction of the total ONA sequence. The}' ru:e a major causc of mutations an d genomc reanangcments. l'ransposil.ion is oflen regulated lo help ensurc Ihal transposons don'l cause too much of a disnlplion lo Ihe genome of Ihe hosl cell . Control of trans puson copl' numbar and regulation of ¡he choice or new insenion ~ites pos ition-Iike mechanism can be USfld ror olher Iypes of ONA rearrangcmenl reactions. The prime example of Ihis is Ihe V(D)J recombination reaclion. rosponsible for assembl y or gcne fragments during developmenl oflhe vllncbrate ¡mmune system.
BIBLIOGRAPHY Books Bushman F. 2002 . Lotero/ DNA tronsfcr: Mec}lOnisms Qnd cOllsequenccs. Cold Spring Harbor Labowlory Press, Cold Spring Hachor, New York. Craig N.L" Craigie R., CclJcrt M.. and J...;:¡mbowitz A.M .. eds. 2002. MobiJe DNA 11. American Soc.iety foc Microbiology. Washington, OC. S itc~Specific
Recombination
Bakcr T.A. 1991. .. ' .. and then Ihero \Vere Iwo." Nuture 353~
794 -795.
Che f) Y. and Rice P.A. 2003 . New Insight ioto sitespecific recombinalion from FLP recombinaSü-DNA struclures. Annu . Rev. 8iop}¡ys. Biomo/. Struct. 32: 135-159.
Hallet n.
T ranspositio n Hacen L.. Ton·Hoang B.. and Chandler M. 1999. Inlegraliog DNA : Transposases and viraJ integrases. Ann. Rev. Micmbio/. 53: 24:5-28 1.
Plaslcrc.k R. 1995. The Tcl/mariner lransposon fa mill" Curren! Topics in Microbio/. Immun oJ. 204: 1 25-143. Prok E.T.L. and Kazazian H.H.. Jr. 2000. Mobilc clements in Ihe human genome. No t . Rov. Genet. t: 134 - 144 . Rice P.A. and 8aker T.A. 2001 . Comparativc ;:¡rchilecture of transposase and inlegrative complexcs. Not. Slrvct. 8io/. 8: 302-307.
Smit A.EA. 1999. Intcrspcrsed repents and olher memcnlos of transposablo clements in marnmali on genomcs. CurrOp. Gene/. Dev. 9: 657 - 663.
WiJliams ToL.
V eD)] Recombioation Fugmann S.D., Lee A. l.. Schockctt P.E., Villcy l.,., and Schalz D.G. 2000. The RAC proteins and V(DlI recombinHlion: Complexes. ends. and tl""dnsposition. Annu. Rev. lmmuIIo/. 18: 495-527 .
CcJlert M. 2002 . V(DlJ rccombinalion: RAG proleins. repair f¡¡dors, and rcgulation. Annu. Rov. Biochem. 7'1: 101 - 1 32.
PAR T
EXPRESSION OF THEGENOME
344
Port 3 Exprt:ssioll of the G(!llOme
PAR T
QUT l l N E
Chcpter 12 Mech.:misms of Transcription Chapter 13 RNA Splicing
Chopler 14 Translaoon Chapter 15 The Genetic Code
art 3 is concemecl with one of the great challenges in understanding the geoe- how the gene is cxpressed. lo othar worru, ho\'\' is informalion in the fonn of the lioear sequence of nue1eoUdes in a polynuelcotide chain converted into the li near sequence of amino aci cls in a polypeptide chaio? The now of information from genes lo proteins and the concept that ¡nformation How is unidirectional. is known as the central dogma and was enunciatecl by Prancis Crick in 1958: .
P
Tha central dogma states that once "informabon" ha,> passed into protein it cannot gel out again. The transfer of informaUon t:rom nueleie acid to nueleic acid. or from nueleie acid lo protein, may be possible, bul transfer from prote¡n lo protejo , or from protein lo nueleie aeid. is impossible. lnformatíon means here the precise detcrmination of sequence. either of bases in the nueleic acid or of amino a cid residues in the protein. Chapters 12 through 15 trace the now of ¡nformation from the copyíng of the gene into an RNA replica known as the messenger RNA to the decoding of the messenger RNA into a polypeptide chain. The process by which nucleotide sequence information is transferred from DNA to RNA is known as transcription, and this is the sub ject of Chapler 12. A multi-subunit molecular machine known as RNA polymerase creates a moving " bubble" in the double helix in which DNA is unwound at the leading edge of the bubblo and rewound into a helix al the traiHng edge. The RNA polymerase uses one of the two transientLy-separated ONA strands within the bubble as a template upon which it progressively builds a complementary RNA copy by base-pairing. The messenger RNA is CJ'Cüted in a similar manDer in all cells. Bul, while the basic enzyrnc thal makes the RNA is very similar. the rest of tho machinery involved ín transcriplion in eukaryotes í5 more complex than íts counlerpart prokaryotes. Sequences in the DNA that determine where transcription starts (promoter) and where il stops (terminatorl are also described. In prokaryotes, once the messenger RNA 1S synthesized, it is read)' for !he nex! slage of information flow in which RNA is used as a template for protein synthesis. Dut not in eukaryotes: Ihero tbe RNA product of transcription must undergo a series of maturation events before it is competent to serve as a messenger RNA . 1'wo of these- the addition of thc so-ca ll ed "cap" structure lo the 5' end, aod of a po ly-A tilil to the 3' -ore described io Chapler 12. The most dramatic processing event is cal1ed mRNA spUcing, and is described in Chapter 13. Genes in eukaryotic cells are frequcntly ¡nterTupted by one, or sometimes Olany. nonprotein-coding segments knowo as introns. W~en the gene is transcribed into an RNA copy. these ¡ntmns must be removed so that the proleín-coding segments. known as cxons. can be joined to each olber to create a conliguous protoin-coding sequence. Chapler 13 descríbes the elaborate molecular machine responsible for removing introns \Vith great precision. Part 3 culminates, in Chaplers 14 and 15, with the process known as translation . Trns· is the process whereby genetic information, in the forro of tbe sequence oC oudcotidcs in messengcr RNA, is uscd lo direc t the orclered lncorporation oC amino acids iota the polypeptide chain of a protein. Chapter 14 describes the fo ur principal partici pants in translation: the coding sequence in messenger RNA; adaptor molecules known as tRNAs; enzymes known as aminoacyl tRN A syn-
Por1 3
Expression 01 the Genome
345
thelases that load amino aci ds onto the tRNA ada ptors; ilnd !he protein-synthesizing factory itself. the rihosome. which is composed of RNA and protein . The remainder of the chapter describes how these four compone nts. with help from a number of key auxilliary factors, manage the remarkable process of convert.ing the nucleotide code of a given mRNA into the correct order of ami no acids in its protein product o Finall y. Chapter 15 describes the classic experiments that led lo the elucidation of the genetic codeo and tays out the rules by whic h the code is transJated. Tbe nucleotide seq uence information is based on a Ihree letter codeo while the protein sequence information is hased on twenty different amino acids. Thc code is degenerate with two or more codons (in mosl cases) specifying the same am ino acid. There are also specific codons that in dicate where lranslation should sta rt aod whera it should stop .
PHOTOS FROM THE COLD SPRING HARBOR LABORATORY ARCHIVES
Rithard RobetU,. 1971 Symposium O" Cfl,omatin. Moch 01 Robert5' re5e
David 8aJtim01'e, F,a~oís lacob, and Watter Gilbert" 1985 Symposí"m on ltte MoJec::ular BioJogy 01 Development. Baltimore co-c!iscoJered, v.;1" H(Mlélro Temin. ttle el'1zyme reverse lIensalptase. v.hidl mak€s DNA lISlng RNAas a template (d'laptef 11 ). Jacob, wilf1 Jacques Monod, proposed Ihe basic modeI for how gene expressioo is regulaled (Chapler 16) and dlso propa¡ed a model lor ho.'I DNAreplication is regulated (Úlaplcr 8). Gilber1: provided blochem1cal va ~dalion for aspeds ollhe Jacob and MQ{l()(\ model of gene reguLation; also il'1vented a chemical method la -sequenong IJNA (Chapter 20). They en separalcly sharcd in Nobcl Priles, ¡n 1975,
"e
1965, and 1960. respectively.
Sydney BJenner and James watson, 1915 Symposium on Ttte Synapse. Brcnncr, shcr.vn here 'IoIith Watson, contributed lo the discoveries 01mRNA and the nature of!he genetlc code (Chaptel 2); his share of a NobcI Prize, in 2002, tvJ>..o,tever, was lor establishing the OOflTl, e eIegons, as a modeI system lor the study 01 devel0pme0tal biology (Chaptel 2 1),
/
frands Cridl., 1963 Symposium on synthesis and StructUJe of Macromolecuk!s. In addition to his role in soMng !he strl..ldure 01 DNA, OICk was an intellectual ¿riving fo«:e in !he cbteIopmenI 01 rnoIecuIilr t.J1ogy donng !he r¡eld's critica! eatiy year~ His 'adaptar h)POthcsIs~ (po~ished in the RNA r¡e Olb f1e\o\'SIener) predicted ¡he existence of rnoIea;Ie-s required lO translare!he genetic axIe of RNA ¡nto !he amino add sequence 01 pfO(eins. Only jata were IRNAs found lo do JUSI that (Chaptel 14).
PIliflip Shalp. 1974 Symposium on Tumor VifUSes. Sharp aOO Rimard Roberts shared !he 1993 Nobel Pnze in Mediane lar ~ng that many eokaryotic genes elre ·split" - rhélr IS. their roding regions are intern.!pl:ed by sr:retches 01 non-coding ONA. The norrroding r€gions elre removed lrom the RNA copy by ·splicing" (Chapte r 13). Sharp is stown here WIIh hlS wife Anne.
Paul Zamemik. 1969 5ymposium on The Medranism of Protein Synttresis. Zamecnik developed in vitro systerns 01 protein SyntheslS tha! provee! critica! to underst
CHAP T ER
Mechanisms of Transcription
p to lhis poi nI we have been considering maintenance oC the genome- lhat is, how the genetic material is organized , protected. and replicated. We now tum lo the quostion oC how thar genelic material is cxpressed-thal is, how the series of basos in Ihe DN A directs the production oC the RNAs and prate¡ns thel perCorm cellular functions and define c.:oUular ide nlity. In the ncxl fcw chaplers we will describe tho basic processes rcsponsibJe Cor gene expression: transcription. RNA processing, and translation. Thmscription ¡s. chemically and cnzymatically, very similar lo DNA replication {Chapler al. Bolb involve enzymes Iba! synlbesize a new strand of nucleic acid complementary lo a DNA template strand. There are sorne important diffcrences. of course-mosl notably that in the case of transcription Ihe new strand is made from ribonucleotidcs rather than dooxyribonucleotides (800 Chapler 6J. Other mechanistic features of transcription that differ from replication inelude the following:
U
• RNA pelymccase (lhe enzyme that catalyLeS RNA synthesisl dees nol need a primer: rather, iI can inHiate transcriplion de novo (tbough in vivo initiat ion is pennitted ooly al certain sequences, as we will sea). • The RNA product does 001 rcmain base-paired to tbe template ONA sLralld-rather, the enzyme displaces lhe growing chaio ooly a fow nucleotides behiml where e8ch ribonucleotide is added (Figure 12· 1), This displacemanl is critical fer Ibe RNA lo be (as is typicaUy Ibo case) u-anslated to produce its protein product. Furthermore, because this ralease follows so dosely behind tbe sHe oC polymerization, mu ltiple RNA polymerase molecules can transcribe ¡he sama gene al the sama time. 98ch follow ing c10se ly aloog behind enolber. Thus. a 0011 c:an synlhesize largfl numbers of transcripts from :¡ single gene (or oLbar ONA sequencel in a short time. • Transcriptio n, though very accwale, is less accwate lhan replication (one mislake occurs in 10,000 nuclootides added, compared to one in 10,000,000 for rcplicationJ. This difference reflects Ihe lack of extensive proofreading mechanisms for transcription. although two Conos of proofreading for RNA synthesis do axist. It makes sonse for the cell lo worry more aboul the accuracy of replicalion than oC transcription. DNA is tite moleeule in w hich the genetic malerii:l.l i5 slol'OO. and ONA replication is the process by which thal genetic material is passed on. Any mistake thal erises during repli cation can therafare easily be catastrophic: it becomes pennonent in the ganame of that individual and also gels passad on to subscqucnt generations. 1'ranscl'iption, in wnlrast, produces only lIausient copies ano normally several from each lranscribed region. Thus, 8 mistake during transcription will rarely do more harm than render one out of many transient transcripts defectivo.
OUT l l N E
• RNA Polymefases anc! therranscnption Cyde (p_348)
The Tronsaíplion Cyde in Boderi.:! (p. 353)
• lranscnpbon In Eukaryotes (p. 36:
J4B
MI..'Cllonisms ofTronscriplion
FIGURE 12-1 TranscriptionolDNA ¡nto RNA.. The figure shows. In absence of !he enzymes invo/v€d. how the [)NA double helix is UflWOUfId ane!
ONAduplex
me
Beyond Ihese mechanistic differences between DNA replication and transcription. Ihere is one profound difference that reflects the different purposes served by these processes. Transcription selectively copies only ccrtain parts or the genome ami makes anylhing fmm one to severa] hundred, or even thousand. copies of any given secHon. In contrast, replication must copy the entire gename and do so once (and only once) avery cell division (as we saw in Chapler BJ. The choice o( which regions lo transcribe is not random: each typically ineludes one or more genes, and Ihero are specifl.c ONA sequences tbal direct the initialion of transcriptíon at Ihe slart of each region and others at tbe end thal tenninate lranscription. Not onIy are different parts of Ihe gonome transcribed to differont extents, bul the choice of which part to transcribe, and how extensive1y. can be regulatcd. Thus. in differeut cells. or in Ihe same cell at dUferent times, differenl sets of genes migh! be transcribed. So, for example. two genetically identica1 cells in a human will. in many cases. transcribe difrerent sets of genes, leading lo differences in Ihe character and function of tbose Iwo cells {fm example, one might be a muscle (:eH, tbe útJ\(~r a neuron). Dr. a given bacterial cel! will transcribe a different set uf genes. depending on the medium in which it is growing. These queslions of regulalion are dealt with in Part 4.
RNA POLYMERASES ANO THE TRANSCRIPTION CYCLE RNA Polymerases Come in Different Forms, but Share Many Features RNA polymerase performs essentially the same reaction in aH cells, from bacteria lo humens. lt is lhus 001 surprising that lhe enzymes from these organisms share many features, especially in 11108e parts of tbe enzyme directly involved with calaJyzing the synthesis of RNA. From bacteria lo mammals, the cellular RNA polymerases are made up of multiple subunits (although some phage and organelles do encode single subunit enzymes Ihat perform the same task). Table 12-1 shows Ihe numhers and sizes of suhunits found in each case and also shows which subunits are conserved at the sequence leve! between different enzymes. As can be seen from the table, bacteria have only a single RNA polymerase, while in eukaryotic cells there are tbree: RNA Po) l. n. and 1II. Poi n is the enzyme we will focus on when dealing with eukaryotic transcription in the second half of this chapter. Tllat is beca use it is Ihe most sludied of these enzymes, aod il is also the polymerase responsible for lranscribing mosl genes-indeed.
TABLE 12- 1 The Subunits of RNA PotymeRse5
Prokaryotic
Eukaryotíc
Bacterial
Archaeal
RNAPI
RNAP 11
RNAP 111
CO
Co..
(Poli)
(PoI 11)
{PoI 111)
P' P o' o'
A'/A"
K
RPAl RPA2 RPCS RPeH RPOO
APBl APB2 RPB3 AP8ll RPB6
APel RPC2 RPCS RPC9 RPB6
r+ 6 oth!'.fs j
[ + 90thersj
[+ 7 others l
(+ 11othersl
w
B O
L
Note: The subonils in each colurm are riSlad in Ofde. of decreasing rroIecular \WlIghl. SotIr.e: Dala adaprad fmm Ebfig11 RH 2000 J . Mol. BiaI. 304: 687- 698. F1\!. 1. P 668.102000
Academoc F'feS$ .
essentially aU protein-encoding genes. Poi .1 and Poi m are each ¡nvolved in transcribing specialized. RNA-cncoding genes. Specifically. PolI transcribes Ihe large ribosomal RNA precursor gene. whereas PoI III transcribes tRNA genes. sorne small nuclear RNA genes. and tha ss rRNA gene. We return lo these enzymes al the end or Ihe chapler. The bacterial RNA polymerase eore enzyme alone is capable of synthesizing RNA and eomprises two copies of the Q subunit and one each of the 13. W, and w subunits. That enzyme is c10sely related to the eukaryotic polymerases (see Tabla 12-1). SpccificaUy. the two larga suhunits. ~ and W. are bomologous to the two largo subunits found in RNA Poi U (RPBl and RPB2). The o: subunits are homologous lo RPB3 and RPBll and w to RPB6. The struc...1ure oC a bacteria l RNA polymerase cure en1.yme is similar lo thal oC the yeasl Poi l1 enzymc. Thcse are shown side-by-side in Figure 12-2. Later we wilJ describe sorne of the structural details tha! shed light on how these enzymes work. For now we just highlight sorne of the general fealures. The baclerial and yeast enzymes share an overall shape and organization; indeed . they are more allle than the comparison oC Ihe subunit soquenccs would predict. This is particularly true oC the internaJ parts. ocal" the active site, and less so on tha peripheries. The distribution of these similarities and difIorences presumably reflects. in Ihe fooner casa, the faet thal the enzymes calT)' out the same function (synthesis of RNA on a DNA template), and in tbe laUer case, Ihat , lO function ln the ceU. the two enzymes interaet with otber proteins and thuse are specific and di{ferent in the two cases, as we shall see. Overall, the shape of each enzyme resembles a crab claw. This is reminiscant of the "hand" slructure of DNA polymerases described in Chapter 8 (Figure 8-5.J. Tha two pincers of the crab daw are made up predominantly of the t\Vo largast subunits oC each enzyme (f3' and ~ for the bacterial case, RPBl and RPB2 roc the eukaryotic enzyme). The active site, which is made up oC regions from both these subunits. 1S found at Iha base of the pincers w ithin a region called Iha "active center cleft" (sea Figure 12-2). Tha active site can bind two Mg2. ~ i.ons. consistent with tha proposed two-matal ion catalytic mechanism Cor nucleotide additiun proposed for a1l types oC polymerase (see Chapter e).
350
Mechonisms al TI"onscriplia n
FIGURE 12-2 ComparisoRofthecrystal
•
structures o, prokaryotic and eukaryotk RNA polyrnerases. (a) Structure 01RNA pOIymerase COfe el1zytne Irom T. oquoticus. The subunits are colorcd as loIlCW\IS: Il is sh:lwn in porpie, Il' in blue, the two fl sul:uuts in yellO\N al1d green, anc! ro in red (Seth DafSt, The Rockeleler University, personal cornmunicatioo.) (b) SlruclUfe of RNA FbIyrneruse 11 'rom ~ast S. cerevrsioe. The subunits are coIored 10 show thar relatedness lo those in the bade!lal eozyme (see Table \2-1). Thus, RPB I anc! 2 ale shown in purple aOO blue respectivefy; RPB3 anc:! I \ are shc:wr1 in yellow anc:! greeo; and RPB6 in red. (Cramer p.. Bushnc/I DA, and Komberg R D. 2001. Science. 292: 1863). lmages prepared WIth MoISaipt. BobScnpt. ilnd Raster 3D.
b
There are various channels that a llow DNA . RNA, and ribonuc1eot.idcs into and oul of the enzyme's active center cleft. These we discuss later when considering the mechanisms of transcriptioll_
Transcription by RNA Polymerase Proceeds in a Series of Steps To transcribe a gene, RNA polymerase proceeds through a series or well-defined steps which are grouped into Ihree phases: initiation. elongation, and termination. Here. and in Figure 12-3, we summarize tbe baslc fcatmes of each phase . lnitiatioR. A promoter is Ihe DNA sequence that initiaUy binds the RNA po)ymerase (together with initiation factors in many cases). Once formed , the promoter-polyrncrase complex undergoes struclural changes required for initiation lo procecd. As in replication initiation , the DNA around the point whero transcription wiII start unwi.nds, and lhe base
RNA Polj'meroses ond
tlll:l
Thmscriprjon Cydl:l
351
FIGURE 12-3 Thepllasesoflhe
transuiption cyde: inítiation, elongation. and terminadoR. "fue figure shows the gE!I1eral scheme lar the tranSOlplon c;de. The lealures shown hold lar bolh bacteria! and eukal)Otic cases.. Other lactors required far initiCltÍOn, elongation, aOO lermiNltion are not shcwn hcre. but are described later In lhe texl Tlle DNA nocleoltde enroding the beginning of lhe RNA chain is called lhe transópl"ion stan site aOO is designaled \he •+ 1" position. Sequenres in \he di'ection in lMlich transaiption proceeds are refetTed lo as downslream of lhe stdrt site. li~. sequences preceding the start site are refened te as upstream sequenccs. \¡I¡1-¡en refetñng 10 a speci6c position In \he upstream .sequence, \h1S is given a ncgative value. Downstream sequences are allotted posrtive values.
blnding (dosed complex)
promoter "melling" (open complex)
! ¡nitial
n::~~=~
transcription \l
1 elongalion
polymerase terminates, 1[1==================,J and releases RNA
pairs are disrupled, produdng a "bubbIe" of single-slranded DNA. Again JiJee DNA rcplicalion, transcription aJways occurs in a 5' to 3' directioll. That ¡s, the new ribonuclootide is added lo the 3' end of the growillg chain. Unlike replication, howcver, only one of the DNA slrands acts as a template on which the RNA slIand is builL As RNA polymerase binds promoters in a defin ed orienlation, the same strand is always transcribed from a given promoter. The choice of promoter delerrnines which strelch of DNA is lranscribed and is the main stap at whic h rcgulation is impasad . That ¡s, the decision orwhether or not to ¡nBiate transcription of a given gene is chiefly how a ce l! regulalcs which prateins it wiU make al any given time. Elongation. Once the RNA pnlymerase has synthesized a short stretch of RNA (approximately ten bases), il shiOs 1nl,0 the e longation phase. This trahsition I'cquires further conformational changes in polymerase thal lead it to grip the lemplate more firmly. During elongation, the enzyme performs an impressive range of lasks in addition lo the cataIysis of RNA synthesis. It unwinds the DNA in front and re-anneals it behind. it dissociates the growing RNA chain frem the lemplate as it moves aIong, and it performs proofreadi ng functions. Recall that during replication, in contrast, several different enzymes are required lo catalyze a similar range of functions. Tennination. Once thc polymerase has transcribed the length of the gene (or genes), it must stop and release the RNA producto This step is cal1ed lerminalion . In sorne cells there are specific, well-characterized. sequences that trigger termination; in others it is less c1ear w hat inslructs the en:tyme to cease transcribing and dissociate from the templale.
Transcriplion lnitiation lnvolves Three Defined Steps The firsL phase in the transcription cycIe- initiation- can itself be broken down into a series of defined steps (as indicated in Figure 1 2-3). The first step is the initial binding of polymerase to a promoler lo form whal is caBed a c10sed complex. In Ihis form lhe DNA remains doublcstranded. and the enzyme is bound lo one face of the helix. lo Ihe second step of ¡niliation , the dused complex undergoes a transition lo the open complex in which the DNA strands separale ovar a distancc of sorne 14 bp W'Ound the start sita to form lhe transcription bubblo. The opening up of Ihe DNA frees the template slran d. The first two ribonucleutides are bro ugh l into the active sita , ali.gnad on the template strand, and ¡oinro togelher. The enzyme tJlen begins 10 move along thc template strand. opening the DNA helix ahead 01' lhe site of polymerization and allowing it lo roseaJ behind. In Ihis way. subsequenl ribonudeotides are incorporated iota the growing RNA chaio. lncorporation of the fi rst ten or so ribonudeotides is a mlher inefficient process, and al that stage the anzyme often releases shorl transcripts leach of less than ten or so nucleoli dcs) and thon begins synthesis agaiu. Once an enzyme gets further Ihan the 10 bp. il is said to hava escaped the promoter. At this point it has formed a stable lernary complex, containing enzyme. DNA. and RNA. This is the transition to Ihe elongation phase. In the remainder of this chapter, we will describe the transcriplion cyele in more detail-first for tllC bacterial case, and then for eukaryotic systems.
Tl/C Tronscription Cyde in Bocten'o
353
THE TRANSCRIPTlON CYCLE IN BACTERIA Bacterial Promoters Vary in Strength and Sequence, but Have Certain Defining Features The bacterial eore RNA polymerase can, in princip ie. initiate transCfiption al aoy poio! on a DNA molecu!e, In ce lls, polymerase initiates transcription only al promoters. lt is the addítion oC an initíation factor céllled a thal converts l:ore enzyme ioto the fonn HUI! inilial es ooly al promoters. That for m oC the enzyme is called the RNA polymerase holoenzyme (Figure 12-41. ln the case oí E. eoli, the predominan! cr factor is caBed (110 (we wiJI consider ather. alternati.vo CJ factors. in Chapter 16). Promol.ers rocognized by po!ymCf"dSe containing (170 share the following characteristic structure: two conserved sequences. each of so. nucleotides. are sepamtet.l by a nonspecific stretch of 17- 19 nucleotides (Figure 12-5), The two defined sequences are centered, respectively. at about 10 hase pairs and al about 35 base pairs upstream of the slte where RNA syothesis starts. The sequences are thus called lhe -35 (minus 35) amI -10 (minu s 10) regions. or elemenL~, according to thc numhcring schcme described io Figure 12-3. in which the ONA nudeotide encoding the beginning ofthe RNA chain is designated + 1. Although lhe vast majority of u 70 promotecs contarn recognizable - 35 and - 10 rcgions. Ihe sequences are not idenlical. By compartng many different promoters. 8 consensus sequence can be derived (see Box 12-1 , Consensus Sequences, for a ruscussion of how these are derived). The consensus sequence reflects preferred - 10 amI - 35 regioos, separated by the optimum spacing (17 bp), Very few promoters have lhis exact sequeoce, but most differ frOID it only by a few nucleotides. Promoters with sequences c10ser to the consensus are generally "stronger" Ihan those that match less we lL By Ihe strenglh of a promoter, we mean haw many transcripts H initiates in a given time. Thal measure is influenced by how well Ihe promoter binds polymerase
FIGURE 12-4 RNA potymerase holoenlylTle T. oquoticus. The RNA poIymerase noloenzyme from Thermus aquaticus. SIlown in gray is the (ore enzyme (tIle same
enzyme shown in part (a) of Figure 12-2). In ptIrple is the 0 '10 subuntt (reg¡ons 2, 3, and 4see Figure 12--6), On!he rigtlt ISregion 2, al tIle lop region 3, and al tIle bonom region 4. k; de;aibed liller in !he Iex{, il is o regions 2 and 4 lha! fecogniz
Image prepafed Wltn foIIoIScripT. BobScripT. and Rasler 3D.
354
Mechanisms ofTromcription
FICURE 12-5 FeatufMofbacterial promoters. Various combinations 01 bacterial prometer elements are shooM1. Details 01 how eadl ~emen.t contributes te poJymerase binding and Illnction ale óe;aibec! 10 lile text
a
(l
•
. 1 l -35
I
I
(6 bp)
(17-19 bp)
110
) .1
(6 bp)
b
01
UP-elemenl
•
1- 1 35
e
10
1
)
"
"extended 10'
n
(1
e '" J .,
• )
-10
initially. how efficien tly il supports isomerization. and how read ily ¡he polymerase can ¡hen escape. The correlation between prornoter strength and sequence explains why promol ers are so heterogeneous: some genes need lo be expresscd more highJy than others and (he fonner are likely lo have sequences dosel' lo the consensus. An additional DNA elernenl that binds RNA polyrnerase js found in some strong promoters, for examp le Ihose directing expression uf the ribosomal RNA (rRNA) genes. This is called an UP-element (see Figure 12-5b) aud increases polymerase binding by providing an a dditiOllal specific interaction between the ellzyme and the DNA. Another class of a 10-promoters lacks a -35 region and instead has a so-catlro "extended -1 0 " element. This comprises a standard - 10 region with an additional shorl sequence elernent at its upst:rearn end. Extra contacts mnde between polymernse ¿¡nd this additional sequence element compensate for the absence of a - 35 region. As we will see in Chapter 16. Ihe gol genes of E. coli use such a promoter.
The U Factor Mediates Binding the Promoter
oí Polyrnerase
to
The (t70 factor can be divided into four regions OIUed (t region 1 through u region 4 (see Figure 12-6). The regions that recognjze the - 10 and - 35 fl lements orlhe promoterare region 2 and 4, respectively. Two halices within region 4 form a eommon DNA-binding molif called a helix-turn-helix. One of these helices inserts inlo the major groove and interacts wilh bases in Ihe - 35 region : Lhe other Hes across the top of the groove. making contads with the DNA badbone. This structural motif is found in many DNA-bínding proteíns - for example. almost aU lranscriptiollal aclivalors and repressors found in bacttlrial ceUs (described in Chapler 16)- nnd was discusstld hl detall in Chapler 5 (Figure 5-20). Thc - 10 region is also recognizoo by an a helix. Bul in Ihis case. the interaction is less well-characterized and is more complicated for the following reaSon: wheteas Ihe - 35 region simply provides bindíng energy lo secure polyrnerase to Ihe prornoter, the - 10 region has a more elaborate role in transcription initintion, because H is withjn that elernent that DNA melling is initiated in the transition from the
Tl1e Tronlicription CYde in Bacten'o
355
Boa: 12·1 Consensus SequefKe5
The DNA sequences of binding sites recognized by a given protein may not always be exactly the same. likewise, a stretch of amino acids fha! bestQW'S upon a protein a particular fundion mCly be slighfly differenf in diHerent proteíns. A consensus sequence is, in each case, a version of the sequence having al each position {he nucleotide (ar amino aad) most oommonly found there in different examples. Thus the consensus sequence far promoters in E coIi rerognized by RNA polymerase containing 0'70 is shown in the figure (Box 12- 1 Figure 1). This consensus sequence was derived by aligning 300 sequences knf)'.Nfl to fundion as u 70 promoters and ascertaining the most common base found at each position in the - 35 and in the - 10 hexamers. That nucleolide ¡s then chosen as the nucleotide of choice al thal position in Ihe consensus; its relative frequency and the frequencies with which the other three nudeotides occur at each pasmon is portrayed in the graph. Note that Ihere is no significanl consensus among the 17 10 19 nucleotides mat lie in the region between - 3 5 and - 10.
In that example, each individual promoter sequence had previously been ídentified so aligning the sequences is trivial. Bul consider a rather different example. In this case, no binding site has been idenlified for the DNA-binding protein in question. HCP.o\Ie\.er, several regions of a chromosome are knov.'fl to contain binding sites somewhere within their lengths. A compuler algorithm is employed mat scans each of the sequences of Ihese chromosomal regions, seard-ling for a potential bindíng sile oommon to them all. A second approach to derMng lhe conser6us sequence for a DNA-binding protein when the binding site is not already knov.fl takes advantage of dlemeal rnethods for synthesizf1g vas! sets of short DNA fragments of randorn sequence (Chapter 20). The protein of inlerest is mixed with !he population of DNA moIecules and !hose DNAs lo v.Alici1 it binds are retrieved and sequenced. A comparison 01 the 5eqJertces bound reveals !he consensus readily, because each 01 the fragments is very short
This lasl method (often calle
'OOr----------------------------------------,
"
75
~
u
•<,
~
.!:
• ;'l ,<
50
,
~
25
BOX 12-' F I GU RE' Promoter consensus sequence and spacins consensus. (Source: RedréMll"l Irom Albens B. et al 2002. Molecular bioIogy of!he ce//, 4th edition, p. 308, fig 6.12. Copyright e 2002. Reproduced by permission of RoutledgefTaylor & Frands Books, Inc.)
-e)(lended - 10"
n
-35
eQ
- 10
~mID :,L~,:~,:,=="i¡"'C"2
1
:"í:="=,="==, »N
U melting
FIGURE 12-6 Regionsofu. lhose .egions of (1 factor thal recogrlize speciflC regions of Ihe prol1lOl.er are indicaled by arJ'QW'i. Reg.on 2.3 IS responsible for mdtillg the DNA. For a schematic view of u recruiling RNA poIy-
merase rore enzyme lO a standard proITlOt€J, see Rgure 12-7. (Source: Redrawn trom Young BA. Gruber T.M~ and Gross CA 2002 Vtf!N5 of transoiplion initiatioo. Ce1I l 09 : 4 17-420, Iíg. 1_Copyright e 2002, with permission from
EIseoÁer.)
356
Mechonisms ofTronscription
dosed to open complexo Thus, the region 01' (T lhnt jnteraels with the - 10 region is doing more than simpJy binding DNA. In keeping with this expeetation. the a helix involved in reeognition of the - 10 region eontains several essential aromatie amino acids that can interaet with bases on the nontemplate strand in a manner that stabilizes the melled DNA. In Chapter 8. we described a similar role for lhe single-strand binding proteJn (SSB) during DNA replication. The extended - 10 element. where presento is recognized by an o: helix in q- region 3. This helix makes eontact with the two spedfic base pllirs fhat con5litule lhat element. Unlike the other elements within the promoter, the UP-element is no! recognizcd by (T but i5 in5tead recognized by a carboxyl terminal domain of the a 5ubunit, m Ued the aerD (Figure 1 2-7). The aCTO i5 connected to the uNTO by a -flexible linker. Thu5. although the uNTD is embedded in the body of the enzyme. the uCfD can reach Ihe upstream elernent and can do so even when tbat element is not Iocated irnmediately adjacent to the - 35 region . but instead i5 located further upstream. The (T subunit is pos itioned within the holoenzyme structure in 5uc h a way as lo make feasible lhe recognition of v8!'ious promoter elemenrs. Thus, lhe DNA-binding regions poin! a way from the body of the enzyme rather Ihan being embedded. Moreover. the spadog between Ihose regi oos is r.on5istent witb th e distance hetween the DNA elements they recognize. Thus, u regions 2 ane! 4 are separated by about 75 A when u is bound in the holoenzyme; and this is about the same distanee as that between the cenlers of the - 10 and -35 elernents of a typicaI {T7U promoler (see Figure 1 2-5 ). This r ather large spacing of lhe protein domains is accommodated by th e regíon be· tween (T regions 2 and 4. that ¡s. by region 3-especially region 3.2 (see Figures 12-4 aud 12-6).
Transition to rhe Open Complex lnvolves StructuraI Changes in RNA Polymerase and in the Promoter DNA The initial binding of RNA polymerase to the promoter DNA in the dosed complex leaves ,he DN A in double-stranded fonu. The nexl stage in tnittation rcquires the e nzyme to become more intimately engaged with the promoter. in the open complexo The transition from dosed lo open complex involves struclural changes in the en zyme
A"
aCTD~ ~
l amo - -:-"'1-
• ~ ~ [l ==~i~~==Ii~-~i===~:===~i-~i=======) UP-elemenl 35 - 10
FIGURE 12-7 O" and a 5ubunits r@cruitRNApotymt!rasecoteenzymetotitepromotet. The (-Ieoninal domain of lhe o subunit (ocm) recognizes me UP--e!emerll (where presenl). w lile tT legions 2 anó 4 recognize me - ID aOO - 35 reglOns respectively (see Figure 12-6). In this figure, RNA ~efase is sho>.Nn In a rather different srnemcrtic form!han presented In earlier figl.nes. lhis form IS p.:!1ticularly usefullor indicating surfares that touch DNA and regulating proteins aOO \'Ve use it again in some figtlres in Chapter 16 v.tten ..ve consider regulation of tronscnption in bacterio.
The Tronscriptlon Cycle in Bacteria
357
and the opening of the DNA double helix to reveal the template and nontemplate strands. This "melting" occurs between positions - 11 and +3, in rela tion to the transcription start site. In the case orthe bacterial enzyme bearing u 70 , this transition, often called isomcrization, does not require energy derived from ATP hydrolysis, and is instead the result oí a spontaneous conformational change in Ihe ONA-enzyme complex to a more energetically favorable form o Isomerizatíon is essentially irreversible and, once complete. typicul1y guaranlees that transcription will subsequently iniliatc hhúugh regulation can sliH be ímposed after this point in sorne cases). Formation oí Ihe c10sed complex, in contrast, is readily reversible: polymerase can as easi ly dissociate Crom the promoter as make lhe transition to lhe open complex.. To p icture the structural changes Ihal accornpany isomerization. we need to examine the slructure oC holoen zyme in more detail. A channel runs between lhe pinccrs of the daw-shaped enzyme, as we descrihet1 oorlier (see Figure 12-2). The active sUe oCthe enzyme. which is made up of regions from both the 13 and 13' subunits, is found al lhe base of lhe pincers within Ihe "active center cleft." There are five channels into the enzyme , as shown in lhe picture of lhe open com pl ex in Figure 1 2-8. Th e NTP-uptake channel alJows ribonucleotides to enler Ihe active cenler (see Figure 12-8 caption). The RNA-exit chanoel a llows Ihe growin g RNA chain lo leave the enzyme as it is synthes ized during e longation . The rcmaining three channe ls allow ONA entry and exit from the enzyme, as follows. The downstream DNA (that is. DNA Mead of the enzyme, yel lo be trallscribed) enters thc acti ve cenler cleft in doubJe-stranded fonu through the downstream DNA channel (between the pincers). Within the active center eleCt. lh!.! DNA strands separate from position +3. The nontemplnte strand exits the active center c1eft through tIJe nontem plale-.sITand (NT) channel snd trBvcls BCroSS the surtilce of ¡he enzyme. The template strand. jn contrast, follows a path through lhe active centcr cie n and exits through Ihe lemplate-strand (T) channel. The double helix re-forms al - 11 in the upstream DNA behind the cnzyme.
RNA exit I
0,
..,.-- --
~lIap
r-..el
FICURE 12-8 OIannelsintoandoutof tlle open complex.. lh1s r.gure sho.vs the relative positions of the DNA strands (Iemplate stlClnd in gray. nontemplale strand in orange); lhe tour regions of o. Ine - 10 and - 35 regions of the promoter and the start site of trélnsaip. lion ( + 1). lhe channels lhrougll wIlich CNA and RNA enter or Ieave the RNA poIymerase enzyme are also snO\lvn. The onIy channel not shov;tl nere IS the nlleleolide entry dlélnnel, rhrough wIlicn nudeotides enter!he active site deh for incorporalion iota the RNA chain as it is made./los dr~n. !hat channel would enter the active sile down into the page at about lhe posi.
!ion sho.vn as •+ 1" on the DNA. Vitlere a DNA strand passes undemealh a protein, il is dravvn as a dotte.:! ribbon. Sigma r€'Slon 3.2 is me linker reglOn between O}.] and 0 4 .
Two striking structuraJ changes are seco in Ihe enzyOle upon ¡somerization from the cJ osl:!d to open comlJlcx. First. Ihe piocers al 'be fre nt of the enzyme clamp clown tightJy on the downstream UNA. Secand , there is a major shift in the position of the N-terminal region of (J (region 1.1) as \Ve now describe. When not bound to ONA. u region "1.1 Hes within the acti ve center cleft of the holoenzyme, blocking the path thal, in th e apeo complex , is foJlowed by the template ONA slrand. In Ihe open complexo region 1 .1 shifts sorne 50 A and is now foun d on the outsido of the enzyme, al lowing the ONA access to the deft (see Figure 1 2-8). Region 1.1 of r.r is highly negatively charged (jusI like ONA). ThllS, in the holoenzyme, region 1.1 acls as a molecular mimic of ONA. The spece in the active center e1eCt. which may be occupied either by region 1.1 or by ONA , is highly positively charged.
Transcription ls Initiated by RNA Polymerase without the Need for a Primer Recall from Chapter 8 that ONA polymerase does not synthesize new ONA strands de novo -Iha' is. it can only extend an existing polynucleútide chain. For this reason , replication always requires a prim er strand . The pri mer is typically a short pjece of RNA thal binds lo the DNA template slrand lo fonn a sharl hybrid double-stran ded regian : DNA polymerase then ad ds nucleotides lo the 3' end ofthe primer. RNA poJymerase can initiate a new RNA chain on a DNA template and thus daes nol need a primor. This impressive feat requires thal the initiating ribonucJcotide ba brought into the active site and held stably o n the tem plate while the next NTP is presented with correct geometry for the c hemistry of pol~' merizatjon lo OCCllf. This is parlicu larly diffi· cult because RNA polymerase starts most transcripts with an A, and tlml rihonudeotide binds th e lempl:lte nucleotide {TI with only two bydrogen bond s (as opposed to the three between C and G). ThllS. the cnzyme has to make spedfic interdctions with thf:l initjaling ribonucleotide, holding it rigidly in the correct ori entalion lo allow chemical attack on the incoming NTP. The requirement for such specific interactions between the enzyme and lhe inifiating nucleofidc probably explains why most transcripts start wi,h the same nucleotide. The interactions are specific ror lhat nucleot ide (on Al, snd thus only chuins bcginning with A are held in a lOaooer suitable for f:l{ficient ¡nHiatioo. It is bo lioved lbat the interactions aro providod by various parts of pulymerase holocnzyme, including part of sigma. Consistenl with Ih is, in experiments using an RNA polymerase containing a uro derivative lacki ng Ih is par! of sigma, initiation requ ires much higher than normal concentrations of initiafing nucleotidc.
RNA Polymerase Synthesizes Several Short RNAs before Entering the Elongation Phase Once ribonucleotides e nter the active renter deft and RNA synthesis begins. there follows a period called abortive initiation. In this phase. the enzyme synthesi7.es shol:t RNA molecu les of less than ten nucleot ides in length. Insteacl of being elonga led fllrth er. these Iranscripts are re leased from the polymerase, and the en~yme, without disassociating from the template, begins RNA synthesis again. Once a polyrnt'..rase manages to make an RNA longer than 10 bp, a stable
Lernary complex is formed-Ihat ¡s. a complex conlaining Lhe enzyme, Ihe DNA template , and a growing RNA chaln. This is the start of the elongation phase, which continues until polymerase is lnstructed to terminate transcription by specific sequences downstl'eam oflhe gene. n is nol clear why RNA polymerase undergoes Ihis period of abortivc initiation , but once again a region oE the (J factor appears lo be in volved, acting as a molecular mimic. In Lhis case it is region 3.2. and it mimics RNA. This rogion of ( f lies in the middlc of the RNA cxit channol in the opon complex (seo Figure 12-8J, and for an RNA chain to be made longer than about ten nucleotides, this region of (J must be ejected hom lbat localion, a process that can take the enzyme several attempts. The ejection of (J region 3.2 probably accounts for (T being more wea.kly associated with Ihe elongating enzyrne than il is with the open complex; indecd it is often lost altogether ITom the elongating complexo In Box 12-2. The Single-Subunit RNA Polymerases, we see how these simple RNA polymerases. despile lacking a (J subunit. undergo a stl'uctumlly comparable shift in transition from the initiating Lo the elongating complex.
The Elongating Polymerase Is a Processive Machine that Syntht!sizt!s and Proofrt!ads RNA ONA passes through the elongating enzyme in a manner very s imilar to ils passage through the open complex. Thus, double-stra nded DNA enters the front of the enzyme between the pincers. At Ihe opening of the cata lytic cIeft, lhe strands separate lo follow different paths through Ihe enzyme befare exiLing via Iheir respective channels and reform ing a double hclix behind the elongating polymerase. Ribonucleotides enter the aLtive site Ihraugh Iheir det1ned channel and are
360
Mer:honisms 01 Tronscrjplion
BoJe 12·2 The Single.SUbunit RNA Polymerases In the text we disruss the multi-subunit RNA polymerases found in bacteria and eukaryolic cells. But there are several examples of single-subunit RNA poIymerases that are capable of perlooning !he some basic reoction as their more comp\ex muhice/lular rounterpans. Thus, many bacteriophage-for examp1e. the E. coIi phage T7 -encode polymerases of this type with whidl, upon infection, me..,. ttanscribe most of their genes. Sim~arly, the majority of mitochondrial and dlloroplast genes are transoibed by po/ymerases doseIy related to the sin-gle-subunit phage enzymes. l. is remarkable !ha! evolution has produced these relative/y simple enzyrnes capable of carrying out transaiption, a task that we, in !he text, emphasize as an impressive achievement even for me much larger and more c.omplicated multi-subunit enzymes. The T7 polyrnerase is the mos! widely studied 01 ¡he single-subunit enzymes. 1I has a moleoJlar weight of IOOkD- compared lo 4ookO for Ihe bacterial care enzVme (without (f factor)-and a structure shown in Box 12-2 Figure 1. OveraU il looks like the PoI 1 family of ONA polymerases that VYe considere
BOX 12~2 FICURE 1 Baderiophap 17 RNA poIymenrM!.
RNA polymerases as wel~ features that have become more apparent since fue structure DI the T7 and bacterial enzymes have been compared in comprex wilh their templates. f>.S we saw in the !eX!, me bacterial enzyme has various rnannels into and out of the active center deft (see FIgure 12-8). One of these, for example, allows the NTPs access to Ihe active site and template, ...vhere they are poIymerized, under the influence of me template, into the gro..-ving RNA chain. Another channel provides the growing RNA chain an exit rrom the enzyme. Comparable channels are seen In the structure of !he phage polymerase as well. The initiation and elongation complexes of the bacteria1 and T7 polymerases have been compared. These comparisons highlight one striking example of how a comparable functiona1 transition can be achieved through diHerent kinels of structura1 change in the tv...u cases. We note
(Jeruzalml O. and 5teitzTA 1998. EMBO J 17: 4101.) Image prepared v.ith Mo/Script. BobScript, ;md Raster 3D.
Tlle Tronscription Cycle in Bacteria
361
Transcription Is Termtnated by Signals within the RNA Sequence Sequences caJled terminalors trigger the elongating polymerase 1.0 dissociate from Ihe DNA and release lhe RNA chain it has made. ln bacteria , tenninators come in Iwo types: Rho-indcpendent and Rho-dependent. The first kind causes polyrnerase to terminate without the invol vement of other factors. The second kind. as its name suggests. requires al1 additional protejo caBed Rho lo induce termination. We wiII deal \vi.th each kind of terminator in turn o Rho-jndependent terminators. also cAlled intrinsic terminators. consist of two sequent:e elements: a short in verted repeal (of abou t 20 nucleotides) folJowed by a stretc h of aboul eight A:T base pairs (Figure 12-9). These elements do not arrect the polyrnerase untH after Lhey have becn transcribed-that is, they function in the RNA rather Ihan in the DNA. Thus. when polymerase transcribes an invertcd repea! sequencc, the rcsuHing RNA can form a stem-Ioop structure (often called a "hairpin") by buse-pairing with itself (sce Chaptcr 1) ). The hairpin is believed lo cause termination by disrupling the elongalion complex, This is achieved either by Corcing open the RNA exit channeJ in RNA polymerase, or, according to another model. by di.srupting RNA-Iemplate interactions. The hairpin only works as an efficjent terminator when it is followed by a stretch of A:U base palrs, as we have described. This is because, under those circumstances, at the time the hairpin forms. the growing RNA chain will be held on the template al the active site by oruy A:U base pairs. As A:U base pairs are Ihe we1:lkest of al! base pairs (weaker even than A:T base pairs), they are more C'dsily disrupted by the effects of the stem loop on the transcribing polymerase. and so the RNA will more readily dissociale (Figure 12-10). Rho-dependent lerminators have less well-characteozed RNA elements, as we shaU discuss below, and for them lo work requires lhe action of the Rho factor as well, Rho, which is a ring-shaped protein with so. identical subunits , binds to single-stranded RNA as it exits lhe polymerase (Figure 12-11). The protein also has an ATPase adivity: once attached to the transcript. it uses the energy derived !rom ATP hyd.rolysis to wrest the RNA from the template and from polymerase.
ONA
S 3'
• CCAGCCCG
•
CTAATaAGCaaaCT~TTTTTTaAA CAAAA
aa ''tcaoGc GATTAC CGCCCOA'AAAAAAACTTGTTT
RNA
lranscript fotded 10 fofm lerminalion hairpln
u -GC
A_e _e eG-
5'
______ 0 ~eln¡¡¡¡¡¡¡¡¡¡¡iiJ[ , 3'
I
'------.J
G rBgr;llr:ti1
FIGURE
12-9
Sequence of
a tho-independent tenninator. Al !he top is ,he seqllence, in ,he DNA, of !he remrinafar. BeI""" is sho..vn ¡he sequence 01!he RNA, and at!he bottom!he stnJctlIre of the lerminalOr hairpin. The terminatOf in quesboo is from !he trp attenUé!lOr. disrussed in Chapter \6. '[he bo~es show mutations isolated in Ihe sequence Iha! disrllpt ¡he lerminatOf. (5ollIce: Adapted 'rom Yanolsky e \98 1. Narure 289: 75 1- 75B, fig 1. Cop,Ifight Cl 198\ Nalure Publishing GfOlIp. Used wi"tIl permission.)
362
MedlOnisms o{ 7'roI1scnption
a
FICURE 12-10 Transcription
t«mination. ShoY.n is a modeI ter how !he mo-indepef'ldenllermil'latOf migtlt v.o!i.:. (a) The hairpm fOfms in the RNA (F"lgure 12-9 ) as 5000 as tIlat region has been traosaibed by polymease (!he enzyme 15 no! stlOM'l here). (b) ltlal RNA structure disrupts po!yrnaase rusl as !he eJll)1Tl!! is tr
b
!
th<:! template, terminating lurlher eIongatlOll. (SOt..'l'ce: Mapted trom Plan l . \981. 24: 10- 23. ~t Cl I981 , WIth permtSSIOfI lrom Elsevief.)
cee
e
uuuuuuu
F I C U R E 12-11 The JI transaiption t@nnination factor. The crysUlI st\'Ucttll'e of me rho termInatJon factor 15 shov",lfI a top ~ VIeW. 11: ctt1SISIs of a heramer 01 rho pro1M. e.x:h monomer l'!ele ~ in a diffe.enl color. The SÍ)( mooomers 101m
(Skordalakes E. and Berger J.M. 2003. CeII 11 4: 135.) !mase prepare
How is Rho directed to a particular RNA molecule? F'irst. there is sorne specificity in the sites it binds (the so-called mi sites. for Rho UtiJizatíon) . OptimalJy lhese sites consist of slrelches úf abmU 40 nucJeotides Ihal do not fold into a secondary struclure (Ihat is. they remain largely single·stranded); they are also rich in e residues. The second level of specificity is that Rho fails to bind eny trnnscript thal is hemg lranslated (Ihal is, a transcripl bound by ribosomes). In bacteria, trans<:ription and lfanslation are tighUy coupled-translation ¡niUates 00 growing RNA transcripts as soon ns '-hey slart exitiog púlyrnerase, while they are stiU being synlbesized. Thu5. Rho typically terminates only those transcripts still being transcribed beyond lbc end of a gene or operon.
TRANSCRIPTION IN EUKARYOTES As we have already discussed, transcription in eukaryotes is under. taken by polymerases c1osel)' related lo RNA polyrnerases found in bacteri a. This is hardly surprising: the process of transcriplion itself is idenUcal in Ihe ..wo cases. There are, however, differences in Ihe machinery used in ear:h case. One we have already seen: eukaryoles have lhree differenl polymerases (Poli, Il. and Ul) , whereas bacteria have only one. Also, whereas bacteria require only one additional initiation factor (a) , severol ¡nitiation factors are required for efficient ano pro· moler·specific ¡nitíalion in eukaryoles. These are called lhe general Iranscription faclors (GTFs). In vitro , the general transcription factors are all that is required, together with PoI 11, to initiate transcription on a DNA template. In vivo, however, Ihe DNA temp}ate in eu.karyotic cells is inoorporated ioto nucleosornes, as we saw in Chapter 7. Under these circumslances. the general transcription factors are not sufficienl lo promote sig· nificant expression. Rather. additional factors are required, including the so-called Mediator Complex. DNA·binding regulatory proteins. ando often, chromatin-modifying enzymes. We w iU 61'S1 consider Lhe basic mechanism by which Po i n and the general transcription faclors assemble al a prornoter lo initiate tran scription .i n vitro. We then consíder the roles of the additional components required to prornole transcripUon in vivo. 4
RNA Polyrnerase 11 Core Promoters Are Made up of Combinations of Four Different Sequence Etements The eukaryotic core promoter refers to the rninima! set of sequence elernents required for 8ccur8te transcríption ¡nitialion by the Po) II rnachinery. as rneasured in vitro. A core promoler is typically aboul 40 nucleoLides long, extending either upstream or downstrearn of the transcription slart site. Figure 12·12 shows the iocation , relative to thc trnnscriplion star! site, of tour el ements found in Poi n core promolerS. These are the TFIlB recognition elernent (BRE), lhe TATA elemenl (or box), lhe ¡nilialor Onr) and Ihe dowllstrearn promoter elemenl (OPE). Typir:al1y, a promoter indudes only two or three of these four elemenls. The consenSllS sequence for each elernent, and the general lranscriplion faclor Ihat binds iI, are also shown. and we sha}J describe these features in more detall in corning sections.
364
MechanisnJs oj l 'ronscription
-31
J
(j ,
-32 -3.
BRE
GGGCGCCC:
CCA
-26
'í, TATA ') ,'
." -25 :r iñC 'l TRIO
TBP
TFIIB
,
Il
I1
)/
' TATAAAA
T T
,
,
CCANTCC TT ATT
TfIlO . 32
O
,
~ACGTG " G T
Pot 11 core pfOfnotef. The figure shows the positions of various DNA elements rclatilK:! to lhe transaiption start site (indicated by the arrQVl' aboYe fue DNA) lhese elements, describe
F 1G U R E 12- 12
in me len, are as folJows: BRE (TRlfI reoognltion elemml); TAlA (TATA Box),' lnr (inillator elernenl); and OPE {doNnstream QTomoter ejff11e1'1l) . .AIso shown (beIow) ale (he consensus sequence fOf cad1 elemenl (determ.ned In lhe same w~ as dc5a1bed lor !he bactenal promoler elements, see Be»: 12 1); ~nd (above) the name of lhe general transalpbon faClO! lh11 femgmzes cach elClTlCfll (Sourcc; Butle! l .E.F. <'.1 al. 2002. Genes ond DeveIopment 16: 2583- 2592, F,& 1.)
Beyond -and Iypically upstream of- Ihe eore promoter, there are olher sequence elements required for efficient Lranscripti on in vivo. Together these elements constitute the regulatory sequcnces and can be grouped into various calegori es. reflecti ng their Jocation. and the organism in questiún , as much as Lheir flmctiún . These elemen ts ¡nelude: promoter proximal elements; upstream activator sequences (UASs): enhancers; and a series of repressing elernents called s ilencers, boundary elements . and insu lalors. AH these DNA elements bind regulalory proleins (activators and repressors), whích help or hinder transcription fram the eDre promoter, the subjcct of Chapler 17. Sorne of Ihese regulatory sequences can be located many lOs or even 100s of Kb from the core promotcrs on whic h they ac!.
RNA Polyrnerase 11 Forros a Prc#lnitiation Complex with General Transcription Factors at the Promoter The genera l transcription (aetors collectively perform Ihe functions perforrn ed by O" in bacterial transcripti on , despite showi ng no s igniliCDnL sequence homology lo lhal protein . Thus, the general transcriplion factors help polymerase bind to lhe promoter and meJt the DNA (comparable lo the transition from dosed to open complex in th e bacterial case). They also help polymerase escape from Ihe promoter and ernbark 0 11 tbe elongation phase. The complete set of general transcription factors and polymerase. boun d together al the promoter and púised for initiation, is called the pre-initiation complexo As we descrihed above (and in Figu re 12-12) many PoI 11 promoters cont ain a so-ca lled TATA elemen! (some 30 base pairs ups tream frOln the transcription start site). This is where pre-in ilj¡Hion complex (ormalion begins. The TATA element is recogni zed by the genera l trnnscription factor cDlIed TFIID. (The nomenclature "TFIl" denotes a tran scripti on factor ror PoI ll, w ith individual faetors distinguis hed as A. B. and so on .) Like many of the gen eral transcri ption faclors, TFlID is in fac! a multi-subunit complexo The componen! of TFlID lhal binds to the TATA DNA st!quenca is called TDP (TATA bindi ng protejn). The other suhunits in this complex are called TAfs, for TBP associated factors. Sorne TAFs help bind lhe DNA at cerlain promoters, and others cont rol the ONA-binding activity of TBP. Upon bind ing DNA, TBP ext ensively distorts the TATA seque nce (we shaU discuss this event in more detail presently). The resulting TBP - DNA cumplex provides a pIatfo rm fo reeruil other general
7hmscription in EuJ;oryotes
36!;
tmnscription fólclors and polyrnerase itself to the promoter. In vitro, these prolei ns assemble at the promoter in the following order (Figure 12-13): TFllA, TFIlB, TFIIF together wilh púlymernse (in complex with yet more proleins, suc:h as those in Ihe Mediator Complex. which we describe below), and then TFIIE and TFllH, which bind upstream of PoJ IJ. Formation of thu pre-ínitiation complex containing these components is followed by promoter melting. In contrast lo the siluation in bacteria, promoter melting in eukaryoles requires hydrolysis of ATP and is mediated by TFlll-L h is the helicnsc-like nctivity of Ihal rador which stimulates unwinding of promoter DNA. TBP ----...;
TFIIO
FIGURE 12-13 Transaiptioninitiation by RNA pofymerase Il lhe Slep-oMse
assembly of the PoIIl pre-initiation complex is
shown here, 1100 described in detail in lhe I€XL lFIIB
1
Once assembled al lhe pocxnoter, PoI n leaves the pre-initiation complex t...,on addition of lile nudeotide prewrsors required ror RNA synthesis, and after phospt1ory1alion of Ser resides within the enzyme's "tail:' The tail contains multiple repeats of tI-..:: heptapeplÍde sequence: Tyr-Ser-Pro-Thr-Ser-Pro-Ser (see Figure' 2- t8).
300
Mechanisms ofTronscriptian
Ju st as we saw in the bacterial case, Ih ere HO\\' fo llo\\'s a period of Abortive initiation befare the polymerase escapes the promoter and enters the elongaliún phase. Recall Ihat, during abortive initiation. Ihe polymerase synl hesizes a series of short transcripts. In eukaryotes, prOmoter escape involves a step not seen in lhe bacterial ("ase. Ihal of phos phorylation of the polymerase as \Ve now describe. The large subunit of PoI IJ has a e-terminal domain (ero), which extends as a " tan" (see Figure 12-13). The CfO conlains a series of repeals of Ihe heptapc ptide sequt:nce: Tyr-Ser-Pro-Thr-Ser-Pro-Ser. There are 27 of these repeats in the yeast PoI ti cm and 52 in the human case, Ench repeat contains siles ror phosphorylation by specjfic kinases including one tha! is a subunit of TFlIH. The form of PoI 11 recruited to lhe promoter initi aUy c:onla ins a la.rgely unphosphorylated tsil. bul the species found in th e elongati on complex bears mult iple phosphoryl groups on its tail. Addition of these phosphates helps polymerase shed mosl of the general transcri ption [actors u~ed fOf illitiation , and which the enzyme leaves béhind as il CSClIPCS the promoter. As We will Sée, regulati ng the phosphorylation slale of the cm of PoI n contro ls later steps - those involving proccssing of the RNA as well. Indeed , in addition lo TFlfH, a number of other kinases have been identified Ihat ael on the ero as well as a phosphatase tbat removes Ihe phosphates added by those kinases.
TBP Binds to and Distorts DNA Using a ~ Sheet Inserted into the Minor Groove TBP uses an extensive region of 13 sheet to recognize the minor groove of lhe TATA ele ment (Figure 12·14). This is unusual : more typically, proteins recognize ONA tlsing a helices inserted into the major groove
FIGURE 12-14 llIP-ONAcompell. The TATA binding protein (TBP) 15 shCMT1 here in purple complexed with lhe ONA TATA sequeoce (shown in gray) found al lhe start of many PoI 11 genes. The details of lrus interaction are desaibed in the Iex\. (NikolCN 0.6., Chen H., Halay D.E., Usheva AA, Hisatake K, Lee O.K. Roeder RC,,31ldBurleySK 1995. Nature3n : 11 9.) Irnage prepared Wlm MoIScnpt, BobScnpt. 31ld Raster 3D. Extended DNA on either side of image fTl()(!elecl by t WT10r Joshua·TOf.
Tronscription in El1karyotes
367
of DNA , as we saw in Chapters 5 and 6, and also for (J fedor carlier in Ihis chapler. The reason for 'fBP's u norlhodox recognilion mechanism is linked lo Ihe need fOr !hal protein 10 di sta rt Ihe local DNA structure. Bu l this mode of recognition mises a problem: how is spec ifi city aehieved? Wo have seen in Chapler 6 thal, compared lo the major groove, lhe minor groove of ONA is less rich in the chemieal informat íon that would enable base pairs lo be dislinglli shed . Instend, lo seleel the TATA scqllcnce, TBP relies on tbe ability of that sl-'quence to undergo a sp~cifi c struetural distórtion, as we núw describe. When it binds ONA, TBP causes the minor groove to be widened to nn almost Oat conformatiún: iI also bends the ONA by en angle of approximately 80<>. T he inlcrnction between TBP ond DNA involves on ly n limited number of hydrogen bonds belween the protein and lhe edges of Ihe base pairs in the minar groove. Instead, much of the specificity is imposed by two pairs of phenylalanine side chains Ihat interr.aJate between the base pairs al eilher end uf th~ recognition sec¡uence and drivelhe strang bend in the DNA, T hus, A:T base pairs are fl:l vored becallse lhey are more madiJ y distorted to 8\10W !he initi al openi ng of the minar groovc. There are also extcllsive interaclions between the phosphattJ backbone and basic residues in the B sheet, adding to t.he overall binding energy of tbe interaction.
The Othcr General Transcription Factors also Have Specific Roles in lnitiation We do nol know in delail Ihe funelions of all Ihe olher general transeription factors. As \Ve have noted. sorne of these factors are in fuel complexes made up oftwo or more sllhllOilS (shown in Table ]2-2). Below we comment on a few structural and functional ehnracleristics. TAFs. TBP is associaled with übout ten TAFs. 1\vo orlhe TAFs bind ONA elements al lhe prumoter; for example. !he initiator element (hu) and Ihe dowJlstream promotef elemenl (DPE) (see Figure 12-12). Several of the TAFs hnve structural homoIogy to hi slone proteins. a nd it has been pro posed tila! !hey mighl bind DNA in a similar manncr, although evidenr:e for such a form of DNA binding has nol been ohtained. For example, TAF4 2 and TAF62 from Drosophila have becn shown to fonu a struelure similnf to thnl of the 1-13 · H4 tetramer (see Chapler 7 ). Theso hi stone-Iike TAFs are found nol only in lhe TFlJD complex bul are also associated with sorne histone modifieation enzymes , such as the yeasl SAGA complex (see Table 7-7). AnoUler TAF appears lo reglllate the binding of TBP lo DNA. 11 doos Ibis using an inhibitory flap thal binds lo the ONA-binding surface of TBP-another example of molecu lnr mimicry. This flap must be displaced for TBP lo bind TATA. TFllB , This protein , a single polypepli de chain, enters the proinitiation complt:x after TBP (Figure 12-13). The crystal structure of lhe
ternary complex of TFIlB - TBP-DNA shows specific TFII B- TEP and TFlIB-DNA eontacts (Figure 12-15). These include base-specjfic inleractions with the major groove upstream (lO the BRE-sea Figure 12-12) and lhe minor groove downslream. of the TATA elernen!. The asymmetTic binding of TFIIB to th e TBP- TATA complex accounls for Ihe
r A8LE
12·2 11teGeneral Transaiption Factors af RNA PtIIymerase 11
GTFs
NumberOf SUbunits
TBP TFIIA TFIIB TFIIE TFIIF TFIIH TAFs
2 1
2 3 9 11
l"h6 mmbers shown 81alor veas! bul are $S/Tll1ar 101 0Itler t'\JIaryolllS. 1I1C1uding h uman:¡ .
MtlChan;sms 01 Tronscriplion
flCURE 12·15 TFIIB-TBP-promoter
compell. This slructure shoNs the TBP prolein boond 10 lhe TATA sequence, just as we saw in lhe preW:lus figure. Here, !he general transcriplion factor TFIIB (shov-m in lurquoise) has bem added lhis lripartite romplex fotms !he platform lo which o\her generallransaiption factors, aOO PQlllltselt are recruited during pre-1nilialion oomplec assembly. (Nikolov O.B., Chen H., Halay E.D., USheva AA, Hisatake K. l ee O-K, Roeder RG., Cfld Burley S.K. 1995. Narure 377: 1 19. ) Image prepared wi!h MotScriPt. BobSaipt, <'Ind Raster 3D. Extrnded DNA on either side ef ¡mage modeled by
l eemor Joshua-Tor.
asymmetry in the rest of ¡he assembly of Ihe pre-initiation complex and the unidirectional transcriplion that results. TF11B a lso conlacts Poi IJ in Ihe pre-iniUation complexo Thus, this protein appears to bridge the TATA-bmmd TBP and polyrnerase. Recent slructuraJ sludies suggest thal the N-terminal dornain of TFIlB inserts inlo the RNA exit r:hanneJ of PoI n in a manner analogous lo 0":1.7 ¡n the bacterial cnse. This two-subunit factor associates with PoI n 'a nd ¡s recruited to the promoter together with that enzyme (and other fadors). Binding of PoI U-TFIIF stabilizes the ONA-TBP-TFUB complex and is required before TFllE and TFIIH are recrui led lo the pre-inilialion complex (Figure 12-13).
TFIIF.
TFIlE and TFlllI. TFlIE. w hich , Iike TFIlF, consists of two subunits, binds next, an d has roles in the recruitment nnd regulation of TFIU-L TFrtH r:ontrols t.he ATP-clepenclen t Irans ition of Ihe pre-initiation complex to the open r:omplex. lt is nlso the largest and most cump lex of the general transcripti on factors-i t has nine sllbuu its and a molecular mass com parable lo tltat of the polymerase itself! Within TFIJH are two subunits thal nmction as ATPases. and another Ihal is a protein kinase. with roles in promot.cr melti ng and escape. as described ahove. Toget.her with other ractors, the ATPnse subunits me also ¡nvolved in nucleotide mismalch tepair (see Chapter g).
In Vivo, T ranscription Initiation Requires Additional Proteins, lnduding the Mediator Complex Thus far we have described what is nceded for PoI n lo jnitiale tTanscription (rom a nak.ed DNA lemplate in vitro. But we have already notad that high , regulat~:d levt!ls of t.rauscriptioll in vivo requhe. additionally. the Mediator Complex, ltanscriptional reguJatory proteins, and, in many cases, nuc1eosome-modjfying enzyrnes (which are themselves often parts of large protein com plexes) (Figure 12~ 16). The charncteristics ofvarious modi fying complexes are given in Table 7.7. One reason lor these additional requirements is that the DNA tcmplale in vivo is packaged into nllcleosomes and chromntin, as we discussed in Chapter 7. This condition complicates binding lo the promoter of polymerase and its associnted fnctors. Transcriptional regulatory
Tronscription in Ellkmyotes
369
FIGURE 12-16 Assembfyotthe
pre-initiation comp&ex in p-esen« of MediatOf, nudeosome modifiers and remodeJers. ~nd trans~ activators. In adótion 10 the general transcription lacm st'OM'l ifl Figure 12- 13, transcriptional activators
me
brund to sites near gene recn..rit nudeosomes rnodityY1g and rerrodeIing canplexes, and !he MeOOtcr Comple., v-.-t1.ch together heIp form the pre-i'ntiation COITlpI&
proteins caBed activators he lp reeruit polymerase lo the promoter, stabilizing i15 binding there. This recruiUnent is mediatoo through inlemetions betwe en DNA-bound activators and parts of the 1ranscription maehinery. Often the interactíon is w ith the Mediator Complex (henee its name). Mediator is associated with Ihe CTD "taH" or Ihe large pol)'merase subunit through one surrace, w hile prescnting other surfaces ror interaetion with DNA-lxmnd activators. Thís explains the need for Mediala r to aehieve significant transcri ption in vivo. Despite this central role in transcriptional activation. deletion of individual subunits of Mediator often leads to 105s of expression of unly a small subset of genes, diETeren! for each subunit (jI is mude up ufmany subunits). This resu lt Hkely reflects the mcl thal different acti valors are beli eved to ¡nlerad with diffcrent Mediator subun ils lo bríng polymerase lo different genes. In addition, Mediator aids ¡n¡tiation by regulaling the CfD kinase in TFllH. The need for nllcleosorne modifiers and remodellers also differs at differenl promoters or even al the same promoter under differonl circurnstances . When and wherc reqll ired . these complexes are also recruited by the DNA-bound acti vatol'S. We wiU discuss the role oCMediator and modifiers in stimulating transcription in Chapler 17. We now consi rler sorne oC the slructural and functi onal properties of Mediator. yeasl mediator
Mediator Consists of Many Subu nits. Sorne Conserved froro Yeast to Human As shown in Figure 12-17. the yeast and human Mediator each ¡nelude more than 20 subunits. of which 7 show significan t sequence homology belween Ihe Iwo organisrns. (The names of the suhunits are differenl in each case, reflecling the experimental approaches tha l led to their ¡dentil'iention.) Very few of Ihese subunil s have any identified functíon. OnJy one, (Srb4l, is essential for lranscription of essentiaJly aIl Poi 1J genes in vivo. Low-resoluli on slntctural comparisons suggesl both Medintors have a similar shape, and both are very large-even bigger than RNA poJymerase itself. The Mediator from both yeast and hurnans is organi zed in modules. These modules can be dissociated from one another under certain conditions in vitro. This observation , together witb the raet Ihal human Mediator varies in its eornposition (and s izel depending on how il is isolated, has loo lo the idea thal there are variOllS forms of Mediator (parliclllarly in melazonl1s), each conlaining suhsets of Mediator sllbunits. FurtherOlore. it has been argued lhal the difierent fo rrns are involved in regulating diffcrenl sllbscts of genes, or responding to
human medialor FIGURE 12-17
eomp.oonofdJe~
and Human Medialors. The homdogous protans are-shcJ.om in dark ttJe. (Source: f'oIo:fdied w.th pemissIon frcm ~k S.
370
Mecl1ani¡;ms ofTrcmsaiplion
different groups of regulalors (activators amI rcpressors). It is equally possible. however. lhat the variations seen in subunit composltion are artifacts, sirnply reflecUog different methods ofisolatiOll. lo sorne sturues it has beco shuwn thal a complex consisliog of PoI n. Mediator, and sorne of the general transcriplion factors can be isolated from r:ells as a single complcx in Ihe ahsence of DNA. This led lo the specul alion that the hulk of tJle proleins required lo iniUate tran· scl'ipUon might arrive al the promoter in a siogle prefonned complex, rather Ihan in 8 slepwise manner. The putative preformed complex was named Ihe RNA Poi n holoenzyrne, aftar lhe bacteri al enzyme containing the CJ factor. and thus able lo iniliote. Desp ile this pam ll el in oamiog. there are esseotial faclors (such as TFUD) that do not associale with the eukaryolic RNA polymerase. H is unclear whether the holoenzyme exisls in significant amounts in vivo, compan.'tl to separale polymerase and Mediator Complex.
A New Set oí Factors Stimulate Poi II Elongation and RNA Proofreading Once polymemse bas iniliated trnnscription, ji shifts ioto the elongatlon phase, as we have discussed. This trnnsition iovolves Ihe Poi II enzyme shcdding most of its initialion factors -for exnmple. the general lranscription factors and Mediator. ro their place another seL of !actors is recruited. Some of Lhese (such as TFllS and hSPTS) are clongation fudors-that ¡s, factors that stin1ll1ate elongation. Others are required for RNA processing. The enzymes ¡nvolved in aH these processes are, like severa! of the initiation factors we have discussed, recruited to the e-terminal tail of the large subunit of Poi n, the CTD (Figure 12-18). In
a
enzyme
componeots 01 splicing machinery
poI'jadeJlation and cleavage fadOtS
b
N
y 1
S 2
P 3
T 4
S 5
P 6
S
¡I" .C
7
FIGURE 12-18 RNA pnJCessingenzymes _e reauited by the tail of poIymerase..
The lOp
part of !he figule ~ various enzymes iflVOt,¡ed In RNA processing reauited by !he "tair of poIyrnerase.
Dilferml enzymES are recrutcd depeoding on lhe phosphor¡Aation stale o( !he talL Those enZ'y1Tles are Ihen Iransferred 10 Ihe RNA dS Ihey are needed (see ocxt scctioo in tcxt). The bollom par! of the figure illusrrates a SC~1Ic of !he tdil wilh lhe se:¡uerlU! el Orle copy ot !he heptapeptide repeal shown. lhe posi6ons 01seJine residues thal gel phosphcJr0ated <'!re mdicated. Phasphory\ation 01serine al position 5 IS assoaaled w.h rOOUlbTlt;:nI 01 capping factors, phosphorylation 01senne al posit¡on 2 is assoc.ated INith recrultrnenl el splidng lactors.
w.ereas
this case. however, the factol'S favor lhe phosphorylaled form of lho CID. Thus phosphorylation of the (."T'O leads lo an exchange of initiation factors for tbose factors required for elongation and RNA processing. As is evi denl from the cryslal slruclure úf yeast Po) n, Ihe polymerase CfO Hes d,irectly adjacent lo the channel Ihrough which Ihe newly synlhesized RNA exits Ihe enzyme. This, logether with its length (it can extend sorne 800 A frOID the body of the enzyme) aIJows the lail lo bind several components of the elongalion and processing machinery and lo deliver Ihem lo Ihe emerging RNA. Various proteins are thoughl lo stimulate elongation by Poi 11. One of these, the kinase P- TEFb, is recruited to polymorase by transcriptional activators. Once bound lo Poi 11, lhis protein phosphorylates the serine residue al position 2 of the CTO repeats as described earlier. Thal phosphorylation event correlales wilh e longalion. In addition, P- TEF'b phosphorylates and thereby aclivates another protein, caBed hSPT5. itself an elongation factor. Lastly, TAT -SF1 , ye! another elongation faclor. is recruiled by P-TEFh. Thus, P - TEFh stim ulales e longation in thrce separate ways. Another factor Ihal does not aHeet inil¡ation, bul stirnulates elongalion, is TFllS. This fudor stimulates the overol! rate of eJongaLion by li miting Ihe length of lime polymerase pauses wben it encollnters sequences thal would otherwise lend lo slow Ihe enzyme's progress. I.t is a feature of polymerase tha! it does not transcribe l.hrough all sequences al a constant rateo Hather, it pauses periodically, somelimes for mther long periods, before resuming transcription. In !he presenc-.e ofTFIIS. the lengt h of time polyrnerase pauses al any given site is reduced. TFI1S has another func lion: it contributes to proofreading by poJymerase. We saw al the staft of Ihe chapter how polymernses are obJe, Inefficiently, to remove misinoorporated bases using lhe active site of lhe enzyme to perfonu Ihe reverse reacUon to nucJeotide incorporation. In add:ilion, TFns slimulates en inheren1 RNA"e ac.tivity in polymemse (not pur1 of the active site), allowing an alternati ve approach lo remove ntisincorpomted bases through local limitro RNA degradation . Tltis feature is comparable to the hydrolytic editing we described in the bacterial case Slimulated by the Gre factors we discussed there.
E10ngating Polymerase ]s Associated with a New Set ol Protein Factors Required for Various T ypes of RN A Processing Once I.ranscribed, eukaryotic RNA has lo be processed in various ways before being exported from the nucleus where it can be trans lated. These processing evenls ¡ndude the foHowing: capping of the S' end of the RNA; splicing; and poJyadenylation of the 3' end 01' the HNA. The mos! complicaled of these is splicing- t he process whereby nOnc:od ing introns are removed from RNA to genera te lhe mature mRNA. The mechanisms and regulation of thal process and olhers. such as RNA edit ing, are the subjecl of Chapter 13. We consider the olher two processes here. Strikingly, Ihere is Sil overlap in proteins involved in elongation, and thase reqlllred for RNA proccssing. In one case, for example, one elongation factor mentioned above (hSPTS) also recruits aud stimuJates the s' capping enzyme. lo another case, elongation factor TAT-SF1 recrw ts oomponents of tbe splicing machinery. Thus il seems that elonga lioll, termination of transcriplion, and RNA processing are interconnecleclpresumably lo ensW'C their proper coordination.
372
Mechanis/JlS ofTranscription
The firsl RNA processing even\ is capping. This involves lhe addi · tion of a modjfied guanin e base to Ihe s' end of lhe RNA. SpecülcaHy. il is a methylaled guanine, and it is joined lo lhe RNA Iranscript by an unusua l 5'-5' linkage involving three phos phates (see bottom of Fig· ura 12-19). The S' cap is r:reated in three enzyrnatic ste ps, as delailed ín lhe figure and legend . In the first step. 1I phosp hate group is removed 'from Ihe 5' of lhe transcripl. "fhen. the CTP is added. And in the final step. tbat nucJeotide is modified by Ihe addilion of a melhyl group. The RNA is copped when it is still only sorne 20 - 4 0 nucJeotides long-when the transcription cyde has progressed only to the lransition between the iniliation and elongation phases. After capping. depbosphorylation of Ser5 wilhin the taiJ repeats leads to dissociation of lhe capping machinery, and fmther phosphorylation (thi s time of Ser2 within the ta il repeats) causes recruitmen l of the machinery nceded ror RNA sp licing (see Figure 12-18). Thc final RNA processing event, polyadenylation oflhe 3' end ofthe rnRNA, is intimately linked wíth lhe terminlluon of transcription (Figure 12-20). Just Il t; wiLh capping and splicing, the polylllcrase CJ1) tail is involved in recruiting Ihe enzymes necessary for polyadenylation
FIGURE 12-19 Thestructufeand formation of the S' RNA cap. In the first
!he 'Yi>hosphate at the 5' ene! of the RNA is r~ by ilfl enzyme called RNA triphosphatase (!he in¡lialil"'@:nudeotideof
sq"
- 3'
5' ,
a trarDC~t in¡tiaJ~ reldios its 0- , ~, dnd 'Y'
1
phosphates). ln me next slep, lhe enzyme
"NA
8uany1~
lransferase cat¡jyZE.'S lhe nudeophilic anad: of !he resulting 16l11l11dl j3,-ph::lsphale 0f1 lhe a-phosptlor)4 grOlll 01 d molecule of GTP, wnh j3,. and -y-phosphates of the GTP serving as
tripho,""'"''
a WQPhC6pha!e leaving group. Once this linkage is made. lhe neMy added 6'larnne and !he
punne at lile original 5' eM 01 lile mRNA are further rnodified by the addition 01 metnyl groups by methyl translerase The resulting S' cap strucIure late- reauits!he rib050rne to the
1"oon,I,1 '",n".,...
mRNA for ¡!anslalion lo begin (see Chapter 14).
' HO
1me..,1 '"n"~.. 7 methyl
Donscrípfion in EukoIyores poIy-A signal seQuence in
'--
CPSF ........
RNA cfeevage
--CPSF 5' •
•
poly-A polyrnerase (PAP)
5' • poIy-A-binding protein
additional poIy-Ablnding protein
(Figure 12-18). Once polymerase has reached the end of a gene, it encounters specific sequcnces that. after being transcribed into RNA. trigb'Cr the transfer of the polyadenylation enzymes to that RNA, leading to three events: deavage oC the message; addition of many adenine residues to its 5' end; and, subsequently. tenninatioo of transcription by polymerase_This process works as follows. Two prote¡n complcxes ore carried by the ero of polymerase as il approaches the end of the geoe: CPSF (deavage and polyadcoylatioo specificity factor) and CstF fcleavage stimuJation factor). The sequences whicb. once lranscribed into RNA, trigger transfer of these factors to Ihe RNA, are called poiy-A signals 8nd are shown in Figu.re 12-20. Once CPSF and CstF are bound to the RNA. other protems are recruited as \Vell . leading ini tially to RNA c1eavage and then polyadenylation.
373
FIGURE 12-20 P1JIyadenytationand
termination. The various sleps in mis puxess are described In !he texto
Polyadenylation is mediatcd by an enzyme c.. Jled poi y-A polymerase, which adds ..baU! 200 adenines lo tbr. RNA's 3' end produced by the c1eavage. This enzymc uses ATP as a precursor and adds the nucleotides using the same chemistry as RNA polymel""dse. But it does so witbout a template. Thus, the long tait of As is found in the RNA bul not the DNA. It is nOI dear what determines Ihe length or the poi y-A tail. bul that process ¡nvolves other proteins Ihat bind specifically lo Ihe poly-A sequence. The mature mRNA is Ihen transported Ii-om the nudeus, as we shall discuss in Chapter 13. lt is Iluteworthy that the long tail or As is unique lo transcripts made by PoI 11, a reature that allows experimental isolation or proteio codiog rnRNAs by affinity chromalography. Thus, we see how a mature mRNA is released from polymerase once the gene has becn Iranscribed. But what lerminales transcription by po lyrnerdse? lo raet , the enzyme does no! terminate ¡mmediately when the RNA is deaved and polYéldeoylated. Rather. it continues lo moyo aloog the template, generaling a second RNA molecule thal can become as long as several hundred nucleotides before lerminating. The polymerase Ihml dissociates from lhe template, releasing tile Ilew RNA, which is degraded withollt ever leaving Ihe nucleus. Jt is nol understood whal links polyadenylation lO termination , but it is c!ear thal the polyadenylat ion signal is required fot termination (interestingly, RNA c1eavage is not)o Two basic models have beco proposed to explain the link between polyadenylation and termination: first , that the transfer or 3' processing enzymes from the polymerase ero tail lo t11e RNA Iriggcrs a con rormational change in the polyrnerase Ihat reduces processivity of the enzyme. leadiog to spontanoous tennioation soon afterward. The second mudel proposes that !he absence of a 5' cap 00 the second RNA molecule is sensed by the polymerase. which, as a resulto recognizes Ihe transcripl as improper and tennioClles. The absence or the cap. of course. reflects the absence or the capping enzymes on the LiD at thi s stage of the transcriptioo cyclerecall that those enzymes are loaded ooto Ihe CTO at Ihe poiot where initialion turns lo etongation and are then displaced in favor or the splicing machinery.
RNA Polymerases I and In Recognize Distinct Promoters, U sing Distinct Sets of Transcription Facto rs, but still Require TBP WO havo already mentionod that oukaryotos havo two other polymerases-Poi 1 and PoI III-in addition lo Poi 11. These enzymes are related to Poi n and even share several subunits (Table 12-2), bul thfly initiate transcription from dístinct promoters and transcribe distinCl genes. These genes encode specialízed RNAs. rather Ihan protcins as we discussed earner in the chapler. Each of these enzymes also works with its own lU1ique sel of general transcription ractors. TBP. however. is universal. because it is involvcd in initiating transcription by Poi I and Poi m, as wel\ as Poi L1 . Poi I is required for Ihe express ion of onll' one gene, that encoding the rRNA precursor. Thero are many copies of that gene in cach cell, and indeed it is expressed al far highcr levets than any other geneperhaps explaining why it has its own dedicaled polymcrase. The promoter ror the rRNA gene comprises two parts: the COte element and the UCE (upstream control element) as shown in Figure 12-21 . The rormer is located around the start site or transcription , the latter ootween 100 and 150 bp upstroam {in humansJ. in uodition to Poi l.
Transcrlpfion in Eukaryoles
•
B
I
(L1
A UCE
150
FICURE 12-21 PoI I promoterregJon. (a) Slructure 01the Pol i promoTer. (b) Pol i tm 'actas- The case srovn here is me vertebrate system. The set of proreins invo/ved In helping PoI I transc:riptlOn In yeast IS rarllef ditlerenl
.,
100
315
b
., initiation requires tWQ other (aetars, called SLl and UBF. SLl comprises TEP and three TAFs specific roc PolI transcription. This complcx binds to the dmVTIs tream haJf of UCE (called site A). SU binds ONA only in the prescnce of UBE That factor binds lo the upstream half oC UCE (called sitc Bl, bringing in SL1 and stimulating transcription from tJlC O'lre promoter by recruiting PolI. PolllI promoters cOme in various form s, and the vasl ma jority ha ve lhe unusual {cature of bcing locatcd d own stream oC the transcríption star! site. Sorne Poi lit promoters (Coc examplc, those foe the UmA genes) consisl oCtwo regioo s. called Box A and Box B. separatcd by a short elernenl (Figu re 1 2-22 ); OthNS contain Box A and Box e (for cxample, the sS rRNA gen el; and still other5 contaiJ1 a TATA element Iike those of PoI U. fuSI as w ith Poi 11 and Poi l. transcription by Po) Jl1 r€quires transcription factors in addition to poly mecasc. in this case, thc factors aro Cc.1l1ed TFIllB and TFllIC (foc the IRNA genes), and Ihose plus TFllIA ror the 5$ rRNA gene. Figure 12-22 shows the tRNA promoter. Here , the TFlIIC complex bincls lo the promoter region. This complex recruits TFIJIB lo the DNA just. upstream of the slart site, whero it in tuen recruits Poi III lo the start sile of transcriplh;m. The cnzyme then initíates , presumably displadog TFU1C &oro the DNA templale as it gaes. As with the ather two dasses of palyro cr-dse, Poi ni uses TBP. lo this case, tha! ubiquitaus factor is fou n d w ithin the T FtlJ B complexo
• i J
!!
B~B
b ---- TBP . - TFIIIB \
TFIIIC
Bo>
B~B
)
fiGURE 12-22 PoI 111 core p'omoter. Shov.fl heJe is die pt"omotef for a yeast tRNA gene fue ordef of events leidlll lo trilflSOlption IIllllation is desaibed in the text
376
MeGhon/sm s of Tronscriplion
SU MMARY Gene exprcssion is tha process by which thc information in the DNA double helix is converted into the RNAs and proteins whose activilies bestow upon a ceJl ils morphology and functions. Transcriptian is the (irst step in gene expression and invalves copying DNA into RNA. This process, calal)'zed by tha enzymc RNA pol)'merosa, is in rnany ways similar to tha procf!SS of DNA replication discussad in Cha~er 8, lo both cases, a oew chaio of nudeotides is synthesized. upon a oNA templale; and both DNA and RNA synthesis proceeds in H 5' lo 3' direction (thal is, Ihe enzyme adds each succcssit'c nucleotide lo the 3' cnd of the growiog chaln), Bul Ihcre aro soveral critical diB'arenccs belwl.'e n lbese twa processcs, somo memanistic, otheIS raflecling Ihe different roles the}' serve. For example. in DNA replication the entire genome is duplica1ed once and onI)' once each cen divisioo. lo transcription, onl)' sorne regioos of the genome aro transcribed, and tbe mgions chosen VBry io differcnt cells or io the same call al differenl limes. Oiffel'col regions can be tran· scribed to different extenls-Ihat is, anything from one lo several thousand transcripts can be made of a given region in a single cei!. Mochanislic diffcrcm;es belween transcription and replication ineludo the rúllowing~ the nucleotides used lo build a new ONA cbain lite deoxyribonudeotides. ""hereas in transcription they are ribonucleotides. Also, whereas DNA polymerase can on ly elongate existing polynucleotide dIains, and thus requires a primer, RNA polymcrnse can initiate RNA synthesis de novo. RNA pQlymera.ses from bacteria lO humens are higMy cansen·ed. Eukaryotes have Ihree differenl polyrnera'>6s cach: bacleria have just one. The threc eukaryofi c eozymes are called RNA Poi 1, U. and lll, lo this chapter we fm.:u scd primarily on Poi n, as this is the enZ}'me tbal transcríbes the vasl majorily of genes in the cel! and aH Ihe protein cod ing genes. The basie enzyme fTom E. eoli, caJ1ed the core enzyme, has one Cap}' of each oí tbree subunits-j3, ¡3:', and w aJld two copies o( o. AIl thase s ubunits have homologues in the eukaryolic enzyrnes. The struclures of the bacterial and yensl Poi n enzyme are ¡¡Iso s imilar. Both rcsemblc a crab daw in sbape, rhe pincen; being mude up of Ihe largest subunits, j3. and W in the case of the bacterial enzymc. The active site is al the base oC the pincers, and aecess lo and from Ihe aclive sitc is afforded through five channek one allows double-strandcd DNA lo cntCT oolween the pino:.:ers al Ihe fronl of the enz)'me; Iwo olhcrs allow the two single strands- the ternp late and non-template strands- Io lcave Ihe e nzyme behind the active s ite: anolher mannel provides tha raute by which NTPs cnter Ihe 8(..11ve site; and the RNA product, which peels off the DNA templale a shon dislance benind the site of pol)'merization, exits the enz}'me tbrough tbe 6fth ehannel. PoI n differs from Ihe bacterial f,!nZ}'me in one important wa)'. The former has a so-caHed "Iail" al the Clerminal end of Ibe large subuni l. and Ihis is absent IToro /he bacterial enz)'me. Tbis tail is made up of multiple repeab; of il heptapeptidc sequcnee.
A cound of transcription proceeds Ihrough three p'nasas cl:illed ¡nHiatioo, elongatioll, and termina. ion. Thollgh RNA polymerases can s)'nthesiza RNA unaided, olIJer proteins-callad initiation factoes - are required for aecurste and cfficient initintion. These faclo rs ensure Ibat thc cnzymc initiales transcrip'ion only from appropriate si les on Ihe DNA. ca llerl promoters. In bacteria lhere is onl}. ooe ioitifltion factor, 0- , whereas in eukar)'oles Ihare are several, collectivf.'ly called lhe general transcription fa ctors. In eukar)'otes , the ONA is wrapped within nucleosomes and, in vivo, e ffident initiation ver)' afien requires additiooal prote¡ns . induding the Mediator Complax lind nuc!eosome modifying enz)'mes. Transcriplional aclivator proteins are also needed (seo Chapler 17). Dming lnitialion, RNA polyrnerase (togcthar with Iho initiation factoNi) binds to the promoter in a c1000d complexo In Ihat slale the DNA remains in a double·strandcd form. This dosed complex then undergoes isornerization lo Ihe open complruc In that form , the ONA around Ihe trans("Tiption slart sito is unwound , disrupling the base pain;, and formiog a bubble oC single-strandcd ONA. This trans ilion allows aecess lo Ihe template slrand, which determines the arder of bases in Iho ne\\' RNA strand. This phase of initiatioo is followcd by promoter escape: once the enzyme has synthesized a series of short RNAs, called aborti\'e iniliation, it manages lo mako a lrans(.'ript ¡hal grows beyond 10 bp. Al Ihis poinl Ihe cnz)'rne leaves the promotor and enlers the elongation phasc. During Ihis phase, polymeidse moves along the gene while 010 cnzyme performs several functi ons: it opens Ihe DNA downstream and reseals il upslream (behiod) Ihe active sitc; il adds ribomiclf~olides to Ihe 3' end of Ihe growing transcript: it pecls the newly-formed RNA off the templato sorne B or 9 bose pairs behind Ihe point of polymerization; and it also proofreads Ihe transcript checking for (o.od repladng) incorrectly inserted nucleolides. Transcription in bolh bacteria and cukaryotes fo llows tbese same steps. There are differenees in tbe two cases, however. For example, in bacteria, isomerization 10 the open complcx occurs sponlnneously and does nol require ATP h}rdrol)'sis. In cukar)'oles this step does requiro ATP hydrolysis. Moro strikingly, in eukaryotes, promoler escape is regulalcd by lhe phosphorylation sla te oC Ihe CTD lail. Thus, the fo rm of PoI II thal binds the promoler- in Ihe pre-initiation complm: has an unphosphorylaled erD. This dornajn bec:omes phosphorylated by one or mOfe kinases, including one thal is par! of one oflhe general transcription fael ors, TFIIH. Termination aIso works differently in bacteria and eukaryotes. Thus, in bacteria there are hvo kinds of term!nators- intrinsic (Rho-independent) and Rho-dependen!. lntrinsic terminators consisl of two sequence eloments thal operate once transcribed into RNA . One element is an inverted repeat that rorms a stem loop in the RNA, disrupting the elongating polymerase. In oombination with a slring of U nueleolides {which bond onl}' wcakly with the template strand), Ihis leads lo release oC Ihe traosc.TipL
Bibliogroph:y
Rho-dcpendent tenninators require the ATPase Rho. a protein thal hops on elongating transcripls and " pulls" them from the enzyme. In cukaryotes, lerm ina tion is dosely Jinked lo an RNA pmcessing event called 5' polYBdenylation . Once phosphoryl¡¡led, Ihe CTO taH of the Poi IJ mes ¡tself from ¡he otber proteins at Ihe promoler. releosing poly· merase into the elongation phase. The cm then binds faetors involved in tronscriptional elongaticJn and RNA processlng. Thus. there is ao ext:hange of initlalion for oloD-
377
gation and processing factors as !he polymcrase movcs away from the promoter and slarts transcribing the gcne. There aro aiso intcral-1ions betwcen the elongation factors and Ihose involved in proccssing, ensuring proper (;oordinalion ofthcse evcnts. In Ihis chaplee we considered capping of Ihe 3 i cnd of Ihe RNA transcripts, polyadenylalion of Ihe 5' eod. and the link between the las! of Ihese and transcriptional term iDation. Splicing is dcsLTibed in the ne:d chapler.
BIBLIOGRAPHY Books
Transcription lnitiation
Cold Spring Harbar Symposio 00 Qllantitotive Biology. 1998. Volume 63 : Mechanisms of Transcription. Cold Spring Harbor Laboratory Press. Cold Spring Harbor, Ncw York. Ptashne M. and Gann A. 200 2. Cenes Olld sisoals. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York . White RJ . 2001. Gene Il'onscription : Mechanisms ond control. BlackweJl Science. Malden. Connecticut.
Malik S. and Roeder RG. 2000. Transcriplional regulation through medialor-Iike coactivators in yeas! and meta· zoan cells. Tronds Biochem. Sci. 25: 277-263. Myers L.e. and Kornberg R.n. 2000. Mediator oftrans(.Tiptional regulatíon. Annu. Rev. Biochem. 69: 729-749. Woychik N.A. and Hampsey M. 2002. The RNA poly· mCTiJSC 11 machincry: Struclure iIIuminates functian .
RNA Polyrnerase Borukhov S. and Nucller E. 2003. RNA polymerase holoenzyme: Structure. funetion and biological implications. Cllrr. Opin oMicrobio/. 6: 93- 100. Darsl S.A. 2001. Ba(;lerial RNA polymerase. Cw·r. Opill. Struct. Biol.U: 155 - 162. Ebright R.H. 2000. RNA polymerase: Structurnl similarities hctwccn bacterial RNA polymerasc Bnd eukaryotic RNA polymcrasc U.f. Mol. 8io/. 304: 687-698. Murakami K.S. and Darsl S.A. 2003. Bacterial RNA poly* merases: The whole story. Curro Opin o Struct. Bio/. 13: 31-39.
Paget M.S. and Helmann }.O. 2003. Tho sigma 70 family of sigma. (adors. Cenome BioL 4: 203.
Promoters Butk'l" I.E. and Kadonaga J.1'. 2002. The RNA pol}rmerase 11 coro promoter: A key componenl in the rogulation of gene cxpression. ~lles Dev. 16: 2563-2592.
CelJ 108: 453-463.
Young B.A. , Gruber T.M., and Gross C.A. 2002. Views of transcription initilltion. CelJ 109: 417 -420.
Elongation and RNA Processing Howe K.J. 2002. RNA polymerase 1I conducfs a symphon)' of pre-mRNA processing activities. Biochim. Biophys. ACla 1577: 308-324. Maniatis T. aad Reed R 2002. An extensive nehvork. of c.:oupling among geno expression machines. Na tl1ro 416: 499-506.
Termination Richardson J.P. 2002 . Rho-depeodent termination al;ld ATPases in transcript tenninalion Biochim. Biophys. Acta 1577: 25 1 -260. - -- 2003. Loading Rho lo terminate trans(.Tip!ion . CeJ/ 114: 157 - 159.
e
H A P TER
RNA Splicing
he coding sequence of a gene is a series of Ihree-nuclootide codoos that specity the linear sequence of amino acids in its polypcptidc product. Thus far we have tadlly assumed tha! the coding sequence is contiguous: the codon for one amino acid is imme-diately adjacent lo the codon for the next amino acid in the polypeptide cham. This is true in the vas! majority of cases in bacteria and their phage. Bul it is nol ahvays su fOI eukaryotic genes. [o those cases, t.he coding sequence is periodically interrupted by stretches of noncoding soqueoce. Thus m.my cukHryotic genes are mosaies , consisting of blocks of coding sequences separated from each other by blocks of ooncoding sequences. The coding sequences are called exons and the intervening sequcnccs are callcd ¡nloollS. As a consequcnce of this alternating pat tem of exons and ¡ntrons. genes hearmg noncoding interruptions are often said to be "in pieces" or "split. " Figure 13-1 shows a typical eukaryotic gene in which the coding region is interrupted by thrce ¡ntrons. splitting il into four exons. Tho number of introns found within a gene varies enormous ly-from one in the case of mosl intron-contain ing yeast genes (and a few human gllnesl, to 50 io the case of the chicken proa2 collagen gene, to as maDy as 363 in the case of the 1itin gene of humaos. Also, the sizes of the exoos and introns vary. Indeed iotrons are very often much longer tban the exons they separate. Thus, for exam ple, exons are typica lly on the arder of 150 nudeotides, whereas introns-though they too can be short-can be as loog as 800,000 nuc1eotides (BOa kbl_ As aoother cxample, the mammaliao gene for the enzyme dihydrofolate reductase is more Ihan 31 kb long, and witbin it are dispersed six exons that corrcspond to 2 kb of mRNA . Thus, io this case, Ihe coding portian of the gene is Icss than 10% of jts totallength.
T
""""ole, region 2
3
genomic
exon 1
! !
2
4
3
IranscriptiOn
pre-mRNA
nonooding region
4 ,.--,
3
>3'
spliol'lg
spliced mRNA
• The Chemlstry of RNA Spliang (p. 380) The Spticeosome Machinery (p. 383) Sptiang Pathways (p. 385)
• Altemallve Splicing (p. 394)
• &011 Shuft!ing (p. 401)
• RNA Editing (p. 404) • mRNA Transport (p. 406)
f I c:¡ U R E 13-1
,-,
ONA
OUTl lNE
Typic~1 eukaryotk gene.
1he depicted gene conlains lour exons separated by three introns. TranSOlption from Ihe promoler generales a pre-mRNA, shouvn in lhe middle line, lhat conlalns alt !he exons aOO ,-nlrons. Splicing rerTl(llr'eS me inlrons and !uses !he exons lo generale !he mature mRNA Iha!, once processed lurther (see poIyadel'lylatioo, Chapter 12) aOO e>¡JOrted from lhe nudeus, can be Ifanslated lo ~ a protein producL
13'
2
3
4 379
Like Ihe uninterrupted genes of prokaryotfls, the spli t genes of eukaryotes are transcri bed ¡nlo a single HNA copy oC the eoli re gene. Thus. the primary transcript for a typ ica l eu karyotic gene contaios introns as well as exons. This is shown in lhe middlc part of Figure 13-1. Because of the length and number of ¡ntrans, Ihe primary transcripl (or pre-rnRNA) can be very long indeed. In the extreme case of Ihe human dystrophin gene, RNA polymerase must Iraverse 2,400 kb of DNA lo copy Ihe entice gene inlo RNA. (Given thal Iranscription proceeds al a rate oC 40 nucleotides per second, it can readily be seen Ihal il takes a slaggering 17 hours lo make a single transcript oC this gene!) Despite this seemingly odd gene organization, the prolein-synthesizing machinery of the cell (Chapter 14) is equipped ooly lo translate messenger RNAs conlaining a cootiguous stretch of (;Odons; it has no way of Identifying and skipping over a block of noncoding sequenee. And so the primary transcripts of split genes must have their introns removed before they can be translated into protein. Introns are removed from the pre-mRNA by a process called RNA spHcing. This process converls the pre-mRNA inlo mature messenger RNA and must occur with grea! precision lo avoid Ihe 1055, or aL1dilIon. of even a single nucleotide at the si les al which Ihe exons are joined. As we shall see in Chaplers 14 and 15. the triplct-nllcleotide codons of mRNA are translated in a fixed reading frame Ihat is sel by the first codon in the proteJn-eoding sequence. lack of precision in splícing- if, for example, a base were losl or gained at the boundary between two exons-would throw the reading frames of exons out of register and downslream codoos would be incorrectly selected and the wrong amino acids incorporated into proteins. Sorne pre-mRNAs can be sp liced in more than one way, generaling alternative mRNAs. So, for example, different combinations of introns might be removed. This is called alternativc splicing, ando by this strategy. a gene can givc rise lo more Ihan one polypeptide prodllCt. lt is estimated tha t 60% of the genes in Ihe human genome are spliced in alternative ways 10 generate more Ihan one protein per gene. Tite number of different variants a given gene can encode in Ihis way varies from two lo Itll ndreds or even thousands. For cxarnple. the Slo gene (rom rat which encodes a pOlassium channcl expressed in oeumns has ¡hu potential lo encode 500 alternalive versions of that producto And, as we shall seco there is B DrosophiJo gene thal can encode as many as 38,000 possible products as a resull of alternative splícing! In litis ehapler \Ve discuss. nol only the mechanisms and reglllation of RNA splicing, bul also ideas about why eukaryotic genes have inlerrupted coding regions. We also describe RNA editing. anolher way initial transcripts can be altered lo change wbat tbey encode.
THE CHEMISTRY OF RNA SPLICING Sequences within the RNA Determine Where Splicing Occurs We now consider Ihe molecular mechanisms of the splicing reaction. How are the ¡ntrons and cxons distinguished from each other? How are introns removed'? How are exons joined with high precision? The borders between introns and exons are marked by specific nucleotide
sequences with in the prc-mRNAs. These sequcnces delinr,ate wherc splicing will occur. Thus. as shown in Figu re 13-2. the exon-intron boundary-that ¡s. the boun dary at the 5' end úf lhe inlTOn -is marked by 1:1 sequence ca lled the 5' splice sitc. The introo-exon boundary at the 3' end of the ¡ntron is marked by the 3' splice site. (The 5' and 3' splice sites were sometimes referred to as the donor and aoccptor siles, respectivcly, but this nome.nclature is rarely used today,) The figure shows a third seqllence neccssary for splicing. This is callcd the bra nch point site (or bmnch point sequence). It is found enürely within Ihe ¡nlmn, usually close to its 3' cnd, and is followcd by a polypyrimidine trael (Py tract), as shown. The consensllS seqllcnce ror each of these elements is shown in Figure 13-2. Th e most highly conserved sequences are the GU in the 5' splice site, the AG in thc 3' sp lice site. and lh~ A al the branch si le. These highly conserved nucleotides are all fOllnd within \he inteon ítself-perhaps not surprisingly, as the sequence of the exons, in contrast to the introns , is constrained by the need to encode the speciñc amino acids of the protein producl.
The lntron ls Removed in a Form Called a Lariat as the F1anking Exons Are Joined Let us oogin by considering Ihe c hemistry of splicing. which is achieved by two sllccessive transcsterification reactions in whic h phosphodiester linkages within the pre-mRNA are broken and new ones are formed (Figure 13-3). The firs t rcacti on is triggered by Ihe 2' OH of the conserved A al the bran ch site. This group acts as a nucleophile to attack the phosp horyl group orlhe conserved G in the 5' splice sile. (This is an SN2 reaction th at p roceeds through a pentavalent phosphorous intermediate.) As a consequence, the phos phodiesler bond between. the sugar a nd the phosphate at the junction between the intron a nd the exon is c1eaved and the freed 5' end of the ¡ntron is joined lo Ihe A within the branch sile. Thus, in addition to the 5' and 3' backbone Iinkages. a third phosphodiester extends frulll th e 2 'OH of tlHtt A to creat(~ a thret:-wé1Y junction (hence its description as a branch poin t). The structure of the three-way ju nction is shown in Figure 13-4 . Notice thal the 5' exon is a leaving grollp in the first transeslerificali on reaction. In the secon d reaction. the 5' exon (more precisely, the newly liberated 3'OH of thc 5' cxon) reverSes its role and becomes a nllcJeophile that altacks the phosphoryl group al the 3' spli ce sit e (Figmc 13-3). This second reaction has two consequences. First, and most important ly. it ¡oins the 5' and 3' exons;
5' e>
lotroo
3"'00
FIC UR E 1 ]-2 Sequences at \he intron-elon boc.mdary. Shown in lhe figure are !he con5enSUS seQuellOeS fQl' both the 5' arod 3' spIice sites, and also Ihe conserved Aal Ihe branm site. As in other cases of COl15enSlJS sequences, where two alternallve bases are si~lar1y fiJVOfed, those bases are bah indicaled al lllat position. In 11115 figtre, the consensus seq..oer03 shown are for humans, This is true lar an other figures. lJIlless otherv;;se staled.
382
fiNA Splicing
F I e u R E 13~1 The splicing reactiOfl. 5tlOINI"I are the tv.o steps of the splidng reaction de5cribed ,n the text In the lirsl: step, the RNA forms a loop structllre, 1'Alid1 is shoM1 in detail in the next figure.
.,i•
¡nlron
5'exon
s..
3' exon
"1
1óliá U
•
~I.
~
OH 2'
1 (,-'1~..,ír ~ mtron lanat
+
p,GpG
spliced exons
thus. Ihis i!> the stcp in which the two coding sequences are actually "!>pliced " togeth nr. Second, thi !> same reaction liberales lhe intron, which serves as a leaving group. BecausH Ihe 5' fl nd of the int ron had been joincd to Ihe branch point A in the fir!>t transesterifi ca tion reaction, the newly Iiberated inlron has the shape of a laria •. In the two reaction steps , thero is no net gain in the number of chcmica l bonds-two phosphodiester bonds are brokcn, and two new ones ffiH.de. As it is just a question of shuffling bonds, 00 energy input is demanded by the chemistry of Ihis proces!>, But. as we shall see
FIGURE 11~4 Theslructureofthe
three-way junction fonned during the sptidng reaction.
o~
3'end--""' ot ;ntron
7'he SpliCf!()$ome Mochillery
bclow, a largc amount of ATP i5 cunsum ed during the sp li cing rp..oc· tion. This enel-gy is rcqu ired, nol for the chem i5try, bul lo properly assemble and operate the splicing machinery. Another point aboul Ihe aplicing reaction ia direction : whal ensures Ihat sp li cing on ly goes forward- that ¡s, toward the prod· uds shown in Figure 13- 3? 1Wo features thal cou ld conlribu te 10 Ih is are as fo ll ows. First , the forwa rd reacti on involvcs an increase in entropy-a single prc--mRNA molecule is split into two molecu les, the mRNA a nd the Iibcraled lariol. Seca nd , lhe excised cxon is rapid ly degra dad a rter it s removal and so is nol available to partake in the reverse rcaction.
RNAU
RNAI
I
S'
383
.,001
1
GlJ
Exuns from Different RNA Molecules
Can Be Fused by
Trans~Splicing 5' _ _ _ _
In our description of spli cing aboye, we assumed tha t the 5' splicc sitc of one exo ll is ¡oined to the 3 ' s plice s it e of the exon tha! imm ediately foll ows it. T his is nol always the case. In alternative sp li e· ing, exons ca n be skipped, and a given exon is ¡oined to Olle furlh er downstream (as we sec later in tlle text)o In sorne rnses, two cxons carried on diffcrellt RNA molecu les can be spli ced together in a process called tra ns-splidng. Although gencrall y rare, Irans-splicing occurs in almosl all the mRNAs of trypanosomes, In the nema· tode lVorm (e . eJegansl, aH mRNAs undergo trans·splici ng (to attach a 5 ' leader scquence), and many of them und ergo cis·splicing as \Vell , Figun: 13 -5 s hows how the basi c splícing reaction just dcscribed is adaptcd to carry out trans-spli cing.
THE SPLlCEOSOME MACHINERY RNA Splicing I s CarTied O ut by a Large Complex Called the Spliceosome The transcstcrifi cation reactions ¡usl described are medialed by a huge molecu lar "machi no" ealled the spliceosome. Thi s comp lex comprises aboul 150 proleins and 5 RNAs and is sim il ar in size to a ribosomc (Chapter 14), In carrying oul oven él single s plicing reac· tion, the sp li ceosome hyd ro lyzes severa I molecu lcs of ATP. Strikingly. il is believcd lhat many of lhe funclion s of lhe spliccosome are carried a ut by its RNA components rather than the praleios, again reminiscenl of the ribosome. Thus , RNAs locate lhe se(]uence elements at Ihe in tron· exon borders and Iikely participale in ca taly· sis of Ihe splicínS reaction ilseH. The five RNAs fU1, U2, U4, USo and U6) are collectively ca lled small nuclear RNAs (snRNAs). Each of these RNAs is bctwccn 100 and 300 nucleotides long and is complcxed witb several protcins. Thcse RNA protein comp lexcs are callcd small nuclear ribonuclear proleins (snRNPs -prollounced "sollrps"). The spliceosome is the large com· plcx made up of Ihese snRNPs, bul !he cxact makeup difTers al difTcrenl stagcs of the spli ci ng reacHon: different snRNPs come and go al differcnl times, each canying oul particular functions in the reaction. There are 81so man)' proteins wilh in the spliceosome tbat are nol part of tilA snRl'\JPs, and othcrs besides that are on ly loosel)' bound to lile spliceo· sorne.
~
F t G UR f 11·5 Trans-SpUcing. In !fanssplicing. two exoos. inillally found In tI'oQ separate RNA moIerules, are spliced togelher mto a slf'l8le mRNA. rhe chemrstry of mis reaction 1:> !he $olI"TIe as thaI 01!he standafd spIiang reac\lOfl descrhod previou!Jy, and the 5pked prcdJct IS lOOsbnguIShable. The rnIy cWference IS lhat !he oChet product - lile Iariat 111 !he Slandard reacnoo- 15, 1f11r~ a y~ brandl sboctl.Jfe inslead. fus is because the lMIiII ~ oongs togeltle! t\o'IO RNA moIeOJles rather !hao formlllg a loop Wllhin a s¡ogle mdecule.
384
RNA Splicing
The snRNPs have three roles in splicing. They recogn ize the 5' splice si te and the bmnch s ite; they bring those siles togetber as required; and they ca talyze (or help to cata lyze) the RNA c1eavage a nd joining reaétions_ To perform these functions, RNA-RNA, RNA-protein, and protein-protein interactions a re aH important. Wc slart by consideri ng sorne of tbe RN A-RNA interacti ons. Th esc operale wilhin individual snRNPs, between different snRNPs, and between snRNPs a nd the pre-mRNA. Thus , for example, Figure 13-6a shows the interaction, through complementary base-pairiog, of the V I soRNA and the 5' splice s ite in the pre-mRNA. Laler in the reaction, that splice s ite is recognized by Ihe V6 snRNA. In another example, shown in Figure 13-6b. lhe branch site is recognized by the V2 snRNA. A third example, in Figure 13-6c, shows en in teraction between V2 and U6 s nRNAs. This brings the 5' splice s ite and the branch si te together. lt is these and olhar sim il ar interactions, and the rearrangements they lead to, that drive the splicing reaction and contribute to its precision, as we will see a HUJe later. Sorne RNA-frce proteins are involved in splic:ing as menti oned Above. One example. U2AF (U2 auxillary factor). recognizes the polypyrimidine (Py) tracl /3' spli ce site, and, in the ¡n¡tial step of the spli cing reaction, helps another protej o, branc h-poinl binding protein (BBPl, bind to the brancb sile. BBP is tben dísplacp.d by the U2 snRNP. as shown in Figure 13-6d. Other proteins invol vcd in Ihe splici ng reaction includ e RNA-annealing factors . which help load snRNPs onto the mRNA. and DEAD-box helicase proteins. The latter use their ATPase acti vity to di ssociale given RNA-RNA interactions. all owing allernativc pairs to form and thereby driving tbe rearrangemenls thal occur through the splicing reaction. Finally. befare turning lo ¡he spliceosome mediated splicing pathway itself. we look at one further interaction. Figure 13-7 shows the crystal st ruclure of a seetian of fhe Ul snRNA bound lo one nf Ih e protei ns nf Ihe Ul snRNP.
FIGURE 13-6 SomeRNA-RNAhybrids
fOfmed during the splicing te.1ctioo_ In sorne cases, (a) different snRNPs reeognire the same (or overlapping) sequences in the premRNA al different stages 01the splicing reaction, as shO'Ml here for U1 and lJ6 recognizing the 5' splice 51le_ln (b) snRNP U2 is shO'Ml recogmzing !he branch slte_In (e) the RNA:RNA pairing betwee1 the snRNPs U2 and U6 i5 sho.Ml. finally, in (d), Ihe same sequence wthin!he pre-rnRNA i5 recogniled by a protein (001 part of an snRNp) al one 5ta¡¡e and disp\dced by an snRNP al ano!her. Each of these ehanges aeeompanies the arr"!Val or depanure of eomponents ol the spliceosome and a strudural rearrangement lhal is fequired for!he splicing reaction lo proceed
a
b 3'
S'
"
S'
exon 1 S'
e
lfjl.l
,
d
lru:!!M:C==__ ' exon 2
S'
...
exon 2
~~===_
Sp/icing Pothwoys
365
FIGUR E 13-7 Sbucture of spliceosomal pHltein~NA comp5ex: U'A binds haitpin 11 of UI snRNA. (Oubridge C 110 N Evans P.R. y
y
Tea CH., anc! Nagai K. 1994. Nature 372: 432.)
Irnage p epaled v.1th MoISoiPl. BcbSaipt. and Rasle13D.
SPLICING PATHWAYS Assembly, Rearrangements. and Catalysis Within the Spliceosome: the Splicing Palhway The steps or the splicing pathway are shown in Figure 13-8. lnitially. tbe 5' splice site is rccognized by the VI snRNP (using base pairing between its snRNA and the pre-mRNA. shown in Figure 13-6). One subunit of U2AF binds 10 the Py traet and the other 10 the 3' splicc site. The rorrner subunH interacts with BBP and hclps tbat proteio bind lo the branch sile. This arrangetnent of proteins and RNA is caBed the Emly (E) complex. U2 snRNP Ihen binds lo lhe branch site. aíded by U2AF and displacíog BBP. This arrangemenl is caUed the A complex. The oose-pairing belween Ihe U2 snRNA and lhe branch site is such Ibal Ihe branch si te A residue is extruded from the resulting stretch of double helical Rt'lA as a single nucleotide bulge as shown in Figure 13-6b. ThisA residue is Ihus unpaired and avaiJable lo react with tbe 5'splice site. The nex! step is a rearrangement of the A complex to bring together al! three splice sites. This is achieved as follows: the U4 and U6 snRNPs. aJong with !he us snRNP. ¡oin the complexo Together these three snRNPs are mUed the tri-snRNP partide. within which t he U4 and U6 snRNPs are hcld togcthcr by complementary base-pairing between theír RNA components. and the Us snRNP is more loosely associated through protein:protein interaetions. With the enuy of the Iri-snRl\rp, the A complex is converted ioto the B complexo
386
RNA Splicing
FIGURE 13-8 Stepsofthe
spliceosome-mediated splicing readton. 1he assernbly and action of lhe spliceosome are shovvn. ,¡¡nd the details of each step are describe
S'
~ •..¡~ .U[l================~= B¡BP=:=U=~=~==~~_"1II 3'
snRNP
snRNPs I
U4 U6
Ul
s lll_ __
Splicing Pallm'Oys
In the next step, U1 leaves the complex, and U6 replaces it at the S' splice site. This requires that the base-pairing between the U1 snRNA and the pre-mRNA be breken . alJowing the U6 RNA to annear with the same region (in fact. to an overlapping sequence, as shown in Figure 13-6a ). Those steps complete the assembly pathway. The next rearrangement triggers catalysis, a nd occurs as foll ows: U4 is released frem the complexo allowing U6 to inl eract with U2 (Ihrough the RN A:RNA base-pairing shown in Figure 13-6c). This arrangemenl, called the e complex, produces the active site. That is, the rearrangement brings together within the spliceosome those components -believed lo be solely regions ofthe U2 and U6 RNAs -that together form the active site. The same rearrangement also ensures the substrate RNA is preperly positioned to be acted upon. It is striking that, nol only is the active s ite primariJy formed of RNA. but a lso that it is only forrned al Ihis slage of spliceosome assembly. Presumably this stra tegy lesscns the chancc of aberrant spli cing; linking the formation of tbe ac tive s ite to the successful completi on oC earHe t steps in spliceosome assembly makes il highly li kely lhal the active silc is available only al legitimate sp lice sites. Formation of the active sita juxtaposes the S' splice si le oí the premRNA and the branch sito. facilitating the first trnnsesterification rcaction. The sccond reaction. between the S' ami 3' spli ce sites. is aided by the Us snRNP, which helps lo bring the hvo exons together. The final slep involves. rolease of the mRNA producl and the sn RNPs. The snRNPs are in iti aUy sUlI bound to Ihe lariat. but gel recycl ed aftee rapid degradanon oflhat piece of RNA. lt might seem odd Ihal the machi ncry and mechanism of splicing is so complicaled. How did il evolve that way? Wou ld it nol have been simpler to fuse the cxons in a single roacHon . rather Ihan undergo the two reacti ons juSI described? To consider this questio n, we tum to a group of introns that -unlike lhose we have consideted thus far- can splice Ihemselves oul of pre-rnRNA without the necd for the spJiceosorne. They are called self-splicing ¡nlroos.
Self..Spli.ciog 1ntroos Reveal that RNA Can Catalyze RN A Splicing The three cJasses of splicing fou nd in ceils (nol iocJuding tRNA precessing, which we discuss in Chapter 14) are shown io Table 13~ 1. Thus far \ "lC have dealt only with nuclear pre-rnRNA splic ing, that med iated by thc spliceosome found in üH eukaryotes. Also shown in Table 13·1 are TAS L E 13-1 lhree CJasses o, RNA Splicing Class
Abundance
Mechanism
Cata.yUe Machinery
Nuclear
Very cornmon; used 'ar most eukaryotic genes Aare; some eukaryotlC genes Irom orgnnelles and prOkaryoles Aare; nuclear rANA in sorne eukaryotcs. organclle genes. and a klW prokaryotic genes
Two Iranseslerification
Major sp/iceosome
pre-mANA GrOllp 11
,nlrons Group t !olrons
reaclions; branch silo A Same as pre-mANA
Two IranseslerificatiOrl reactions: branch site G
ANA eru:yme cncoded by ¡nlron (ribozyme)
Same as group 11
tba so-caJ led group 1 and group TI self splicing introns. By self splic:ing, we me'dn that the intron itself fold s inlo a specific conformation within tl)e precursor RNA and cataJyzes lhe chemistr), of its own release (recalI that we discussed the general features of RNA enzymes in Chapter 6). In terms of a praclica1 definition. selhplicing means that these introns can remove Ihemselves from RNAs in lhé test tube in the absence of eny proteins or olher RNA molecu les. The self-splicing introns are grouped inlo two classes on the basis of their structure and splicing mechanism. Stricdy speaking, self-splicing mtrons aro n'ol enzymes (catalysts) because they mediate only one round of RNA processing (as \Ve shalJ consider later in Box 13-1). In the case of group n ¡ntrans, the chemistI}, of splicing, and the RNA inlemledia!es pfI;>duced, are the $ame as for nuclear pre-mRNAs. Tllat is, as shown in Figure 13-9. the ¡ntron uses an A residue within the branch sile to attack the phosphodiester bond al the boundary between jts 5' end and the end of lhe 5' exon-Ihat ¡s, al the 5' splice site. This reacUon produces the branched lariat. as we saw befare, and is followed by a sl."i.,;ond rei1ction in which the newly freed 3'OH of lhe exon attacks tbe 3' splice site. releasing tbe intron as a lariat and fusing the 3' and 5' exons. 4
4
Group I Imrons Release a Linear Intron Rather than a Lariat Group 1 intrens splice by a different pathway (Figure l3-9c). lnstead of a branm point A residue. Ihey L1se a free G nudeoUde or nucleosi de. This G species is bound by the RNA and its a'OH graup is presented lo the S' splice site. The same Iype of trnnsesterification reaction lhat leads to the lariat fonnation in the earlier examples, here fuses the "G" to lhe 5' end of the intron . The second roaction now proceeds JUSI -as it does in !he earlier examples: the fr eed 3' end or Ihe exon atlacks the 3' splice site. This fuscs the lwo exons and releases lhe i.lItron, though in lhis case Ihe inlron is linear mlher than a Jariat structure. Group 1 introns, which are smaller than graup n ¡ntrons. share a conserved secondary structure (RNA Colding is discussed in Chapter 6 ). The struc lure of group 1 introns inel udes a binding pocket that witl accommodate any guanine nucleotide or nucleoside as long as it is a ribose form . In addition to tbe nucleolide-binding pocket. group r ¡ntrons contain an "intemal guide sequence" that base-pairs witb the 5' splice si te sequence and, Ulereby determines the precise site al which nucleophilic Olttack by !he G nuclcotide takes place (see Box 131, Converting Group 1 Introns inlo ~ibozymes). A typical self-splicing inlrOn. is between 400 lo 1,000 nueleolides long, and, in contras! to ¡ntrons removed by spliceosomes', much of lhe sequence of a self-splicing intron is critica] for tbe splicing rcactioI'L This sequence requircmenl holds because the intron must fold into a precise structure lo perfonn the rcaction chemistry. In addition. in vivo. tbe intron is complexed with a number of proteins thal help stabili 2'.e tbe correct Sll'ucture-partly by shielding regions of t,he backbone from each other. Thus. the folding requires certain sections oI !he Rt"lA backLone to be in clase proximity to other sections, and tbe negativo charges provided by lhe phosphates in those backbone regions would repel each otber if not shielded. In vitro, high salt concentrations (and thus positive ions) compensate for the absence oC these proteins. This is how we know that lhe proteins are not needed for the splicing reactíon itse lf. The similar chemistry secn in self and spliceosome-media!ed splicing is believed to reflect an evolutionary relationship. Perhaps ancestral 4
Splicing Potlm'Oys
Box
I]~I
Converting Group Ilnttons into Ribozymes
Onc.e a &{OUP \
As exptained earlier in the text, grou9 I (and 11) introns are llave a tuffi(M::r numoo- 01 only ore. But they can be read!1y converted InlO enzymes (ribozymes) in !he follov-Jing WiJ'f (Box 13- 1 Figure 1): the relinearized intrun desaibed aOOte: retains its active sile. If Vo.€ provide it >Mth free G and a substrate lhat indudes a sequence complementary to the internal guide sequence, it Vllill repeiltedly catalyze deavage of substrate rnolerules. We will have converted a group I intron inlO a ríbozyme. similar to the wé!f !hat !he self-deaving hamrnerhead cooId be converted lo a ribozyrne by separating !he active site from ,he substrate (Chapter 6). We can go a step lur/her by changing \he sequence uf the internal guide sequence and thereby generate taila-made ribonudeases that d eave RNA moIecules of our choice. not enzymes because Ihey
RNA _ _ _ _ _ _ _ _ _ _ " 5' _
ribozyme
5' _ _ _ _ _ _ _ _ _ "
+
S G _ _ __
B ox
389
13-1
F I '" U R E
1 ,"roup I introns can be converted into tRIe ribozymes.
390
RNA Splldng
a pre-mRNA spliceosome
e
b group 11 self-splicing
group 1self-splicing
5'. ":::;"U4
U4
5' ~J, ~ 3'
!
5' _ _ _ '
+
5' _ _ _ 3'
+
OH 3'
'"
G-
1
G-
5' _ _ _ 3' +
(Q)
f I e u R E 13-9 Qoup I and group U intrOfls.. This flgtXe comp
group II-Iike self-splicing ¡ntrons were the starting point for the evolution of modem pro-mRNA splicing, Tho catalytic function s providcd by lhe RNA were rotained. bul the requirement for extensive sequence specificity wi thin Ihe intron its(11f was rolieved by having lhe soRNAs and theie associaled proleins provid(l masl of those functi ons in transo In tbis way, introns had only to eetai n the minimum of sequence ejemenls required to target splicing lo thc correet placéS. Thus, many múr-c and varied sizes and sequences of ¡nlrons were pernliUed. 11 is intcresting tha! lhe sITucture of th e catalytic region thal perfonns rhe first transesterificationrear.tion is vcry similar in lhe grotlp n ¡nlron and the pre-mRNA/snRNP complex: (FIgure 13 ~10) . Tru s observation fuels rhe broadlle speculutiun (discussed in Chaplee 6) that ead y in the cvolution of modern organisms, many cata lytic functions in Ihe cell were carried out by RNAs and that these func tions have. 00 Ihe ""hole, since becn replaced by proteins. In the case of lhe spliceosome and the ribosome, bowever, Ihese activities have nol been enlieely replaced by proteins. Rather, lhe vestigial RNA-catalyzed rnech anisms remain al the hea.rt of the present complex: machinery~
Splicing Pofhwo}'s
FIGURE 13·10
391
Proposedfoldingofthe
RNA c.ala!ytic regions 'DI splicing of group n introns and pre-mRNAs. Jhe dotted reglOllS of me RNA in Ihe group 11 ca-..e replace an additional four IGl/ded domains not shov.fl in th,:s
~e·mRNA
depictíon.
~dom"n
gfOUp 2
t
OH
5
'U
~:iIli:: \ ' : :::::::: :: ::='= 'n,,,n:::::::Y " 5'
How Does the SpHceosome Flnd the Splice Sites Reliably? We have already secn one mcchanism Ihat guards against inappropri ate :>plicing- the active site of the spliceosome is onIy formed on RNA sequeoces lhal pass tbe test oC beiog recognized by multiple elemenls during sp liceosome assombly. Thus, for examplu, the 5' spHce site musl be recognized initiaUy by Ibc Ul snRNP and then by the Us snRNP. It ís unlikely both would recognize an incormct sequence, and so selection is stringen!. Yel. the problem of a ppropriate splicesite rccogn ition in the pre-mRNA remains formidable. Consider Ihe foHowing. The average human gene has eight or nine exons and can be spliced in tbree alternative ronns. Bul there is one human gene with 363 exons and one DrosophjJa gene thal can be splicud in 38,000 alternativc ways (Figurt! 13-11 J. Ir lhe snRNPs had to find the correel 5' a nd 3' spl ice sites on a complete RNA molecule and bring Ihem togelher in the correel pairs, unaided . il seems inevitable that many errors would occur. Remember. also, Ibal tbe ave rage exon is on ly sorne 150 nucleotides long, whereas tbe average ¡ntron is approximalely 3.000 nucIcotid cs long (as we have seen , sorne introns can be as long as 800,000 nucleolides). Thus, Ihe exons must be identified within a vast occan of inlronic sequences.
exon 4 exon 6 12 altematives 48 alternati ves
exon 9 33 alternatives
exon 17 2 altematives
ge~ic (~~~'¡::;';¡~~;;=c~:;t{~S;~~~~~~~~"~~~~J:)
ONA and pre-mRNA
~
..............
!
mRNA 6
FICU RE 13·11 The muJtiple exons of tfle Drosophlla DSCAM gene. This gene was cloned as an allOll guidance receplor responsible fOf directing growth CQnes lo tIleir proper targel. The DSCAM gene (shov.n allhe Iop) is 6 1.2 kb long; once lranscribed aro spliced, .¡ produces ene Of more versions of a 7,8 kb, 24 exon, mRNA (the figure shows!.he generic structure of those mRNAs). As shown, there are SE'\leral mutually exduwe ahernalives fer exons 4, 6, 9, and 17. lhus, each mRNA wiUcontain ooe of 12 pos51ble altemallVE!S lor e,¡(on 4 (in olange), ene 01 48 for ellOn 6 (purple), one of .33 for exon 9 (bIue), and one of 2 101' exon 17 (red). 1I 0311 possible combinatioos of these exons are used, the DSCAM gene produces 38,016 diHer€f11 mRNAs and proteins. (Source: Mapted from Btad D. 2000. Protein diverity 110m alternative spl¡óng. ceO 103: 368. Copyright e 2000. Used Wlth permisslOn frorn E1se...ier.)
Splice-site recognition is prone to two kinds oC eITOrs (Figure 13-12). First, splice sites ca n be skippcd , witb components bound at, ror exampie, a given 5' splice site pairing with those al a 3' site beyond the correet one. Second, other sit es. clase in sequence hut nol legitima te splice sites. could be mi stahml y recognized. This is easy to appreciate when nne recalls that the splice sHc con sensus sequences are ralher loase, And so, ror exampl e. components al a given 5' splice site migh.t pair with com ponenls bound incorrectly al such a " pseudo" 3' splice sile (see Figure 13-12b), Two ways in wh ich the accuracy of splice-site selection can he enhanced are as foll ows. First. as we saw in Chapt er 12, while Ira nscribing a gen e to produce th e RNA, RNA pulymerasc 11 carries with it vario us protein s wi th roles in RNA processing (see Figure 12-1B).
b pseudo splice-site selection
a ffi{on skipping
~OO 1 c:::::: 2 ONA
pre- 5' mRNA
exon 1
I.-====••C===-._
-inCOffect"
mRNA
' _ _.
3'
3'
========
2
5' _-=======-"'1'" '
-l o-
splice site 5' _
FI cu RE 13-12 Errors produced by mistakeJ in splice-site selection. (a) Shows the consequence 01 skipPing an exO!\. This happens d tIle spliceosome components bound al the S' splice site of one exon ¡nterad ~ spliceosome components bound al the .5' spfice site o( not lile next exon. bu! ene beyond. (b) l!1ustrates lile effect of spliceosome cornponents recognizing "pseudo" splice sites-sequences tIlat resemUe (bu! ilre not) legltlmate splice sites. In !he case shown, the pseudo site 15 withín an eJ(on and leads lo regioos tJedr lhe S' end 01thOl eo:on being mistakenly spliced out ..,10118 v.ith the ¡ntrQ(),
3'
These inelude proleins involved in sp li ci ng. When a 5' splice s ite is encountered in the newly synthesi zed RNA , those components arc transferred from the polyrnerase e -terminal "tail" (Ihat part of the enzyme where Ihey hiteh a ride) onto the RNA . Once in place, the 5' splice site eomponenls are poised lo internet with Ihose Ihat bind to the next 3' splice s ite lo be synthesized. Thus. the correet 3' spli ce s ire can be recogn izcd befare any cOinpcling sit es fu rl her downstream have been Iranscr ibed. This cO-l ranscriplional l()ading process great!y dirninishes Ihe Iikelihood of exon s kipping. It is worth noting that even thougb much of the splicing machinery assembles while lhe gene is being transcribed - and on individual introns in the order they are transcribed-this does nol mean the introns are themselves splieed oul in that order. Thus , in contrast to many other activities we have heard about - trans('.ription, re plication . and so on - there appears lo be no " traeking" mochanism involved. whereby the machinery assembles at one end of the gene or message and acts as it tracks to the other end. A second rnf..'(;hklnlsm gUil rds againsl U¡e use of ínoorroct sites by ensuring that spliee sites clase lo exons (and thus likely lo be authenticJ are recognized preferentially. So-called SR (Serine Argenine rich) proteins bind lo sequences called exunic spJicing enhancers (ESEs) wilhin the exons. SR proteins bound to these sites internet with cnmponents of the splicing mar..hineI")', recruiting them to tbe nearby splice sites. In this way. the mae hineI")' binds more efficienUy lo tbose splice sites Ihan to incorreel sites not dose 10 exons. Specifically, Ihe SR proteins recruit the U2AF proteins lo t:he 3' spli ce site and Ul snKNP 10 lhe 5' site (Figure 1 3-1 3). As we 58W carlier. these faetors demarcale the spliee sites for thc rest of the maehinery to assemble correctly; SR proleins are essential for splicing. They nOI only ens ure Ihe accuraey and efficioncy of constituti vo splicing (as we have jusi seen) but also rcgulate alternative splicing (as we will see presently). They come in many variehes, sorne eontroll ed by physiologieaJ signals. others constitutj ve ly active. Sorne are expressed preferentially in ccrta in eell types and control splicing in cell-type speeifie patterns. We will di seuss sorne speeifie examples of the roles of SR proteins in lhe next seclion .
FIGU RE 13-13 SR proteim reuujtspUceosome components lo the S' and l' splke sites. Legitimate splice sites are reoogniled by the splicing machine.y by vutue of being d ose to exons. Thus,
SR proteins bind lo sequences wi thin the exuns (exOllic splicing enhaocers), and Irom there renu;t U2Af and Ut snRNP lo ¡he dO\Nl"lstream S' aOO upstream 3' splice siles respectlVely. This irulÍales \he assembly of Ihe spliong machine.y on the cOfrect 51tes aOO spliong Ciln proceed as outhned earlier. In looking al \his figule, flO1e !ha! an inlron is drawn in ¡he cenler. bounded on each side by an exon. This is in conlraSllo many of Ihe earlie r mechanistic figures in which a Single cenlral inllon is depicted lying bel'Neen tv;o inlmns. (Source: From Manlatis 1 aOO Tasic B. 2002. Allernative pre-MRNA sphcing and proleome expanSlon in melazoans. Noture 4 t B: 236- 243. Copyrigh t O 2002 Nalure Publishing Group. Used",1h permission.)
394
RNA Splicill8
ALTERNATIVE SPUCING Single Genes Can Produce Multiplc Products by Altematívc Splicing As we desccibed in Ihe introduction lo thi s chapter. many genes in higher eukaryotes encode RNAs Ihat can be spliced in alternativo ways to generale lwo or more differenl mRNAs and, thu s, different prolein producls. in some cases, Ih e number or polential alternatives Ibal can be generaled ITom a sillg1e gene is brealhtaking-hundreds (in Ihe ral S10 gene, foc example) oc even thousands (for the Drosophila DSCAM gene IFigure 13-11 J). For a simple case, consider Ihe gene for the marnmalian muscJe protein Tmponin 1'. Shown in Figure 13-14 lS a region of the premRNA made from Ibis gene and conlailling five eXOllS. This RNA is spHced lo form two alternative mature mRNAs, each oontaining four exons. A dinerenl exon is eliminaled from each of the two rnRNAs. so Ihe lwo messages have tIuee exons in common , as well as each carryíng one unique exon. BUI. as shown in Figure 13-1 5. altemative splic¡ng Call arise by a numher of means. Thus. as well as alternative exons being chosen. eXOJ1S can be extended. or (deliberately) skipped , AIso, introns can be retained in Sorne messages. rather Ihan beiog deleted , agajn gencrating diversity in Ihe proteins produccd. lo th e previous secHon, we described mechanisms Ihal ensure varialions oi' Ihis sort do not take place-Ihal exons are nol skipped and spli ce sites nol ignored. So how does alt ernalive splicing occur so ofien? The basic answcc ls Ihat sorne splicc sites are used only sorne 01' the time. leading to the production of different versions 01' lhe RNA from different Iranscripts of Ihe same gene. Alternative splicing can be eHhcr cons tituti ve or rcgulaled. In Ihe former case. more than one prorlucI is always made frOln the transcrihed gene. In the case of regul ated sp licing, dift'erel11 fOlTIlS are gene rated at different times, undcr differeot condiHons. or in diffe rent cell or tissue Iypes. Another cxample of constitutive alternative splicing is seco with Ihe T antigen 01' the monkey virus SV40 (Figure 13-16), The T antigen gene encodes lwo protein products-Lhe large l' anligan IT-ag) and the smal1 t antigen (t-ag). The two proteins resull from alternative splicing of the pre-rnRNAs fro m Ihe same gene. Thus, as shown in the figure, the gene has Iwo oxons and different mature mRNAs result from Ihe use of two differenl 5' splice sites. In lhe mRNA enooding large T. exon 1 is spliced directly lo exon 2. deleting Ihe intron thal Hes betwP.en. The mRNA rm I-ag, on the olher hand, is formed using the alternativa S' splice site. Thus, in tltis Qlse. the rnRNA includes sorne ar Ihe inlron as
fi GU RE
13·14 Altemative splicing in
the troponin T ¡ene. Shcr.vn here is a region of ttus gene enaxflng five e)(Qns IM'Uch generales two atlemalive/y spliced forms as inálCatal. One conlains exons t, 2, 4, and 5: the oIner oontains e~ons l. 2,3, and 5.
2 spliced mRNA
3
5
3'
5'
splicing
a troponin T primary RNA
5'
2
""'"
5
4
3
transa;p!
spliced mRNA
(3 troponin T
5'
2
4
5
3'
f
3'
1
spticing
Altemaill'e Splicing
395
ONA
!
rtanscriptiOf\
_==-__==-...'.
primary RNA 5' .... rtanscr1pt
spliced
mRNA
¡
¡
5'
,- , ,
3'
nOrmal
5'
, ,
¡
3'
5".
3'
1
¡
_r:w_ _ 3' ,
exon skipped
,
11
3
exon extoodcd
, ,
3'
l
, + ,
5'
intron relained 5'
3'
J"
alternati ve exons
FIGURE 13-15 Five ways lo spfice an RNA. Al me lop is shown a gene encodmg three exons. This is tran5Cribed Inte a pre-mRNA, shown In Che mlddle, and then spliced by filie different altemative pathways. lhus, by illCluding all exons. an mRNA cOfllaimng all three exons is generaled. Exor¡ skipping gives an mRNA c.ootaining jUSl exons 1 a rd 3. By exon extens;ol\ part of intron 1 is induded together with the three exons. In aOOlher case, a CCN1lplete intron is retained m matule mRNA. Rnally, exons 2 and 3 mrghl be used as d ternatives, genelati' rg a mi~ure 01 mRf\:As, e;,ch indudlng elIorr 1 and ejIher eJl:on 2 or 3.
me
well. (11 ¡s, Iherefore. an example of lhe "extended exon" shown io Figure 1 3-15.) Tho mason Ihis larger messflge encorlp$ the smaller proteio is OOcause I here is nn in-framc stop codoo withio the region of the ¡olroo retajned in this t;nRNA. Bolh fonns of T ant igen are made in a cell infecled by SV40. Bul Ihe rati o of lhe two forms produced does differ depend ing 00 Ibe level of Ihe sp licing proteio SF2/ ASF. When presenl al high levels. this proteio direcl s the machinery lo fav or use of Ihe closesl 5' splice si te and
ONA
e)(on 1
primary RNA
tra nscript
, 3' 5"__1C10= ===:: : :JI__
! ==="!i,--
======-.'_ 3
-=::1'1:.
5' _ _
!,
•
rnRNA
" _~IC:::l'¡::'
' -'!""-~
protein
NC ' = ='C
Ne,====.C
"t-ag-
FIGUR E 13-16 Constitvtive altematíve ~iáng. Splicirrg of lhe SV40 T antigen RNA is shown. Bofh forms are typrcally produce
[-ag"
396
llN/\ Splicing
Ihus produces mQre of the t-ag mRNA. SF2/ASF is an SR protein and, whell abUlldant, presumably binds sites within exon 2 and helps the sp liceosome asscmble Ihere.
Alternative Splidng Is Regulate
FIGURE 13·17 RegulateclaltematiYe splicing. Sorne a~emativ€ly spliced exons appear ¡n mRNAs unless pcevented from doiog so by a repressor prolejn (snown io part a). Others appear only if a spedfic activatOl" pro-
motes tne.r indusion (part b). Either medlanlsm can be used ID regulate splióog such that io ooe cell type a partiaAar exon is ioduded in an mRNA, whefeas io another it is ool
a
cell type 1
cell type 2
splicing site
+
repressor site primary RNA transcript
primaryRNA transcripl
spliced mRNA
S· _~CIij_ 3·
=!,~ • 1 S'
S'
\
I
1
s· • _ _í::::_
3'
I
~l _0::::=!___
3'
s· _
_
3' unspliced
= ""!"_ 3·
i """"" 5·_ = "'3.... . .
• -=:::::60_ _
unspliced 5'
3·
~,p~~~i el.
splicing splldng site enhancer
b
S
3·
!
S· _ _ _ _ 3' spliced RNA
has another domain. rich in arginine and serine, called an RS dornain. Tlw RS dorna in, found at Iha e·terminal end of thc protein, mediates inlenlctions belween lhe SR protein aod proteins within the splicing machinery, recruiting thal machinery to a nearby splice site. An example of an activalor that promotes a particular aJternative splicing evenl in a specific lissue Iype is Ihe Drosophi/a HaH-pint proteio . This activator regu lates the alternative splicing of a set of premRNAs in the fly OVHry. It works by binding lo s ites near the 3' splice site of specific exons in those pre-mRNAs and recruiling lhe U2AF sp! idng factor. Most silencers are recogni zed by members of the 11eterogeneous nuclear ribonucleoprotcin (hnRNP) family. Thase bind RNA but Jack tbe RS domains and so can no! recruit the splicing machinery. Instead, by blocking specific splice sites, they reptess ,he use of Ihose sites. One examp le is hnRNPA 1. w hich binds to aJ1 exonic silencer elcnronl wilhin an exon of the HIV lal pre-RNA and represses Ihe inelusion of thal exon in the final mRNA. By binding lo ¡ls site. the repressor biocks binding of lhe activator Se35 (an SR prote¡n) to a nearby enhancer element. This bJocking is not direct Ihe two binding s il es do not ov'erlap-but hnRNPAl promotes cooperativo binding of additional molecules of hnRNPAl lo adjacenl sequenccs, spreading over Ihe enhancer site. When present, anolher SR protein {SF2! ASF) can overcome th is repression. because il has a l~igher affinity for lhe enhancer seq uence than does SC35 and therefote displaces th e tepressors bound thera. We wiU see similar themes ol" cooperative and competitive binding in exampies oftranscriptional regulalion in Chapters 16 and 17. Another mammalian splicing repressor is the hnRNPl prote¡n. In sorne cases th is protein blocks lhe bindi.ng of the basic splicing machine.ry by binding directly lo the Py tracl (explaining why hnRNPI is also cAlleil the polypyrimirline IrAct-binding protein). In olher cases jI excludes a given exon from the mature mRNA by binding lo seqllences Ihal flank that exon. This exclusion occurs either because molecllles of hnRNPI al each end. of the exon interact and loop out the exon , which is then passed over by the sp liceosome; or becallse the molecules o( hnRNPI al cach en d bind cooperalively wilh other molecules of hnRNPI. coating t.he RNA across Ihe whole exon. Thi s too would render the exon invisible lo the splicing machinery {Figura 13-16). In Chapter 17 (Figure 17-28) wo considor a parlicularly olaboralo example of regulated alternative splidng-Ihat involving the doubJesex gene of DrosophiJo. The se>< of a given fly depends on which of Iwo alternative splicing varianls of this mRNA il produces. We have emphasized alternative splicing as a way in which multiple protein products Can be produced frOIn a single gene. These diffe rent proteins are caBed isoforms. They can have sim ilar functions, distinct functions. Ot even antagonistic functions (thus, one form mighl HCt as H dominant negHtive of another). But even sorne genes that encode only a single functional protein show altf'...rnativv splicing. In those cases, a1te.rnative sp licing is used simply as a \Vay of switching expression of the gene on and off. This is achieved in lwo ways. Most straightforwardly. an exon contains B stop eodon. ando when incorporated jnto mRNA, this premalurely terminates translation generating a truncated polypeptide. TypicBlIy. such an incomplele protein is nonfunctional and rapidly degraded. Alternative spliciJ18 delermiJles whetber or nol the exon with (he stop
398
RNA Splidns
( f hnRNPI
FIGURE 13-18 tnhibitkmofsplidngby hnRNPI. Tv.o model!> are pre$ented. In ene the prolein coats lhe entire exon. In lhe other il binds al each ene! of the exoo é1nd conceals j¡ within a Iocp.
5'
O
3'
1
1 3'
5'
~
,
codoo is ¡nduded in a given mRNA, and thus, in erCecl. whether or nol the gene 15 expressed. Tha second way alternative splicing can be used as an D%ff swilch is by regulaling Ihe use of an ¡otron. which, when retained jo the mRNA. eusures that species is nol transportad out of the nucleus and so i5 never tfans latad. SpJicing was discovered in studies oC gene expression in the mammalian adenovirus. wh ere mRNAs are alternatively spliccd, as described in Box 13-2, Adenovirus and the Discovery oC Splicing.
80x 1)-2 Adenovirus and the Oiscovery of Splicing
Studies with bacteria and their phage led
10
Ihe view tha! the
mRNA is an exad replica in terms of nudeotide sequence 01 lile gene Irom whidl it IS lRInsaibed (see Chapter 15). It therefore carne as a shock when, io 1977, it was discovered that certaio (and, as we nO\N know, many) eukaryotic mRNAs are spliced together io patchwork fashion from much longer primary tronscnpts. How was mis startliog discovery made? lo an effort 10 uoderstand gene transaiption io eukaryotes, scientists fOOJsed on the human ONA virus called adenovirus. This virus was intended to serve as a moclel for understanding the molecular biology of the eukaryotic gene Just as phage T4 and h had done for Ihe prokaryotic gene (see Chapter 2 1) The virioo of adenovirus IS composed of sevefal differen! viral-enroded proteios, and the mRNAs for these proteins were purified wilh the hope tha! their 5' lermini lMJuld pinpoint Ihe transctiption Initiation sites far eadl gene 011 the viral genome. lostead, all of !he mRNAs, even though !hey encocled different proteins. were found 10 have .dentical 5' sequences. We now know that all of the mRNAs for the viñon proteios of adeoovirus arise from a single promoter known as !he majO!' late promoter. Initiatioo 'rom !his promoter generales long transcripts tIlat spao the coding sequences fOl multiple proteins (Box 13-2 Figure 1). This Iraosaipt then undergoes ahemative splicíng l o generate separale mRNAs far individual víricn compooents such as the hexon aod fiber proteins. AII of the mRNAs share the same S' sequence, which is stitched together flom three short non-protein-coding sequences known as the
tripartite leader. The leader is then altematively spliced to the roding sequences for !he hexon. fi ber, and other virion proleins to generate each of the late viral mRNAs. That these messengers are spliced together from RNAs ansing from several regions of the genome emerged from a variety 01 experiments-one 01 which is known as R-loop mapping (Box 13-2 Figure 2). W'hen RNA is incubaled, under the appropriate cooditlOns, with a double-strilnded ONA conlaioing a slretch of sequeoce identical lo thal of the RNA, Ihe RNA anneals lo ils complement, displadng a slrelch of the noncomplementary slrand io the form 01 a loop (Bo)( 13-2 Figure 2a). Follow10g the staining procedure used 10 visualize oucleic acids, this R-loop cao be obseved io the electroo microscope. as RNA-ONA and ONA-ONA dupleXe5 appear thicker than single-standed oucleic acids. W'hen such an experimenl was perfomed with adenDVÍrus messengers, the resulting R-loops were found not lo be fully cootiguous with a single region of DNA. lostead, and depending on which fragmeot of virol ONA was used, one 01 both ends of Ihe RNA were found to protrude from the RNA loops as singlestranded tails (BOJ: 13-2 Figure 2b). lo other cases, one 01 the tails is seen to anneal with a DNA fragment from a differenl reglan of the viral genome (Box 13-2 Figure 2e). dearly, these mRNAs were composite molecules lha! had been joined together Irom sequences compl€fT1enlary to noncor¡tiguous regioos of the genome. These and other kinds of ONA-RNA annealing experimeots were used to deduce the pattern of alternative spliciog shown in Box \3-2 Figure 1.
Bo" 13-2 (Continued) primary transcripl
,
-,
-
tripartita leader
--
ONA
•
fibe<
-
••
-
•• •
•
!========~====~~====~==~~~~.~;========;========~=======;;=======;;=======;;=====~~ •
I
I
O
10
I
20
30
40
50 map unlts
60
70
80
90
100
BOl 13-2 FIGURE 1 Map of die human adenovirus-l genome. The map shows!he lranscription pattems of the late n1<:NAs, induding the primary lransaipl (s/'v:!v.In as a long dafk green arrOV'i al lhe rop); lhe tripartite leader sequences fotKId at fOSiIions 16.6, 19.6, and 26.6 (stn-.n as green bars); and!he map positloos of the DNA sequences that €fICOde!he variou!. 'ale mRNAs (the late mRNAs are shoNn as short darlo: green
e
B o x 13-2 FIGUR E 2 R-loop mapping of die adenovirus-llate messenger RNAs. (a) The schematic sno..-..-s the formation o, an R-Iocp Slfucture. Adouble-stranded [)fIlA fragrnenl genet"ated D¡ álgE'Slion with a restriction endonudease is il"!CLb3ted wilh mRNA and heated ID jl5t above tht> Tm el lhe [t.JA in 8ffiIl formemide. lile h)'brid formed between lhe messenger and ils coo-plementary DNA sequence iestAts In á!Splaa!ment of !he second DNA strand. The poIy-A tai, of !he mRNA (nOl. encoded by ONA; see Chap~ 12) i5 seen prtiecting from the ene! of me hVblld dupla (b) Eledron miaograph and schernatic diagram 01 an R-Ioop obsefved afler inrubaling hexon mRNA with a rompIernentaly DNA sequence from the late region of !he adenovirus- 2 genorne. Note the exte!lsions of both the 5' and 3' ends of !he messenger. The ONA is represented by bIad:. lines; the RNA is reptesented bv green 'ines in the diagrarn. (c:) EIectron rrKrograph and schemalK: diagrem of an R-!oop obsefved afta inc:ubabng liber mRNA lMth two DNAs, the c:crnplete adenovirus genorne and <1 IeStridion endonudease fratment derived from the. early regiorl 01the genome. (Soufce: EMs c:W1esy el (b) Chow LT~ Gelinas R.E., Broker TR, ilIld Rcbefts RJ. 1977. AA amazing sequence arrangement at!he 5 ' ends 01 adenovir1./5 2 messenger RNA. CeH 12: 1- 8, page 2. (opvri91t O 1977. Used Wth perrnission frorn 8se,.ier. (e) Berget S.M, Moore e, and Sharp PA 1977. Spliced segrneots at the 5' lermm1./5 el adenovirus-2 late mRNA. Proc Nat1. Acod. So. 74: 3 171-3 175.)
A Small Group of Introns Are Spliced by an Alternatlve Spliceosome Composed of a Different Set of snRNPs Higher eukaryoles (iocluding mammals, planls, and so 00) use the major splicing machinery we have discussed thus far lo direcl splicing of Ih e majority of Iheir pre-mRNA . Bul in lhese organisms (unlike in yeast) sorne pre-mRNAs are sp liced by a low-abundance l'orm of spliceosome. This rare form cootains sorne compooents common to the major sp liceosome but other unique componcnts os wcll. Thus, U11 and U12 components of the alternative splicesome have the sarue roles i n the spJicing reaction as U1 and U2 of the major form, bul they recogn íze disliocl sequences. U4 aod U6 have equ ivalen! counterparls in both spliceosome forms-allhough th ese snRNPs are distinct, they share lhe same names . Finally, the U5 component is identical in bolh Ihe major and in Ihe alternati ve splíceosome. The minor spliceosome recognizes rarcly occurring inlrons having consensus sequences dislinct [rom the seq uences of most premRNA introns. This recenUy discovered form is known as Ihe ATAC spliceosome, because the termini oC Ihe originally identitled rare inuons con tain AU al Ihe 5' sp lice sile anó AC at lhe 3' site (in RNA or AT and AC in DNA). Later il transpired that Olany inlmns spliced by this pathway have CT-AG termini (Iike mainstream ¡olroos), bul othenvise their consensus sequences are distinct from Ihose of the major palhway. Despite the differenl splice slLe and brnnch site sequences recogoized by the two systems, these major and minor fomlS of spliceosomes both remove inlmns using the same c hemical pathway (Figure 13-19). Consisten! with this conserved mechanism. the differenr:es in splice-site sequences recogn.ized by these snRNPs are mirrored by complementl'lT'}' differenCAs in the seqllcnces .of their snRNAs. Thlls. it is the abilily of the snRNAs aJld splice sile sequences to base-pair thal is conserved, not any particular sequence within either. 1I is also worth noting Ihal AT-AC inlmns might fi! into Ihe evolutionary scheme discussed earlier. Thus. as we mentioned. it has been proposed Ihat the group n ¡nuons represenl the oldesl form ol' inlrons. Further to this. it is suggested thal the AT-AC illtrons evolved fram lhe group U ¡nlmns ando eventua tly. give rise fo lhe maior pre-mRNA inlrons (Figure 13-20).
EXON SHVFFLING Exons Are Shuffled by Recombination Encoding New Protcins
[O
Produce Genes
As we have nOled, all eukaryoles ha ve inuons, and ye! these elements are rare-a lmosl nonexis tcot-in bacte ria. There are lwo Iikely explanalions for this situalion. First-in lhe so-called introns early model - inlrons existed in aU organisms but have becn 1051 from bacteria. Ir inuons originally did exist io bacteria, why mighl they subsequently have becn losl? The argumenl is l}ull lhe~ "gene rich " organisllls (st..>e ChaJ-llen; 7 ami 11). llave slream~ Hned their genomes in response lo selective prcssure lo increase lhe rute or chromosome repliC'.ation and ceH division, (RecaIJ airo thal among
EXQfl
I
401
flGU RE 13-19 The Af-A( spficeosome catafy¡ed splícing. This minor sPiceosorne WOfks en .3 minority 01exons (perha¡:6 one in a lhousand in hurntln5, for example), 1I00 those ha~ dJslinct splice-site sequences. Regardless, Ihe chemistry is the same, and SO are sorne of !he spliceosome componenls, and others are dosely r8aled.
U1
U4 Al'AC
Shuffling
.3'
5". __ pre-mRNA ¡ntrans
5' splice site
,
-
[
..
(yeasl)
AT-N:,
group 1I introfls
I
¡ÜN,Yl:í'" j(V.
: :
,
:
ltIlC;;;;;;;;;::::i'~'¡¡'XJ¡¡;:¡t'lüi:';;;;===~~
, ;
~' CCUU , ' , '
, ,
trans
.. :
,
3' splice sile
I
(mammalian) -n;URAGU · maJ~
branch site
,, '' , ' ,uCía'uOX A . ' "" ~ " ." ' .'
.' "
..
pre-mRNA introns- major, AT-/lC, 3nd tl'ans-
spliang- and glOup 11 introns. Shaded regiOfl5 show nudeotides lhat are identic:al in majar,
-y:
~U:AGp~ ~_e.~
FIGURE 13-20 Sequencesconservedln different kinds of ¡ntrons. Shown are COIlsefVed sequences found in the S' splice site, 3' splice Slte, and branch srte of nuclear
UUY'A'G'
Al-K... and tranysplidng ¡ntrans. (Source: Adapted from Yu Y.-T., Scharl E.c., 5ffi1th CM~ and Steitz JA 1999. The grü'Mng v.orld of smaft nuclear ribonudeoproteins. In The RNA ~ 2nd edilion (ed. GesteIand RF., Ced1 IR, and A!kins lF.), w. 48, - 524, p. 497, F¡g. 4. CoId Spring Harbor Laboratofy Press, CoId Spring Har-
bar, New York.)
402
RNA Spliclng
eukaryoles, yeast-which are uniceHular and rapidly growing-have fewer introns than do complex mlllticelllllar urganlsms.) In Ihe alternative view. ¡ntrons never exisled in bacleria bul ralher arase later in evollltion. According lo this so-called inlrons late model, inlrons were inserted into genes Ihat previously had no introns. perhaps by a transposon-like mechanism (soe Chapler 11). lrrespective of which explanation is true-and at lh is slage it is impossibJe to decide the matter unambiguously-there is the secon d , perhaps more intcrcsting, question: why have the ¡ntrons been retained in eukaryotes, ando in particu lar, in the extensive form seen in mullicellular ellkaryotes? One clcar advantage is thal the presence of introns, and Ihe need lo remove Ihem. allows for alternative splicing wh ich can gencrale multiple proteio prod ucts trom a single gene. But. on an even grander sca le. another advan tage afforded 11wse orgllnisms is believed fo be the fo llowing: havíng Ihe coding sCCluence of genes divided into severa l cxons allows new genes lo be created by reshuffling exons. Three ohservations strongly suggesl that .h is process actually occues:
= __= •
exoo_ 1 5' ==-_
E1
5· =:1....._
eKOO
2
1 -== E2
•
...
!
translaüon
01
domaln 1 dornain 2
""'''''''!'''''''''''Ud;m'' ' 'tiOn 01
] DNA-binding
fl GU RE 13-21 hon, encocle protein domains. In t.his eJIilmple, the ct>IA-binding dornéIin of a plOtein IS efK.uded by one e¡((lfJ, wIlile the dirnenzallOn dornain of tnat same protem is encoded by a separate exon. Protein dornains foId indepeodently of!he resr of !he protelfl in which they are found, and often carry out a Single fundion (as we álSCllSSed in Chapler 5). lhus. exoos can oflen be exchanged betweer¡ proteins productively
• First. the borders bctween eXons and introns within a given gene ofien coincide with Ihe boundaries between domains (see Chapler 5) wi.hin Ihe protein encoded by thal gene. That is, it seems lhat each exon very oflen encodes an independently folding uníl 01' protein (often corresponding lo an independent funcHon as well). For exampIe, consider the DNA-binding protein depicted in Figure 1 3-21. Like most DNA-binding proteins. this one has two domnins-the ONA recognition dornaio aod the dimerization domajn. As shown in lhe figure. these dornains {D1 and D2J are el1coded by separate exons (El and E2) within the gene. • Second, many genes. 3n d the proteins fll ey encode. have llpparenlly arisen during evolution in par! via exon duplication and divergence. Proteins made up of repealing units (such as immllnoglobu lins) have probably arisen thi s way (see Chaplee 11 Figure 11-35). The presencc of ¡nlrons bctween each exon makes the duplication more likely. • Th ird, related exons are sometimes found in otheew ise unreJated genes. Thal is. lhere ls cvidence that exons really have been reused in genes encoding different proteins. As an exam ple. consider the LDL receptor gene (Figure 13-22). This gene contain5 50me CXOIl~ Ihal are c1early evolulionarily related lo exons found in the gene encoding the EGF precursor. Al the same time, it has other exons that are cJearly related lo exons from the C9 complemen t gene (Figure 13-22). More extensive examples of cxon accreti on are apparent from Ihe complete seq uences of genomes-for example, the human genome. As shown in Figure 13·23 . there are numerous examples af proteins made up ol' highly relaled domains used in varíous combiualions. encoded by genes made up of shuftled exons. As we have seen, exons tend lo be mlher short (sorne 150 nudeolides or so) while introns vary in lenglh and CaJ1 be very long indeed (up to several hundred kbJ. The size ratio ensures that, for the average gene in a higher eukaryote, recomhinaUon is more Iikely to occur wilhin the ¡ntrons .han within the exons. Thus. exons are more likely lo be reshuFfled lhan disrupted. The mechanism oC splicing-the
Exon Shuffling lDl receptor gene
exon
. ~~:-i-"J '. ~ 1'------~~~'~ "
C9 comploment gene
,
EGF precursor gene
F I (i U R E 13-22 (ienes made up of parts of otfler genes. lhe LDL receptor (!he plasma 10lIl denSlty lipoprotein receptOl') gene conlains a strelch of SIl( exons dosely relate
mIS
use of lhe S' and 3' splire sites-guarantees that almost all recombinant genes wiU be expressed . because the splice sites in different genes are largely interchangeable. 1n addition , alternati ve splicing can allow new exons lo be tried without discarding the original gene product, a
y
W, F
b
W
t
Y, F
y
e
"""""" ancestor
{
fiGURE 13-23 Acc:umulatton.lo55, and reshuHling of domains during the e~ution of a famil\, of protems. The figure'shows proposed mutes whereby difierenl related proteijns mlght have adved by gain and Ioss 01 specific dornains. lhree elI3r11p1es are gi\o€n, in eoch case lhe proteins in question are chromatin modifymg enzymes (Chapter 7) flom '¡IeaS1 (Y), worms (W), !lies (F), and humans (H). Each proleln IS depaed by a ~ies of differently rolare
403
404
RNA Splicing
RNAEDlTING RNA Editing Is Another Way of Altering the Sequence ofanmRNA RNA editing. liko RNA splicing. can change the sequence of an RNA after it has been transcribed. Thus the prolein produced upon translatiOll is differcnt from that predicted from the gone sequencc. There are two mechanisms Ihat mediate ediling: site-spudfic dealllinalion find guide RNA-wrected uridine msertion or delation. Wa consider each in turn, In one form of sitc-spocific deaminalion, 6 specifically targeted cylosine residue wilhin mRNA is convw1ed iolo widine by deamination. Typically. for a given mRNA spccies. tbe procoss occurs only in cerlrun fi ssues or cell types and in a regulatt:ld manner. Figure 13-24 shows ths rnammalian apolipoprotein-B gene. This gene has several cxons. witbin ono of which is a particular CAA codon that is targeted for editing; it is tha C within tbis codon that gels deaminated. Thal deamination. canicd out by the enzyme cyUdine deaminase, convcrts Ihe e to a U (Figure 1325). In tbis example, !he doamination occurs in a tissue·specific manner: messages are edited in intestinal cells but not in Iiver colls. Thesc Iwo forrns of apolípoprotein B are both involved in lipid metabolismo The longer formo found in tho liver. is involved in the tmosport of erldogenously synthesized cholesterol and triglycerides. The smaller version, found in (he intestines. is involved in thu transport of dietary lipids to v6l'ious tissues. Thus the CAA codan . which is trallslated as glutamine in Ihe unedited message in the Uver, is converted in Ihu intestine. 1.0 codon2, 153 within exon 26
"
pre-mRNA
Cl_ _ _ _ _ _ _
mRNA
""",,,,,,, lo ,;"", (no editing)
1
.._______ 3·
~.r,"'
I
1""0""'" 'o ;",,,,,~
(slte-specllk: deamlnation) CAA _UA A
CJ_ _ _ _ _ _Il!.! If~,_ _ _ _ _ _ _ _ 3·
l'''o,~,on ¿
glutamine
. N 'C=========~~========~, C
4,563 as protein
N========='C ee c'
2,153
prolein
flGUR E 13-24 RNA editing by deaminaoon. The RNA made lrom Ihe human apoIlpoproteln
gene IS edile
RNA Edjljng
UAA - a stop codon. Tbe result is lbal lhe full-Ienglh protein (of sorne 4,500 amino) acids is produccd in Ibe liver, bul a truncated polypcplidc of only abou! 2.100 amino acids is made in the intestine (seu Figure ]3-24). Olher examples of mR.NA editing by enzyrnatic deaminalÍon ¡nclude adenosino doomination. This reaction carried out by Ihe enzyme ADAR (adenosine deaminase aeling on RNA) - of which there are Ibree in humans-produces Inosinc. lnosine can base-pair with cytosine, s nd so this challb'6 can rei'ldily alter Ihe sequenct! of the protein encodad by the rnRNA. An ion charUlel exprt.'Sscd in marnrnalian brains is the laryt:l of this type of cditing. A single adil in its mRNA elicits a single amino 8cid change in lhe protuin, which io turn allers tha Cal ' permuabilily of Ihe channel. lo Ihe absence of this editing, brain development is seriously impaired. A very differcot form of RNA editing is fonnd in the RNA transcripls Ihal encocle prolcios in the milochondria of trypenosomes. In this case, multípl e Us are inserlad into specific regions of rnRNAs after transcription (or, in other cases, Us may be deleted). Tht::se inSt::rtions can be so extensive lhat in an extreme case Ihey amolln! lo as roany as half Ihe l1ucleotides of lhe malure rnRNA. The additi on of Us lo the message c hanges codons ancl reading frames, complelely a l!ering the "meaniug" of the messagtl. As en cxample, consider Ihe Irypanosorne coxU gene. In a spocific region of Ihe mRNA of lhi s gene, four Us are inserled betwcen adjacent bases al three siles (two Us al one sita and one U 01 each of Iwo additional sites). Thcsc additions aJter some codo.lls 8nd ca use a " - 1" cbange io lhe reading feame , a shift lba l is rcc:¡ui red 10 genoratc the corrtlCt open-reading frame , 8S shown in Figure 13·26a. How are these additional bases inserted? Us are inserted into the message by so-called guide RNAs (gRNAs). as shown in Figure 13-26. These gRNAs cange froro 40 lo RO nlldoolie! es in lenglh ane! are encocled by genes distinct from lhose that encode the mRNAs Ih ey ad on. Each gRNA is divided into three regions. The fust. al Ibtl S' end, is called the "anchor" and directs the gRNA to the region uf Ihe mRNA it will edil: the second determines exactly where the Us will be inserted within thu edited sL'quence; and Iho thied , al lhe 3' nnd, is a poly-U slrelch. We 1l0W look more dosely al how Ihe gRNAs dimct ed iting. The anchor region of the gRNA contains a sequence Iha! can base-pair wilh a regían of thu mQlisage irnmediately beside (3' to) lhe region thst will Ix! editcd (Figure 13-26b). This is follow(.>d by the editing " instructions:" a strelch of gRNA complemenlaty lo the region in Ihe messagu lo be cdittd, bul containing additional As. The As are al positions in lh e gRNA opposite where Us will be inserted into the rnRNA. Allhe 3' end of the gRNA is lhe poi y-U region . The role of the nuclootides in Ihis ['(.... gion is uncluar. though it is pmposoo that they tether thc gRNA lo purine ridl sequcnces in the rnRNA upstceam (S ' to) the edited n.>gion. As ShOWll in Figwe 13-26c, the gRNA and rnRNA form an RNA-RNA duplex wilh looped out s ingltl-stranded regions opposile whcre Us will be inserted. An endollucJease rncogn.izes and l:uls the mRNA opposite these loops. Editing ¡nvalvas the trnnsfcr of U" into the gap in the message. This process il> catalyzro by lhe enzyma 3' tenninal uridylyl t.ram... Cerase rruTase). After fhe addition of VI>. lhe t",o haJves of !he mRNA are joincd by an RNA ligase. and Ihe "(.>diting" region of tho gRNA conlinues its setion aJolIg the mRNA in a 3' lo S' direction. A single gRNA C8J.1 bu
N~
o~~)
4{)S
o
l~ UJ
cytidine
de.m;n". ,
o
N
F I c; U RE 13-25 lhe dumination of the
base cytosine lo p'oduce urdo
.06
RNA Splicing
a DNA se<¡uence
primary
RNA
-=========)3 ' ¡
5"
G"AG A AC C l
5'
site of U insertion
mRNA 5''';;:;:;=:::JO~'[\[A:!1.A:C];rn:;;::;;=;;;;;;=:! 3' + gRNA
3' I pof)I-Ul CUAACAUAUGGA¡ -- - - 11
2UUU
5'
!
editing anchor region (regien of homology)
, == editjng
mRNA
e
¡
!
5'
5' 3'
[pd~J
eAA
~~ 3S'
Ú CA U ~ U--6: GA
endonuclease
protein
1
cuts a l misma lCh
5'.::::=:0 3'
5'
l
unedited RNA s-C====::iO¡JI:O¡J¡JIIJI===:r3' guide RNA
3' 1JiilRJ1 e u A_A e AJ.(AU G G-A, I
II
editing region
UTP
5'
VV
!
anchor
3'
i~
i
===3' _ _ _ _ 5'
3' I pdy-UI CUAACA U AÚG Ó A
b
Us added, gulded by base pairing with gRNA
editing region
S'
::1::,"", ==,u=,Jc' u¡ eAuAuGGÁr;;;;;;~: il
!
ngase ¡oins s' lo 3'
ends 01 message
5'e::::::
3' T p5Iy-Ol cUAACAUAUGGA
3'
S
f lC U R E 13·26 RNA editing by p ide RNA mediated U insertion. Edibng of !he trypanosome gene RNA. (a) Shows tIle positions 01the lour U nuc\eotides inserted into the pre-mRNA of the coxJl
(oxll
gene. These generale!he corred reading trame and coding infoflTlation in the mRNA. (b) Shows!he sequence 01the guide RNA tila! determines Ihe U insertion pattem, aOO me SEquence 01Ihe unedited sfletch of mRNA. (e) SI10NS lheeditlng leactton itsell
responsible ror inserting several Us al di fforont sites (as is the case for tho one shown in Figure 13· 26). Furthormoro. in sorne cases, several difrerenl gRNAs work on different regions orthe same messaga.
rnRNA TRANSPORT Once Processed, rnRNA ls Packaged and Exported from the Nucleus into the CytopJasm for Translation Once fu1l )' processed - capped , ¡oITOn-free. and polyadeo)'lated mRNA lS transportad out of the nud eus and into the cytoplasm (Figure 13·27) where it lS translated lo give ils protein product (Chapter 14). MoveU1ent from the nucleus lo the cyloplasm is nol a passive process. lndeed, il must be carefully regulated: th e fully processoo mRNAs represenl on)y a small proportion of the RNA found in the nucleus.
mRN/\ Trunspor1
nucleus
407
FIGURE 13-27 TransportofmRNAsout
of the nudeu$.
RNA export lrom the nudeus
is an active process. and only certain (appropri· ate) RNAs are selected fol transpon. l o be mRNA for Iransport
!i€lected for transport. the RNA must have Ihe corred: coMedion 01 protelns bound to it These ....,11 distinguish 11 from othl"f RNAs, v.tJich muS! be retaioed in lhe rodeos 01 destroyed Proterns thal recognize exon:exon boundaries, for
example, .ndlcate an mRNA lhat has been appropnalely spliced. v.tJe.ea5 proteins thal bind Introns Ifldicale an RNA that should be relalned In
me nudeus. Once 10 Ihe cytopla5m• .some
proleins are shed and others are tallen on in readiness lO!' transJation (Chapter 14). AAAAAAAAA.. .~~. . . .¡¡~~""'"
and man)' of tho olhur RNAs would be delrimental lo the cel! if exported. These inelude. for exan1ple. damaged or misprocessed RNAs. and liberaled introns (which . being. as Ihey lend lo be. so much larger than the exons. represent a larger population of RNA than do tbe maluce mRNAs). How art:! RNA se lection and transpon ach iuved? As we havo cmphasized in this and the previous chaptcr. from the moment an RNA moJecuJe staclS fo be transcribed. it becamos associated with proteins of various sorts: initially proteins invol.vfld in capping. th en splicing faclors. and fina lly Ihe proteins tha! mediate polyadenylalion. Some of these proteios are rep laced at various steps along the pcocessing path. bul others (induding some SR proteins. for examplc) are not ; ando moreover. addition
408
RN/\ Splicing
Tho mechanisms of nuclear transport a re beyond Ihe scopo of lhis book; sufficc it lo say that some of th e proleins associated with t.he RNA carry nuclear export signnls that are recognized by export recep tors tha l guide Ih e RNA out thro ugh the pore. Once in Ihe cylop lasm, Ih e proteins are discarded, and are Ihon recognizod for imporl back ioto Ihe nucleus where Ihey associale with anolher mRNA an d repeat the cyele (Figure 13-27). Export requires energy, and Ihis is supplied by hydrolysis of GTP by o CTPase protcin called Ran. Like othar CTPasos, Ran oxists in two conformations devend ing on whether complexcd with CTP or CDP, and the transition from ono stale lo Iho olhor drives movel11ent into or out of tho nucleus.
SUMMARY Mosl genes encode proleins, and Ihe st.'quencc of amino acids wilhin any given protein is dclcrmined by lhe sequence of "codons" in its gene. Each codon is made up of a group of Ihree adjacenl nudeotides. In almosl a l1 bacterial and phage genes, the open-reading frame is a single slrclch of codons wH h no break. Bul Ihe coding scquence of many eukaryotic genes is splil inlo slrclches of codons interrupled b), slrolches of noncoding sequence. The c:oding strelches in Ibese s plit genes are called rumns (fm "exprossed sequences") ancl Ihe nOflcocliflg slrctc!ws aro CHlled inlrons (for kinlervening scquences·'). The numbers ancl sizes of Ihe i ntrons and exons vary enormousl)' from gene lo gon(l. Thus, in )'easl, only a relalivcly s mall proportion of b't:nt.'S have inlrons, ancl whem Ihey occur '-hey IImd lo be shorl ancl fcw in flumbcr {one or oecasionally two par geneJ. In multicellular organisms such as humans, Ihe number of genes containing inteons is much larger, as is Ihe numbcr of inlrons per gene (up to 362 in an extreme case). The sizes of exons do vary bUI are often around 1 50 nudeotidos; ¡nlrons. un Ihe ol hor hand. vary fmm 61 bp lo as much as a slaggering Bon kb. When a gene conlaíníng inlrons ís lranscribed. tbe RNA inilially relains those introns. These are lhen removed lo produce Ihe mature mRNA. The process of inttnn removal is calletl splicing. Many inlron-conlaining genes give rise lo a unique mRNA species. That ¡s. in each case, 811 Ibe introns are removed from Ihe origina l RNA. leaving an rnRNA cOlllposed uf all lhe e?,úns. Bu l in olher cases, splicing can produce a number of differcnl mRNAs from the same gene by splicing Ihe OI'igi nal RNA in differenl patleros. Thus. for examp le, sorne genos contain alleroativn exons, only one of which ends up in a givcn niliNA . In other cases , ti. givcn exon might be removed (a1oug wlth Ihe inlrons) from Sorne copies of Ihe RNA-again producing alternative versions of mRNA fmm Ihe same gene. &'quences found al Ihe boundary ootween introns und exons allow Iho Cllll lo idenlify ¡nlrons fm remova l. TheAA splicing scqucnccs aJT! almost. cxcJusivcly withi n lhe introns (where Ihere are no rt.'Striclions imposcd by the m.'Üd lo encode ami no acids, as Iheco are in cxons). Theso scqucnct~ are called tlle 3' IUld 5' splice sites, dcnoting their relative
localions al one m the other end of the intron. To splicc out an introll also ruquires a scqucnce e1ement, caJk>d the branch sitc. ncar Ihe 3' cnd of the intron. [ntran removal procceds via two Irnnseslerificalion roaclions. In the firsl. an A in Ihe branch sito atlacks a G in the 5' splice silo. In Ihe second, Ihe liberatod 5' exon allacks the 3' splico silt!, These rcactíons have IWo consequences. First and foremosl, Ihe)' fuse lhe Iwo eXons. Second . lhey rolease lhe inlron in Ihe form a l' a branched structure callcd 11 larint . Splicing of nucleosomal pre-mRNAs rcquires a Jarge complex of proteins ami RNAs called Ihe spliceosome. Thís is made up of so-caJled snJ{NPs, Qf which there are five - Ul, U2, U4, U5, and U6 snRNPs. rJlch of Ihese compriscs an RNA molecule, called lhe U1 to U6 snRNA, respectivcly, snd a numbcr of proleins. Ihe majoríly 01' which are differonl in cach case_ The aclion of Ihe spliceosome is particularl)' inleresting in Iwo rcgards. Firsl, Ihe RNA componcnls have a central role in rocognizing introns and calaly.ling their removal. Stx:· lInd. Ihe complex is very dynamic. TIlal Is, al differenl sleps during tbe process or s plicing, Ibe spliceosome conslitution alters-differont subunits of the milchine join and lcave the complex, uach pcrforming a particular function . Thus, early on, U1 snRNP rocogni:tes Ihe 5' splicc site, whi le Ibo U2 snRNP rccognizes Ih{: branch s ile. U4 and U6 then join, IOgelher wi lh U5, bringing Ihe branch sile and 5' splice sile togethcr and stimulaling the first reaenon con· enmilanl with V I and U4 leaving. fina lly, Ihe 3' and 5' splice siles aro broughllogulher and cxons are fuscd , Thcre are a ICw rare intrnns Ihat can roml)Vc themscl\'llS from wilhin RNA molcculcs by a process known as selfsplicing. Though nol slricl ly an enzymatlc rll8elion , the RNA of Ihe inlron Jlevertheless mediales Ihe chemislry of romoval. Thusc sclf-splicing inlrons come in Iwo cJasses, one of which (group 11) splice by Ihe same chumical pathway as Ihal mt.'I:! ialed by Ihe s pliceosome. These inteons probably rcpmsenl tbe cvolulionary origin or modcrn splicl..'Osomal inlrons, and Ihe Iwo-slep chem ical palhway used by bolh reflccls lhal evolutionary relalionshlp (aud perhaps explains why inlrons are not removed by a more dirtcl single-stcp mcchonismJ.
BibJiogruphy
The sp lice siles deseribed aboye are defined by ralhcr short sequcnecs wilb low lovels of eonservation. 11 thus represcnls a significant ehallcnge for tho slJli cing machinery to recognizc and splice onl y al eorred sites. There are various mechaoisms by which Ihe spliecusomc cnhilm;cs accuracy. First, iI assembles on Ihe sites suoo afier they have buen synthosit.cd. This ensurc5 tho)' aIfl solecled bdore other downslream sitos are avaitablc to compelo. Second, Ihere are olher proleins- SR prole ins-thal bind noor legitimato splice siles and help recruit Ihe splicing machinery to Ihose siles. In this way, aulhenlie siles cffec· tivdy have Il higher affinily fOf the machincry than do socalk>d psucdo sih!S of s imilar sC(luence. Therc are a largc variety of SR prolcins. Eaeh blnds R."JA wilh one surface and wilh anolher internets with oomponents of the splicing machinery. Sorne SR proteins regu lale splicing. That ¡s, a givon SR protein may Le found only in ono ccll Iype and ml.>diale a particu lar splicing evenl only in lhal cell type. Othor SR proleins a re onl)' oclive in tbe prosence of spcciflc physiologieal signa ls, and so a given sp licing ovenl only oceurs in response lo Iha! signa\. Jo Ihis way, SR proleins resemLln lranscrip-
400
lional aclivalors, as we will sec in lalIJr chaplers. Also analogously wHh lranscriptional n ..-gulation. Ihere are rcpressors of splicing Ihat exclude splicing of spt..'cific introns under eertain circurnslanees. Togetlmr with Ibe other modifiUtliulls deült wilh in Chapler 12, splicing is required before rnRNAs can be Iransported oot oí Lhe nocleus through nudear pores. This loo can be rcgulalcd. Jt is believed Ihal a given cxon typieally encocles Bn independently folding (and funelional) protein dornalo. Thos, soch Bn oxon can readíly funclion in eornLinalion wilh different rumns. This s uggcslS il has bt.'C1l relativcly ellsy, through evo lu tion, to generate ncl\' proleins by shur· fiing existing exons belwccn genes. RNA editing is another meehanisrn that allows an RNA lo be changed afler lronscriplion so as lo e ncode a diffúrenl protein from Iha! encoded by Ibe gelle. 1'wo mechanisms [or cdit ing are: enzymalic modifieation of bases (generali ng forms Ihat a lter how Ihey are road by IRNAs) and Ihe insertion or detelion of mulliple U nuc1eotides wilhio lile message.
BIBLIOGRAPHY Book, Albcrts B., Johnson A.. Lcwis 1.. Raff M., Roberts K.. and Walter P. 2002. Molecular bíology of tite ccJl, 4th edition. earland Sciencc, New York. Geslchmd RF., C,~h T.R.. Alkins J.F., t:ds. 1999. TI!!;! RNA wodd, 2nd edition . Cold Spring Harbor Laborlllory Prcss, eold Spring Harbar, New York. Lewin B. 2001). Genes VIl. Oxford University Press. New York. Lodish H. , Berk A., Zipursky S.L, Malsudall1l P., 8allimore D., and [)arnell l. 2000. Molecular ceJJ bi%gy, 4th editjon . \V.H. Frucman, New York.
Mechanisms oE SpJicing and the Spliceosome Crick F. 1979. Splil genes and RNA splicing. Seieucc 204: 264-271eravcley B.R. 2000. &)fling uul Ihe cornplexily 01' SR prolein funchon. RNA 6: 1197 - 1211 . Hastings M .L and Kraim..'l' A.H. 2001 . Pn.."-mRNA splicing in tbe new rnillenlliurn. Curro OpinoCell Bio/. 13: 3IJ2-3tl9. furica M.S. and Moore M.J. 2003. Pre-mRNA splicing: Awash in a sca of proteins. MoJ. CeJ112: 5-14. Maniatis T. and Rt.-'Cd R. 2U02 . I\n extensivc nctwork of couplíng amnl1g gcnH exprcssion machines. Na turc 416: 499-506. Reed R 2000. Mecbanisrns of fi dclity in pre-mRNA spJicing. Curro Opin oCell 8iol. 12: 340-354 . Stal ey J.P. and Guthrie C. 199B. Mcchanirn/ devises oflho sp liccosome: Motors, docks, springs and Ihings. Cell 92: 315 - 326.
Torn W.Y. and Steitz J.A. 1997. Pre-mRNA splieing: 'fhe
discovery of 1I new spliceosome douiJles Ihe challcngc. Trends BjocJ¡em. Sci. 22: 132 - 137,
Self-Splicing Cech T.R. 1990. Nobd Iccture. Sdf-splidngand enzyrnalie aetivily of an intervcning sequencc· RNA from Tetrohy· meno. Biose;. Rep. 10: 239-261.
Altemative Splicing and Regulation Barass ' .0. and Beggs J.o. 2003 . Splicing b>OOS global. Trends Gene!. 19: 295-298. erllvcley B.R 2001. Alternalivc splicing: Increllsing divc rsilY in the proloomic world. Tronds Genet. 17: 100 - 107. Ladd A.N. and Coopcr T.A. 2002. Finding s ignals tha! rcgulatc altcrnalivc splicing in Ihe posl-genom ic era. Gcnome Bio/. 3: rcviewsOOOB.I - 0008.16. Manifllis T. and Tasic B. 2U02. Alternative pre-rnRNA sp/jcmg aod prolt.'Orne RXpansion in melazoans. Nolu re 418: 236 - 243. Smilb C.W. alld VlI.lcarccl J. 2000. Altcrnalivc prc-rnRNA splicing: The logie of eorniJinatorial control. 'fl'ends Biochem. Sci. 25: 381-388.
rnRNA Transport Drcyfuss e., Kim Y.N .. and Kalaoka N. 2002. McsscngerRNA-binding protei ns and ¡he mcssagt..'S Ihey carry. Not. Rev. MoJ. Cell 8io/. 3: 195-205.
410
RNJI Splicing
RN A Editing Bcnne R. 1996. RNA L>d iri ng: Haw a mcssage is changed. CUIT. Op in oGen. [)ev. 6: 221 -231 .
[Hane V. and Oavidson N.O. 2003. C-Io-U RNA cditing: Mochanisrns loodi ng 1.0 gcnclie diversily. ' . 8iol, Chem. 2711: 1395-1398.
WA. snd Fournicr M.l. 2003. RNA·guidúd nudcolidu JJlodificalion uf !'i)x'suma l Rnd ulh üT RNAs. ' . Biol. C/lcm . 278: 695 - 698. KoogRn L.P.. Ga llo A.. and O'Conncll M.A. 2001 . Thc many rules of Sil KNA editor, Nol. Rcv. Cellet. 2: 869-878.
OUCEllur
Maas S .. Rich A.. s nd Nishikul'8 K. 200 3. A-Io-I RNA editing: Rt.'Ccnt ncws snd resid ual myslerics. ' . 8iol. CJ¡em. 278: 1391 - 1394,
Madison-Anhm ueci S" Gnulls J.• tmd Hajduk S.L. 20112. Ediling machines; T he eomp lu" ilics of trypanosomc RNA L>d ili ng. Ce/J lOB: 435 - 438. Simpson L, Sbiccgo S" snd Aphasizhev R 2003. und ine insortionldeletion RNA cdi ting in trypanosomc mitochondria: A comp lex bu!;ine5¡;. RNA 9: 265 -276.
CHAPTER
Translation
he central qm"stion addressed in this chapter and Ihe n6xl is how genelic ilúormation conlained within the order of nuclootides in messenger RNA (mRNA) is used lo generale the linear sequences of amino acids in proteins. This procesoS is known as translation. or the events we have discussed. lranslation is among the most high ly conserved across a ll organisrns and among Ihe mas! energetica ll y costJy for the ceH. In rapidly growing bacteria! cells . up to 80% of the ceU's energy and 50% of the ctlll 's dry weight are dedIcated lo protein synthesis. Indecd. the synthesis oC a single prolein requires the conrdinatcd action oC wcll ovcr 100 proteins flnd RNAs. Consisten! wilh Ihe more cOIup lex natute of the translation process, we have ruvidcd Qur discussiao iota IWO chapters. lo th is first chapler wc describe the avents thal aUaw doc:oding of the mRNA, snd in Chapler 15 wc describe Ihe nalure of lhe genelic axle and ils recognitia n by Lransfer RNAs. Translatioll IS a much moro formidable challenge in infarmalion trnnsfer than the transcription of ONA jnto RNA. Un li ke !he comp le-mcntarity benvoon the ONA te mplate and the ribonucleotides of Ihe messcnger RNA, tl1e si de c ha ins of amina acids have liule or no spccific affinity for the punne and pyrimi dine bases found in RNA. For cxample, Lhe hydrophobic side chains of Ihe amino acids s lanine, valinc, leucine. and isoleuci ne can nat fonn hydrogen barrds with the amioa snd kelo groups of the nllcIL-'Otid e bases. Likewise, iI is hard lo imagine thal severa! difIerenl combinations of three bases of RNA could farm surfaces with unique affinities for the aromatic ammo acids phenylalanine, tyrosine. s nd Iryptophan. Thus. it soomed unlikely thal direct intf.lractions betweeu the mRNA templs te end tho amino acids cauld be responsible for the spt.'Cific end accursLe ordering of amina adrls in s polypeptid tl. With tbese cons iderntiolls in mind, in 1955 Francis H. (''rick propost.>d. tha! prior 10 tbeir incorporation iuto polypeptides, 8mino acids musl atlach lo a spocial adaptor malecule that is capable of directly inloracting wiLh and rccognizing Ihe tbree-nucleotide-Iong coding unils oCIhe mcssenger RNA. Crick imagined thal the adaplor would be an RNA molecule because it would need lo recognize Ihe code by Watson-Crick basc-pairing rultls. Just two yeaI8 later, PauI C. Zaroecnik and Mahlau B. Hoagland demollstraled that prior lo Iheir incarporation inlo proteins. amino acids are atlached lo a cJnss of RNA molecu1es (represenling 1 5% of aU cellular RNA). These RNAs aro caJlcd lransfer RNAs (ar tRNAs) bernu'it! Ibe aromo acid is subsequentIy transfen-ed to Ihe growing polypcplide chain. The machinl3ry responsible fOI" traosJating the langllage of messengcr RNAs into lhtf language of proteins is composed of fOllr prilllary componenl s: mRNAs, IRNAs, aminoacyl tRNA synlhetascs, and Ih e ribosome. Together, these componenls accomplish Ihe extraordinary
T
QUTL INE
Messenger RNA (p. 4 12)
• Transfer RNA (p. .q 15) Attachment of Amno ,A,cids lo tRNA (p. 417)
lhe Rilxlsorne (p. 423)
InlllatlOn of Translallon (p. 432)
• Translation EIongaoon (p. 440) TeflTllnation o, Translation (p. 448) TranslalJOrl--Dependent Regulahon 01 mRNA and Prciein 5tability (p. 452)
lask of translating acode wriltcn in a four-base alphabel ioto a second code wriUen in the language of the 20 amino acids. The mRNA provi des Iho informatioo Ihat musl be interpreled by the translalion machinery. and is ,hE:! lemplate for Iranslation. Tbo protein-coding rogion of lhe mRNA consists of an ordcrro serios of Ihree-nudeotidelong unils caJled codons lhat specify tho order of amino acids. The tRNAs provide Ihe physical interface betwoen th e ami no acirls beiog added lo tha grow ing polypeptide chai n and Ihe codons io Ihe mRNA. Enzymes callt:d aOlinoHeyl tRNA syolhetases coupl~ aOlinu acids lo spocific tRNAs thal recognize the appropriale codon. The fina1 central pIayor in lranslatioll is lhe ribosome, a remarkablo, multi-megadallon machino composed of both RNA and protein. The ribosome coordinales Iha correel rer.ognition of tho mRNA by eacb tRNA and ('.atalyzes peptide bood formation between the growing polypeptide chain and Ihe amino acids altached lo Ihe selt."Ch:td IRNA . We will firsl consider the kcy altribultlS of p..8ch of Ihese four components. Wo then describe how these components work together lo accomplish traru;laliOlL Rect:ut progn:ss iu elucidatiug the ~lructllre of the componeots of the translational machinery make this an exciling area-Olle that i5 rich in mecharustic insights. Among tlw questions we wiU ask are the foltowing: What is lhe organil'A1tion of nucJeotide sequence information in mRNA? Whal is Ihe slnlcture of tRNAs. and how do aminoacyl tRNA syothetases recognize and attach Ihe correel amino acids lo each tRNA? Finally, how does Lhe ribosomo orchestratc the decoding of n udeotide sequen ce informatioo aud tha addítion of amino acids lo Ibe growing polypeptide chain?
MESSENGER RNA Polypeptide Chains Are Speci6ed by Open~Reading Frames The translation machinery decodes only a portion of each mRNA. As we saw in Chaptor 2, and wiU consider in detail in Chapter 15, the informatioo for proleio synlhesis is in Ihe form of lhrea-nudeotide codons, which tl8ch sptlcify one amino acid. The proteio coding regioo(s) of each mRNA is compasad of a conliguous, non-overlappiog string of codoos callcd nn open-reading f¡-ame (commonly kuown as ao ORf). Each ORF specifies a single proteio and slarls and onds al inlernal s ites within the rnRNA. Tbol ¡s , !he ends of an ORF are distincl from the ends ortho rnRNA. Translation struis at the 5' end Qf lhe open-reading frame and proceeds one coelon al a time to the 3' eud. Tha first and last codoos of an ORF' ilre known as the sta.rt and stop oodollS. [n bacteria, the star! codon is usually 5'-AUG-3' bul 5'-CUG-3' and sometimes even S'-UVG-3' are also used . Eukaryotic ceUs always use 5'-AVG-3' as the start codon. Trus codon has two important functions. Fírst, it spocifies the first amino acirl lo be incorporated into the growing polypeptide chain. Second, it defines th~ roüding rrame for all suLsequent codons. Becausa codoos are immediately adjacent lo each olher aud because corlons are Ihree nuclcotides long, any stretch of rnRNA could be translaled in Ihree different readiog (rames (Figure 14-1). However, once translation starts, each subsequenl codon is always immedialaly adjacent lo (but not overlapping) the previous three-base codf,ln. Tbus, by setting the 10l:ation of tobo first codon, the start codon determines the location of all following codons.
Messenger RN
s ~ IIlInAr.YE 11IlJJl1!I!I!I1!liE11!1!'D u G A r.YEt!I!I1llllll1l[;K!l!![!I;I'JGI!I!ItiI!I!ll1lJ
""" 00080 ",,,, 0000080 s
~
D!l[1'l'J[.lr:.11l!lrJl'll!lI!l!l!lIil1l!lEWil!l[1'l'J[.l_I!I!E!lI!E_t!D!Il!I!E!lI!E!J
00000000000008 FI CU R E 14-1 Tluee possible rodina trames of ,he E.. coli ttp lode, sequence. Start oxIons are shaded in green and stop codons are shaded in red. The amíno acid seq.¡eoce d Ihe encOOed sequence is inooted in !he single let!er code below each codon.
Stop codons. of w hich Ihere are three (5'-UAG-3'. 5'-UGA-3'. and 5'UAA-3'), define !he ond ofilie open-madiog mmo and signal termination of polypoptido synthesis. We can now fully ~ pp reciate the origin of Ibe tenu open-rroding fmme. It is a contiguous sfrotch of codons "read" in a particular frarno (as sel by the first codon) thal is "open" lo translation Lecause illacks a stop codoo (tha! is. unlil Ihe last codon in Ihe ORF). Messenger RNAs contajn at least one opon-readíng frrune. The number of ORFs per mRNA is diffcrenl bclww n eukaryotes and prokaryotes. Eukaryotk mRNAs almost always contain a single ORF. lo contrasto prokaryotic mRNAs frequently contain two or more ORFs and hence can encocle multiple polypeptide chains. Messenger RNAs containiog multiple ORFs are known as polycistronic mRNAs, and those cncoding a singlo ORF are known as monocislronic mRNAs. As you leamed in Chapler 12. polycistronic mRNAs often encode proteins Ihal perform reJated functioos. such as ditTerent steps in the biosynthesis of an amino acid Of nucleotide. Tbe struclures of a Iypical prokaryotic and flllkaryotic mRNA are shown in Figure 14-2.
Prokarvotic mRNAs Have a Ribosome Binding Site that Recruits the Translational Machinery Por l:ranslation to oecor. the ribosome must be recnüted lo the rnRNA. Tu facilitate binding by a ribosome, man)' prokaryotic open-reading frames contain a shorl sequence upslrcam (on the 5' sid e) of the slart codon called the ribosornc binding sitc (RBS). This elemelll is also referred to as a Shine-Dalgarno sequcnce after tha scientists who discovered it on the basis of comparing the sequences of multiple mRNAs. The ribosome bind ing site. typically Tocaled truee to nine Laso pairs on Ihe 5' sida of the slart codoo , is complomentary lo a seqUeJ1ce located near the 3' cnd of one of Ih e RNA r.omponents, th c 16S ribosornal .RNA (rRNA) (sue Figure 14-2a). The ribosome binding site base-pai rs with Ihis RNA componcnt, thereby aligníng Ihe ribosomo whh Ihe beginning of the open-reading framo. Th o core oC this region of the 165 rRNA has the sequence 5'-CCUCCU-3' . Not surprisingly, prokaryotic rihosome hind ing sites are mosl oft on a s ubset of the sequence 5'-AGGAGG-3 '. The f:lx!ent of complementarity and Ih e spacing bl:llwmm Iho rihosome binding site amI thu start codon hus a
414
Tronsfulion
FIGURE 14-2 Slructureofmessenc"" RNA. (a) A polyóstroruc prokaryotic message. The ñbosome binding sife rs indICated by RBS, (b) A monOClSfronic euk.:lryoric messagc. The
S' cap IS indJcatecl by a 'ball" al !he cnd of!he mRNA.
S'
In"
l
''''''
o;JXI
l ===
b
S' NN~NNN AUG GNN 3' S' ~J._..m~'.
5'~
___~____.d"'--= · =-AAAAA,,3'
!
~"""""""
strong intlue nce on how actively a particular open-reading frame is transJated : high complementnrity and proper spad og promotes active lranslation , whe reas limited complementarity and lar poor spacing genera lly supports lowur levels of translatio n. Sorne pmkaeyotic ORFs intemal lo a polydstronic message lack a stmng r ibosome binding site bul are nonetheless actively translated. In these cases tJ\(J start codon often overl aps Ihe 3' end o r the adjaccnl opelH eading frame (mosl afien as the sequellce 5 '-AUCA-3' , wruch contains a stal1 and a stop codon). Thus, a ribosome that has jusI compleled transJali ng lhe upstrearn opnn-reading frame is appropciately positioned lo bcgin trallslating from the sla.r1 codon for Ihe dow nstream open-reading fram e, circum venting the need for a ribosome binding site to recruit the ribosome. This phenomenon of linkcd translatiol1 between overlapping open-reading frames is knowlI as tl'anslatlonal coupling.
Eu karyotic mRNAs Aee Modified at Their 5 ' and 3' Ends to Fadlitate Translation UnJike th eir prokaryoti c cou ntcrpa rts, eukaryot.ic mRNAs recruit ribosomes using a specific r:hemical rnodificatiol1 ca ll ed Ihe 5' cap. whic h is located al Ihe extre me 5' end of the message (see Cha pter 12 and Fjg ure 14·2b). The 5' cap is a melhylated guanine l1ucleotide tha! is ¡oined to the 5' end of tlle mRN A via an unusual 5' to 5' Hnkagc. Crealed in three sleps (see Chaplee 1 2), the guanine nucl eotide 01" Ihe 5' ca p is connected to the 5 ' end of the mRNA throug h three phosp hato grollpS. The resulting structure recruit s the ribosome lo t.hc mRNA. Once bound to the mRNA, the ribosome moves in a 5' - 3' dircction until il encounters a 5'-AUG-3' 5ta rt codon . a process ca lJ ed scanning, Two other features of eukaryotic mamm alia n mRNAs stim ulotc tran slatiOIl . O ne fcature is tl1e prcsc llce. in sorne mRNAs. o" a purinc
Trollsfer RNA
three bas~s upstream or Ihe start r:odon and a guanine immediateIy downstream (5'-G/ANNAUGQ.-3' ). Ttlls sequence was origiually identified by Marilyn Kozak Hnd is referred to as the Kozak sequence. Many euk.aryotic mRNAs Iaek thcse bases, but their presence increases the effidellcy of transIation. In eontrast lo the situatioll in prokaryotes. these bases are thought lo inlerael with inHiator tRNA, nol with the smalJ rRNA. A seeond feature that contributes lo effident translatioll is tllC presence of a poI y-A tail al the extreme 3' elld or Ihe mRNA. As we S8W in Chapter 12, this tail is added enzymaticall y by the enzyme poly-A polymerase. Despite its location al Ihe 3' end ofthe mRNA. the poly-A tait enhallces the level of translation af the mRNA by promoting cffidenl recycling ofribosomes (as \ve shall di scllSs later).
TRANSFER RNA tRNAs Are Adaptors between Codons and Amino Acids Al thtl heart or proteill synthcsis is the "lranslation" or nudeotirle sequenee inrormation (in the form 01' codol1s) inlo alllino acids. This is aeeomplished by IRNA molecules , which ae! as adaptors hctween mdol1 s and the ami no adds they speci fy. There are many types of tRNA moJeeules. hui eaeh is attached lo a s pecific amino acid and cach recognizes a particular codon , or codons. in the mRNA (mosl tRNAs recognize more Ihan one eodon). tRNA molecul es are between 75 and 95 ribonucleotides in le llgth . Although the exacl sequencc varies. all tRNAs have certaio fcalures in c:ommoll. F'irsl. all tRNAs end at the 3' termilllls w ith Ihe seq uenee S'-CCA-3'. Thi s is the site thal is attached lo Ihe cognale amino acid by the e nzyrne aminoacyl tRNA sYlltheta se, as we will considcr below. A sccond slriking aspect oC tRNAs is the presenee o[ several ul1usual bases in their primary slrucfure. These ullu sual [catures are created post-lnmsr:riplionally by enzymali c modilkiltion of normal bases in Iha polynucleolide d16in. For cxa mple. pseudouridine (i'U) is derived from uridine by an isomerization in which Ihe si le of altachment or Ihe uradl base to the ribose is switched from the nitrogen al ring posi tion 1 lo the carboll al ring position 5 (Figure 14-3). Likewise. dihydrouridine (D) is derived from uridine by enzymatic reduclíon of Ihe double bond bclwecn the earbons at positions 5 and 6. Other un usual bases found in tRNA incJude hypoxanlhine. thyminc. and methylguaninc. Theso modified base:;- ore not cssential for tRNA function , bul cell s lacking those modified bases show reducad rates of growth. This suggests Ihat Ihe I\lodified bases lead lo improved tRNA function. For exampl c, as \Ve \Viii see in Chapler 15, hypoxanlhine
uridine
pSeudOlKidine
dihydrouridine
",
~o"
N3 "5
H
:e; " .)z. , 6
H
FIGURE 14-3 Asubsetofmodifiec
nudeosides found in IRNA.
416
Translatian
plays an importanl role in Ihe process of codoll recognition by r:ertain tRNAs.
tRNAs Share a Cornrnon Secondary Structure that Resembles a Cloverleaf A.<> tve S8 W in Chapler 6. RNA molecules Iypically contain regiol1s of seJf-compl emc ntari ly Ihal enable Ibem lo form Iimited stretches of double helix Ihal are held together by base painng. Other regions of RNA molecules have no complement and hence. are sing1e-stranded. tRNA molecules exhibí! a characteristic pattern of single-sLranded and double-stranded regions (secolldary structure) that can be iIIustrated as a clove r/eaf (Figl,re 14-41. Thc prinr:ipi11 fcalures nf the tRNA r.Ioverleaf are al1 ac:ceplor ste m; Ihree stem· loops. which are referred lo as th e *U loop . Ihe O loop. and Ibe antieodon loop; and a fourth variable loo]). DescriptiollS of each of Ihese fcalure s follows:
• The acceplor stem, so-namee.! boca use it is the site of attaehment of the ami no aeid . is formed by pairing between Ihe 5' and 3' ends of the tRNA molecule. The S'-CCA-3' sequence al Lhe extreme 3' e nd ofthe molecule protrudes from this doubJe-strand ed stem. • The "'U loop is so-named beeau se of the characteristic presence of Ihe un us ual base 1.JrU in the loop. Th e modified base is often round within Ihe sequ ence 5' -T'VUCG-3'. • Tn e O loop takes its nam e from the charaderi stic presence or dihydrouridines in the loop. • The antieodon loop, ilS ils name implies. contaills the antieodon. a three-nucJeotide-long decoding element thal is responsibl e for ceeognizillg the codoll by base-pairing with the mRNA. The antieodon is bracketed on the 3' cnd by a purine and on its 5' ond by uracil. • The variable loop sits between the anlieodon loop and the 'IJIU loop. and. as its name implies. varies in size from 3 to 21 bases.
FI (; U R E
14-4 Ooverfeaf represenlafion
of!he secondal)' structure of tRNA. In mis representation 01 a tRNA. Ihe base-pairing between different parlS 01 the tRNA are irJdi..cated by dottecl led fines.
me
acceptor arm
\jiu loop
o
anticodon loop
Attochment 01 Amino Acids to tRNA 3'
• 5'
e
b
417
~UIoOp
I
;:=-3'
-accepla
Oloop~
"m
-
antiaxion Sfem anliCodon loop
F I CU RE 14-5 Conversion betwH n Ihe doverleal ..,d Ihe ildual thrH -dimensional strvcture (a) CIoverleal represenlafion. (b) l -shaped representilfion showin¡¡ tf-.e Iociltion 01 fhe basepaired regions of !'he final foIded fRNA. (e) Ribbon representiltlon ot !he actual toIded sffUCture 01 a tRNA Note Iha, alfhough lh6 diagram ~lustrales ha.Y!he actual tRNA structure IS relatcd 10 the doverIeaf reptesemoJtior\ a tRNA does not attam its final sfructute by firsI base-painng and then loIdIng N"1o an l-shape. 01 a tRNA.
tRNAs H ave an L.S haped T hree-Dimensional Structure Tite cloverleaf reveals regiolls of self-complem elllarity within tRNAs. Whot is the actual tltree-dimens ionaJ configuration oC this adaptor malocule? X-ray crysta llography rcveals nn L-shaped tertiary struclure in whic:h the terOlinus of Ihe acceplor stem is al one end oC tha molecule I:I lld the anticodon loop is fl bou l 70 A away al the other end. To undel"s tand th e relatiollship of lru s L-shnped structure (d cpid ed as an upside-down L in Figure 14 -5 ) lo the cJoverlear. cons ider the foll owing: the acceptor stem a nd the stem of the 'JIU loop forOl an extended h eHx in the fin al tRNA structure. Similarly. the anticodon stcm an d the slem of the D loop form a second extcnded helix. These two extended hclices align al a right angle lo each other. with the O loop a nd the 'l/U loop coming togclhcr. In the fin a l ¡mage. tha two extended he lices adopl their proper helical configuratioll . Thrce killds of interoctiolls stabi lize thi s L-shapcd struclurc. The firsl is hydrogen bonds between bases in differenl helica l regions thal are broughl neer each olher in three-di mensionnl spaee by the terliary structute. These are generally unconvelltionaJ (non-Watson-Crick) bonding. The second are illteractions between the bases and the sugar-phosphate baddx)I1e. The lhird kind of slabiliz ing interaction is Ihe addil ional base staoo ng gained fmm forma tion oC fue two extended regions of base puiring.
ATTACHMENT OF AMINO AClDS TO tRNA IRNAs Are C harged by the:: Attachment of an Amino Acid to the 3 / Terminal Adenosine N ucleotide via a High-Energy A cyl Linkage IRNA 1l1olecules lo which a n am ino acid is attaehed are sa id lo be charged. and tRNAs that lack an amino acid are sai d lo be unchargcd. Charging raquires an aey! linkage behveen the carboxyl group o[ the
418
1hmslalion
amil10 flc id and the 2'- or 3'-hydroxyl gmup (see belowJ 01" the adenosine Ilucleotide that protrudes from the aecep tor stem. This acyl lin kage is c:onsidered lo be a high-energy bond in Ihat its hydrolysis resulls in a large change in free ellergy. This is significant for protein synthesis: the energy released when the bond is broken holps drive the formation of the peptido honds thal link amino acids lo each other in polypeplide chains, as we will seo heJow.
Aminoacyl tRNA Synthetases Charge tRNAs in Two Steps AH aminoacyl tRNA synthetases attach al1 ami no acid lo 8 tRNA in Iwo enzymatic steps (Figure 14-6). Step one is adenylylation in which the amino acid reacts with ATP lo uecome adenylyLated witl) the CO Il-
FIC:;UR E 14-6 The two steps of IIminOflC)'J-IRNA chllrging. (a) Aden~rion of amno acid (b) Transfer of the adef1)fy1aled amino acid 10 tR~ The
process shown is for
a R
O
1
11
+
NH-C - C -O~
,
a dass It IRNA synthetase.
1
O 11
oo 11
11
O'
o-
o-
JB
-o - i -O- i -O - f-O~
H
ATP
amino acid
"i--( OH
3'
1 NH
_¡J-o-Lo~ I !_
2
+
H adenylylaled ?mino acid
o 11 -o-P-o-P-oI I ooO
OH
11
pyrophosphale
3'
b adenylylaled amino acid
o ~ NH-~R -MO -o-~-o
'
1
H
I
O"
OH
3' 3' OH
---------"'"
! +
~-[-o~ OH
3'
JABLE 14·1 Casses of Aminoacyl tRNA synthetases· Class 1I
Quartemary $tructure
Class I
G~
(a~2)
Aja
I",l
GI, Glc
P.e
(az)
Se,
«(\2)
,",
(a~)
Hes A,p
(a2) (a2)
A'9 Cy'
Mel Val lte
fw>
(a:,.)
Leu
Ly'
la,)
Phe
(a2f.\;¡)
Ty, T,p
Quartema~
$truclure
lol lal lol (a2)
(a2)
lol lol lol lal lal
SOurce. Data Irom Delarue M. 1995. Jltmnoacyl. lANA synU1etases. Wrcnt (}pIr"Wn ff1 Sf1vC11J(8/ B.UIogy 5: 48 - 55. adaptecl lrom Table 1. 'CIass I en~Bfe general~ monomeric. whereas class 11 er"rZ)'I"'"lE!" are r:lirreric Of letratlle1fc. v.olh ,~1Ó,.Jt!S
lfU'" I""J SLtIunils CQrllributioylo!he bindfig sile lur El ""'yle tnNA o. and 13 refeo 10 >;ub ....,ill;
Ql Ihe lANA syn¡hewses ane:! Ihe sub5CI lpls íPdicate~" YOjChioolBlry,
comitant release of pyrophosphate. Adenylylation refers to transfer of AMP, as opposed lo adenylati.on, which would indicate the transfer of adenine. As we have seen in fhe case of polynucleotide synlhesis (sce Chapler 8), the principa l dri ving force fo r Ihe adenylylation rcacHon is the subsequent hydrolysis of pyrophosphate by pyrophosphatase. As a result 01" adenylylation , Ihe runino acid is attached to adenylic acid via a high-energy ester bon d L., which the carbonyl group of the amino acid is join cd to Ihe phosphoryl group af AMP, Stcp two is tRNA charging in which the adenylylated amino aeid, which remains tightly bound to Ihe synthetase. reacts with tR:NA. Thi s rcad ion result s in the ITalls fer of the amino acid lo the 3' end or the tRNA via the 2' , or 3'-hydroxyl find the concomitant release al" AME There are t\Vo c1asses of tRNA synthetases (Table 14-1). Class I cnzym es attach th e amino add to the Z'OH of the tRNA and are gener-ally monomeric. Class n enzynms altac h the ilmino acid lo Ihu 3'OH of !he tRNA and are typicaUy dimeric or tetrarnari c. Although the inilial ooup ling between Ihe tRNA and the amino acid are different. ouce relcased from the synthetase. the a mino add rapi dly equilibrates hetwcen attachme nt al Ihe 3'OH and the z'OH.
Each Aminoacyl tRNA Synthetase Attaches a Single Amino Acid to One or More tRNAs Each of the 20 ami no acids is attached to the appropriate tRNA by a single, dedicated tRNA synthetase. Because masl umino acids are specified by more tban one codon (see Qlapter 151. iI is not uncommon for one synthetase to !"ccognize and charge more than one tRNA (known as isoal:cepting tRNAs). Nev~rtheles s. the same tRNA syl1thetase is respollsible for charging all tRNAs for a particular amino aeid. Thus, olle and only one tRNA synthetase atlaches each amino acid to aH a l' lhe appropriate tRNAs. Mosl organis ms ha ve zo diJferen t tRNA synth etases, but t his is nol aJways th e case. Fa!" exampl e, sorne bacteria lack a synthetase for chargi ng Ihe IRNA for gJutam ine (tRNAGln) with its coguate
420
TIunslulion
amino acid. Instead. a singl e spades of aminoacyl IRNA syn thelase charges tRNA G1n as ",eH as tRNAGlu with glutamale. A socond enzyme Ihen converts (by amination) Ihe gJutamate moíely of the charged tRNAGln molecules lo glutamine. Thal ¡s, Clu-tRNACln is aminated to GJn_IRNA G1n (lhe prefix ¡dentifies Ihe amino acid nnd the superscript identines the nature of the tRNA). The presence of Ihis second ellzyme removes the need for a gJutamine tRNA synthelase. NevertheJess, all aminoacyJ tRNA synth etase can never 811ach more than one kind of amino acid lo 8 given IRNA.
tRNA Synthetases Recognize Unique Structural Features oí Cognate tRNAs As we can see from the above considerations, aminoacyl tRNA syllIhe lases face l'wo important chaJlenges: they must recogllize tlle corra1 sel of tRNAs for a particular amino add, and they musl charge all of lhese isoaccepting tRNAs with the corred amino aeid. 80th processcs mus! be carried out wilh high fidelity. Let us fU'St CQnsider the specificity of tRNA recogniHon: wha! featores of the tRNA molecule cnable a synthetase lo discriminate cognate, isoaccepting tRNAs from the IRNAs for the other 19 amino acids? Genetic, biochemk..al. and X-ruy crystaJ10gruphic evidcm:a indicate that Ihe specificity detenninallls are cJuslcroo al t\Vo distanl siles 011 the molec.ule: Ihe acr.eplor slem 8nd thl1 antir:odon loop (Figure 14-7). The acceptor stem is an especialJy importanl determinant ror the specifici!y of tRNA sy nthetase recognition. In some cases changing a single base pair in tJle acceptor stem (a particular base pair known as tho discrimínator) is surficient lo eonvert the recognition specificity of a tRNA hom one synthetase to another. Nonetheless, the antieodon loop frequelltJy contribules to discrimination as well. The synthetase for glutamine, for example, makes numerous contacts in both the acceplor stem and fieros!> the anticodol1 loop, induding the 8ntir.odOll ¡tself (Figure 14-8). acceptor
F I G U R E 14-7 st,ucture 01 tRNA:
elements l"equi,ed '0' aminoacyl synthetase fecognition.
antícodon M.O" ',--
anticodon Ioop __-':
discrimlnafOr
3' acceplor
/ Hluchmenl 01 Amino Adds fo IRNA
421
f I e u R E 14-8 Co-ayslal Slructure o, glutaminyi ammoacyl tRNA synthetase with tRNAGIR. The enlyfOC IS shovm in gray and tRNA,c-. IS shoINrJ in purple. The yetlOlN, red, and green molecule;s glufaminyi Nv1P. NOIe ¡he proximity c:J this moleOJle lO the 3' end of the tRNA aOO the points of contact between !he IRNA and the synthelase. (Ra¡h V.L, SiMan LF., Beijer B. SplOat B.5., and Steitz TA 1998. 5mJcture 6 : 439 - 449.) lmage preparedv.ith BOOScñpl. MoIScnpl. an
You migh! expect that the anticodon wouJd almos! always be used (or recognition by IRNA synthetases beeause it is the ultimale defining featme of a tRNA - lhe antieodoll dictales the amino acid tha! the tRNA is responsible fol' ineorporating into tlle growing polypeptide ehaio. However, because eaeh amino acid is usually specified by more tlmll one eodon, recognition of lhe anticodon eaJ1J1o! be used in aU cases. ror example, the amino aeid serine is specified by six codons. including 5'·AGC-3' and 5'-UCA-3', which are completely difieren! from olle another. Hence. the tRNAs for serine necessarily have a vario ety of different anticodons. which r:onld no! be easiJy recognized by a single tRNA syn thelase. So. 1.0 recognize its tRNAs, the synthetase for scrine must rely un delenninants tha! He outside orthe anticorloll . The set of tRNA determinants that anable synthetases lo discriminale among tRNAs is somelimes referred lo as the "second genetic code" because of its central importallce in illformation fIow, As we discll ssed aboye, lhis code is significantly more complex tJlan the "first genetic code" and cannot be readily tabulated. Without sud acode. synth etases could nol disfinguish one tRNA from another, and Ihe translation machinery would flot produce polypeptidcs with a reproducibl e sequenCf).
Aminoacyl-tRNA Formation Is Very Accurate TI1C challcnge faced by aminoacyl tRNA synt.hetases in selecting the correct amino aeíd is perhaps even more daunting than the challenge the cJ1zyme faces in recognizing Ihe appropriate tRNA (Figure 14-9). The reuso n fol' (his is Ihe relal ively small size of omino acids ando in sorne cases, their similarity. Despite this challengc, Ihe frequellcy of mischarging is very low; typically, less than 1 in 1,000 tRNAs is chargcd with the incorrect amino acid. In certa in cases it is easy to understond how this high accul'ocy is achieved. for exa mple. the amino adds cystein c and tryptophan difrer substalltialJy in size. shape. and chemical groups. Even in the case of the similar-Iooking
422
Trans/otion
•
OH
$ eH,
eH,
/ C, H H2N COOH
/ C, H H2 N COQH
I
lyfOS.,e
I
phenylalamne
b CH,
"-C~2 /CHl
C~1 /CH1
eH
eH
I
/C" ~N
I
H CCOH
isoleucine
/C, H2N
H COOH
vaUne
F I G U R E 14-9 Oistinguishing h!atures of similar amino acid.$.
amino acids Iyrosine and phcnylalanine (see Fig ure 14-9aJ. the opportunil y ror fonnjn g a strong and energelically favorable hydrogell bond with the hydroxyl mo iely of the fonn cr hui 1101 th e latter a lJows the synthetase ror tyros ÍJle (I yrosy) tRNA synlhclaseJ lo discriminate effcctively aga inst phenylalanine. It is more chall enging to undersland the case o r isoleudn e anel valine. which difTer by ollly a si ngle methylene group (see Figure 14-gb). Valyl tRNA synthetase can steri ca lly excl ude isoleucine from ils cl:ltalytic pockct beca use ¡soleuci ne is larger than valille. In contrast. valille ShOllld slip eas ily inCo the catalytic pockel of the isoleucyllRNA syn lhetase. AltllOugh bolh amino acids wiIl fit il1lo the synlhetase ami no acld binding site. interactions with the extra melhylenc group 011 ¡soleucine \Viii provide an extra -2 lo - 3kca l/mol of free e llergy (see Table 3-1). As we described in Chapler 3, even lhis relati vely smaJl difference in free energy will make bi ndiJlg lo isoleucine approximately 100-fold more Iikely than binding lo valine, ir the two amino acids are presenl at equaJ cOl1centrations. Thus, vBline would be I1ttuched lo isoleudne tRNAs approximately 1 % of thc time, howcve r, thi s is an ullacce ptahly high rate of error. As wc have seen, the actua l frequency of mis incorporation is < 0.1%. How is such high fidelity achi eved?
Sorne Aminoacyl tRNA Synthctases Use an Ediling Pocket to Charge tRNAs with H igh Accuracy Dne common mechanism lo incxease Ihe fidelity of an aminoacyl IRNA synfhetnse is lo proofread Ihe producls 01' Ihe chargil1g reaclion as we huve seCII fo r ONA pol ymerases ill Chapter 8. For example. in addition lo its catal ytic pocket (ror Bdenylylationl, the isole ucyl tRNA synthelase has a nearby ediling pockCI (a deep eJert in lhe enz)'mc) that all ows it to proofrea d the product of Ihe adenylyJation reaction. AMP-valine (a s wel! as adcnylylales o r other smalJ amino acid s, sud) as alanine) can fil ¡nto this edi!ing pocket, where it is hydroJyzed alld rclcased as free va line and AMP. In con trast . AMP-isoleuciue is loo large lo enler the editing pockOI and hcncc is no! subjecl lO hydrolysiso The refore. Ihe ediling Vockel is a molecular sieve that excludes AMP-isoleucill e bu! no! AMP-valine. As ti consequence, isoleucyl IRNA synthclase is ahle to discrimiJ 18te against va line twi ce: in the inilial binding and adenylylation of Ihe ami no acid (discriminating by a factor of approximal ely 100), nnd then in the cditing of the adcnylylated amino acid (agaill discriminating by a fa clor of approximalely lOO). tor au ovecall selectivit y of approximalely lO,OOO-fold (that is. an e rror mle of approximatel y 0.01 % 1.
The Ribosome Is Unable to Discriminalc betwecn Correctly and lncorrectly Charged tRNAs The reason Ihat so much responsibility faU s on aminoacyl IRNA synlhetases lo ensu re Iha! Ihe proper amino acid ha s been attached to the proper tRNA is Ihat no further· discrimination takes place aft er Ihe charged tRNA is rel eased from that enzyme. In alher words, the ribosome "blindly" accepts any charged tRNA Ihat exhihits a proper codon-anticodnn inlerar.t ion , wheth er OT nol the tRNA carries ils cng11ale am ino add. This concJusioll is supported by two k,Jl(ls 0 1" experiments: one genetic and the olher biochemical. The genetic experimenl involves
The Ribosome
423
Box 14-1 Selenocysteine
Celtaln proteins, suCn as t'ne enzyrnes @u\atn\O'ile ~ .lOO formate dehydrogenase, COIltain .lO unusual arnino acid ca~ecI selenocysteine, which is part 01 the catalytic cente! of the enzymes. Selenocysteine CO"Itains the trace elemenl seleruum in place of the suUur atom 01 cysteine (Box 14-1 F¡gure 1). Interestingly, selenocysteine is nct incorporated into proteins by cherrical mod¡r,catiOl afte... translation (as is true for certain other unusual amino acids. such as h~n:» axIon. InrurporatUn 01 seIent.x:ysteine al LIGA ruduns
\he presence ot a special sequence eJernent el.sewhere in the mRNA Thus, seIenocystell1e can be ~ of as a 21st amino acid !hat is incorporated Inlo proteins by a modiflCabon of the standard ttanslation rnachinery of the cell.
~~
coa-
Coo1
1
+HlN - C - H
~H3N-C- H
1
1
eH,
eH, 1
1
SH
Se
se\enocysteine
cy5teine
1-'1 FICURE 1 lhestructures of qstetlle and xlenoqsteille.
BOX
Ihe iso lalion of a muta nt tRNA th at ca rries a nucleotide substitution in the anti codoll. Recall thal tRNA synthetases frequcn tly dú 110t rely 0 11 inleraction with the anli codon lo recogn ize cognale tRNAs. Hence. a subset of tRNAs can be mut<1ted in thf,lir tlnticodol1 s but slill be charged wil h their usual cognate amino acids. As a conseq uence Ihe anticodon mutation, however, Ihe mutanl tRNA delivers its a mÍllO acid lo Ihe wrong codon. In other lVords, Ihe ribosome and the auxiliary proleins thal work in con jundiOIl with the ribosome (which we will disc uss s hortly) prima rily check thal Ihe charged tRNA makes a proper codon-anticodon inl cruction with Ihe mRNA . Tht! ribosome and these proteins do liUl e tu prevent all incorrectl y c harged tRNA from adding an inappropriate antino acid to the growi ng polypeplide. A cJassic biochem ical experimen t nkely ¡Ilustrares lhe point Ihal Ihe ribosome recogni zes tRNA und not the ami no acid th al it is carry ing. Consider lhe charged tRNA cysleinyl-IRNACy" (re member lhat the pre flx idcn rit1es tlle amino acid llnd the supcrscri pt identi fies lhe nature oC Ihe tRNA). The cysteine a ttached lo cysteinyl-tRNACys can he cOllverted lo <111 a lanine by c hemica l reductioll lo give aJanine-tRNAcys (Figure 14-10) . Wh cn added lo a cell -free proteinsynth esizing systcm, a l anine-tRNAC:Y~ introduces alunines al codons that specify insertion of cysleine. Thus. the lra nslatioll nlachillery relills 0 11 the high fide lity 0 1" Ihe ami noacyl IRNA synthelases lO ensure the accurale decoding uf each rnRNA (see Box 14-1, Selenocysteillel.
or
cysteine
NH2 H
O
1
11
1
SH- c - c -e ---
H
cysleinyl-IRNA
","",,,,, 1 reductiOf)
THE RIBOSOME The ribosome is the macromol ecuJar machine thal directs Ihe synthesis of proteins. Consistent with the add itionat challenges of translaling a nucl eic acid code ¡nlo an amino acid code, Ihe ribosome is largor and more complex than the mi nimal machinery required for DNA or RNA synthesis. lndecd , si ngle polypeptides can perform ONA or RNA
alanioe-IRNA
F I (; U R E 14-10 Cysteinyt-tRNA charged ~
C or A. Cheffilcal leductions of cysIE.'Ine
attached 10 cysIeinyl.IRNA.
424
Translalian
synthesis (although DNA replication and transcriplion are also often mewated \:Iy larger multisubunit complexes). lo cont:rast , lhe machinery for polymcrizing amino acids is composed of at least three RNA molecules up lo about three kilóbases in size and more than 50 ditIerent proteins. wilh an overall molecular mass oC greater Ihan 2.5 mcgadaltons. Compared lo the speed of DNA replicntion -200 lo 1.000 nucleotides per seconcl-translation takes place al arate of only 2 to 2Ú amino ncids per second. ln prokaryotes, the trnnscription machinery nnd the tmnslntion machinery are located in the same compartment. Thus. the ribosome can commence translation of the mRNA as it emerges from the RNA polymerase. This situation allows the ribosome to proceed in tandem with the RNA polymerase as it elongntes the transcript (Figure 14-11). Recall tha' Ihe 5' end oC an RNA is synlhesized first. and thus transln· tion . which also starts at the S' end of th e mRNA. can conunence on naseent tra nseripls as soon as they emerge from Ihe RNA polymerase. lnterestingly. Ihere are several inslances in which Ihe coupling oC transcription and translation is exploitad during the regulation of gene expression . as we sha ll see in Cha pter 16. Although slow relative to DNA synthesi:s ln prokaryotes, lhe ribosomc is capable ol' keeping up wilh Lhe tmnscriptíon machinery. The typical prokaryolic cate of translation oC 20 amino acids per second correspondc¡ lo Ihe lranslalion of 60 nucleotides (20 codons) oC mRNA per second. This is similar lo the rate of 50 to 100 nucleotides per second synthesized by RNA polymerase. In conlrast to the situation in pro karyotes. translation in eukaryQtes is completely separate [rom transcription. Indeed . th ese events oceur in separate compartments oC the cell: transeription occurs in the nucleus. whereas trans lation occurs in the cytoplasm. Perhaps due to the laek of coupling to Iranscription. eukaryotic translation proceeds at th e more leísu rely speed of 2 - 4. am ino acids per second.
RNAP FIGURE 14-11 PTobryotic ANA
poIymerase and the ribosome al WOfk on the same mRNA.
"
"
The llioosome centrifuge
425
F I (j U RE 14-12 Sedimentation by ultracentñfugation lo sepal1lte individual ribosome subunrts and lhe fuU ribosome.
ribosomal subunits
The Ribosome ls Composed of a Large and a Small Subunit The ribosome is composed oC two subassemblies oC RNA and prolein known as the large and small suhunits. The large suhunil contajns the peptidyl transCerase ccnter, which is responsible ror Ihe Cormation oC peplide bonds. The sma ll subunit contains lhe decoding center in whieh eharged tRNAs read or "deeode" the eodon units of Ihe mRNA. By convention, the lurge and smal1 subunits are named according to the velocity of their sedímenlalion when subjeeled lo a eenlril'ugal force (Figure 14-12). The uoit used to measure sedimenlation velocity is Ihe Svedberg (S; the larger the S vaJue Ihe raster the sedimentation velocil)'). whieh is named after the inventor 0'( lbe ult racentri h\ge, Theodor Svedberg. In bacteria lbe large subunil has a sedi mentatio n velocity oC 50 Svedberg units and is flccordingly k.nown as fue 50S subuni1. whereas lbe small subunit ls ealled Ihe 30S subun il. The inlael prokaryoLie ríbosome is refurced lo as the 70S -r:ibosome. Notice Ihat 70S is Icss than the swn oC 50S and 305! The explnnation ror this npparent discrepancy is that sedimentation velodl)' is deterrnined by hoth shape and size and henee is not a mcasure of mass. The eularyotic ribosome is somewhat larger. composed o[ 60S and 405 subunits. which logcther form ao 80S ribosollle. The large and small subunits are each cOlllposed of RNA known as ribosomal RNAs. and many ribosomal proteins (Figure 14-13). Svedberg units are once again used lo distinguish among Iho ribosomal RNAs. Thus. in bacteria the SOS subunit contains a SS rRNA and a 235 rRNA. where.-1s the 305 subunil contains a single. 165 rRNA. Ahhough thenJ are far more ribosomal proteins than ribosomal RNAs in each subunil. the mass of the ribosome is approxinmtely half proteio and half RNA. 1ñis is !fue because Ihe ribosomaJ proteins are small (lhe average molecular weighl of a ribasamal protein in the bacterial Slllall subunit is - 15 kDa). In contrasto the 165 and 23 5 rRNAs are large. Reeall that.on average. a single nucleotide has il molecular weight of 330 daltons; therefore. Ihe 2.900-nucleolide-long 235 rRNA has n molecular weight of abnost 1.000 kDa.
The Large and Small Subunits Undergo Association and Dissociation during each Cycle of Translation Central to lbe meehanism oC translation is a e)'cle in which Ihe sma ll and large subunits of the ribosome assocíate with each other and the mRNA, translate Ihe la rgel mRNA, lhen dissociate afl e r eaeh round of prolein synlbesis. This sequenee of association and dissocía Uon is
426
TranslOlion
5.8S rRNA (160 nucleotides)
F 1G U RE 14-13 Composition oflhe
PfOkaryotic and eukaryotic ribosomes. l ile rRNA and proten composition 01 !he dlffer-
ent subunits are Indlcated. lhe sizes oí the IRNA and !he numl:er of proteins are indicated _
60S 5SrRNA nucleotides)
eu karyotlc
rlbosome 28SrRNA (4,700 nudeotides)
SOS (MW= 4.200.000)
L _ _~ 49 proteins
tv
40S
18S rRNA
1 .4oo.000)~ ( 1,900
C:=J
(MW=
nudeolides)
-33 proteins
5SrRNA
1120
50S
nucleotides)
(MW= 1.600.000) prokaryotic
,-t-.
libosome
70S (MW~
2.500.000) L
30S
i
23SrRNA
(2,9OO nucleotides)
_ _~ -34 proteins
-e S
16SrRNA
11 .540 nucleotides) 21 proteins
known as the r ibosome cycle (Figure 14-14). Briefly, translation bcgins with the bindIng of the rnRNA and an initiating tRNA lo a free, small subunit of the ribosome, Tha small suhunit-mRNA complex then recruits a large suhunit to create an ¡ntact ribosome with tha mRNA sandwiched hetween the two subunits, Proteio syntllesis is ¡nitiated io the next ste p. commencing al the start codon at the 5' cod of the message and progrcssing downstream toward the J' end of the mRNA. As Ihe ribosome translocates from codon to codon. one charged tRNA after another is slotted into the decodíng and peptidyl trnnsferase ccnters of the Tibosome. When Ihe e10ngati ng ribosome encoun!ers a stop codon, the now completad polypeptide chai o is released. and the ribosome d isassociates [ rom the mRNA as separnte largo and small subunits. Tbe separated subunits are now available to bind to a frcsh rnRNA molecule and repea! the cycle of protein synthesis, Ahhough a ribosome can syntheslze only one polypeptide at n time. ench mRNA can be translated simuhaneously by multip lc ribosomes (for simplicity Jet us assume that the mcssage we are considering i5 monocistronic). An mRNA bearing muJtiple ribosomes i5 known as a polyribosome 01' a polysomc (Figu!(! 14-15). A single ribosoms is in con-
The Ribosome
iniliator tRNA
\
-
small ribosomal subunit
mRNA
FJ G URE' 4- '4 OVerview of the events of translation..
tact with approximalely 30 nucJeotides of mRNA bul Ihe large size of the ribosome only alJows a density of one ribosome foe every 80 nucleoticles of mRNA. Thus, a single mRNA moleeule i.s able 10 direct the simultaneous synthesis of multiple polypeptides using an array of ribosornes. The ability of mulliple ribosomes lo funelion on a single mRNA explains the relatively limited abundancc of mRNA in the cell (typically 1 -5% oI total RNA). 11' an mRNA eould be translflted by only one ribosome at a time, as few as 10% of Ihe ribosomes would be engaged in protein synthesis at any time. Tnstead, the associfltion of multiple ribosomes wilh each rnRNA indieates Ihat lhe majorHy of the ribosomes are engaged in translation .
New Amino Acids Are Attachcd to the C,Tenninus of the Growing Polypeptide Chain As we know, hoth polynucleotide and polypeptide chains have inlrinsic polaritjes. Thus, foI' each of these molecules \Ve can ask which end 01' tbe chain is synthesized firsl. We leamed in Chapters 8 and 12 thal
FIGURE 14-15 ApcMyribosome.
-
427
Translalion
4Z8
polypeptide chain
H o I 11
H
H
C= O
C=O
O
O
. _-)-(-C..... N _ ~ _ R HA
I
H
H~ -~-R I ~' I
1)
1
aminoacyHRNA
peptidyl-IRNA
1
FIGURE 14-16 Thepeptidyt transfel'aSe ,eaction.
ONA and RNA are syothesized by adding each new oucleotide triphosphate to the 3' end oC the growing polynucleotide chain (afien referred lo as synthesis in the 5' - 3' direction) . Whal is the arder oC synthesis oC a growing polypeptide chaio? Trus was first determined in a classic experimenl performed by Dintzis that was described in Chapter 2. This experiment found tbat each new amino acid musl be added lo the C-Ierminus of the growing polypeptide chain (ofien referred to as synlhesis in the N- lo e -terminal direction). As described in the next section. trus directionality is n rurect rosull of the chemistry of prolein synthesis.
Peptide Bonds Are Fonned by Transfer of the Growing Polypeptide Chain from One tRNA to Another The ribosome calalyzes a single chemical reaction: the formation oi" a peptíde bond. Trus reaction occurs belween the amino acid residue al the carboxy-termioal end of Ihe growing polypeptide and the iocoming amioo acid to be added to the c hain. 80th the growing chain and Ihe iocom iog amino acid are attached lo tRNAs; as a result , during pe ptide bond formation . the growing polypeptjde is continuously attached to a tRNA. Tbe actual substra les for each round of amioo acid additioo are two charged s pccies of tRNAs-an aminoacyl-tRNA aod a peptidyltRNA. As we discussed earli er in this chapler (see the sectioo. AIlaebment oC Amino Acid s to tRNAs) the aminoacyl~IRNA is nuac hed al ¡ts 3' end to Ihe curboxyl group of the amino aeid . Th e peplidyltRNA is att ached io exactly the sume maooer (at its 3'eod) to the carboxyl -terminus of Ihe growing polypeptide chaln. The bond bel\\leen the uminoacyl -tRNA and the amino acid is not broken duriog lhe formalion of Ihe oext peptide bond. lnstead, lhe 3' ends of lhese 1\\10 tRNAs are brought into close proximity lo each othe r on Ihe rihosorne. Thi s positioning allows Ihe amino group of the aminoaeyltRNA lo attack Ihe carbonyl group of the mosl carboxy l-terminal amino acid attached lo the peptidyl-IRNA to form u new peptide bond (Figure 14-16). There are two conseq uences of Ihis metnod of polypeptíde synthesis. First. th is mechanism 01' peptide bood fo rmation requires that the N-Iernlinu s of the protejn be synthesizcd beforo Ihe C-termio us. Second, Ihe growing polype plide chain is Iransferred from the peptidyl-tRNA to Ihe aminoacyl-tRNA. Por this reason. Ihe reacUon to foml a new peptide bond is calJed the pePlidyl transferase reuelioo . rnterestingly. peptide bond fonnution takes place wilhout the simultaneoos hydrolysis oC a nudeoside triphosphale. This is because peplide bond fonnation is driven by hreaking the high-energy acyl bond tbal joins Ihe growing polypeptide chain to the IRNA. You will recull that this bond was created during the lRNA synthetase-catalyzed reaction thal is respoosible ror charging tRNA. The charging reaction involves the hydrolysis of a molecuJe of ATP. Thus, the energy for peptide bond rormation originales from H rnolecule of ATP IhHI was hydrolyzed during Ihe IRNA charging reaction (Figure 14-6) .
Ribosomal RNAs Are Both Structural and Catalytic Determinants of the Ribosome Although Ihe ribosome and ils basic funcUons were discovered more than 40 years ago, the recent determinatioo oC the high-resolution.
The llioosom e
FI G U R E 14-17 Two views of the ribosome..
The 50s subtmit is abolle the 30S subuoll io both
lhe cavity betweeo lhe 50S and 30Ssubunits in !he flght hand ¡mage represenlS the site of tRNAassociaoon (see figure 14· 19). lhe RNA componen! of \he 50Ssubunrt lS sInvn In gray and the proteio componen! is shown in purple. The RNA componenl of lhe 3OSsubuni! is shO'M1 io ligh! blue ane! the pro· leio componen! io dark blue. (Yusupov MM., Yusupova G.z., Baucom A, ll€'belmao K., Eamest T.N., Cate J.H., <'lnd Nollef H.F. ]001. Science 29 2: 883.) Im ages prepared wrlh MoIScripl, BobScopt, dne! Rasle. 3D ~
three-dirnensional structure of the prokaryotic ribosome has vastly increased our underslanding of the workings of th is molflcular machine (Figure 14-17). Perhaps Ihe mosl importan l outcome of Ihese studies is lhe definitive demonstration thal ribosomal RNAs are nOI simply structural componenlS oC the ribosome. Rather. [hey are directly rosponsibJe fo r lhe key fu nctions of the ribosome. The most obvious example of this is the demonstrntion that the peptidyl transterase center is composed entirely of RNA , as \Ve will discuss in detail below. RNA also plays a central role in the function of the small subunit oC the ribosome. The anUcodon loops of the charged tRNAs and lhe codons ofUle mRNA contact the 16S rRNA , not the ribosomal protems of lbe srnall subunit. A further indication of the importance or RNA in the structure and function of lhe ribosome is thal mosl ribosoma l proteins are on Ihe periphery of Ihe ribosome , not in its interior (see figure 14-19 ), The core functional dornains oC the ribosome (lhe peptidyl transferase ceno ter) are composed eilher en tirely or mostly from RNA. Portions of sorne ribosomal proteins do reach into lhe core oC Ihe suhunits, where their function seems to be to stabil ize the tightly packed rRNAs by s,hie lding the negative charges of their sugar-phosphate backbones. Indood. it is likely that lhe contemporary ribosome evolved from ti primitive protein·synthesizing machine that was composed entirely 01' R.NA.
The Ribosome Has Three Binding Sites for tRNA To carry out the peptidyl trdIlSferase reaction , the ribosome must be able lO bind al leasl two tRNAs simultaneously. in Cact. the ribosorne conlams
429
430
Trunslatian
mRNA
FI (j U A: E 14-18 Tbe ribosome has three tRNA binding sites. The ~lIc. ~lusIJa· 1100 of the nbosome shows the three binding sites (E, P, and A) !hal span the IWO subunits..
three tRNA bincling sites. called the A, p, and E siles (Figures 14-18 and 14-19). The Asile is the binding site for the aminoacy lated-tRNA, the P site is the binding site for the peptidyl-tRNA. and the E sitc is Ihe binding site for the tRNA that is released after tho growing polypeptide chain has bren lransferred lo tho amino.-1cyl-tRNA lE is (or exit). Each t.RNA bindjng si te is formed al lhe interface between the largo and lhe small subunits oCIhe ribosotne (Fi gure 14-19a and b). In this way. the bound tRNAs can span Ihe clistance between the peplidyl transCerase centec in Ihe large subunil (Figure 14-19c) and the decoding center in the small subunit (Figure 14-19d). The 3' ends oC the tRNAs thal are coupled to Ihe amino acid oc to the growing peptide chain are ad jacenl to the large s ubunit. The anlicodon loops oC the bOW1d tRNAs are loealed ad jacent lo the small subunit.
Channels through the Ríbosome Allow the mRNA and Growíng Polypeptide tú Enter and/or Exit the Ribosome 80th the decoding center and the peptidyl Irdnsferase center are buried within the intaet ribosome. Yet, mRNA must be threaded through the decoding center during translation . and the naseent polypeptide chain musl eseape from the peptidyl transCerase center. How do these polymers enter (in Ihe case of mRNA) and exit the rilxlsome? The ans\Vcr is provided by Ihe slructuro oC Ihe ribosome, which ccveals "' tuIUIels" in and out oC the ribosome. T he mRNA enters and exits the decoding center through two naITOW channels in the small subUllÍt. The entry channel is only wide enough for unpaired RNA to pass through. Th is feature ensures that the mRNA is in ao extended fonn as ít enters the decoding center by removing any intramolecular base-pairing interactions thal may hAve fonned in the mRNA. In between the t\Vo channels is a region Ihal is acccssible lo tRNAs and where adjaceot eodons can biod to the aminoacyl-tRNA and peptidyl-1RNA in the A and P siles, respectively. Interestingly. there is a ptonounced kink in the mRNA between the two codons Ihal Caci litales mainlenance oC the correet rendiog Crame (Figure 14-20). This kink places the vacant A site codon cre¡:¡ led after a cyc1e oCribosome Imnstocalion in a distinelive position thal prevents the ineonting aminoacyJlRNA fro m aceessing bases immerualoly adjaeen t to the codon. A second chaonel th rough Ihe large subunit prov1.des an exit path Cor the newly synthesized polypeptide chain (Figure 14-21). As wilh the mRNA c hannels , the size or the channel Hmits lhe rolding oC Ihe growi ng polypeptide ch¡:¡in . lo this case, Ihe polypeptide c¡:¡o form an u helix within the channel huI oth er secondary structures (such as ~ sheets) and tertiary interaclions can on ly form alter Ihe polypeptide exits the large rihosomal subunit. For this reason. lhe final 111l1~e dimensional structure oCa newly synthesized prolein is not aUained untH after il is released from the ribosome. Now that we have described the Cour primary componúnts ()f Ihe lranslation process, the remainder oC the chapler wiU focus 00 the individual stages of translatiun in moro detaH . Our description will proceed in order Ihmugh the three stages oC trnnslation: ¡nit iation of the synthesis of a new polypeptide chain. elongation of Ihe grow ing polypeptide, and terminaliOIl of polypeptide synthcsis. As we will sec. Ihere are imporlnol similnrities and di Fferences between prokaryoles and eukaryoles in the slrategies they employ to earry out protein synthesis. We shall consider Ihe nature oCthe tmnslation machinery from both kinds oC cells in each oC the Collowing sections. As we have seen Cor DNA and RNA
The Rib050me
a
b
e
d
FIG U R E 14·19 Views of the three-dimensional slructUI'e 01 Ihe ribosome induding three bound IRNAs. Thc E, P. and A sile tRN~ dre 5howo In ycllow. red, and green respectively. The coIm represenring lhe RNA aOO protein romponents of the smal1 and large subuni\s are (he sorne as thosc In HE" !..le 14 - 17. (a) and (b) Two views of me ribosome bound to the tIlr ee IRNP6 In the E, P, ilnd A Sltes. Note that the Ieft (a) and r1ght (b) views shovvn hcre correspond lo !hose views of the ñbosome shown rn Figure 14- 17. (e) lile rsoIated SOSsublVlit boond lo tRNAs. The pepbdyl transferase ceoter 15 arded. (d) The ISOlated 30Ssubunit Ix>und \O lRNAs_The decoding center IS arded. (Yusupov MM, Yusupova GL, Baucom A., llebel'm.:ln K, Eamesl T.N., Cate J-H., and Noller H.r. 2001. Science 292; 883.) Images prepared Wltn MoIScnpt. BobSCflpt, Mld Rasler 3D.
431
432
Tt-OIlslarion
FI CU RE 14·20 lhe interaction between the A me and P site tRNAs and
the mRNA within the ribosome. Two 1I1e'NS of \he strudUfe 01ttle mRNA and IRNAs are ~ as they are lound in the ribosome. For darity, !he IÍlx>some is not shClYlfl. lhe E, P, and A Sl\e tRNAs are shown in yellow, red . .md green respectiveIy and !he mRNA Is sho.-vn in blue. Oriy !he bases lnvolved in the codon-df1ticodon in!eJdC!ion ale sho.Nn. lhe Slrong kink Ífl the mRNA dearly distinguishes belWeen the A site and P site rodons. The dose proXlmity 01 lhe 3' mds uf !he A site and P slte tRNAs can be seen ir1 \he Iower ' mag
synthesis, although the ribosome is the center of activity. auxiliary taco tors play critical functions in each of the stcps of translation and are required lar proteül synthesis to occur in a rapid and accurate fashion.
INITIATION OF TRANSLATlON For translation to be successfully in itiated. three events mus! occur (Figure 14-22). First. the ribosome musl be recruited lo the mRNA. Second. a charged tRNA musl be placcd into the P site of the ribo· sorne_ Third, the ribosome must be precisely positioned over the start FICURE 14·21 Thepolypeptideexit ~_ In this image lhe SOS subunit is cut in hall 10 r€Vea!!he poIypeptide exit tunnel. The rRNA IS shOW1 in white arod the ribosomal pro-
teins are sho""'" in yellow. The !hree bound lRNAs are coIc.ed as folJ()IoIS: E-site (brOl.\II1), P-site (putple), aOO A-site (green). The red and goId parts of lhe rRNA adjacent lo lile A-sile tRNA are components 01 the peptidyl tfanslerase cenler. (Souce: Courtesy of T. Martrn Schrneing and Thomas StciU; lrom Schmcing 1M. t1 al. 2002. A pre-translocalional intCfl"Tledicl.e in prolein synthesis observed in oystals of enzymaticaUy actIVe SOS subunits. Narure Struct 8icJ. 3: 225 - 230.)
(ni!folion
oi Tmns/ulion
coJon . The correcl posilioning oC the ribosome ove. Ihe slart codon is critica!. because lru s establi shes the reading t"rame COl" the tmnslation of the mRNA. Even a one-base shift in the loca lion oC the ribosome ",ould ;esult in the synthesis oC a completely unrelated polypeptide (see the discussion of messenger RNA aboye and in Chapter 15 ). The dissimilar structures of prokaryotic and eukaryotic rnRNAs result in distinetly differenl mcans oCaecomplishing these events. We will slart by addressing the initiatiol1 events in proka ryotes and then di scuss the differenees observed in eukaryolic cells.
Prokaryotic mRNAs Are Initially Recruited to the Small Subunit by Base..Pairing to rRNA The assembly oflhe ribosome 011 ao mRNA occurs ooe subunit a l a time. The smalJ subunit associates with lhe mRNA first. As we discussed earlier, Cor prokaryotes lhe associatiOI1 of the small subunll with the rnRNA is medü:Jled by oose-pairing interactions between the ribo!';ome binding site and the 16S rRNA (Figure 14-2J). For ideally positioned ribosome binding sites. the small subunit is positiol1ed on the mRNA such thal the slart eodon wílJ be io lhe P site when the large subuoit joios the complex. T he large subunit joins jts partner only at Ihe very end of the ioitialion process, JUSI prior lo Ihe fonnation oi" \he firs l peplide bond. Thus, many oi" the key events oi" translation initialion oceur in !he absence oí tbe fuU ribosome.
1
l
1
A Specializcd tRNA Chargcd with a Modificd Me,hionine Binds Directly to the Prokaryotic Small Subunit Typically eharged tRNAs enter lhe ribosome in the A site and only reach the P sile after a l"Ound of peptide bond synthesi!:i. Du ring inhia· Hon. however, a charged tRNA enters the P site directly. Trus event ;equires a speciaJ tRNA k.nown as lhe ¡"itiator tRNA. whieh base-pairs with the staft codon-usuall y AUG or GUG. AUG and GUG have a differenl meaning when Ihey oceur wi lhin un open-;eading frame. where they are read by tRNAs for methionine {tRNAMc\} and valine (IRNA \-\il). respectively (see Chapler 1 5 ). Neither methionine nor valine is attaehed to the initiator tRNA. lnstead, it is charged wi th a modified fonn of me!hionine (N-fonnyl methi.oníne) lha! has a fonnyl group attaehed lo ils amiDO group (Figure 14-24). "fhe charged initiator tRNA is referred lo as fMet·tRNAF·~'. Because N-fonnyl methionine is the first amillO acid to be incorporaled inlo a polypeptide c hain. you migh! Ihink that all prokaryotic proteins have a fomlyJ group al Iheir amino ternuous. This is not \he case, however, as an enzyme known as a defonnylase removes lhe fo.myl group [rom Ihe omino lerminus dming or afte. lhe synthesis of Ihe polypeptide ehain. In fael, many proka ryotic proleins do nol even starl wíth a methionine; aminopeptidases ofie n rernove Ihe amina tenninal methianine as well as one or Iwo add ilional amino acids.
Threc In itíation Factors Direct the Assembly of an Initiation Complt'x that Con ta ins rnRNA and tllt' lnitiator tRN A The initiati on oC prokaryatic translalion conunenees with the sllla U subunit and is r..atalyzed by three translatioll iniliation factoes called
FU¡URE 14-22 Anoverviewoflhe events of tJanslaoon initiation.
433
434
Translalion
ribosome
165
JOS
eH, I s I eH, I
'1 fH,
H- N- C- COOH
I
eH, I s I eH,
ft '1
H- C- N- C- COOH
I
H
H
methionine
I
fH,
N-formyl melhionine (fMel)
f I G U R E 14-24 Methioníne and N-fonnyl methionine.
fiGURE 14-23 Thel6SrRNAinteracts with the ribosome binding site to position the AUG in the P site. This i\lustralion stJov.Is an mRNA with me ideal 5€pIIr
fMel tRNA
fiGURE 14-25 A modelofinitiation factor binding lo the 30S ribosomal subunít. The estimat~ kx:.:.!kJn ot If] , 1F2, afIÓ lB binding are shown alorlB with the reglOns of me 30S riboscrnal subunil !ha! wiU beccme part of !he A, P, and E sites.. (Soorce: Adapte(! !rcm Ramakrishna n, V. 2002. RIbosln.lC1Ure and the rnechaniSl"n of tlilnslalien CeIII08 : 560, fig 2. Copyrighl O 2002 with perrnission {rcm EIsevier.) smle
1Ft . lF2. and IF3. Each factor facilitates a key step in the in itiation process (Figure 14-25J: • lFl prevents tRNAs from binding lo the portion of tbe small subunil Ihat will become part of the A site. • lF2 is a GTPasa (a protein Ihat binds and hydrolyzes CTPl thal intcracts with three key components of thü initiation machincry: the small subuni t, lF1 , and charged initiator tRNA (fMet_tRNA/MeI ) . By intcracting with lhese cOITIponents . IF2 faci litates the subsequent association of fMet_tRNA¡fMeI with the sIDall subunit alld preve nts other charged tRNAs from associating wilh Ihe small subunil. • IF3 binds to Ibe small suhtmit and blocks jI from reassociating with a large subunit. or from binding charged tRNAs. Because ¡nHiation requires a free SITIall su bunit . the hinding of lF3 is critical for a new cycJe of Iranslation. 11-"3 becomes associaled wilh the small subun il al the end of a p revjous round of translation when it helps to llisassociate the 70S ribosome into its large and small subunits. Each of Ihe initiation factors hinds at. or near. one of the threo tRNA binding s iles 0 0 the smal! subunit (Figure 14-25). Consistenl with its role in bJocking Ihe binding of charged tRNAs lo the A site. lFl binds dircctly to tho portion oftho small s uhunit tha! will become the A site. IF2 binds to lF1 and reaches over tho A site into Iho P sito lo contact Lhe tMet-tRNA¡IM[lt. FinaJ1y, IF3 occupies Ihe part or lhe small subunit thal will become !he E si te. Thus, of the Ih ree potenlial lRNA biooing sites o n the small subuu it. only the P site is capable of binding a tRNA in tho presence of the initiati on factors. With al! three initiation ractors bound. Iho SITIall suhunit is preparad to bind to tho ITIRNA and Ihe initiatof tRNA (Figure 14-26). These two RNAs can bind in oither ordor ano independentl y of oach olher. As discussed above. bind ing to the mRNA typically involves base-pairing belween Ihe ribosome binding site and the 168 rRNA in the s rnaJ) su bunit. Meanwhile. hinding fM et-tRNA¡fMe' to the smalJ suhunit is facilitated by its interactlons with IF2 hound to GTP and {once the mRNA IS boundl base-pairing betweon the anticodon ami the slarl codon oftbe mRNA. The lasl step of iniliation involves Ihe association of the large subunit to create the 70S in¡tiation complexo When the start codon and fMet_tRNA¡lMol hase-pair. tho small subunil undergoes a change in conformation. This altered con forJOation results in tite release of IF3,
Initiali on 01 Tronslafion
In the absence of lF3, Ihe large subunil is free lo bínd lo Ihe small subuníl wilh its cargo of IF1 , lF2, mRNA, and fMet-tRNA,rM ~' . The binding of the large subun it stimulates the GTPase activity of lF2-GTP, causiog it to hydrolyze GTP. The resulting IF2-CDP has reduced affinity for the ribosome and Ihe initialor IRNA leading lo the release of IF2-GDP as well as IF1 from the ribosome, Thus , Ihe nel result of initiation is Ihe formali on oC an ¡nlact (70S) ribosome assemblod al the slart site of the rnRNA with fMeHRNA/Mc, in the P site and an empty A site, The ribosome-mRNA complex is now poisad to accept a charged IRNA mto the A si te and commence polypeplide synthesis.
b
Eukarvotic Ribosomes Are Recruited to the mRNA
by Ihe 5' Cap ¡nillation of translati on in eukaryoles is similar lo prokaryotic initiation in many ways. Both use a start codon and s dedicated irtiliator IRNA , and both use initiation facloes to form a complex with Ihe small ribosomal subunit that assembles on the ruRNA prior lo addition of the large subuni!. Nevertheless, euknryoles usa a fundamentally distinct method lo recognize the rnRNA and the start codon. which has important consequences for eukaryotic translation. lo eukaryoles , the s mall subunit is already associated with an initiator tRNA when it is recruited to the capped 5' ond oC Ihe mRNA, lt Ihen "scans" along the rnRNA in a 5 '- 3' d irection until il maches Ihe first 5'-AUG-3' in the correct context (see Ihe di scuss ion of the Kozak sequence in the precediog secti on 0 11 mRNAl, wh ich it recogoizes as Ihe slaft codon. Thus , in mosl inslances (see Box 14-2, uORFs and IRESs: Exceptians lhat Prave lhe Rule), only lhe first AUG can be used as the start site of tra nslation in eukaryotic cells. Nole thal this method of initi alion is consisten l with the (ac! that th e vast mnjorily of eukaryotic RNAs are monocistronic and encode a single polypoptide; recognition of an intemal starl codon is geoerally not possible or required. As we have seen for olher molecular processes (such as promoter recognition during transcrip tion) , e ukaryotic ce\ls require many more aux iliary proteins to drive the initiation process than do prokaryotes (although eukaryotes have initi ation factors that correspond to the prokaryotic IF1 . IF2. and IF3). Remarkably. more than 30 differenl polypeptides are involved in init iatíoo of tmnslalion in eukaryotes. In contrast to Ihe prokaryolic situation, in eukaryotic cells bind ing of Ihe initiator tRNA to Ibe sma ll subunít olwoys precedes associali on with the rnRNA (Figure 14-27a). As tha eukaryotic ribosome completes a cycle of translation, it dissociates iuto free large sod small subunits through the action of Cactors (call ed eIF3 and eIF I A, rcspeclively) analogous lo the prokaryotic initialio n factors lF3 and 1Ft. Two GTP-binding proleins, elF2 and e IF5B, mediate Ihe recruitrne nt of the charged initiator tRNA. For eu karyotes Ihi s tRNA is charged wit h methionine. not N-formyl methi onine. and is referrod lo as Met-tRNA¡MC'. In a case of unfortunate nomenclatura, Ihe e ukaryotic analog of lF2-GTP is elF5B-CTP. This factor associates with the small subunit in an eI.f'l A-dependont manner. In turno elF5BGTP helps to recrui! a complex of e IF2-GTP and Met_tRNA¡Mc, to the small s ubuni!. Togelher these two GTP-binding prote ins pos ition the Met-tRN A,Mi>' 'in Ihe fut ura P site of Ihe small subunit, rasu lting in the formation of lha 438 pre-initiation complex,
1F2
• +<\EJ.., ¡¡¡¡¡ -.J
e
fM".'RNA( d
FICiURE 14-26 Asummaryo'
translation initiatloo in prokMyotes.
435
436
FIGURE
TronslaUon
'4·27
AssembJyofthe
a
b
eukat)'Otic small ribosomaJ subunít and initiator tRNA onto the mRNA. Note lhat elF4F is composed 01 three prolal1S: elF4A,
5' cap
... '--~C:;O'--'
e!irectly binds me5 ' cap, tethering Ihe other two proteins to lhe end
e1F4E. .me! eIF4G. elF4E
elF4F
,,_. ~!\ .....¡,,. 'staft codon (AUG)
ofthe mRNA.
eIF5b-GTP
;¡¡p.
cm
eIF2-GTP~
IRNA¡mel l
-
435 pre-initialion complex
Recognition of eukaryotic mRNAs by Ihe 435 pre-initiation complex OOgins with the recognition of the 5' cap found al Ihe end of mosl eukaryotic mRNAs. Recognit ion is med iated by a three-subunil prolein called eIF4F (see Figure 14-27b). One oC the three subtmits binds directly lo the 5' cap ilnd lhe other Iwo subunits bind nonspecificall y lo the associated RNA. Thi s complex is juined by olF4B which acl ivates an RNA helicase in one oC the clF4F subuni ls. T he helicase unwinds any secondary structures (soch as haírp¡ns) that may have formed al the end of Ihe mRNA. Rcmoval of secondary slructures is cr itical as the s' end of mRNA musl be unslructured lo bind to the small subunit. The olF4F/B bound unstructurod mRNA recruits Ihe 435 pre-initiation complex lo the rnRNA tbrough inter-actions 00tween clF4F and e1F3.
[ni/iolion ofTrans/otion
The Start Codon Is Found by Scanning Downstream from Ihe 5' End of Ibe mRN A Once assembled at the 5' an d of the mRNA, the small subunit and its associatcd factors moya alol1g the mRNA in a 5' - 3' direction in an ATP-dependcnt process thal is drivcn by the elF4F-associated RNA holieaso (Figure 14-28). DlIring Ihis movement, the small subunil "scans" Ihe rnRNA for Ihe firs t start eodon . The start eodon is F I e u R E 14w 2B Identffication of the initiating AUC by tlJe eukaryol:ic small ñbosomal subunít.
start codon (AUG)
1'---
"",nn;n, IAOpi +
•
e lF3 elF4B
elF4F
c;+e. elf5B
e,
438
Translotion
recognized through base-pairing between Iha anticodon oC the iniliator tRNA and the start codon (this is w.hy it is critica! that tbe initiator tRNA bind to the small subuoit before il binds to the mRNA). Corred base-pairing triggers the release oC elF2 and elF3. Loss oC alF3 (which had preveoted binding of the large subunit) and elF2 lwhich was bound to lbe initiator tRNA) allows Ihe largo su bunit to bind to the smalI subunit. As in the prokaryotic situation. binding of the large subunit leads to the release of the remailling initiatioll factors by stimulating GTP hyruolysis by tbe IFz anaJog, eIFSB. As él. result of these events, the Met-tRNA¡lIfel is placed in the P site of Ihe resulting 80S ¡omation complex. With the start codoo and Met-tRNA¡MO' placed in Ihe P sile. Ihe oukaryolic ribosome is now poised to aecept a eharged tRNA ioto il s A site and carry out Ihe Connation of tbe first paptide bond.
Translation lnitiation Factors Hold Eukaryotic rnRNAs in C ircles lo addition binding to tha 5' and of eukaryotic mRNAs. Ihe iniliation factors are dosely associated with tbe 3' end of Ibe mRNA through il s poi y-A tail (Figure 14-29). This is mediatod by an interaetion between elF4F ami Ihe poly-A banding protein thal coals Iho poi y-A taH, A consistenl interaction between tbe two ends occurs beeause both elF4F and tho poi y-A binding protein are boulld to the mRNA Ihrough multj ple rounds oC Iranslation. Tha intoraetion hetween tbese proteins results in the mRNA being held in a circu lar eonfiguration via a pr tein bridge betwoen Ihe 5' and 3' ends oC the moleeule. It has long beeo known that Ihe poi y-A tail contributes lo efficient lranslation oC rnRNA. The finding Ibat translation initiation factors "c ircularize" mRNA in a poly-A-depondent mannar provides a s imple ralionale ror Ihis observation: once a ribosome finishes translating an rnRNA thal is circulruized via its poi y-A tai1. tbe newly released ribosome is ideally positioned to ro-initiate tf'dns lalioo 00 Ihe same mRNA_ FI(iURE 14-29 Amodelforthe drrularization of eukaryotic mRNA. Grcularization is proposed lo be mediat€d by an interactiol be~
th~
elF4G subunit 01 elF4F
/
and the poIy-A binding protein.
poly-A binding protein
r \1j\-
"op (AGU)
Inijiation 011ronslation
Box
1~2
uORFs and IRESs: Exceptions that Prove the Rule
Not all eukaryotie polypeptides are encoded by an openreading frame that starts with the AUG that is most proximal lo Ihe S' lerminus. In some cases, the 'irst AUG is nol in a proper sequenee context resulting in its bypass. In other cases, short, upstream, open-reading Irames (uORFs. encoding peptides less than ten amino acids long) are found upstream 01 the principal open-reading trame, lhat encodes a large polypeplide (Box 14-2 Figure la). In lhese cases, the uORFs act to regulate the exlent 01 translation of a larger, dOlfoJnstre3m, open-reading 'rame. At least some 01 these uORFs CITe loIla.ved by RNA sequences Ihal cause a pfOpurtion (30 - 50%) of Ihe small subunits ¡hal translate them to be relained on the mRNA aher terminatJon. The retained small subunits continue scanning lor the nel(( AUG but can only locate il aher a nevvly charge
oORF
a dONnstream AUG-albeit at a greally reduced rateo A more extreme example of initialing translatian at siles downstream of lhe AUG tIlat is dosest lo the 5' terrninus is represented by internal ribosome entry sites (lRESs). IRESs are RNA seqLJences lhat functian like the prof
5·C.....~.~lIu~·~~~~~~;;,;G~~'
•
.,'l'~f....om't~.D.... .
........................
ll·4 . . . . . . . . . . . . . . . . "'0
•
normal 5'
rare
* IIG
/
,
~
"
t"D
j,
J
normalS' u
439
'.<2 9
I,
,t,_
3'
It,_
3'
.......""'..____...._rf)__III""_,.
5·')II....1W1Cil............""m_....~~=--=::::~
440
Troflslation
Box 1,t.2 (Continued)
b
s'
.U1. .um~'"'''''I,(JI''(_"U~t;'¡I~*,!,,--."" --.4,~:r elF4F
S'
.U1""'~'"'''''I,f¡J'''_mum-e""!:':*~"r-_L'~iIG""
__.,'1~;S
pre-/nitiation oomplex
1 B ox 14- 2 F I CU R E 1 Two methods fOf eukal')'Otic Ifanslation lo ¡nmate atlntemat AUCs.. (a) uORFs can all{)Vo¡ the small subunit lO continue scanm~ after completing translatioo. (b) tRESs Cill"l recrull me 435 pre-initiation complex directly 10 me mRNA.
TRANSLATION ELONGATION Once lhe ribosome is assembled with tbe charged initiator tRNA in lhe P sile, polypepl ide synthesis can begin. There are three key events that mus! occur Cor tbe correet addition oC each amino acid (Figure 14-30), First, lhe corroet aminoacyl-tRNA is loaded into the A site of the ribosorne as dictated by the A-sile codon, Secando a peptide bond is Conned belween the aminoacyl-tRNA in the A site and the peptide chain tha! is a Uached to lbe peptidyl-tRNA in lbe P site, This peptidyl transfe rase reaction, as \Ve havo sean, results in the transfer of Ihe growing polypeptide from the tRNA in the P site lO the amino acíd moiety ofthe charged tRNA in the A site. Third. the resulting peptidyl-tRNA in the A site and its associated codon mus! be tra nslocated lo the P site so Ihat the ribasome is poised for nnother cycle of codon recognition ami peptide bond fonnation, As with tbe originaJ positioning of the mRNA, this shift musl occur procisely lo m.aintajn tbe corroet reading frame of lbe message. 1\vo auxilia ry proteins known as elongation factors control these events. Both oC
Tmnslatioll Elongatioll
lhcse factors use tbe energy of GTP binding and hydrolysis lo eMance the rale and aecuracy of ribosome funClion. Unlike Ihe initiation óf translation. the mechaoism of elongation is highly conserved between prokaryotic and eukaryolic cells. We will ¡¡mil our discussion lo lranslation elongation in prokaryotes. which is underslood in the greatest detail, bul the events that occur in eukaryotie cells are sim ilar to 'hose in prokaryotes , both in the faclors iovolved and io their mechanism of action, Aminoacyl~tRNAs
Elongation Factor
44]
5'
aminoacyl-tRNA binding 10 A site
Are Delivered to the A Site by
EF~Tu
Aminoacyl-tRNAs do not bind to Ihe ribosume 00 their own. Instead , they are "escorled" to the ribosome by Ihe elongalion factor EF-Tu (Figure 14-31). Once a tRNA is aminocylaled, EF-Tu binds to Ihe tRNA's 3' end, maskiog Ihe coupled amino acid, Th is intcraclion prevents Ihe bound aminoacy l-tRNA from participatiog in peptide bond fonnation until it is released fTOm EF-Tu. Like Ihe init iation faclor IF2, the clongat ion factor EF-Tu binds and hydrolyzes GTP and the type of guanine nucleotide bound goveros ils funclion, EF-Tu can only biod to ao amiooacyl-IRNA when it is i:lssociated with GTP, EF-1\J oound lo GOP, or lading any bound nucleotide. shows Httle affinity for aminoacyl-tRNAs. Thus . when Ef'-Tu hydrolyzes its bound GTP. any associated aminoacyl-tRNA is released. EF-Tu bound lo an aminoacyl-tRNA cannol hydrolyze GTP al a significant rateo The trigger thal activates the EF-Th GTPasc is the same domuin 00 the large s ubunit of the ribosome that activales the lF2 GTPase whon Ihe large subunit ¡oins the initiation complex. This domain is known as the factor binding center. EF-Tu only interacts with the fi:lclor bindiog center after Ihe tRNA is loaded iota the A site and a correct codon-anticodon match is made. At this point. EF·Tu hydro· Iyzes its bound GTP and is reJeased from the ribosome (Figure 14-31). As we discuss bolow, control of GTP hydrolys is by EF-Tu is critical to the specificity of translation.
The Ribosome Uses Multiple Mechanisms to Select Against Incorrect Ami noacyl~ tRNAs The eITor ralo of translation is between 10- 3 lo 10 - 4 . Thal ¡s. 00 moro than 1 in every 1.000 amina acids incorporated into protein is incorroct. The ultimale basis for tbe selection of the correcl amínoacyl-IRNA is the base pairing between the charged tRNA and the codon displayeú io the A site of tho ribosome. Despite Ihis. the energy difference belween a correctly fonned codon-anticodon paír and Ihat of a near match canoot account for this level of accuracy. In many instances only one of the three possible base pairs in the anticodon-codon ioleraction is mi smatched, yet the ribosome rarely allows such mismatched aminoacyl-tRNAs to continuo in the translatiOIl process. At Joast furee differenl mechanisms contributo lo this spccificity (Figure 14-34). In each case, these mechanisms seleet ogoinst incorrect codon-anticodon pairings. One mechanism Ihat contribules lo Ihe fidelity of oodon recognition involves Iwo adjacent adenine residues in the 165 rRNA component of Ihe smaH subunit. These bases fonn a ligbt interaction with the minar groove of each correet base pair formed between !he antioodon and thu
peplide bonO formation
1
5'
l''' "·ocal~"
FIGURE 14-30 Summaryofthesteps of translation.
442
Tmnslalion
binding centef
aminoacyl-IRNA
EF-Tu-GDP I
+0 ,
first two bases of lbe codon (Figure 14-32a). As you wil! feca l! (see Figure 6-10), lhe edges of a G:C and al1 A:U baso pair are very similar in the minor groovo. The adjacent A residues in lbe 165 rRNA do not discriminate between G:C or A:U base pairs and recognize either as correcto lo contrast, non-Watson-Crick base pairs form a minor groove that canoot be recognized by these bases, resulting in significantly reduced affinity for incorrect tRNA... The nel result of these interactions is Ihat correct1y paired tRNAs exhibit a much lower rate of dissociation from the ribosorne than do incorroctly paired tRNAs. A second mechanism that helps 10 ensure correct codon-anticodon pairing invol ves the GTPase aelivily of EF-Th (Figure 14-34b). /\$ described above, release of EF-Tu from the tRNA requiros GTP hydrolysis, which is highly sensitive lo correct codon-an ticodoo base pairing. Even a single mismatch in the codon-an ticodon base paidog leads lo a dramatic reduction in EF-Tu CTPase activily. This mechanism is un example oC kinelic selectivity and is related lo Ihe mechanisms used lo eosure correct base-pairing during DNA synthesis (see Chapler a). In bolh cases. fonnatioo of correct base-pairing inleractions drarnRlically eohances Ihe rate of a critical biochemical step. For the DNA polymerase. this slep was tho Connation of Iha phosphodiesler bond. In tbis case, it is tbe hydrolysis ofGTP by EF-Tu. A third mechanism that cnsures patring accuracy is a forro of proofreadiog Ihat occurs afier EF-Tu is releac;ed. When the charged tRNA is first introduced iolo the A sile in a complex with EF-Tu-GTP, its 3' end is distant from the site of peptide bond formalion. To partidpate successfully in Ihe peptidyl transfe rase reaction , Ihe tRNA mus! rotate into lile peptidyl transferase center of Ihe large subunit in a process called accommodation (Figure 14-32c). Incorrectly paired tRNAs frequentl y dissociate fmm theribosome during accornmodation. It is hypothesized that the rotation of the tRNA places a slrain 00 Ihe eodon-anticodon intemetion and that only a cormdly paired anticodon can suslain Ihis strain. Tbus, mispaired IRNAs are more Iikely to dissodate from the ribosome prior to participaling in the peptidyl transferase reaction. In summary. in addition lo Ihe codon-anticodon interactions, the ribosorno expIoits minor groove interactioos and two phases of proofreading to ensure that a correcl aminoacyl-tRNA binds in the A site. Each of these three additional selectivity meehanisms enhances tha rate of peptide bond formalion with corroct codon-anticodon inleractions and selects against incorreet interactions.
The Ribo50me 15 a Ribozyme FICURE 14-31 EF-Tuesrorts
aminoaeyl-tRNA lo the A site 01 rile ribosome_ Charged tRNAs are ooul1d lo EFTu-GTP as they fir~1 ínteract IrVith the Asjle 01the riboscme. VItlet1 lile corree! codon·anticOOon Inleracoon ocrurs, Ef-lU Interacts with me faCla bll"l(ling cenia; hydrolyzes it:s bound GTP and is reIeased Irom tI1e tRNA and me Ii~me..
Once the correctly charged tRNA has beeo placed in the A site and has rolaled into the peptidyl trans fe rase center, peplide bond fonnalion takes place. Th is reaetion is catalyzed by RNA, spedfically the 23S rRNA component of the large subunit. Early evidence for thi s carne fr e m oxperiments in which it was shown Ihat a large subunit thal had been largely stripped of its pfoteins wa s 5tHI ab le to carry out peptide bond formation. Proof thal the peptidyl transferase is ootirely composed of RNA has come from Ihe high-resoluHon, three~ dimen s ional structure of Ihe ribosome, which reveals lhnt no amino acid i5 lacated c10ser Ihan 18 A from tbe active s ite {Figure 14-33}. Because calalysis fequire s di stances in the 1 - 3 Á range, it is c1ear that the peptidyl Iransferase center is a ribozyrne. That is an enzyme composed of RNA (see Chapter 5).
a
correa pai,ing
b
colTect pairing
Ioopof
tRNA
3'
5'
GTP hydrolysis and EF-Tu released
EF-Tu-GOP
\
é) +O;
¡ncorreet pairing
5'
¡ncorrect palnng facto" bll1ding center not
no GTP hydrolysis EF-Tu·tRNA released
élccornmodation
correa base pairlng
r palli ng
FI e o R E 14-32 llvee mechanisms to ens&Ire corred: paifing between the tRNA and the mRNA.
(a) Adátional hydrogeo bonds i!l'e formed between tw;:¡ nderine residIks 01 !he 165 rRNA and me miflOf 8rOOle 01the arltirodon-codoo pail onIy v.f-oen they are CXlIrectIy base-paired (b) CO/lect ba5e-pa~ng
444
TrallsJotioll
fIGURE 14-33 RNA surrounds ttte peptidyI transferoase center of the larp ribosomal subunrt. The three-dimenslonal.stf\.JctlJre 01 the bacterial sos subunit is sro.vn. The fRNAs are shown in gray aOO !he ribosornaf proteins are shown in ¡x.¡rple. The 3' ends 01the A and Psile tRNAs that are immediatefy ad;acent 10 the pe(>tidvI transfefase renter are shOY.'n in green aOO red. respediveIy. (Yusupov M.M •• YlJStJ¡:ova GL , Baucom A, lIeberman le. Eamest TN., Cale lH. aOO NoIIer HE 2001 . Science 292: 883.) Irnage prepared v.ith MoIScript, BobScript. aOO Raster 3D.
How does the 23S rRNA catalyze peptide bond formal ion? The exoCt. mechanism remo¡ns to be determined, tut sorne answers to this question are beginning 10 e merge. First. base-pairing belwecn the 23S rRNA and the CeA ends of the tRNAs in the A and the P sites help to position lh e alpha-amino group ofthe ominoacyl-tRNA lo attack the carbonyl group of the growing polypeplide attached lo tha peplidyl -tRNA . Tbese intaroclions ara al so likaly lo stabilize Ihe aminoacyl-tRNA after accommodotion. Secause dose proximily oC substrates is f"drely sufficienl to generale high levels of r..atalysis. it is hypothesized thal olher elements of lhe ribosomal RNA change the chemica l environment of the peptidyl transferase active site. For example, it has been proposed Ihal nudeotides in the peptidyltransferase center accept a hydrogen from the alpha amino group of the aminoacyl-tRNA . making the associaled nitragen a slronger nucleophile. This is a common mechanism used by ffiany proteíns to stimulllle nuclcophilic aUack of carbonyl groups.
Peptide Bond Formation and the Elongation Factor 'EF#G Drive Translocation of the tRNAs and the mRNA Once Ihe peptidyl trans fera se reaction has occurred. Ihe tRNA in the P sita is deacetylated (no longar attached lo an amino acid) and the growing polypeplide chain is linked fo the tRNA in the A site. For a new round of peptide chain elongation to occur, the P-site tRNA must move to the E site 8nd the A-site tRNA must movl:J lo the P sile. At the same time , the mRNA must move by three nucleolides lo expose the nexl codoo. These movements are coordinated within the ribosome and are collectively referred lo as lranslocation. The inilial steps of translocation are coupled to the peptidyl transfera se r&,ction (Figure 14-34). Once the growing peptide c hain has been transfe rred to the A-site tRNA, the 3' end oC this lRNA moves into the P-site portion of lhe large subunit (Figure 14-34 panel 2). In contrast, the anticodon end of the A-site tRNA remains in the A site. Similarly, the now deacetylated P-site tRNA is lacated in the E site of the lorge subunil and the P site of the limall subunit. Thus, transl6ca lion in the large subuni! precedes t.rans location In the sTnaJI sub1.mit
Trotlsfotion f:longofion
and the tRNAs are said lo be in "hyhrid states." Their 3' ends shift into a new location but their anticodon ends are still in their pre-pcptidyl transferase posillon. The complelion of translocation requires lhe action of a second elongation faC10r called EF-G. EF-G can only hind to lhe ribosome wben associated with GTP. After Ihe peptidyl transferase macHon. the shift in tlle lorntion of the A-!'ite tRNA uncovers a binding site ror EF-G in the large subunit portian of the A si te. Wben EF-G-GTP binds, iI contacts !he fact or-binding renter u f the large subunit, which sfimula les GTP hydrolysis. GTP hydrolysis changes the conCormation of EF-G-GDP, allowing it to reach into the small sllbunit and trigger translocation of the A-site tRNA (Figure 14-34 panel 3 ). When tran s]ocation is complete. the resuhing ribosome structure has dramaticnlly reduced affinity ror EF-G-GDP, allowing the elongation factor to rolease from the ribosome. Together these evenlS result in the translocation of the A-site tRNA into the P sile. Ihe P-site tRNA into the E site, and Ihe movement uf the mRNA by exactly Ihree base pairs {Figure 14-34 panel 4).
EF-G-GTP
EF~G
Drives Translocation by Displacing the tRNA Bound to the A Site
The exaet means by which EF-G induces translocation is nol clear, bUl part of the mechanism involves Ihe abilily of EF-G-GDP to occupy the A-site portion of the decoding ccnter. By interacling with the decoding center, EF-G-GDP displaces Ihe A-sita tRNA into the P site. Lile dominoes, the displacement oC l.he A-site tRNA into the P site means Ihat the P-site tRNA must move into lhe E sile. During the movement of the tRNAs, the mRNA is shifted by three base pairs. Movement oC lbe mRNA is mediated by base-pairing between the moving A-site tRNA and the mRNA. which is mainlained during translocation. E.'isenlially. the mRNA is pulled alnng with Ihe moving A-sile tRNA. Indeed, rare "frame-shifting" tRNAs thal have four-nucJeotide-long anticodons (and can therefore compensate for certain frame-shift mutations) move the mRNA by four nucJeotides ¡nstead oC three. In contrasl to A-site lRNA movement. movement of the P-site tRNA into the E sitc disrupts base-pairing of the IRNA with the mRNA. Hence. the now uncharged tRNA in the E site is free to di ssociate from the ribosome and to become recharged with a fresh amino acid by aminoncyl tRNA synthetase. Changes in the sma ll subunit of the ribosome also contribule to transloealion. For example, changes in the slructure oC Ihe small subunit musl occur lo nllow the release of EF -G-GDP after translocation is complete. In addition, prior lo translocation, portions of Ihe small subunit separate the A, P, nnd E sites. Thus, for the tRNAs to translocate to their new positions, these regions must move out of the way. The irreversible nature of GTP hydrolysis and lhe occupnncy oC the A-site decoding centar by EF-G-GDP ensures Ihe forward movemenl of the lranslation process. How does EF-G-GDP interaet with the A site ofthe decoding cenler so effectively? Crystal struClUl'es of EF-Tu bound to lRNA and EF-G roveal a dear answer to this question. EF-G-GDP and EF-Tll -GTP -tRNA have a very similar slructure (Figure 14-35). Recall thal EF-Tu -GTPtRNA also binds lo the A-site decoding center. What is mosl remarkablo abou! this similarity is that. aJthough EF-G is composed of a single polypeptide, ils slruC1ure mimics that of a tRNA bound lo a
~
&
~
FICURE 14-34 Ef-Gstimulationof translocationrequireSCTl'hydrolysis.
445
446
Trunslatiall
F I G U R E 14-35 Structural comparison
of elongation factors. Er-Tu-GDPNp·Phe· tRNA is shCM'I1 00 the leh aOO EF-G-GDP is shown on the right GDPNP is an analogue of GTP that c.annot be hydrolyzed that is used lO Iock the moIeOJle In Ihe GTP-botJOO coofonnation during the determination 01 lile threedimensional structure. Note lile similarity between the stl\Jcture of \he green dornaln 10 EF-G aOO the tRNA bound to Er-Tu (also shov.tI1
in greet1). (leh structure: Nissen P., Kjeldgaard M., lhirup S., PoIekhina G., Reshetl1ikcNa L, CIar"- aF~ and N)borg J. 1995. Sdence 270: 1464-1472. Right structure: al-Karadagt11S~
AevafsSOfl A, Galber M., Zheltonosova J~ aOO l lijas A. 1996. Structure 4: 555-565.) lmages prepare.:! vvith MoIScripI, BobSoipl, and Raster 3D.
proteio. This is a o example of "molecular mimicry" in which a protein takes 00 Ihe appearaoce oCa tRNA to facilita te association with the sorne bindi og site.
EF-Tu-GDP and EF-G-GDP Must Exchange GDP lor GTP Prior to Participating in a New Round of Elongation EF-Tu and EF-G are catalytic proteins that are used once Cor each round of tRNA londiog ooto the ribosome, pepticle bond formal.ion. and translocation. After GTP hydrolysis. bolh pruteins must release their hound GDP and bi nd a new molewle of GTP. For EF-G Ih is is a simple process, as GDP has a lower affinily for EF-G thao does CTP ancl is rapidly released after hydrolysis of CTP. The uobound EF-G rapidly binds a new CTP molecule. In the case oI EF-Tu. a second protein is required to exchange GDP for GTP. Ttw e longatio n factor EF-Ts i:lcts as a CTP exchange factor Cor EF-Tu. After EF-Th-GDP is released from the ribosome. fI moleclJ ll; ! ofEF·Ts binds to EF-Tu. causing the displacement of CDP. Nexl. CTP hinds to the result.ing EF-Tl.I-EF-Ts complex, causi ng its dissociation into free EF-Ts and EF-Tu·GTP. FinaUy. EF-Th-GTP binds a molecule oC chargcd tRNA, regenerating the E1~-Th-CTP aminoacyltRNA complex, which is once again ready to deliver a charged tRNA to the ribosome.
A Cycle of Peptide Bond Formation Consumes Two Molecules
of GTP and One Molecule 01 ATP Let li S conclude our di sctJssion of elongation wilh a simple cost accouoting. How many molecules ol" nucleoside triphosphate clOBS it cost per round of peptide bond formation (l eaving aside the energetics of amino acid biosynl hesis and the energelics of ioitiation a nd terminalion)? As you w ill recall, one molecule of nuclcoside tripbosphate (ATPJ is consumed by the aminoacyl-IRNA syn thetllse in creating lhe high-energy aeyl bond Ihat lioks the amino llcid lo the tRNA. The breakage oCthis high-energy bond drives Ihe peptidyl transferase reac-
Tennination ofTmnslafian
447
tion Ihal creales Ihe peptide bond. A second molecule af nucleoside triphosphate (GTP) is consumed in Ihe delivery oC a charged tRNA lo the A site oC Ihe ribosome by EF-Tu and in ensuring that correct codon-anUcodon recognition had taken place. Finally, a third nucleoside triphosphate is consumed in Ihe EF-G-mediated process oC IransJocation. Thus. making a peptide bond costs the cell two molecules oC GTP and one oC ATP, wilh one nucleoside lriphosphale being consumed ror each step in lhe translation e longation process. Interestingly, of the thrce molecules , only one IATP) is onergetically connecled to peplide bond fonnation. The e nergy of Ihe ol.her two molecuJes (GTP) is spent lo ensure the accuracy and order oC events during translation (see llox 14-3, GTP-Binding Proleins, Confonnational Switching, and Ihe Fidelity nnd OrdeMng of the Events of TransIAlion). Throughout Ihe discussion of lranslation elongatioD we have nol distinguished between prokaryotes and eukaryotes. Although lhe oukaryotic fa clors analogous to EF-Tu (oEF1) nnd EF-G (oEF2) are named differenlly, their runctions are remarkably similar to Iheir prokaryotic counlerparls.
Bol: 14-3 GTP-Binding Proteins. Confotmational Switching. ;!lOO the Fidefity and Ordering of the Ewnts of Transfation
GTP is used throughout translation lo control key events. The energy of GTP hydrolysis is nol couple
acmnplished? A key feature of !he GlP-binding proteins invoM!d in lfanslaticn is that their conforrnation d1anges depending on tire guanine nudeotide (such as GDP vs. GTP) 10 v-.tJich Ihey are bound. llis can be seen for EHu in B<»: 14-3 Figure 1, v.hich shoNs the three-dimensional structure el EHu bound to GTP or GDP. EF-Tu l.JlCIergres a maja' conformational change v.hen it binds ro GTP !ha! results in !he forrration 01 its tRf\J¡"I, binding site. In particular, CJ"le dornain of EHu (shown in magenta in Box 14-3 Ftgure 1) shifts its kxatioo relative to the other dOO1ains of !he protein depending on \he nudeotide tnal is bound 1his change in domain Ia:aticrl as well as d1anges in the conformalion of the other IVI'O dcrnains (sha.rvn in turquoise and dark blue) results in me fonnaticn of a new surface 00 EF-Tu !hat binds tightly 10 d1arged tRNAs (you can see EF-Tu boond 10 a tRNA in Figure 14-35). Thos. depending 00 !he form of guanne nudeotide bound, these factOfS can have different functialS or bind to d¡fferent proreins/RNAs. Fcr example, EHu-GTP can bind 10 an aminoac:yl-tRNA but EFTu-GOP cannal By coup\ing GTP hydro/ysis tu the rompletim of key events in ba1s1ation, I.he 0"00 of these events can be tightly controlled For EF-llJ, the GTP-dependenl association d EF-Tu with cmnoacyl!RNAs ensures !hat peptide bood formation does not ocrur prior 10 CCfrect codcn-anticodon pairing. Formation of the correa base pairs triggers GTP hydrolysis. Olee bound lo GDp, EF-Tu is re-
leased fron !he aminoacyl-tRNA alloMng peptide bond fmnaticn lo ensue. The mechanism !hat acbvates GTP hydrolysis b,I each of !he GlP-regulated auxiliary proIeins is the same. In each case. GTPase actMty is stimulated through an int€faction """;th a spedfic r~ 01 !he large suburit caUed the factor binding center. 1his interadion is not of suffióent affinity to OCaJr in isdaticwl. Instead, each GTP-controlled, translatia1 fador mUS! make several other critical ir)leractí0r6 with lhe ribosome to stabihe !he precise assodatioo with !he fador binding center mat Ieads to GTPase octivatioo. Indeed, as we have seen for EF-Tu, tIis interaction is hghly sensitive to me exad nature of the inleractioos ber.veen EFTu, !he aminoacyl-tRNA, me mRNA, and the ribosane. Thus, !he interactioo wth the fadO" binding center rnooitcrs aU !he other interactiCfls of these proteins and RNAs ....,th !he riOOsorne. Only IMien an appropriate set of interactions is ad1ievcd (sudl as corred codcn-anticodon paimg) does the GTP-birding site able lo interad proructively Wth the factor binding center, \eading ID GTP hydrolysis and the associaled chan~ in prdan confamation. The use of GTP duñng translatioo is anarogous to rhe use el AlP by !he sliding damp loaders (see Olapter 8, Box &2). Recalt Ü1cIt in ¡ha! case. ATP binding \MIS required lo assemble an initia! complex v...;!h the sliding clamp, bu!. ATP hydrolysis and release of !he sliding clamp could onIy OCCUr \I\ohen the clamp loader endrded the prirner:ternplate junction. In ttanslation, GTP i5 required for !he inirial assodation with !he ribosome (and in sorne Inst
448
Translotion
BoJe ••, (Continued) b
BOX 14- 3 flGU RE 1 Comparison of Ef-Tu bound to GOP and GJP. (a) EF-Tu bound 10 GDP. (b) EF·Tu bound 10 GTP. The GTPbind· ing domain is shcwn in red. The rotation of lhe magenta dornain and !he changes in lhe structure of tI1e green and blue dornains lead to the formatJOn of a suong tRNAbinding site when GTP is bound (see Figure 14·35). (SIructure (a) Polekhina G., Thirup S., Kjeldgaard M, Nissen P., lippmann e, and Nyborg 1. 1996. Structure 4: 114 1. (b) Kjeldgard M, Nissen P., Thiltlp S., and NybOlg 1. 1993. Struaure 1: 35.) lmages prepared ~ MoIScripr, BobSOlpt, and Raster 3D.
TERMINATION OF TRANSLATION Releasc Factors Terminate Translation in Response to Stop Codons The ribosome's cyclo of aminoacyl-tRNA binding, peptide bond formation, a nd transJocalion continuos unH) ono of the three stop codons enters the A s ite. l t was in itial1y postulated Ihat lhere would be one or more chain-te rminating tRNAs that would recognize those codons. However, this is nOI the case. Instead, stop codons are recognized by proteins called release factors (Us ) that activate ¡he hydrolysis of the polypeptide from the peptidyl-tRNA. There are !wo c1asses of release raclors. Class Tre Jease fac!ors recognizc the stop codons and trigger hydrolysis of the peptide chain &om the lRNA in the P site. Prokaryotes have I.wo class 1 release ractoes called RFl and RF2. RFl rocognizes the stop codon UAG, and RF2 recognizes the stop codon VGA. The third stop codon. UAA, is eccogni zed by both RFl and RF2. Jn euknryolic ceUs thero is a singJe class , release factor called eRFl thal recognizes all ¡hree stop codons. Clnss n relp.8se fuctors slimulnte Ihe dissociation of the dass 1 factors from the ribosome after re lense of the polypeptide cha in.
Terminatian afTmnslation
449
Proknryotes and eukaryotes have only one class ti {actor ca lled RF3 and eRF3. respectívely. Like EF-G, EF-Tu, and other translation factors, class n release factors are regulated by GTP. Short Regions of Class 1 ReJease Factors Recognize Stop
Codons and Trigger Release of the Peptidyl Chain How do release {aclors recognize stop codons? Because release factors are entirely composed of prote¡n . reoognüion of stop codons musl be mediated by a protein-RNA ¡n!eraction. Experiments in which short coding regions were genelically swnpped between RFl and RF2 (which have different stop-codon specificity) pinpointed the region oC thi s 1"ecognition lo a stretch of tbree omino acids. E-xchnnge of ¡hese 'hree amino acids between RFl and RF2 results in hybrid release faclors that acqllire the stop codoo recognilion specifici ty of their counterpart but are otherwise idenlical in funclion. Evidenlly. jusI. Ihree amino acids are responsible for the specificity of stop codon recognition. The region defined by 'hese three amino acids represents a peptjde anticodon that interaels with and recognizes stop codons. In keeping with this view. the three-dimensiona l slruclure of RFz bound lo a ribosome reveals tha! the peptide anlicodon lS (ocaled c10se lo the stop codon in the decoding cenler (Figure 14-36). A region of class I release factors that contribu tes to polypepticle release has also been identified. AH class 1 factors l;ihare a cooserved , three-amino ncid sequence (glycine glycine glutamine, GGQJ that is essentia l for polypepticle release. Moreover. the structure of RF2 boand to the ribosome confirms that ibe GGQ motif is located in c10se proximity to the peptidyl transferase center (Figure 14 -36). It remains lmclear whether the GGQ motif is direcUy in volved in the hydrolysis of lhe polypeptide from the peplidyI-tRNA or if it induces a change in
FI (¡ U RE 14-36 Model ofa type I re~ase
f.,ctor bound lo the A site of the ribo-
sorne. Ths model iUUSh'ates Ihe loca6on of a C\;!SS I rele
" - - GGQ
involved in popIypeptide l1ydroIysis is lIxated adjatenl to the 3' end of Ihe P site IRNA. The SPF peptide antiaxlon is Iocated adjacert 10 Ihe anlicodeo loop 01 me P site tRNA in a posi!ioo thaI would allow easy access 10 !he stop ceden
P site IRNA
(Scurte: Adapted rrom Blodersen, D. E. and Ramalmshnan. V. 2003. Shape can be seductrYe.
Nat. Srruct. BicI. 10: 79, fig 2, part a .)
450
Translotion
the peptidyl transferase center thal allows the center ilself to calalyze bydrolys is. Together, Ihese studies have led lo the hypothesis that class 1 reJease factúrs fu nctionally, but nol structurally. mimic a tRNA: having a peptide anticodon thaL interacts with the slop codon and a GGQ motif tha! reaches ;nlo Ihe peptidyl transfernse center. Comparing the struclure of the release factor tha! is bound to the ribosome wilh that of a free release factor provides an additional insight inlo the role of stop codon recognition in polypeptide release. As we have seen, the peptide an ticodon and the ceQ of a rojease factor extends from Ihe decoding centcr to the peptidyl tcansfar center of the ribosome. In the absence of a ribosome, however. the peptide anticodon and lhe GGQ motif are quite clase to eoch ather (approximately 20 Á). too dose lo reach both Ihe decoding center ond Ihe peptidyl trans ferase center. (For comporison , the om ino acid-accepting stem allhe 3' eod of a IRNA molccule is about 70 A from lhe anticodon loop at the olher end of the moleculc. ) Thus, release {actoes must undergo a change in conformation upon binding lo the ribosome. This fi nding has led to a model in which rulea::;e factoes mn only assume the extended, chain-lerminaling conformation (in which they can reach into the peplidyl transferase center) when a stop codon is presenl in the decod ing mnler.
peptide hy
di
Rf-3-GDP
GDPjGTP Exchangc and GTP Hydrolysis Control the Function of the Class 1I Release Factor
RF-3-GOP
o
Once the class I release fa ctor has triggered the hydrolysis of Ihe (leptidyl-tRNA linkage. it must be removed from lhe ribasome (Figure 14-37 ), This is accomp lished by the class U release fa clor, RF3 (or eRF3). RF3 is [l GTP-bind ing protein but, unlike the other GTP-hinding prole¡ns involved in translation. this factor has a higher affinity for GDP than GTP. Thus , free RF3 is predominant1y in the GDP-bound form o RF3-GOP binds to the ribasome in a manner that depends on the presence of él cJass 1 release factor, After the class 1RF stimuJates polypeptide release, a change in the con formation of lhe ribosome and the class I factor stimulates RF3 to exchange its bound GDP ror a GTP. The bindlng of GTP lo RF3 leads lo the formalion of o high-affinily interaction with the ribasome Lhat d isplaces Ihe c1ass f factor from !he ribosome. Th is change also allows RF-3 lo associate with the factor binding center of lhe large subunit. As with other GTP-binding protein s involved in trans lation. lhis interaction sLimulates the hydrolysis of GTP. In the absence oC a bound elass J faclor. RF3-GDP has a low affini¡-y for Ihe ribosome and is released.
The Ribosome Recyeling Factor Mimics a tRNA
FIGURE 14-37 Pofypeptide release is catatyzed by two reluse factors. lhe dass I release factO( (sha.Nn here as RF 1) recognires the stop (odon and 5tImulates poIypept¡de
release !hlO\lgh a CGQ rnotif !ha! is localized 10 lile peptidyl transferase rente!", l ile dass U ~se f
"'ter
Afler the release of the polypcplide chain and the release ractoes, the ribosome is stjlJ bound lo lhe mRNA and lS Jef1 with two deacy lated tRNAs (in the P and E siles). To parlicipale in a new round ofpolypeptide syn'-hesi.s, the tRNAs and the mRNA must be removed from Ihe ribosome and the ribosome musl dissociale inlo its large and small subunils. Collectively, Ihese events are referrcd lo as ribosome recyding. In prokaryotic cells 8 factor known as the ribosome recycling factor (RRF) cooperates wilh EF-G and IF3 lo recycle ribosomcs afier polypeptide reIease (Figure 14-38). RRF binds lo lhe empty A sitc of the ribosorne, where iI mimio;; a tRNA. RRF also recruits EF-G lo the ribosome aod , in events thal mimic EF·G funct,i on during clongation, the EF-G stimulates the release of the wlcharged tRNAs bound in the P and E
Tennjnation ofTrcJI! lilatian
451
flCURE 14·38 RRfandEf-.Gcombine
to stimulate tite ~ease of tRNA and mRNA from a terminated ribosome.
l
RRF _ _
RRF
~"", ~
A "lo
EF-G-GTP
1F3
mRNA
•
sites. Although exactly how this release occurs is undear, it is thought thal RRF is displaced from the A site by EF-G io a manner similar 10 the displacement of a tRN A from the Asile duriog elongation. Once the tRNAs are removed, EF-G snd RRF are released from the ribosome along with Ihe mRNA. IF3 (the initiation factor) may a lso participate in the release of the mRNA and is required lo separnle lhe t\Vo ribosomal subunits from each olher. The final outcome of these evenls is a smalJ s ubunil bound lo IF3 (bul nol tRNA or mRNA) and a free large suhunit. The roleosed ribosome can no", IJarticipul.e in a new rOlmd of Imnslolion. Reinforcing Ihe view lhal the RRF is a mimic oCtRNA. it resembles a tRNA in its three-dimensional structure. Neverlhaless, it interacts with the ribosome in a very difIerenl manner Ihan does a tRNA. RRF is dosely associated only with tbe large subunit portion oC the A site, We can rationalize this difference between lhe recycling factor and tRNAs in the Collowing way. If lhe ribosome recyd ing factor precisely mimicked an A-site IRNA. Ihen Ihe P-site tRNA would be moved in lo Ihe E site by EF-C. Instcad. EF-G and lhe recycli ng factor lead lo lhe release oC the P-site tRNA from the ríbosome directly from Ihe P site. 1I is likely that EF-G and lhe ribosome recycliog factor cause a more dramatic chango in the structure of the ribosome than normally occurs during tnmslocation. allowing both the mRNA Bnd the tRNAs lo be relellsed. ti ke initiation and elúngation, the lerminatiún úf translation is medialecl by an orderecl series of interdepenclent factor binding and release events. This orderod nalure of translation ensures that no one step occurs before tbe previous step is complete. For example. EF-Th canool escort a new tRNA ioto !he A site unlil EF-G completes translocation. Similarly. RF3 cannol bind lo t.he ribosome unless a class 1 reJease fador has already recognized a stop codon. There js a weakness to trus orderly approach 1.0 lranslatioo: if aoy slep canno! be completed. lhen the entire process stops. It is jusl lhis Achilles heel Iha! antibiolics exploit when they targel the translation process (see Box 14-4, Antibiolics Arrest Cell Divisíon by Blocking Specific StBpS in TmnslotionJ.
TRANSLATlON-OEPENOENT REGULATION OF rnRNA ANO PROTEIN STABILITY At sorne frequency, mRNAs will be ma de that are mulant or damaged. Such defeclive mRNAs can ari se from mislakes in transcriplion or from clamage fhat occurs after !hey are synlhesized. For examplc. because they are single-sfranded. mRNAs are more susceptible lO breakage. S uch damaged mRNAs have the possibility of making incomplete or incoITecl proleins thal could have negative cffects on the cell. In sorne cases. such as poinl mula tioos Ihol c hange only a single ami no acid . there is IitUe thal can be done lo eliminate lhe mutant mRNA or ¡ts proteio prod uct. HowBver, in olhe r cases described be low. IDe process of lrans lation is used lO detect defective rnRNAs and eliminate either them or their prote jo produc ts.
The SsrA RNA Reseues Ribosomes that Translate Broken mRNAs NOnTIa lly a stop codon is required to re lease the ribosome from an rnRNA. What happens to a ribosome !hal initiales translation of ao mRNA frogmenl that lacks a lermination codon io the appropriate
Tronslatian-Dependent Regulatian al mRNA tmd Prolein Swbilífy
Box
.4-.
453
Antibiotics Arrest Cell DiYision by Bloddng Specific Steps in Translation
M tibiol;CS represent a JXlWerlul tool te fight disease. Many of the most widely used antibiotics in medicine kili bacteria but have little or no etfect on eukaryotic cens, and hence are not toxic lo the patienl Síf'"lCe their disc.overy in the first half of \he last century, anfibiotics have herped make prevlously untreatable infections such as tuberculosis, bdcterial pneumonia, syphilis, and gononhea \.algely OJlable (although !he emergence of antibiotic-resistant bacteria is becoming an increasing obstade lo effective treatmenl). Antibiotics have many different kinds of targets in the bacterial ceI!, 001 approximately 40% oi the knovvn antibiotics are inhibitors 01 the translalion machinery (Box 14-4 Table 1). In general, these antibiotics bind a component of Ihe trans\.ation apparatus and inhibít its functiorJ. Because di!ferent anlibiotics arrest translation at different steps and 00 so in a precise manner (for example, JUSI prior lo EF-Tu release), these agents have become useful tools in studies of me mechanism of protein synthesis. Thus, In addítioo lo !heir cbtious medicar henefits, antibiotics have come 10 play an important role in helping us undetSland !he ~king oi !he translation machinery. Puromydn is one antibiotic commonly used in studies oi translatíon. It binds 10 Ihe large subunit regían of Ihe Asile.
Once boond, puromycin can substitute for an aminoacyltRNA in the peptidyl translerase reaction (801: 14-4 figure 1). Because puromycin ís very smatl compared lo a tRNA, ilS binding lo the A site is not sufficient lo retain !he poIypeptide chain an the ribosome. Thus, peptídyl chains that are transferred lo puromycin díssociate Irom the ribosorne as an incomplele, puromydn-bound polypeptide. In other words, purornycin causes polypeptide synthesis to terminate prematurely. Other anlibiotícs targel olher features of the ribosome, such as the peptide exit tunnel, !he peptidyl translerase center, the factor binding center, Ihe decoding center, and regions critical far translocation. Yet other antibiotics are inhibítors of translatiorl fadors. For example, kirromydn and fusidic add are inhibitors of the elongarion factors EF-Tu and EF-G, respectively (Box 14-4 Table 1). In both cases, Ihe antibiotic interacts with the GiPhound forrn 01 the translation fador and prevents changes in conforma non that would normaUy occur afler GTP hydrolysis. Thus, kirromyc;n i3rrests ribosomes ""'¡¡h bound EF-Tu·GDP
BOX 14~4 TABLE I Antibtotiu: Targets and Consequences
AntibioticlToxin
Target Cells
Motecutar Target
Consequence
Tetracycline Hygromycin B Parcmycin
PrOkaryotlc ceUs Prokaryotic and eukarYOlic cells Prokaryotic cells
Chloramphenicol
Prokaryolic cells
A site 01 305 subunit Near the A site ot 30S subunil Adjacenllo Ihe Asile codon-anlicodon inleracfion site in 305 subunil Peplidyt Iranslerase cenler 01 50S subunit
Purornycln
Prokaryotic and eukaryoliC cells
ErythfOmycin
PfOk.aryolic cells
Fusidic acid
ProkaryOl.lc cefls
EF-G
Thioslrepton
prokaryotic cells
Factor binding center 0 1 the SOS subunit EF·Tu
Aicin and ",-Sarcin (prafein toxins)
Prd<.aryotic and eukaryotic cells
Diptheria Toxin Cycloheximide
Eukaryotic cells Eukaryotic cells
Chemically madilles the ANA in the factor binding center oI large ribosornal subunit ChemiCalty madilies EF· fu Peplidyl translerase center 01 lhe 60S subunit
Inhibils am¡noacyl·tANA binding lO lhe A sire Prevents translocation ot A-sile l ANA lo P site Increases error rale during translatiOO by decreasing selectivily 01 codon-anlicodon pailing 810cks corree! positioning ollhe Asile aminoacyt-IANA lar pePlidyl transler reaclioo Chaln terrninalOf; mimics the 3' end 01 aminoacyl-tRNA in A site and acts as acceptor fer Ihe nascent poIypeplide chain Blocks exll ol lhe gr(),.Ying poIypeptide enain frcm the ribosome; arresls trélflslatioo Prevents release of EF-G·GOP fron Ihe ribosome Inlerlems wllh the assCX;iation of IF2 and EF-G with lactor binding cefller Prevents Ihe conlormalional changes associated with GTP hydrolysis and, tJ1erelore, EF-llJ release Prevents activatioo 01 Iranslation laclOr GTPases
Kirromycin
PeptidYl translerase cenler of large ribosomal subuni' Peptide exlt tunnel ot 50Ssubunlt
In/líbils EF-Tu function Inhibits peptldyllransferase aclivlty
454
Translalian
80Jt 14-4 (continued)
puromycin in the A site
peptidyl puromycin
B O x 1 4 - 4 F I e u R E 1 Puf'Ornytin terminales Iranslatton by mimidng a tRNA in the A site. P\Jrornycin binds in me A site and pl!rticipares in peptide bond forrn.mon. Once ccmpleted, puromydn and any assodated polypeptide diHuses out of lhe ribosome.
reading frame? Such an mRNA can bp. generated by ¡ncomplete tranScripti oD or nuclcase aclion. Translation of this Iype of mRNA can initiale nurmally and continuc untH the 3' cnd of thc rnRNA is reached. Al this JXlint. !he ribosorne canno! procúcd. There is no codon eilher to bind an aminoacyl-tRNA or lo bind a re\ease factor. Withoul sorne mechanism lo release them from thcsc dcfcclivc rnRNAs, many ribosornes would be pennanently trapped. removing Ihem fro m polypr.:pt¡de synthesis. In prokaryolic eell s. sueh stalled rihosomes are rcseued by Ihe aclion 01' a chim cri c RNA molecu le Ihal is par! tRNA and parl mRNA . call ed a tInRNA. SsrA is a 457-nudeotide ImRNA tilal includes a region al its 3' end that strong1y resembles tRNA ALn (Figure 14-39), This similarity allow.s the SsrA RNA lo be charged with alanine and lo bind EF-Tu-CTP. Whco a ribosome is stalled al Ihe 3' end of
lh:lrIslafion-Dependenf Regulalion o/ mRNA and ProfIJin SllllJility
455
FICURE '4- 39 lhe bnRNAandSsrA
stalled ribosome
.:-~. ' ....<1-1>01"'" mRNA
RNA
recognition by SstARNA
EF-Tu-GDP
lranspeptidatiorl
translocabon and replacement of mRNA
S'
conlinued translocalion of mRNA reading hame
~
la
ed
N CXX:~ C",:'"
+
degradalion by cenular proIeases
resale ñbosomes stalLed on prematurefy tenninated mRNAs. The SsrA RNA mlffilCS a
mat
IRNA but can only btnd ~ nbosome IS stallcd al lhe 3' end of an mRNA. Once bound, lhe SsrA mRNA svbstllutes pan of ils ~uence lo iJd as a new "mRNA."
Thus. translation products anslOg from broken mRNAs are rapidly deared lo prevenl these defective proteins from harming Ihe cell. How does tbe SsrA RNA bind to on ly stalled ribosomes? Because of the large size of SsrA (it is more than rom times bigger than a standard tRNA), it cannot bind lo lhe A site during normal elongalion. In contTast, when Ihe 3' end of lhe mRNA is missing. additional room is eroaled in Ihe A site lo accommodate tbe larger RNA. Thus, onIy rihfr sornes staJled at the 3' cnd oC an mRNA represent a potential binding sile (or Ihe SsrA RNA.
Eukarvotic CeUs Degrade mRNA s that Are IncompIete or that Have Premature Stop Codons Translation is lightly linked to the process of mRNA decay in eukaryotic cell s (Figure 14-40a). This Iinkagc is exploiled by two mechanisms thal monitor Ihe intcgrity of mRNAs that are being Iranslated. For oxampIe. when an mRNA conlains a premature stop codon (known as a nonscnse codon; see Chapler 15). lhe rnRNA is rapidly degraded by a procoss called nonsense mediated mRNA decay (Figure 14-40bJ. In ffimnmal s, recognition of mRNA s wHh premature stop codons relies on !he assembly of protein complexes within lhe openreading frame of the rnRNA. These exon-junction complexes are assembled on the mRNA as a consequence of splicing and are 10calOO jusi upstream of each exon-exon boundary (see Chapler 13). Orrunarily. when the firsl ribosomc ITanslates 81] mRNA, tbese complexes are displaced as the mRNA enlers tbe decoding center of tllf! ribosome. Howevcr, if a premature stop codon is present in tbe mRNA (duc to mutation of!he genc or mistakes in transcription or splicing) , thcn the ribosorne is released prior to tbe displacement oC Ule complexes. Under Ihese conditions. the complexes interact with the premalurely tel'minat ing ribosome, which activatel'> an enzymp. that mmoves ..he cap al Ihe 5' end oflhe mRNA . Because the mRNA is ordinarily protected &oro degradalion by !he 5' cap, removal ofthe cap causes rapid degradat ion ofthe rnRNA by a 5'- 3' exonucl ease. A differenl process caJled nonstop medialed decay rescues ribosornes Ihal translate mRNAs t.hal lack a stop codon (Figure 14-4 0cJ. Unlike thcir prokaryotic coun lerparls. eukaryolic mRNAs terminate with a poly-A lai l. When an mRNA lacking a stop codon is trans lated. the ribusomc translates through the poIy-A tail (beca use thero is no stop codon lo causo it to terminate befare maching the tail). This resu lts in the additioIl of multiplf! Iysines lo the end or the protein (AAA is the codon for Iysine) and stalling of the ribosome at tbe cnd of lhe mRNA . The stalled ribosome is bound by a protein (Ski7) (related lo Ihe class lJ rel ease ractor eRF3) that slimulates ribosome dissociation an d rccruits a 3'- 5' exonudease Ihat degrades Ihe "noostop" rnRNA. In addition. proleios Ihat conlain poly-Iysine al their carboxy-termious aro unstablc, leading to !he rapid degradation of proteios de rivcd rmm nonstop mRNAs. Thus, Iikp. the s ituation in prokaryotes. protcins syn thesizerl from mRNAs lacking stop codons are rapidly removed from lhe coll. A fascinating foature of nonsonse rncdiatcd mRNA decay and nonstop modiated decay is thal both processcs of mRNA dcgradalion require t.ranslation of Ihe damaged mRNA. In the absencc of transl alion. the damagcd mRNAs are nol rnpidl y degrnded and have normal stabilily. Thus. although indirecl. eukaryotic cells reiy on translation as a mecha nism to proorread their mRNAs.
Ttonslation -[)ependenJ Regulo/ion 01 m RNA nnd Protejo Stobility
457
b nons ense medNited mRNA decay
a normal exon Junction comp1exes
~
5· ..,...._,' ~'~_'~m ~m '!',.~ cap ----" ~
___¡'_ _ AAAAAA
proteins
translalion of mRNA
1 •
stop cOOon
~_ AAAAAA
5' -
5' oap
3'
endonuclease
2 3 1 Upf protens
______
decapplng enzyme
.... 3' AAAAAA
~~.S---
"
5'- 3' endonuclease degades uncapped RNA
e non-stop mediated FICURE 14-40 EukaryotkmRNAswith fM'i!mature or no stop codons are targeted for degradation_ (a) TransJation 01 a ~1 ~NA displaces an of the exon junction comp~. (b) Nonsense mediated decay. Trans1a!ion of an mRNA with a premalure slop codon
decay
5' . _ _ _ _ _ _. .
oap
I
does not displace one 01 more of tIle exon ,une!ion romplexes. This lesults in !he reauitment of !he Upl ' . Upf2. and Upf3 prol.etns lo me ribosorne. Once bound lo lhe ribosome, lhese proleios activare a decapping enzyme Ihal rerTlOI/eS the 5' cap of me mRNA l he uncapped mRNA is rilen rapidly degraded by 5' lo 3' ej(()nudease5 lhal are norrnally unable lO degrade the mRNA due 10 !he presence 01 me 5' cap. (e) Nonstop medicll.oo decay. In lhe absence 01
a Slop codon, the poly-A lail of the mRNA 15
trarlsIdted Atemple!; that indudes the Ski7 protein aOO a 3' (O 5' exOfll.lClease ca!Jed rhe emsome binds any ribosome stalled al !he 3' ene! 01 the poIy·A lait. lhis results in the reIease of !he ribosome 110m the mRNA and its
degradation. Simlal 10 SSIA mediated nonstop decdy, !he poly-Iyslne lound al Ihe end 01 protelns denved fmm wch mRNAs largets ,he plOl~n 101 deg~liOl1..
"'AA
ellosome
.__
5' •• _ _-
,~ , a""
_ _ _. .\
~
•
•
+
degraded prolein
4
""
458
Trons/otion
SUMMARY Proteios are syolhesizecl on RNA templates known as messenger RNAs (mRNAs) in a process kllown as translation. 1'r
tRNA binding s ites tbat reach ootween tbe two suhunils: an A site where the charged IRNA enlers Ihe ribasame. a P site Ibat contains Ihe peptidyl-tRNA, and an E site, where deaeylated tRNAs exits the ribasome. Translalion of one prote in ¡nvolves a cycle of association and dissocialíon of the small and large s ubunits. 1n Ihis ribosome cycle. (he sma ll and large subunits assemble al Ihe beginn ing or nn open -read ing frame and Ihan dissociate into free s ubunits when translation of tbe ORf' IS complele. The mRNA is translated starting at Ihe 5' cnu of Ihe ORF a nd the polypepUde chain is syntbesized in nn amino-termi nallo carboxyl-Ierminal direction. Trans lalion takes pluce in lru-ee principal stcps: iniliaHon. elongation. and termi nntion. Tnitiation in prokruyotes involves Ihe recruitment of Ihe small rihosomal subunit lo the mRNA through the intemcHon of tbe ribosollle bi nding s ile wilh Iho 16S rRNA. This intemction is facilitated by three auxil iary proleim; lcaJh:d initiatlon Íén;tors 1F1 . lF2. and IF3). Ihat help lo keep Ihe two subunils apart and recnlU a special initiata r IRNA lo the start codon. Pniring betwceo the anlicodon oC Ihe charged iniliator tRNA and the 5tart codon lriggers Ibe recruibnenl of IDa large subunil , Ihe reJease of the inil.iation fa ctnrs, and Ihe placemenl or Ihe charged ¡nitiator IRNA in the 1-' sile. This is the proka.ryoticiniliation complex. and il is poised to aecept a charged tRNA iolo the A site and carry oul the fo rmatioo of Ihe first peptide bond. Eukaryotic mRNAs recruit the small subunit through recognition of the 5' cap and the acHon of numerous auxil¡aI)' ¡oítíation ractors. The smaU subu nit then scans downslream unlil it encounlers an AUe , whicb it recognizes a<¡ Ihe star! codon. As in prokaryotes. on ly when Ihe Slarti ng AUe is I'el:ognized does tbe large ribosom
Bib/iogrophy
1'ranslatioll terminates when the ribosome encountars a stop codon. which is rccognized by one oi' two dass 1 reinase factors in prokaryotes anrl a single elass I release mctor in eukaryotes. The release factor triggers lhe hydrolysis of Ihe polypeptide from the peptidyl-tRNA and hence the release of Ihe completad polypeplicle. Finally. a d ass 1\ ralease factor, a ribosome recyding factor, and an iniliation
459
factor (lF3 in prokaryotes) complete termination by causing Ihe release of the rnRNA and Ihe deacylaled tRNAs and the dissociation of the ribosome into its large and small subunits. The ribosome cycle is now complete ancl Ihe small subunil is ready lo c.ommonca a new cyele of polypepticle synthesis.
BIBLIOGRAPHY Books Alberts B., Johnson A., Lewis J., Raff M., Roberls K., and Waller P. Z002. Molecular bio/ogy 01 the cell. 4th eJjlían.. Garhmd Scicnce, New York. Brown TA. 2002. Cenomos, 2nd &tilion. mos Scientific Publbbers Lid., Oxford, United Kingdolll. Sonenberg N .. Hershey ' .W.B.. and Malhews M.B. . eds. 2000. Translational control of gene expression. Cold Spring Harbor Laboratory Press, Co ld Spting Harbor, New York.
tRNA Arnez '.G. and MonIS D, 1997, Structural Itnd functional oonsideratíons of Ihe aminoacylat ioo rooction. Trends BiQChom. Sd 2: 189-232. Cusack S. 1997. Aminoltcyl-tRNA synthetases. Curro Op. Slruct. Biol. 7: 881-889. Delarue M. 1995. Arninoacyl-tRNA syolhetases. Curro Op, Slruct. Bio/. 5: 48-55,
The Ribosome Oahlberg A.E. 2001. Tlle ribosome in action. Science 292: 868 - 869.
Frank J. 2000. Tila ribosome-A macromolecular machine parexcellencc. Chcm Bio/. 7: R13;J-R141. Mocre P.B. and SI.oHz T.A . 2002. Tila involvement of RNA iJl ribosome function. Naturo 418: 229-235. Ramakrishnan V. 2002, Ribosome slructure and t.he mochanism oftranslation. GeJlI08: 557- 572. RolI-Mecak. A., Shin 8., Oover 1: E., and Burley S.K. 2001. Engaging tho ribosome: Universal IFs of translatjon. Trend... in BiD. Sciences 26; 705-709.
Translation Broderson O.E. and Ramakrishnan V. Z003. Shapes can be seductive. Nat , Struct. Bit)J. 10: 78-80. Groen R, 2000. Ribosomal Iranslocation: EF·G tums the Cfim k . CurrBiol.l0: R369 -R373. Nissen P.. Hansen J., Dan N.. Moare P.B. , and Sleitz T.A 2000a, The structura l basjs of ribosome activity in peptide bond synthesis. Sdence 289: 920 - 930. Nissen p.. Kjelrlgaard M ., anrl Nyborg J. 2000b. MaaomolCOllar mimicry. EMBO f. 19: 489 -495.
CHAPTER
The G enetic C ode
l lhe vefy heart of the Central Dogma is the concepl of informa· tion trans fer from Ihe linear scquence of the four leUer alphabet of the polynucleotidc chain into Ihe 20·amino acid language of the polypcplide chain. As we have seen, the translatioll of gOllclic information ¡nto am ino acid scqnences takes place on rihosomos and i5 medialcd by specia] adaptar molccules known as nansfcr RNAs (IRNAs). Thcso tRNAs recognize groups of Ihree consecutive nudeotides knOWIl as codons. Wilh four possiblc nudeotides al cach position, the tolal numbcr of pcrmutations of these triplcts is 64 (4 X 4 X 4) . n value welJ in exr.ess of the numbcr of ami no acids. Whir.h of these triplet oodons are responsible (Of specifying which amino acids. and what are the tules tha! govnrn thcit use? In this chapler, we dis· euss the nalurc and unde rlying logic of lhe genelie codeo ho\\' tJle code \Vas "crackr-d." and the ecreel of mutations on the cod ing eapadty of messengr-r RNA.
A
OU TlINE The CocIe ls Degenerale (p. 461)
• Threc Rules Govem the Generic Code (p. 469)
Suppres.sor Mutations Can Reside in !he Same or a Dlfferent Gene (p. 471)
• The COOe Is Nearly UniverSilI (p. 475)
THE CODE IS DEGENERATE Table ] 5·1 li sis all 64 permutation s. with the left-hand column indi. caling the hase al Ihe 5' end of the triplet. the row aero5s the top spec· ifying th e middle base nnd Ihe ri gM-hand co lumn identi fying lhe base in tb e 3' posilion. One of th e most Slriki.ng fentures of thc code is tha! 61 of lhe 64 possibl c tripl ets spccify an amino acid. with the remain· ing ¡hree triplct5 being chain-tcrminaüng s ignal s (see bolow). Tbi5 mea ns thal many am ino acids are spocifiod by more ¡han one codon, a pbenomenon ca ll ed degeneracy. Codons specifying the same amino acid are synonyms. For example. VUU a nd uve are synonyms for phenylalanine, whoreas serino is encoded by the synonyms UCU. UCC. UCA. UCG, AGU. anel AGC. In ract. when tho first Iwo nuclcotidcs are ¡den liea\' the third nuc1cotide can be eilher cytosine Of uracil and the codoo will still code for lhe samo amino Beid . Often, adcnine and guanine aro sim ilarly intcrehangeablc. Hm\'ovcr, nol all degenf!racy is bascd un equivalenee of Ibe firsl two nucleolidcs. Leueinc. Cor c'xample. i5 (;oded by UUA and UUG. as well as by CVU. QJC. CUA . and CUG (Figure 15-11. Codon degcueracy. especiaUy the frequent third·plaee equivalcnce of cytosine and uradl Or guanine and aden ine, explains ho\\' Ihere can be great varia!Íon in Ihe AT/GC ratios in the ONA of variotl s organisms without corrcspondingly large changos in the relative proportion of amino acids in their protcins. (For examplc. tJw gonomes uf certain bacteria display vastl y diffcrcnt ATICe ralios. a nd yet aro dosely related enough lo encode protcins of highly similar amino acid scquences.) 46'
Tl1e Genelk Code
462
TABLE 15-1 T1IeCeneticCode
uuu uue UUA UUG
,
o 11
"I "I
¡Oi.
,-
o-C -Y - T-~ N~H
•
~ .~
~
'" ;,. < o
<
euu eue eUA eUG
Phe
Loo
Le,
ueu uce UCA ueG eeu ece eeA eeG
Set
UAU uAe
m.t'!I 1m CAU CAe
TI"
eAG
UGC
e"
stop
l!mI
",o.
UGG
Ttp
CGU eGe eGA eGG
"'9
His
PtO
CM
UGU
GIo
stop
~
•
~ >
.!!
0_ ~
m_
, - cccccc r - o G" ]
~~GG¿T ",c" IJ'", -~
.""""'"
'"' -" ,
-
=
,-
~
,
o
~
"
~
AUU AUe 110 AUA AUGt Mel
AeU Aee AGA ACG
GUU GUe GUA GUG
GeU GCe GCA GeG
V~
AA)j
Tht
AAe
As"
AAA
AAG
L"
GAU GAG !'Ja
As-
GAA
GAG
AGU AGe AGA AGG
GGU GGe GAA
GI,
So<
" o
>
.!!
1V9
Gly
GGG
• Chain-terminating oc "nonsense- cocIons I Also use
pi,
OHH
11 I I / o- C- c-C-%
I
I
...... NH, H
\
CH3
f I GU JI. E 15-1 Codon-antirodon pairing of two IRNA Leu molealles. Critlcal stem anc! loop regions of the lRNA slructure are !abeled (see Chaptcr 14). 1he red l1exagons lipked to the G lO Ihe anticodon) denote rneth~tIon al the NI posllions of the base. Note lha!: lile rodon 15 shown 10 a 3' 10 S· orieotallon.
c:r
;;
~
a
Perceiving Order in the Makeup of the Code lns pection of the distribulion of codons in the genetic code suggests that the code evolved in such a way as to minimizCl t!lC dClleterious e ffects of mulation s_ For instance, mulations in the first position of a codon will afien give a s imilar (ir not the same) amino acid. Furthermore, codons with pyrimidines in the second position specify mostly hydrophobic amino acids, whercas those with purines in Ihe second position corrcspond mostly lo polar omino acids (see Toble 15-1 and Chapler 5. Figure 5-4)_ Hence. becalJ sc Iransitions (A:T to G:C or G:C lo A:T s ubstitutions) arc the most common type of poinl mUlations, a change in the second posilion of a codon will usually replace one amino acid wilh a very similar one. Finally, if a codon s uffers a lran si ~ li on mulation in the third position . rarcly will a diffcrent am ino acid be specificd . Even a transvcrsion mutation in Ihis position \Viii have no consequclOce abou l ha lf the time. Another consistcncy noticeable in the code is that whcnever the first Iwo pusitiuns of a codun are bull! ou.: upied by G 01' C. each uf the four nucleotidcs in Ihe third posilion specifies lhe same amino acid (sllch as prulinc. alaninc. argininc. or glycine). 00 the olher hand, whenever the Brsl two posiüons of the codon are both occupied by A or U, the identity of lhe third nudeotide does make a rHffmence. Since G:C base pairs are stronger than A:U base pairs. mismatchcs in pairing the third codon base are often tolerated ir the first Iwo positions make slrong G:C base pairs. Thus. having all fOUT nucleotides in
Tlle Code Is lkA-'t!nl"fUle
tbe third position specify the same amino acid may have evolved as a safety mechanism to minimize errors in the readtng ofsuch codons.
TABLE 15·2 Pairing Combinations vrith the W~e Concept
Wobble in the Anticodon
Base in Anticodon
It was first proposed tbat a specific tRNA anticodon would exist for
G
U Q(-C
e
G
A U
U AorG
every codon. [f tha! were the case, at leas! 61 differcnt tRNAs. possibl y with an additiollitl 3 for the chain-terminaling codons. would be presento Evidence began lo appoar, hO\.... over. that highly purified tRNA specics of knO\.\!Tl sequence COu ld recognize several different codons. Cases were also discovcred in which an antkudon base \Vas nol une of the 4. regular ones, bul a fif1h base, inosine. Like all the other minor tRNA bases, inosine arises through enzymalic modificalion of a base presenl in an otherwise complelfld tRNA chain. Toe base from which il is derived is adelline, whose carbao 6 is deaminated to give the 6-keto group of inosine. (lnosine is actually a nucleoside cumposed of ribose 8lld the base hypoxanthine, but it has come to be referred to as a base in common usage and we du so herc.) In 1966, Francis Crick devised the wobble concept lo explain these obsflrvalions. It states that the base al lhe 5' end of the antieodon is not as spaüally confined as Ih o other two, a tlowing it to form hydrogen bunds wHh any of severa! bases located a l Ihe 3' end of a codun. No! all combi nations are possib le, with pairing res!ricted 10 those shown in Table 15-2. For example, U al the wobble pos ition can pair with either Ildenine or guan ine, while 1 cnn pAjr with U, C, or A (Figu,re 15-2). ThtJ pairings perrnitted by the wobble rules are those tbat give ribose-ribose dislances close to that uf th e standard A:U or G:C base paies. Purine-purine (with the exccption of I:A pairs) Of pyrimidine-pyrimidine pairs wou ld give ribose-ribose distances that are loo long or too shurt, respectively. ThA \Vobblc rules do nol pArmit any s inglp. tRNA molecllle lo recognizc fom different codons. 'fhrre codons can be recognized onl)' when inosine occupics the first (5') position of toe anticudon, Almost all the evi dence gathertld sin ce 1966 supports the wobble concepl. For example. the concepl correctly predicted thal al leasl threv tRNAs exist for Ihe six serine codons (UCU. uec. UCA, uec, AGU, and AGC). Tho olher two am ino acids (Ie ucioe and argin ine) that are cncoded by six codons also have different tRNAs for the sets of codons that differ in the first uc secund pusition . In tho throo-dimom;ional structurc of tRNA, the lhree anticodon bases-as weH as Ihe Iwo following (3') bases in the anticodon loopall point in roughly the same direction, with Iheir exact conformauons largely dctermined by stacking inleractions bctween Ihe flal surfaces of the bases (Figure 15-3). Thus, the first (5 ') anticodon base is al Ihe end of lhe stack and is perhaps !(!SS res tncted in its movcments t11an the othor two anticodon bases- hcnce, wobble in the third (3 ') position of Ihe codon. By contrast, not only does tbe third {3'} anlicodon base appear in the middle or the stack, bul the adjacent base is always a bulky modjfind purine residutl. Thus, re:slnction or its movcmenls mayexplain \. . . hy wobblc is nol secn in the firsl (5') posilion of Ihe codeo
Three Codons Direct Chaio Termination As we have seen, three codons do not correspond to any ami no acid . Instead, they signify chain termination. As we discussed in Chapter 14, lhese chai-n-Iermillati ng codons. UAA, VAG, and UCA, are road nol by
46:1
Base in Codon
A. U.ore
464
The Genetic; Code
a
b
anticodon
3'
5'
~
5' anticodon arm of tRNA
U in lOO first (5')aoticodon positioo can pair with A (Y G
e
H
N
O ",," " H-
r
N
N
) >-
noose
\=<
H
H
5'
3'
5'
3'
5'
I in {he firsl (5') anticodon position can pair wilh U, e, or A
ritose
H
inosme-adenine codon or anlicodon
codon or antiCOdOn
guanine-uracil F I GU R E 15-2 Wobtl'e base pairing. Note that the ribose-ribose distances for all the v-.OObIe pairs are clase lo those of the standard AU Df G:C base pairs.
special tRNAs but by specific protcins known as release factors (RF1 and Rf'2 in bacteria and eRFl in eukaryotcs). Release faclors cnter the A silo of tbe ribosome and trigger hydrol ysis of the poptidyl-tRNA occupying the P sito, resulting in Ihe release 01' Ibe newly synthcsized protein,
H ow the Code Was Cracked The assignment of am ino acids to specific codons is one of the great achievements in tbe hislory of molecul ar biology (see Cha pler 2 for an historie account), How were these assignments mado? By 1960, the general outHne of how messenger RNA (mRNA) participa leS in prolein
Tlle eode 15 Degenerote
•
465
b
T stem
Tloop
54
\
\
3'acceptor
"""\
63
56 OIOOP _ _
20
- - - - ~ end o, antiCOc\on
anticodon Ioop _ _-,
] anticodon
F I Ci U R E 15-3 Sbuctu,e of yeast tRNA""". (a) lile !eh panel stl
synthesis had been estabJished. Nevertheless, there was HUle optimism tha! we wouJd soon have a detailed understanding oC Ihe genetic code itself. JI was bclieved rhat idClltrncation of the codons for a given ammo acid would require exacl knowledge oC both the nucleolidc sequences of a gene and the corresponding amino acid order in its protein product. Al that lime, the elucidation oE Ihe amino acid sequence of a protein, although a laborious process. was already a very practical one. On the úther hand, Ihe then-current methods for delermining DNA sequences were very pdmitive. Fortunately, Ihis apparent road block did nol hold up progross. In 1061. juSI one year after the discovery of rnRNA, the use of artificial rnessenger RNAs and ¡he availability of cell-free systerns for carrying out protein synthesis began lo make il possible to crack the code (see Chapter 2).
Stimulation of Amino Acid Incorporation by Synthetic mRNAs Biochemists found Ibat extracts prepared frorn cells ol" E. coli Ibat wece actively engaged in protein synlhesis, were capable of incorporating radiool..1ively-labeled amino acids ¡nto proleins. Proteio synthe:;js in these cxtracts proceeded rapidly ror several minutes and thcn gradually aiffie lo a stop. Dwing this inlerval, there was a corresponding los:: of mRNA owing lo the aetion of degradative enzyrnes present in the extracto However, the addition of fresh mRNA to extraets lhat hod stopped making protein caused an immediate resumption oC synthesis. Thc dependence of ce)) extracts on cxtcmaJly added mRNA provided an opportlUlity to elucidatc the nature oC the code using synthe¡ic
4titi
Tlle Gelletic Code
5'
o-~
polynucleotide phosphorylase
,
+
,
HOOH
~
3' HO OH
ribonudeoside diphosphale (ppA)
+
poly-A"
o phosphale
3' HO OH
poIy-A" +
1
F I GUR E 15-4 ~"udeotide phosph
polyribonucleotides. These synthetic templates were created using the enzyme púlynudeolide phospborylase, wh icb catalyzes the reacljon:
IXMPJ" + XDP ~ I XMP I " .. + (l¡
IEquation 15-11
where X represents the base and rXMPl 1I represents RNA of length 11 nuclootides. Polynucleotide phosphorylase is nonnally rcsponsibl e for breaking clown RNA 811d under physiological conditio ns fa vors the degr'ddation of RNA into nucleoside diphosphates. By use of high nucleosidt:l diphosphale concentralioos, however, this enzyme can be made lo catalyze lhe formation of internucleotide 3 ' - 5 ' phosphodiester bon ds and lhus make RNA molecules (Figure 15-4). No template DNA or RNA IS required for RNA synthesis wilh this em:yme; the base composition o( the synthetic product depends entirely on the raijo of tbe various ribonucleosi de diphosphates added to tbe reachon mixlure. For example, when only adenosine diphospbate is used, the resulting RNA contains only adenylic acid and is thus ealled polyadenylic acid or poJy-A. It is Iikewise passible to make poly-U. poly-C. and poly-G. Addition af two or more different diphosphates produces ,mixed copolymers such as poly-A U. poly-AC, poly-C:U. and poly-AGCU. In aH these mixed polymers . the base sequences are ap proximately random . with the nearest-neighbor frequencies determined solely by lhe relativc conccotrations of the react ants. For exomple, poi y-AV molecuJes with two times 3S much A as U have sequences like UAAUAUAAAUAAUAAAAUAUU. _. .
Poiy-U Codes for Polyphenylalanine Under Ihe rigbt cond itions in vitro. almost aU syothetic polyrners wiU attach to ribosorncs aod fun eti on as templates. Luckily, high concentratians of magnesium were used in Ihe early cxperiments. A high magnes ium cance nt ration circurnvents Ihe need for initiation factors and the spccial ¡niUelor fMe l-tRNA , allowing chain initiation lo take place without lhe proper signa ls in !be rnRNA. Poly-U W8S Ihe first synlhetic polyribonucleotide discovered to have mRNA acti vity_ Jt selects phenylalanyl IRNA molecules exclus ively. !bereby forming a polypeplide chaio contai ni og only phenyla lanine (polypbenyla lanine). Thus. we k:now Ihat a codon for pheny lal anine is composed of a group o( three uridylic add residues, UUU. (That a codon has three nucleotides was known
The Code 15 Degenerole
467
from genetic experiments. as indicated in Chaplers 2 and 21. and below.) 00 the basis of analogous experimenls with po ly-C and poly-A. cce was assigned as a proline codoo and AAA as a Iysine codon. Unfortunatcly. thi s Iype of experiment did not teH us what amino acid GGG spocines, The guan ine residues in poly-G firmly hydrogen bond to each other and form mu ll istmnded triple belices that do not bind lo ribosomes.
Mixed Copolymers Allowed Additional Codon Assignments Poly-AC molecules can conta in eight difTerent codons. ceC. CCA, CAC, ACC. CAA. ACA. AAC, and AAA. whose propürtions vary with the copolymer A/C ratio. Wben AC copolymers atlach to ribosomes. they cause Ihe ¡ncorporation of asparagine. glutamine, histidine. and threonine-in addi tion to the proline proviously nssigned to a:c codons and the Iysine previously assigned lo AAA codons. T ho proporHons of lhese amino acid s incorpora led inlo polypcptide producls depeod 00 Ihe A/C ratio. Thus. since an AC copolymer oootaioing much more A Ihan e promotes Ihe ¡ncorporatioo of many more asparagioe Ihan histidine residues. we conclude lhat asparagine is coded by two As and one e and Ihal histídine is coded by two Cs and one A ('fable 15-3). Similar experiments with olher copolymcrs allowed severaJ additi onal assig,nments. Such experiments. however. did not revea l the order of the diffcrent nucleotides wilhin a codon. There is no way o f knowing from random copolymers whelher the histidine codon containing two Cs and one A is ordered a:A, CAC, or ACC.
JABLE 15-3 Amino Add Incorporation into Proteins·
Amlna Acid PoIy-AC (5: 1) Asparagine
Glulamine Hislid;ne Lysine Profine Threonine
PoIy-AC ( 1:5) Aspmagine Glutamine Hislldine
Lysine Preline Threonine
Ob_ Amlno Acld Incorporatlon
Tenl8tive
Calculatedll'iplet Frequency
Codon Asslgnments
2. 2'
3A
2A1C
2A1C lA2e
6
3A
5 5 23 1 100 21
2Al e
2A 1C, lA2C
4.0
4
4.0 4,0
100 4.8 24
0 .8
3 .3 3 .3
2A1C
lA2C
3.3 3.3 16.7
0.7
3 .3
Sum of Cak:ulated Triple« Frequenties 20 20
100
IA2C, 3C
lA 1A2C,3C 2A1C, 1A2C
3C
20 20
2Ale
100 7 26
1A1C
16.7 16.7
83.3
16:7 0.7 100 20
'The amíno acid incorporalion miO proteins was observed afler addiog randOrn copoIylT'lel'S 01 A and e 10 a ceH ·1ree eKtracl. The incorp«alioo tS given as a ~cenlage oIlhe mal<;maI incorporalioo 01 a single amino acid. The copolymer ralio was lhen vsed 10 calculale!he frequency with which a giVen codon wouId appear in Ih(! poIynucleol!de produCl. The rela1lve frequencoes oI lha CCidOtls ere 9 IuoctIQn 0 1 lha prO!)R.I)dity lhal a parl'cular ntJClOOI,de 'NIII OCCU" in a goven
pasitl()'l 01 a codon for al<8mplc. vd'Ien lhe
A!:-
ra~o is s : 1, lhe ral'O or
AAA/AJ>C ., S x S )( 5:5 x 5 x 1 = 125;25. U W(! Ihus assign kllhe 3A codQn a
Ifoovency 01 100. then the 2Aand l e COdOn is 9SS1gnOO a freQl.lf.!flCV 0120. By corre\aling lhe reletive freQUl'mCles 01 amino acid incOrporatiOn wilf1lhe calcutaled frequencles wi!h wh!ch given codons appcar, tentallVEl coóon asstgnmeols CaJl be macla.
fA 8 L E 15-4 Binding of Aminoacyl tRNA MoIewles lo Trmudeotide-Ribosome Complexes
Trinudeotide 5'-UUU-3'
UUA MU AUG GUU UCU eeu AM
UGU GM
uue UUG Aue Gue uee cce MG UGe GAG
euu AUA
eue
CUA
GUA ueA eCA
GUG ueG cCG
UCU'
CUG
Phenylalanine l eucine lsoleucine Methlonine Valine Serine Proline lysine Cysteine Glutamic acid
' NoIe Ihat this cOOOn was misassignoo by It'is melhod
Transfer RN A Binding to Defined Trinudeotide Codons A direct way oC ordering the nudeotides within sorne of the codons was developed in 1964. This method utilized the fad that even in the absence of all the factors rcquired for protein synthesjs, specific aminoacyl-tRNA molecules can bind to rioosome-mRNA comp lcxes. For example, when poly-U is mixed with ribosomes, only phenylalanyl tRNA will attach. Correspondingly. poly-C promotes the binding of prolyl-tRNA. Most impol'tantly, this specific binding does not demand the presence oC long mRNA molecules. In fact, the binding of a trinuc1eotide to a ribosome is suffieienl. The addition oC the trinucleotide UUll results in phenylaJanyl-tRNA attachment. whereas iC AAA is added . lysyl-tRNA specifically binds to ribosomes. The discovery of this trinucleotide effect provided a relatively easy way of determining Ihe order of nucleolides within many codons. For example, the trinucleotide 5'-GUU-3 ' promotes valyl-IRNA binding, 5'-UGU-3' stimulales cysleinyl-tRNA binding. and 5'-UUG-3' eauses leueyl-tRNA binding (Table 15-4J, Although a1l 64 possible trinuclooUdes were synthesized with thc hope oC definitely assigoing Ihe order of every codon. 001 aU eodons were determined io Ihis way. Sorne trinucleotides bind lo ribosornes much less efficiently Ihan UUU or GUU, making it imposs ible lo know whelher lhey code COf spccific amino acids.
Codon Assignments froro Repeating Copolyrners Al tbe same time Ihal the trinucleolide binding teehnique became available. organic chemical and enzymatic Icehniques were boiog used to prepare synthelic polyribonucleotides with known repeating scquences (Figure 15-5). Ribosomes start protein synthesis al random points along these regular eopolymers; yet they incorporate specific amino acids into polypcptides. For exam ple. lha rupeating sequence CUCUCUCU . . . is (he messcnger for a regu lar polypeplide in which Jeucine and serine aJternate. Similarly. UGUCUG ... promotes the synlhesis of a polypeptide conlaining two amino acids. L)'steine and valine. And ACACAC ... direds Ihe synthesis oC a polypeptide alternating threonine and histjdine. Tbe copolymer built up from repetjtjon of the Ihree-nucleotide sequcnce AAG (AAGAACAAC) direcls ¡he synthesis oC three types of polypeptides: polylysine, polyarginine. and polyglutamic acid. Poly-AUC behaves in the same way. aeling as a
Tlm~e
ONA lemplale
Q!IQ!IQ!IQ!I
)S'
1RNA polymerase
re,. ~ s'u u,uocuD e" UTP
3'
S'
TABLE 15·5 Assignment of Codoos using Repeating Copotymers Bui" f10m Two o, Three Nudeotides
e_M Copotymer
Recognized
ICU),
c u Cluculcuc .
(UGI,
UGUIGUGIUGU •.
(ACI.
ACAICACIACA
(AG).
AGAIGAGIAGA
(AUC).
AuclAUCIAUC • UCAJUCAIUCA CAulCAUICAU . .•
469
F I Ci UR E 15-5 Preparing oligo-i'ibonudeotides. USing a c:ombination of organic: svrllhesis and c:opying by DNA poIy-
5" : n~: n~ '; n ~ : n~ )~ 3"
Rules GOlfflrn the Gellet.ic Code
Amino Acids Incorporated DI Potypeptide Made
Asslgnment
Leucine Serino Cysleine Vatino Throon¡ne Hlstldine Argin,ne Glutamlne Polyisoleucine Polysenne Polyhislidine
S'-CUG-3' UCU UGU GUG ACA CAC AGA GAG S'·AUC-3' UCA CAU
Codon
lemplate for polyisoleucine, polyserine, and polyhistidine (Table 15-S). Further codon assignments were obtained from repeating tetranuc1eolide sequences. The suro of aIl Ihese observations peffilitted Ihe assignments of specific amino acids lo 61 oul of Ihe p05sible 64 coclons (see Table 15- 1), with Ihe remaining three chain-Ierminating codoos. UAC, UAA. and UCA, no! s pecifyiog aoy amino acid . (Note, as discussed. in Ihe previous c haplor, that in the specinl contoxt of trons}ation initi ation in E. con, AUG is used as a starl codon to specify N-forrnyl melhionine rather Ihan its usual codon assignmcnt of melhionine,)
THREE RULES GOVERN THE GENETIC CODE The genetic code is subjecl lo lhree rules Iha! govern Ihe arrangemenl and use of codons in messenger RNA. Thc firsl rule holds that codons are rcad in a s' lo 3' direClion, Thus, in principIe aod as an example. the coding sequence for the dipeptide NH 2·Thr-Arg-COOH could be written as S'-ACGCGA· 3' (where S'-ACG-3' is a threonine codon and S'-CGA-3' an arginine codon) or as 3'-GCAACC-S ' wherein Ihe codons are written in Ihe 8ame order as bofore but oppositely to Iheir original
melaSe 1, dooble-stranded DNA Wllh simple repeating seQUellces can be generate
orientations. Because messenger RNA is translated in a S' to 3' direction, however, only the former is the correct coding sequence; if the latter were translated in a 5' to 3' direction . then the resulting peptide would be NHrArg-Thr-COOH, rather than NH 2 -Thr-Arg-COOH. The second rule is that codons are nonoverlapping and the message contains no gaps. This means that Sl!ccessive codons are represented by adjacent trinucleotides in register. Thus, the coding sequence for the tripeptide NHz-Thr-Arg-Ser-COOH is represented by three contiguous and nonoverlapping triplets in the sequence S'-ACGCGAUCU-3'. The final rule is that the message is translated in a fixed reading &ame. which is sel by the initiation codon. As yOl! wiU recall &om Chapter 14, translation starls at an initiation codon, which is located al the S' end of the protein-coding sequence. Becausc codons are nonovcrlapping and consist of three eonsecutive nucleotides, a s1retch or nucleolides eould be translaled in principie in any oC three reading frames. It is ille initialion codon thal dietates which of the three possible readiog frames is used. Thus, Cor example. the sequence 5' ... ACGACGACGACGACGAct;ACG ... 3' could be translated as a series of thrCf;mine coclons (5'-ACC'-3'), u series oC arginine codons (5'-CGA -3 '), or a series of asparate codons (5' -GAC-3' 1 depending on the frame of the upstroam start codon.
Three Kinds of Point Mutations Alter the Genetic Code Now thal we have considered the nalwe of the genetic codeo It IS instructivo to revisit the issue of how the coding sequence oC a gene is altered by point mutations (see Chapter g). An alteratioo thal changes a codon specific for one amino acid to a codon specific Cm another amioo acid is called a missense mulation. As a consequcnce. a gene beoring a missense mutation produces a protein prorluct in which a single amino acid has becn substituted Cor another, as in the classic example oC the human genetie disease sickJe cell anemia. in wlúch glutamate 6 in the J3-g1obin subunit of hemoglobin has been replaced with a valioe. A more drastic effocl results from an alteration caus ing a c hange to a chai n-tennination codon. which ls known as 8 nonscnse or stop motation. When a nonsense mutation arises in the middle of a genetic message. nn incomplete polypeptide is released from the ribosome owing to premature chain termination, Tho siza of tha incomplete polypeptide chain depends on the location oC the nonsense mutation. Mutations occurrmg near Ihe bcginning of a gene result in very short polypeptides. whereas mutatioos oear the end produce polypepti de chains of almost normallength, As we saw in Chapter 14, mRNAs that contaln a premature stop codon are rapidly degraded in eukaryotic cells by a process known as nonscnse-mcdiated m.RNA decay. The Ihird kind of point mutation is a frameshift mutation, Frameshift mutations are insertions or deleUons uf one or a smaIl numbcr of base pairs Ihat altcr Ihc reading trame. Consider a landem repeat oC the sc<¡uence CCU in a frame Ihal would be read as a series of alanine eodons (Ihe codons are artificially set apart Crom each other by a gap for c1arity bUI are, of eourse, contiguol!s in a real messenger RNAj, Ala
Ala
ALa
Ala
Ala
Ala
Ala
Ala
S'-GCU GCU GCU GCU GCU GCU GCU GCU-3'
Suppressor MUfafians Can Res
Now imagine the insertion of un A in the message, thereby generating a serine codon (AGC) at the site of the insertion. The resulting frameshift causes triplets downstream of the insertion to be read as cysteines: Ala
Ala
Ser
Cys Cys Cys Cys Cys
5'-CCU GCU AGC UGC UGC UGC UGC UGC-3' Thus, Ihe ¡nsertion (or for Ihal matter the deletion) of a single base dnl.stically alters Ihe eúding capac:i ly of Ihe message not ún ly al the site of the inserHon bul Cor the remainder of the messenger as well. Likewise. tbe ¡nsertion (or deletion) of Iwo hases would have Ihe effeet of throwing the entire coding scquence, at and downstream of the inscrtions, ioto a different reading frame. FinaIly, consider the instructive case ol' an insertion of tbree extra bases al nearby positioos in a mcssage. It is obvious that the streteh of message, at and between the duee insertions. will be drastically altered. Bul beca use the code is read in units of three. messenger RNA downslre:lffi of the Ibree inserted bases will be in its proper reaning &ame and hence, completely unrulel'ed: Ala
Ala
Ser
Cys Met
Leu
!-lis
Ala
Ala
Ala
5'-GCU GCU AGC UGC AUG CUG CAU GCU COCU GCU-3'
Genetic Proof that the Code Is Rcad in Units of Three Thc pnL"'tlding cxarnple is the logic of a c1assic CXpt~riJ1lenl by Frdocis Criek, Sydney Bmnner, and their coworkers, involving bacteriophage T4 that established that the code i8 read in units of three and did so purely on the basis of a genetic argumenl (that is, without any biochemicnl or molecular evidence). Genetic crosscs wcre carried out to creale a mulant phage harbol'ing tbree inferred single base pair insertiún mutations al nearby positiúns in a single gene. Df cúurse, the three insertions wúuld have scrambled a shúrt slrelch úl' oodons bul the protein encodcd by Ihe gene in question (called rll) was able to tolerate the local alteratioo lo its aminú acid sequence. Tbis !inding indicated that tbe overall coding capacity of lhe gene had been cbielly Idt unaltered despite the presence oC three mutatiúns, each of which alone, or any two of which alone. would have drastkally rutered Lhe reading frame of Ihe gene's message (and rendered its protein product inactive) . Because the gen e could tolerate '-bree insertions bul not one or two (or, for that maller, fuOI), Ihe gcnetic code must be rcad in units of thrce. Sec Chapters 2 and 21 ror a discussion of the historie figures who shúwed that the codc is read in units of three, and for a description of the role of ooctcriophage T4 as a model system fúr elucidating the oature of the code.
SUPPRESSOR MUTATIONS CAN RESIDE IN THE SAME OR A DIFFERENT GENE Dften, Ihe effects oC harrnCuJ ffiutations can be reversed by a seeond genetic change. Some of these subsequent mutations are easy lo understnnd , being simple reversc {backl mutations, which change an altered nucleotide sequence back to lts original arrangement. More difficull to wlderstand aro the mutalions occurring al difTerent locations on Ihe chrúmúsome that suppress the change due to a rnutatiún at
472
Tl le Genetic Code
site A by produdng an additional genetic chaDge al site B. Such suppressor l;Ilutations faH inlo Iwo ma in calegories: those occuning within the same gene as the original mutalioo, but al a different site in this gene (intl'agenic supprcssion) and those úccurring jn another gene (in1ergenic suppression). Genes thal cause suppression oC mulations in other genes are called suppresSOI' genes. 80th of the Iypes oC suppression that we are considoring here work by esusIog the produc. lion oC good (or partialJ)' good) copies of the protein made inactive by the original hannful rnutotion. For exarnple, ir thc first mulation caused fhe production of ¡nactive copies of one of the enzyrnes involved in rniling arginine, Iben the suppressor mutation alJows arginine lo be made by resloring the synthf'sis of sorne good copies of lrus same enzyme. However, lhe rnechanisms by whkh intergenic and intragenic suppressor ml.ltations cause Ihe resumption of thc synthesis of good proteins are completely d iH.'erent. As an examp le of intragenic supression. consider Ihe cese of a missense mutation. lis effeet can sometimes be reversed thl'Ough en additi onal missense mufat ion in Ihe same gene. In such cases, Ihe original loss of enzymatic ar:livily is due lo an altered threedimensional configuratioll resu ltjng from the presence of an incorreet amino acid in the encoded protein sequence. A seconel míssense mutation in Ihe same gene can bri ng back bjological activity ir il somehow restores the original configuralion around the funclional part of the molecule. Figure 15-6 shows anolher example of intragenic suppression, this time for Ihe case uf 8 frameshift rnutation,
lntergenic Suppression Involves Mutant tRNAs Suppressor genes do not ad by changing Ihe nucleotide sequence of 8 mutant gene. Inslead. they chenge the way the mRNA template is read. úne of the best known examples of suppressol' mutations Me mulanl tRNA genes lbat suppress the effects of nonsense mutations in proteincoding genes {but mutanl tRNAs that suppress missense mutalions and
f I e u R E 15·6 Suppt'e55ion of
•
b mutan! ONA 1
frameshrft mutations. (ca) Adeletloo In the nudeotide coding sequenc:e can resuh in can lncomplete. incactive polypep6de chain. (b) The
d.~lioo o"
effect of the deIebon. ~ in panel do can be O\Iercome by a second muliltion, can Insemon In \he coding seqtJerlc:e. Ttus. Insertion results in the pfOduction of a complete poIypeptide &..:Iln hiMng two amino acid replacements. Depending 00 the change in sequence. me protein may halle partial or full activity.
nuc1eolide during gene replication
1
:IrII IIIIII!I 111nnnmm: 1 io",tion of • nucleotide during gene replicalion
mu!ant ON~
:111 !!!!II1!!I!lIIlIIillmC
•
mutan! mRNA 1
1
stop
mi'rrnn-
I
sense codons
s'
I!
mRN~ 1
ilililrihiiiihntlhnftl 3
missense nonsense codons oodons
!
o:x:x:x:x:x:o
I
mutan!
signal
!
\
corree! incorrecl abottive amino acids amioo acids chaln ending
1
Suppres.~Uf
Mutalions Con Tleside in Ihe Same Uf a Differenl Gene
4n
e\I€n frameshift mutations are also known), In E, colj, suppressor genes are known foc cach of too three stop codon s. They
mulatee! gene oontaining nonsense codon
S'
gene coding for a miflOf tyrosine tRNA
lO
~
IS
OOO~ S.OOO • ~
000'''000
DNA
000111000
••
F I G U R E 15-7 Nonsense supp'es5ion.
I
OOO~ "OOO
,.
species acts to suppress the nonsense codon
I
transcríption
tyrosine IRNA reoognizes ttle Iyrosíne codOns 5'·UAC-3' and
mulant mRNA oontaining nonsense codon read by a release factor 10 form a nonfunctional incomplete protein producl
S'-UAU-3' ,
!
S'
1 b
mutated gene containing
mUtated gene lar minar tyrosine tRNA
nonsense oodon 5"
}3'
5"
)5'
3"
000"1000 3"
000111000
)3'
000111000
OOOUIOOO
transcription
3'
5'
recogmzes lhe nonsensc codon S'-UAG-3'
I
the nonsense codon is suppressed; Iyrosine is insertee! at the position 01 the nonsense codon to allow the formatlon of a complete poIypeptide main.
The figure st-o.vs hov.- a mlnor tyrosine tRNA
) S'
In
mRNA.
species. whereas the other two are dupljcate genes coding Eor a "pedes present in smaller amounl... One or Ihe olher of the two duplieate genes is aJways the 8ite of the suppressor mulation. No 8uch dilemma exist" for UCA 8uppression. which is mediated by a mutant forOl of tRNA11'p; tbe suppressing tRN A1tp retains its capacity to read UGG ltryptophanJ codons whiJe also recognizing UCA stop codons. This is possible because the anhcodon was changed from OCA (3 '-ACC-5') in lhe wildtype lo UCA (3'-ACU-S' ) in the mutanl tRNA Ttp, and wobble ruJes. as we hava secn, allow recognilion o[ A or e in the 3' position of Ihe codon by U in the 5' position of an antieodon.
Nonsense Suppressors also Read Normal Termination Signals The acl of nonsense suppression can be viewed as a competition
between lhe suppressor IRNA and Ihe release factor. when a stop codon comes inlo the ribosomal A site. either read-truough or poJypeptide chain termination will occur, depending on which arrives first. Suppression of UAC codons is efficient. In the presence of Ihe suppressor tRNA. more Ihan half of Ihe chain-terminating signals are read as speciBc amino acid codons. E. coli can tolerate this misreading of the UAG stop codon because UAC is wed infrequently as a chain-Iermi naling codon al the end of open-reading frames. In conl.rast. suppression oC Ihe UAA codon usually averages belween 1% and 5% an d mutcmt cdls producing UAA-suppressing tRNAs grow poorly. Thís is expccled froro Ihe fael that UAA is frequently used as a chaio-terminating codon and ils recogn ition by a suppressor t RNA would be expected lo resull in the produetion of many more aberrant ly long polypeplides.
Proving the Validity of the Genetic Code The code was cracked. as we have seen. by means of biochemieal methorls involving Ihe use of ceU-fll"f! syslems for cmrying oul protein synthcsis. But molecular bíologists are generally suspicious of a method lhat celies 00 in vitro analysis alone. So how do we know definitiveJ y 111al thc code as depicted in Thble lS-1 is true in li ving eells? Of eoun>e. in the morlern era of large-scale DNA sequencing, in which the enti re nucIeotide sequences of Ihe genomes of diversc organisms rangiog from microbes lo man have been determined, the genetic code has nol only beeo vali dated bul shown to be universal or nearly so (see below). Nonetheless, a dassie and inslructive experimtmt in 1966 helpoo to vali· dale Ihe genBtic coda well before DNA sequencing was possible. The experiment was oosed on the construction by gcnetic rec:ombination of a mulant gene of phage T4 thal harbored a mutually suppressing pair of ¡nseriion and deletion mulations (similar lo the example given in Figwe 15-6). The gene in question encoded a cell-wall degrading enzyme called lysozyme. ehosen because iI is small. easy to purify, and ¡Is complete am ino Bcid sequence was known. TIle experimental strntegy was to como pare Ihe amino acid sequence of the doubly mutanl prolein with Ihat of wild-type Iysozyrne. When Ihe amino acid sequences of lhe mutanl l . . . NH 2 - Thr Lys Val His His Leu Met Ala Ala Lys-COOH . .. ) and wild-type (.. . NH 2- 1br Lys Ser Pro Ser Leu Asn Ala Ala Lys-COOH .. . ) \Vere compared. Ihey were found to differ by a slretch of I1ve amino acids (highlighled in bold). This observation suggesled that the inserlion and deletion mutati ons had scramblcd a shorl stretch of codons in Ihe meso. sage oE Ihe mutan!. Knowing Ihe consequenl eft'c cl of Ihe scramblcd eodons 00 the amino aci d sequenee of the prolein imposed importanl
constraints on the nature oí Ihe genetic codeo Specifically, if the genctic code as elucidated in biochemical experiments is valid, then il should be possible to identify a sel of codons for the wild·type sequcnce Ser Pro Ser Leu Asn tbal, when properly aligned and bracketed with an insertion al one end and a deletion al the olher. would specify Ihe mutant amioo acid sequence. Indeed, such a solulion exists, which roquires a delelion of a nucleotide al the 5' end of Ihe coding sequeoce and the insertioo of a nucleotide al Ihe 3' end: NH 2 - lys
Ser
Pro
Ser
Leu
Asn
AJa -COOH
5'-AAA ACU CCA UCA CUU AAU GC-3' 5'-AAA CUC CAU CAC UVA AUC CC-3'
NH 2 - Lys
Val
His
His
Leu
Mel
Ala-COOH
As you can see, the solulion verifies severa! codon assignments and demonstrates that more than one synonymous codon is used lo specify tbe same amrno acid in vivo {for example, 5' -CA U-3' and 5' -CAC-3' (or histidine). Lastly, and imporlantly, you should be ablc lo convince yourself from Ihe solution that translation procoods in a 5' lo 3' dircction. (Hint: see if you can accouot ror the two amino acid sequcnces in their proper NHz lo COCH arder whcn you aligo e8ch of the codons in your solutioo in a 3' to 5' orientation.)
THE CODE IS NEARLY UNIVERSAL The results of large-scale sequencing of genomes have largely conñnned the expecled universality of the genetic codeo The universality of lhe code has had a huge impacf on our understanding of evolution as il made it possible lo directly compare proteio coding sequen ces amoog all organisms for whieh a geoome sequence is availilble. As we shall see io Chapter 20, powcrful compuler programs are available that can search for and identify similaritics among prcdicted coding sequences from a wüie range of organisms. The lmiversalily or Ihe code also helped lo create the fleld or genetic engineering by making it possible to express cloned copies of genes cncoding useful protein produels in surrogate host organisms, sueh as the production oC human ¡nsuIin in bacteria (see Chapter 20). To \mdcrstand thc conscrvativc nalure of the code, considcr what might happen if a mutation changed the genelic eode. Sueh a mutBlion ndght, for examplc, alter the sequence of the serine tRNA molecule of the c1ass that eorresponds to DeD, causing them to reeognize UUU sequences inslead. Tbis would be a lethal mutatioo io haploid eells containing ooly one gene directing Ihe produetioo of tRNASer, for serioe would nol be inserled into many of ils oonnal positions io proteins. Even if there were more than one gene far tRNAScr (as io B diploid celJ), this Iype ofmutalion would sUJI be lelhal since il would cause lhe simultaneous replacemenl of many phenylalanine residues by serine in eell proteins. In view of what we have jusI said. ir was completely unexpected lO find that in certain subcellular Of-ganelles. lhe genetk code is in fael slightly dlfferent from Ihe standard codeoThis realization came during the elucidation of the entire DNA sequeoce of lhe 16,569·base pair human mitochondrial genome bul is observed ror mitrochondriB io yeast, the fruit fiy, and higher plants. Sequences of the regions known
lo specify proteins have reveaJed the followiog differences between Ihe slandard and mitochondrial genetic codes (Table 15-6): • VCA is not a stop signal bul codes for tryptophan. Hence. the anticodon of mitochondrial tRNA Trp recognjzes both VGG and UCA. as
ir obeying the traditional
wobble rules. • Internalll1ethjonine is encoded by both AUG and AUA, • In mammalian mitochonrlria, ACA and AGG are not argmme codons (of which there are six in Ihe "universal" codel hui specify chain termination . Thus, Ihere are foue stop codons (UAA, UAC, AGA. and AGG) in tbe mammalian milochondrial codeo • In fruit Oy mitochondria, AGA and ACG aro also nOI arginine cocloos but specify serine. Perhaps nol surprisingly, mitochondrial tRNAs are likewise unusual with respeet to the rules by whieh Ihey docade mitoehondrial messagcs. Only 22 tRNAs are present in mammalian mitochondria. w hereas a minimum of 32 tRNA molecules are requlrad to decode the " universal" code according lo the wobble rules. Consequenlly, w hen
TABU 1 S*' Genetic Code of Mammalian Mjtochondña· second position
;;-
•• ;,. • .Q ~ o
~
;; ~
UUU Phe uue (GAA)t UUA le, UUG (UAA)
UCU uce UeA UCG
euu eue eUA eUG
ceu cee CCA ceG
AUU AUe
r;m AUG GUU GUC GUA GUG
le, (UAG)
(CJIU)l
ACU ACC ACA ACG
Va'
Geu Gee
11. (GAlJ)
Me'
(UAC)
UAU
T"
UGU ey, UGe (GCA)
Se,
UAC
(GUA)
(UGA)
1m l!!iD
"op ",op
1m
His (GUG)
CGU
p",
CAU CAC
(UGG)
CAA
G'o
CAG
T,p (UCA)
UGG
(UCG)
(UUG)
eGe eGA CGG
AAU
Aso
AGU
Se<
Th,
AAC
(GUU)
(UGU)
AAA AAG
l"
(UUU)
"'"
AGC (GCU)
EI!D lil!l!I
..op stop
GAU
A",
GGU GGC
G~
GCA (UGC)
GAC GAA
(GUC)
G',
GAA
(UCC)
GCG
GAG (UUC)
Ala
-a: ~
1!
• •~ ••
" '"
GGG
• Oiffen:lnces bet'-Neen lhe milochrondial and · universal" genetic code (TabIe 15--1) are shcNm by green shading.
t Each QrCK.p of codons is shaded in gray and is read by a single IRNA whose anticoclon, written 5' - 3'. in parentheses. Each four-codon groop is read by a IRNA having a U in lhe firsl (5') posman ol!he anlicodon. Two--codon groups ~th codons ending in eilher UlC Of NG are read with GU v.obble by IRNAs, with G or U, respectively. in !he first position of lhe anticodon. The anlicodons often contain modified bases. 1 Note !ha! !he C in !he first anticodoo posilion engages in unusual pafring.
Bjbliogrophy
471
an amioo acid is specified by four codoo s (with the same first and secood positions), only a single mitochondrial tRNA is iovolved. (Recall Ihat a minimum of 1wo tRNAs would be requjred by nonmitochond riaJ systcms.) Such mitochondrial tRNAs a l! have in the 5' (wobblc) position oC their antlcodons 8 U residue. which is ablc to engage in pairing with any of tite four !ludeotidas in Ihe third codon posítion. In cases where purines in the third positíon of the codon correspond to dífferent Bmino acids from p)'rimidines in Ibat position . a modified U io the First position of tho anticodon of Ihe mitochoncirial tRNA reslricts wobble lo pairing wHh Ihe two purines 001)'. Exceptions to the "universal" code are not limited lo mitrochondria bul are also found in several pmkaryotic genomes and in tbe nuclear gcnomes of certain cukaryotes. The bacterium Mycoplasma cupricolum llses UGA as a tryplophan codon f'dther than a chain-termination codon. Likewisc. some unicellular protozoa use UAA and VAG. whicb are stop codoos in the "uflÍ\'ersal" code, as glulamine codorno FinaUy, acodan (QJGJ for ooe aOlino acid (Ieucine) in the "universal" code has becoOle a codoo foc another arrUno acid (serine) in the yt,'a st Candida.
SUMMARY In !he "llniversal" genetic code used by every organism from bacteria lO humans . 61 codons signify specific amino acids; Ihe remaining three are chain-term inatio n codons. The code is high ly degenerale, w ith several codons (synonyrns) usuatly corresponding lo a singlo amino acld. A given IRNA can sometimes specifically recognize several codons. This ahil ily mises from wobble in Ihe hase at tbe 5' end of the anlicodon. The s lop codons UAA. UAG. and UCA are ruad by specific proteins, nol specialized tRNA molecules. The genetic code is sub¡ecl lo three principal rules. Codons are read in a 5 ' lo 3' dirp.ction. codons are nonoverlapping and the message contains no gaps, and lhe message is lranslaled in a fnwd reading &ame. which IS set by Ihe initiation codan. The genetic code was cracked through the study ofprolein synlhesis in cell-froo exlracts. Addition of new mRNA lo an exlrnct depleted of its original messenger componen! resrnts in the production of new proleins whose amino acid sequences are determined by Ihe exlemally added mRNA. The firs l (and probably most hnpnrtantl step in o 'acking the genetic code OCCllrred when Ihe synthetic polyri-
bonudeotide poi y-U was found lo code specifically for polyphenylalanine. Use of o!her synthe1ic polyribonuc1eotidClS, bolh homogenoolls (poly.c, and so on) and mixed (poly-AU. and so on), ¡hen allowed assignmenl of codons for thu various aminu Hdds. Dctermination of the exacl order of nllcleolides in codoos subsequenlly carne from 1\ Stllrly of specitic Irinudeotide-tRNA-rihosome interactions and the use o( regular copolymers as messengers. Point mutations that alter the code are missense mutalions, which change the codon for one amino acic1 imo the codon for another amino acid; Ilonsense mutaliolls. which callse protein synthesis lo terminate prematllfcly; and frameshift mutalions . which alter l.he reading frame of Ihe Illessage. In sorne cases the effecls of missense, nonsense. aud frameshift mutalions can be paI1i ally suppressed by extragenic suppressors. For example. mutan t IRNAs read stop codons generated by nonsenso mutations as ir Ihey were codons for a spedlk amino aeid. A slightly differenl genctic code is lltilizcd in mitochondria and in the principal genomes of cerlain prokaryoles and pl'Otozoa. such as Ihe use of UCA , a stop codon in the " universal code," as a tryplophan codon.
BIBLIOGRAPHY Books Cclis I.E. and Smith J.D.. eds. 1979. Nonsense mutwions rmd tRNA s!lppressors. Aeademic Prcss, New York.. Clark B. and Petersen H .. eds. 1984. Cene expression : The translational step and ils control. A lfred 8ellzon S)'m. posjum, vol. 19. Copenhagen. Munksgaard. Cold Spring Harbor S)'mposio on QuonWative 8ioI08),. tH66. Volu me 31 ; Tlle genetic codeo Cold SpriIlg Harbor !.aborator)'. Cold Spring Hamor, New York.
5611 D.G .. Abelson J.N .• and Sch immel P.R., eds. 1980. Tmnsfer RNA : Biolngica l nSpP.cL<;. Cold Spring Harbor Labof8lory, Cold Spring Harbor, New York.
Vcas M. 1 96~). Tlle bioJogiool codeo Wiley (Intersciencel. New York.
Fcltturcs of the Gcoctic eode Crick F.H.C. HI66. Codon-anlicorlon pairing: The wobble hypolhesis. f. Mol. Biol. 19: 548- 555.
478
The Genetic Codc
Kohli J. and Grosjean H. 1981. Usage of the three terminalion coclons: Compilalion and ana lysis of Ihe known eu karyotic an d prokaryolic Imns laHon termination s equences. Mol. CenoCenet. 182: 430-439. Lagerkvisl U. H181 . Unorthodox codon rendíng end the evalution af Ihe genelic codeo Cell 23: 305 - 306.
How the Code WltS Cracked Cl'ick F.H .C. 1063. The recent excitcment in lhe coding problem. Prog. Nucleic Acid Res. 1: 164. Khorana H.G. 1968. Polynucleotide synthesis and lbe geoelic codeo Harvey Leclure Series 1966- 67. Vol. 62. Academic Press, New York. Ninmberg M. and Leder P. 1964. The effect of trinud eoUdes upan Ihe binding of sRNA lo ribosomes. SdenCfJ 145: 1399 - 1407.
Speyer J.F.. Lengyel P.. Basilio e .. Wahba A. r., Garclner R.S., anrl Ochoa S. lY63. Synthetic polynucleotides and Ihe amino acid ende. Cold Spring Harbar Sym p. Quant. Hial. 28: 559-568.
Three Rules of the Genetic eode Bretiller $., 5tretton A.O.W., and Kaplan S. 1965. Cenetic code: The nonseose triplels for chain termination and Iheir suppression . No/ure 206: 994 - 998.
Crick F.H.C.. Bamett L., Btenner S .. and Walts-Tobln RJ. 1961. Ceneral nature of Ihe genetic cocle for proleins. Na ture 192: 1227- 1232. Garen A. 1968. Sense and nonsense io Ihe geuetic code. Science 160: 14!l - 159.
] erzaghi E., Okarlil Y. S lreisinger C .. Ernrich 1.. Inouye M.. and Tsugita A. 1966. Change of a sequence of amino acirls in phage T4 Iysozyme by acridine-induced mula· tinns. Pmc. No /J. Acad . Sd. USA 56: 500 -507
Suppression Duckingham R H . and Kurland e.e. 1980. Inleractions belween UGA-suppressor tRNA' P and Ihe ribosome: Mechani sllls of IRNA se lection. In Tronsfer RNA: B;olngiool aspects (ed. D. 5611 el a L), pp. 421- 426. Cold Spri ng Harbor Laboratory, Cold S pring Harbot', New York. Ozeki H. , Inokuchi H., Yamao F., Kodaira M., Sakano H.. I'kemum T ., and Shimura Y. 1 !::II::IO. Genetics Qf llonsense su ppressol' of tRNAs in Eschcrichio cali. In Transfer RNA: 8io/ogical aspects (ed, D. sol! el aL), pp. 341-349. Cold Spring Harbor Laboratory, Cold Spring Harbor, Ne\\' York. S teege O,A. and Soll D.C. 1 979. S uppression. In 8;010gicol regulo/ion and developmenr I (erl. R.r. ColdbergerJ, pp. 433- 486 . Plenulll , New York
•
•
• ..
• •
••••
••• • •• • •• • •• • • • •• • •••• • • •• •• • • ••• • •• ••• • •••• • • •••• • o
..
PAR
1"
••• • •••
•• • •• • • • ••• • • • • •• • --_.-
T
REGULATION
480
Part 4 ReguJatiQJl
PAR T
OU TLIN E
• Chapter 16 Gene Regulation In Prokaryotes
ChapteT '7 Gene Regulatioo In Eukaryotes Chapter \8 Gene Regulaban during
Oevelcpmenl Chapter 19 Comparallve Genomll3 and the Evolutlon of Animal Divefsity
·I n the preceding prut. we considered bow the genetic infoanation encoded,in the DNA is expressed. T his involves the transcription oC DNA sequences into an RNA fonn whic h is then used as a Lelllplate for tran slation into proteín. Bol not all genes are expressed in aIl cells aLl the time. Indecd, much oC Ji Ce depends on the ability oCcells to express their genes in different combinations al different times and in clifIerent p]aces. Even a lowly bacterium expresses only sorne oC its genes at any given tillle-ensuring il can. Cor example, lIlake the enzymes needed lo metabolize the num enls it encounters whilc nol making enzymes for other nutrienls at the same time. Development oC multicellulaf organ isms offers a striking exam ple oC this so-called "differential gene expression." Essentially a1l lhe cells in a h uman co.ntaln the same genes, but the set of genes expressed in forming one cell type is different from that expressed in forming another. Thus. a musele cell expresses a set of genes different (al least in part) from tbat expresscd by a neuron, a skin cell . and so on. By and large these differences occur al Ihe level of transcriplion-most commonly, !he ¡nit¡ation of transcription. In the Collowing chapters, we look al how genes' are regulated slarting in Chapter 16 with how this is done in bacteria. It is here that lhe basic mechanjsms can mosl ceadily be appreciated.. First , we dea l with simple cases that iIIusteale different mechanisms of tran scriptionaJ regulation. These inelude the case of the loe operon. These genes encode proteins needed for metabolizing the sugar lactose, and are expressed only when tbat sugar is available in the growth merllum. Then we Jook at exarnples of gene regulation Ihat operate at later steps in gene expression-RNA elongation and translation , for example. Finally, in this chapter we describe how phage " chooses between alternative developmental pathways by expressing different seis of genes upon infaction oí a bacterial cell. In Chapter 17 , we consider basic mechan isms oCgene expression in eukaeyoles, from yeas! lo sorne of Ihe simpler cases fo und in higher eukaryotes. Mech anisms of tTanscriptionnl activation and repression are compared to those in bacteria. and we see where medlanisms are conserved, and where Ihere are addit ional features-mosl notably the effects of chromatin modHications o( the type discussed in Chapter 7. We also see how small RNA molecules can regulate gene expression in various ways. As we saw in Chapter 13, eu karyotes very oflen have to splice RNA before they can be translaled. This offers another step at which expression oC a given gene can be regulated. In this case, regulation can determine not only when a given gene is expressed. but also which of several alternative proteins is made. In Chapters 18 and 19, we consider gene regulation in the context of developmental biology. In Chapter 18. we look at examples of how genes are regulated to bestow ceU type specificity {differentiationJ and pattern fonnation (morphogenesis) in a group of genetically identical cells-for example, those found in a developing embryo. Chapler 19 looks al diversity among closely-related organisms and sees how. in many of these, lhe differences in morphology oc behavior result not from changes in the genes, but from differences in where and when lhose genes are I".xpressed within each organism during deve)opment.
Poli
'11
Regulotiol1
481
PHOTOS FROM THE COLO SPRING HARBOR LABORATORY ARCHIVES Edward lewis. ca .. Undegren. Alfred Heme'(, and Joshua L.ecSertlerg. 1951 Symposium on Genes and Mutations. lev.-is insligated the genetic analysis of deveIopment, using ¡he fruit fIy as his mode[ (Chaprer 18). He shared!he 1995 Nobe[ Plize in Medicine for his work. Lindegren was iI pionecl 01 yeast genetics (Chaptel 2\ ). Hershey wa5, IOgether with Max De!brüc:l:. ,lOO Salvador luria, lhe leadeJ 01 the 8'OUP used phage as their
mal
mooe[ sysl.em in the earfy days d molecular biology (Otapter 21); the ¡hree 01 them shared ¡he 1969 NobeI Prize for MeDICine. lederberg discovcfed !hal DNA could pass betwcen bacteria by a mating process c.alled oonjugation (Chapter 2 1), lar which he shared in !he 1958 NobeI Prize for lv1edicine.
Jeff Roberts and Ann BUrges5. 1970 Symposium on Transcription o, Genetic MiIIterial. Roberts' reseClrch
has locused on regulatorS d gene apre5S1OIl In bactena ancI phage, p.lrticularty ilnltlerrninators In philge lamtxla (Chapter 16). Burgess bocame a biology educalor aOO is irMl!ved in nahonal efforts 10 ¡mp~ scieoce ooucation. Rc:berts was an author 01 the previoLfo edilion ol!his 0001:.. while Burgess has a cousin among Ihe cunenl authors (TB).
Christiane NUuJein-Voth.ard. 1996 CStIL Meeting
on Zebrafish Oeveiopment and (ienelics. Mutanl SGeef1S carried out In fruiI !Iíes by Nüss!ein-l.blha,d and her colleague Ene Wle5ch.:rus Ic!entif!ed many genes oitlcallO the early embryooic: development 01 mal orgamsm, aOO probabIy all animals (dlapteJ 18). Far Ihis the two 01 them shared In !he 1995 Nobel Pnze with Edward Lewis,
482
Parl 4 Regula/ion
Mark Plashne and Joseph Goldstein. 1988 Symposium on Molecular Biology of Signal Transduction. Ptashne was instrumental in taking the eady ideas cA Jocob,)OO Monod about how- gene expression is legUIated. and desoibing how these work at a molea.Jlal ~ (Chapters 16 and 17). Coldstein, vvith his longtime coIlabofalor Midlael &CNIIO, VIOrked oul !he signa! transduction palhways (Chapter 17) !hal control e,opression of genes irwo/ved in dlolesterol melabolism, for whidl Iiley \/YOn the 198 5 Nct>eI Prize in Nledicine.
.Jacques Monod and Leo Sxilard, 1961 CSH
Monod.logeI.her with Fram;oise .Iacob. formlJlated lhe operon model for Ihe reg. ulation of gene expression (Chapter 16). The two of rhem. together INilh lheir coIleague Andre bwff, shared fhe 1963 Nobel Prize in Medidne for this acflievemenlleo Szililld was a wartime nudeilr physiósllMlo turned 10 molecular bioIogy after taking the phage course at Cold Spring Harbor in 1947. He Tan a Jab vvith Aalon Nooo in Chicag'). (Source: Courtesy of Esther Bubley.) LabotatOf)'.
Mrs. I.H. Herskowitx with SOf"Is. Ira and loe&. 1947 Symposium on Nudeic Acids and Nudeoptoteins. Ira Herskowit:z pieneered the use 01 !he yeast ~ cerevisioe as a mcxlel organism for molecular bioIogy (Chapter 2 1), and made major crntributions lo ideas about gene regulation in this Ofganism as he had, earlier, in bacteriophage lambda (Chapters 16 and 17). His lather, l!'Nin, later the author 01 a genetics textbook, was attending the symposium tIlat year.
CHAPTER
Gene Regulatíon ín Prokaryotes
n Chapter 12 we saw how DNA is transcribed into RNA by the enzyme RNA polymerase. We also described the sequence elemeots that coostitule a promoler-the region al the start of a gene "lihere the enzyme binds and initiales transcription. ln bacteria the most common form of RNA polymerase (Ihal bearing u 70 ) recognizes promoters fonned (rom three elements - the "-10", " - 35", and "UP" elementsand \Ve saw thal the strength of any given promoter is dctermined by which of mese. elements il possesses and how \Vell they match optimum "consensu s" sequences. In the absence of regulatory proteins, mese elements determine the efficiency with which polymerase binds lo Ihe promoter ando once bound, how readily it inHiales transcription . Now we tum lo mechanisms thal regulate expression-that ¡s, mechanisms thal increase or decrease expressioo of a given gene as the requirement for ils product varies. There are various stages al which expression oC a gene can be regu lated. The mosl common is transcription ¡njljalion. and Ihe bu lk of this chapter focu ses 00 tbe regu lation of Ihat step in bacteria. We slart w ith an overview of general mechan isms and principIes and proceed lo sorne well -studied cxamples that demonstcatc how Ihe basic mechanisms are used jn various combinations to control genes in specific biological contexts. We also consider mechanisms of gene regu lalioo Ihat operate at sleps after transcriptioo ¡nitiaUon. including transcriptional aotitermmatioo and the regulation of translation.
I
OUTl l NE
• PI'II'lCiples of Transoiptional Regulation' (p. 483) Regulalion of Transaiption tn~lation : Examples from Baderla (p. 488)
• Examples 01 Gene RegulatlOn al Steps aher Tronscription ¡nitiation (p. 5(4) The Case of Phage A: ldyers of Regulation (p. 512)
PRINC IPLES OF TRANSCRIPTIONAL REGULATION Gen e Expression Is Cootrollcd by Rcgulatory P roteins As we described io Ihe introduction lo tbis section. genes are very often controlled by extracellular signals-in the case of bacteria, this typically meaos molecu les present in the growth medium. These sigoals are communicated lo genes by regulatory prote¡os, wbich come in two types: positive rcgulators. or activators¡ and oegative regulators, or repressors. 1ypically these regulators are DNA-binding proteios thal rccognize specific siles al or near the genes they control. Ao activator increases tronscription oI the rcgulated gene; repres50rs decrease or eliminate tbat traoscription. How do these regulalors work? Recall Ihe sleps in transcription initiati oo described in Chapter 12 (see Figure 12-3). First. RNA polymerase binds lo fue promoter in a closcd complex (in which the DNA strands remam together). The polymerase-promoler complex Ihen undergoes a lransition to an open complex in which the DNA at the slart site oI 483
484
Gene ReguJation il1 Prokaryvtes
transcription is unwound aod the polymernse is positioned lo initiate transcription. This is Collowed by promoter escape the step in which polymerase lcaves the promoter and starts transclibing. Which steps are stimuJated by activators and inhibited by repressors? That depends on the promoter and regulators in question. We consider two general cases. outlined under the next two headings.
Many Promoters Are Regulated by Activators tbat Help RNA Polymerase Bind DNA and by Rcpressors that
Block ,ha' Binding At many promoters. in the abseoce oC regu lalory proteios, RNA polymerase binds ooly weakly. This js because one oc more of the promoter elements discussed aboye is absenl or imperfcct. When polymerase does occasionally bind. however. it spontaneous ly undergoes a transition lo the open complex and ¡nitiates transcri ption. This gives a low level of constitutive expre&sion ca1led the basallevel. Binding of RNA polyrnerase is the rate limiting step in this case (Figure lfi-l aJ. To control expression (roro such a promoter. a repressor need only bind to a site overlapping the region bouod by polymerase. In that way. the repressor blocks polyrnerase binding to the promoter. thereby preventing transcription (Figure 16-1b], although ít is important to note that repression can work io other ways as well. Tbe site on DNA where a repressor binds is called an opera1or. To activate tronscription (rom trus prometer. 3D activater just helps polyrnerase bind the promoter. TypicaUy tbis is acbieved as follows: the I (i U RE 16-1 Activation by reauitment of RNA potymerase.. (a) In me absence 01 both octivalor "nd repressor, RNA poIymefase ocea9Ol"lally binds the promoter spontaneous/y
basallevel
G[~~~==z:~~~::c:::====~====:>mlfan~~on promolet""
b
no transcription
spontaneous
isomet""izaliOll leading lo of transcriplion
PrincipIes of1hJllscriptiollol Regula/ion
485
activator uses one swface to bind lO a site on the ONA m::ar lJle promott!r; with anothnr s.urface. the activator sirnultaneo\lsly interacts with RNA polymerase, bringing the t:!nzyme lo the promoter (Figure 16-1cJ. This mcehanism, ohcn caBed recruilmenl, is an example oC coopcrativc binding of p.roteins to DNA (see Chapter 5). Thc intet"
Sorne Activators Work by Allostery and Rt1,.711Jate Steps after RNA Polyrnerase Binding Nol a1l promoters are limited in the same way. Thus. consider a promoler al tlle olher extreme from thal describcd above. In this case. RNA polymerase binds eCficienlly unaided and form s a slable c10sed complexo BUI that c10sed complex does not spontaneously undergo transition to the open complcx (Figutc 16-2a). At thi s promoter. en activalor must stimulale tho transition from c10sed lo opon complex, since lhat Iransition is the rate-limiting step. Activators that stimulate this kind of promoter work. by triggering 'a conformational changa in either RNA polymerase or ONA. That is. they intersct with the stable c10sed complex wld induce a conformalion al cllange Ihat causes tTansition to the open cODlplex {Figure 16-2bl. This mcehan ism is an example oC allostery. In Chapter 5 we encoun tered aUostery as a general mechanism for controUing the activities of proleins. One of the examples we considered Ihere was a protein (a cyclin) binding lo, and activating, a kinase (Cdk) ¡nvolved in ceU cycle regulation. The cyclin duos this by inducing a con formational dmnge in lhe k.inase. swilching il from an inactive lo an active slate (Figuro 5-27J. ln this chapler, we will See lWQ examples of transcriptional activators working by allostery. ln one case (al !he g/nA promoter) . the activator (NtrC) interacts wilh the RNA polymerase bound in a c10sed complex at the promoter, stimu]atíng a RNA~
no spontaneoos isomerizalioo ancllhus no transcription
; binding
b RNA _
activated leveI 01 Iransaiplion
F I (; URE 16-2 Allos tefic activation of RNA poIymerase. (a) Binding of RNA poIymerase 10 the prometer In a stable dosed complcx. (b) The actil/alor inleracts with poIyrnerase lo Ingger transilion lo Ihe open comp\Cl( aOO high levels 01 transcription. fue representations 01 the dosed and open complei
4H6
GeI1e Regula/iQI1 iI1 ProJwryo/m;
transition lo the open complexo ln the olher cxamp le (al the merT promote rl . !he at:tivalor (MerR) achieves Ihe samo offect but doos so by ind ucing a conformalional change in the promoter ONA. There are varialions on these themeR: sorne promoters are incffi cient a l more than one slep and can be activaled by more t.han one mechanisrn. Also. reprossors can work In ways olhor Ihan jusi blocking the bincling of RNA polymerase. For example, some repressors inhibi¡ transition lo the open complex o or promoter escaptl. We will consider exampJes of Lhese later in the chapler.
Action at a Distance and DNA Looping Thus far we have tacitly assumcd thal ONA-binding proleins tha! interact wi!h each olher bincl lo ad jacenl siles (fur exa mple. RNA polymerase and aclivator in Figures 16-1 and 16-2). Often this is the case. Bul some proteins ¡nlerac! with each othor even when bound to sites well separated on the DNA. To accommodate lrus interaction, the DNA between Ihe si tes loops out. bringing Ihe sites inlo proximily with one anolher (Figure 1 6-3). We wi ll enco unler examples 01' this kind of inleraction in bacteria. lndeed, one of the acti vators we have already mentioned (NtrC) activates " frum a di stance'" ; its bind ing sites are norma ll y located about 150 bp upstream of the promoter, and the activalor works even w hen those s ites are placed further away fa kb or more). We will a lso cons ide r re pressors thal interact lo form loops of up to 3 kb. IIL Ihe next c ha pler-on e llkaryotic gene regulation -we wi ll be faced with more nUm CtollS and more dramalic examplcs of thi s "action a l a distance." One way to help bring distant ONA sites cIoscr logether (and so help Jooping) is Ihe binding of othor prote ins to sequenccs belwoon those s ites. In bacteria thero are cases in which a protein binds between an acti· valor binding site and UlO promoter and helps the activator inleract with polymerase by bending the ONA (Figure 16-4). Such "architectural" proteins facilitato inlcractiuns bctween proteins in uthor processes as well {for flxamp le. site-specific recambination; see Chapter l l}.
FUWRE 16-3 Interactíonsbetween protein5 bound lo ONA. (a) Cooperative binding of ploteros lo adjacenl sites. (b) Coop-
a A
(1
erallve blnding 01proleins lo separated slles.
site
A b
B
Prillciples of lronscriptioool Rfi1::uloliOll
487
FIGURE 16-4 DNA.bendingprotein can facilitate interaction between DNA-
bef1ding protein
RNA~
DNA-binding prolei"s. A protein thal bends DNA binds lo asile befween lhe activafOf blnding site and the promoter. lhis brings !he Sltes doser toge!her in space and thereby heIps !he inleraction between meDNA-bound activatOf and polyrnerase.
Me
Cooperative Binding and Allostery Have Many Roles in Gene Rq,>'Ulation We have already poioted out that gene activation can be mediated by simple cooperative binding: the activator interacts simultaneously with DNA and wilh polymerase and so roc.ruils the cnzyme lo the promotor. And we have described how adivation can. in other cases. be mediuted by allosteric evenls: ,m i:lLtivator ¡nteructs with polymerase already bound lo the promoter and, by ¡nducing a conformational change in the enzyme or the prollloter. stimulates transcription ¡nitiation. 80th cooperative binding and allostery have addilional roles in gene regu lation as welL For cxample. groups of regulators often bind DNA cooperativcly. Thst IS, two or more acli vators an d /or rcpressors intcrse l with each albor and with DNA. and thereby help each otllCr bind near a gene they al! regulale. As we will see. this kind of interaction can produce sensitive switches thal aUow a gene to go foom complele ly off lO fully on in response lo only small changos in conditions. Cooperative bindinR of aclivalots can also serve lo integtale signals; that i~. somo Renes are actival.ed only when multiple signaLs (and lhlls multiple regll' lalors j are simultanoously presen!. A particularly striking and well· underslood example of cooperativity in gene regulation is provided by bacteriophage )., . We co nsider lhe basic mcchanism and consequences of cooperdlive bind ing in more delail when we discu~s thi:ll example later in the chapter. and also in Box 16-5. Alloslery. for its parto LS nol only a mechanism of gene activation. it is also often Ihe way reguJators aro controlled by their specific signals. Thus. a typir,al bacterialregulator can adopt two conformations - in one It can bind ONA; in lhe olber il cannol. Bioding of a signa l moleculc loeks IJ1e regulalory protein in one or anothor conformalion. thereby determining whether or not it can act. We saw an example of this in Olapler 5 (Figure 5-25). whoro we also considered thc basic mechanism of a Uoslery in sorne detBiI: in this and Ihe nexl chapler wc wi1l seo several examples of allosloric control ofregulators by their signa ls.
Antitermination and Beyond: Not AH of Gene ReguJation Targels Transcription Initiation As slaled al lhe beginning of lhis chapler, lhe bulk of gene regulanoo takes place at lhe ¡nHiation of transcription . This is truo in eukaryotos jusI as it is in bacteria. Bul rcgulation is ccrtainly nol restricted lo tbal slcp in eilher class of organismo In this chapter we will seo examples. in bacteria, of gene regulation Ihat involvc transcriptional e longation. RNA processing. and trans lalion of Ihe mRNA into protein.
488
GeIll!
Regula/ion in Prokm)'oles
REGULATION OF TRANSCRIPTION INITIATION: EXAMPLES FROM BACTERIA Having outli ned basic principies of tl'anscriptional regu labon, we turn to sorne examples that show how these principies wOl'k in real cases. First, we consider the genes ¡nvolved in lactose metabolism in E. coli-those of the lae operon. liere we will see how an activalor and a repressor regulate expression in response lo two signals. We also describe sorne of the experimental appl'Oc'1ches that reveaJ how thesc regulators work.
An Activatol' and a Repcessor Together Control the lac Genes The thme Jac genes-JaeZ, lar;Y, and JacA-are arrangcrl adjacently 00 Lhe E. coJi genome and are called Ihe lac operon (Figure 16-5). The loe promoter, Ioc:aled al tlle 5' end of laeZ. directs transcription oi" aH three genes as a single mRNA (caned a polycistronic message because it includes more than one gene); this mRNA is translated lo give the thrce prote¡n products. The JaeZ gene encodes the enzymc ~-gal actosidase, whích cleaves the sugar lactose inlo galaetose and glueose. both of whieh are used by the cell as energy sources. The JueY gene encodes the lactose pennease. a protein that inserts into Ihe rell membrane and transports lactase into tbe eeU. The JacA gene encodes thiogaJactoside transacetylasc. which rids the cell of toxic thiogalactosides that also !,'Ct transported in by laeY. These genes are expresscd at high levels on ly whcn lactase is available, ami glucose-the preferred energy source-js noto 1\vo regulatory proleins are involved: one is an aetivator call ed CAP. the other a repressor caBed the Lae: repressor. Lac rcprcssor is encoded by the laeJ gene , which is located near the olher lae genes, but tl'anscribed troro its own (constitutively expressed) promoter. The name CAP stanrls for Calabo lite Aclivator Protcin, bul tllis activator is also known as CRP (for cAMP Receptor Protein, for masnos wc wi ll explain ¡aler). The gene encoding CAP is locnted elsewhere on the hActerial chromosome, nol li nked to Ihe loe genes. Both CAP ami Lar. repressor are DNA-binding proteins and each binds lo a specific sile on DNA at or near the lac promoter (see Figure 16-5). Ulch of thesc reg\l latory proteins responds to one environmcntal signal and communicates it to the lae genes. Thus, CAP mediales Ihe elTccl 01" glucose. whereas Lac rcpressor mediates lhc lactose signal. 'nlis regulatory system works in the follow ing way. Lac repressor can bind DNA
" CAPsile ",,",,1<" '------::==;--'1 prometer FI (; U RE 16-5 The loe op6Of1. The three genes (Iocl, y, and A) are transcrbed as a single mRNA from!he promoter (as indicated by lhe arrow)_The CAP Slte and !he operatar are each about 20 bp. The operawr IteS Wlthin !h!: region boond by RNA polymerase al Ihe promoter, and lhe CAP site lies jusI upstlearn 01me prometer (see Figure 16-8 far more details 01the relative aITangefTlents 01 mese binding Sltes anc! !he le>:! ler a description of the proteins tIlat bincl lo Ih«n). The picture is simpl,fied ,n thal thae are two addilional, weaker, loe operators Iocated nearby (sec Frgure 16- 13), but we do no! need lo conside lhose al present
ReguJolioTl ofTronscriptioTl Initiation: Exumples from Bacteria
+
4HQ
basal leYel of transcription
+
no transcription
- 1+
acti\lated leve! of tra nscription
+
f I G U R E 16-6 Expression of the loe genes. The pfesence or absence of lhe sogafS lactase and gluoose control me leve! of G:pression 01 lhe lo<: gef1es. High levels of expression require lhe presente 01 laclose (and herce !he abseoce of luncllonal Lac repressor) and absence ollhe preferred en€rgy source, glucose (and hence presence 01 lhe ac\IVator CAP). Wheo bound lo !he operalor, LK repressor excludes polymerase whether or not actJve CAP IS preseol CAP and liIC repressOl are shown as single UOlI5, bul CAP actually binds DNA as a dlmef, and lac repressor bmds as a letramer (see Figure 1& 13). CAP recruits poIymerase te lhe lo<: promole r whefe il spontaneously undergoes isomerization lo the apeo cemplex (!he state shown in lhe oollom liroe).
and repress transcription only in the absence of lactose. In the presence of that sugar, lhú repressor is inactive and tlle genes dc-reprcssed lexpressed}. CAP can bind DNA and acti vate the lae genes only in Ihe absence of glucose. Thus, the L"'O mbined cffeet of these two regulators ensures that tilo genes are expressed al significanl leve ls on ly when lactose is present and gl ucoso absent (figure lt:i-6J,
CAP and Lac Repressor H ave Opposing Effects on RNA Polyrnerase Binding to thc
rae
Promoter
As we have seen, the site bound by Lac repressor is ea lled the loe operator. This 21 bp scquence is twofold sym metri c and is recognized by t\Vo subunits of Lac repressot. one binding fo eaeh half-site (sen Figu re 16-7). We wil l look at that binding in more detail laler in lhis chapter, in lile section "CAP and Lac Reprcssor Bine! UNA Using a Common Struetural Motif." How does repressor, when bound to the operator. reprcss transcription?
,.
.
G
e
!
!
..
"half-site"
"half-site" laG operntOf
fl C;UR E 16-1 The symmetñc haH-sites uf the loe operatof.
490
Gerw Regu lolion in ProÍ:OI}'OleS
CAP-binding site
DNA covered by RNA polymerase
I
S', ~
.v.c:GCM'I"!'M'l'G'TCMI'l.'TAGCTCAC'rCAT'I'AGGCACCCCAGGC AATTACIIC'lCI\ATCGAGTOAGTIu\T
.
• mRNA
'I"1"I'ATGC'I"I'CCOOC'l'CG'I'A'I'G'l"I'GTG'roGAA'I"l'GTGAGeGGA'I'AACAA'l"l"l'CACACAOOAAAC G'l'G'I'GTCC'I'T'I'
.
-10
-35
•
+1
ONA covered by repressor
F I e u R E 16-8 lhe contr~ reg'ion of the loe operon. lhe nudeotide sequence and organizalion 01 lhe loe operon control region are shown. lhe coIored bars above and belavv the DNA sho.v regrons covera:l by RNA poIymerase and the regularQfY proteins . Note !hal Lac repressor ev.rers more DNA than lha! sequence defined as the mínima1 op€fator binding site, and RNA poI\'fTIer
The loe operator overlaps the promoter. and so repressor bound to lhe operator physicalJy prevenls RNA polymerase from binding lo the promoter and {hus initiating RNA synthesis (see Figure 16-8). Ptotein binding sites in DNA can be idenlified. and their location mapped. using DNA footprmting and gel mobility assays described in Box 16-1, Delecling DNA-Binding Sites. BoJ( 16-1 Detecting DNA-Binding Sites ONA Footprinting How can a proten binding site in DNA, such as an operatot, be rdentifted? A series of po.verful approaches aUo.vs identificaÜOfl of the s~ es where proteins act and the chemical groups in DNA (methyl, amino, or phosphale) a proteln contaas. The
basic principie that undef1ies these metho:1s, is as follows. Jf a DNA fragment is labeled with a radioactive atorn roly at one en
isolated fragments wiU not be broken at this site by subseQuent chemicallreatment. By using allthree methods, we can learn where a protein makes spedfic contaas both with bases aOO with the phosphates in the sugar--phosphate backbone o/ DNA. ~I
Mobility Shih Assay
As JUSI nol ed, how tar a DNA molecule migrates dLlring gel electrophoresis varies with slze: the smaller the mdecule the mOfe easily it moves through Ihe gel, and so the further it gets In a given time. In addition, jf a given DNA molerule has a prolein bound lo it, migratioo of that DNA protejn complex Ihrough tre gel is rctardcd compared lo migration of the unbound DNA molerulc. 1his forms the basis of an assay to deted spedfic DNA bjnding activitíes. The general approoch is as foIlD'N'S. A short DNA fragment containing the binding site of intere5t is radioactively labeled so il can be detected in small
q..rantities by polyacrylamide gel eledrCf.lhoresis and autoradiography. This DNA "probe" is then mixed with the protein of ¡nleres! and the mixture ís run on a gel. If the protein binds to the probe, a band appears higher up thc gel than bands formed frorn free DNA (see Box 16- 1 Figure 2). lhis method can be used ro identify multiple proteins in a crude extraCl lhus, if lhat probe has sites for a number of protejns found in a given cel! type, and that probe is mixed with an extrad of thal cell type. mulriple wnds can often be resolved. This is because proteins of different size wiU affect migration of Ihe DNA fragment lo different e>rtents- the larger the protejn lhe slO\o\Ief fhe migratioo. In this way, for example, me various transcriptional regulators !hal bind to the regulatory region of a given gene can be identified.
ffllgllla l ion 01 '/'ronsc.riplion lnirialion : E;..:ompIGs {ram Baden'o
491
80x 16--1 (Continued)
* *-=. *' *' *-
,
*."',::::!'.,=:::¡
, -
*.o:::::::"".~'?
*1 0:::::::""·="'==
,.
,
*,==.~::óI::J
*c:::: •
\ Jl !~~Of _
ONA fregmefll
*·c= =
_
fragments
ONA fragmenl + ONA-binding prolein
*.o:== :lIi.':=,
*= === *0:,====>. f,.. DNA
\1 -
__
bound ONA
freeONA
16·1 FIGURE 1 Footprinting method lhe stilrs represent lhe radioac\ive labels al Ihe ends of the DNA fragmenlS, arrOlNS indicale sites v.here ONAse arts. and red tirdes represenl ldc repressor bound lo operala. Dn lhe left, DNA moIerules cut al random by ONAse are separare
801( 16·1 fl(;URE 2 Cet mobility sMI assay. The principie 01 the rrobility shift assay is SOOMl schernatically. A pfOlein is mil
492
Gelle Regulo/ion in Pl'Okoryotes
As we have sec o, RNA polymerase binds the loe promoter poorly in the absence of CAP, oven when Ihere is 00 active repressor prnsent. This is because lhe sequence of the - 35 region of Ihe loe promoter is not oplimal for ils hinding. ao d the promoler lacks ao UP-eloment (see Chapter 12 and Figuro 16-8). This is typicaJ of promoters that are (;ontrolled by activators. CAP binds as a dimer lo a sito similar ín length to \he loe operator. bUI dilferenl in sequence. This site is located sorne 60 bp upstream of the slart site of trnnscription (see Figure 16-8). Wllen CAP binds lo thot sito, the activalor hclps polymerase bind lo the promoter by interacting with lhe enzyme and recruiting it to lile promoter (see Figure 16-6). This cooperative binding stabüizcs the binding of JXIlymerasc 10 the promoter. We now look al CAP-mediated actjvalion in mnre detail.
CAP Has Scparate Activating and DNA·Binding Surfaces Vurious experiments support tilo viow Ihat CAP activates the loe genes by simple recruHmenl oC RNA polymerase. Mutant versions of CAP
have been ¡solaled Ulat hind ONA hui do nol activate Iranscriptioo. The existence of Ihese so-called posili\'e control mutants demonstrales that. to activale transcription. Um activalor must do more Ihan simply bina DNA near the promoter. Ums, activation is not caused by. for example. the activator changing local ONA slruclute. The amino acid subslitutions in Ihe positive conlrol mutants iden tify Ihe region of CAP that touches po(ymerase, ca lled the activating region. Where does the activating region oC CAP touch RNA polymerasc when activating lhe loe genes? This site is revealed by rnulant forms of ,polymerase that can transcribe mosl genes normaUy, bul Crn1not be acti vatcd by CAP al the loe genes. These mulánts have amino acid substilutions in Ule Ctcrminal domain (CTD) of tllC Q subunit of RNA polymerase. As \Ve sa\V in Chapler 12, this domaio is attached to !he N-terminal doma¡n (NTDJ of o: by a flexible linker. The aNTO is cmbeddea in the body of the enzyme. but th e uCI'D extends ou t from it and binrls the UP-elemen l of the promoter (when Iha! elemen l is present) (see Figure 12-7). At tllC 10(; promoter. whero Ihero is no UP-element, uCfO binds lo CAP and adjacent ONA in~;tead (Figure 16-9) . Thi s picture is Slipported by a crystal struct urc of a cOllJplex con laíning CAP, nCTO, and 1I ONA oligonllcleotide rluplex containing B CAP sile an d an adjacent UP-element (Figuro 16-10). In Box 16-2, Aclivator Bypnss Experiments. we describe an experiment showing thal activation oC the loe promoter rcquires no more than polymerase recruitment. Having seen how CAP activales Iranscri ptioo al the lae operonand how Lac rcpressor counlers that effect -we now look more dosely at how those regulators recognize their DNA-binding sites. f I GU RE
16-9 Amvatton of \he Ioc promoter by CAP. RNA poIymerase biOOing ill the /oc ..,rumol€f with lhe Ilelp of CAP. eN> is recognized by the CTDs 01 the O: subuni15. The QCTDs also contad DNA. adjacent lo the CAP Slte, v.nen interacting with CAP. As in Ch~te 12, we l6e thi5 lepresenfation of RNA polymerase v.flen iOOicating speciflC poin15 01 contact between an activillor aOO i15 tar~ site on poIymerase. or between regions of potymerase and !he pfOmoter.
CAP sile
-35
- 10
Re~ultJtiOI1
01 Trrm8CripLioI1 lt/jtiarioI1 : Exomples Il'om BoC/eno fiGURE
493
16-10 Slructureof
CAP-aOO-DNA comp~x. CAP is shovm bouOO as a dimer lo its site ¡USI as we saw in Figure 5· 18. In addition, in this case.. the nCTO 01 RNA poIyrne.ase IS sJlov,on bouOO 10 an acija· ceot strerch 01 DNA, i:Ind interacting with CAP. lhe site 01 interaction on each protein involves the residues identilied genetically. In ttus llgure, CAP IS sho\.vn in turquoise aOO the aCTD of potymerase in purple. CIne molecule 01 ATP is showl bound to each monomer DI CAP: (Benoff B. er al 2002. 5dence 297: 1562.) Im"ge preJ.li'red v.ith MolSaipt <'lOO Raster 3D.
CAP and Lac Reprcssor Bind DNA Using a Common Structural Motif X ray erystallography has been used lo determine the struetural basis of DNA bindi ng for a nmnoor of baeterial aeli valors and rcpressors. including CAP an d the lae reprcssor. Although Ihe dctails differ, the basie mechanism of DNA rccogni ti on is similar for most bacterial reg ulators, as we now describe. In lhe typical case, lhe protein binds as a homodi mer lo a site that is an inverted ropeat (or near repeall. One mooomer binds eaeh half site, with the axis of symmetry of Ihe dime r Iying over thst of the binding site (as we ssw ror lac reprcssor, Figure 16-7). Recognition of specific ONA scquences is aehieved using a cooserved region of secondary structure ca11ed a helix-lurn-heJix (Figure 16 11). This dornain ls com posed of two a helices, Qne of which - the recognition helix-fits iolo the majol.' groove oC lhe DN A. As we discussed in Chapter 5, an o: helix is just lhe r¡ght size to fit into the major groove, allowing amino acid 4
4
4
4
4
Box 16-2 Activator Bypass Experiments
If eln actIvator has only to recruit polymerase 10 the gene, theo other ways of bringing the poIymerase to the gene should work just as well. This tums out to be true of the lac genes, as shCMlfl by !he folla.ving expeñments (80x 16-2 Figure 1). In
one experiment, another protein:protein inleraction
is
used in place ot that between C/Jf' and poIymerase. This is dooe by taking tvvo proteins known lO ¡nteract with each other, auaching one 10 a DNA-binding domajn, and, with the echer. repladng the e-terminal domain 01 the potymerase O' subunit (!lOO). The modified polymerase can be activated by the makeshift -activatal' as long as tl1e appropriate ONA-binding site is introduced near the promoter. In anothet experiment, the !lGO of polymerase is replaced with a ONA-binding dornain (fer example, that al CAP). This modified poIymerase efflciently initiates transaiption fr(YT1 the loe promoter in the
absence 01 any activator, as long as !he approptiate ONAbinding site is placed nearby. A third experiment is even simpler: polymerase can transcribe the loe genes at hi~ Ievel5 in vitro in the absence 01 any activator il the enzyme is present at high concentration. So we see that either recroiting poIymerase artifidally or supplying it at el high concentration is sufficient lo produce activated levels of expressioo 01 the loe genes. lhese expeñments ale consistent with the activator having only to help poIymerase bind te the promoter. Fot an explanation of why simply tnaeasing the concentration of a protein (for C'.xample, RNA poIyrnerase) helps it bir.d to a Site on ONA (in this case the promoter), see 80x 16-5. lhe results disrussed in the box would not be expected ;f the activator had to induce a speciflC allosterlc change in poIymerase to activate transcription.
494
GeI1e Re¡!uloliOIl iI1 ProkOI}'fJtc."
Box 16--1 (Continued)
a
r---::---+ activated
'l;;~::==:I~~~======~~=-==~c:::::~transmp~ -35 - 10
'J ONA-
binding sit.e
b
r---::---+ acthlated
[~;:1===:J::;:L--=====r:~====I:===) transcriPlion CAP site
- 35
- 10
BOJ: 16-2 FIGURE 1 Twoactivatofbypassexperiments.. (a) rheaCTD is repked by' .x.. which wi1h V. V fused lo DNAbiilding dornaln, cY1d tIle site recogozed I;y tIlal dornain is shov.n pIaced near the Itx genes (b) The aCTO IS ' epIaced by the ONA-birxling portion of CAP.
aprccein
Interacts protein Protein is
a
residuos on ils Duter face lo inlorael with ehomical groups on tho cdgcs of base pairs. RecaU Ihat in Chapler 6 \Ve 58W how caeh base pair pro-scnts a eharacleristic pattem of hydrogen bonding acceptors and donors (Figure 6-10). Thus. a proteio can distinguish difIerent DNA sequences in Lh is way wi thout unwinding lhe DNA duplcx (seo Figure 16-11 ). The contac:ls made ootWL't.!n Ihe ami no aci d side chains protruding from thc rccognition he lix and the edges o( the bases can be medialed by dimet H-bonds, indirect H-bonds (bridged by water mo lecules), or
Van der Waals fo rces. The nature of these bonds is discussed in Chapter 3. and Iheir roles in ONA recognition in Chapters 5 amI 6. Figure 16-12 ilIuslrates an examp le thc interactions mado by a givcn rceognition helix and its DNA-binding site.
or
FIGURE 16-11 Bindingofaprotein
with a ttelix-turn-ttelix domain fa DNA. The prolein, as is t)pically the C85e. bincls as a dimer, and me t\oVO subunits are by the shaded a rdes. The helix-turn-heli)( rrotif on
Incltcated each rnonorner is indJcated;lhe " recognilion helix" is Iabeled R
DNA-binding
site
Regulatian of Tronscription lnitiation: Examplm, {ram Bac:tl!ria
495
F I (i U R E 16- 12 Hydrogen bonds between " rept"esSOf and base pairs in the majo.' groove 01 its operator. Diaglam of the repl'essor-<>perafor comple>:, showi"8 hy<:!1Qgen bo"ds (in dotled lines) between amno acid side chains and bases in the consensus hatl-sile. Only the lelevant amino ac:id siete dJains ale shown. In addition 10 Gln44 and 5ef45 '" the recognition helix. Asn55 in the loop 101loMng the lecognilion helÍl( also makes contad IMih a spt'cific base. FurthermOle (cYld unusual 10 this case, see late- in !he text) lys4 in !he N-terminal alm af the protein makes a contad in Ihe majar groove en !he opposite lace 01the ONA heloc Gln33 confacts the bockbone. (Source: RedflllllO fmm Jordal\ S. and PabQ e Soence 242: 896, Rg. 36.)
The second helix of the helix-turn-helix doma¡n sits across Ihe major groove ond mokes contact with Ihe DNA backbone, ensuring proper presentalion of the recogllilion helix , and at the same time adding bi nding cllcrgy to me overall protein-ONA interaction, This descriptioll is essentially true for nol only CAP (Figures 5-1B and 36-10) and Lac rupressor, but ror memy other bacterial regulators as WCII! , including the phage h rcpressor and 00 protl::l ins we will encounte!" in a later section; Ihere are differences in dp.tail. as thp. foHowing examples ilIustrate, • Loc repressor binas as a tetramer, no! a dimer. Nevertheless, each opemlor is contacled by onl)' twa of these subunil s. Thus, the d ifferen! oligomeric foml does no! aller the m(:![;hanism of ONA recognition. The other two monomers within the tetramer can bind one of two ot her lae operators, located 400 bp downstream and 90 bp upstream of the primary operator. In such cases, the intervening DNA loops out to accornmodate the reaction (Figure 16-13) . • In sorne cases, other regions oC the protein. outside lhe helixturn-helix domain . also intcrac! ",¡Ih Ihe ONA. A. mpressor. for example. makes additi onal contacls using N-tenninal arms. These
¡oc operator
promotef operator
F I GU R E 16-13 lac repressor binds as a tetramer to two operators. 1he loop shown is between !he l dC represSOl boond at the pIlmar¡ Gpelator and the upsfream auxi~ar¡ one. A similar loop C<1n alrematlvety form I'.ith lhe downstn:am operalOl. The pnffia¡y operntor - that one.shov.n agamst ¡he ¡:JOnlOter -IS \he ope!"ator refefled lo in discusslon 01 reguliloon 01 lac gene exp¡-ession. In Ihrs figure, each repteSSOI dnner l!i shl7>'lO as t\'IO a rdes, falller than as a smgle ow.l (as used ,., eilflie- figUles).
Opefs tof
rcach arollnd Ihe UNA and inleract with Ihe minar groove 011 lhe back face of the helix (see Figure 16-12) . • In many cases, binding of lhu protein doos not aller the slructul'c of Ihe ONA. ln sorne cases, however, various dislortions are seen in the protein-UNA complex. For example, CAP induces a drarnatic bend in the DNA, partially wrapping it around Ihe protein. This is caused by olher regions of the protein, outsidt:: the helix-turn-h elix domain, int(lracting with sequ(lnc(ls oulside tho oporalor. In other cases, binding results in twisting of tha operator ONA. Not aH prokaryotic repressors bind using ti helix-Iurn-hclix_ A few have been described tha! employ quit6 diffarant approaehas. A striking example is the Are repressor from phage PZ2 (a phage related lo h bul one which infeets SalmonelJo). The AIe reprtlssor binds as a dimer lo an inverted repeat operalor, bul instead of an a-helix, it rccognizcs ils binding sita us ing Lwn illltiparaUlll ~- stnllld s inserlad into Ihe majar groove.
The Activities of Lac Repressor and CAP Are Controlled AHosterically by their Signals When lactose enters the ecH. il is convertcd to allolaetost::. lt is allolactose (mlher Ihan lacloso itselO Ihal controls Lac rl:lprussor. Paradoxically, lhe convm-sion of lactose to allolactose is calalyzed by l3-ga lactosidase. itself encodt:ld by one of Ihl:l lae genes. How is this possible? The answt:lr is that cxpmss ion of the lae genes is Ieaky: even when Ihey are represscd, an accasional transcripl gets mada. Thal happens because every so often RNA polyml:lfase will manage to bind thl:l promoler in place of Lac repressor. This leakiness ensures Ihere is a lo\\' level of p-galaclosidase in the cell even in the absence of lactase, and so Ihere is enzyrne pojs~d lO catalyze Ihe conv(lcsioll of lactase lu a ll o~ lactose. Allolaclose binds lo Lar: repressor and triggcrs a changtl in the shape (conformalion) of Ihal protein. In thl:l absencc of allolaclose, repressor is presenl in a form Ih at binds ils sito 011 DNA (and so keeps the Jae genes switched of-O. Once alloJ actose has altered Ihe shape of repressor, Ihl:l protein can no Jonger bind DNA. and so thl:l Jae genes are no longl:lr repressed. In Chapler 5 we described the structural basis of this allosteric change in Lac repressor (Figure 5-25). An important point lo t:mphasizc is Ihal allolactose binds lo a parl al' Lac repressor dislincl {mm its ONA- binding domain. CAP aClivily is wgulalcd in a simi lar manner. Glucose lowers the intraceHular conctlnlration of a small molecule, cAMPo This molecull:l is the allosteric effl:lctor for CAP; only when CAP is complexed with cAMP does the protein adopl a conformation that binds DNA. Thus, ooly when glucose levels are low (and cAMP levtlls high) dOl:ls CAP bind UNA aod activale the lae genes. The part ofCAP thut binds the effeelar, cAMP, is separale trom the parl of Ihe pralein that binds ONA. The lae operan of E. eoli is one of the two s)'stems used by Freneh biologists Franr;:ois Jacob and Jacques Monod in formulating ¡he earl)' idl:las about gene regulation. In Box 16-3. Jacob. Monod, and the Ideas B(lhind Cono Regulatioll, we give a brief description of tbose t::arly sludit:ls and why tht:: ¡dt::as they gencraled have proved so influentiaL
ReguJalicm ofTro mcription /niliation: Examples from Bacteria
497
Boa 16-] Jacob. Monod. and the Ideas Behind cene Regulation The idea that the expression of a gerre can be rontrolled by the product of another gene- that there exist regulatory genes the sde function of IM-lich is regulating the expression cA other genes-was arre of the grea! insights frcrn the earfy years 01 rTlClIerular biology. It was proposed by a group of scien~sts working in París in the 1950s and earfy 196Os, in particular Frant;Ois Jacob and Jacques Monod. They sought to explain two apparently unrelated phenomena: the appearance of prgalactosidase in E. cofi grwm in lactose, and the behavior of the bacterial virus (bacteriophage) ~ upon infection of E coli. lheir v.«k culminated in publication of their operon model in 1961 (and the 1965 NobeI Prize far medicine, IM-lich they shared \r\Iith their coIleague, Andre lwoff). It is difficult to appredate the magnitude of theif achievement now that we are so familiar with thar ideas and have sud-! direct ways uf lesling their modeIs. To pul it in perspectMe. consider 'Io.'hat was l
mutants in ....-hich the enzyme was produced constitutivety. These mutanls came in IV;Q dasses: in Ole, the gene encoding the loe repressor was inactivated; in the other, {he operator site was defective. These \1M} dasses could be distinguished using a c1s-trans test, as we now describe. Jacob .nd r.'mod ronstructed partially d;~oK! cel~ (""" Chapler 21) in v.hich él sectim af fhe chrornosorne from a ~kI type cell canying the loe genes (that i50 the lac repressor gene, loc( !he lJ€nes of !he loe q¡erm, iYKl theic '"8"1atuy elements) was introduced (011 a plasmid caJled an F') into a cel1 canying a mutant version of the loe genes on its chrornosome. This transfer resulted in the presence of two copies of the fac genes in the cell, making it passible \o test whether the wikl·type copy could complement any given mutan! copy. Iv\rhen the chromoscmal genes wefe €:lCpfessed on>titutively because of a mutation in the Ioel gene (encoding repressor), the IMld-type copy 00 !he plasmid restored repression (and inducibility) - for example, Il-galactosidase was once again ooly made \IIkIen lactase vuas present (Box 1&3 Figure 1). This is because the repressor made from the Wlld-type /ocl gene en the plasrnid cnuld diffuse lo !he chromosone-that fs, it oould act in trans. \fI.ot\en the mutation causing coostitutive expression of the chrornosomal genes was in the foc q;l€fator, it coold na be complemented in trans by the wild·type genes (Box 16-3 Figure 2). The operator functions only in c;s (that is, it only acts on the genes directly linked to il on !he sarne ONA molecule). These and other results Ied Jacob and Monod lo pro¡::a;e Ihat genes were expressed from speófíc sites calle:l promoters found al the stélrt of the gene, and lhat this expressioo was regulated by repressors that act through q:>erator sites located on the ONA beside the promoler.
wilcl-type
chromosome
oc"'"
repressors
mlAant chromosorne
-no transcription
inactive
ceo"""" BOl 16-3 fiGURE 1 Partial d;ploid cels sItow that functional repressors wodIln In Ihe absence of Iactose, the /oc genes are no! e
Gene Regu/o/jon in Prokatyotes
400
Box 16-] (Continued)
BOX
16-3
FIC;UR[
2 Partial diptoid
cels show tIIal opetalors WOIk only in cis.. (él) HapIoid ceII containing mutant opefator (O~). (b) Partially diploid ceII CDntaining él normal operalor (O) and a mutant opelalof (O~). The loe genes (Z, Y, ;,nd A) attached 10 the mutant operator continue lo be expressed constitutive/y even in lhe presence of él wikl-
•
""'~t
0,
"'""""",me
Z
,.J
y
A
I
I
y
A
y
A
I
I
type operator en another chromosome in the
same cel!. Thus, the opefaWf only worIo:.s in Os.
b wild-type
"'","'''''''''''
Z
~tant
chromosOO1e
I
•• BUI these experiments ....;m the loe system were not carried
operator sites. The similarity of these t'WO regulatory systems (despite the ver¡ different bioIogy) mwinced Jaa:b and Moood tha! they had iclentified él fundamental mec.hanlsm of gene regulation and that their modeI woold apply thrClJghout nature. As we will see, allhough their desaiptioo was not complete- rna;t
in isoIat.ion; in parallel, Jarob and WIOno::l did similar expertments 00 phage ). (a system we consider in detail later in this chapta). Phage ). can propogate through either of tVv'O life cyc\es. Vlrhich ooe is chosen depends on whjch of the relevant phage genes are ~essed. The Rench sdentists füund they out
noticeably, they did nol ¡nduc\e activators (such as CAP) in Iheir scheme-the basic modeI they proposed af cis regulatory sites recognized by trons regulatory factofs has dorninal:ed the majO'· ity of subsequent lhinking about gene regulation.
could isdate mutants defective in Q)I"ltrolling gene expressioo in this system just as they had in the /oc case. These mutations aga,n defined él repressor that acted in trans thrOJgh Os- actíng
BOX 16-3 FIGURE 3 lhisd,awing.
showing the loe opero.. and its regulation, was rendeted by F,a .. ~ Jacob, 1001. (SoIXce: Courtesy of Jan Vvftkov.IskJ.)
R<:¡
-IC 11,
"1" 514 ,
-;
I
S<;.,
< ~
¡ ""---
e (J.. >n
J,
l.J-...
~",,~i~
RNA.
.fi.t,
Regularíon ofTronscription
Combinatorial Control: CAP Controls Other Genes As Well The lae genes provide an example of signa) integration: Iheir expres· sion is controlled by two signals, each of wh ich is commun icated lo the gen~s via a singlo rogu lator- the Lac repressor and CA P, respl;lctively. Considor anolher Sel of E. eoJj genes , lhe gal gen{ls. These encode enzymes ¡nvolved in galactose metabolism . As with Ihe lae genes, the gol genes aro only expressed whon their substrate sugar - in this case galactosll-is present , and the preferrod energy source, glucose, is absenl. Again, analogous lo lae, the Iwo signals are conununicated to t he genes via two n~gulators - an activalor and a repressor. The rcpressor. encodod by the geno galR. mediales the effects oí tho inducer gaJaclose. bul the activator of the gol genes is again CAP. Thus, a regulator (CAP) works together with different repressors at difierent genes. This is an example of combinatoria) (:Dnlrol. In fact, CAP acts at more than 100 genes in E. eolio working with an array of partners. Combinatorial control is a characteristic feature oCgene regulalion. Thus, wluo!n Ihe sarne signal controls multiple genes. it is typically communicated lo each oC those gtmes by the same regu lalory protein. Tha! rogutator wi ll be coml'llunicating just one oC perhaps seve ral signals involved in regulating each gene; !he other signal s, diflerent in most cases, will each be rnedialcd by a separate rogulator. More com· plex organisms-higher eukaryotes in particular-tend lo have more signal integration. and lhen.! we will sce greater and morl:! elaborate examples of combinatorial control (Chapter 17).
Altemative o Factors Dircct RNA Polymcrase to Alternative Sets of Promoters Recall fmm Chapter 12 that it is lhe a subunil of RNA po)ymerase thal recognizes the promoter sequences (Figure 12-6). The loe promotor we have becn discussing, a long with the bulk of oth(lr E. eoli promoters . is ffJcognized by RNA polymeras(l bearing the 0"70 subunit. E. coJi encodes several olher O" s ubunits thal can repl ace a10 Ululer ccrtain circumstances and direcl tho polym(lraSe lo a!t(lrnativc promolers. One oí these altematives is the heat shock O" factor, rr'z. Thus. when E. coli is subject lo heal shock. Ihe 8mollnt of this ncw (T Cactor increases in Ihu cull , it displaces aro &om a proportion of RNA polymerases, and dilllcts those enzymes to transcriiJe genes whose products protect Ihe cell from !he effecls oCheat shock. The love l of dl2 is illcreased by t\Vo mechunisms: first. its lranslalion is stimul;¡ted - Ihal ¡s. its mRNA is lcanslaled wilh greater efficiunr.y aRer heat shock lhan iI was beforl.l; and second , Ihe protein is transient1y slabilized. Another I.lxampltl of an altt:rnative a factor, o-~', is considered in Ule next seclioll. {T'i4 is associated with a small fraction oí Ihe polymerase molecules in Iho cell and directs Ihat enzyme to genes involved in nitragen metabolism. Sometimes a sories of a lternative sigmas dirocts a particular pro-gram oí gone exprossion. 1\vo examples are found in Ihe bacterium B. subtilis. We consider the most elabora te of fuese, which controls sporulation in that organism, in Chapter 18. The othor we doscribe briefly here. Bacteriophage SPOl infocts B. sublWs, where it grows lytically to produce progeny phage. This process requires that the phago express its genes in a carefulJy controlled order. That control is imposod on polymerase by a sories of a Iternative (J' factors. Thus, upon iníection, the bact~ rial RNA polymerase (bearing tha B. subti/js v(:lrsiou oí (J' 70)
500
Gene HeguJalion in Prokaryotes r~cogni zes so-c..lled "early" phago promolors, which direct transcription of genes Ihal encode proteins n eeded early in infeclion. One of Ihese genes (call cd gene 28) encodes an altemative a . This displaces Ihe bacterial a faclor and diwcts Ihe polymerase lú a second sel of promolers in the phage genome, those associated wilh Ihe so-called "middle" genes. One of these genes. in lurn, encodes the a factor for Ihe phage "late" genes (Figure 16-14).
NtrC and MerR: Tr.mscriptional Activators [hat Work by Allostcry Rather than by Recruitment Although Ihe majority of acti vators work by recruitmenl, Ihere are oxceptions. Two examplp.s al activalors that work not by recruitment bul by allosleric mechanisms are NtIC and MerR. Recall what we mean by an allosteric mechanism. Activators that work by recruitment simply bring an active form of RNA polymerase lo Ihe promoter, In the case of activators that work by allosteric mechanisms, polymeraso initial ly bil1ds lhe promot er in an inactive complex, To activate lranscription, the activator lriggers ao allosteric c hange in thal complexo NtrC controls express ion 01' genes involv¡;;ld in nitrogen metabolism, such as the g/nA gene. Al Ihu glnA gene, RNA polymerase is prebound lo the promoter in a stable closed complex. Thc ac.tivator NtrC induces a conformational change in the enzyme, triggering transition lo Ihe open complex. Thus the activating event is an all ostcric change in RNA polymerase (ses Pigure 16-2). MerR controls a gene called merT, which encodes an ~nzyme that makes cells resistant 10 lhe toxic effects of mercury. MerR also acts on cm inaclivc RNA polymerase-promoler complexo Like NlrC, MerR induces a confonnallonal change Iha! triggers open complex fonnalion. In lhis case. however, the allosleric etrect of the activator is on the DNA mther than the polyrnerase.
NtrC Has ATPase Activity and Works from DNA Sites Far from the Gene As with CAP, NIIC has separate activating and DNA-binding domains ami binds DNA on ly in Ihe presence of a specific signa!. In the caSll of NtrC, that signal is low nitrogen levels. Under those conditions, NtrC is phosphorylated by a kinase, NtrB. and as a resull undergoes a conformational change that reveals the activator's DNA-binding domain. Once
.
I
..
.'
= e arlygenes
F I (j U R E
16-14
Alternative
mk.1dle genes
Ir
factors wnb"oI tbe Dfde..ed expre55ton al genes in a
late genes
bact1!rial
wnus. The b
Aberts a et al. 2002. MoIeruIm blOlogy of!he celt 4th edibon, p. 415, fig 7·63. C(llyright 10 2002. Reproduc:ed by pefmisslon 01 Routledge{T~or & Frands Books, Inc.)
Regulofion o¡Tronscription lllilialion: Examp les ¡mm E
active, NtrC binds fom sites lacated sorne 150 base pairs upstream of the promoter. NtrC binds to each of its sites as a dimer. ando through protein:protein interactions between the dimers, binds to the four sites in a highly cooperative manner. The form of RNA polymerase that transcribes the glnA gene contains the (154 subunit. This enzyme binds to the g/nA promoter in a stable, dosed complcx in the absencc of NlrC. Once active, NtrC (bound lo its s ites upstream) internets directIy with (15 •• Th is TCquires Ihat tho DNA between tho activator binding sitos and the promoter form a loop to accommodate the interaction (Figure 16-15). ti the NtrC binding sites aro moved further upstream (as much as 1 lo 2 kb) the activalor can stilJ work. NtrC itself has a n enzymatic acti vity-it is an ATPase; thi s activity provides the energy needed to induce a conformational change in polymerase. That conformational change triggers polymerase to in itiale transcription . Specif1call y. il stimulates conversion of the stable. inactivo, c10sed complex to an active. open com pl exo Al sorne genes con trolJed by NtrC, thcrc is a binding site fOI a nother protein. called lHF. lacated between the Nl'rC binding sites And the promotero Upoo binding, IHF bends DNA: when Ihe IHF binding site-an d hunee Ihe DNA oond -are in the correel l-egisler, !his event increAses activation by NtrC. The explanation is that. by hcnding tbe DNA, IH F brings Ihe ONA-bound activator doser to tht:l promoter. helping the acti· valor interncl wilh lb!;! polymerase bound thore (see Figure 16-4 ando for a doser look al how IHF bends ONA, see Figure 11-10).
MerR Activdtes Transcription by Twisting Promoter DNA When bound to a single DNA-binding site, in the presence of merc ury. MerR activates lhe merTgene. As shown in Figure 16-16. MerR binds to a sequence localed bútween the - 10 and - 35 rugioos of Ihe merT promoter (Ihis gene is transeribed by cr 711-containing polymeraseJ. MerR hinds on the opposite face of the DNA helix from th at bound by RNA polymerase. and so polymcrastl can (and does) bind to lhe promoter at tht:l same time as MorR. T he merT promoler is unusuaJ. The distance between the - 10 and - 35 elements is 19 bp instead of lile 15 lo 17 bp Iypically found in a (170 promoter (see Chapler 12, Box 12-1). As a resu lt, these two sequence elements recognized by (1 are neither optima lly separated núr aligned ; they are sOIlHlwhat rotated aroun d the faee of lhe helix in relal ion lo each olher. Furthermore, lh e bind ing of MerR (in (h e
actlvated 1eve4
,'!"!'=:;~:::==.
FI CU R E 16·1 S Acthoation by NtrC.
lhe promoter seqlJeflCe recognized by (l'~'cOlltitinlng no\oen-
zyme is differef1t lrom that re(ognized by (l'1!)-contalning holoenztme. AIthough not specif!ed in the figUle,
NtrC COI1l acts tI1e W4 subunil 01 polymerase f\lttC is shown as a dimer. blJl in taa fonm a h¡gher-oroer
complelc on ONA.
of ttanscription
502
Gene Regularían in Prokaryofetii
FIGURE 16-16 Activat~nbyMerR. The - 10 11M - 35 elernents of!he merT plOmoter IIe 011 neilrly opposile sodes of !he helix. (a) In !he ólbsence of mercury, MefR blnds 1lfId scilbilizes !he iroctive form 01 !he prometer. (b) In !he presence ot mercury, WterR \'oNI5lS !he ONA so as to Pfoperly 11lign!he plOmotel elements.
merT
a 35
MerR
10
1
Hg"
b
(
-35
MerR
- 10
~
me,r
absence of Hg2+ ) loc ks Ihe promoler in Ihis unpropitious conformation: polymemse can bind. bul nol in a manner tha! allows it 10 iniliate transcription. Therefore. there is no basa l tran scriplion. Whfm MerR binds Hg2 " , howflver. Ihe prole;n und(lrgoos a con formaHona) change that causes Ihe ONA in the cen ter of Ihe promoter to twisl. This stru ctural distortion restores the disposition of Ihe - 10 and - 35 regioos to somethiog c10se lo thal found al a st.rong 0"70 promoter. In this ocw configuralion , RNA polymerase can efficicntly iniliale transcription. The structurcs of promoter DNA in Ihe "active" and "inaclive" stales have betw delermined (ior another promoter regulated ill thi s manner) a nd are shown in Figure 16·17. It is importanl to note that in this example the activalor does nol interact with RNA polymcrase 10 actívate transcription, bul inslcml alters the conformation of the DNA in th(l vicinity of tho prebound enzyme. T hus. unlike the earlicr cases, Ihere is no sepamlion of DNA binding and activating regions: for MerR . DNA binding is ¡ntimately linked lo the aclivalion process.
Sorne Repressors Hold RNA Polyrnerolse at the Prornoter Rather than Excluding 1t Lac repressor works in lhe simples! possible \Vay: by binding to a s ite overlapping !he promoter, it blocks RNA polymerase binding. Many repressors work in that sa me way. In the MerR case. we saw a differunt fonn of repression; inlhat case Ihe protein halds the promoter in a canformalion incompalible with transcription initiation. There are other ways repressors can work. ane of which we now con si der. F I C; U R E 16-17 Structure of a merT-like ptomoter. (11) Promoterwith 11 19 bp 5p1lcer. (b) Promolef with 11 19 bp sp3Cer v.tJen in rompla \villll1dive activ1ll0r. (e) Promoter with a 17 bp spacet'. The promoler shov.1l in pclrts (11) 1lf\d (b) is from!he bmr gene 01 Boa/tus subtilis, which is controlted by regull1tOf BmrR BmrR INQtks 115 l1n activator when rom· pbed ¡Mil! lile drug '~lIap1~ylpbosphooium (lPP). The - 35 (TICACT) 1lnd - 10 (lACAGT) elffi1enls 01one SUl1nd i\Ie shol'.fl In p!nl; and SJeer!, respectJvely. (Source: Addpted. wtth permisslOl'1, from ZheIeznova He\dweJn H . aOO Brenr11lll RG. 200 1. Nalure 409: 378; F'!gure 3 b, c. d. Copyright e 2001 Nature Publishing Group. Use
me
a
b
19 bp spocer'" ..." " .
, 17 bp spacer
R~u/otion O!TroIlScripfion
lnilialion: ExampJes Jrom /JaC1f?ria
SO:I
Some rcpressors work from binding siles IhAI do nol overlap th e promoter. Those repressors do not block polymerase binding - ralher Ihey bind to sites beside a promoter. interact with polymerase bound al that promoter. and inhibit initiation. One is Ihe E. coJi Cal repressor, which we mentioned earHer. The Cal repressoc controls genes that encode enzymas ¡nvolved in galactose metabolism; in the absence of galaclose tho repressoc keeps th a genes off. In Ih is case. tIle repressor interacts with the polymerase in a munner Ihal inh ibits transition from Ihe clos(ld lo open complex. Another example is provided by the P4 protein frOID a bacteriophage (cP29) that grows on the bacterium B. subliJis. This regulator binds lo a site ad jacent lo one promoter-a weak promoter called PAJ-and , by inleracting wilh polymerase, serves as an activalor. The inleraction is with the nCTO, just as we saw wilh GA P. Bu! this activator also binds al another promoter-a strong promoter called P I\'ll" Here it makes the sama con lact with polymerase as al the weak promoter, bullhe resuIt is repression. lt S(Jf!ms lha! wheroas in Ihe former case Ihe extra binding llllergy helps recruit polymerase. and he nce activales Ihe gene , in Ihe lAlter case. Ihe overall binding energy-provided by the stroog interaclions between the polymerase and the promoter and the addjtional inleraelion provided by the activator- is so strang tha! the polymerase is unnble lo escape the promoter.
Arae and Control of the araBAD Operon bv Antiactivation The promoler of the aroBAD operon from E. coli is aClivated in the presence of atabinose and the absenee of glueose and directs expression of genes encod ing enzymcs required foc arabinose metobolism. Two activators work together here: AraC and CAP. Whcn arabinose is present, AmC binds that sugar and adopts a configuration that allows il lo bind DNA as a dimer to the edjacont half-sites. amI , and a m J;¿ (Figure 16-] Baj.
a + arabinose
amO,
RNA potymeraao
ame
transcription
,,
b -ambinose
F' (¡ U R E 16-18 eontroa el \he ot"08AD operon.. (a) Milirose binds te PJ?C. changing!he shape of !ha! activalDf 50 it binds as a dif11€f to ora/ l and aro/,. This p1aces one rnonorneJ 01 lvaC cIcre 10 !he promoter fl'om whtd1 rt can iKÚVate tran50~ (b) In lile absenc.e 01ambinose, !he Ara<: dime aó:>p1s a d¡fferert CQI1forma1oo
S04
Gene Regufation in Prokaryotes
Jusi upstream of Ihese (but nol shown in the figuro ) is a CAP site: in Ihe ahsence 01' glucoS6, CAP bi nds hcrc and helps activation, in Ihe ahsence of arabinoS{! lhe araBAD genes are nol úxpressed. This is because. w hen nol bound 10 arabinose. AraC adopts a difTerent conformation and binds DNA in a diffurent way: onl:! monomer still binds lhe aral1 site, bul the other monomer binds a dislant half-site called araO¿. as shown in Figure 16-18b. As these two half-sites are 194 bp aparto when Ame binds in Ihis fashion Ihe ONA between Ihe two siles forms a loop, Also, whan bound in lhis \Vay, thul"e is no monomar of A.mC al aroJ~ , and as Ihal is Ihe position from w hich activation of nroBAD promoter is mediatecl . lhere is no activation in this coofiguration. The magnitude of induction of Ihe muBAD promoter by arabinose is very targe, and for this reason the promoter is a ften uscd in expression vectors. Expression veclors an;! ONA constnlcts in which effici ent synlhesis of any proteio can be ensured by fusing its genl:! lO a strong promoter (see Chapler 20). (n Ihis case. fusing a gene to the araBAD promoter allows expression oC the geno lo be controlled by arabinose: thtJ gene can be kept off until expression is desirablc, ane! then "i nduced" whe n ils product is wanled simply by addilion of arabinose. This allows expression even of genes with products that are loxic 10 the bacterial ceUs.
EXAMPLES OF GENE REGULATION AT STEPS AFTER TRANSCRIPTION INITIATION Amino Acid Biosynthctic Operons Are Controllcd by Prcmature Tr.mscription Tcnnination In E. coJi the five contiguous trp genes encode enzymes Ihat synthesize the amino acid tryptophan . These genes are expressed efficie ntly only when tryptophan is Bmiling (Figure 16-19). The genes are controlled by a reprossor, jusI as !he loe genes are, but in Ihis case the Iigand Ihal conlrols Ihe aClivity of thal repressor (tryptophan ) acts not as an inducer bul as a corepressor. That is, when try ptophan is present, il binds the Trp ropressor (l.nd in ducos a confonnational changc in lhat protein. enabling il lo bind the trp opemlor ¡:md preven! transcription, When the tryptophan conccntrntion is low. Ihe Trp repressor is free of its cOJ"epressor and vacates ils operalor, allowing the synthesis oC Itp mRNA to conunence from the adjacent promoter. trpB
F I G U R E 16-19 1he trp operon. The Iryptopllan operen of E. wIi. !ihONing the retalion of lhe leader (see Iex!) 10 the s1JUCtural genes thal code lor!he Trp enzymes. lhe gene products are anthranilale syntheli!Se (product of trpE). pIlosphorib05yt anthranilate translEfase (flpD). phosphoriboS)A anl!lranilale isomefase.fndole glyca-oI phosphate synthetase (tIpC), I.ryptophan s~thetase ¡3 (rrpB), and tryptophan synthetase« (f"IpA).
trpA
Exomp/es of Gene Hf.'8u/olion ot Steps ofter Tronscription [nitiolion
Surprisingly. however. once polyrnerase has in itiatcd a trp mRNA molecu le il docs nol always complete Ihe fu H transcript. lndeed. most messages are terminaled prematurely before Ihey inelu de aveo Ihe first trp gene (trpE1. uruess a second and novel device confirms !hat Hule Iryptophan is available lo the cell. This second mechanism overcomes Ihe premature Iranscriptioo lermination, called allconalion. Wlum tryptopban levels are high. RNA polymerase thal has in itialed tl'anscriptioo pauses al a specific sita. nnd then terminates befora getting lo IrpE. as we juSI described. Bul when tryptophan is lim iling. polymel'ase does nol lerminate, aod instead reads through the trp genes. Attenualion, and tbe way it js overcome, rely 00 Ihe c10se link between transcriptioo and translalion 'i o bacteria, and 00 lhe abili ty 01' RNA lo form alternative sltuclures through Intramo lecular base pairing. as we now describe. The kay lo undcrslandiog atten uation carne from examining the sequcnce 01' fue 5' end oCt Ip operon mRNA. This analysis revealed tha! 161 nueleoti des of RNA are made from tbe tryptophan p romoler before RNA polymerase encowlters the firs l codon a l' t¡pE (Figures 16-19 and 16-20). Near the end of Ibis leader sequence. an d before trpE, is a lranscri ption termioalor, composed of a charactcristic hoirpin loop io the RNA (made from sequences in regioos 3 and 4 oC Figure 16-20), followed by eight uridine resi duos (see Figure 12-9). Al this so-call ed aUenuator, transcription usua lly s tops (and. we might have thoughl, should always stop), yielding a leader RNA 139 nucleoti des long (figure 16-201_This is the RNA product secn io the presence of high levels of tryplophan. How, then. can mRNA for the whole operon ever be made, as is seen in lbe abseoce of tryptophan? Three fcatures of the leader sequence aUow the attenuator to be passod by RNA polymerase when the cellular concentralion of lryptophan is low.
r ~.
AUGCACUUG-
I~ \ ..-
,e:-
~o
leader peptide Met--Lys-Ala-
lle-PNI-VaI--l.eu-Lys- Gly-
- Arg -Th:~-(1toP1
~CAAUGAAAGCAAUUUUCGUACUGAAA
'39
I
161 trpE polypeptide
I ~GIn-nv ..
iu[u i[¡¡ ui[ ui[ uj[ u UGAACAAAAUUAGAGAAUAACAAUGCAAA CA . 4
I
end of leader (síte 01 allenualíon)
FI G u R E 16-20 Trp opefa lor te ader RNA. Fealures of the rudeotide seq.¡eflCe of the trp operen 1eader RNA
• Firsl, Ibero is a second bairpin (bosides Iha lerminator hairpin) lhat can form hetween regions 1 and 2 oC the leader (see Figure 16-20). • Second , region 2 also is complemenlary lo regioo 3: thus, yel anolher hairpio consisting uC regions 2 and 3 can Corm , and wbon it does 1I provonls Ibo terminator bairpio (3. 4) Crom Corming. • Third. the leader RNA contains an opeo-reading frame encoding a short leader peptide oC 14 amino acids. and lhis open-rending frame is precedod by a strong rihosome binding site (see Figure 16-20). The sequence encoding the leader peptide has a strikiog Cealuro: two tryptophan codons in a row. Their importance is unrlerscored by corresponding scquences Cound in similar leader peptides of olher operons encoding enzymes that make amino acids (¡;;ee Table 16-1). Thus. Ihe leucine operon leador peptide has four adjacent leucine codons, and Ihe histidine 0IJeron leader paptide ha.. scven histidine codons in a row. In eacb case these operons are controlled by atlenuation. Tbe function oC these codons is to stop a ribosome allempting lo ltanslale Ibe leader peptide: thus. when tryptophan is scarce. there is very HUle charged Iryplophan tRNA available. and tbe ribosome stalls when il reaches the Iryptophan codons. Under those circumstances. RNA around the tryptophan codons is within the rihosome and canoot be part of a hairpin loop. (Recall tha! transcription and translation procaed simuJtaneously in bacteria.) The consequence of Ihis is shown in Figure 16-2 1 and described below. A ribosome caughl al Ihe tryptophan codons (parl b) masks region 1 . leaving region 2 Cree lo pair with region 3; thus Ihe torminator hajrpio (Cormed by regions 3 amI 4) cannol be made, and RNA polymerase passes Ibe attenualor and moves on into the operon. allowing Trp enzymo oxpression. If, 00 the other hand. Ibere is enough lryptophan (and tbereCore enougb charged Trp tRNA) for tbe rihosome lo proceed through the tryptophan codons. the ribosome blocks sequence 2 by the time RNA containing regions 3 and 4 has been made. Ribosome blocking region 2 allows formalion oC the terrninator hairpin (from regions 3 and 4). aborting transcription at lhe end oC Ihe leader RNA. The leader peptide itself has no function and is in fact immediately dcstro)'od by cellular proteases. ihe use of botb repression and aHenuation lo control exprossioll allows a finer tuning oC the level of iotr8cellular tryplopban . rl provides a two-slage response lo progressively more slringent tryptopban starvation-the ¡nitial response being the cessation of repressor binding. with greatcr starvation leading lo relaxation of attenuation. Bul altenuation aJone can provide robust regulation: other amino add operons like his and leu have no rl:lpressors; instead. they rel)' entirely on attenuation for their control. This example oC atlenuation shows Ihat transcription of a gene can be regulated without the use of a regulatory protejn. In Box 16-4, Riboswitches. we see other examples of regulation withoul regulatory proteins.
Ribosomal Proteins Are Translational Repressors of their Own Synthesis RegulaUon of translation oCten works in a manner analogous lo tron· scriplional repression: a "rapressor" binds lo tba !ranslation start site and blocks initialion of that process. In some cases, lhis binding
T A B L E 16·1 Leader Peptldes 01 Attenuillor-Conlrolled Operons Contalnlng Genes for Amino Add Biosynthesls'
Operon
Amlno Acld Sequence of Leader Peptlde,
Tryptophan Threonme Hrstidrne lsoleucll-¡e-val,ne GEDA
Mel Mel M.. Me!
Lys Lys Th, Th,
Ala Ar9 A'g Ala
Lcucff18
Mel
Se, His
Phenylalan;ne Isoleucine-vallne B
Mel Me!
Lys Th'
His Th,
lIe lIe Val Leu PIO lIe Val lIe
Se, Val
PI1e
Se, Gln Leu PIO Val Gly P,o Mel A'9
Val Ttv Phe Arg Cys "9 Gly Phe Loo Val
Loo Th, Lys Val Gly PI1e lIe PI1e Asn Val
Lys
T,O
T,O
I~
Gly Ttv H" His 11. S.. Ala Ala Th, Gly Gln His PI1e Ala Ala Lys Val Val
Th,
Th,
H;s
H~
Leu Loo Leu
Val G'I Loo
A'9 lIe His Val A'9 Leu
Phe Leu Val
PI1e Loo Gly
Phe PIO Asn
Th,
G'I Leu
Se' lIe His S.. Lys Asn
Th, Th, Ala
PI1e Ala "o
Th' His IIe
Th, fu
Gly
Asn
G'I
A~
Asp Val
Val
11.
lIe
IIe
Phe
IIe
Val
A'9
Gly
A'9
Se'
Ala
Ala
Val
Val
Val
"o Val Ala Ala P,o PIO
Gly
PIO
'ThO ol()S~l"IIhoSIS ot lsoleuclne and Ifaillle Is comolex. Ihe genes are encoCIed "" slI'Iera l operCln$ and Ihe patnway 10 leuelne sy1Ilhesls is a braroch ~ lho ...alino paltlway Thus. lsoteuclne. "allne, and ieucine are all lnvolvod in allonu&llon oI lhe ,saleucintJ·lfalme operon s ($ource ' Adapled II'O"Tl Bauer e el al 1983. Gene frJnct/Ol'l tri proIo:aryotcs. Copynghl e 1983 Cold Spnng HartlOr LabOralory Press USOrl WI:h porrrus· ,~ )
508
Gene Re8ulation in Prokaryotes a "~h tryptophan
coding region
b low tryptophan
4
trp operon mRNA • • • • •_
e no proteln svnthes!s
F I (j U R E 16-21 Transaiption lennination at the ttp attenuatof. Ho.v tral1SO'ption terrT'ÍI"oation al the
trp qJelCi1 atten.JatCl is coolroUed by !he availablrlly 01 tryptop1an. ln (a) (condltions of high tryptoph:!n), ~ 3 can pairwth sequence lIlO lorm the Iransaiptiorl letmnation ha;,pin. In (b) (COI1ditioos ol!ow tJyptophan), lhe 1Í000000staUs al adjacenl tryplophan coOOf\s, Jeaving seq.¡ence 2 free 10 pair IMI/) seq.ex:e 3, thereby pre. leader peptide.JI.UG, !he hairpin forms by pairing of seqJeI1Ce5 1 aOO 7., prevefltil'1g fOlTT\"lIion of !he 2, 3, h.>Mpín, aOO a1lc7.W1g fonnation of the hailpll1 al sequerces 3, 4. The lip enzymes are rol expressed.
involves recognition oC specific secondary sll'uclurcs in lhe mRNA. We consider here the regulütion oC the genes that enc:ode rihosomal proteins. Correet expression oC ribosomaJ protein genes poses an intercsting regu latory problem for the cell. Each ribosome contains sorne 50 distinct proteins lhat must be made at the same rate. Furlhermore, the rate al whjch a cell makes protejn, and thus tha numher of r ibasomes it nceds. is tied dosely to the ceU's growth rate: Il change in growth condiUons quickly leads to an in crease or decrease in the ra te of synthesis of all l'ibosomal components. How is all this coordinated regu~ lation accomplisht!d? Control oC ribosomal protein genes is simplified by their organization into several opemns. e8ch containing genes Cor up lO 11 ribosomal proteins (Figure 16-22). The genes for sorne nonribosomal proteins whose synthesis is also linked lo growlh Tale are con tained in theso operans, including those Cor RNA polymerase subunits a, 13, ami 13'- As with other operons, those operans are sometimes regulated at the leve] of RNA synthesis. But, the primary con trol of ribosomal
Exomples ofCene Regulolion at SIeps afrer Ihmscription Iniliulion
509
protein synlhesis is al IDO level or trons/alion of Ihe mRNA, n\,tl Iran· scription. The following si mple f"xperiment shows Iho di stinclion . When exlra copies oC a ribosomal protein operon aro introduced ¡nto tho coll, the amount oCmRNA ¡ncreases correspondingly. but synthesis oC the proteins stays near1y the same. Thus, the cell compensates Cor extra mRNA by curtailing its ilctivity as a template. This happens because ribosomal proleins aTO repressors oC thoir own transJation . For each operon, one (or a complex of two) ribosoroal proleins binds the messenger ncae the tronslatjon iniL.iation sequence o[ a no oC thn Brst genes in Ihe operon, preventing ribasomes from binding and irtitiating lranslation. Repressing lranslatioll oC tha first gene. also prevents expres· sion oC some or aH of the rest. This strategy is very sensitive. A Cew unusoo moleculas oCprotein LA, Cor example. will shut dOWn synthesis of lhat protein, as well as synlhesis oC Ihe olher len ribosoma l proteins in ¡ls operon. In this way. these proteins are made jusi at tha rate they are needed for assembJy into ribosomes. How on~ protein can function both as a ribosomaJ component and ¡IS a regu lator oC ils own lransJation is shown by comparing the sites where Lbst protein binds lo ribosomal RNA a nd lo ils me!'>senger RNA. 1'h05o sites are similar both in sequence and in secondary struclure
Box 16-4 Riboswitc:hes
Gene regulation typically ¡nvolves regulatOl)' proteins tIlat control the expression of genes at me Ievcl of transcription or trans-Iation. Not all gene expression is govemed by regulatory proteins, hovvever. lhe tryptophan operon ot E. edi, as 'NC have seen, re;ponds to me ceDular leve! of its end product (tfyptophan) by an attcnuation mechanism involving a lcader RNA but no dedicated regulatOl)' protein. Another example of gene regu· lation Ihat does r.ot involve a regulatory protein is the fibosomal RNA (rRNA) genes of E roIi, whose rate 01 transaiption is strongly influenced by the grw.rth rate of the cell. Ir tums out that RNA poIymerase forms unstable COmpleKeS al the promot€fS tor rRNA genes, and these complexes are highly sensitive to the concentration of the nudeotide that Initiates transcription (usuaUy ATP). Her.ce, under conditions of rapió groMh v.hen the cellular leveIs of AW are high, the RNA poIymerase-promoter complexes are prod~ aoo !he rRNA genes are tJanscrDed at a high rateoConverscly, uooer conditions of nUlrient limitatioo .....nen the gltMlth rate arrd cellular ATP levels are Joo., initiation by RNA pofyrnerase is ineffic.ienl and rRNA genes are transcribed at a b.N rafe lhis nudeotide-sensing systern is pero haps !he simplest of al1 transaiptional control mechanisms as it MIves no regulatory proteins and is soIeIy determined by the special properties of rRNA gene promoters. Yet anolher example of gene tegulation wittlout legulalory ploteins is the fiboswitch. Riboswitches ale legulatory RNA elements that act as direCl sensors of small molecule metarolites to control gene transcription ()( translation. For example, many genes whose function is related lo the amino acid methionine in the bacterium Boóllus subtilis are controlled by a 2ronuc1~ong. unlranstated leader RNA that can adopt altemative structures: one involvíng a stem·loop transcription
l erminator and the OIher an antiterrniroator. s.adenosyl methionine, but not methionine itse/f (or other methionine-related small moJecules), binds to these leader RNAs to stabilize the transcription termination structure. These lcader RNAs are therefore .svv1lches (ribosv.itches) that sense cellulat levels 01 S.adenosyl methionine and theleby control lransaiptional read-through ¡nlo rile dcMtnstream gene. Many examples of riboswitches are r.ow knCM'll, each ,esponding 10 a different metabolite, such as vitamin B12, thiamine pyrophosphate, llavín mononucleotide, Iysine, guanine, and adenine (Sox 16-4 Figure 1). Some riboswitches operate at the level 01 transoiption terrnination but others operate al the leve! of translation, controlling the foonation o/ an RNA structure thal blocks binding of the ribosome to the mRNA fOI the downstream gene. Riboswitches ale found not only in bacteria, but evidenlly also in atrnaea, /ungí, and plants. Another kind 01 riboswitch deserves special mention, Rather than lesponding to a metabolite, these leacler RNAs respond to unchalged tRNA. Thus, certain genes, notably genes far aminoacyJ tRNA synthetases (see o,aptet 14), are controlled by a transaiprioo termination mechanism that ínvolves a 200to 300-nudeoride long. unrranslaled, leader RNA tIlat directly and specifically interacts with the cognate, uncharged tRNA for the synrhetase. lhis intetacOOn stabilizes Ihe leader RNA in its antitennination structule so that transcliption inlo the ad)acenl synthetase gene can praceed. Specificity is achieved In part ~ a Ncodorranticodon~ interaction between the tRNA and !he leader RNA Because only uncharged tRNA can bir.d to the leader. transcriptional read·lhrOlJgh is only stimulaled when me cognate amino acid is in short supply and the level 01 uncha¡ged tRNA in the ceUrises.
510
Gtone Rc¡;ulotion in Prokaryotes
Box 16-4 (Continued)
B12 riboswileh
o
TPP riboswitch
Ihiamine pyrophosphate
FMN riboswitch
flavin mononud eolidc
S-adenosyl-melhionine
Iysine riboswilch
Iysine
PO
P4
SAM ribo switch
o P3 P2b P2e
,. guanme riboswtlch
~ ,. PI
guanine
ce-'ANH¡ o
~_
adenine riboswilen
aoonine
~ ,.
(i~ _
PI
I N'
8 O X 16-4 F I ti U R E 1 ftibosWitches participate in fundamental genetic control. The secondary Slructures o, lhe seven knO\l\lO riboswitche; and the melaboliles lhey sense are shown here. (Source : Ad3pted Irom Mandill M., Boesc B., Bilrrick lE., Winklel and 8reaker
w.c..
RR 2003. CeH 113: 577 -586; Figure 7 Panel A. pase 5811. Copyrigllt O 2000, v.ith permission 01 Elsevier.)
(Figure 16-23). The comparison suggests a precise mechanism of regu lation . Since the bind ing sita in the ffiessenger inclu des tlw initiating AVe, mRNA bound by excf'..5S prote in Ss (i n .h is examp le l ca nnot attach to ribosomes to inHi ala translati on . (Th is lS analogous lo Lac repressor bi ndi ng lo the loe promoter and Ihereby blocking access lo RNA polymerase. ) Binding is stronger lo ribosomal RNA Ih 1:ln lo mRNA. so translation is re pressed onIy when a1l need fo r the prole in in r ibosa me usl>embly is salisfi ed.
Exump/es of Gene Rcgu /Ulion 01 Sleps oflcr Tronscripti,
Lll
L1
L10
L7/ 12
~
S12
S7
EF-G
EF-Tu
S10
L3
L4
l23
l2
l14
L24
L5
S14
S8
S13
S l1
54
a
L17
L 11 operon
p-",
stroperon
(L22. $ 19)
S3
l16
l29
l18
SS
L30
l15
$ 10 operen
L6
spc opefOfl
a operan
RiOOsomal prote,n operons of E. eoII. The that In each case acts .,s, a translaflonal repressor of !he other protelfls IS !ihaded red. (Source: Ada~ 'IOm Nomv!a M., Gourse R.. and 8auWlmafl G. 1964. RibosomaI proten operans of E. col! Ann Re\! l3Kxhem 53 . 82. ~gh' CI 1984 by AnnuaI R€'.'iews. ".wwannua~.) F I G U R E 16-22 E. col; ribosomal proteín operons_
protem
a 58 biflding site in 16S ribosoma1 RNA
e b 'ranslal.ion lIlilJalion regioo in !he messengef RNA lOf 58
FI GU R E 16-23 Ribosomal protein 58 bínc:ls 165 rRNA. A rompanson el the re8'Cl" v.flere rilx>SUT'oaI pfC:.ein 58 (encoded by the SIX' operen; F.gtXe 1&22) biods 165 rRNA in the riixlsome. with the translallOr\ II1Itíabon sote 10 ~ mRNA. Similar sequeoces are shaded in darlo: green. The dashed lines boIr 011 Ihat region of lhe 165 rRNA protected by the Ss protein. (Sotxce: CerrettI O.P., Mattheabs I.C, Keamey K.R, 'v\J l. and Nornura M. 1988. 1. Mol.
BioI. 204 : 309 - 329.)
512
Gene ReBul a/ion;(I Prokarj'Dle::;
THE CASE OF PHAGE A: LAYERS OF REGULATION Bactcriophage h is a virus th at infects E. eoli. Upon infection, the phage can propagate in either of two ways: Iytically or Iysogenically. as i1Ju sttated in Figure 16-24. Lytic growth requires replication of the phage DNA and synthesis of new coat proteins. These components combine to form new phage partides lhat are releasod by Iysis of the hosl cel l. Lysogeny-the alternative propagation pathway-involvcs integralion of the phage DNA into the bacteri al chromosome where it is passively replicated at eacn cel! division- just as though it were a legitimate part of the bacterial genome. A IysOEicn is extremely stable under normal circumstances. but the phage dormant w ithi n ¡l- the prolJhage -can efficient ly switch lo Iytic growth ir the cell is exposed lo agents that damage DNA (and thus threaten the host cell 's continued cxistence). This switch from Iysogenir. to ¡y tic growth is called Iysogenic induction. Tho cho ice of devclopmentaJ pathway depends on which of two alternativc programs of gene expression is adopted in tha! cell. The
iflfeclion
1
1
Agenome
Iytic growth
induction
•
•
Iysogenic growth
1
c:J :0
• •
•
•
'-1,
CÁ- ~
n€W ptlage
• •
•
F I C UR E 16-24 Crowth and induction of A. Iysogen. Upon infec1.ior\ Acan grCM' elher !ytically tysogenically. A lysogef1 can be ptop:lgC!ted sl
7? O" Y?~
1
•
•
•
The Cose af Phose ll : Loyers ofRegulolioll
513
phage
1':.
recombinat~~ prote1ns(;
exdsiooase inlegase
xis ¡nt aff
he""
genes
J: tait genes
FICU RE 16-25 Map of phage ~ in the circular fonn. >.. genome is linear in Ihe phage head, but. l.J.IOf1 infectioo, cirwlarizes 111 the ros ~ile. VIt1en inlegralE'd inlo Ihe bi!cterial chrome&>me it is in 11 linea,
form Wlth €f1ds al !he art site (see Chapter 1\ for 11 desoiption 01 inte8ralioo).
program responsible ror lhe Iysogenic stato can be maintained stably Cor many generations. but then. upon induction. swi tch over lo the Iytic program with grest efficicncy.
Alternative Pattems oE Gene Expression Control Lytic and Lysogenic Growth )., has a 50-lb gonome and sorne 50 genes. Mast of these encode coat proteins. proteins involved in DNA roplication, recombination and Iysis (Figure 16-25). Tho products of these genes are important in making new phage partid es during the lytic cyde. hut our concem here is restricted to the regulatory proteins, and where they act. We can. therefore. concentrate on just a few of them, and start by con sidcriog a very smaU area of the genomc, shown in Figure 16-26. The depicted regioo contains two genes (el and ero) and three promoters (PR• P L, and P RM ). All the othar phage genes (except ono minot
"
F I GU RE '6-26 Promoters in the tight
and left control regtons of phage ~
5>4
Cene Regu/ation in ProkOlyCJtes
FIGURE 16-27 Transcription in the ~
comrol regions in tytic and Iysogentc I'owth. Ivtaws indicate v.tJich promolers are active al ¡he decisive per10d during Iytic and Iysogeoc growth, respectM2ly. The all'CM'S also sOCNV the direc6::lf1 01 Iranscriptioo trom each promoter.
el
em
Iytic
P,
el
p-
p.
P~
p.
•
Iysogenic
P,
one) are outside this region and are lranscribed directly from P R and P l (which stand ror !ightward and leftward promoter. rcspectivelyJ, or from alher ptomoters whose aclivities Hrc controlled by products of genes lranscribed &om PR and PL • P RM (yromotcl' for Iepressor maintenance) lranscribes o ruy the d gene. PR and P L are strang. constitutive promotcrs- that ¡s, lhey bind RNA polymerase efficiently and djrect transcription without help from an activalor. PI/M' in contrast, is a weak promolt:r ami onl)' direds effidenl lranSl;I"iption when an activator is bound jusI upstream. P RM resembles Ihe loe promoter in Ihi s regard. There are !wo arrangernenls of gene expression depicted in Figure 16-27: one renders growtb Iytit:. the other lysogenic. Lytil: growth proceeds when PL and PR ternain sw itched on, while PRM is kept off, Lysogenic growth. in contrast, is a consequcnce of P L and P R being switched off. and PRM switched on. Ho\\! are thase prornoters controlled?
Regulatory Proteins and Their Binding Sites
ONA binding
FIGURE 16-28 Xrepre55M. Thefigure shoo.vs él rnoromer 01 A repressor, indicating Vilfiws SUrfdCe5 invoM:xl Ill ól fer~nt dCtivities camed out by ¡he pmlein N inooles me amino dornatn, e the carboxy ÓClmiIln. "Tetramer\lation"
denotes thE> region ...mere two dimers interaCl binding cooperativety 10 dd;acent siles
...... )('fl
en [X\¡A (Source: Adilpted trom Ptastlne M. and Gdnn A 2002. Genes 8~1s, p. 36, Fig 1-17.
e (cid Spring Harbor laboratofy Press.)
The el gene encodes ~ repressor, a protein of two domajns joined by a flexible Iinker region (Figure 16~ 28). The N ~te rrni na J dornain con~ tains the DNA-binding regian (a he lix -turn-helix dornajo. as we saw earlier). As \Vith tha majori!y of DNA-binding proteins . >.. repressor hinds DNA as a dimer; th e main dimurizalion contacls are made berween tbe C-Ierminal dornajos. A single dimer recognizes a 17 bp DNA sequence. each monomer recognizing one half-site. again jusi as we saw in Ihe loe system. (We have a lready looke d al the deta ils of DNA recognition by ~ repressor in Figure 16-12. ) . Despite its name. >.. repressor can both activate and repress transc ription . When fun cti oning as él repressor, il works in Ihe same way as does Lae repressor - it bínds to sites Ihat overl ap the promoter nnd cxcl udcs RNA polymcrasc. As an activator, }.. rcpressor works like CAP. by recruitment. >.. reprossor's activating regio n is in tho N-Iermina l domain of the protejn. Its target on polymerase is a region of Ihe (T subunit ad jacent to the part of (1 Iha t recognizes Ihe - 35 region of the promoler (region 4. see Chapler 12. Figure 12 ~ 61. Cro (which stands for !t,onlrol of repressor and Qther things) only represses transcription, like Lac repressor. It is a single domain proloin nnd agoin hinds as a dimer lo 17 bp DNA sequences. ~ oopressor and ero can each bind lo eny one of six operatots. These sites are l'ecognized with differenl affinities by each of Ihe proleina. Three of those sites are round in Ihe ll'3 ft-colllrol region. and three in Ihe righ t. We \Vi II focus on the binding of >.. repressor and ero lo lhe siles in the right-hand region. and these aro shown in Figure 16-29. Binding lo sites in lhe left-hand control region follows a simil ar pattern . Tha Ihree binding sites in tbe right operator are called 0Rl' ~ . and ORJ; these sites are similar in sequence. bul not ideotícal, and each one-if isolated from the others and examine d separately-can bind
Tlle Ql.se al P/lOse
•
/0, ;
Loyen¡
al Regul aliorl
51 5
P,
p ....
" 0"
0"
0"
b
P,
p ... c l rnRNA
•
-'0
0"
-35
~
TO"l'J\CI'''-''' l
n
0"
-35
0"
-'0
cromRNA
FIGURE 16-2g Relative positions of promoter and operator sites in o.. Note that Om t1o'E.'I1aps!he - 35 region 01P~ by thrre base pairs. ar.d tIlat of P~ by two. Thrs diflcrence is erKll.If;tl lOf 1\. te be repres~ and PI/M oc:tivatcd by lepres50r bound at~. (Source: (b) Adaptcd from Plashne M. 1992. Agenel/C switch: Phogc ond h¡gher orgolltSms, 2nd ediltOf1. Copyng,t CI 1992 BIackweII Soence Ud. lhed
with pmnISSIOI1 .)
either a d imer of repressor or a dimer of Cro. Thc affin ities of these various interactiolls. h owever. are 1101 aH tbe same. Thus. repressor binds ORl tenrold better ilian il binds 0Ri ' ln other words. ten limes more repressor-a tenfold higher eoncentrHti on -is needed to bind C\z than ~1. ~ binds repressor w ilb about the sama affinity 85 docs Gro. on the otber hond. bin ds 01L1 with bighesl affinity. and on ly binds ~2 and OHl when presen! al ten fold h igher concentrotion. The significance of Ihese differenccs will beeome appanmt presenlly.
a.a.
A R e pressor Binds to Operator Sites Cooperatively ). repressor binds DNA cooperalively. This is critica! lo its functio n and occurs as fo llows. Consider repressor binding 1.0 sites in DR' Jn nddition lo providing ¡he dimerizanon contacts, the C·tenninal dornain 01' A rcpressor med iales intcrnctions between dimers (the poinl of contact is tite patch markcd " Ietramerization" in Figure -16·26). In this way. Iwo dimers of Teprcssor can bind eooperatively to adjacenl siles on DNA. For examplc. repressor al 0¡(1 helps rcpressor bind lo Ihe lower affinity si le OI
FIGURE 16-30 Cooperativebindingof ). rtprHsor lo DNA. l he A fepre.SOf lTlOOOfTlCfS I1teract te forro dimers, and !hose dUl1ers If1tero!lCt te form tetramers. These ¡nlerac tlOOS ensure that biróng of repfe5SOf 10 [)NA 15 ~ativc. That COOper.3!Ne binding IS helped furthef by interOCtlOf1S ~ repressor tetrarnm al <4 inteTading with olhers al: O(see later In text
51 6
Gene Regula/ion in Prokoryo/e~
Box 16-5 Conantration. Affinity. and Coopetathle Binding
Vv11at do we mean when we talk about -strong" and "weakbinding siles? l/oJhen we say tvvo molecules reoognize each other, ar interad with each other- such as a proten and its site en ONA-we mean they have some affinity for each otr.er. W'hether they are actually found bound together at any glven time depeoos en two things: 1) hoN high thal affinity is-i.e., hc:w ti~tly they inleraet. and 2) the concentJation 01 the molecules. As we emphasized in Chapters 3 and 5, \he molerular inleractions that underpin regulation in biological systems are reversible: when interacting molecules find each olher, they stid together for a period of time 'lOO tr.en separate. The hig,er the affinity, the tighter the two molecules stick together, ane! in general the Ionger !hey remain together befare parting. The higher the roncentration, the more often they will fine! eac.h other in !he first place. Thus, hi~er affinity or higher roncentration have similar effects: they both result in the two moIecules, in general, spendins. more time bound to each olher.
sor at G,u periodically lets go of ONA, it is holding on te repressor al Üpl and so remains in the proximity of 0 10, This effectively inaeases the local concentration of rEpressor in the vicinity 01 that site and ensures repressor rebinc!s frequently. If you dispense with cooperativity and just inaease the roncentration of repre5sor in the cell, when repressot falls off 0 112 it will oot be held nearby by repressor al 0 111 and will usually drift away befare it can rebine! ORl' But at the higher concentrations 01 repressor, another moIecule 01 repressor will lil:.eIy be dose to 0 112 and bind there Thus, even if eadl repressor dime.- only sits on ORl tar a short. time, by either holding it nearby or incre.asing the number 01 possible replacements, you increase ttle likelihood 01 repressor reng bound at any given time. Yet another way 01 thinking about cooperallve binding ís as an entropic effect. \iVhen a protein goes lrom being free in solution te being constrained on a DNA-binding site, Ihe entropy of the system decreases. But repressor held dose te 0 10 by interaction with repressor at (J,¡ l is alreody constrained compare
Coope rativity Visualized Cooperativity can be expressed in terms of increased affinity. Repressor has a higher affinity for (\1 than lor 0 R2. But once rep-essor is lx>und 10 ORlt repressor can bind (],u more tightly because it interacts with not only 0 lQ, bu! with repressor bound at ÜRl as well. Nethe.- of these interactions is very strong alene, bJt when combined they substantially increase !he affinity 01 biooing of that secorKl repessor. As we saw in Chapter 4, the relabonship between bine!ing energy and equilibrium is an exponential one (see Table 4- 1). lhus, inaeasing \he binding energy as little as tw"ofold inaeases affinity by an arder of magnitude. Another way lo picture haN cooperativity works is lO think of il as inueasing the local roncentration 01 repressor. Picture repressor oound cooperatively at (\1 anc! OR2' Although repres-
100
"
-
~
o
.8 <í Z
e
~ . ~ .... /
.- /
.-
,,
.-
, ,,
, , -'
-'
/
O repressor cooceotratiofl 80X 16- 5 FIG URE 1 Coope'ative bmding reaction. The dashed ~ne shows !he (1.1~ lhat describes binding of a pl"Ofein to a sin~e sire. The Sleeper Slgmoid curve shows cooper.ltive binding ot for example, ~ repressor 10 its opefalor siles. (Source: Adapted fmm Ptashne M. 1992. A genetic switch: Phoge and higher OIganisms, 2nd edition. CopyIight e 1992 BI
The Cose o{ Plloge A: Loyen; o{ Regulo/ion
517
Box .6--5 (Coofinued)
fraction of free repressor (i.e., that not bound to ONA) is lound as monomer in the cel'; thus it is in essence a cooperative bine!ing of four monomers rdther than two ~table dimers, adding to the concerted nature 01 complex formation on ONA. aOO so aclding te the steepness of the rurve. But wtJy does cooperativity make me binding OJrve steeper? We have already seen how tI1e site is filled at a 10\1IIef concenbation 01 repressor tr.an its affinity wookl suggest; but ~ is it !ha.. as repressor concentraoon decreases, binding falls
ensures the protein will olten sample them l.o'Ihile atlempting ID ream its corred sile. \Mlat is needed is a strategy that increases affinity fa tIle corred site without aiding interactions with Ihe incorrect sites. Increasing !he number of contacts betvIIeen the proten and ,ts ONA site (lar example by making the protein larger) does not necessarily help because it also tends to inaease binding te !he irKDfTect sites. Once affinity lor lhe incorrect sites gas too high, the protein e:¡sentially never finds ,ts corred s ite; it spends too long sampling incorrect sites. Thus a kinetic prob'em replaces !he specificity one anc! it can be just as disruptive. Cooperativity solves lhe problem. By binding to two adjacent siles cooperatively, a protein increases drarnatically its affinity for those sites. without increasing affinity for other sites. lhe reason it does not increase affinity far the incorred sites is simply because !he chance 01 two molerules of prote;n binding incorred sites dose toge!her at the same time (allCl'A'ing cooperativity to stabilize that binding) is extremely remote. Only l.o'Ihen they find the correct sites do !hey remain bound long enoogh to give a second protein a manee te tum up. Cooperativity and Allostery in this mapter we use the tenn cooperativity te refer te a partICular mechanism of cooperative binding. the term is also used in other contexts wnere difierent mechanisms apply. In general we might 50y \ha! cooperativity describes any situaban in which !'NO ligands bind lO a !hird molecule in such a way lhat the binding 01 one 01 those ligarlds helps !he binding of !he other. 111us, for the ONA-binding proteins we consic\ered here, cooperativity ís medialed by simple adhesive imeractJon5, t::ut in other situations coq:l€rativity can be mediated t::v allosteOC events. Perhaps the best example of that is the binding of ~ gen moIecules te hemoglobin. Hernoglobin is a homotetramer, ane! each subunit binds one molerule of oxygen. That binding 15 cooperatNe: when the first oxygen binels, it causes a confoonabonal change l.o'Ihich fixes !he binding site lar the next oxygen in a conformatlon of higher affinity. ThU5, in this case there is no direct interaction between the ligands, but by triggering an allosteric transition one ligand increases affinity lor a secand. AIthou~
Repressor and Cro Bind in Different Pattems to Control Lytic and Lysogenic Growth How do repressor and Cro control the differcnt pattems oC gene cxpression associated with the different ways A can grow? For Iytic growth, a single Cro dimcr is bouod to ORJ; this s iltl overlaps PRM aod so Cro represses thal promoter (which would only lVork at a low leve) anyway io the absence of activator because the promoter is weakJ (Figure 16·31). As neither reprcssor nor Cro is hound lo 01(1 and OR2. PR binds RNA potymerase and directs transcriptioo of Iylic genes; PI. dnes Iikewise. RecaJl Ihal both PR and PL are strollg promoters Ihal need 00 aclivalor.
518
Gene Regulo/ion in Prokot)'oles
Iyoogeo
01
liodvdion
Iytic growth
Cm
-~-
RNA po/ymarase
ero
el
Repressor bound fo 0.., and ~ Ivms off transc ription frorn P~ Repressor baund al ~ contadS RNA ~ase al ~. activaring expression al me el (repressor) gene. OPJ lies vidlin PI(M; ero bound lhere represses lranscription of el . (Soorce: Adapted from F'tashne M. and Ganl'l A 2002. Genes & signoIs. p. 30, F¡g 1-13. (;¡ (old Spnng Hi!rbor Lfu"atory Press.) f 1G UR E 16-31 lhe action 01 " repressor and Cro..
During Iysogeny. Pm.r is 0 11 . while PR (and P¡J are off. Repressor bound cooperatively at ORI and ~z blocks RNA polymer8se binding al PR, reprcssing transcriptioo fro.m Ihal promoter. Bul rcprcssor bound al Úru! activa les transcriptioo rrom PRM• We felurn to the question of how the phnge chooses between these alte rnativc pathways shortly. Bu! Hrsl we consider induction-how the Iysogenic state oU llined above switches lo Ihe altemalive lylic one when the cell is threatened. Lvsogt=nic Induction Rt:quires Proteolvtk Cle-dvdgc of A Repcessoc E'. coli senses and respouds ID DNA damage. It does tbis by acti"ati ng the function of a protein called RecA. This enzyme is ¡nvolved in recombinabon (which accounts for its name; see Chapter 10) bul it has another functiol1. That is. it stimulates the proteolytic autocleavage of certain proteins. The primary s ubstrale. for th is Activi ly ís a bacteria) reprcssor protein callcd LexA Ihal reprcsst!s genes encoding DNA repair enzymes. Activated RecA stimulilles a utucleavagc oC LexA. releasing repression or Ihose genes. This is calIed the SOS response (see Chapler 9). If the ccll is a lysogen. it ís in the best interests of the prophage lo escape under these th rcatening circumslances. To th is cnd. >.. repressor h ~ls evolved 10 resemble LexA, ensuring Ihal }., repressor too undergoes autodeavage in respollse to activat ed RecA. 1'he cleavage reaction removes Ihe C-terminal domilln of rcpressor, and so dimerizatioll and cooperati vity are immediately lost. As Ihese fun cHon s are critica l ror repressor binding to ORl and OR2 (al concen tratioos of cepressor found
"'he Cose 01 Phoge). : WjlfJt'S DI RC8uJofion
$19
in a Iysogen), loss of cooperativity ensures thal repressor dissociales from Ihose sites (as weH as from OL! and Od. Loss of repression lriggers Iranscriptioll from PR nnd Pl leading to lytic growth. For induction lo work efficiently. the levet ofrepressof in a Iysogen must be tighlly regulaled _lf levels were lo dr.o p too low. under normal condit ions. the Iysogen might spontaneously induce; if levels rose too high . appropriate induction would he inefficient. The reason COl' the latter is thal more repressor wou ld have to be inactivated (by RecA) for the concentration to drop eoough lo vacale O¡U and 01(2' We have already seen how repressor ensures tha! its level never drops loo low: it activales its own expression, án example oC posiLivc auloregulalion. Bul how does it ensure levels never get loo high? Repressor also regulales itself negat ively. This negative auloregulalion works as fOJI01/"S. As drawn, Figure 16-31 shows PRM being activated by repressor (al 0Rl) to make more repressor. Bul ir the concentralion gets too high, re pressor will bind lo ~ as well , and repress PRM (in a mBnl1er ana logous to Cro binding Ora and repressing PRM dorjng Iytie growth). This prevents synthcsis of new repressor un til its eonCt:ntration faJ ls to a level al which iI vB{;ates ~!I' As an aside. it is interesling to lIote that the tcnn "induetion" is used lO describe both Ihe switch from Iysogenic to tylie gro"1h in A, nnd Ihe swilching on of the lae genes in response lo lactose. This common usage slems from the raet thal both phenomena were studied in parallel by Jncob and Monod (see Bo>: 16-3). It is also worth noting that. jusi as lactose induces a conformational change in Lac reprcssor to relieve repression of Ihe lae genes , so too the inducing signals of A work by eausing a structural change (in this case proleolytic cleavageJ in A repressor.
Negative Autorcgulation of Repressor Requires Long,Distance lnteractions and a Large DNA Loop We have discusscd cooperath'c binding of reprcssor dimers to adjncent operalors such as ~I and 01-(2' There is yet another level of cooperativc binding seco in ¡he prophagc of a Iysogen , one critical lo proper negative autorcgulation_ Repressor dimers at ORI and ~ ioteraet wilh rep ressor dimers bound cooperativcly al 0..1 and q l' These interaetions produce an octomer of repmssor; each dimer wilhin Ihe oCta mer ls bound lo Il separate operator. Te accommodlt te the 1011g-distancc ¡ntemcLíon betwecn repressors al O¡. and q, Ihe ONA bcl ween those operator regions-sorne 3,5 kb. including the el gene itself- musl form a loop (Figure 16-32). Whcn Ihe loop is formcd, ~ is held close lo C\.3' TItis allows anolher twa dimers of repressar lo bind cooperativcly to these twa siles. This coopcrati vity meal1S Ol{~ binds repressor al a lowcr coneentn'ltion than
FIGURE 16·32 Interaction of
el
refHessors at o~ and 0..- Repressors al ()¡¡ and O, interdd as shoIMl. These inleractlOnS Slabilize binding. In mis way, me interac:tions. increase repression of PR 1Ind Pl , and allovv repressor lo bind ()¡¡, al a Iower conCer1lra\ÍOn than il othelW15e could. (Soorce: Adapte
Pfts>l
520
Gene ReguJation in Prokaryotes
it otherwise wou ld-indeed. at a concel1tration onl y just <1 Htt le higher than Ihal required lo bind ~1 and 0R2- Thus, repressor Cú ncentration is very tightly controlled indeed-small decreases are compensaled for by increased expression of its gene , and ¡ncreases by swilching Ihe gene off. This explains why Iysogeny can be so slable while also ensuri ng thal induction is very efficient. The slructure of Ihe e-terminal domain oC A repressor, interpreted in ¡he Iighl of earlier geoctic sluruf!s, rtlvcals the basis of dimer formation. Bul it also shows how two dimers interact lo fonn the letromeric form (as occurs when nJprcssor is hound cooperatively lo Oul nnd ~ ) . Moreovcr, lhe structuro fC\'Cills the oosis for lhe actomer form-and shows Ihal tbis is tbe bighest order oHgomer repressor c:tn fonn (Figure 16-33),
Anolhcr Activator, "ell, Controls the Decision between Lytic and Lysogenic Growth upon lnfection of a New Host We have secn how A repressor and Cro control Iysogenic and Iytic grawth. and the switch from one to Ihe othcr upon induclion. Now we lum lo the early cvcnts of infection. Ihose thal determine which pathway Ihe phage chooses in the first place. Critical lo this choice are lhe products of two other h genes, dI nnd d ll. We need only expand slighlly our lIlap of Ihe rogulalory region of h lo see where d i and clIJ He: d i is on Ihe right of el and is traoscribed from PR; cUJ, 0 0 the left of el, is trnnscribed from PI, (Figure 16-34J. These and othcr genes were isolated in elegant genetic screens outl ined in Box 16-6, Cenetic Approaches !hal ldenlíficd Cenes lnvolved in ¡he LyticlLysogen ic Choice. Like h reprcssor. ell is o trnnscriptional activalor. lt bi nds to a site upstream of a promoter called PItE (for [€pressor ~s tab l ishment) and
d imer-dimer
exposed
dimer-dimer
interface
Interface
interface
F I CU R E 16-33 InC:eractions between the C-tenninal domain of A reprt!SSClrs. The figure
shows. al the top, a schematic representomon of two di'ners of the (·terminal dornain of ¡., repressor. Indic.ated are \he two patches here called B and R on the surfore of that dornain that mediate interac:tions bctween two dimers. to give a tetramer, in the first inslance, and thcn between two tetrarners to glVe an octamer (!he foon found v.tlen repressor is bound cooperatiV€fy 10 the foor sites, ORlo Úf<:¡ , 0 u andO,,). Once the octamer has formoo, there is no space !eh Ior a furthcf dirner lo enter me comple){, and so the octamef is the highest arder structure IIldI forms. (Scurre: Modifted, with permisSlon, from BeU el al. 2000. Cell 101 : BO I -8 11, F'.gtlR3 4 (parts a, b) and 5 (parts a, b, e) CopyridJl CI 2CXX>. USed with permission from Elsevier.)
1'he Cose o{ Phage 11 : Layers 01 RegulaUotl c lll
,,, fIGURE
16-34
Genesandpro~fS
¡mofved in the fytic,ltysogenic choice. NoI shown here Is lhe ger1e N which líes bel\oveen P, 000 cUl (sce fígure 16-25)
slimulates transcription of Ihe el (rcpressor) gene from thal promoler. Thus, Ihs reprcssor gene can be transcribed from two differenl promOlers (PRE and PRM ). PRE is a weak promoler because it has a ....ery poor -35 sequence. 01 prolein binds lo a site lhal overlaps the - 35 region but ls located on lhe opposite face of the ONA helix; by directly interncting with polymerase , eIl helps polymerase bind lo lhe promoter. Only oncc sufficienl repressor has been made from P RE can thal repressor bind to 0"1 and ~ and direct ¡IS own synlhcsis from PRM . Thus we sec thal repressor synthesis is established by transcription from Dne promoter (stimulaled by oue activalorJ and then mainlained by transcriplion fram enolher (under ils own control-positive aUloregulation), We can now sec in summary how eH orchestrntes the choice belwecn Iytic and lysogenic developmenl. Upon infec;tion, trnnscription is immed intely in itiated ITom the two constitutive promoters P R
Sox 1W Genetk Approaches that ldentifjed Geftes InvoIved in the lytic/lysogenic Choice
Genes involved in Iytic/lysogenic choice were idenlifie::l by saeening for ~ mutants InClt erlidently grow only either lytiallly or Iysogenically. To understand how tllese mutants were found, we need to consicler hOVv' phage are grOlNTl in the laboratDfY (see Chopter 21). Bocteñol cells con be grown as o confluent, opaque lawn 000S5 an ogar piole. A Iytic phoge, grown on tho! 1iM'Tl, produces cleor plaques, or hales (Figure 21-3). Each plaque is typically initialed by a single pr.age infeding o baderial cell. The progeny phage from that inlection then inlect surrounding cells, and so 00, killing off (l'ySing) the badeñal cells in the vicinity of the original infected cell and causing a c\eor cell-free lone in the otherwise opaque lav'ln of wderial cells. Phage ~ forms plaques loo, but they ore rurbid (or cloudy)-that is, !he region wrt:hin the plaque is cleorer than the uninfected lawn, but ooly marginally so. The re.ason for this is mal ~, unlike a purely Iytic phoge, kills only a prq>ürtion of the cells it mfects, the others surviving as Iysogens. Lysogens ore resistant to subsequent inlection and so can graN within the plaque unharmed by the mass of phage partides found there. The reason ler this "¡mmunity" is quite simple: in a ly.;ogen, the integrated phage ONA (the prophage) continues making repressor from PRt.\o Any nE!\rV ~ genome entering tIlat cell will at once be bound by repressor, giving no chance of Iytic growth. In ooe classic study, mutants 01 ~ that formed c\eor plaques were isoIaled. These mutant phage ore unoble 10 form Iysogens but stin grCM' Iytically. The ~ dear mutaIions identifled the three phoge genes, called el, ell, and eDl (fer Qear 1, 11, and 'TI). In other studies, so called virulent (vir) mutations were isoIated. These define the operator siles where ~ repressor binds, anc! Ifo.Iere isolated by virtue of the fad tIlat such pr.age can gl'f.ll¡',/ on 1ysogens. By analogy to the Iac SVStem, the el mutants are comparable te the Lac repressor (fad) mutants, vir mutants are the equivalent 01 the 1ac operator (locO) mutants (see Box 16-3). Another revealing mutation was identified in a different experiment. this one a mutation in a host gene. The mutant is called hfl for high frequency of Iysogeny. l/oJhen inlected with wild-type ~, this strain almost always forms Iysogens, very rorely ClIID'Ning me phage to graN Iytically. This baderial strain lads the protease tIlat degrddes the ~ en protein (see Text).
,.".
522
Cene Regu/ation in Prokaryotes
an d PL• PR directs synthesis of bOlh Cro and e ll. erO cxpre51-lion favors Iytic devc lopment: once ero rcaches a certain levcl it \V iii bind 0 R:i and block P RM• en cx pressio n, on ¡he othcr hand, favors Iysogenic growth by direeting tran seriptioll of the repressor gene (Figure 1&-35). For stlccessful Iysogen y, repressor musl Ih en bind to ~1 and O R2 and activate PRM before ero can inhibit Ihat promoter.
Growth Conditions of E. coH Control the Stability of CH Pro[eÍo and [hus [he Lytic/Lysogenic Choice The efficicncy with which cn directs transcription of the el gene and hence the rate at which repressor is made - is !he eritic.11 slep in decidi ng how ~ will develop. Whal determines how efficie nll y e ll works in a ny given infeclion? When the phage infects a p opulaHol1 of bacterial cells thal are healthy 8lld growillg vigorollsly, it tend s ID propagate lytically, rcleasing progeny into an environment rieh in [resh hosl cells. When conditions aJ'e poor for bacterial growth. however, thc phage is more like ly lo form lysogens and si l tight ; Ihere will likely be fcw host cclls in the vicinity for any progeny phage lo ¡nfeet. These different growth conditions impinge 0 0 en as foJlows. CIl is a vcry uostablc protcin in E. coli; it is dcgrodcd by a speci fic protease called FtsH (HllB). encoded by Ihe hf1 gene. The speed with which en can direct synlhesis of repressor is thus determined by how qlliekly il is being degraded by FtsH. Cells lacking tbe hf1 gene (mld thus FtsH) almos! always fom1 Iysogens upon ¡nfeetioo by A: in the absence of lhe pratease, e n is stable and directs synthesis of ampl e repressor. FtsH activity is itself regu lated by the growlh conditiol15 of the bacteriaJ cell. and, a ll hollgh il is no! understood exact ly how that is achieved, we can 5<1y the following. lf growth is good, FtsH is very active, e n is destroyed efficientl y, repressor is not macle, and Ihe phage tcnd to grow Iyticnll y. In poor growth conditions th e oppositc happens: low FtsH acLlvity. slow degradation ofCII . repressor
fiGURE 16-35 Establishmentot
Iysogeny. The el gene is trdOsoibed ffom PRE when estabIishing Iysogeny and from p~ \..t1en maintairn ng !ha! stalf'~ RepfesSOf bound ar DR' and ÜJQ no! only actll'ates me rTlalntenanee mode but it also ttlms off the establishment mode 01eJPresSlOn. Na e that PRcontrols not only Iytic genes but also expression of eJl, ancl is thlJS impondOt in Iysogeny as wel! ~ Iytie deveIopment Slmilarty, thoogh not shown in the fig-
P RE
ell
bif'\ding site
many lyue genes, also controls the clJl gene v.tuch helps establish ure, Pv ooich oontro/s
Iysogeny (see tcxt). (Source: Mapted froro Plashne M. ancl Gann A. 2002. Genes oS: signols.
p. 31 , Fig 1- 14. Cl Cold Spring Harbor laboratol)' Press.)
PRE
eU
b¡nding site
A second di protein-dependent promoter, PI' has a sequence similar to that of P RE and is located in front of the phage gene int (see Figure 16-25 ); this gene encodes Ihe integrase enzyme thal cntaJ)lZeS site-specifie reeombinnli on (lf h ONA into the bacterial chromosome to form tlw prophage (see Chapter 11). A third eH-dependent promoter. PAQ ,located in lhe middJ e of gene Q. aets to retard lytic development and thus lo promote Iysogenic deveJopment. This is because the PA Q RNA 8ctS as an antiscnse message, binding to Ihe Q message nnd promOling its degradation. Q is anolher regulator. one lhat promotes the Jnte stages of Iytic growth , as we will see in th e next section.
Transcriptional Antitermination in A Developrncnt We eurlier saw examples ol' gene rcgu lntion \hat operatcd at stages after transcription initiatian. Two more examples are found in h development, as we now describe, starting with a type of positive transcriptional regulalion called antitermination. The transcripts conlro lled by 11. N and Q proteins are initiated perfeetly we lJ in \he absence of Ihose regu lators. Bul the Iranscripts terminate a few hundred lo a Ihousnnd nucleolides downstream of the promoter unless RNA polymerase has beco modified by th e regulator; h N and Q proleins are therefore eall ed 811titerminators. N protein regulates early ge ne expression by acting al tbree terminators: one to Ihe left of the N gene it self, one 10 the right of ero, nnd DIle between genes P and Q (Figures 16-25 nnd 16-36). Q protein has one targel. a terminator 200 nucleotides downstream of the late gen e promoter. P R,. loruted between the Q and S genes (see Figure 16-36). lbe late gene operon of h. transcribed from PR, . is remarkably large for l3 prokaryotir: transcription unit: aboul 26 kb. 8 distaoce that takes about]O minutes for RNA polymerastl lo traverse. Our understanding of how antiterminators work is ill~om p lcle. Uke olher regulalory proteins, N and Q only work on genes that carry particular sequences. Thus , N prolein prevenls termination in Ihe earl y opcrons oC A. but nol in other bacteril3) or phage operons, The spocific recognitiol1 sequenees for l3ntiterminators are not found .in the terminators where they act, but inslead occur in the operans wel\ before the GCCCTGAAAAAGGGC
cm
BoxA
I BO)( B
fluIR QBE
8
Pause
,----, ,---, J:)
e
-35
10
p,. FI G U RE 16-36 Rec.ognruon sltts and sites of itCtion of the " N and Q transcriptiOfl
antiterminators. lhe upper hne shows the early righl'Nard promoter p~ and its ininallerminator, t~ , . The nut site is. dlvided iOto two fegions, called BoxA (7 bp) <100 BoxB, separaled by a spacer regoon 018 bp. lhe 5eqlleflCe 01 BoxB has dyad symmetfy aOO forros a stem-Ioop 5tructure once transc:ribed into RNA. The sequeflCe of ¡he RNA·like strand P~"
o,
flfltR
is shown above The ta.ver line shows lhe promotef
me sequences essenllal lor Q proteio fllnctlon, and the terminator <11 wh,ch Q proteio deis.
1,·
tenninalors. N protein rcquires siles named nlll (for N utilization) lhot are 60 anri 200 ollcleotides downslream of PL and PR (see Figure 16-36). But N does not bind lo lhese sequenccs within ONA. Rather, it binds lo RNA transcribcd from ONA contaming a nut sequence. Thus, once RNA polymerase has passed a nut site. N binds lo Ihe RNA and fTOm Ihere is loaded on lo Ihe polymernse ilself. In Ihis slale, Ihe polymernse is resislanl lo Ihe lerminalors found jusi beyond the N and cro genes. A N works along with Ihe producls of ¡he baclerial genes nusA , nusB, nusE, and nuse. The NusA protein is an important cellular transcription factor. NusE is the small ribosomal subunit protein 81 0. bUI its role in N protein function is unknown. No cellular funcHon of NusB prolein is known. These proleins foem a complex with N al Ihe nut site. bul N can work in Iheir absence if presenl at high concentmtion. suggesting !hat il is N itself lhal promotes antitemlinalion. Unli ke N proleio. IhH X Q protein recogn.izes DNA sequences IQBE) between Ihe - 10 and - 35 regions of the late gene promoler (PR, ) [see Figure 16-36). Jn lhe abscnce of Q, polymerase binds PR' and ¡niUales t:mnscrip lion, Oll]y to pause oner a mere 16 or 17 nucleotides; il Ihen conlinues but terminales wheo ji re:tches lhe lerminalor (IR' ) some 200 bp downstream . [f Q is present , it binds lo QSE once the polymerase has left Ihe promoter, und trnnsfers from there to the nearby paused polymerase. With Q 00 board. the polymerase is thco Ilble to transcribe through IR·.
Retroregulation: An Interplay of Controls on RNA Synthesis and Stability Determines int Gene Expressjon The en protejn activates the promoter PI that djrects expression of the inl gen e, as well as Ihe prom oler PRE responsibJe for repressor synthesis (see Figure 16-25). The Int protein is the enzyme Ihal inlegrates ¡he phage genome into thal of the host cell during formntion of a Iysogen (see Chapler 11). Thereforc. upon infection, conditions favoring cn protein activity give rise lo a burst of bol h repressor sud integrase enzyme. Bu l Ihe inl gene is Iranscribed from PL as well as t:rom p. > so one wou ld have thoughl thal inlegrase shou ld be marle cven in thc absencc of ell prolein . This does nol happen. The renson is that inl messenger RNA iuitiated at PL is deg.raded by cellular nucJeascs . whereas mRNA initiated at PI is stable ond can be translated into integrase protein. This occurs because the Iwo messages hove different structures at their 3' ends. RNA initialed al PI stops at a terrninalor about 300 nucleotides after the end of the int gene; it has a t)o/pical stem-and-Ioop struclure followoo by six uridine nucleotides (Figure 16-37; see Chapter 12. Figure :12-9). When RNA synthesis is initioted nI PI., on the other hand . RNA polymerase is modified by the N protem and thus goes through and beyona the tenninator. This Jonger rnRNA can form a slem thal is a substrnte for nucleases. Because lhe site responsible foc this neglltive regulation is downstrcam of the gene it affects, and becslIse degrndation proceeds backward through Ihe gene, Ihis process is callcd retroreguJation. The biological function of retrorcgulalion is elear. When en activity is low a nd Iylie developmenl is favored. thefe is no nved for iniegrase enzyme; ¡hus, its mRNA is destroyed. But when CH oClivity is high and lysogeny is favo red. the inl gene is expressed to promote recombinatian of the repressed phage DNA into the bacterial chromosome.
Summary
525
site of termination in a bsence of N protein
¡ 5'
TGATGACAAAAAATT AGCGCAAGAAG~CAAAAAT CACCTTGCGCT~ATGCTCTGT
,int
' ,"'•.,C2T2.,C" T,GCTCTCT ~ T2TCTC''''2T2C"Q
•
I
I
•direction o.Iranscription
u U
G e e G G e e G
u
U A U A G e e G
Q
e
u
•• u
e
G
u U
•• U
u u u G
u
•
•e •
f lG U RE 16-31 [)NA site and transcribed RNA strudures active m retnnegulation of int express ion. Al !he top is shown !he DNA sequence ando below, !he small eyI.nders shOYJ!he symmelfic 5eqlJffla'S that form hairpir15 In RNA. 'fhe structure on Ihe Jef! shems the terITllnatOf formefllc:h is a t¿nge: lar deavagc by RNA5e 111 and degrndatlOfl by nudease5.
Thero is yet a further subtlety in this regulatory device. When a prophage is induced, il noods lo make integI"dse (together with another cnzyme. caBed excisionase; see Chapler 11) lo cal
SVMMARY A typiCAI gene is s witched on and off in r!:sponse lo the need for jts product. This regulation is predominantl y al Ihe level oftranscriptioll iniliation. Thus. for exampJo. in E. coli. a gene óncoding the enzyme thal melalxllizcs lactose is tran-
scribed al high levels only whon laclose is available in Ihe growth medium. Furtherm ore, when g1ucose (a hener energy sollJ'Ce) is also availablo. the gene is nol expressed even whan lactooe is presento
526
Gene ReguJation in ProkafYOtes
Signals, such as Ihe presence of a specific sugar. aro communicaled to genes by regulatory proleins. These are of Iwo types: oclivatol's, positi ve rcgulators Ihal switch genes on: and repressors, ncgative rcgulalors Iha! switch genes off. Typically Ihese regu lalon; are ONJ\·binding proleins thal recogni;zc specific sites al or ncar Ihe genes they Control. Activato.rs, in tho simplest (and most common) cases, work on promoters Iha! are ¡nheren!l)' weak. That is, RNA polymerase hinds lo Ihe promoter (and thus íniUales transulplion) poorly in the absence of any rcgu lator. An el;livator binds lo ONA wl1 h one surCare and with enother surface binds polymerase and recruits it lo Ihe promoter. This process is an example of cooperative binding, and is sufficient to sti mulate transcription. Reprossors can in hibi! lranscriplion by binding lo (j si tu thal overlaps the promoter. thereby blocking RNA polymerase binding. Rcpressol"S" can work in other wa}'sas wel!, for example by binding lo a site beside fue promoter and, by inleracting wilh polymerase bound at !he pro· motur, inhibiting in¡lia tion. Tho loe genes of E. coli are conlrolled by an al;tivator and a reprossoc that work: in Ihe simplest way juSI ootlined . CAP, in the absence ar glucose, binds DNA ncar lhe loc promolcr and, by rocruit ing polymerase to Iha( promoter, ilctivates exprússion of (hose genes. Thc Lac rcprcssor binds a site thal overlaps the promoter and shuts off expression in !he ahsence of laclose. Another way in which RNA polymerase is recruitúd lo differcnt genes is by Ihe use of alternative u I"actors. Thll S, different 17 factars can rcplare Ihe most prevalen! one (fr l ll in E. oolí) tllld dime! the enzyme tu promoters of differenl wquences. Examp les include ¡rI Z, which direcls t.ranscri plion of genes in response lo heat shock, ami (r~·. which dirocls transcriplion of genes invoh'od in nitrogen metabolism. Phago SPO] uses a series of a lternati va u lo control lhe ordered expression of its genes during infection There are, in bacleria, examples of other kinds of tran· scriptional activaliou as wcll. Thus. al sorne promolers, RNA polymcrase bimls efficienlly unaidod, and fonns a sla· blo, bul inactivo, dosed oomplex. Thal dosed complex does nol spontaneously undergo lransilion to Ihe open complex and inítiate transcription. Al such a promoter. an activalor must stimulate the transilion from dúsed tu open complex. Activators that stimulate Ihis killd of promoter work by al lostery: they ¡nteract wjth Ihe ¡;table closeu complex and
induce a oonformational change that causes transition lo Ihe open complex. In this chapter we saw two examples of transcriptional aclivatOl"S wúrking by allastery. In one case, Ihe activator (NlrC) inleracts wilh the RNA polymerase (oouring tf4 ) bound in a slable closed "t:omplcx al tho g/nA promoter. stimulating transilion lo the open complexo In the other example, the aclivator (Met'R) induces a conformational change in the mcrT promoler ONA. In all Ihe cases we have considcrod, lile regulators them· selves are oontrolled a Uoslerically by signals. That 15. lhe shap€ of tho regulalor changes in the pre:;ence of its signal: in one stalC it can bind DNA, in the other it canoa!. Thus. ror example, the Lec reprnssor is controlled by Ihe ligand allolactose (a product made from laclose). When allola1.10se binds reprossor il induces a chango in tbe shaJXl of thal pro· lein : in Ihat slate tho prolein cannot bind DNA. Gene expression can be rcgula tcd al stops afler lran' scription initiation. For example. regulation can be at Ihe level of transcriptional elongalion. Three cases \Voro discussed hero: atlenuation at Ih(: trp genes and antilerminalion by the N and Q prolflins of phage A. The trp genes encode enz}'mes required for the synlhesis of the amino ilcid Iryplophan. These genes are only transcribad whon the cel1lacks tryplophan. One way that amino acid con· trols expression of these genes is atlenuation: a transcripl initi
BIBLIOGRAPHY Book s Aloorts B. , Johnson A .. Lewis J., Raff M .. Roberls K., and \-Valter P. 2002. Molecular Biology o/lhe Gell, 4th edilion. New York, N. Y.: Garland Science,
Cold Spring Horbor Symposio on Quontilative Biologr.
Haumberg S .. eel. 1999. Prokar)'OHc Cene Exprcssion. Oxford, Un itcd Kingdom: Oxford Universily Press. Beckwith J. , Dilvies J., and Callant J., eds. 1983. Cene Funclion ín Prokaryotes. Cold Spring Harbar, NY.: Cold Spring Harbor Laboratory.
Plas hne M. 1992. A Ccnetic Switch: Phoge A ond Higlte,. OfBC/nismS, 2nd cdilion. Malden. Mass.: Blackwell Sdence, and Cambridge. Mass.: Cell Press.
1998. Volume 63: Mecharusms of Iranscription. Q1ld Spring Harbor, NY.: Cold Spring Harbor Laboralory Press. Müller-HilI B. 1996. The lac Operon. Berl;n: de: Gruyter.
Ptashne M. and Gann A 2002. Cenes &- Signols. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboralory Press.
Bib/iogmphy
527
Activation and Repression
DNA Binding. Coopcrativity, and Allostery
Adhya S., Geanacopoulos M., l.,ewis O.E., Ro)' S., and Aki T. 1998. 1)-anst:riplion regulalion by repressosome and by RNA polymerase conlael. C~ld Spring Harbor Symp. Quanl. Biof. 63: 1-9.
Bell C.E. ane! Lewis M. 2001. The Lac repressor: A second generalion of slructural and function al studi es. Curro Opino S lrucl. Bio1. 11: 19 -25.
Buek M., Gallegos M.T. . Sludholme O.J., Guo Y., and Gralla ¡.O. 2000. The baclerial enhancer·depenclfm t d ''''' (uN) transcripliun faclor. J. BnCÜ'lriol. 182: 41294136. llusby S. and Ebrighl R.H. 1999. Transeription activation by catabolile acli\'ator protein (CAP). / . Mol . Biol. 293: 199 - 213. Hochschilcl A. ami Dove S.L. 1998. Prote¡n·protein contaels Ihat aclivate and repress pl'Okaryolic tranSl:riplion. C,ef/92: 597 - 600.
Huffman J.L. and Bronnan Re. 2002. Prokaryolie lranscriplion regulalors: More tban jusL the helix-tWTl-h elix molif. Curro Opino St ruct. 8iol. 12: 96 - 106. Jacob F. ane! Monod J. 1961. Genelic regu latory mechanisms in the synthcsis of proleins. J. Mul. Biol. 3: 318 -356. Llúyd G.. Landini P., and Busby S. 2001. Activation anrl reprossinn of Iram;l:riplion ini tialion in bacteria. Essoys Biochern. 37: 17~ 31 . Magasanik B. 2000. Global regulabo n of .gone exprcssion. ProC. Norf. Acod. Sci. 97: 14044 - 14045. Mii ller-HiII B. 1998, Some repressors of baeterial lranscriplion. CurroOpino Microbio/. J : 145- 151 . Plashne M . and Gann A. H197. Trnnsr;riplional aclivalio n by recruilmenl. Noture 386: 569-577. Rojo F. 200L Mechanisms of leanscriptione l reprHssion. CurroOpino Microbio'. 4: 145 - 151. Hombel l. North A .. H weng [.. Wyman C.. and Kustu S. H198. The bac terial enhancer-binding proLein Ntr<": as a m olecula r mechine. Gold Spring HQrbor Symp . Qunn l. Bio/. 63: 157-166.
Roy S .. Gergcs S., and Adh ya S. 1998. Aclivalion and repression of Iranscription by e!iffercntiel conteet : Two si des ofo coin. J. Biol. Chenl. 273: 14059-14062. Schleif R. 2003. Arae prolein: A love-hate relalionship. Bioossoys 25: 274 -282. Xu H. and Hoovf'.r T.R. 2001. Transcriplional rogulalion al a distance in baderia. Curro Opino Microbio/. 4: 138 - 144 .
Hochschild A. 2002. The swilCh: el clases Ihe gap in auloregulation. Curr. Biol. 12: R87-R89. Luscombe N.M .. Austin S.E., Bonnan RM., and Thornton ¡.M. 2000. AA overview oflbe slnlclures ofprolei.n-ONA complexas. Genome Biol. 1: REVIEWS001 . Monod J. 1966. From cnZ}'1nalic adaplation lo allosteric lransitions. Scümce 154: 475 - 483.
Rcgulation at Stcps After Transcription lnitiation Bauer c., Carey J.. Kasper L., Lynn S .. \Vaechter D., ene! Gardner j . 1983. Altcn llalion in bacterilll opl}rons. In Ceno FuncHan in prokoryo res {Beckwith J.. Oavios J., and Ca lla nl J.. cds.l, pp 65-89. Cold Spring Harbor, N. Y.: Cold Spring Harbor LaLoralury. Frit..-'{Iman 0. 1. ancl Court O.L. 2001 . Bacte riophoge >..: Alive and well and slil1 doing ils thing. Curro Opin. Microbio/. 4: 201 -207 . Gouesmen M. 1999. Dacleriophage }.,: TIla tlnlold slory. /. Mo/. Biol. 293: 1 77- 180. Groonblatl , .. Mah T.r., Legauh P. , Mogridge J., Li J... and Kay L.E. 1998, Strllcture and mochanism in lranscriptional antilermination by the bacteriophage A N prolein. Go/d Spri1l8 Horbor Symp. Quont. Biol. 63: 327 -336. Nomura M . 1999. Rcgnlation of ribosome biosynlhesis in Escherichia coli aud Socchoromycfls Cerevisiac: Diversity and common principies. /. Bacteriof. 181: 6857 - 6864. Nomllra M " Gourse R., and Baughmall G. 1984. Regulalion of the synlhosis o f r¡bosomos and ribosomal cnmponcnls. A nll. Re\'. Biochf1m. 53: 75 - 117. Robcrts j .W., Yarnell w., Barllcu E.. Guo J., Marr M ., Ka D.C. , Sun H., and Roberts C.W. 1998. Antiterminatinn by bacteriophage .ti. Q prolein. Co/d Spring Horbor Symp. Quant. Biof, ti3: 319-325. Wefsberg R.A. nnd CoHesman M.E. 1999. Processive nnlitorminalion. l . Docteriof. 18t: 359 - 367. Yanofsky C. 2000. Tran scription allenuBtion : Once vicwed as a nO\'el regulatory sleateg}'. J. Bac/R riol. 182: 1 -8.
C HAPTER
Gene Regulatían in Eukaryates
n eu karyoti c ceUs. express ion of a gene can be regulated at all Ihose steps we saw in bacteria (Chapter 16), ilnd a few additional ones besides. Most striking among Ihe additiooal sleps is splicing. As we sa w io Chapler 1 3, many eukaryolic genes come in pieces and,
I
QUT LI NE
• ConsefVed Mechamsms of Transaipuonal Regulation from Yeast la Mammals (p. 531)
• ReCluitment of Protein Complexes to Genes by Eukilryotic AdJvators (p. 537) Signallntegration a!"ld CombmatcJllal Control (p. 544)
• Transcriptlonal Repressors (p. 549) Signal Transduction and me Control of TransCllptional Regulators (p. 55 1) Gene ·SilenClng" by Modificatioll of Histones aOO ONA (p. 556) Eukaryot.ic Gene Regulation at Steps after Transcription Initlation (p. 562) RNAs In Gene Regulation (p. 567)
529
530
Gene Regulo/ion in Eukmyo/es
associated with a typical gene. As in bacteria, individual regulalors bind short sequences. but in eukaryotes these binding sites are often more numerous and positioned furthér from the starl site of transcription Ihan lhey are in bacteria. We caJl Ihe region al the gene where the transcriptional machinery binds. the promoter; the individual binding sites. regulator Modiog sites: and the stretch of DNA encompassing the complete collection or regulator binding sites rol' a given gene, the regulatory sequcnces. The expunsion of regulatory sequences- that is, lhe increase io the number ofbinding sites for regulators al a typical gene-is most striking in multiceUular organisms such as Drosophila and mammals. This situation reflects the more extensive sigoal ¡ntegratioo found in lhose organisms: that is, the teodency for more sigoals to be required to switch a given gene on at the right time and place. We saw examples of signal integratioo in bacteria (Chapter 16), but those examples typicaBy involved just two difierent regulators integrating two signals to control a gene (glucose and lactose at the lae genes, for example). Yeast have less signal inlegralion litan multicellular organisms-indeed Ihey aro not so different from bacteria in this regard -and theír genes have less extensive regulatory sequences than those of multicellular eukaryotes (Figure 17-1). lo multicellular organisms, regulatory sequences can spread thousan ds of nucleolides from the promoter-both upstream and dowoslream-and r.8n be made up oC lens of ragulator binding sil es, Often these biodin g sites are grouped in units called cnhanccrs, and a givon enhancer biods regulators I'csponsible for activating the geno al a given time and place. Alternati va onhancers biod difforent groups of regulators and control expression of tho same gene al diffcronl times and placcs in response to d iIferent signals. Having mOre extensive regulatory sequences meuns that sorne regulators bind sites far from Ihe geoes they control, in sorne cases 50 kb 01' morco How can regulators act from such a distance? In bilcteria WB encountered DNA-binding proteins that commu nicate over a range of a few kb: A repressors at OR inlenl.cting with Ihose <:1 1 0[.; and NtrC, which can activate lhe g/nA gene from sites placod 1 kb 01' more upslream. In those examples of "action al a distance," the intervening DNA Joops out to accommodate the inleraction betweeo the proteins. Tbo same mechan ism oxp lains
regulalory promoter
FIG URf 17~ 1 TlIe regulatorr etements
of a bacterial yeast. and human gene. Ihustrated here is !he inaeasing complexity of regulatory seqU€nce5 from a simple bacterial gene rontroIled by a repressor lo a human gene conI.rolled by mulbple ac:tNalors and repressors. In each case, a promoter is shoIMl al lhe site v.hefe transcription is iniliated. W'hile mis is occurate fOf the bacterial case, in !he eukaryolic examples lranscJ1ption il1lllales somewhal cbMlSfream oIl'otlere !he Iranscription machine binds (see Chapter 12).
sequence
•
~asl
human
))D C
( 11
11
11 Il
)))))
le i
•
l l l l l l -)
eukaryO'lic cases as well . thO'ugh in sorne cases the dista nces over which proteins work is very large and it is not c1eur hO'w the looping occurs. Activation at a distance raises a nolher problem. When bO'und al an enhancer. Ihere muy be severru genes within range of an act ivutor. yet a given enhancer Iypically regulates on ly one gene. Olher regul atory sequences- caHed insuJators or boundary elements-are fo und bel ween eohancere and sorne prO'molers. InsulalO'rs bJ ock aclivalion oC the promotor by acti vutors bound lit the enhancer. These elemonts, although sU B poorly und erstood. e nsure ilctivators do nol work indls· ai minalely.
CONSERVED MECHANISMS OF TRANSCRIPTIONAL REGULATION FROM YEAST TO MAMMALS in Ihis chupter we consider gene regulaHon in organisms mnging from s ingle--celIed yeasl lo mammals. AH these organisms have both the more elaborate transcripti e nal machinery and the nucleesomes snd their mO'difiers Iypical ef eukaryotes. So it is not surprising that ma ny of the basic features of gene regulation are the same in aH cukaryetes. As yeasl ure Ihe most amenable to a combi nat ion O'C genetic and biochem ical dissection. muc h of the informatiO'n about how activaters and I'epl'essers wOI'k comes from that organismo Importanl fer the conclusiens d ruwn from Ihi s work. whc n exp ressed lo a mammaJian cell , a typicn1 yeast acti vator can stim ulate transcriplion. This is lested us ing u rcportcr gene. The reporter gene consists or bindi ng silos fe r Iho yoast activater inserted upslream ef the promoler of a geno whese expression level is readily measurod (a .. we discuss below). We wiII see thal lh u typical eukaryotic aclivalor works in a manner similar to the simpl est bacterial case: it has separate DNA binding and acti vating regioos, and activates transcription by recruiting proteio compl exes to specific genes. [n contrast, repressers werk in <1. variely or ways. sorne d ifferenl frem anything \Ve encountel'cd in bacteria. Those ¡nelude examp les of what is called gene silencing. il'! whicb modification to regions of chromatin keep genes in sometimes large strotches ofDNA switched off. Despite having se much in ca mmen . ne t all details of gene regula· tion are the same in ull eukaryotes. Mast importcmtly. as we .huve men· tianed . a typica l yeast gene has less extensivc rcgulat ory sequenccs thun ils mult iceHuJar coun!erpart . Se we mus! 10O'k te higher organisms lo soo how the basic mechanisms of gone regulalion are oxtended lo accommodate more complicated cuses of signal integration and cembinaterial ee ntrol. Regulation at later slages oI gene expression-transeript elongation, RNA splicin g and translution-are deal t wi th tater in Ihe chapler.
Activators Have Separate DNA Binding and Activating Functions In ooctoria wo saw that a typical activator. such as CAP, has separate DNA binding and activating fun etions. We dcscribed the genetic deme nstration of Ibis: positive centrel (or pe) mutants bi nd ONA normaliy. but
532
('.ene Regulolion in Eukatyales
,..... , ...... _ activdlioo
are defective in activation. Eukruyotic activa!ors have separete DNA bi nding and nctivating regions as well. lndeed. in thet case, the t\Vo surfaces are very often on separate dornains of the protein. We lake as an example the most studied eukaryotic activator, Gal4 (Figure 17-2). This protein activoles transcription of the ga lactose genes in the yeust S. cerevisiae. Those genes, like their bacterial counterparls, encode enzymes requiroo for galactose metabolisrn. One such gene is called GAL1. Gal4 binds lo four sites !acated 275 bp upstream of CAU (Figure 17-3). When bound th ere, in the presonco oC galuctose, Gal4 activa tes Iran scriplion ofthe GAL1 gene 1.0oo-fold. The separate DNA bindíng und aelivating regions of Gnl4 were revealed in two eomp!ementury experiments. In one experimento expression of a frugment of the GAL4 gene-encodi ng tlle N-terminal third of the activator-prod uced a protein thal bound DNA normally but did nol Ilctivale Iranscriplion. This protein contained the DNAbinding domain bul lacked the activati ng regian and Wl:IS, 'herefore, fomlatly comparable to tbe pe mutants oC bacterial activators (Figure 17-4a). ln a second exp~r im en t. el hybrid gene was const.ructed Ihat encoded the G-terminaJ throo-quarters oC CaJ4 fused 10 lhe DNA-binding domaio of H bacteriaI repressor proleio, LexA. The fusion protein was expressed in yeasl togelher with a reporter plasmid benring LexA binding siles upstream oCthe GAU promoter. The fusion prot.eio aclivatod transcriplion of this reporter (Figure 17-4b). This experimenl shows tha! activation is not med iated by DNA binding alone, as it was in one oCthe alternative mechanisms we encountered in bacteria-activation by MerR. Instead, the DNA-binding domaio serves merely to tether the activating region lo the promoter just as in the most eommon mechanism we saw in bacte ria (Chapter 16). Many olhcr oukaryotie activators have becn examined in similar experimenls and whether from yeas!, níes, or mammill!;. Ihe same ~tory typically bolds: DNA-binding dornains and activating regions are separable. ln sorne cases they aro oven carried on separate polypeplides: one has a DNA-binding domain. the other an activating region. and they form a cornplex on DNA. An examp!e oC this is the herpes virus activalar VP16, which ioteracts with Ihe Ocn DNA-binding proteio found in cells infected by that virus. Another example is the Drosophila aetivator Noteh, described in tlle next cllapter. The separable nature of ONA binding and activllting regions of eukaryotie aelivalors is tho hasis for a widoly used assay 10 detect proteio -prolo¡n interactions ( 500 Box 17-1, Tho Two Hybrid Assay).
dom,,,"
/ ONA
DNA-binding dornajn
ONA-binding sita F I (; U R f 17-2 (;a14 bound lo it5 site on DNA. The yEilSt activalor Gal4 binds as a dimer 10 a 17 bp site m DNA. The DNA-binding dornajn ollhe protejn IS separale frorn !he region of!he prolein containing!he activabng legio'1 (!he activation dor:1i!1n).
~
1
2
3
4
D l
000 ..
(1
.r
!GAL 1
I
I
UASo L I_ __ _ _ _ _ _
~~---------'
275 bp 17-3 The regulal:OfY sequences of the yeast GAU gene. lhe UASc-. (Upstream Pctiva!ing Sequence tor CAL) contains 4 bindlng siles, eoch 01 v.tJich blnds a dimcr of GaI4 as show1 in Figure 17-2. Though not shown here. there is another site betweEn these and di€' CAL' gene that binds a replesso calJed Migl , vJJich we ~n hear aboutlater in me dlapler (~Figllre 17-20).
FI GU R f
•
,
Consfllvro Mechrmisms orrromerip'ionoJ Reyulotion ¡rom l eos' lo Mommals
a
FIGURE 17- 4 Domainswap
activaliOQ _ region Ga~
uperiment. Part (a) shc:J.;,-s that Ihe DNAbmding da-na.n 01Ga14, W11hout that proIein's actwation d?maon, C
ONA-binding
:::~:-n::~'f;-;-~~1:~é::::l~~; '.jé:~dom lael
activate tlanscriplicn. In aoother e!!iperiment ON
Gal4 Site
OC:~::~::::~~~~:C::::J:~:"::~ Gal4 silc
b
OFF
DNA-?inding ~ , dom8ln ...............
:
I
533
laeZ OFF
LexA sita
(not shown)
me activation dornaio. Wthoot!he
DNA-bnding dornain, dso doE5 na: activate transcnptioo. Part (b) shCM'!> that <'Jttar:tnng the activaba"l dornain of GaI4 10 !he DNA-birdng domain 01!he bilcterial proten lex.A, aeales a hybrid protejn thaI actNates transcriprion of a gene in yeast as long as thilt gene bears a binding Slle lo< l exA. E.>plE'SSion is measuroo using a n:-porter plasmld in I'AlidJ !he QU./ promo!eJ is tuscd 10 the E. coli locZ gffle Vvhose product (¡)-gaIacrOSldase) IS readily assa-,ed in veast cells. Levcls uf expr€S5ion from !he G4L I prornofa in response 10 !he various ac1ivator consmx:ts can therefore easi~ be measurec!. Similar reporter plasmids ilfe used in maoy eq:¡eriments In Ihis
"-,-
Ga~
aclivating ......... regian LexA DNA/aoZ ON
LexA site
Box 17-1 ThI! Two Hybrid Assay This assay is used 10 identify proleins that interad with each in Box 17- 1 Figure 1, adivaother. Thus, in the case tion of a reportel gene depends on the fad that protein A interads with protejn B (even though those proteins need nol themselves normally have a role in tlanscriptional activation). The assay is predicated on the finding. discussed in the text, that the ONA-binding domain and activating region can be on separate proteins, as long as those proteins .nterad, and the activating region is thereby tetheled l o Ihe ONA nea. the gene to be activated. Pradically, the assay is carrj ed out as follo\0\'5. The gene encoding protein A is fused lO a OI\lA fragment encoding lhe ONA-binding domain ot Ga14. The gene lor o second prote;n (B) is fused tu a fragment encodjng an activating region. Neilher protein alone, when expressed in a yeast cen, adivates the /eporter gene carrying Gal4 binding sites (as shown in the firsl two lines of the figure). Vv11en both hybrid genes are explessed logethel in a yeast celr. however, the intera ction between proteins A and B generales a complete activator, and the reportel is expressed, as shown in the
snown
bonom line of Ihe figure. In a widely used elaboration of this simple assay, ¡he two hybrid assay is employed lO screen a lihrary of candidal es to find any protein tha! wil1 interad wil h a known starting proten. So now, protejn A jn the figure would be Ihe starting protein (called the ~bait") , while protein B (the ·prey") represents one of many allernalives encoded by the library (see Chaptel 20 fOI a description of how libraries ale made). VeaS! cells are lransfected with the constlud encoding protein A fused 10 the ONA--binding domain, togethel with the library encoding many unmown proteins fused lo the activating region. Thus, each transfeded yeast cell contains protein A tethered l o ONA and one 01 anothel ",Iternative protejn B ¡use
5 34
r..em~
ReguJation in Eukaryotes
Box17-1 (Condnued )
region
gene
---;;:,l'.="'-==]:::--___ tL-'~
(,,-1_ _
NO ",=
iplion
ONA-binding site
A
A
=N=A=-b=i~=~~i='"rdb) 9E=U~::~t====igt =:~=:JJ ()J=~ A
NO transaiptioo
A
(11==jJb~~~~1===I,) g~·o:.;:) "'~~'oo ONA-binding sila
BO X 17- 1 FIC U RE 1
Eukaryotic Regulators Use a Range of DNA~Binding Dorna¡os, but DNA Recognitioo lovolves the Sarne Principies as Found in Bacteria Tho experimenls descrlbed ahoye show Iha! a bacterial DNA-bindi ng dornain can function in pince ofthe DNA-binding dornain of a euknryotic activalor. That resull suggests there is no fundamental difforence in the ways DNA-binding proteins from these organisms recognize their Sil C5. Recall from lhe pfIlvious c haplcr tha! mos! bacterial regulators bind as d imcrs lo DNA largot scqu ences which are twofold rotationaUy symmeh'ic; nach monomer ¡oserts an lX hel ix in to the ma jor groove of lite DNA over one-half of the site and detccts Ihe edges of base paiTs found there. Bi nding typicall y requi.res 110 significant alteratioo io Ihe slruclurc oC eilher the protein or the DN A. T he vast majority oC bacteria! rcgulalory protcins use the so-called hulix-turn-heJix moHf. This motif. as wc S8W, consists of two o: helices separated by a short turno One hclix (the rccognition heli x) Fits in lhe major groovc oC the DNA and recogni zes specific base pairs. The other heli" makes cootacts with the ONA backbooe. posi lion ing Ihe recogniti on heli x properly a nd increasing Ihe strenglh of bind ing (seo Figure 5-20). T he same basíc -prin ciptes of DNA recognition are used in mosl c ukaryoli c cases, dcspite varia lion s in deta il. Thus, protoins oflen bind as di mcrs and recogn izc spcci fi c DNA sequences using an o: helix insert ad into the majnr s roove. One class of eukaryolic regulalory p rotoin present s the recognition helix as parl of a slruclure ve ry like 1he heli x-turn-h etix dornaln; others presen! the recognition helix wilhio quite different domaio structures. lo a variation we did
ColIserverl Ml'Chanisms 01 1'ronl>criptionol Regula/ion from YffllSt to MommaJs
535
no! see in proka ryotes, several of Ih A regulal ory prolein s we encounler in eukaryotes bind ONA as heterodimers.
e
recognition hclilC
HiS
o
O
CYs
FIGURE 17-5 ONAni!cognitionbya
homeodomain. The horneodomain C:0I1SIStS of three Q hEkes; r:i which MQ (her.res ') and 3 lf1 !he f.gure) form the SlnJdUfe r8embling me he!DI·lum«.fIx motit (compare this figure WTlh FIgure 1& 12. for example). Thus, ha>; .i is!he recogflition heIix ano:.!. as shOM\ It IS insef1r:-
F J G U R E 17-6 Zinc finge' domain. 'The a heli~ on the leh of!he slructure is Ihe recognition heliII, and it is preserlled 10 ¡he [)NA by lhe ~ shee t on lhe righL lhe tinc is cOO«linated u,. the two His residues 10 Ihe ... he]1)( and two cr-> resldues.n lhe IJ sheetas shoMI. TIlis arrangemenl slabilizes !he structure and is essentiallor ONA bindmg. (50urce: Adapte
N
536
Gene ReBu farían in Eukoryores
additional finger - the lenglh of the ONA sequence rccognized. and thus the affin ity of binding. There are other DNA-binding domains thal use zin c. In those cases. the Zn is coord inated by four Cys residues , and slabilizes a rather different DNA recogn ition motif- one resembling a helix-tum-heJix. An example of this is found in Ihe mammalian regulatory protein, the glucorticoid receptor, which reguJates genes in response to certain hormones.
flGUR E 17·7 Leucine úpper bourtd 10 ONA. Two large
Iength. Thus. as showr¡ IOWilrd !he IqJ !he two hehces Ultefact lo fOl'm a c.oilecl-ccil tIlat hoIds the rrooon1efStogether; funhet ÓOIM1. !he hehces sep1lrate enoI.Ift1 to embrare !he DNA. insertlOg inlo !he milJOr grocr.e orl oppoote SI~ 01 rhe [J.JA·helil<.. Once agail\ speofialY is p cMded by ro"llacts made betv.Een amno acid side chalns orl die
F I G U R E 17~8 Helix-loop-helix motrt. In thís UlSe, W~ ag
In boIh ~ recognilKJl1 and, 10 rombnatioo with a semnd, shorter,
Pespectives on
DNA fecognÍllOl'l and Impbcaboos fOl' lJansaipl;ooal activilUOn.. CeII TI: 45 1, figure 2A. Cop,mdll O 1994. Used ....-ith permtSSÍorl fmm EIseW:r.)
Lcucinc Zippcr Motif. This motif combines dimeri zation and DNAbinding surfaces within a s ingle slructural unít. As shown in Figure 1 7 ~ 7 . two long a helices fo rm a pincer-like structure that grips the DNA, with each o: helix inserting inlo the major groove half a tum apar!. Dimerization is mediated by another region within those same o helices: in th is region they form a s hort stretc h of coited coil , wherein the two hclices a re held together by hydrophohic inleractiolls between a pp ropri at e l y~s paced leuci ne (or olher hyd rophobi c) residues. We discussed this protein-protein inleraction in more detail in Chapter 5 (Figure 5-1 5). l..R.ucin e~zippe r-co nt a ining proteins orten form heterodi mers as welJ as homodi mers. That is al so lrue oC our final category. the so-called helix-Ioop-helix protei ns (HLH proteins). HeUx-Loop~Helix Proteins. As in the example of the leucine zipper. an exle nded o: helical region from each of two monomers inserts into lile major groove of the DNA. As s hown in Figure 17-8, lhe dimarization surface is formed from two helicaL regions: the fj rst is parl oCthe same helix involved in ONA recognition; the other is a shorter n helix. These two helices are separated by a fl ex ible loop that aJlows Ihem to pack together (and gives the mollf its namo). Leucine zipper and HLH proteins are often ca Bed basic zipper and basic HLH proteins: thi s is heC8use Ihe region of th e O' heJix th al binds ONA contains basle amino acid resldues,
A ctivating Regions Are N ot W ell#Dennoo Structures In contrasl to DNA-binding domains, activating regions do nol always have w e ll ~defin ed structures. They have been shown to form helical struclures when interacting with their targets within the transcriphonal machinery, bul it is believod those structures are "induced" by lhal binding. As we shall see. lhe lad of defined structure is consistent wilh the idea that activating regions are adhesive surfaa>,s capable of interacting wíth several olher protein surfaces. Instead of being characterized by struclure, therefore. activating regions are grouped on the basis of amino acid content. The activating region oC Ca14, roe example, is called an "acidic" activating region , cenechng a preponderance of acidic amino acids. The imporlance of these aciruc resid ues is highlighted by mutations that increase the activator's polency: such mulations invariabiy inccease lile overall acidity (negative charge) of lile activating region. But despite this. the acti val~ íng region contains equally critical hydrophobic residues. Many other activators have acidic activating regions like Gal4. Although these show HUle sequence similarity, they retain the characteristic pattem 01" acidic and hydrophobic residues. It is believed that activating regions consisl of reiterated :=;mall units. each of which has a weak activating capacily on its o\o\'n . Each unit is a short sequence of amino acids . Tha greater the number of units . and
Recruifmenl o{ Protein ÚJlnplexes
the more acidi.c cach uni! , the stronger the n::sulting aelivating region. ihis is consislent with the idca Iha! activating regions lack an overall structure and act simply as ratber indiscriminale "sticky" surfaces. (To understand tbis reasoning, imagine instead lhat an activaling region folded inlo a precise, slable three-dimensionaJ strucl ure-comparable lo, for example, a DNA-binding doma¡n. Under Ihose circumstances. fragments of thal domain would nol be expected to retain a fraeli on of Ihe DNA-binding activi ty of lhe ¡ntact domain-ruther, the entire doma in iR needed for aJly significanl acti vity. Bul if euch activating region i5 simply a general adhcsivc sllfface. it i5 easy lo imagine il being made up of smaller. weaker units.) There are other kinds of activating regions. These inelude glulaminerich activating regions such as thal found on the mammalian activalor SP1. AIso, Pro-rich activating regions have been described. for example on another manunalian activalm CfF1. These too lack delined struclure. In general, ""hereas acidic activating regions are Iypically slrong lild work in any eukaryotic organism in which they havc bren lesled . olher activ81ing regions are weaker and work less universally llUUl members of the acidíc class.
RECRUlTMENT OF PROTEIN COMPLEXES TO GENES BY EUKARYOTIC ACTIVATORS Activators Recruit the Transcriptional Machinery to the Gene We saw in bHcleria thal , in lhe most common case, nn activator stimulales trnnscription of a gene by binding to DNA with one surface. and with another, interacting with RNA polymerase and recruiting the enzyme to tha! gene (see Chapler 16. Figure 16-1). Eukaryotic acti valors aJso work this way. bul' rarely, if ever, Ihrough a direcl interaction between ¡he activator an d RNA polymerase. Instead, !he activa!or recruits polymerase indirectly in two ways. First. !be activator can interact with parts of the transcriptioll machinery olher Ihan polymerase. ando by recruiling Ihem, recruit polymerase as wen. Second. activators can recruit nucJeosome modifi ers Ihal alter chromatin in ¡he vicinity of a gene and Ihereby help polymerase bind . In many cases. a given activator can work in both ways. We firs! consider recruitment of lhe tmnscriptionn l macbinery. The eukaryotic lranscri plional machinery contai ns numemus proleios in addition lo RNA púlymerase. as we saw in Chapler 12. Many of ¡hese proteins come in preformed cúmplexes such as Ihe Mediator and the TFIID oomplex (see Tab le 12-2 and Figure 12-16 in Chapter 12). Aetivators inleraet with one or more of these complexes and recruit Ihem lo Ihe gene (Figure 17-9). ather components Iba! are nol direclly recruited by tbe aClivator, bind cooperatively with lhose that are recruited. This meaos that, despite lhe Jarge number of componenls needed to transcribe a gene, activators may have lo recruit only a relatively few entities. Indeed, according to one view, mosl of lhe machinery comes lo Ihe gene in a single. very Jarge complex called the ho)oenzyme, which contains Ihe mediator, RNA polymerase. a nd sorne of Ihe general tran scription factors (as we described in Chapler 12). 1'his Jeaves just a couple of other complexes lo acrive sRparately. such as TFIID and TFIlE. These laUer components may be recruited themseJves by actlvalors or bind cooperatively with holoenzyme.
5~U
C.ene Resulotion in Eukaryotes enhancer
- ---f I e u R E 17~9 Activatton of transcription initiation in eukaryotéS by recruitment of the transcription machinft'¡'. Asingle actJvator is sho.-vn recrurting two posS4b1e large! complexes: the Medialor; andoIhrough that. RNA palymefa5e ti; anO also lhe gerlelCll transrnption factor lFlID. Other generallran501pl1011 factors are recruiled as pan 01 !he Mediator/Pol I! m mplex (holoenzyme); separately. (thrOl.lgh dired recrultmenl by me activator); 0 1 bine! spootaneously in the presen<:e of lhe recrtJited componenls. These me no! sheM'fl hele. In rea!ity, mIS recruitment would uSUi:l11y be mediated by more man one activalor bound upsIJeam 01 Ihe gene.
Mediator
Ud
1¡
lexA s ite
fiGURE 17-10 Activationof transcription through dired tethering mediator lo ONA. This is an aample 01 an activator bypass expei"lment, as desaibed in Chapter 16. BID 16-2. In Ihis case, !he GAl.1 gene is
o,
Whatever the preci se details, an activator promote!'i formation of the entire pf&initiation complex by recruHing one or more oC the constituents to the promoter. Many proteins in the transcriptionaJ machinery have been shown to bind to activating regions in vitro. For exa·m ple, a typical acidic activating region ean interaet with ·eomponents of the mediator and with subun its of TFlID. Recruitment can be ViSlllllized using the ler.hnique c.alled chromatin irnmunoprecipitation (ChIP), described in Box 17-2, Chromatin Irnmunoprecipitation. TWs technique reveaJs when a given protein binds lo a defined region of DNA within a ceIl. At most genes, the transcriptional machinery appears at the promoter only upon acti vation of the gene. That ¡s. the machinery is not prebound, and so activation is not typicaIly mediated by an alternative mechanism we encountercd in rare cases in bacteria: the allosteric modificati on of prebound polymerase. In bacteria wo saw that genos activated by recruitment (such as the loe genes) can be activated in so-caJled activator bypass experiments (Box 16-2). In such an ex periment, activation ls observed when RNA polymerase is recruíted lo !he promoter wit.ho ut using the natural activator-polymerase interactiofl. Similar experjments work in yeas!. Thus. the CALl gene (normally activated by CaJ4) can be activated equally well by a fusion protein containing the DNA-binding doma¡n of the bacterial protein LexA fused directly lo a component of the Mediator Complex (Figure 17-10). ft 15 important to note that these experiments do not excludc the possibility that al leasl some activalors not on ly recru it parts of the transcriptional machinery, they also induce all osteric c hanges in them. Such changes mighl stimulate the efficiency of transcription ¡nHialinn. Nevertheless . the recruitment of the machinery to one or another gene is the basis of specificity ; lhal is, which gene is activated depell ds on which gene has the machin ery recruited to it. Also. the su ccess of the activator bypass cxpcriments suggests that any
l/ecruitm ent of Proteill Complexes
lo
Cenes by E¡¡koryOlic: Al.'lil 'O lor.s
539
Box 17-) Chromatin Immunopredpitation This technique, ohen just called ChiP, enables an investigator 10 identify where a given protein is bound in the genome of a living cell. Thus, far example, il is possible to determine whether components of the transaiptional machinery are bound lo a given promoter al a given time. 1I is also possible to determine whemer a spedflC regulatory protein is bollnd at a given gene, and so an. In outline, lhe lechnique is performed as folluws: foonaldehyde is added 10 cells, ooss-linking 10 the DNA any proteins !hat are bound lo DNA al tha! momen!. lhe cells are then Iysed aOO me DNA is broken into srnall fragments (2oo-300bp each). Using an antibody speófic for the pro:ein of inlerest the fragments of DNA anached lo thal protejn can be separaled from Ihe majority of the DNA in the cell The ooss-linking is then reversed and the protein removed. lo determine whether a particular region of DNA is bound by the protein, PCR is perfarmed (Chapter 20) using primers designe
A1though lhis technique is very po.vertul and routinety used, it does have limitations of which the illleSligmor needs lo be aware. FIISt, lhe resoIution of the rnethod is limited. 1I is mi possible 10 sho.\' !hat a protein is bound lO a speófic site, merely mal il is bound 10 a site within a given 200- 300 bp fragnem. lhus. it is adequate to shI:Mt tila! a regulatory protein is bound upstream of one r&her IMn anOlher gene, bul il does not sho.v you exactly v..ttere upstream of !he gene the protein is bound Serond, only proteins for .....nich antibodies are available can be looked al Even rrore jmpcrtant, pr01eins can only be identifled if me relevanl epitope recognized by the antibody is exposed v.hen
me praein in q..¡estion is cross-linked 10 the DNA (and pemaps to other proteins wilh which ji interacts at the gene). In an extension of this complication, if a given protein is not delected under one errvironmentat or physidogical conc:IltlOr\ aOO then is detected under another, the obvious interpJetatioo is that the
protein in question binds lO that region of DNA ooly in respoose lO the cnange in en~ronmental conditions. But, it mighl be lhat it is bound all the lime aOO undergoes a conformational chan~ ín response 10 lhe change in conditions, and only then is !he epitope revealed. Or, the epitq)e may be concealed by another protein under one se! of conditions bu! not me omer.
===
=±::~::::Jl 10:::1
===
=±::~::::Jl 10:::1
immllnoproopitate ONA-protein compleX
1 u
1
1
amplify ONA by PCR
"[10: 1= 80X 17-2 FICURE 1
= = ::il]
allosteric events fhat mighl exist are not essential for successful gene expression in these cases.
Activators also Recru't Nucleosome Modificrs that Help the Transcription Machinery Bind at the Promoter In addition to direct recruitmen! of the transcriptional machinery. recruitment of nucleosome modifieo; can help activate a gene inaccessibly packaged witrun chromalin. As discussed in Chapler 7 (Table 7·7). nucleosome modifiers come in Iwo types: ¡hose tha! add chemical groups lO the tails of histones. such as histone acetyl lransferases (HAlSJ. which add acetyl groups; and tbose thal remodel the nudeosornes, such as the ATP-dependent activity of SWIlSNF. How do these modifications help activate a gene? There are l\Vo basic models to explain how changes in flucleosomes can help the transcriptional machinery bind al the promoter (Figure 17-11). Firsl. remodeli ng, and celtain modifications. can uncover DNAbinding sites thal would otherwise remain maccessible within the nucleosome. por example. by increasing the mobi lity of nudeosomes, remodelers free up binding sites for rcgu lators and for the transcriplion machinery. SimilarJy. the addition of acetyl groups lo hislone taUs rulers the interactions batwaen those lai ls and adjacent nucleosomes. This modification is also believed to "loosen" chromatin structuro, frecing up sites (see Chapler 7 for a more complete dp-scription). But adding acetyl groups also helps binding of the transcriptional rnachinery (ruld other proteins) in another -..va)': it creates specific bind· ing sites on nucleosomes for proteins bearing so-called bl'omodomains (Figure 7-39). One component of Ihe TFIID complex bears bromodomains. and so binds to acetylated nucleosomes better than to unacetyJat.ed ones. Thus, a gene bearing acetylated nucleosomes at its promotee w ill likely have a rughcr affinity foe the transcriptional machinery than oue with unacety laled nucleosomes. Which parts of the lranscription machinery, and which nucleosome modifiers. are required tu transcribe a given gene'? And which components are directly recrutted by a given activatoe working al a given gene? The answers lo these questions are nol known in most cases, but some components of the transcriptlonal machinery are more stringentJy requiroo at sorne gene... than al othen;, and the same applies to nucJeosome modifiers as -..vell. 111ese differences are in many cases nol absolute. Thus, while all genes absolutely require RNA polymerase itself, a given gene may depend on another particular component of the transcription machinery. or a nucleosome modifier. or it may nol. ln sorne cases, a componen! of Ihe transcription rnachi nery might be requ ired partiaJly (tha! ¡s, in the absence of thal component, aclivation i5 reduced bul nol eliminated). ln addilion, what is nceded lo activate a given gene can vary depending on circumslances, such as Ihe stage of the cell cycle. For exampJe. Cal4 usually acti vates the CALl gene efficient1y in lhe absence of a histone acetylase. Duriog mitosis. however-when chromatin is more condensed {Chapter 7) - achvatioo is eliminated unJess lbat acetylase is recruitcd lo lhe gene.
Action at a Distance: Loops and Insulators Many eukaryotic activators-parlicu )arly in higher eukaryoles-work from a distallce. Thus. in a mamnlaJian cell, for exampJe, enhallcers can be round several tens or even hundreds of kb upst:rcam (or downstrcam)
11{;cruitmefl/ of Protejn r.l.lmp lexes lo Genes
bJ' elJknI)'otic IIctivutors
chromatin remodeling complex
histone acetytation
~
transC"iption machinery binds promoter
F' G UR E 17-11 Local alterations in chromatin structure direded by activators. Adivators, Cdpi:lble ot blnding lo their silt'S an DNA viílhin a mx:leosome are shCM'l1 bound upstream of a promotetha! IS Ill
allow binding 01 Ihe trdrlSaipbon machinery lo the promoter.
of the genes they control. We snw in bacleria that proteins bound lo separated siles on DNA can nevertheless interact-a reaction accornmodated by ONA looping. Bul in those cases, we were considering proteins binding only a few hundred base pairs apart. Under that condition, the proteins are bound sufficiently c10se lo each other thal their chance of interacting is much higher on DNA than off i1. Once the sites lo which they bind are sep1 to help comrnunication between distantl y-bound proteins. Recall. fro m bacteria . one way Ihis can be done. The "arc:h ilectural" protein IHF binds to sites 00 UNA and bends i1. At sorne genes co~troll ed by NtrC. IHF sites are found between the activatorbinding sites and the promoter. By bending the DNA, [}-¡F helps the DNA-bound activator reach RNA polymerase al the promoter (see Chapler 16. Figure 16-4). Various models have been proposed to explain how proteins binding in between enhancers and promolers might hel p activalion in the cells of highAr eu karyotes. In DrosophiJa , IhA cut genA is aclivated from an enhancer sorne 100 kb away, A prolein called Chip (nothing lo do w Hh Ihe tecbnique of that name! ) aids oornmwücation betweCll enhancer and gene. Thus, mutants in the gene encoding Chip arrect
remodeled nuclOQsomes
~
transcription machinery blnds promoter
542
(;ene ReguJOlian in Euk/lI}'CJles
the strength of activation . Ha\'\' Chip works is still not clear. but one ruodel ís IItat íl bínds to multiple ONA sites between the ellhanccr ¡:lIld the prom oter. and, by intcracfing with itself. form s multiple miniloops in the interveníng DNA. the cumulative effecl of which is lo bríng the promoter and enhancer inlo closer proximíty. There are olher models. In eukaryotes , the DNA is wrapped in nucleosomes as we have seen. and the histones within those nucleosomes are subject lo various modifications that arreet their disposition and compactness. Thus. sites separated by many base pairs may noto in effect. be as fae apar! in the cell as might have been thought. Also. chromatin may in sorne places form special structures that actively bri ng enhancers and promoters closer together. lf an enhancer activales a specific gene 50 kb away. what stops it from activating other genes whose promoters are within that range'! Specific elemen!s called insulators control !he actions of activators. When placed between an enhancer and a promoter, an insulator inhibits activation of the gene by that enhancer. As shown in Figure 17-12 . the insulator does not inhibit activation of tha! same gene by a different en hancer. one placed downstream of Ih e promoter; nor doe~ it inhibit the original activatof from working on a different gene. Thus. the proteins that bind insulators do no! actively rcpress the promoter. nor do they inhibít the acti vities of the activators. Rather. they block corom1.Ulicalion betwccn Ihe two. In other assays . in sulators also seem able to inhibit the spread of c bromatin modification s. As \'le have soon, lhe mudification s tate of local ch romatin inOuences whelher genes are expressed or no!. We will see below that propagation of certain repress ing histolle modifications over stretches of chromatin Hes al the hear! of a phenomenon called transcriptional silendng. Silencing is a specializcd forro of repressío n thal can spread along c hromatin. switching off multi ple genes without the nced for each In bear binding sites roc spcci fi c
F I e u R E 17- 12 Insulaton block activation by enhancers. 10 p
a
~ ON prc:moler
enhancer
b
OFF enhancef
e ON
insulalor
promoter
enhancer
insulator
promoter
enhancer
insulator
promoter
~ promoter
OFF
d
ON enhancer
Re<:ruitment o/ Protllin Comploxes lo Genes by Eukoryolic Acli va/of'S
543
r(:pressors. Insu lal or clemAnts can block Ihis spreading, so insu lators protecl genes from bOlh indiscriminate activation and repression. Thi s silualion has consequences for some experimental mani pulaHon s. A gene in serted at random ¡nlo Ihe mammalian gellome is often "silenced" because it becomes ineorporated inl o a parlicularly dense form of chromalin called heterochromatin. Bu l if insu lators are placed up- and downstream of lhat gene they proteet it !"rom silencing.
Appropriate Regulation oE Sorne Groups oE Genes Requires Locus Control R egions The human globin genes are expressed in red bload cells of adults and in various cells in the lineage that forms red blood cell s during developmen!. There are five different globin genes in humans (Figure 1713a). Although clustered. tbese genes are nol al! expressed al the same time. Rather. the different genes are exp ressed al different stages of development starting with E, then fhe y genes. followed by P and (l. How is their expression regu lated? Each gene has ils own collection of regulatory sites nceded lo switch that gene on al the right lime during devclopment and in lhe pmper tissues. Thus Ihe j3-globi n gene fwhich is expressed in adult bone marrow) has two enhancers: one upstream of the promotor, the othor downstream. Only in adult bone mruTOW are Ihe correct regulators a1l active and present in appropriate concentrations lo bind Ihese enhancers. But more tban th is is roquired to switch on tht:sc genes in the correet order. A group of regulatory elements collectively callee! tbe locus control region, or LCR, is found 30-50 kb upslream nr the whole cluster of g10bin gerles. How the LCR works is slill undear. bul it binds regu latory proteins t11at cause Ihe ehromati n structure around the whole globin gene cluster lo "open up," allowing aCCl:lSS to the array of f(,'gll lalors that control expression of the individual gones in a definod order,
a
10000S control
region (lCR) E
\-
cluster 01 globll"l genes
L-J
10 kb
b mouse P-globin locus LCR
Ey
'---' 10 kb
e
mouse HoxD Iocus
GCR
) '---' 20kb
F I G U R E 11-13 Regulatton by LUs. lhe hurren groom genes, and the LCR lha! ensures the" ordered expression, are shOM'l io pan (a). Nol shov.-n is the CI-gicbln gene, which is e»pressed ttuOlItf1out developmenl; ltS product rombines Wlth of!he globios shOM"l here 10 tu rnolo produc:e difieren! IOfI"l1S of hem~obin al dilft::renl Slages d deveIopment In part (b) are lhe gIobin genes 110m m¡ce, y)j rch are illso regulated by an LCR In pan (e) JS shown Ihe HoxD gene d uster Irom me mouse controlled by
eam
ao elemenl G111ed lhe GCR v.Alidl, lire lhe LCRs, appeilrs ID ¡mpc6e ordered expre5S1OI1 Ofl me gene cluster.
The LCR is made up of multiple-sequence fllements. Sorne of Ihese have Ihe properties of enhancers: lhat ¡s. if those sequences are altached experimentalJy upstream of a reporter gene, they can activate that gene. Other parts of the LCR act more Iike insu lator elements and still olhers seem to have properties of promoters. This divers ity of elements has led lo numerous models for how LCRs might work. The simplesl i5 Ihal regulalory proleins bind lo Ihe teR a nd recruil chromAtin modifying complexes lo Ihe region. Recent experiments have ll sed techniques tha.t allow the locations of the LCR and promoter lo be visualized in cells during acllvation. These suggest the LeR is in cLose proximity lo each promotor as that promoter is acti· vated, consistenl with the idea lhat proteins bound at the LCR interac! with others al Ihe promoler. Another model has been proposed, however. in which the entire tran5criptionAI machinery is recruited lo lhe LCR and from Ihero trAnscribes al1 the way through Ihe locus, opening up the chromatin as it goes and freeing up the local control elements in fron! of each gene. These individual promoters wouId then produce high level expressiolls o[ each gene as required. Figure 17-13b shows Ihe moure globin genes, and Ihei r associaled LCR; And Figure 17-13c shows another group oC mQusa genes whose expression is regulaled in a temporarily and spatiaJly ordered scquence. These aro the so-ca Jled HoxD genes. They are involved in patteming the deveJoping cmbryo (Chapter 19, Box 19-3 ). The HoxD genes are controlled by an element caBed the CCR (global control region! in a manner very like lhat seen with th~ globin genes and their LCR.
SIGNAL INTEGRATlON AND COMBINATORIAL CONTROL Activators Work Together Synergistically to lntcgrate Signals In bacteria we saw examples of signal integration in gene regulation. Recal!. for example, that tbe loc genes of E. coJi are efficiently expressed only wben both lactose is present and glucose absent. Tbe two signals are communicated to lhe gene Ihrough se parat e regulators-one an activator and the other a rcprcssor. In multicellular organisms signal ¡ntegration is used extensively. In sorne cases numerous signals are required to sw itc h a gene on. Bul jusi as in bactoria, eacb signal ilO transmitted to tba gene by A saparate regulator, so at many genes mldtiple activators mus! work together lO switch Ibe gene on . When Illultiple activators work togelher, lbe)' do so synergistically. That ¡s , the effect of, say, two activalors working together is greater (usuaJly much greaterJ thM the sum of each of Ibem working alone. Synergy can result from multiple activalors recruiting a single component of the transcriplional machinery; multiple activators each recruiting a differenl component; or multiple activalors helping each othe! bind to their sites upstream of Ihe gene they control. We briefly cnnsider all three strategies -before giviJlg examples. Two activators can recruil a single complex-for example, the Mediator-by touching diffcrent parts of il. Thc combined binding energy will have an exponenlial effect 011 recruitment (see Chapter 3-, Tahlc 3-1). In cases where the activators recruit differenl complexes (neither of which would bind efficiently without h elp) , synergy is even easier to piclure.
S ignoJ llltegratioll olld ÚJmbillotoriol ÚJlltrol
54S
Synergy can al so result from activators helping each other bi nd under conditions whero lhe binding of o ne depends on binding of the othe r. This coopel'ativity ca n be of the type we encoulltered in bac! e~ ria, whereby the two activators touch each other when they bind their sites on DNA. Bu! it can work in olher ways as well: one acli vatOl' can recruit something Iha! he lps the second acti vator bind. Figure 17-14 illlL';trates the different ways activators he lp cach other bind ONA. These include " class ical" coope rative binding; recruitment of a modifi er by one activator to he lp a second bind; and binding 01" o ne activa~ to r to nucleosonlfl.l ONA uncove nng the biuding site for another. Synergy is critical for sign al inlegration by activators. Consicler a gene whose produet i5 o nly needecl when hvo signals are received . Each signal is eomm unicated to Ihe gene by a separate acti vator. The gene must be efficiently ex pressed when both acti vators are present bul be relatively impe rvious to the aelion of either activator alone.
a
b
•
r O ::y-0 A
I
e
A
X
d
A
A
FIG URE 17·14 CooperatiYebinding
of actrvators. Here are sho.M'i four Wiif> that fhe binding 01one pfOlein to a sire en DNA can help the blnding uf another 10 a neart.y síte. In part (a) is shov.In ~ative bindina IIYough direct ¡nter3Clion between the tv.'O proteifls, as we saw lor ~ repressor in Chapter 16, and 'Nill see between many regulatcrs in €\Jkafy«es as wetl. In (b) a similar effect is achieved by bolh proteins ¡nlerilCting with a common thlrd protein. Parts (e) and (d) show oorea eff~ In v.tJ1dl binding of one prolem lO lts site 0I'l DNA within nudeosomes heIps btoding of a second protein. ln (e) lhe firSl pfOlein recruits a nudeoSome lernodeller v.hose oction le.-eals a binding S\le 1m a second prc(ein. ln part (d) !he blnding of Ihe fÍlsI protein 10 its site occurs because that site is on lhe DNA jtJst v.t.efe it exits Ihe nudeosome. By binding there, it ufllNinds lhe DNA lrom lhe nudeosome a linle, revealing !he binding site 1m lhe second protein. Each of these rned"ial1lsms can expIain hovv 0I'le regulator can help others bind. oro jndeecl, how an ac:tivalor can help lhe transcrJlboo machJlel)' b.nd lo a promoter.
546
Gene ReguJotion in Eukaryate5
SibfJlal Integration: the HO Gene Is Controllcd by Two Regulators; One Recruits Nucleosome Modifiers and the Other Recruits Mediator The yeasl S. cerevjsjoe divides by buddi.ng. That ¡s, inslead of divid· ing to produce two idenlical daughter cells, Ihe so·ca lled mother cel! b uds lo produce a daughter cell. We w iH focus here on Ihe expression of a gene called HO. (We noed nol concem ourselves with the fundion of Ihi s gene, wh ich is described in Olflper 11.) Tha HO gene is expressed only in mother cells and only al a certain point in the cell cycle. These lwo condilions are communicaled lo Ihe gene through l\Vo activntors: SW15 and SBF. SW15 bind s lo multíple sites sorne dislanco from the gene. Ihe nearest being more than 1 kb from Ihe pro· moter (Figure 17-1 5). SBF also binds multiple sites. bul Ihese are localed c10ser lo the promoter. Why does expression of Ihe gene depend on both activalors? SBF (which is acti ve only al Ihe corree! slage of Ihe cell cycle) canllO! bind its s iles unaided; Iheir dis position witbin chromatin prohibit s it. SWI5 (which ncls only in Ihe mother ce l!) ca n bi nd lo its sítes unaided bul canool, fmm that distance, activate the HO gene (remembcr thal in yeast. activalors do not work over long dislances). SWl5 can , however, recnJ it nucleosome modifiers (a hi!'>lone ecelyl transferase followoo by the remodclling enzyrne SWUSNF). These act on nucleosomes over Ihe SBF sites. Thus, ir bolh activalors are preson! and active, the flction of SWI5 enables SBF lo bind, and that acli valor. in lum, recruits the trnnscriplional machinery (by dircctly binoing Mediator) and activfl1es expression of the gene.
Signal Integration: Cooperative Binding of Activators at the Human Il-Interferon Gene The human (3,-interferon gene is activated in cells upon viral infechon . 'nrection triggers three aclivators: NFKB , IRF, and fun/ATF. These proleins bind cooperatively to sites adjacent lo one nnoth er within en enhancer located about 1 kb upslream of the promoter. The struclure formed by Ihese regulalors bound lo Ihe enhancer is cnlled nn cnhanceosome (Figure 17-16). The binding of the activalors Is cooperativa fOf Iwo reasons. First. the ar:tivators inlerncl with each other. Second, an addi!ional proteio,
F I GU RE 17·15 Control of the Ha gene. SWlS can bind its sites wthin chl'Ol'TliltÍn 1I11é11ded, btlt SBF cannol Remodellefs /lnó histone acetyl¡tses recruited by SWl5 alter nudeosanes over the SBF Sltes, allOWlng that activatar 10 bind near Ihe promoter and activate !he gene. In the fl8llre, lar sil1l>licity, me nudeosomes are not dra\M1. (Source: Mapted from Ptashne M and Gann A. 2002. Genes & Signcls, p. 95, Fig 2· 18. e (oId Spring Harbar Labofatory Press.)
chromatin remodeling complex histone acetylase
SW!5 \
Signal ¡r/tegro /ion cmd Combinatorial Con trol INF-j3 gene
enhancer IRF
called HMG-I, binds within the en bancer an d aid s binding of Ihe aelivators by bending lhe ONA in a way Ihal fac ilitales Ihe interactions amo ng Ihem . HMG-I, which is eons itituti vely active in Ihe coll , 1hus has an architectural roltl in the process. These layers of coopemtivity ensure tight integrab a n of signals: fol:' Ihe gene lo be activatcel. all Ihree acti valors and HMG-I mus! be presen!. Once fomlCd. activators within the enhanceosome contact the transcriptianaJ machinery and activale Ihe gene.
Combinatori.al C ontrol Lies at the Heart of the Complexity and Diversity of Eukaryotes We cncountered simple cases of combinatorial control in bacteria. ror example, CAP is involved in reguJat ing many genes. in collaboration with othar regulatoTS. At the loe genes it works with lhe Lac fElpreSSOr; atibe gol genes wi th the Cal rcprossor. There is extensive com binatorial control in eukaryotes. We fust consíder a generic case (Figure 17-17). Cene A is conlrolled by foue signals U . 2. 3, and 4), each working through a separala activator {activators.1, 2.3,3 11(.1.41. Gtlne B is controlled Ly three signa ls {3. 5, and 61, working through activalors 3, 5. and 6. Note that thore is oue signal in common between these two cases. and the actiwltor through which that signal works is the same at hoth genes. ln complex multiccllular organisms. such as Drosophilo ami hu.ma ns. combinatorial control in vo lves many more regulators and genes !han shown in this kind of exam plc; and , of course, repressors ns well as activators ciln be involved. How is it that the regulators can intennix so promiscuously?
547
F I G U R E 17-16 1he human ll-int1!fferon enhanceosome. Cooperative binding 01 lhe three aaivotors, logether with Ihe architectural protein HMG-I, activates lhe p..interferon gene..
548
Ge ne Regula/ion in Euka ryotes
FIGU RE 17-17
Combinatoriat controt
Two genes are shoI,\I(l, each controlle<:! by multiple signal!>-fouf in the case of gene A: three in the case of gene B, Eacr. signill is communicaled 10 a gene by one regtJlalory pIOlen Regulalory prolein 3 acts al both g€fl€!., in combinalion with ditlerenl addilional regula1M in lhe two cases.
• s ile 2
sjle 3
$ile 4
b
As we discussed above, multip le activalors work syn ergistically. In fact . even multiple copies of a s ingle activalor work synergistically. suggesting Ihat a given activator can inleract with multiple largots. This provides an explanation for why djfferent regula tors can work logelhe r in so many combinations: becau se each can use a ny of an array of largets , the combinations Ihat work togclher are unrestr ictad. 80th lhe examples of signal i.ntegra tion we considered above - the H Q gene in yeast and the human j3-in lcrferon gene - involve aclivalors that also regulale other genes in exam ples of combinatorial control. Thus. from the yeas! example, SWIs is involved in regulating sevoral other genes. And in the mamma}¡an case, NFKB regulates nol only lhe j3-interferon gene bul numerous other genes inc!uding the immunoglobuhn K ¡igh! chain gene in B cells. Jun/ATf, likewise. works wílh other rogu );¡lors lo control olhp.r genes. We dp.scrihec1 earlier that sorne DNA-binding proteins bind as heterodimers with altemative partners. This oCfers another level of combinatorial control.
Combinatorial Control of the Mating~Type Genes from Saccharomyces cerevisiae The yeast S. cerevjsioe exists in three forms: two haploid cells of differont mating types- a and Q-and the dipIoid fonn ed when an a and an o: ceJl mate and fu sc. Colls of the two mating types differ oocause they express different seis uf gcnes: a specific genes a nd o: s pecific genes. These genes are conlrol1ed by activators and repressors in valious combinations. as we fiOW brief1 y describe. The a cell and the Q cell each encode cel) type specmc regulalors: a colls make the regulatory protein al; a cells make the proteins al and 0.2. A fourtb regulalory protejo, calJ ed Mcm1 . is also involved in reguIating the mati.ng-type specific genes (and many other genes ) and is. presenl in bolh cell types. How do th ese vnriolls regul aloes work logelher to ensure thal in a ceUs. a spedfic genes are swllched on and o. spocific genes are off; vice versa in Q cel1s: and in diploid cells. both sels are kept off? The arrangement of regulators al the promoters of a specific genes and O: specilic genes is shown in Figure 17-18. • In a celIs. the Q spocifi c genes are off becnuse no nctivators are bound there, while lhe a specific genes are on because Mcml is bound and activales lhose genos. • Fn n cells, lb o o. specific genes are on becausc Mcro1 is bound upstream and activates them . Al !hese genes. Mcro1 binds to a weak site and does so only when it binds cooperatively with
7mnscripfionoJ Repressors
eelllype;
0001 (haploid) o
aeeU (haploid)
"
o
¡
¡ ¡
¡
MAl lorus:
54.
aJa ce ll (diploid)
"
¡
gene regulatory proleins:
y
01
M=l
al
¡
a2 Mcm1
01
I
Mcm1
a2
J
t
target genes:
a specific gene,
a specific
aSG
ON
aSG
OFF
oSG
OFF
aSG
OFF
aSG
ON
aSG
OFF
hSG
•ON
hSG
OFF
gen'" I hSG haploid specific genes
•ON
"=)
f I e u R E 17-18 Control of ceU-type specific genes in VNSt. As desaibed in delailln lhe tex!, !he Ihree (ell types 01 !he yeast S. cerevtsiae (the haploid a and o. cells, and the ajo. diploid) are defined by me sets 01 genes they express. One ubiqUltous regulator (Mcrn 1) and three cell-rype specific regulator<; (a l, 0.1, and 0(2) together regulale three classes of targel genes. The MAl Iorus is me region of the grnome which enccdes the f"ll
a monomer of Ihe prOlein a l . Thi s ensures I.hal Mem1 aeli vales Ihese genes on ly in o cell s. The a specific genes aTe kepl off in a ceLls by the repressor 0:2. This repressor binds. as a dimer. cooperatively with Mcml al these genes. Two properti es o f 0:2 ensure a-speci fi c genes are not expressed here: it covers the activating region of Mcml. preventing that protein from acti vating; and it a lso actively reprossos tbe genes. Tho mechanism by which 02 acts as a repressor is described in Ihe next secli on. • In diploid cells. both a an d o: spccific genes aro off. This is done as fo l1ows: lhe a specifi c genes bind Mcml and a2. jusI as Ihey do in o cell s. This keeps those genes off. The o speci fi c genes are off because. as in a cel1s, no activators bind Ihere. • Both the h::lploid cell types (a and a l express another c1ass of genes called haploid-spccific genes. These are swit ched off in Ihe d iploi d cell by a2 which binds upstrenm of Ihem as a helerodimer with the al protein . Only in diploid cells are both Ihase regulalors prosen!.
TRANSCRIPTIONAL REPRESSORS In bacteria we saw Iha1 many repressors work by binding to sites thal overlap t.he promoter and Ihus block binding of RNA polymerase. Bul we also saw olhar ways Ihey can work: Ihey can bind to sites adjacen l lo promoters ando by interacting wilh polymerase bound Ihero,
550
Gene Regulatian in Eukaryores
inhibi1 Ihe onzymc from i niliating transcription . Tbey can a lso inted ero wilh Ihe ach on of activators . In eukaryotes we see a1l these except tbe firsl (ironically the mosl common in bacteria). We also see another fo rm of repression , perhaps the most eommon in oukaryotes, w hieh works as follows: as with ncti vators, repressors cn n recnlíl nucJeosome modifiers, bu l in Ibis case the enzymes hnve lhe opposite effecls to lhose recruited by activntorsIhey compael Ihe chromatin or remove groups recognized by Ihe transcr\ptio nnl rnach inery. So. for exnmple, hislone deacetylases repress tra nscription by removing aelel)'1 groups from the lails of histones; as \Ve have aiready SC{'J } , the presel1ee of aeetyl groups helps transeription . Other enzymes add methyl groups lo histone ta ils, and lh is frequently rcpresses Imnscri pl ion, Theso kinds of modificatioD also form Ihe basis of a type of rcpression call ed "silencing," wh ich we consider in sorne detaillater in Ihis chapler. Thoso vacious examples of ropression aro shown sehemalically in Figu re 17-19. Herc we consider jusI o ne speeific examp le. lhe repros· sor ca1led Migl which . li ke G;¡ 14 . is ¡lI vul ved in conlrull illg lht'l CAL gones of the yeast S. cerevisioe.
FI<;URE 17-19 Waysinwhich eukaryotfc repressors work. Transcription of eukdryotic genes can be repressed In various wa'j's. These indude Ihe loor mechanisrm. shov,,-¡ in lhe figure. Pan (a) sho::Jo..vs that. by anding te a site 011 DNA mat ovedaps lhe bmding <;Íte 01 an activamr, a repreSSQf can loolbn bu"dlng 01 the actlVillor 10 a gene, and mus block activatian of ¡hal gene. In a varialion on Ihis Iheme, a repressor can be a derivative of the same praein as the actNator, bUl lad !he é!CIÍ\Iaung region, In anolher vanallon, an activator thal binds lo DNA as a dimer can be inhibiled ¡rorn doing 50 by a deñvative that retains the regíen of lhe protein requíred far dimerizatioo, bu! iads Ihe ONA-binding domain. Sudl a derivative forms inactive helerodimers witl1 the actlvator. In pan (b), a repressor binds to a site on DNA be5lde an actl\lator and Interacts wilh ¡ha! activalor, ocduding Its actIvatmg region.ln pan (e), a repressor binds 10 a site lIpslream of a gene and, by interadilll .....;m the Irnnscnpllonal machlnery al Ihe prornoter In sorne specifie way, Ir;híbits transoiption initianon. Pan (d) shows !€pression by JeOU¡fing histone modiliers lha! alter I)lIdeosornes in ways thm Inhibit tr<'InscnpllOTl (ter example, deacetylatlCfl, as shown here, bul also methylation In sorne cases, or even remodeIing at sorne p(QlT)()\ers).
a mechanism:
activator bindlng s ile
o
oompetilion
promeler repressor binc!ing site
b
inhibilion
direct repression
d histone :c--~~-~ deace tylase
indirecl repression
Signa/7ransduction ímd Ihe Carllrol
UASo
Mrgl
sile
F I e u RE 11-20 Repression of the GAL1 gene in yeast. In the pJeSence of glucose, Mlg! binds a site berv.een lhe UAS.:; and lhe CAL / promolEf. By reauiling lhe Tup I repressing complex. Mg l represses eJpressiOll of GAL/ . Repression IS a resolI of deacet)4clfÍOll 01 local nucleosomes (Tup I reau¡ts
a deacetylase), and also prcl:>ably by directly contac:tIng and inhibiting lhe Iranscription Il"roadltnery. In an expenmenl nol sha.<.t1, il Tup l is fI.!Sed to a DNA-bmdi ng dOlTlilin, and a site lar Ihal domain is placed upsll&lm 01 a gene,. epression of lile gene is teplessed.
Figure 17-20 shows the CAL genes as we saw Ihem ea rli er (Fi guro 17-31. bUI wilh the addition of a site, belwr.en Ihe Gal4 binding sites and the promoler: Ihis is where. in t.he presenee of glueose, Mig1 binds and sw ~lches off the CAL genes. Thus, just as in E. con. the cel! on ly makes the enzymes needed lo melabolize galactose if the preferrod enecgy source, glucose, is nol presenl. How does Migl repress tlle CAL genes? Mig1 recruits a "reprossing eomplex" contai ning lhe 1\tp1 prote¡n. T his comp lex is recruited by many yeast DNA-bi nding proteins U1tll repress lcanscription , incJuding the 0.2 proteio ¡nvolved in controlling maling-type specmc genes we described above. Tupl a l50 has counterparts in mammaJian cells. 1\"0 mechanisms have becn proposed lo tlxplain the ropressing effect of Tupl . First. Tupl recru its bistone deacelylases, which deacetylale nearby nucleosomes. Second, TUpl interacls d irectly with the Iranscri plion machintlI'y al the promoter a nd inhibits initialion .
SIGNAL TRANSDUCTION AND THE CONTROL OF TRANSCRIPTIONAL REGULATORS Signals Are Often Communicated to Transcrlptional Regulators through Signal Transduction Pathways As we have seen, whether or nol ;) given gene is exprossad very often depends on enviromental signals, Signa ls come in many forms -they call, as \Ve saw was typically the case in bacleria, be smalJ moLecules such as sugars. Bul Ihey can also be proteins roleased by one cell and received by anolher. This is particularly common during the davelopmen t of multiceUulac ocganisms (Chaplor l B), There a re various ways lhal signals are detected by a coB a nd comm unicated lo a gene. In bacteria we saw thal signals conlrol the aelivitios of regulalurs by indue iug aBosterle c hauges in those regu· lalors. Ofton that affecl is diroct: a small molecular signal, sueh as a sugar, cnlers the cell and binds th e tmnscriptional regulator di· reetly. But we saw one example w here Ihe a freet of the s ignal is indiroc! (control of the acti vator NtcC). In Iha t case. the signa l (low amruonia levels) induces a kin ase Iba! phospboryla les Nt rC. This Iype of indirect sign;¡ ling is an example of él signal transduclion pathway.
The term "signal" refers to Ihe initiati.ng Jigand it¡;¡elf- tbat is, the sugar or protein for example. This is bow we have defined it previ· ously. It can also refer to the "information" as jt pAsses from detaction of that ligand to the regulators Ihal directly control the genes-that ¡s, as it passes along a signal transcluction pathway. In tbe simplesl of bacterial cases Ihere was no distinction of course, but once a signal transduction pathway is involved. there is. And in eukaryotes we wiII see-particu!arly in Chapter lB-that most signals are commun icated to genes through signal tronsduction pathways, sometimos very olabo· rate ones. In tbis section we first look at a couple of cases of signals being passed along signal transduction pathways in eukaryotes. We then consider more generally how signals, emerging from such pathways, control the transcriptional regulators themselves. In a s ignal transduction pathway, the initiating Iigand is typically detected by a specific cell surrace receptor: the ligand bincls lo an extra· ceIJular dornain of the receptor and this binding is cornmunicated 10 the intmccllular domain. From thare the signa1 is relayed to the relevant transcriptional regulalor, oftell lhrough a casc.:'lde of ki nilses. How is tha bind.ing of Ii gand lo the extracellular dornain comrnunicated to the intrncellllJar doma in? This can be through an allosteric changa in the receptor, whereby binding of ligand alters Ihe shApe (and thus activity) of the intracellular domain. Alternativcly. the ligand enn oct s imply lo bring together two oc more receptor chains, allowing inlernctions between tbe intrAcellu!ar domains of those receptors lo activate each other. Figure 17-21 shows two examples of signa! transduclion pat hways. The fiest is a relatively simple case, the STAT pathway (Figure 17·21a). Ln Ihis example, a kinase is bound to Ihe intracellu lar domain or a receplor. When the receptor is activaled by its ligand (a cytokínel, il brings together two receptor chains and triggers the kinase to phosphorylate a particular sequence in the intracellular domain of the opposing receptor. Thi s phosphorylated site is then recognized by a particular STAT protoin which. once bound, gets phosphorylated itself. Once phosphm)'lated. Ihe STAT dimerizes. moves to Ihe Hueleus . and binds DNA. The other example is more elaborate (Figure 17-21b): Ihe MAP kinase pathway lhal controls activalors such as Jun. In this case. the ilctivated receptor induces a casca de of signaling events, ending in activation of a MAP kinase that phosphorylates Jun (an d olher tmnscriptionn l rogulators). The mosl coromon way in wh ich ínformation is passad through signa llransduclion pathways is vja phosphorylation , bul proleolysis, dephosphorylation, and other modific:!lions aro also user!'
Signals Control the Activities of Eukaryotic Transcriptional Regulators in a Variety of Ways Once a signa! has been communicated, directly or indirectly, to a transcriptional regulator, how does it control lhe aclivily of tha! regulalor? In bacteria wo saw that the allosteric changes that control transcriptional rcgulators very often affect the ability of the regulator lo bind ONA. This is true in cases wh ero lhe signalling ligand ilself acl!> dircctly on Ihe transc riplional rogulalor and in cases where tho presence of the signalling liga nd is commullicated lo the regu lator through a signal tmnsduction pathway. Thus, Lac repressor binds DNA only when free of allolactose, a nd phospohorylation
•
(JAK)
Ra, cytoplasm
•
F~
[AOP]
enters nodeos
and binds ONA ane! ••0 - activates lranscription
•
( [AOP]
<:
~
FI e u R E 17·21 Two signa! transduction pathways from mammalian ceUs. Shcmn are !he STAT -c.a»ed SH2 domain. These recognize p~ted lyr f5ldues in certain sequence contexts, and IhiII IS lhe basls 01spe
me
-
F~
[A OP]
MAPK
554
Gene Regulo fion in EuknryOfes
of NtrC Iriggers an a ll ostor ic change controlling DNA binding by tha! activator. In aukaryotes. transcri ptiona l regulalors are not typically con lroll ed al thc leve! of ONA bindi ng (Ihough Ih ere aro exceptionsl. Regulators are inslead usually controUed in one of (wo basic ways: Unmasking an Activating Region. This is done eitJlcr by a confonnaliona! change in the ONA bound acti valor, revealLng a previously buried activnling region; or by reJease of a mnsking prote io that proviously interacloo with. and eclipsed, an activating region. Tha conformalional changes required in each case can be triggered aHher by binding ligand diroctlyor through a ligand-dependenl phosphorylation. Gal4 is conlroll ed by a mask.ing protein . In the absance of gnlactose. Ga l4 is bou nd lo its siles upstream of the GALl gene, bul it cloes nol aclivale thal gene becauso anol har protein . GalaO. binds lo Gal4 and occludes its activating region . Galaclose triggers the release or Cal80 and activalion of lhe gene {Figure 17-2 2). In man y cases UiC masking protein no1 only blocks Ihe activating region bul is itse lf (or recru its)
s:::=~~~r::==:;;;;:==:J==;' OFF (basal)
+ga"'- 1GALf
=~~C=~;===~~ ON UASG
Mig1
site
f I G U R t 17-22 Tbe ynst amato.- Gal .. 15 reguJated by !he Gateo protein. Gal4 IS active 0Il1y in me prescnce of galactose. Even in Ihe absence 01galac:lose, Ga14 is found bound 10 liS site. upstream 01 the GAL I gene. Bel it does nol urde, these cirrumslances OCIÍVale lhal gene because ¡he acti'lating regioo is bound by a proten caUed GaIBO. In Ihe presence 01 gafactose, GaIBO undefgoes a coofClm"liltional manee. Ihe actiValing regiom are revealed, and lhe GIL I gene is actívated. In !he figure, Gal80 is shCJIA11 dissocialÍng 'mm GaI4 in Ihe presence of gaIaCIDS€. In reality il is lhought lo r..hange ilS posioon and wearen ilS binding. but nol c.ompletely fall off As shown, Migl is nOl: bound al I' S site because lhere is no glurose presenl (see Figure 17-20).
proliferation in these cells. Mutations affecting Ihis pathway are afien associaled with cancer. Transport Inlo and Out of the Nucleus. When nol active, man)' aclivalors and repressors are held in Ihe cyloplasm. The signall ing ligand causes them lo move lo Ihe nucleus w here they acl. There are many variations 00 Ihi s the me. Thus, Ihe regulalor can be heJd in the cytopl asm through interactioo with an inhibilory prolein or with the cell membrnne. or it C.íln be in a conformation in w hich a sign;¡1 sequence required for its nuclear imp0r! is concealed. Release and transpon inlo Ihe nucleus in response lo a signal can be mediated through proleolysis of an in hibitor or lethering region. or by allosteric changes. We wil l see en example of lhis in Ihe nexl chapler when we consider the formation of the dorsal-vontral axis of the Drosophilo embryo. There. Cactus Lo; an inhibítory protein that binds tho trnnscriptiona l regulator Dorsal in the cytoplasnL lo response lo a specific signal , Cnclus is phosphorylated and destroyed . al low ing Dorsal to eoter Ihe nucleus and acl (figure 18-13).
Activators and Repressors Sometimes Come in Pieces Wo havo. on lbe whole. considered activators and repressors in Iheir simplest forms. though we have alluded to sorne additional complexi tieso FOT example, the activator can come in pieces: the DNA-binding dumain and ncl ivating region can be on separate polypeptides, which come togethcr on DNA lo fonn lhe activator. In addition, in considering the regulation of regulators by their signals. we again see examples of protein complexcs fo rming on DNA. und thc nature of the com plex can determino whethor the DNA-binding prolein activalcs or represses nearby gones. For examplo. we jusI saw a case (E2F/Rb) w here an aeUvalor can bind a protein and become a repressor. There are even more olaborale casos, such as Ihe glucocortiooid receptor (GR). This mammalian protein can oither act ivate or repress lranscripUon depending on the naluJ'e and arrangoment of its DNA-bind lng sites al a giVeJl gene. Jn Ihe absence of its ligand, GR is held in Ihe cytopli'lsm through ioteracl ion with a protein called hsp90. Upon ligflUrl binding. the receptor is relcasod and moves lo tho nucleus. (Thus GR is anolhor example of a regu lator whose activily is controlled by nuclear localizalion.) Once in the nucleus, tho GR binds sites called GREs. These s ites come in Iwo Iypes. When bound lo ono, il activatos transcription; when bound lo Ih e olhor. it represses, as we now descri bo. When bound to the second of these s ites, the receptor adopls a conformation thal allows it lo bind a histone deacotylase. When bound lo the first site. the conformation of the receplor is such thal it does nol bind the histone deacetylase bul rather binds another molecule called CBP. Binding of CBP loads lo activation of Iho noarby gene, partIy because CBP is it self a histone acetylase bul also beca use it can recruit components of Ihe transcript ional machinery. (CBP is recruited by mally acli vaLors ill ma mmali au colIs, ünd ufte ll severa! i:lI::tivüturs can internct with it al once. rndead, the activalors bound at Ihe human j3-iolerfcron enhaocer -F'igure 1 7-1 6-are an example. l T he tenns "co-repressor" and "co-activator" ílre oft en applied lo aoy auxlliary protei n whJch is neither part oI the transcriptional machincry nor itself a DNA-binmng regul alor, but which is l1Iwertheless involved in transcriptional regulation. CBP is an example. The lerm is also oflen applied to other nucleosome modifying complexes.
GENE "SILENCING" BY MODlFlCATION OF HISTONES AND DNA We have thus Cae considered ceguJation by activalors a nd repressors thal bjnd near a gene and swi tch it on oc off. The effects are local, and lhe
actiOlls of the regula tors are aften controlled by specific extracellular signals. We 1l0W twn to mechanisms of gene silenciog. Silencing is a position effeCl-a gene is silenced because of whero it is located, nol. in response to a specific environmental signal. AJso, silencing can "spread " Qver Jarge strelches of DNA, switching off m ultiple genes, even a nes quite distant from the in itiating even,t. Despite these differences. unclerstanding silellcing does not require entirely new principies. just extensi nns of those we have already encounlered in tbis chaptee.
The mosl enmma n fo rm of silencing is associaled wilh a dense farm of chromatin ca l1ed heterochromatin. Hetecochromatin was named for its appearance under the lig ht microscope: it appeacs dense compared lo olher chromatin, Ib e euchromatin . Heterocruomalin is frequently associated with particular regions of the chromosome. notably the tclomares- the slructures fo und al the ends of chromosomes-ane! the centromeres. As you lea rlled in Chapler 7, tdomeres and ce ntromeres aro typically composed of repetitive sequen ces and (,untaiIl Cew. if all y. pruteill coding gtmes. lí u gene is experimellta Uy moved inlo these regions, thal gene is typically switched off. In fact, there are other regions of the chromosome thal are also in a heterochromatic state. and in w hich genes are found, such as in Ihe sil ent mating-t ype locus in yeas!. And in mammalian cells . aboul 5 0 % oí the genome is estimaled lo be in sorne form oí heterochromatln . We have already seen thal the density of chromatin can be altered by enz}'mes thal c hemicaUy modify Ihe taUs oC hislones. Such packaging affects access ibility of Ihe DNA and Iherefore affects processes s uch as replication . recombümtioll . as \VeH as transcripti on . As we have described . both aclivatioll and repress ion of lranscriplion often involve modificat ion of nucleosomes lo ruter tbe accessibil ily of a gelle to the transcriptional madünery and olber ftlgulatory praleins. We have also encountered praleins lhal recoguize modified nucleosomes and bind specifically lo Ihem . Heterochromatic s ilencillg can be undecstood as an extension of these same principIes, as \Ve describe momentari ly. Transcrlptio n can also be silenced by metbylatiol1 of DNA by enzymes called DNA meLhylascs. This kind of silencillg is nol found in yeosl bul is common in mammalian cells. Melhylalioll oí DNA sequences can illhibit biuding of proteins. including the lranscriplional machine.ry, and thereby bloc k gene expression . But melbylalio n cnu inhibi! expression in anolbc.r way as well: sorne scque nces are recúgnized only whell lUtlthylated by specific repressors !hal then switch off llearby genes. often by recmiting histone dcacety lase.
Silencíng in Yeast Is Medíated bv Deacetylatíon and Methylation of Histones The te lomeres. the silent mating-type locus (Olapter 10), and the rDNA gelles are a ll "silent'· regions in S. cerevisioe. We consider lhe telomere as an example. The final 1-5 kb of each chromosome is found in a folrled. dense suuclu re, as shown in Figure 17-23. Genes taken from alber chromoso-
Gene
.
_
"Silencing~
by Modifica/ion
al Histune$t1ntJ DNA
Sir 2 prctein
.....
}Sir 3. 4 protein
"..
F I e u R E 11-23 5ilencing at the yeast telomeré. Rap I reauits SIR complex lo Ihe telomere SlR2. a component of .hal cumple)(, d&'lcerylates neab( nudeosomes.1he unacetylated lails Ihemselves Ihe n bind SlR3 and SIR4, reauiting more SIR complex, allCl'tMng the SIR2 Wlth,n il te act en nudeosomes furthe! lJ.Nit(, and so en. This explaiflS the spreading of the Sllencing eflro prod~d by de«etyla uon. (Seur'c: Adapted 110m Gru nstein M. el al. 199& ""casi heterod1romalin: Regu~en of its assembly and inlleñlance by histones. ceO 93' 325 - 328. Cow"istlt e 199B. Uscd Wlth permission from EIsevl€f.)
mal locations and moved to this region are oflen s ilenced, particulady if they are onJy weakly expressed in thelr usual location . The chromatin al Ihe telomere is less aceiylnted tban !hat found in mosl of lbe rest of the genome. where genes are more roadi1y expressed. Mutations have boen isolated in whjch silencing is relieved-that is, in which a gene placed al the telomere is cxpressed at higher levels. These studies impli cate three genes encoding regulators of silencing, SIR2, 3. amI 4. {SIR stands ror §.ilent infoJTl1ation regu lator}. The three proteins encoded by t.hcse genes form a oomplex tba1 associates wilh silent chromatin , and Sir2 is a hislone deacetylase. The silenci ng complex is recruiled lo tbe telomore by a DNA-biudillg pmlein I.hal recogllizes the lelamem's repeated sequences. This recnlitmenl initiates local deacetylalion of bislone tails. The deacetylated histones are, in huno recognized dircctly by lhe silencing complex. ami so lhe local deacetylatioD rendily spreads along tbe chromatin in a selfperpetuating manner, produciug an exlended regian of dense b eterochromalin. How is tltis spreading Iimiled lo the telomere (and otber silcnced regions)? Olher kinds of hislone modification block binding of the Sir2 proleins, and thereby stop spreading. Methylation of the tail of histone H3 is believed lo do Lhis. HistOllC methyl transferases auach methyl groups lO ltistone tails. As we saw in Chapler 7, tbese enzymes add tnethyl groups to specific Iysine residues in tbe lails of hislones 1-13 and 1-14. Histone methyl transferases have recuntly been described in S. cerevisiae. where they are belicved to help repl"Cssion of some senes, amI. as just noted, block spruading or Si.rz medialed silendIlg in otbel's. Bul histone metbylases have beel1 bctlcr charactel'ized in higher cukaryotes and in lhe yeast Schízosaccharomyces pombe. [n those organisms. siJencing is typically associated with chromatin confailling histones that are no! only deacetylated , but methylated as \Vell. Thus, methylation 01" Iysine 9 in the 1-13 tail is a modification associated with silenced heterochromalin in tbese organisms (Figure 7-38). Otber sites oí methylation {Iysi ne 4 on that same tai l. for exa mpleJ are associaled with increased transcri ption.
....
,... ....
557
Just as acetylated residues witrun hislones are recognized by proteins bearing bromodomains, methylated l"Csidues bind proteins with chromodomaios (see Figure 7-39). One such proteio is the DrosophiJa p rotein HPl. a component of sileut heterúchromatin in Ihat organis mo
Histone Modífications and the Histone Code Hvpothesis Jt has been proposed that a histonc eode exists. According lo this idea. ditlerent patterns of modifieations 011 hi stone taUs can be "read" to mean dHrerent tbings (Figure 7-39). T he "mcan ing" would, in parl, be the result of thc direct effects of these modificatiollS on cbromatin density and Coon. Bul in addition, the particular pattem of modificat1on5 al any given lacalíon would mcmit specific proteins. the parti cular set depellding on the number, type, ano d isposition of recognition dornains those proteins carry. We have already seen that a component oCthe TFlID cornplex recognizes acetylated Iysines (it has two bromodomains and recogn izes, spedficaUy, H4 N-terminal tails modified 00 Iwo particular lysÍlte groups). Ami we havo jusI seco Ihat HP1 recognizes H3 tails modilied by methyJ groups 00 a particular lysiue residue. There are also proteins that phosphorylale serine residues in H3 aod H4 lails aud proteins that bind those modHJcati ons. Thus, multiple modifications at several positions in Ihe histone taits are possible; the examples oCH3 and H4 , togelher with H2A aud B, are shown in Fif,:rure 7-40. Add lo trus the observation thal many of tbe proleios Ihal carry modificationrecognizing domains are themselves enzymes lhat modify histones further, and we start to see how a process of recognizing amI maintaining patterns oC modification could be achieved. Col1sider one simple case-Iysine 9 011 Ihe tail o( histone H3 (see Figure 7-39). Different modificatioo slales of Ihis rtlsidue have diffenmt meanings. Thus, acetylation of Ibis residue i::; associated with actively trallscribed genes. Tha! residue is recognized by various hislone acetyJases bearing bromodomains, ami tbese slimulate addilional acelylation of olh er nearby Il ucleosomes. When Iysine 9 is unmodified, it is associatúd witb sileJll~Ed regiolls (as we saw in S. cerevisjae above). Unacelylaled hislooes often mcruit deacetylating enzymes beller tban acetylated histones. reinforong and mailltaining the deacetylated slate (as we 58W in the sprearung oC sileoced regions in S. cerevisiae). Finally, that same Jysine can in sorne organ isms be metbylated: in tbat case, Ihe modificd resirlue then binds proleills that eslabli sh and maintain a heterochromatic state, stronger than lha!. assodated Witll deacetylated histones.
DNA Methylation ls Associated with Silenced Genes in Mammalian CeUs Some rnammaHao genes are kep! silent by methylation of nearby DNA sequences. In faet. large regions of the mammalian genome are marked in Ibis way, and often DNA methylation is seen in regions Ihat are also hete.rochromatic. This is because methylated sequenccs are alten recognized by DNA·binding proteius (such as MeCP2) that recruit histone deacetylases an d histone methylases, which then mod ifY nearby chrotnatin . Thus. methylaHoll of ONA can mark sites whcre heterochromatin subsequenll y forros (Figure 17-24), DNA methylation lies al the heart of a pheoomenon called imprinting, as we oow describe. In a diploid cell , there are two copies of most genes,
Cene
~Silencing" by
AJodificalloIl al HjstaIle1! and DNA
melhylation
(r; ;,!,H """'" proteins lhat bind melhylaled DNA
U
V
,, (OFF)
1 dlromalin remode ling rompl",
FI G U R E 17-24 . Switc:hing" gene off tt.ough ONA methylatiOfl
tiOfl, expression is never firmly shut off - 11's icaky. Often ¡ha! is nol good enou~ -sornetimes a gene mus! be completely shut off. on occasicn perrnilnentty, This is achleved thrcugh methylat¡cn of the ONA aod mcdif~ caticn of !tle local nudeoscmes. Thus, when the gene IS not belng expressed, a CNA me~nsfcrilse (a meth~se) can gain access "nd meth~e cytosines wnhin the prcmoter sequence. the gene Itsdf, and me upsteam actI\Iatcl' binding s.i1!!s. The meth-y4 grot..p 15 added te me S' positioo in the cytosine rinS genetabng 5· methykytosine (see Chapte.- 6)- Tt.s mcxfoficallcn aboe can dlsrupt bind'ng of 1t1e transcriptlOn machinery and actlvators .n sorne cases. But d also blllds other proteins (for E'IIdmple, McC(2) lha! rccogIllle [)NA seqJences contain,f'€ methylcytosine. These proteins, in tUIT\ reauit complexe5 tIlat remedel and modily lccaI nudrosornes, sWltCh!ng off expresSlOn cf !he gene rompIetely.
one copy on a crnomosome inherited from th c father, tJ1e olher on the equivalenl chromosome from the rnotheL Ln most cases, the two a Heles are expressed at comparable leveJs. This is hardly surprising: they carry the same regulatory seguences and are in the presence of the same reguIBlors; they are also located in an equivalenl regíon of two very similar chromosomes. Bul fuere are a few cases where olle copy of a gene is expressed while Ihe otJler is silenl. 1\"'0 well·studied examples are !he human H1 9 aud Igf2 genes (Figure 1 7-25). These ~IU loci:lled c10s e lo each other Oll llUflJan d uomosome 11. In a given cell. one copy of H 19 (thal on the maternal chromosome) is expressed . whlle the other cOPY (011 the paternal chro-· mosome) is switched off; for Igf2 the roverse is tOle-the paterna l copy is on and the maternal copy ofr. 1\vo regulatory sequences are c ritical for Ihe diffe rential exp ression of these genes: an enhancer (dowllstream of lhe H 19 gene) aud an insula l.or (located betwee n the 1119 and Igf2 genes), T he e nhancer
55tl
560
Cene Regulalian in Eukm'j'oles
F I GU RE 11-25 Impñnting.. ShCN.In are IWO examples of genes controlled by imprint,ng - Ihe mammalian fgf2 and H19 genes. As described in the tex\, ¡n a given cell, lhe H19
amaternalchromosome CTCF, X-------:===== __ I ~ I f2
,
1--,
...-;,N-------
H19
=I:===t=lr--¡nsulalor==:I===:¡=1¡t="'~h=,"~oe:::,CJ
gene is eJq)ressed from only Ihe maternal chromosome, /g12 frorn the paternal chromosome, The methylalion stale of the ¡nsul
detClmines whelt1er 01' nOI the 'nsulator binding protein (aCF) can bind and block activation of !he H/9 gene from lhe dOlNflstrCilm enhancer,
OFF
b
-------------
paternal chrosome
(when. bound by activators) can . in principie, activale either of the two genes. So why does it activate onIy H19 OH ,he maternal chrornosome and Igf2 on the paternal chromosome? The answer Hes in the role of I,he insulator amI its methylatiol1 slate, Thus. the enhancer Gannol activate the 19f2 gene on lhe malernal cmomosome becaustl 011 thal chromosome, tbe insulator bil1ds a prolein, CTCF, lbal blocks activators at the enhanccr from activating the 19f2 gene. On lhe paternal chrornosome, in cOlltrasl, the iusulnlor elernent alld the H19 promoter are methylated. In Ihat slate. the transcriptiol1 machinery C<'Innot bind the H19 pl'Omoler, and CTCF cannot bind the insulator. As a result. Ihe nnhancer 1l0W aclivales the 19i2 gene. Thn H19 gene is further repressed on the patemal chromosome by the binding of MeCP2 to the methylated iusulator. This, as we have seen, recruits deacetylases, and Ihese repress Ihe H19 promoter.
Sorne States of Gene Expression Are lnherited through Ccll Division cvcn when the lnitiating Signal ]s No Longer Present Patterns of gene expression must sometimes be inherited. A signal released by one cell during development causes neighboring cells to switch on specific genes. Those genes may have lo remain switched 011 in those cells for many ccll generations , even if the signa! that induced them is prcscnt on ly flcetingly. The inheritance of gene expression polterns, in thf! absence of either mutation or the initiating signal, is called epigcnetic regulation, The imprinting example we discussed above revcals ane way lhe expression of a gene can be regulated epigenetically. Contrast this with sorne of the examples: of gene regulation we have discussed. If a gene is controlled by an activalor, and lhal activator is only active in the presence ofa given signal. then the gene will remain on only as long as the signal is present. Indeed, und er normal conditions, the lac genes of E. coJi will on l)' be ex pressed while lactase is pn::sent aud g lU C01StJ absen l. Likewistl tht: CAL gt: lIt~s of yeast an:: expressed únly as long as glucose is absent and galactose presento and human j3-interferon is made on ly while cells are stirnuJated by viml infection. Bu! we have also already encountered an exampIe of geno regulation which can be inheriled epigelJetically. The renson that case. maintenallce of a phage ~ Iysogen (Chapte[ 16), can be described as epigenetic is díscussed in Box 17-3 , ~ Lysogens aud the Epigenetic Switch.
Gene "Sifencing " by Modifica liDn DI HislDnes and iJNA
Nucleosome and DNA modificatiollS can provide the basis for epigelletic inheritan.ce. Consider a gene switched off by methylation of local histones. When that region of the chromosome is replicated during cell division. the methylated histones from the parental DNA mo lecule end up distribuled equally between the two daughter duplexcs (see Figure 7-42). Thus. each of the daughter rnoleculcs c8rries sorne metbylated and sorne unmelbylated nucleosomes. The methylated nucleosomes recrnít proteins heal"il1g chromodomains. including the hi stone methylase itsel[ which then methylales the adjacent unmodificd nucleosomes. A daughter strand that lacked melhylated histones altogether (thal ¡s, one from an muneulylated parent) would not recruit tbe metbylase. In this way, the stete of chromatin modificatioll can be maintained through generations. DNA methylation is even more reliably inherited. as showll in Figure 17-26. Thus, certain DNA methy lases can methyla le. al 10\0\' frequency, previously urunodified DNA; bul far more efficiently. so-called maintenance methylases modify hemimethylated DNA-the very substrate provirled by replirnlion of full y methylaterl DNA. In mammalian cells. DNA methylatiori may be the primary marker of regions of the genome that are silenced. After DNA replicat ion.
DNA
replication
nol recognized by maintenance melhylase
5' methylation
T
3' C:~=~~'::> 5' F JCURE 11-26 paUems of ONA methylation can be maintained through cel! division. As we saw in f Igure 17-24. ONA ¡nvo~ in expression of a vertebrate gene can gel melhylaled. aOO exprcs.sion of that gene switch€
•
561
562
Box
Gene Regulafian
17~]
jn
El.lkaryates
A Lysogens and the Epigenetic Switc:h
Heritable pattefl"lS of gene expression can be established with· nudeosome, or DNA. modiflcation. Consider a bacterial example we discussed in Chapter 16, a h Iysogen. In a Iysogen. the phage is in a dormant sfilte within !he bacterial host cel!. This stale is associated with a specific pélttem of gene exprcssioo, and in particular wilh sustained expression of the h reprcssor protein (see frgurc 1&27). lysogenic gene expression is established in an intected cell in response to po:.Y grD\rVth ccoditions. Once established, hcNvever, !he Iysogenic stélte IS maintained stably despite improvements in gl'CMlth COlditicns: mc:Mng a Iysogen inlO rich grONth medium does na lcad to ¡rlduction. And "indeed, induction essentially never ocrurs until a suitable inducing signal (suc.h as UV li¡tlt) is receNed. out the use of
M.aintenance d the Iysogenic state through cell division is thus an example of epigenetic regulatioo. Instead d any 'om of modj.. flCatioo, this epigenelic cOltrol results fmm a two-step stratEgy fa repressor synthesis. In the first. systheSis is initially established throog/l actívation of the repessor (el) ~e by !he activalor 0 1 (whic.h is sensitive to grONth coodititTls). In the secOld step, represscr synthesis ís maintained by autoregu1ation: repressor actlvates expression of its 0Ml gene (see Chap1er 16, f@Jre 1&35). In Ihis way, when the Iysogenic cell divldes. each daughtcr ce" inheñts a cq>y of the dmnant phage genome and sorne repressO" prctein. lhal repressor is suHident to stimulate further repressO'" synthesis ITIJTl the phage geoome in both cells. Much 01 gene regulation duñng the develcprnem of muocellular organisms lMl!"ks in just this WCto¡. We VoJiII see examples in the next chapter.
hernirnelhylated sites are rernethylated. These can then be recogl.lized by the repressor MeCP2 , which in lum recmits rustone deacetylases and melhyJases, rees tablishing silencing (Figure 17-241.
EUKARYOTlC GENE REGULATlON AT STEPS AFfER T RANSCRIPTION lNITIATION Sorne Activators Control Transcriptional Elongation rather than lnitiation In the previolls chapter \Ve eucountered the N and Q protein s of phage h: these regulators control the elongation of a transcript afte r initiati on
(Figure 16-36). SpecificaJly, they aet as "antitenninators." In eukaryotes we seo l'egulation at this step as wel!. The elabol'ale transcri ptional machinery of a e ukatyotic cell contains numerous proteins requi red ror initialion. It also con tains sorne tbat aid in elongation (see CJ18pter 12). At sorne geucs there are sequcllces downstream of the promoter lhat cause pausing ol' stall ing of the polymerase soon after initialion. At those genes, tlle prcsence or absonce of ceJtain elongalion faclol's greatly inOuences !.he leval at wruch the gene is expressed, Que example is the HSP70 gene fmm Drosophila. This gene, activaled by heat shock, is controlled by two activatol'S working together. Tbe GACA binding factor is believed to recnti t enol1gh of the lranscription rnachinel'Y to the gene for initiation of transcription. But, in I.he absence of a second activator. HSF. the initialed polyrnerase slalls sorne 100 bp downslream of the promoter. In response lo heal shock, HSf' binds lo specific siles al Ihe promoter and l'ecruits a kinase, P-TEF, lo the stalled initiated machinery. The kinase phosphorylates Ihe C-terminal domain of the laegest subunit of RNA polyrnerase (Ihe so-called polyrnerase "tail ") freeing !he enzyrne from the stal! and allowing transcription to proceed through Ihe gene. We saw in Chapter 12 Ihat phosphoryJation of the polyrnerase tail is an important step in !he eal'ly stages of transcription at all genes, and the kinase TFIIH can perforrn that phospbol'ylation. Whethel' P-TEF is a1so nceded al mosl genes is not eleClr. A strong acidic aclivalor like
Cal4 1s able lo recrui.l P-TEF aJong with lhp. rost of tbe machinery. It may be lbal only al certain genes is lhe recruitment of Ihe machinery partitioned belween regulators in the way we see at HSP70 gene, allowing an extra layer of control. The HIV virus. Ihal which causes AIDS. transcribes ils genes from a promoter conlrolled by P-TEF. Again, polyrnerase iniliales Iranscription al Ihal promoter, lInder lhe control of Ihe activator SPl, hui stalls soon afterward. In thal case. P-TEF is brought lo the stalled polymernse by au RNA-binding protein . not a DNA bound olle. The prote in respollsible is called TAT. TAT recogn i:r.es a specific seque nce near tJle st.arl of LIle HJV RNA and present in the transcript made by Ihe stalled polyrnerase. Another domain of TAT inlerac!s with P-TEF and recrwts il to the stalled polyme1'8se.
The Regulation of Alternative mRNA Splicing Can Produce Different Protein Products in Different Cell Types As we saw ill Chapter 13. the coding region of luany illdividua l eukaryotic genes is s plit. with strelches of coding sequence (exons) interrupted by (sometimes much larger) regioos oC noncoding sequence (callad introns). The whole gene is Lran scri bed before lhe coding regions are spliced together. discarding tbe noncoding regiol1s. Tho number of genes with introns, and the number of inlrons per gene. ¡ncreases with !he complexity oC the organismo In sorne cases a given precursor mRNA can be spliced in altel1lative ways lo produce different mRNAs tllal encode different protein products. The choice of splicing variant produced at a given time or in a given cell type can be regulaled. The regulation of alternative splicing works in a manoar reminiscent of lranscriptional regulalion and was discusscd in Chapter 13. To recap. Lhe splicillg machlnery binds to splice sites and carnes out the splicillg reaction. Binding of the machinery to a givllll splice sile depends on tbll affjnity of Iha! sile for the machinery and the actions of proteins Ihat reguJate splicing. For example. a strong splice site can direct efficient constitutive splicing. Bu! tha! can be blocked by a splicing repressor that binds lo sites overlapping the strong splice site and excludes the spliciug mac hinery (Figure 13-17a). This mechanism of splicing repression i5 analogous lo mechanisms oC both transcriptional and translational repression we encountered in E. eoli. In other cases. sequen ces called splicing enhancers are found near splice sites. These sequellces are recogn ized by regulatory proteins tha! recnüt the splicing machinfll'y lo the splicu sil a. Like Ira nscl'iptional activators. these regulatory proteins have separate domains, oue thal binds Lhe nucleic acid (in Ihis case RNA) and one thal binds Ih e splicing machinery (Figure 13-17b). The regulalion oC a splicing cascade by repressors and activators Hes at the heart of sex detennination in DrosophiJa, as we now briefl y describe. The sex of a fiy 1S detennined by the ratio of X chromosomes lO autosames. A fe ma le results from a ratio of 1 (two Xs and two sets 01' e ulosomos). and a male frorn a ratio of 0.5. This ratio is initially measured a l Lhe leve) of transcription using two activatoes. called SisA and SisB. The genes encoding these regulators are both on the X chromosorne. and so. in the early embryo, Ihe prospectivc femaJe makcs twice 8S much oC their products as does the male (Figure 17-27). These activators bind to sites in the regulatory sequence IIpstream of the gene Sex-lethal (Sxl). Allother regulator that binds lo and controls
Gene Regulat/an in Eukal)'ates
MAlES l X:2A
FEMALES 2X:2A
"s-a
sis-a
sis-b
d(X!
ss-b
dpn
~x
1X
1X
1X
I P
P
Pm
Slop
ONA
1
p.
ONA
1
transcription
¡ye-mRNA
no transaiption
-=:- .,.",......._ . . .
5' . .
1
splicing
spliced mRNA
early Sxl protein
5' el=-,!","-=:JI,
¡
=-_c
N CI
FI GURE 17-27 Earty Transcriptional regulation of SKI in male and female fties. The Sis A and Sis B genes are foond on the X chromosome aOO enrode transcriptiondl aaivators ¡hal control expfessíon 01 the Sxl gene. Opn, ., repres~ of SxI, e. eJ"lClJded by él gene on dlrornosorne 2. v"hilc both males.,nd fe.ITIdles express .he Sclme amounl of Ihe autosomillly encoded Opn, female. make ~e ~ much 01lile aa1' vator.; ~ ITIdles (because females hc1\'e two X chromosomes <100 males only ene). The differen~ in rabO of acWa~ te repressor ensures !he Sxl lS elq)ressed in females bul nol males. The SxI protein then autoregulales its own expl'eSsion as described In lhe tex! aOO!he nex! figure (Source: Adapted from Estes P. A. el al. 1995. MultJple response elements in !he sex-Iethal early prometer ensure ils female-spedfic expression pilltemoMol. CeH BicI. 15: 904-9 17. Copyright
the Sxl gene is a repressor called Dpn (Deadpan); Ihis is encoded by a gene found On one of the autúsomes (chromosome 2). Thus, the ratio of activatoI'S to repressor differs in the hNo sexos, and trus makes Lhe differeuce between the Sx1 gene being activated fin femaLes) amI repressed (in mal es). The Sxl gene ¡s expresserJ fTOm llVO promoters, Pe and P o,- Thc fonner (promoter for establishment) is Ihe one cOlltroll ed by SisA ami SisB (and hence expressed in fema les on ly), Later in dcvelopment, this promoler is switched off permanently. 'rn fe maJe embryos, exp ression of Sxl is maintained by expression from Pm (promoter fur maiotenance). Transcriptioll from Pm is constitut ive in bot h remales and males, bu t the RNA produced from this promoter con tains one exon more Ihan the transcript prod uced from P.,.. If fuat exo n rema ins in the mature message, il faits to produce an active protein. Tha t is whal happens in Ihe maleo Bul. in Ihe remale spl icing removes that exon and fun ction al Sxl pro lein continues lo be produced _ As shown in Figure 17-28, iI is Sxl protein itself, present in fue remaJe bul nol the male (thanks to earlier expression fmm Pe), thal clirects splicing of the RNA made from Pu , amI emmffiS lhe inhihitory exon is spliccd otlL Sxl does trus by working as a splicing repressor. "fhus . fu nctional Sxl proteio continues lo be made in fema les. That protein regulates the splicing of other RNAs in the female as weH as its own . One of these is the RNA made constitutively (i n males and remal es) from lhe tra gene (Figure 17-28 ), Again , in the absence of $xldirected sp licing, this RNA fails lo give proteio (in mal es), bUI in the prescnce of Sxl it is spliced lo give fun ctional Tra proteio (in (eruales).
EUKnI}'fJtir; Cene Reg/J /olion al SlIlpS ofter Tmnscl'iplion lnilialion female primary RNA transcript
male primary RNA transcript regutaled 3' spli!:e site
,-
I
I
51(/ gene
,
I
,>
l
1
no func:tional protein
regUlate~}'
,Ira gene
5)(1prolelo
splice site
,-
13'
I
,-
9
,protein
""
T...2a
regulaled 3' splice sita \
,-
,
,N(
~
~c
+
represses female genes
1
mate development
l
O"
l
N(
males). Sex·lethal also oonl1015 spliang of!he rro gene, producing runctional Tra protein In males (bu! not males). Tra 15 Itself a splicing regulaler, It aas 00 pre-mRNA from the doublegene, When the dsx mRNA is spllCed In response to Tra PIOIein, a \.€1'S1011 of Doublesec protein is produre::l (in females) with a 5tretch 01 30 amino adds at lis ( -terminal end distinguiSl1 it from ¡he fotm of Ihe protein
mat
Tra proteln
produced in the ab!;ence of Ihe Tra regulator (in males), The lemale lorm 01 Dsx activates genes requireó lor female developmeot anó re·
Sil _ _ _ , ,- _-=:;¡j¡;¡b ,-
proteins
en the leh). lhe presence of that proten IS maintained by autoregulation of the !if!lIang 01 ilS own mess"ge, In the abserlce of that regulalJon, no IUf'lCt1ooal prQ{cin is produced (in
se,¡(
no functienal
gene
13'
,-
l
,-
FICURE 17-18 Acascadeofalternative splicing events detennines tf'¡e sex of a Oy. As desoibed in detail l1l the teJo;l, lhe Sex-lethal protein 1S produce
q
,
,-
565
~
,e
+
represses mala genes and activales female genes
+
femate development
Tra protein is also a splicing regulator. Whereas Sxl is a splicing repressor. Tra is an activator (Figure 17-28), One of ils targels is RNA made from the gene encoding DoubJe sex (Dsx). This RNA is spliced in two alternative forms. both encoding regu latory prolei ns bul wilh diffcrenl activilics. Th us , in Ihe presence of tro , dsx RNA is spliced in a way Ihat gives rise to a proteio Ihat represses expression of malespecific genes. lo Ihe absence of Tra prote io, Ihe foon of Dsx produced represses female- specific:.: gelles.
Expression of the Veast Transcriptional Activator Gcn4 Is Controlled at the Level of Translation Gcn4 is a yeast ttanscripLional activator Ihat regulates Ihe expressioll of genes encoding enzyrnes that direct am ino acid biosynthesis. Although it is a transcriptional activalor, Gcn4 is itself regulated al the level of translation , In Ihe presence of low levels of amino acids, the Gcn4 rnRNA is translaled (ami so the biosynlhelic enzymes are expressed), In the presence of high levels of amino acids, lhe GC114 mRNA is not translatad. Row is this regulation uchieved'{
presses Ihose ler mate deve!opmcnt The mate form. wrnrn has a stJetch of ISO amioo acids al lhe (·terminal ene!, represses genes thal direct female deveIopment $):1 proteJn acts as a splic· Ing repi'CSSOf by binding 10 !he pyrimldine trao at the 3' splice site (see Figure 13·2), The Tra proteln, in cootrast. acts as a splícing actÍ\later. tt binds 10 an enhancer sequence in ene of the ~sofdsx RNA(seeFigur e 13-13),
566
Cene Regv/o'ion in EvJ.:oryrJl efi
FIGURE 17-29 TraMlationalcontrolof Gcn4 in response 10 amino acid starvalion. As describeSlream ORfs IS trMSlated inibally. W'hen IImino are scarce (starvation conditions), it lakes Ionger fOl' ~ translational machu"lCf)' lO re-lflllla\e translatlOO, and so J1 tends 10 reach the Col4-encoding opE.'fHeadll18 ("'ITIC befare
nonstarvation conditions
y 'l TCJ
e1F2.
6iil
IRNA....
40S ''''''''''
ms
fe-uWatlng and translates lhat lo &'ve Gcn4 proten. When 3ITVIO aads are plentittJ (tlO'l5tarvation conditions) re-i~1ioo tclkes plbce ~ ¡nler~ open reading ffames, anO \he lli\l1s1i11lOn machinery \hen dissoci&es from \he RNA Icmplate and Gcn4 15 ~r trllnslated. (Souce: Hlflnebusch A G. 1997. Joumal of 8ioIogy 01 rheCell2n 2 166 1-2 1664, fig. LCOpyf1gh! CI 1997 1he Amencan Society Ior Biochemistty & MoIecUar Biology.)
$UbU~'" ,- ----~t~~~~~-~~~~;:;;;;;;;:';""---GG~e~N~4-'60S
l'
uORF1 (lfanslated)
uORF4
Ih?2.·l
..----,-
,-- ---~
GCN4
uORF 1
uORF4 ribosome re-initiates
starvatlon condltlons
elF2. y
G phosphofylation
slow binding
~NA~~f~4~~;1 ,
,GeN<
uORF1 (lntnslated)
uORF4 amioo add blosynlheric enzymes
I S?'~
t
, ""RF1
,-
UORF4
GeN<
The mRNA encoding tbe Gcn4 protein contains four small open reading frames (called uORFs) upstream of Ihe coding sequence for Gcn4. The most upstream of these short open-reading frames (uORFl) is efficiently recognized by ribosomes that scan along the message from the 5' end (see Chapler 14), Once Ihey have translated uORF1, a unique property of Ihis ORF allows 50% of the small suhunits of the ribosome to remain bound lo the RNA and resume scanning for downslream iniHation (AUG) codons (Figure 17-29). Before intiating translation of any downstream open-reading frame , scanning 40s rioosome subunits musl bind Ihe trdnslation factor elF2 complexed with Ihe initiating tRNA molecule fMET-tRNA. (Recall from Chapler 14, in Ihe absence of the initiator tRNA, Ihe 40s suhunit cannot rccognize the AUG sequence in Ihe mRNA.) Under conditions of amino acid starvation, eIF2 is phospborylated, a modíñcation that reduces the efficiency with which il binds tba ribosome: also, under Ihose cond ítions, Ibere is less charged initiating Mel·tRNA available. Thus, when amino acids are searce, ribosomes Ibat resume scanning aCter l[,doslatíon of -uORF1. pass through uORF2-4 before rebinding eIF2 -1RNA"i
RNAs IN GENE REGULATION We S8W in Chapler 16 a few examples in which RNA molecules are central lO regulating expression of a gene or set of genes. Reca11, for exampie. attention of the trp genes of E. eoli. In fuet case, the secondary strueture of short RNA transcript determined whetber RNA polymerase lranscribed tbe Irp genes, or terminated tran scription before reaching them (Figure 16-211. Wo also sa\'\' (Box 16-4) how so-called riboswitcbes work in a similar way. Once again, alternative sccondary structures of leader RNAs determine wbetber polymerase continues transcribing a sel of genes, or terminales instead. In this chapter we have seen ho\'\' regulatory elemen ls in RNA can bind proteins involved in tmnscriptional regulation. The HIV TAT protein was an example. Recently. however. it has become apparellt thal RNAs have a more general and mechanistically distinct role in gene regulation. Shorl Rt'\JAs, generated by the aelion of enzyrnes we will discuss in lhis scction, can direct repression of genes with homology to those short RNAs. This repression, called RNA inteñerence (RNAi), can manifest as translational inhihition of the rnRNA, destruction of the mRNA or transcriptional silencing of tbe promoter tbat directs expression of tbat rnRNA. How widespread tbe aclion of RNAs will turn out to be is still uncIear. and Ihe details of the mcchanism used lo silence Ihe largel genes in any given case is also Iypically unresolved. But as we will see. tbe role of these RNAs ranges from developmental regulation (in. ror ex-
ample, lhe wotm C. eJegons) to the proteclion again!'>1 infection by certain viruses (in plants). RNAi ha!'> also been adapted ror use as a powerfuI experimental technique aJlowing specific genes lo be switched off in any of many organisms.
Doublt!#Stmnded RN A Inhibits Expression of G enes Homologous to that RNA The discovery that simply introducing douhlc-stranded RNA (dsRNA) mto a ceH can reptess genes containing sequences identical to (or very similar to) that dsRNA was rernarkahle in 1998 when it was reported . In lhal case, Ihe experimenl was done in tbe wotm C. clegans (sro Chapter 21). A similar effect is seen in many alher organisms in which it has subsequently becn tried. Earlier than tbis report, however, it had been known that in plants genes could be silenced by copies of homologous genes in Ihe same cel!. Those additional transgenes were often found in multiple copies, sorne integrated in direcf rcpeal orientation. Also, in plants, it was knQ\¡vn that infcction by viruses was co:r:nbated by a mechanism that ¡nvolved destruction of viral RNA. These two cases " 'Cre brought together in the following observation: infeclion of a plant with an RNA virus that carned a copy of an endogenous plant gene led to silencing of that endogenous gene. AH these phenomena are now knawn lo be mcchanislically linked. lo lbis section we consider how dsRNA can switch off expressioo of a gene.
Shott lnterfeting RNAs (siRNAs) A te Produced from dsRN A and Direct Machinery th.at Switches O ff Genes in Various Ways Díccr is an RNAseIn-Iike enzyme that rccognizes and digests long dsRNA. The products of this are short double-stranded fragments about 23 nucleolides loog. This is shown in Ihe firsl step of Figure 17-30. These short RNAs (ofien called short Lnterfering RNAs. or siRNAs) inhibít expression of a homologous gene in three \Va}'s: they lriggc r desli:u<;!ion of its mRNA; lhey inhibil translation of ji:; IIIRNA; or they induce chromatin modifications within the promoter tbat sitenee tbe gene. Remarkably. whicbever route is used in an}' given case, mucb of the same macbinery is req\.lired. Thal machinery includes a complex called RISC (RNA-induced silencing complex). A RISC complex contains. in addition lo the siRNAs Lhemselvcs, various prateins including members of the Argonaut famUy, which are believed to inteme! wilb the RNA component . As shO\"lll in Figure 17-30, once a given siRNA has becn produced and assembled within RISC. it is denalured in an ATP-dependent manner. The appearancc of single-stranded RNA activates the RISC complex (indicated by an asterisk in the figure). Once activaled, Ibe complex is dirocted to an RNA containing sequence complementary to the siRNA. Once tbere il can degrade that RNA, or it can inhibit its translation. Typicall y il sceros lha! the mute chasen dcpends. af least in parto on haw c10se is the match between Ihe siRNA and the target mRNA: if they are completely complementary, lhe latter is degraded: if lhe match is less good, lbe response is largely ao inhibition of translation. A nucJease activity within RISC is responsible for degradation when that is seen. A RISC complex can also be directed by an siRNA into the nucleus where iI associales with regions ofthe genome complementary to that siRNA (Figure 17·30, on the left). Once there, the
RNIls in Gene Regulo /ion
dsRNA HIIII_IIt/IfIIIIHIIIIIIIIIIHIHHII celJ membrane cy\opJasm
:: /
.: nucleus
(~
ffi_~i"tl le,:;:~
FIGURE 11·30 RNAisilencing. RNAi switches off lhe exprcssion of a given gene \'\ohen double-slfanded RNA moIOOJIes v.1th homoIogy lo lhal gene are introduced, or made, io \hal cetl. lhis effect involves processing of lhe dsRNAlo make short Interfenng RNAs by the enzyme Dicer. lhese siRNAs then direct a romplex caUed RISC (RNA-induced silendng complcx) lO repress genes in three ways. 1I attacks and digests mRNA v.ith homology 10 the siRNA; it interferes v.ñth IransJation of lhose mRNAs; or it direcls dlromatin modifying enzymes lo me prOmolers thal direct expr€SSlon of tIlose mRNAs. Although in !he figure RISCperlorms some functions in !he cytoplasm and enters the nudeus fa another, all c:oUd lake place in !he nucJeus. (Source: Adapted from Hannon G. J. 2002. RNA interference. Noture 4 18 : 244- 25 1, FIg 5, p. 249. Copyrtghl e 2002 Natu/€ Ptb~shing
amplificalion
IIaOSIationaJ
Inhibition
degradation
cOlllplcx recruits olher protein s that modify the chromati n around the promoter of Ibe gene. This modificalion leads lo si lencing of transcriplion . We have already d escribcd s ilencing medialed by chromatin mod ification. Eslablish ing s ilenciog in Ihe cenlromeric regions of the yeast S. pombe has recent ly been shown to require the RNAi machi nery. In that case, it is believed that regions of the centromere (see Chapl er 7) are tran scribed to produce RNAs !hat ej· lhcr fold to form slem loops or hybrid ize with olber RNAs from the same region , The resulting d sRNAs are recognized by Vicer and clcaved lo produce the siRNAs rcsponsible for di recting the RNAi machinery to the centrorncl'Cs. 1t is slill undea!' the extent lo which RNA i might tu rn out lo be ¡nvolved in olher cases of c hrom atin modification and silencing in olber organisms. There is another fea ture of RNA i silenci ng worth nating - its extreme efficiency. Thus. very smaIl amaunts of dsRNA are enough to induce complete shutdo'\VIl of targel genes, Wh.i1e it remains undear why the effect lS so stroog, il might in volve an RNA-dcpendcnt RNA polymerase which ia required in many cases of RNAi . The involve-meot of l11is cnzymc s uggests sorne aspect of lhe inhibitory "signa}"
569
Group. used with perrrusslOn.)
might be amplified as part of the process. One way this might be acbieved 15 revealed by the following observation. when a given siRNA largets a region of a specific mRNA. additional siRNAs are ohen generated that larget adjacenl regions of that same mRNA. The RNA·dependent RNA polymerase mighl have a role in generating lhese additional siRNAs aher recruilmenl lo the mRNA by the origin al síRNA (see Figure 17·30, on the right). MicroRNAs Control the Expression oí sorne Genes
during Dcveloprnent We have alluded f.o the long dsRNA precursors of the siRNAs as eithcr being provided expecimentally, or, in the case of cenlromeric silencing in S. pombe. being trnnscripls that base-pair with tbemselves or other transcripts. There is another class oI natural1y occumng RNAs. called mieroRNAs (miRNAs), tbal direet repression of genes in the same way as siRNAs. MicroRNAs are most extensively characterized in plants and worms (in which they wem first recognized). The miRNAs. typically 21 or 22 nts long, arise from larger precursors (about 70 - 90 nts long) transcribed from non-protein encoding genes. These transcripts contain sequences that foun stem loop structures, which are processed by Dicer (or DCL1, for Di cer-like 1, in plants). The miRNAs they produce lead to the dcstruction (Iypically fue case in plants) or transJalional repression (in worms) of target rnRNAs wilh homology lO !he miRNA. JI is estimated tbat therc are aboul 120 genes that encode miRÑA precursors in worms, and 250 in humans. Ofien these miRNAs are expressed in developmentaJly regulated patterns, and, where characlerized, lheir largets are typically mRNAs lhat encode regUlatory proteins with importanl mIes in tlle development of the organism in questioll. AIso. strildngly, 30% of tbe miRNAs found in wonns have close homologous in fli es and/or mammals. Tbus, it seems that miRNAs are an ancienl part of prograros of gene regulation during developmen t, and Ihat RNAi-like mechanisms have a w ider role in gene regulalion Ihan was initially thought likely. Despito !his. !he mechanism of RNAi may havo evolved originally lO protect cells from any infectious, or othcrwise disruptivc. c lernent Lba! employs a dsRNA intermediate in íts replicative cyde. lbis w01l1d inelude certain viruses and many transposons Ihal replicate via a dsRNA intermediate (see Chapter 11). RNAi turns off genes expressed by those agents. as well as desLroying the dsRNA intermediates themselves. The importance of this fundion for RNAi remains evident in planls. Many plant viruses have evolved mechanisms lo counteract the host mounted RNAi defense response. These viral functions, called viral suppressors of gene silencing (VSGSs) are normally essential virulence detenninants. bul can be dispenscd with whcn infeeling plants defective in RNAi pathways. It has a1so becn reported lbat some mutanls of C. elegans that affeet RNAj have increascd endogenous transposon activity. As an experimen tal method, RNAi has had s\\'ift and widc ranging impacto lt enables an experimenter lo silenee any given gene in almost any organism simply by inlroducing ¡nlo thal organism short dsRNA molecules with sequence complementary to that gene. The effectiveness with which RNAi eliminates express ion of targot genes is critical. as is the relative ease of the proced ure. Thus, when it comes to inactivaling the gene. it is much easier lban disrupting tbe coding sequence witbin the genome, an operation which, even where possible, is laborious in all but the most amenable of model organisms.
Summary
571
SUMMARY Al> in bacteria, transcri ption initiation is the most frequenlly regulaled ste p in geno expression in eukaryoles, despite the additional sleps lha! can be regulated in these organisms. Also a<; in Iha bacteria. tranRcription ¡nitintino is typically regulaled by proteins thal bind lo spocific sequences 00 ONA near a gene and either switch thal gene 00 (activators) or !lwitdl il off (repressors). This conserva tion of regulatory rnechanism holds in the face oC several complexities in the organization an(l transcriplioll of eukaryotic genes no! found in bacteria, as \\le now summarize. Nudcooomes and their modiflcation. The ONA in a eukaryofic call js wrappecl in histones lO form nucleosomes. Thus. the DNA sequences lo which the transcriptional machinery and the reguJatory proteins bind are in many cases occluded. Enzymes Ihal modiFy ruslones. by add ing (or removíng) small chemical groups, alter tbe hislones in two ways: !hay change how tightly !he nudcosomes are packcd (anel thus how accessiblc the ONA within Ihem isl; IDld Ihey forro (or remove) bindi..og sites for rnher proloins im'olved in 11'8nscribing Ihe géne. athor énzymes "remodel" Ihe nucleosomes: they use the energy from ATP hydroJysis lo move the nudoosomes around, influencing which sequences are available. Many rcgulators and larger dislances. Genes of multicellul ar eukar.)'oles are Iypically controlled by more regulatory Ill'Ole ms than their bacteria l cuunterparts, som a bound lar ITom Ihe gene. This refltlcls the larger numbar of physiological signals Ihal con trol a typlcal gene in multi cellular organisms. Tha elaborale transcriplional machinery. The enzyme RNA polymerase is largely conservad belween oocleria and eukaryoles (Chapler 12). Bul the eukaryoti c enzyma con· lams more subunils . am! there are sorne 50 or so additional proteins tha! bind al Ihe Iypical eukaryofic promoler along w¡th polymerase. While we do nol know wha! many of these proteins cio. tllf! majority are CSSI1nlial for effidenl han· scription of many genes. Many uf these proteins come lo the IlromOler as large prote in complexes. In tlukaryotes, jusi as we saw in bacteria, aclivators predomlnantly work by recruitmenl . Jn tbese organi slTIIi, however. tha aclivalors do nOI reenúl polymerasa directly, or ruane. Thus. Ihay recruil the olher proteio complexes required lo iniliate lranscriplion of a given gene. RNA polymerase ilself is brought in ruoug with these othcr complexOs. The activator can recruit hislone-modifying enzymcs as weJl . and the eft'el.1s of those modificatioos may help Ihe transcription machinery bind the promoler. Tile activators can interact wilh one or more of roany differenl componenls of the transcriptional machinety ar Ihe nucleosome modifiers. This explains how they can so readily work togetIJer in large numhers and various cambl· oalions and accaunls for Ihe widespread use of signal iolegratian and combinatorial control we see, particuJarly in multicellular organisms. Sorne activators work fmm sites far from Ihe gene, requiring tha! the DNA between Iheir binding siles and the promoter loops oul. How loops can fonn ovar Ihe very large dis!ances ca lled ror in sorne cases is no! dear, bul It very Ilkely involvas changes io the chroma lin sl.ruc-
lure between Ihe activalor bincling s ile and Ihe promoter, bringing Ihose two elements d oser togcther. ONA setluences called ins ulators bind proleins Ihal interlere wit h the inte raction oolween aclivalors bound al dislanl enhancers and Iheir promolers. These cou ld work by inhihiliog mechanisms tha! facilitate looping (such as chaogas in dlmmali n slrur::ture), Insu la/ors help enslIre tha! activalors work only on !he correct genes. Eukaryotic repressors work in various ways, jusi as they do in bacteria. However, the simplesl anrl most commOn mechanism seen in bacteria is for Ihe repressor lo bind lo a site overJapping the promoter, !hus blocking bindll1g' of RNA polymerase. Tha! mechanism is nol Iypically secn il) eukaryotell. Mosl commonly, eukaryolic repressors work by recruiling hislone modifiers Ihal reductJ Iranscriplion. Vor example, whereas a hislone acel}·lase is Iypically associatnd with activation, a hislonc doacelyla1;e-tltal is, an ell;¡;ymu thal removes ace1yl groups-acts lo repress a gene, In sorne cases, long stretchos of nudeosomal DNA can he kepl in a relalivcly ineel slale by appropriale nucJcosome modification, masl notably deacelylation amI Olelhylalion. In Ihis way, groups of genes can be knpl in a "silent " slate withoul the need for specific ropressors bound al 68ch individual gone. Once ostablh¡hed, Ibis coodifion can be mainlained because the modificalion enzymes Ihemselvcs are often preferentiall y recrulted lo nudeosomes Ihal are in thal slale. 1'hus, 'he modificatiun slate recruits Ibe enzymes Ihal produce that particular pallern of modifications. Thig means Ihal once ¡nitialad, the silen! slate can be exlended and inheritcd ralbereasil}'. In some euk.aryol ic ul'gIDlisms, such as mamlnals, s ilenl genes aro also associaled with rnelhylaled DNA. MClhylaled sequences can either block Ihe binding of Ihe transcription machinery and activalors. or Ihose sequences can specifically bind a cJass of repressors Ibat recruit hislone-modifyiog enzymes Iha! repress nearby genes. We also saw how variQUS sleps in gene regu lalian after transcriptíon ínitialion can be regula led . These incJude transcri pli onal elongati o n and lranslation, jusi as we saw in bacteria . Bul most slrihng (a nd somelhing we did no! soo in bacteria) is Ihe regulati on of s plici ng. In multicellular eukaryoles the majority of RNAs requiro splidng. In sorne cases, a lte rna ti ve paUHrns of splicing lead lo different protein produCIS, TIJa l process can be regul ated. We considered the example of sex determina· tioo in Drosophila, whero a cascade of regula16d, a lternati ve s plicing even ls determ ines whether a fly develops ¡¡ ~ a male Of .rema le. Anotber forro nf geno regulalion we described in Ihis chapler involves s mall RNA molecuJes Iha! 'inhihil expression of homologous genes. These RNJ\s inelude regulatury RNAs used in anmlll l deveJopmenl and olhers generalf;ld in planls upon viral ¡nreclion . The mechanisms by which Ihese RNl\s inhibit expression of genes can involve destructi on uf mRNA, inJl ibilion of translalion, and RNAdirectod modifi cation of nucleosomes in the promolers of genes. This slra legy (or repression is ¡he oosis of a widely used experimental tcchnique Icallcd RNAi) usod to swilch off eX(Jrossion of genes of choice.
572
Cene ReguJotion il! Eukoryoles
BIBLlOGRAPHY Book,
Nudcosome Modifiers and Transcriptional
Carey M. and Smale S.T. 2000. Trunscriptiunal W{;ulatioll in eukoryotes: Concepls, Slrotegies. ond tech niques. Cold Spring Harbor Laboratory Press. Cold Spring I·farbor, Ncw York. Cold Spring Harbar SJ'mposio on Quontilative Biology. 1998. Volume 63: Mechanisms or Iranscri ption. Cold Spring Harbor Laboralory Pres.s. Cold Spring Harbor, Now York. Bannon G.J. 2003. RNAi; A guide to gene silencing. Cold Spring HftI'bor Labor8lory Press. Cold Sprjng Ha rbor. New York. Ptashne M. and Cann 1\.. 2002. Cenes ond s ignoJs. Cold Spring Harbar Laboralory Press, Cold Spring Harbor, New York. While ~. J . lOOl . Cene tmnscription: MecJwnisms ond control. Blackwell Science. Ma ldan. MassachuseUs.
Rcgulation Bergor S. L 2002. J listone modifications in transcriptional regulation. Curr. OpinoGenel. [)cv. 12: 142-148. Flaus A. and Qwon·Hughes T. 2001 . Mcchani sms for ATPdependent chromali n remode ling. Curro Opin o Cenet.
DNA Recognition
Dev. 11 : 148 - 154.
' enuwein T and Alli s e.0. 2001. 1'ransJaling Ihe hislone codeo Science 293: 1074 - 1080. Mar mm·sloin R. and Rolh S.Y. 2001. Hislone 8cclyltransfern ses: Punction . s l ruct ure. and cfllalysis. Gurr. Vpin. Gene'. Dev. 11: 155-161. NarJikar G.I.• fan H.Y. , a nd Kingslon R.E. 2002. Cooperation oolweon complexe:; Ihal regulate chromatin slTuolu ro ilnd Inmscription. CeJJ 108: 475-487. Peterson G.L and Worknlflll J.L. 2000. Promoler targeling and chmmalin remodaling by Iha SWJlSNF complex. Gurr. OpinoCenel. Dov. IO: 187 - 1 92.
Carv je C. W. Rnd Wolborger C. 2001 . Recognilion of specjf¡c ONA StKjuenctlS. Mol, Cel/. 8: 937-946. Harrison S.C. 1991 , A struclural taxonorny oí DNAbinding dorna ins. Na ture 353: 7 15-719.
Silcncing and lmprinting
Activation Bulger M . ami Croudine M. 2002. TRApping enhancer funchon. Noture Genetics 32: 555 - 556 . Fry c.J. and Ptllen;on C.L. 2001 . Chromalin remooeling enzyrnes: Who'Jl on nrlll? Curro Biol. 11: R185-R197. Jones K.A . and KadonagR J.T. 2000. Exploring !he Iranscriplioo-chrornalin inlerface. Cenes Dev. 14:
Gartenberg M. R. 2000. The Sir proleins of Soccharomyces cerevisiue; Medialors of tr8llscriptiona l s ilencing and mudl more. Curr. Opin. Microbiol. 3: 132- 137. GolIschli ng O.E, 2000. Cene silenciog: IWO faces nf SlRZ . Curro Bial. 10: R708-R7 11 . Grunslein M. 1998, Yeasl heterochromatlll: Rogulalion of ils assembly aod inherilance by histones. Gell 93:
1992- 1996.
Lefstill ' .A. and Yamamolo K.R. ., gUa. Allosleric effects of ONA 011 tranJ'criptional ragulators. Nature 392: 885-888.
Malik S. and Rood~r R.G. 2000. Transcriptional regulatio n throUgh mcdialor-likc coaclivators in yeasl and melszoan calIs. Tte nds BiocJlOnJ. Se;. 25: 277 - 283. Myers L e. and Kowoorg R.O. 2000. Media lor of transcriptional regulalia n. Anl1u. Rev. Biochem. 69: 729 ~749. Naar A.M.• Lemon 8.0 ., and Tjian R. 2001 . Transcr:iplionaJ cooclivator complexas. Almu. Rov. 8iochem. 70: 475-501 . ptashne M. and Gann A. 1I"J97. Transcriplional activation by recruitmcnl. Noture 386: 569-577. Slruh l K. 1999. fundamenla Hy durarenl logic of gene ragulalion in eukaryoles and prokaryoles. Ce1/98: 1 - 4.
Repression MaIdonado E., Hampsey M., and
Rei n~ D. 1999. RepresThrgeli ng Ihe heart of the matler. CB1199: 455-458. Smilh R L amI lohnson A.O. 2000. Thming genes off by Ssn6-Thp1 : A conserved syslOm of transcriptional roprossion in eukaryotes. 7tends Biochpm. Sci. 25: 325-330. SiDO:
Bird A.P. nnd Wolffe A.P. 1999. Methylatioo-induc:ed repreSli ion -Dolts. brates. and chromal in. Cel/ 99: 45 1- 4501 .
325-328.
Marticnssen RA. and Colot V. 2001. DNA methylation aorl cpiganelic inlwritancc in plants and filamentous fUr\gi. Science 293: 1070. Richardli E,J. and Elgin S.e. 2002. Epigonetic codes for halerochromatin formation alld silencing: Ro unding up the us ual sUlipects. CaJJ 108: 4B9-500. TI lghman S.M. 1999. The sil1l; of tlle falhen; and molhers: Cenom ;c imprinting in mammalian developmen l. GeJJ 96: 185-193.
Wolffe A.P. 2000, Tra nscriplional control: Lmprinling insulOlion. Gurr. Hio/. 10: R463 - R4G5.
Combinatorial Control and Syncrgy Carey M. 1996. The e nhanalOsome and transcriptional synergy. Cell 92: 5 - 8. Jo hnson A.D. HlU5 . Molecular mechanisms of cell Iypc delermination in budding yeast. GUIT. Opino Canel. lA:\'. 5: 552-558.
Man iat is T., Falvo ' .V.• Kim T.R , Killl T.K.. Lín C H.• Parekh 8.5., and Walhelet M.C. I9g6. Slruc lure and funcli on of Ihe ¡nterreron-bela enhanceosome. Go/d Spring Horb. Symp . Quan/. Biol. 63: 609-620.
BibJiogrophy
573
Merika M. and Thanos D. 2001 . Enha nceosomes. Curro Opino Genet. Dev. 11: 205-208.
Hunler T. 2000. Signaling-ZDOO and beyond. CeJl100:
Long~ Rangc
Pawson T. and Nash P. 2000. Protein-pmtein inlernctions define specificily in signa l lransduclion. Genes Dev. 14:
lnteractions
Dorsctl D. 19!m. Disl
1027-1047.
RNA and Gene Regulation
Dev. 9: 505-514.
CrOliveld F. 1999. Activation by locus control regions ? CUlToOpin oGenet. Dev. 9: 152- 157. West A.C. , Gaszner M., and FelsenfeJd C. 2002. InsulalorS: Many furu;tions, many mechanisms. Cenes Dev. 16: 271-288.
Signals and Signal Transduction Bromberg r.E 2001 . Acli vation of STAT proteins and growth control. Hioessoys 23: 161-169. BrU\vn M.S .. Ye J., Rawson R.B., and Coldstllin 2000. Regulalecl intramembrane proleolysis: A control mechanism conserved from bacteria In humanlS. CeJJ 100:
'.L.
391 -398.
Oamell
11 3- 127.
I.E. Ir.
1997. STATs anu gene regulation. Science 277: 1630-1635.
HjJl C.S. and Traisman R. 1995. Transcriptional regulatiofl by exLrncellular signals: Mochanism and spocificity. Cel/OO: 1 99-21 1.
Bailis J.M. and Forsburg S.L. 2002. RNAi hushes helerochromatin . Cenome BiD/. 3: 12. Oen li A.M. and Hannon C.J. 2003. RNAi: An ever-growing puzzle. Trcnds BiDchem Sci. 4: 196-201 . Crewal S.J. and Moa7.ed D. 2003. Heterochromatin and epigenetic control of gene expression . Science 301: 798-8U2.
Ha nnon G.J. 2002. RNA inlerference. Nalure 41B: 244-251. Kidner C.A. and Marlienssen R.A. 2003 . Macro affects of microRNAs in planls. 7hmds in Genetics 19: 1 3 - 1 G. Matzkc M.• Matzke A.• Pruss e., ¡¡nd Vanee V. 2001 . RNAbased silencing stralegies in plants. Curr. Opino ('.ene/. De\'. 11 : 221-227.
'njsterman M" Kelting R.F.. and Plasterk RH. TIte genaties ofRNi\ s ilencing. Annu. Re\'. Cenet. 36: 489-519. Zamorc P.D. 2001. RNA inlerfercnce: Listcning to the sound of silencl:!. No l . Strucl. Biol. 9: 746 - 750.
CHAP T ER
Gene Regulatíon duríng Development
here are more than 200 different cell types in a human, aU oC whieh arise from a single eeH, the fertilized egg. These genetically identical cells come lo differ from one another by expressing disMct seis of genes d UJ'ing development. For example. developing rnuscle cells express specialized forms of aetin, myosin, and tropomyosin Ihat are absenl in other oIgans such as the liver OI kidney. To appreciate the extent of differential gene expression. consider the following. A typical invertebrate, such as a fruit fly or wonn, contains approximately 1 5 ,ooU-20,000 genes, whereas vertebrates eontain perhaps double thís number, behoveen 3U,00U and 40,000 genes. Whole-genome microarray methods make it possible 10 identiry which genes are expressed in a given tissue, As an example. approxi mately 7% or 8% (- 150n genes) of a1l genes in the genome of the nematode wonn C. eJegans are expressed in the musdes (Figure 18-1 ). Different cell types-say, a muscle cell and a neuron-express somewhat different, bul overlapping, subsets of genes, TypicaJl y, less than half of the genes expressed in one cell type are also expressed in another givon cen type, and a specifi c cell may be defined by lbe expression oC abon! 1 00 lo 20 0 "signature" genes Ihat are rasponsible for its unique characteristics, (See Box 18-1, Microarray Assays: Theory and Practice. ) How do m Us lba! are denvoo from the same fertilized egg establish different programs oCgene expression? Most differentiaJ gene expression is regulated al the level of transcription initiation , and we describcd the basic mecbanisms ofthis rCgWalion in the preceding two chaptees. ln the
T
OU TL I NE l hree Stralegies by which Cells Are Inslructed 10 Express Specific Sets of Genes during Developrnenl (p, 576)
Examples of the Three Slrillegies for Establishing Differenfial Gene ExpresSlon (p. 580)
• 1he Molecular Biology 01 Drusophilo Embryogenesis (p. 590)
F I (j U RE 18~ 1 Miaoanay pids comparing ellpression patterns in two tissues (muscles and neuFOns) in C. elegoru. Each C1rde in the grid oontilins a short ONA segment from lhe ooding fegion of a single gene in the e elegons genome. RNA
was extracted ftom musdes aOO ncuoos, and labeled with fluofescent dyes (red ilnd green, respectively). ThU5, the red cireles indicate genes expressed in musde. vmefeas the green renect genes elCpressed in neurOns. lhe yel\ow ordes indicare genes expressed in both ceH lypeS. ti is cleilf lhat!he two sarrp!es f.'XPfess distinct seIS 01 genes. (Source: Courtesy 01 Sluart Kim.)
575
516
Gene & 8111000n d urill8 Jfill'elQpmenl
first half of this chapter, we describe how cell s communicale wi.th eacb olher during development lo cnsure ulat each cxpresses the particular sel of genes required for their proper development. Simple cxamples of each of tbese stratcgies are then described. In !he serond haH of !he chapter, we describe how these strntegies are used in combination with the tra n· scriptional regulatory Olechani sms described in Chapler 17 to control the development of an entiro organism-in this case, the fruit fl y.
THREE STRATEGIES BY WHICH CELLS ARE INSTRUCTED TO EXPRESS SPECIFIC SETS OF GENES DURING DEVEWPMENT
a
fcrtilizalion . 1
unlcrtUized egg
lerlilizúd egg wlth localizad
wilh unifOfTTl dislrlbuliOn
RNA
ofRNA
b
We have alroady seen bow gene expression can be oontrolled by "signals" received by a cell from its environment. For example, the sugar lactose activates the transcription of the lac operon in E. coli, whilc viral infection activates the expression of Ihe ~-interfe ron gene in mammals. In this chapter we focus 00 the strategies that are used lo instrucl gcnetically-identical cc ll s lo express distim;:t sets ofgenes and tllCreby differentiate into diverse celJ types. The three major strategies are mRNA localizalion. cell-to-cell contact, and signaling through Ihe difTusion ofa secretcd signaUng molecuJe (Figure 18·2). Each ofthese strategies is introduced briefly in the folIowing sections.
Sorne mRNAs Become Localized within Eggs and Embryos due to an Intrinsic Polarity in the Cytoskeleton ce li A
signal
.. cel18
F I G U R E 18-2 The three strategies for
initiating differential gene adMty during deveSopment. (a) In sorne animals, certain "maternal" RNAs presenl in the egg become locaIilE!d either befare or o3fter fertilmmon. In this example. a specific mRNA (green squiggles) beromes rolized lo vegetal (bottom) regions aNO' fcrtilizalioo (b) Cel A must physically tnler«t wrlh lell B 10 ~1.imtJale tlle rect.1JlOr p esent on tOe surfac:e of eeUB. lhis IS because lile "Iigand" prodtJCed bt eell A is lethered 10 the plasma membro3ne. (e) In this exarrple of bng-
range eell Slgnafing ceII O secretes a signaling rrdecule that diffuses through the extacellula r matrix. DiHerenl ceHs (1 , 2. 3) re<:eive \he srgnal and ¡j[tilTlil!ely undergo changes In gene activIty.
Qne stralegy to establish differences between two genelicalJy-identical cells is to distribute a critical regulatory molecule asymmetrically during cell division . thereby ens uring that the daugbter cclls jnherit different amounts of that rogulator and thus follow different pathways of development. Typically. the asymmetricaUy distributed molecule is an rnRNA. These mRNAs can encode RNA-binding prol eins or cell signaling molecules, but most often they encode transcriptional activators or repressors. Oespite this diversity in fue function of their proteio products, there is a eommon mechani sm for localizing rnRNAs. TypicaIly. they are transported along elements of the cytoskeleton, actin filamcnls, or microtubules. The asymmetry in tbis process is provided by the intrinsic asymmetry of these elements. Actin fi laments and microfubules possess a n intrinsic polarity. with dirocted growlh al !he + ends (figure 18-3). An mRNA molecuJe can be transported from one end of a cell to the other by means of an "adapler" protein, wbich binds to a specific sequence within the noncoding 3'unlran.c¡lated trailer (3 ' UTR) region of an mRNA. Adapter proteins contaio two dornains. Qfie recognizes the 3' UTR oC !he mRNA, while the other associates with a specific component of the cytoskeleton. such as myosin. Depeodiog 00 t.he specilic adapter that is used , the mRNAadapter complex either "crawls" aloog an actin filament . or directly moves wilh the + end of a growing microtubuJe. We will see how this basj c process is used lo localize mRNA detcrminants within tbe egg or lO restrid
Contact and Secretoo Cell Signa ling Molccules
buth E1icit Changt!s in Gt!n t! Expression in Nt!igh buring C ells A cell can influence wbich genes are expressed in neighboring cells by producing extracellular signaling prot eins, These proteins are
l'hree Sl rolegje.~ by whic;h Gells Are lnstrocled lo Express Specjfic SeIs ofCenes
dtJri~
DeveJopment
577
Bu ..... Mkroanay Assays: lbeory and Pradice
Microarray assays permit the genome-wide analysis of gene €lIpression profiles. lhe miaoanay, I'yl)ically encompassing thousands te lens of thousands of k.nc:1M1 sequences immobilized on a microscq:>e slide, can be suqected to a senes 01 ¡,."bodization experiments perfOlmed in parallel. lo generate the anayed material for the microonoy, protein coding sequences are prepared using me poIyrnerase chain reaction (FCR; see Chapter 20). The masl common amplification method involves lhe use of short oIigonucleotide sequences (typically on the order of 20 nudeotides in length) that bracket an exon for a particular prolein coding gene in the genome. Paired oIigallJdeotides, each pair representing an exon for every protein coding gene, are ttlen h~ridized lO genomic ONA and amplifled by PCR The resulting amplífied genomic ONA fragments are then attached to gIass slides in a series of spots. Each spot on the slide, therefore, cootains a disaete amplifled ONA frd8rnenl representing a unique protein coding gene. $lides the size of a typical microscope slide can carry as many as 40,000 PCR fragments. This coIlection represents the entire protein coding capadty of ttle human genome on a single sflde. lo investigate lNhole-genome pattems of gene expression, Ihe slide is hybridized with differentially labeled fluorescent
RNA probes. Consider !he case shO'Nl1 in Figure 1&1, which compares gene activity in the muscles and neurons of the nematode worm, e elegans. Total mRNA was isolated fmm each tissue and labeled with different dyes. It is possible ro label the musc1e mRNAs red and the neuronal mRNAs green. These two samples of labeled mRNAs are Ihen simultaneously hybridized en the same glass slide containing PCR tlagrnents representing each of the neany 20,000 genes in the C. efegons genome. W1en both samples hybridize to a particular spot, or gene fragment, a yel1o.v color is ernitted. This hybridization result indicates ttlal the particular gene is signíficantly expressed in both tissues. Spots that strongly stain red correspond lo genes thal are mainly expressed in the muscles, bul not neurons. Converse1y, those spots tIlat stain green represent genes Ihat ale expressed in neurons bul not muscles. 1he basic methcxl can be used to compare Ihe gene expression profiles of any t'M) samples. For example. there have been extensive studies ttlat compare mRNA profiles in normal tissues and tumors. It Is also possible to isOOte RNA from normal yeasl cells. Of Drosophi/a emblyos, and compare these with mutan! yeast cells. 01' mutanl fly embl)'US.
s ynlhes ized in Ihs firsl cell and then either deposited in Ihe plasma membrane of Ihat cell or secreted inlo Ihe e xtracellular matrix . These two approaches have features in common , so we cons ider them together here. We wiIl Ihen see how secreted s ignals can be u sed in olher ways, A given signal (oC eithe r sort) is gene rally recognized by a speci fic receptor on Ihe surface of recipient cells. When lhat receptor binds lo lhe s ignaling moleculp.. it triggers changes in gp.ne e xpression in Iha recipient cel!. Thi s communication from the cell surfaee receptor to Ihe nucleus often involves signal lransduction pathways of the son \Ve considered in Chapter 17. Here \Ve summarize a few basic fea tures of these pathways . Sometimes Iigand-receptor interac tions induce an e nzymatic cascade that ultimately modifies regulato ry proteins a lready present in the nucleus (Figure 18·4a). In other cas es, activale d receplors cause Ihe rel e ase oC DNA-binding proteins from the cell s urface o r cytoplasm into the nucleus (Figure 1 B-4b) . These regulatory proteins bind lo specific DNA recognition s equences and either actívate or repress ge ne express io n . Ligand binding can a Iso cause proteolyHc cleavage of the receptor. Upon cJeavage, the i ntracytoplasmic domain of the receptor is release d from Ihe cell surfaee and enters the nudeus, wbere it associates with DNA-hinding proteins and influ ences ho w Ihose prote¡ns regulate transcription of tbe associated genes (Figure 1B-4c). For example, the trans porte d prote in might conve rt what was a transeriptional repressor into an ac tivato r. In Ihis cas e , target genes that w e re fo rmerly repressed prior to signalíng are n o w induced. We wiII co ns ider e xamples of each of thes e variati o ns in cell signaling in this chapter.
FIGURE 18-3 Anadapterproteinbinds to spedfic Hquences witttin ttte l ' UTRof ttte mRNA. The adapler alw binc!s lo rnyosjn,
.......t.ich ·a~s· along Ihe actin filamenl in a directed fashlOfl, trom Ihe "- " eod 10 Ihe grov,;ng ~ +" end 01lile filamenl
578
Gene Rellulalion during DIt.'elopmlJnf
Signaling rnolt!cules Ihat remajn on the surface control gene expressio n on )y in those cell s Ihat are in diceet , physieal contad with the signali ng ceH. We refe r lo Ihis process as cell-to-cell canteet , In con trast , signaling molecu)es thal are seCfeted into Ihe extracellu lar ma trix can work over greater distances. Sorne travel over a distant;e of just 1 or 2 cell diameters. whereas olhe rs can act over a range of 50 ceUs or more. Lo ng-range signali ng molecules are sometimes responsi ble fo r positi onal information . whic:h is discussed jn the next seclion.
• receptor
cytoplasm
I t I
k¡nase~e
t
causes pilosphorytalion of DNA-bindirog
~
prolein in nudeus
-- -b
~
/releaseofDNAblndlng prolein ¡nto nucleus
e
deave-~ inlracyloplasmic domain
!
Gradients of Secreted Signaling Molecules Can Instruct Cells to Follow Different Pathways of Development based on Their Location A recurring theme in developmenl ís the importance of a cell's posili on within a developing embryo or organ in determining what ji will become. Cell s loca ted al lhe front of a fruit n y embryo (Ihat is, in anterior regio ns) will fo rm po rti o ns of Ihe adu lt head s llch as the an tenna or braln bu t w ill nOI develop into posterior struclures s uch as the abdomen or genita lia. Cells loca ted on the top, or dorsaf , surface of a frog embryo can deveJop inlo porlions of Ihe backbo ne in the tadpole or a dult but do nol form ventral, or "belIy," li ssues such as the gut. These exn mples iI1ustrate the faet that the fale of a cell what il witl become in Ihe a dult - is co nstrained by its locatio n in the devel oping cmbryo. The inOuence of location o n deveJopment is called positional information. The most common way of establishing positioDal information io volves a simpl e exteosio n of one of the strategies we have already encountered in Chaptee 17- the use of secreted signaJing molecules (Figure lB-S). A sma ll group oC cells synthesize and secrete a signal¡og mofecule that beco mes distributed in en extraceHular gradient (Figure 18-5a). Celts located oear the "source" receive high co ncentrali ons of the secreted protein and develop ioto a particular cell type. Those ce lls located at progressively farther distances fo lJow different pathways of developme nt as a result of receiving lowe r coocentra ti ons of Ihe signa li ng molecule. Signa ling molecules Ihal control position information are someti mcs ca Bed morphogcns . Cells located near the source of the morphogen receive high concentration s of the signaling molceule and, therefore, experience peak acti vati on ofthe specific cell surtace receptors that bind it. rn contrast. cells located far ITom the source receive low levels nf the signal. and consequently, only a small fraction of the ir cell surface receptors are acti vated. Consider a row of three cells adjacent lo a source of a secreted morphogeo. Something Iike 1,000 receptors are activated in Ihe first ceH . while only 500 receptors are acti vated in Ihe next cell . F I G U R.E 18-4 DiHeI"ent medtanism$ of signa¡ transductton. A Ilgane! (01" ~ing moIecuIe") blnds lo a een surface re:epTOl. (a) The activaled re:eptof induces latent ceUular kinases
Ihclt ulrimately cause !he p/"losphot)fatim 01DNI\·binding prcteins \l\lithin the nuc/eus. fus pilos~1ÍOr1 carses.1he regulatory protein 10 actívate (or repress) !he lran~on 01 speciflC genes. (b) The act1vated recepta releases a dorman! DNA-binding profein Irorn the C)1q:llasm so thaI it can rt:NJ enter the nudeus. Once In Ihe nudeus, Ihe r~latory prolen actWales (or represses) me transap600 01specifle geres. (e) The acwated receptor is deaved by eellular proteases that cause a C-terrrunal portlOfI 01 the receptor \O enle the nudeu5 and loterdd WIth o;peOfic DNA-bindlog proteos. The resuItirlg protein compIex actívate. !he transoip6on of speofic genes.
Tbree Slmlegies by wbieh Gel/s Are Inslruc:lf:ltl lo Express Specific; Sets of Cenes dun'ng Del-elopmcnl
: · ro a
••• •••
morphogen source
b numbe r of receptors occupied
FIGURE 18-5 AdustetofceUs
• • •••
'---' cel11
!
produ«s a signaling moiecule. or morphogen. that diffuses th,ough
O~ cell 2
cel13
!
!
©P ~
e levels of activated Iranscription factor in nuclei
570
111 (!)
d
and jusI 200 in the next (Figure l B-Sb). These difIerenl levels of receptor occupancy are d irectly responsible for differenti aJ gene expression in the responding cell s. As we have seen, hinding of signaling mo)ecules lo ceJl surface receptors leads (in one way or another) to an ¡nerease in the coneentration of specific transcriptiona) regulators, in an active fonu, in the nucleus of the cell. Each feCftptor controls a specific tmnscriptional regulator (or regulatorsJ, and this controls expression of particular genes. The number of eell surface receptors that are activated by the binding of a morphogen determines how many mo)ecules of the partic ular regulatory protein appear in the nucleus. The cell c1asest lo the morphogen source- containing 1,000 activated receptors- will possess high eoncen trations of the transcriptiona) activator in ¡ts nucleus (Figure lB-5e). In contrast, Ihe cells located farth er from the source contain intennediate. and low levels, of the activator, respectively. Thus. there is a cOITelation between the number of activated receptors on Ihe eell surface, and lhe amount of transcriptional regu lator present in the nucleus. liow are these different leve!s of the same transcriptiona l regulator able to trigger different patterns of gene expression in these different cells? In Chapler 16 we learned that a small I..:hange in the levels of the 11. repressor determines whether an infected bacteria! ceU is Jysed or Iysogenized. Simila rly. small changes in tha amount of morphogen. and hence smaU di fferences in the levels of a transcriptional regu)alor within
the exttaceltular matrlx. (a) Cells 1,2, and 3 r€CeNe progressively 10'M2r amounls of the signali ng moIecule since they are iocated progressively larther frorn !he soun:e. (b) CeUs 1. 2. and :3 eontain progressivefy loweJ numbers 01 activated surface receptors. (e) The three eells centain different level5 01 one Of more regula. tory proteins. In me simples! sceoario, there IS a linear eorrelalion between !he numbeJ of aaivated eell surface receptors and Ihe afTlOUnt oi a regulatory lactor tila! eoters lhe nudel. (d) The diffefent JeveIs of!he regulaloty factor lead 10 the e~ion of diHerent seis 01 genes_ Cell 1 expresses genes A, B. and C beCilUSE' it con\dins the highesllS'els of the regulatory lactor. Cell 2 expresses genes B and C. but no! A, because rt eontaios ¡nk:m1ediate lS'els of !he regulatory factor. These levels are nol suflicient to actívate gene A. Fmally, eell 3 contains the Iowest leveIs 01 the negulatory factor and expesses onIv gene e 51nce expression of geoes A and B requires htgher 1eveIs.
580
Cene Regulation dun'ng Develapmellt
the nucJ eus. determine celJ identity. Cells '-hat conlain high concentralions of a given trnnscriptional regulator express a variety of larget genes that are inactive in cells containing intennediale or tow levels of the reguJalor (Figure 18-Sd), The differential regulati on oC gene expression by differenl concentrntions oCa regu lalory protein is one oC the most impor-. tanl and pervasive mechanisms encountered in developmental biology. We will consider severa) examples in the course of this chapler.
EXAMPLES OF THE THREE STRATEGIES FOR ESTABLISHING D1FFERENTlAL GENE EXPRESSION The Localized AshI Reprcssor Controls Mating Type in Yeast by Silencing the HO Gene Before describíng mRNA localization in animal embryos, we first consider a case from a relatively simple single-ceH eukaryote. the yeast S. ccrevisiop.. This yaast can grow as hap loid ce lls ¡hal divide by budding (Figwe 18-6). RepJicated chromosomes are distributed between
FIG URE 1 8~6 A hapkJidyeastcell of mating type a undergoes budding to produce a mother
!
swilch
a b HO OFF
~ a daughter
HO ON
• a mother
Examples ofthe Three Stratcgies for Establishín8 Dífferelltíol Gene Expression
two asymmetric cells - the larger progenitor cell, or mother cell, a nd a smaller bud. or daughter cell {Figure 1B-6a}. These cells can exist as either of two mating types, called a and a, as discussed in Chapters 10 and 17. A mother cell a nd its daughter celJ can exhibit d¡[[erent mating types. This diHerence arises by a process called mating-type swilching. After budd ing lo produce a daughter, a mother cell can "switch " mating type. with, Cor example, a n a cel) giving rise lo an a daughter. bui subsequently switching lo the a matiog type (Figure 18-6b). Switching is controlled by the product oCthe HO gene. We saw in Chapter 10 lhal (he Ha proteln ls a sequence-specific endonuclease. Ha triggers gene conversion within the mating-type locus by creating a double-strand break at one oCthe two sile nt mating-type cassettes. We also saw in Chapter 17 how Ha is activated in the mother ce)) . It is kept silent in the daughter cell due to the selective expression of a repressor called Ashl (Figwe 18-7), and thi s is why the daughter cell does nol sw itch maling Iype. The ash1 ge ne is transcribed in the mother (,1': 11 prior lo Luddi ng, bul Ihe encollad RNA becomes localizeu
within the daughter cell throllgh the following process. Dwing
a Bsh1 mRNA Iocalized to bud
.""
mRNA
o
/
poIariZed actin filament bud
b She2 ¡::rotein ssh l
She3 prolein
!{;:J
m~R=NA=lU.Til: R S' _ =: 3' _ _-,\_. ~
S' _4 ~ = 3'
myosin-driven
R.~
rTlOV€ment
=
3'
rnyosinV polBl'ized actin filament molher celt
FIGUII: E 18-7 ~lIzation of ashl mRNA during budding. (a) hoshl gene is tlafI5Clibed in Ihe mother cell dunng budding. The encoded rnRNA moves [mm lhe motherceU inlo Ihe bud by sliding along polanZfd aOln filarnents. Movemenl tS dlrected and begins al lhe "- " ends of Ihe f!lament and edends with lhe growing "+" ef1ds. (b) The ash J mRNA Ir.!nsport depends 00 lhe binding of!he She2 and She3 adapter pmteí~ lo spedOC seqlJeoces contaíned llYithin !he 3' UTR 1hese adapta proleins bind m~in, whlch "crav.fs' along Ihe adÍn filamenl and bnngs lhe ash I mRNA along for Ihe ride. (SotJrce: Adapted from Alberts B. el al. 2002. MOJerutaf bioIogy al /he ce/{ 41h edition, p_97 1, f l 6-84, part <). RfpIO' du:ed by pernlission of RootIedge/Ta~or & Fr.!r1CÍs Books. Inc.)
581
582
Gene Regula/ion d urins Developrnen/
budd ing, the ashl rnRNA atteches lo the growing ends of microlubules. Several proteins Cunction as "adapters" thal bind Ihe 3' UTR of the ashl mRNA and also lo the microlubules. The microtubules extend from the nuc1eus oC lhe motber c:ell lo the síte of budding, and in Ibi s way the osh J mRNA is tran sported to Ihe daughter cel!. Once loca li zed wilhin the daughter cell , the ash J mRNA il> lranslaled in to a repressor protein Ihal binds lo, and inhibits the tra nscriplion of. lhe HO gene. This s ilencing of HO expression in Ihe daughter cell prevents tllat cell ITom undergoing mating-type sw itching. in lhe second half oC Ihis chapter, we will ,s ee Ihe localization or mRNAs used in the developmenl oC ¡he Dmsop}¡jfa embryo. Once agai n this localization is rnedialed by adapler proh'lins that bind lo lhe mRNAs, specifica lIy. lo sequences found in their 3' UTRs. (See Box 18-2. Revicw of Cytoskelelon: Asymmetry 8nd Growth.l A second general principIe Ibat r,merges from sludies on yeast malingtype swi tching is seen again when we consider Drosophila developmenl: Ihe ¡nlerpIay between broadly distributed aclivatnrs ancl localized repressors lo esleblish precise pettems of gene expressíon within
Box 1"'1 Review of Cytoskeleton: Asymmetry and Growth
lhe cytoskeleton is composed of Ihree twes 01 filaments: intermediate filaments, acrin filaments, aOO microtubules. Acrin filaments aOO miaotubules are used to localize specific mRNAs in a variety 01 diHerent cell types, induding budding yeast aOO Orosophila oocytes. Actin lilaments are composed of polyrners 01 actin. The actin polymers are organized as two paral1e1
helices that fron a c~lete tv.>ist every 37 nm. Each actin monomer ls locatee! in Ihe same orientation v..;thin the poIymer, and as a result, actin filaments cantain a deilr poIarity. The plus (+) end grovvs more rapidly than the minus ( - ) end, and consequently, mRNAs slaled for localization move along INith the gro.ving . +~ ene! (Box 18-2 Frgure 1).
b actin molecule
o Bo x 18-2 F I G U R E 1 StructUfe5 of the actin monomef and fUament. úystal Slructure Di lhe actin moromer. (a) The four domains of!he monomer are shov-;n, in differenl coIors. with ATP (in red and yellow) in the cenler. The "_ 0 end Df!he monomer is al lhe lop; !he "+0 end is al tIle bottom. (Dner'bein LR. Graceffa p~ and Dominguez R. 2001 . Sdence 293~ 708 - ·/11 .) lmage prepared wllh f./IolSCflpl, BobScripI. and Raster :m . (b) The O1onomers are assembled, as a single helix, ¡nto a filarnenl
EX(Jmplr.s (Jf fI/C Three Sfm fl1gies lor Es foblishjng Differenrjn/ Genol Ex pressjon
583
lIox 18-2 /Continued)
Miaotuooles are composed of polyrners of a protein called tuoolin, ....tIich is a heterodimer composed of related o aOO ~ chains. Tubulin heterodimers form extended, asymmetric ¡xotofilaments. Each tubulin heterodimer is located in the sorne orientation wrthin the protofilamenl Thirteen different protofilaments associate to form a cyliOOric.a1 miaotubule, aOO all of the protofilaments are aligned in parallel. Thus, as seen lar actin filaments, mere is an intrinsic poIarity in miaotubules, with a rapidly grCM'ing ~+ ~ end aOO more stable ~ _. end (Box 18-2 F.gure 2). 80th octin aOO tubulin function as enzymes. Actin c.atalyzes the hydrotysis of ATP to ADP. ....tIile tubulin hydrolyzes GTP to GDP. These enzymatic actMties are responsible lar the dynamic growth, ar ''trmelmming,- seen for actin filarnents and micro-
tubules. T'r"Pically, it is the actin or tubulin subunits at the ~ _ n end 01 !he filarnent that mediate the hydrolysis of ATP or GTP, aOO as a result. these subunits are somev.-hat unstable and lost from the end. In contrast newly added subunits al the ~ +~ end have not hydrolyzed ATP Of GTP. and this causes them to be more stable components of the filament Directed gro.vth of actin filaments or microtubules al the "+~ ends depeOOs on a vdñety of proteins tIlat associate ~th the cytoskeleton. One such protein ís callee! ¡xofilin, which ¡nteracts ......-iIh actin mooomers aOO augments thar incorporarían into the •+u ends of growing actin filaments. Other proteins have been shOM1 to enhance the gro.vth 01 tubulin protofilament5 at the u +" ends of microtubules. H _.
b tubulin heterodimer (= microtubule subunit)
p
.-....•••.....•.•..• •,-.• ,., -•......... .... ..••....... -" .... •..•...,
..."
,
.
••• -lo'
,
500m
.. •.... ... •.•... .' •..•... ...-·1... "'"l.,...... -.. .. , 6"
•
e. l
.•
",
'
•••
, ,
e ••• " ,
'
'
. . . ... . . 1
- T'
protofilament
'
micfolubule
8 O X 18-2 F I GU RE 2 Structures of lile tubulin monomer and filament. (a) The aystal structute 01the tubulin monomer shows !he a subunit in t\.lttp)ise aOO the 1} subunit in purple. The GTP molecules in each subunit are shotm in red. (lowe J., li H., Downing K.H .• aOO Nogales E. 200 1. J. Mol Biol 3 13: 1045- 1057,) lmage prepared w th MoISaipt , BobScripl, and Raster 30_(b) The ptotofiJamel1t 01tubulin consists of adjacent monomers assembled in the Silme orientation.
individual cells. In yeast. the SWI5 proteio is respoosible ror activatiog expression of the /-lO gene (see Chapter 17). This activator ís present hoth in tne mother ccll and the daughtor coll during budding, but its ability to turn on HO is restricted to the mother cell because 01" the presenCtl of the Ashl rcpressor in tho daughtor celL In other words, Ashl keeps the [-lO gene off in the daughter cfl ll despite the presence of SWI5.
S84
Gene Reguladon dUring Del,ldCJpment
A Localized rnRNA Initiates Musde Differentiation in the Sea Squ irt Embryo Localized mRNAs can establish d ifferenlial gene expression among the genctically-identical cells of a developing embryo. Jusi as the fate of the daughter cell is constrained by lIs inheritance of the ashl rnRNA in yeast, the cells in a developing embryo can be instructecl lo follow specific pathways of development through the inheritan(;e of localized mRNAs. (See Box 18-3, Overview of Ciona Development .) In the case of muscle differentiation in sea squirts, a major determi. nant ror programming cells to form muscle is a regulatory proteio caUed Macho-l . Macho-l mRNA is initially distributed Ihroughout the cytoplasm of unfertilized eggs but beco mes res tricted to the vegetal (bottomJ cytoplasm. shortly after fertilizalion (Figure 18-8). It is ultimately inherited by just two of the cells in eight-cell embryos. 8nd as a result those two cells go on to form the tail muscles. The Macho-l mRNA e ncocles a zine finger DNA-bind ing protein that is believed to activate the teanscription of muscle-specific genes, such as aelin and myosin. Thus. these genes are expressed only in muscles because Macho-l is made only in those ceUs. In the sec;ond part of this cha pter, we will see how regulatoey proteins synthesized ITom localized mRNAs in the Drosoph ila embryo activate and repress gene expression and con trol the foemation of different cell types. Cell~to--Cell
Contact E1icits Differential Gene Expression in the Sporulating Bacterium. B. subtilis The second ma jor strategy for establishing differential gene expression is cell-to-cell contacto Agai o , \'Ve begin OUT discussion with a relatively simple case, this one from the bacterium Bacíllus subtilis. Under adverse cooditions, 8. subtilis can forro spores. The 6rst step in this process is the formatian of a septum at an asymmetric Jocalion within the sporangium. the progenitor of the spore. The seplum produ(";es two
FIGURE 18-8 TheMacho.l mRHA
bewmes localb:ed In tbe lertiljzed egg.. (a) The rnRNA is mibally distributed Ihrrughout !he Cf1opIa<.rn oIunfernl!Zed eggs. p.¡ lertifizatioo \he egg is incL::ed lo underóQ a highly asyrnmel:ric division 10 produce a smal polar body (top). Al this lime, ,he Macho- I rnRNA becomes localized lo bottom (vegelal) regions. ShortIy mereaher, aOO 'NeII befare \he hrst diviston 01 the 1 cel embfyo, Che Macho- 1 mRNA undetgoes a second wave of locaIizabOfl. Ths ocrurs dunng ,he second Ilghly
a
, uofertilized egg
lirsl phase of segregalion
b
------••
anteria
tal~sc~
secood phase of segregation
ExampJc.s of Ihe Three Slrolegies lor EstoblishinS Diffe/'enliaJ Cene Expression
585
cells oC differing size that remain attached through abutting membranes. Tbe smaller cell is called the fOl'espore; it uHimately forms the spore. The larger cell is called the mot her cell ; ir aids Ihe development of the spore (Figure 18·9). The forespore influences the expression of genes in the neighboring mother cell, as follow s.
BOJe • •3 OveNiftV of
aono Dewlopment
Adult sea squirts are immobile filter-feeders that live In shallow ocean woters (Box 18-3 FIgUre 1). 1hey are hermaphrodites and possess both sperm and eggs. They can self·fertilize but prefer no!: to do so. Instead, sperm from one animal tWica11y fertilizes eggs from another. The resulting embryos are transparent and wnposed of relatively few cells (hundreds, rather th::!n the tens of thousands seen in vertebrate embryos). lhese errfx)us deveI~ rapidly into sMmming tadpcles just 18 - 24 hoors after fertilization Corn!*'-e cell lineages are knc:wn fa eadl of the maja tissues. This makes it possi:Ae to Iñsualize seq..¡ence of cell divisiros from fertilizatioo to me famatiCfl of speciahed tissues in the tadpole. for example, me tadpde ta~ cootalns 36-40 musde cells (depending 00 me spedes), and the lineage that forms these cells can be traced back to the fertilized egg.
me
The tail musdes represent the first ceH lineage that was visualized in any animal embryo, about 100 years ago. This visualization was made possible by a yellow pigment that is present in the unfertilized eggs 01 cerlain asciclians. TIle pigmenl is initially distributed throughout lhe egg but becomes localized to vegetal (bottom) regions shortly alter fertilization (Box 18-3 Figure 2). The localized pigment is inherited by juSI two of the cells, or blastomeres, in eight-cell embryos. These two cel1 s give rise ro most of lhe tail musdes in the tadpole. The yel10w pigment is nol Ihe actual musde ~determinant~ - that is, it is no! responsible lar programming lhe cel1 s to form muscle. Rather, the pigment is merely a visible marker thal is assodaled with the determinanl
8 o x 18-] F I ti U RE 1 Ciono life cyde.
The aduIt sea SQUirt is shO'M"l in lile upper leh p.3nel. The orange malerial cooesponds to delleIoping eggs and the white is the sperm. Progressively oIder ernbryos are shCMt1 in the remaining pane!s. The emb!)QS in the third rr.:m are undergoing gaSfrulatioo. A ',OUng ladpole can be seen in Ihe l(MIer righl panel. This stage is reached 12- 14 hoors after fertilization (see 1
me
586
Gene Reglllation dlll'ill8 Del'elapmenf
80x 18-3 (Continued)
a
ectoderm
e""""mn vegetal pele
b
posterior
anterior vegetal poIe
e
developlng ta¡1muscles view from vegetal pole
80 X I 8 -3 F I G URE 2 Eady deavages in Ascidions. lhe fertilized, l
The forespore contains an active form of a s pecific (T fa ctor, (Tr, which is inactive in the mother cell . In Chapler 16 we saw how (T factors associate with RNA polymerase and seleet specific target promoters for express io n. (Tf activales lhe spoIlR gene which encodes a secreted signaling protein. SpollR is secreted into Ihe space between the abutting memb.ra nes of the mother cel] and the forespore where it triggers the proteolytic processing of pro~(TE in the mother ceBo Pro_(TE is an inactive precu rsor of the erE fact or, The pro-(T& protein contains an N-ter minal inhibil ory dorna in that blocks (f E act ivity and tethers the proteio lo the membra ne af the mother cel! (Fig ure 18-9) . SpollR induces t.he proteolytic c1eavage of the N-terminal peptide and Ihe release of the mature and active form of (TE fTom Ihe membrane. (TE activates a set of genes in Ih e molher cen that is distinct from those expressed in Ihe foresp ore. In this exa mple. SpolIR funcH ons as a s igualing moJecu le thul acts at
586
Gene Re8u1otion duringDel:elopmellt
80. 18-3 (Conlinued)
a
ectoderm
vegetal pele b
anterior vegetal poIe
e
view from vegetal pole BOX 18- 3 fiGURE 2 Earty deavages mAscidhms.
The fertilized. l-ce1l asadian embryo conta!ns a numberof localized "determithat control the deYeloprrenl of diffefent tissues. For example, the yellow determinant is inherited by cells that form me tait musclcs. Thc red determinant IS inherited by cells tha! 101m lhe endoderm, 01 gut (Source: Re
The forespore contains an act ive form of a speci Cic a factor. wh ich is ¡nactive i.n Ih e mother cell. In Chapter 16 \'l/e saw how a fa ctors associate with RNA polymerase and select specific target prornoters for express ion. a F activates the spolIR gene which encodes a secreted signaHng protein. SpoIfR is secreted ioto the space between the abutting membranes of the mother cell and the forespore where it triggers lhe proteolytic processing of pro-al": in the mother cell. Pro_aE is an ¡nacti ve precursor of the uF. faclor. The pro.al> protein contains a n N-terminal inhibitory dornain thal blocks a E activity and telhcrs Ihe prote¡n to lhe membrane oC the mother ccll (Figure 18-9). SpoUR induces the proteolytic cleavage oC the N-terminal peptide and the rclease of lhe mature and active form of uF. from the membrane. a E acti vates a set oC genes in the mother cel1 that is d istinct from those expressed in the forespore. In this example. SpolfR functions as 8 signaling molecule that acts al a F,
Examplcs o/,he Three Slrategjes lor Eslablishirl8 Differen
CJE Iniliates
FIGURE 18~9 Asymmetric gene activity m the mothet cell and fo«!lPO"' of B. subtilis depends on tbe ilctWation of different dasses of (7 fadors. ~ spoIlR gene is activated by ¿., !he forespore. Tre encoded SpoIIR protein becornes assooated WJth the septum separati'1S lhe rno!her 001 (00 lte !eft) and forespore (01'1 the right). It trigger.; the prcteoIytic processing of ilrl ÍI"IaCiÍ'.e!Ofm of o' (pro--o') in !he motncr The activated d plotein leads lO the lecruitment of RNA po/yrnefase are! !he activation 01speafic: genes in the l'l'1OItet- c.eII. (Source: Losick R and Straiger P. 1996.. Spcrulalion In BoiJ~ svbtilis. Ann. Rf!'I. Qy¡et 30: 209, fig 3, pdrt a \Mm permission frcrn theAmuol Rf'viewol ~ Vol. 30. !ti 1996 by Annual ~ l.'MW.annuall~g.)
rol
the interface between the forespore and the mother cell and elicits differential gene expression in (he abu lli ng mother cel! Ihrough the processing of (TE. Induction requires cell-to-cel1 contacl because the forespore produces small quantities of SpoIJR that can internet with the abutting molher cell bul whit:h are insufficicnt to elicit the processing of (Jt: in the other ce lis of the populatioo.
A Skin~Nerve Regulatory Switch 15 Cootrolled by Notch Signaliog io the [oseet CNS We now turo lo an cxamplc of ceJl-to-ceJl contact in an animal embryo that is surprisingly similar lo the one just described in B. subtilis. In that earlicr example, SpolIR causes the proteolytic activation of crE. which. in its active state, dire<..1s RNA polymerase lo the promoter sequences of specific genes. ro the following example, a ceU surface receptor is c1eaved and the intracytoplasmic domain moves to Ihe nucle us where it binds a sequence-specific· DNA-binding protein that activates Ihe transcriptioo ofselected genes. For this example. we must first briefly describe the dcvelopment of the ventral nerve cord in insecl embryos (Figure 18~ 1O). This nerve cord functions in a manner thal is roughly comparable lo the spinal cord of humans. It arises from a sheet of cells called the neurogenic ectodermo This tissue is subdivided into two cell populations: one group remains on the surface oC Ihe embryo and forms ventral skin (or epidermis); the other population moves inside the embryo to fono the neurans ofthe ventral nerve cord (Figw·e 18-10a). This decision about whelher lo become skin or neuron is reinforced by signaling between lhe two populations. The dcveloping neurons contain a signaLing molecuIe on their surf;:¡ce called Delta. which binds to a .receptor on the skio cells called Notch (Figure lB-10b). The activati on of the Notc h receptor on ski n cells by Delta reoders Ihem incupable of developing into neurons, as follows. Activation causes the intracytoplasmic domain
588
Gene Regulalion durillg Developmen'
•
ectoderm
neuroblast specffication
•
lateral inhibition of SUITOl.nding cells
by neuroblasl
of Notch (Notch rc ) to be released from the cell membrane and enler nuclei, where it associales with a DNA·bind ing protein ca Ll ed Su(H) . The resu lting Su(H)·Nolch 1C comp lex act ivates genes Ihal encode transcriptional repressors which block the deve lopment of neurons. Notch signaling does no! cause a simple ind uction of the Su (H) activator protein bul instead triggers an on/o ff regulatory switch. [o the absence of sigoali ng . Su(H ) is associaled with several proteins, including Hairless, CtBP, and Groucho (Figure 16-11). Su{H) complexed wit.h aoy of these proteins actively represses Notch target genes. When Notch 'C enters Ihe nucleus. it displaces the repressor proteins in complex with Su(H), turning thal. protein inlo an activator instead. Thus . Su{H) now actívates the very same genes lhat iI formerly repressed. Delta·Notch signaling depends on cell-to-cell contacto The cells that prcsent the Delta ligand (neuronal precursors) must be in direct Vhysical contact with the cells that contain the Nolch receptor (epiderm is) in order lo activate Notch signa ling and inhibit neuronal differentiation. [n the next section we will see an example oCa secre ted signaling molecule tbat influences gene expression in cells located far from those thal send Ihe signal.
A Gradient of the Sonie Hedgehog Morphogen Controls the Formation of Differcnt Neurons in the Vertebrate Neural Tube
FICURE 18-10 TheneulOgenicectodenn fDm15 two major cdl types: neurons and skkl cells (or epidermis). (a) Cells In theeady neurogenic ectoderm can lorm Clther type of ceI!. Howevef, once one ollhe ceUs begw¡s 10 loon a I1€Uron or "neuroblas," (dark cellln !he cerner 01 the grid 01 c:e1Is), it inhibts all of the neigHxlring ceUs that it directly touches, (b) This Inhibttioo causes m:lSt 01 me cens 10 rernafn on tIle surface of the ~ and form skln celb. In c:ootrast, tre de\.eoping neul(JllTClVes fnlo!he embr,u ~ ar¡d Iorms neurons. (Soun:e: Photo 110m Skeath l B. iI1d Glrroll S.B. 1992. Regulalion 01 proneural gene e
We now turo to an example of a long-range sigoaling molecule, a morphogen, that imposes positional information on a developing organ. For Ihis example. we continue our discussion of neuronal differentiation, but this lime we consider the neural tube of vertebrates. In all vertebrate embryos, there is a stage when cells lm;ated ¿¡long the future back- the dorsal ectoderm - move in a coordinated fashion toward internal regions of tJ¡e embryo and fonn the neural tube. thc forerunner of the adult spinal cord. Cells LocatOO in the ventralmost region of the neural tube fonn a specialized structure called Ihe floorpl ate (Figure 18-12). The floorplate is the site of expression oC a secretee! ceLl signali ng molocule called Sonie hedgehog (Shh), which fun ctions as a grod ient morphogen. Shh is secreted from the fl oorplate and form s an extracellular gradicnt in the ventral hnlf of the neurnl tube (Figure 18-12a). Neurons develop wilhín !he neural tube mto di(ferent cell types based 00 the amount of Shh prolein they receive. This is determined by their Localion relative to lhe floorpLate; cells located near the floorpLa te receive Ihe highest concentrations of Shh. while those located farther away receive lower levels. The extracellular Shh gradient leads to different degrees of acli vation of Shh receplcrs in difierent cells in the ne ural tube. The Sh h gradienl specifies al leasl four different types of neurons (Figure 16·12b). Ce lls located near the floorplatc-those that receive tJle highest concentralions of Shh-have a high number of Shh receptors act.i· vated on their surface. This instructs those cell s to fonn a neurona l cell type ca lled V3, which is disti nct from the ot.her neurons that aeise from the Shh gradient. Calls located in more laler:al region s of the neural tube {{arlher from the fl oorpl atel receive progressively lower levels of the Shh protein. This results in fewer Shh receptors being activoted in those cells. which therefore become motorneurol1S. Yet
Examples of Ihe Three Slrategies for Esloblishill8 Oifferelll jaJ Gelle Expression
lower levels of Sbh direct thc Cormalion of the V2 and Vl interneurons, respectively (Figure lB-12b) . How does this differential activation of Shh ret:e ptors produce d ifferent cell types? The activation of Ihe Shh receptor causes a transcriptionsl activator caBed Gli to activate the expression of specific "target" genes. The induction of the e H aclivator is controlled, in part , by its regulated tra nsport into the nucleus. Bind ing oCShh to ¡Is receptor on lhe ceH sudace allows a previously inactive form of eli to enter th e nucleus of that cell in an active form o The extrace Llular Shh grad ient present in the ncural tube thus leads lo the formation oC a corresponding Gli activalor gradient. That is, \he amounl of active e H in the nudeus oCany given cell depends on how far that cell is from the floorplate-the doser it ¡s, the higher the concentration ofGIL Once in the nucleus, e li activates gene expression in a concentratíon-dependent fashion. Peak concentrations of e li, present in ceUs irnmcdiately atljacent lo the n oorplate. activate target genes needed fo r the differenliation oCtht: V3 ne urOllS. Sligbtly lower levt:l s oC e li acti vate larget genes that spedfy !he formation of motorneurons, while intennediate amllow le\'els of e H induce the formation of the V2 and Vl intern eurons. respectively. We will see, in the next set.1ion, that th e different binding affiniti es of e H recognition sequences within the regulalory DNAs of the various target genes Iikely p lay s n importanl role in lhis differe nti al regulation of Shh-GIi target genes. Thus. Vl genes can be l](.:ti vated by low levels of GH because they have high-affinit y recogniti on sequences for that activator in their nearby regu latory DNA. Tn contrast, VJ target genes might contain regulatory DNA with low-affinity GH rccognilion sequ ences that can be activated only by peak levels of Sh h signaling and the e H aclivalor. Thi s principie of a regulatory gcadien l producing multiple "thresholds" of gene expcession a nd ceU differcntiati on is again iII ustrated particularly well in the earl y Drosophila e mbryo.
-
589
-¡;---,
CtBP
OFF
Su(H) 1
cell·to-cell
000"'"
I
neuronal repre5S0r gene (Notch larget genes)
opidermal
"",
"'"' '
Notch by Delta
"" O epidermal
nooroblast
s.(H )
ON
neuronal repres$O( gene (Notch target genes) f I CU R E 18~ 11 Notch-S4J(H) regulatory
•
b
switdl. The developing neuron (neuronal pie""
Q.I[!;()r cell) does not €loprcss neuronal repressor intemeurons
neurons neural tube FIGURE 18-12 Formation of diffefent newons in the vettebrate neural tube. (a) The scaelcd signaling molecule Sonie hedgehog (Shh) Is expt'esscd in Ihe lloOfpIate 01 !he develeping neural tube (see!he be'0\M1 arde al the bottom of!he diagram). The Shh proten dif· fuses through me exuaceUular rnatm: 01 lile neuraltube- The h!ghest ~els are present In vcnlfal (bottom) reglOns and progressívely 1CM'er in more lateral reg¡!Y1s (arro.vs). (b) The graded disto-
bution 01 the Shh prolein leads 10 the lormation 01 diSlinct neuronal cell types In !he \enlral naif 01 !he neurat tube. Hlgh aOO intcrmediate leveIs lead lO the development 01 Ihe V3 neur!Y1S and mqomeurons, respedively. lCM' and lowest IcveIs lcad lo the deveIopment 01 !he V2 and VI inlemeurons_(Source: h1apted from Jessell 1: 2OCXJ. Neurooal specification in lhe spinal cord: Inductve signals and transcriptional codesoNoture!?ev. Genet. 1; 20- 29. Copyright O 2000 Nature Publishlng Group. Used"";th permisslOll.)
genes (top). Tht3e genes are kept off by a DNA-binding proten called Sll(H) and associ-aled repl"es50" prcteins (Hairless, CtBP, Groud-.o). The neuronal precursor cell expresses a ~anng moIecule, called DelIa, mal is telhered to!he cell St.Jrface. Delta binds to the Notch receptor In neighboring cells that are in direct physlcal contact -Mth !he neuron. De!ta..f\.k)tch Intefactions cause !he Nolch recepto" 10 be acti'Jated in Ihe neighbonng cells, v.tIlch diffefenti· ale into epidermis. The acti'Jated I\Iotcn leceplOr is clea..cd by cell,-lilr proteases (scisso~) and the intrac:ytopiasmic region of lhe receptor is (eleased into the nucleus. This pie<:e of !he Notch protein causes the Sll(H) regulatay protan lo functlon as an actNator rather Ihan a rcpressor. As a result !he neuronal repressor genes are actívated in the epdennal cells so lhat they canna de.elop ¡nto neurons,
THE MOLECVLAR BIOLOGY OF DROSOPHlLA EMBRYOGENESIS In the remaining sections of thi s chaptee we focus
00
lbe early embry-
onic development of the fruit fly. DrosophiJa melanogaster. 'fhe molec ular details of how developmenl is regulated are hetter understood in this system than in any other animal embryo. The various mechanisms (Ir ceL! communication discllssed in the fust haH of this chapler, and tbose of gene regu lalion discussed in the previous chaplers. are brought togelher in this eXilmplc. Localized detemlinants and cell signaling pathways are both used lo establish positional information Iha! rcsuh in gradients oC regulatory peale ios that pattem lhe anterior-pos terior (head-tail) snd dorsal-
ventral (back-helly) body axes. Tbese regulatory proteins-activators and re pressors -control Ute expression or genes whose products define different regioos of the embryo. A recurring theme Is the use of complex regu latory DNAs-particularly complex enhancers-to bring transcriptional activators and repressors lo genes where they fum:tion in a combinalorial mannor to produce sharp on/off palterns of gene expression.
An Overview of Drosophila Embryogenesis Ufe begins ror the fnJit fly as ir does fOf human s: adult males inscminate females. A s ingle sperm cell enlen; a mature egg, and the haploid sperm and egg nuclei fu se lo form a diploid . "zygotic" nucleus. This n ucleus undergoes a series of nearly synchronous divisions wilhin (he centra l regions of the egg. Because there are no plasma membran~s s~parating the nuclei. the e mbryo now becomes what is caIl ed a syncitiom - that is, a single cell wi th multiple nuc1ei. With ¡he ncxt series of divisions . the nudei begin to migrate toward the cortex o[ periphery of the egg. Once located In lhe cortex. the nuclei undergo another three di visions leading to the forma tion oC a monolayer of approximalely 6,000 nud e i surrounding the central yolk. During ¡¡ l -hour p eriod o f.rom 2 to 3 houro aftcr fertilizati on. ce ll membrancs form belween adjacent nucle i. Before th e format ion oC cell memhranes. the nuclei are totipotent Of uncommitted ; Ihey have not yet take n on llO idcntity Bnd can s liU give rise 10 any ceU ty pe. Just aCter cellularization. however, nudei have become irrevers ibly "determincd" to differentiate into specific fi ssues in the adúlt fly. This process is clescriLt:d in Box 18-4. Overview of Drosophila Development . The molecular mechanisms responsible ror thi s dramatic process of determination are described in the remaining sediom; of lhis Ch¡¡Ple r.
A Morphogen Gradient Controls
Dorsal~ Ventral
Pattcrning
oC the Drosophila Embryo The dorsal-vent.ral pauerning of the cad y Drosop}¡iJa e mbryo is controtled by a reguJatory protein called Dorsal. which is initiatly distributed throughout the cytoplasm oC the unCcrtilized egg. Aft er fe rtilization . and aftcr the nuclei reach lhe corlex of lhe embryo. the Dorsa l proteio coters nucl ei in ventra l and la leral regions bul remain s in lhe cytoplasm in dorsal regioos {Figure 18-13}. The formatioll of this Dorsal gradient in nuclei acl'Oss the embryo is very simila.r. in
The Molecular Biology o{Drosophilll Embt)'ogfJClcsis
a
FIGURE lS-U Spitde-TolI and Dorsal gtadienl (a) The (ireles represent aoss· sec\ions through early Drosophi/o embryos. The
Dorsal nudei al peripnery
ToP receptor IS unilormly distributed tI1roughout
me plasma merrbrane of the prcccllular embryo. The SpatzIe signaling molecule is dislribute
•
Dorsal protein
re leased lo nudei ventral
b
nucleus ,
.. ..
D
,
prOC:ein Phosphorylation of cactus causes its degfadation. so that Dofsal is released ffom!he cytoplasm into nudei.
, , , p
A
, , ,
,
"
V
, ,,
591
........Cactus
t
,
Pelle
cytoplasm
, ,,
Tall receptor _
"
O Tube
t
~~
perivitelline
ex! r eIIUlaf ventral
space
signal (Spatzle fragmenl)
principie. to Ihe farmatian oC Ihe eH activalor gradient within ventral cclls of the vertebrate neural tube. Regulated nuclear transporl of lhe Dorsal prote¡ n is conlrolled by a cen signaling molecule called Spatzle. This signal is distributed in a ventra1~to-dorsal grament with in the extracellular matrix present betwecn the plasma membrane of the unfertil ized egg and Ihc outer egg shell. After fertiliUltion. Spat7.le hinds 10 the cell surface TaL! receptor. Depending on the Concentration of Spatzle, and Lhus the degree of receptor occupancy in a givcn region of Ule syncitial tlmbryo. ToU is activated to a greater or Icsser extent. There is peak activation of TolI receptors in ventral regions-where lhe Spiilzle concen tration is highesl-and progressively lower activalion in more latcral regiDOS. Toll signaling causes the degradation of a <.:ytoplasmic inhibitor. Cactus. and the release of Dorsal from Ihe cylop lasm into nudei. This leads to the fornultion of a l:orrt:::;ponding Dorsal nuclear gnulienl in Ihe ventrcLl half of the early embryo, Nudei locuted in lhe ventral regions of Ihe embryo contilin peak levels of tlle DúrsaJ prote¡n. while those nuclei Jocated in laloral regions oontain lower levels of lhe proteio. The adivatiou of sorne Dorsal target genes rcquires peak levels of lhe Dorsal protein. whereas olhers can be activa led byintermediate and low levels. respectively, In this way, the Dorsal gradient specifies three mlljor tbreshold s of gene expression across Ihe dorsal-ventral
5Y2
Gene Regulalion duriJl8 Df:\!eJopmeJl/
Box 18-4
Ovemew of Drosophílo Oevelopment
Aher the sperm and egg haploid nudei fuse. the diploid, zygotic nudeus undergoes a series of len tapid and nearly synchronous deavages within the central yoIky regions uf the
egg. l arge microtubule artays emanating froro the centrioles of Ihe dividíng nudei help direct the nudei from central regions Il::ward the petiphery of me egg (Box 18-4 Figure 1). Aher eight deavages, the 256 zygotic nudei begin to migrafe to the periphery. During this migration they undergo two more deavages (Box 18-4 Figure 1, nuclear deavage cyde 9). Most. but not all, of !he resulting approximately 1,000 nudei enlet Ihe cortical regioos of the egg (Box 18-4 Figure 1, Nudear deavage cyde 10). The others ('\itellophages~) remain in central regiUlS Mete they play a sorrrewhat obscure role in developmenl Once lhe majority of the nudei reac.h !he cortex at about 90 rninutes follcming fertililCllirn, they fir st "<:quire Cúmpetence lo transcribe PoI 11 genes. Thus, as in many othet crganisms such as Xenopus, there seems lo be a ~mid·blastula transition.« whereby early blastaneres (or nudei) are transcripliana!!y silent during rapid peñods of mitosis. lNhile causalíty is undear, ir does seern that ONA undergoing intense bursts of replication cannot simultaneously sustain transcription. These and other observations have lec! to \he suggeslion that there is
competition between me large macromoIerular complexes prorooting rcplication and transoiption. Because transoiptional oompetence is ooly achieved when the nudei reach Ihe ctrtex, it has been suggested !hat peripheral tegions contain Iocalized determinants. Hcmever, recent gene expression studies have stripped mum of the rnystery frorn the cortex. For example, !he segmentation gene, hunchbock. is unif()(ffily transaibed in al! of the nuclei pl'"e5ent in Ihe anterior half of the eariy embryo. This ex¡xession encompasses both the peripheral nudei that have entered cortical regions, as well as the ....1tellq:>hages that remain in !he yoIk. AAer !he nudei readl the cortex, they undef9l another three rounds of cleavage (for a total of 13 dMsions after fertilizatioo). leading 10 the dense packing of about 6,000 coIumnar-shaped nudei encIosing the cenltal yolk (Box 18-4 Figure " Nt.Jdear deav¿¡ge cyde 14). Tec:hnically, !he embr)Q is still" syncitium. although histochemical staining of early embryc6 with antibodies against cytoskeletal proteins indicate a highly structured mesIr work surrounding each nudeus. Ouring a l-hoor period, from 2 lo 3 hours aher fertilization, the ernbl)o under~ a dramatic cellularization proccss, v.hereby (eU membtanes are formed bet'Neen adjacent nudei (Box 18-4 Figure 1, Nudear deavage q:Ie 14). By 3 r.oors aher fertilization, the embr)o has been trans-
Bo x 18-4 F I GURE 1 Ckosophllo embryoge:nesis. DrosqJhiIía errbryos Cle oriente
The Molecular Biology ofDrosophila EmbI)V8enesis
593
Box 18-4 (Contínued) fonned into a cellular bIasto::lerm, compaable lo me '1ldloN baU of cells" Ihat characterize me ~Iae of most other embryos. One al" me rrost rompelling aspects of dassical embr)
most marine organisms, sum as ascidians, are visually stunning. UnfO'tunately, the Orosophila embryo is rathet ugly; its salvation has beco me unprcccdcnted visualizalion of gene expression patterns. The difterential gene activity mar has been so graphically visualized in lhe early embt)u USlng a vañety of moIewlar and histochemical tooIs is nct simply a manifestation of cell fate specir.cation Rather, sorne of the fjtst genes to be visualized encode regulatory proteins thal actually dictate cell fate. Thus. the molecular studies have literatly il1uminated !.he mysterioos process of cell fate spedfication and determination V\tJen !.he nudei enler !.he cortex of the egg. they are totipotent and can form any adult ceU type. The rution or eam nudeus, hovvever, now determines its fate. The 30 or so rwdei that migrate into posterior regions of the C01ex encounter localized protein determinants, such as Oskar, Vvttich program !.hese 0éIi've nudei lo forro !he genn cells (Box 18-4 Figure 2). Among !.he putative delenninants contained in the polar plasm
are large rwdeoprotein oomplexes, called polar granules. lhe posterior nudei bud off from me main body of the embryo along with the polar granules, and !he resulting poIe cells diffetentíate into either sperm or eggs, depending ro !.he sex of me embryo. The microinlection of polar plasm ¡nto abnonnal locations, such as central and anterior regions, results in the differenhation of supcmumcrary poIe cells. CortICal nudei lhat do nct enter !.he pdar plasm are destined lo fron the scrnatic tissucs. Agoin, these nudei are toripctent and can form any adult cell type. HQWE!\/Cf, wi!.hin a very brief periad, pemaps as little as 30 minutes, each nudeus is rapidly programmed (O' specified) to follow a particular pathway of dif~ ferentiation. This speófication process ocrurs during me pericx:l of cellularization, although there is no reascn to believe !.har !he depositioo of cell membranes between neightni1g nudei is o itical fa- determining cell fate. Diffelent nudei emibit distina pattems of gene transaiption prior lo !he completion of cell famatico. By 3 routs after fertJl ization. eacn cell possesses a fixed posiricnal identity, so lhal th:Ise Iocated in anterior regions of !he embryo will fetm head structures in the adult tly, whereas cetts located in posterior regions will form abdominal structurcs.
fertilized
"'" pole granules
.'
. · .e
~
-. -.. :.~ .~
•
many nuclei in
a syncytium
<>
pole cells
80 x 18·4 F I e u RE 2 Development o. genn cells. Polar grarujes Iocated IIlIhe poslerior cytoplasm of lhe lI1ferti~zed egg contain germ (el daerminants, and!he Nanos mRNA, which is important fa the development of the abdominal segments. NLX:Iei (central dots) begin ro migJate to the peñphey. Those Ihat enter posterioJ regicns sequester lile ¡:dar gmntJes and Iorm the pole ceIIs, vkokh bm the gerrn ceIs. The rernaiJ"lng ceOs (somalic cels) lorm al 01 !he other tissues in the adult fIy. (Soun:e: Adapted from sm,eiderman HA 19 76. lnsect deveIopmenlln S)'TTlpOW 01 rhe Royal EntomaIogicaI Soaety el London 8 : 3 - 34. (ed PA L.awrence). Copyñ¡tll lO 1976. Rlprinted I'r' permission of BlockweI Science.)
5901
Gene Regl1laliall dl1rin8 Developmelll
Box 18-4 (Continued)
Avaricty of genetic and experimental studies have ~ that celt fate specificatim is cmtrclled by Iocalized maternal determinants !hat are d(>(X)Sited into the egg during oogenesis. lhe first evidence for such determinants came from ligation experimen~ in which a hair was Iied aroond !he middle of D~iIa embryos. If this separatiX'l between the anteror and posterio- halvcs QCOJrred early, during syncitiat bIastoderm stages, central regions of the embryo failed to frxm thotacic structures such as wings and halteres (Box 18-4 Figure 3a). t-Jo..vever, if the ligatioo was droe later, after ce/1ularizat:io\ then these structures were prq>erly formed (&:»: 18-4 FIgUre 3b). These .lOO related experiments suggested Ihat me a- more aitical determinants diffused into postern regions frrm !he anterior pole and
men
thclt this determinant(s) coutd be trapped in anterio" regiuls by separatill: the halves of earfy embryos with a hair. SystetTliltic genetic saeens by Ene \Meschaus Clnd dlristiane NossIein-\t)t1 hard ldentified approximately 30 ~sewnentation genes" that contrd the earty patterning of the Drosophilo embr)n This irMllved !he examinalion of thousands of dead a
embryos. At the midpoint of embryogenesis, !he ventral skin, or epidermis, secretes a rutide that rontains many fine hairs, or dentides. Each bcx:ty segment of the embryo cantaios a charac.teristic pattem of dentides. Three different dasses el segmentation genes were identified on the basis of causing specific disruptKJns in Ihe dentide pattems of dead embryos. Mutations In the so-<:alled -gap. genes cause the detetion of severat adjacent segments (Box 18-4 F.gure 4). For example, mutations in !he gap gene knirps cause !he Ioss uf !he seoond throogh seventh abdominal segrnents (normal embr)Qs possess eight sum segments). Mutatims in the ~pa¡r-rule· genes cause !he Ioss of altemating segments. For exampte, mutations in the eve~ skipped (eve) gene cause Ihe loss of Ihe even-numbered abdominal segrnents. Finally, mutations in segment poIality genes do not alter !he normal number of segments, but instead. cause patteming defects within evely segmenl Far eJlOmple, normal segments cmtain dentides in one region, bur are naked in the other. In certain segment pdarity mutants, as hedgehag, both regions of every segment oontain dentides.
ro
. .. .. .. . . . . ·..
. ..
· . .. J . . ... ·.. "
no thorax fOfJJlation
... .
tflorax forms
BO X 18-4 F I G U R E 1 Ugation experimenL l/IIhen a hair is used to separate lhe anterior and posteriOl halves of early embryos, then from the anterior pale fail to enter posterior regions. As a result, the emlxyos deYeIop into abnormal flies that lack mace structures. In contras!, ....nen!he hai, sepafates oIder embryos (series en !he fiWlt), Ihen me determinant alfeady entere
Box 1..... (Continued)
••
a
" ,
A ,,
b
.
~;~,
•¡
T,
,
" .,
.~
. -_. -. . _.
- ..::-
~.:..-
" ro
.
......
~'¡k;}.;i, ~ ... ~.' :, ~~;';'
.~
. "
~
no,,,.,.¡
• A8
- .-
ll.
!
•
knirps
eo x 18- 4 FIGURE 4 Daddield Images of nonnal and mutan' cirdes. Ca) lhe pat!ern of dentide hairs in mis normal embryo are slightly different ameog!he different body segmefJls (labeled TI !hrough AS in the image). (b) The Kr1irps mutant (having a mutation in the gap gene knirps), sha.M1 here, lac.ks!he second !hrough 5eYenth cbdominal segments. (Source: Nusslein-I.blhard e and 1Meschaus E. 1980. Muta1ions affecting segment number and poIarity in Drosophila Noture 287 : 795-80I _lmagcs courtesy of erie Wl€Sdlaus, Princeton University.)
axis of embryos und ergoing cellularization about two hours after fertilization . Thcse thresholds inítiate the diffcrentiation of three dilOtinel tissues: mesoderm, ventral neurogenic ectoderm, and dorsal neurogenic edoderm (Figure 18-14). Each of these tissues goes on to foml distinclive cell types in the adult Oy. The mesoderm forms flight museles and internal organs. such as the fat body. which is analogous to our Iiver. The ventral and dorsal neurogenic ecLodenn for.m distinct ncurans in lhe ventral nerve cord. We now consider lhe regulation of three different target 'genes that are activated by high. intermediate, and low LeveLs of the Dorsal protein-twjst, rhombojd. and sog. The highest levels of the Dorsal gradient-Ihat is. in nucJei with the highest leve ls of Dorsal proleinactivate the expressjon of the twisl gene in the ventralmost 18 cells that form the mesoderm (Figure 18-14). The lwisl gene is 110t adiveated in lateral regions, lhe neurogenic ectodermo where there are intermediate and low level:; of the Dorsal protein. The reason for Ihis is that ¡he lwi.<;l 5' regulatory ONA contains two low-affinity Dorsal binding sites (Figure 18-14). Therefore, peak levels of the Dorsa l gracUent are required for lhe tlfficient occupancy of these sites: the lower Levels of Dorsal protein present in lateral regíons are insuffident to bind a nd activate the transcription of the twist gene .
596
Gene ReguJation during Development
., 4 optimal Dorsal
binding siles mombold
'-' 1 high-affinity Dorsal binding site
2 IO\.'{-affinity
., .,
Dorsal binding sites (fine
"'" on "1 (¡ U R E 18-14 Three threshoJds and IhJ'H types of replalory DNAs.
The (lMSt 5 ' regulatory
CDnIM; both high aOO inlermeóate JeveIs of the Dorsal ¡¡/aditnt lo activale rhonVoid exp1"eSSIOr'I in venlraJ-lateral 'egroos. FínaUy, the sog intronic €ft¡ancer contains faJ, eventv-sPaced optimal Dorsal bmding sites. These alJow hlfl¡, mtermedl"te, illlÓ Jow levels of the DoJ!ii!1 g¡aóern 10 activare 509 expressiOO throughotA kneral reglOfls. [)NA
The rhomboid gene is activated by intermediate levels of the DorsaJ protein in the ventral neurogeni c ectodermo The rhomhoid 5' flanking region contains a 300 bp cnhancer located about 1.5 kb 5' of the tronscription start si te (Figure 18-158). This enhancer coota¡os a clust er of Dorsa l biodiog sites, mostly low-affinily siles as seen in Ihe twjs! 5' regulalory region. Al least one of tbe siles,
a Ihomboid
~ ~~ ~,;:dp' Snail bincling siles
do""
NE
Snal protein
NE b "'"
qr------,...,...,.,.,--~
mesoderm J I (¡ U R E 18-15 ReguJalory DNAs.
~Snail birw:1ing sites/
(a) 1he rhomboKJ enhancer COnl()lflS bmeling SIteS fa both
Dorsal and the Snail repressor. SJf\ce the Snail protein is ooIy presenI in IIf"fltral regions (!he mesoderm), rhorrOOd IS kept off in the mesodrnn and resfricted lO ventral regions of Itx: neuro¡¡enic ectodet'TTI (ventral NE). (b) The inlronic 509 enl.mrer <'liso conlt'lins Snail reples50l sIres. lhese keep 50g ~esslon off In !he lT1E'SOdE.1m dnd reslrlc!ed lo bfood lateral stnpes 111m encompass both IIf"fltral and dorxJI rt:gIOI15 01 !he neurogenic ectoderrn (NE).
Tlle Molecular Bio/ogy ofDrosophila Embryogenesis
597
however. is an optimaI. high-affinity site Ihat permits Ihe binding of intermediate levels of Dorsal protein-the amounl present in lateral regions. In principIe, the rhomboid enhancer can be acti valed by both the high levels of Dorsal protein presenl in the mesoderm and the inl ermedial e Ip,vels present in Ihf! venlral nf!urogenic ecloderm , bul it is kept off in th e mesoderm by a tran scriptiollal repressor called Snail. The Snail rtlpressor is only expressed in the mesoderm; it is not present in Ihe neurogenic ectodermo The 300 bp rhomboid enhancer contains binding sites for the Snail repressor, in addition to the binding sites for the Dorsal activator. This interplay between the broadly di slribuled Dorsa l grad ie nl a nd tile locali zed Snail reprcssor leads lo th c restri cted express ion of th e rhomboid gene in Ihe ventral neurogenic ectoderm, We ha ve alrea dy seen how Ihe loca}jzed Ash1 repressor blocks Ihe aclion of the SWl5 activator in Ihe daughter celJ of budding yeast . an d furth er along in Ihis chapler we wil l see the extensivc use of lhis principIe in olher aspecls of Drosopbila de velopmenl, The lowest levels of me Dorsal protein. present in lateral regions of the early embryo. are su fficient to activate the sog gene in broad lateral stripes lha! encompass both th e ventral and dorsal neurogenic ectoderm. Expression of sog is regu lal ed by a 400 bp enhancer localed within the fust intron of the gene (Figllre lR-15bJ. This e nh ancer contains a series of four evenly spaced high-affin ity Dorsa l binding sites that can therefore be oc:cupied cven by Lhe lowest level... of the Dorsal prolein. As seen for rh omboid, Ihe presence of Ihe Snail repr.essor preeludes activatian of sog expression in tbe mesoderm despitc the high levels of !he Dorsal prolein found Ihere. Thus. the diffe ren liaJ regulation of gene expression by difTerent Ihrcsholds of the Dorsal gradient depends on the combination of the Snail repressor and the affinities of the Dorsal binding siles. The occupancy of Dorsal binding sites is nol on ly determined by the intrinsic affiniti es of Ihe siles bul also de pends on protein protein interactions belween Dorsal and other regu latory proleins bound to Ihe targel enhancers. For example. we have seen Ihal Ihe 300 bp rhomboid enhan cer is activated by jntermediate levels of Ihe Dorsal gradienl in Ihe venltal neurogenic ectodermo T hi s enhancer conlains moslly low-affinily Dorsal binding sites. However. inlerme-di ate levels of Dorsal are su ffi r:ienl lo bind these sites due lo protein-protein interacl ions with anolher activator prolein caJled Twisl. Dorsal and Twis l bind lo adjacent siles within the rhomboid cnhancer. No! only do the two proteins h elp ane analher bind Ihe e nhencer. bul once bound , Ihey work in a synergistit: fashion to slimulate Iranscriplion (see Box 18-5. The Rol e of Activator Synergy in DevelopmenlJ.
Box 18-5 1he Role of Adivator Synergy in Development Pemaps as little as a twofold difference in the levels of the Dorsal protein determi ne whelher a nail/e emb1yonic cell forms a muscle cel! or neuron This regulatory sv-iitch in celt iclentity depends on the sharp lateral limits 01 Ihe Snail expression pattem, which clernarcate the boundaf)' betv.een lhe presumptive mesoderm and neurogenic ectocIerm (Elo>I
18-5 figure 1). Cells that €)(press snail invaginate te form
mesoderm, while cells located in more lateral regions (and lack snail expression) fOftTl deñvatives of the neu rogenic ectodermo lhe lormation of the sharp snail borders depends, in part. on the multiplication uf the Dorsal and lwist gradients. The idea is Ihat Ihe broad Dorsal gradient triggers a slightly steepet
596
Gene Resu1ation dun·ng Develapmenl
80x 18-5 (Continued) Twist pattern, and then the Dorsal and TWlSt protens function synergistically Vv'ithin the limits of the snail S' regulatory ONA lo activate expression (Boc 18-5 Figure 1). There 15 a cluster of low-aHinity Dorsal sites localed aboot 1 kb upstream of the transcriptíon start sfte of the mojI gene and tv.o TWst binding sites near the 5/IOil prometer. Because of the distance separating these sites, il is unlikely (ha! Dorsal and TVv'ist physically interact to facilitate cooperative binding to ONA. Inslead, thcy mighl make separate contacts v.tith different ratelimitíng transcription amplexes ("promiscuous synergy,~ see Chapter 17). fa" example. Dorsal might render the sna,15' regulatocy region in an "apen" oonformation by recruiting an enzymatic complex \hat modifteS chromatin, such as SWI/SNF or HAT. This opening 01 the 5f)(]jJ 5' regulatory region might facil¡" tate \he binding of Twist, 'II.tlich subsequently rCOlJÍt5 the TFlJOPoI 11 cornplex lo the COI\:! prometer (see Chapler 17). We St!t! later in this chapter that Bicoid and Hunchback fundion in a synergistic lashion to activate eve stripe 2. A similar principie is used to specify \he dorsal mesoderm in a vertebrate embryo, as we now discuss. The dorsal mesoderm of the Xenopus embryo is the source of important signaling moIecules tha! control the development of \he central nervoos system (CNS) during gastrulation The lormation of \he dorsal mesoderm depends on localized mRNAs in the unfertilized egg, tnduding VegT. The VegT gene encodes a sequence-specif1C transcription factor that leads to the activation of lhe Xnr gene throughout the presumptive mesoderm. Xnr encocles a TGF- ~ signaling molewle Ihat is necessary but no! suffident to activate gene expression v.-;thin the dorsal mesoderm. Instead, aaivatioo depends en Xnr and \MU signaling. After fertilization, a process called cortical rotation oc.curs, during which the internal cytoplasm 01 the egg rotates relative to the plasma membrane (Box 18-5 Figure 2a). Cortical rotation leads to the stabilizatioo of ~-catenin along one side of the earfy embryo, which corresponds lo the Mure dorsal stJr~ face. A ceU surface protein, ~-calenin, is normally released into nudei upon aaivation of Fnzzled receptors by secreted, extracellular signaling proteins called IAtIts. However, cortical rotatíOn may circumvent the need for Wnts and direaly induce Frizzled receptors to release ~-catenin. Once in the nud eus, ~-cateni n interacts wi\h a sequence-specifrc transcriptíon faaor, called Td or Pangolin. The Td/~-catenin complex activates a target gene called 5iarncJit;, whích encodes a homeodornain regulatory protein. Siamois expression is distributed throughout dorsal regions, where there are hi~ levels of ¡3-calenin. This Siamois expression prefile intersects wi\h the Xnr signaling moleaJles distributed througt-oot te., mesoderm (Bol< 18-5 FIgure 2b). The "",nt of inlersection COfresponds lO the dorsal mesoderm; Siamois func~ tions synergistically with Xrv te activate target genes in the dorsal mesOOerm. One of me filSt genes to be aaivated is called g:xJSeCoid, 'II.tlich encodes a homeodomain regulatory protein.
The 5' regulatory DNA of me goosecoid gene contains binding sites fa Siamois as well as for "Smad" proteins. Smads are transcription lactors mat are induced by the aaivation of TGF~ cell surface receptors (Box ~ Figure 2b). In me absence of signaling.. 5mads are inactive due lO their association with the ,ntracyt~l asmic domains of lhe TGF· ~ receplOrs at \he cell surface. Upon signalin& hCllJ\leVer; me Smads are released into nuclei. This results in the binding of Smads lo the goosecoid 5' regulatory DNA. Smads and 5iamois now function synergistically lo activate goosecoid expressien within the dorsal mESOderm. The site 01 expression corresponds to \he one region of me embryo where mere are high IeveIs of both aaivatoTS.
ON I
!
Dorsal siles
'------' Twisl sites
snaif
ModelforDOHal-TwistsynH8Y_ The brood Dooal nudear gradient activates the hNist gene in \o€rlttill regions. The ~ (lnc! Twisr proteins wofi,; synergistically lo octiv(lte (1 variety 01genes in \Iefltral aOO \I€fltral-lateml regions.. It has been suggestee! thar Dorsal recrullS cbomatin-modifying complexes v.tl~e TWsI stimulates transaiption by interacting v.;ff"¡ Meóata (X lFIID ccmplexes. (Source: Stathopoulos A. aOO I.f.'\o1ne M. 2002. Dorsal Sradie..1t nelWOrl<.s in rile DfOSophilo embryo. Dev. BioIogy 246: 57-67, fig 2, p. 59. Copyright O 2002 with permission from Elsevier.) BOX 18-5 FIGURE 1
The Molecular Bio/og}' of Drosophila Embryogenesis
Box
a
'M
599
(Contínued)
sperm entry
animal poIe
Siamois
pigmented
po;nt
cytoplasm of a nimal pote
venlrar
(TGF-J3) protein
Vg 1 mRNA
b
¡}-calenin proteins
Tcf-3 prole ins
~
ri
11
r X-
OFF
• Slan/OIS
90""
~"
-t """ TGF ~ (""')
ON
• " ..no" gene
ON
BOl 18-5 F I G U R E 2 Specification of the dorsal mesodenn in the lenopus embryo. (a) The Xenopus egg contains a TlU11ber of localized mRNAs induding ~T and Vg I . Veg T enrodes a T-box DNA-binding protein M1i1e Vg l enrodes an actMn/TGF-13 siáNling moIecuJe. They JeOO !D lhe e!CpI"essIoo of Xnr in vegetal regions. Cortical rotation ocrus after feftÍlizatíon and leads lo !he stabilrZiltion 01 j3-catenif1 a~!he future dorsal su/lace. The ¡:::oint of intersedion between!he Xr1l and j3-carenin óomains defines !he dorsal mesoderm and Ieads lo lhe activ~ af a number r:l genes soch as goosecoo. (b) j3-Catenin in droaI regioll5leads lo me ac:tlva6on of the siomcis gene,.....m:h encocles a homeOOox rcgulatory protein. lhe Xru signa~ng molecule leads lo the actívation of rm. rn lhis regron they \MJI"k 5)fl8I8JSIlCclry lo actJvafe thegoosemd gene. (Source (a) Adapted from AIberu B. et ar. 2002 MoIeaJhr bioIogy of the cel¿ 41h editoo, p. 12 \ \ , f2 1-66. Copyright Q 2002. Reproduc.ed t1r' permission of Rout\edge/Ta't'b & Ffancis Bcds. tre. (b) Adap!ed from Gilbert S.E. 2000. DeveIopmental biobgy, 6th rotion, p. 322. rlS. 1025. Cc.pyright lO 20lXl Sina.oer Assodates.. Used with permisgcn. .ATId from MoofI R and Kimelman D. 199& rrom cornea! rotation ro organlzt.'f gene eKpreSSioo. BioEssays 20: 542, 'is. 3. ~ e t I99B, Used by permission d John 'Ailey & Sons. Inc)
Segmentation ls Initiated by Localized RNAs at the Anterior and Posterior Poles of the Unfertilized Egg Al the time of fertilization , the Drosophila egg conlains two localized mRNAs. Dne, Ihe bicoid mRNA, is localed al the a nterior poleowh jle ¡be olher, the oskar mRNA, is Jocaled al the posterior pole (Figure 18-16a). The oskar mRI\1A encocles an RNA-binding protein that is responsible fo r the assembly of polar granules. These are large mac.romolecular complexes composed of a variety oC difIerenl prolcins and RNAs. The polar granules control the development oC hssues thal arise from posterior regions of Ihe early embryo, including the abdomen and the pole ceJls, which are th e precursors of ¡he genn cells (Figure l B-16b). The oskar mRN A is synlhesized wilh in the ovary oC lhe mother fl y_ It is fi rsl dcposited al Ih e anterior end of the immature egg, OJ' oocyle. by "h el pcr" ceJls call ed nurse ceUs. Bul , as tll e oocyte enlarges to form Ihe mature egg, the oskar m RNA is transported from anlerior lo pos terior regions. This 10cali zAtion process cle pends on spec ific seqllent;es with in Ihff 3' UTR oCtht! oska f mRNA (Figure 1B-17). We h ave already
600
Gene Regulatioll durillg Del'elopment
a
pre-celtular embr)O
anterior
posterior
po/e
""'.
bícoidmRNA
osksrmRNA (p:!tar granules)
1
b
-)~ inoskarRNA poIe ceJJs
--e
earJyoocyte nudeus
posterior
anletior
_!_...~
oskar mRNA
gruwing
microtubuJes
seen how the 3' UTR of the oshl mRNA mediates its localiz¿¡tion lo the daughter ceJl of budding yeast by interacting with the growing e nds of microtubules, A remarkably similar proc.:ess controls Ihc localization of the oskar mRNA in the Drosophila oocyte. The Drosophila oocyte is highly polarized, The nucleus is located in anterior regions; growing microlubules exlend from Ihe nucleus mio Ihe poslerior cyloplasm. The oskar mRNA inleracts with adapler proleins that are associated with the growing + ends oC the microtubúl es and are theTcby transported eway from anterior regions of !.he cgg. where Iba nucIcus resides. into lhe posterior ph.lsm_ After Cerlilization, the cells that iJlherit the localized oskar mRNA (and polar granules) form the pole cells. The localization of Ihe bkoid mRNA in anterior regions of tlIC unfertilized egg also dcpends on sequences contained within its 3' UTR. The n ucl eotide sequences oC the oskor and bicoid mRNAs are distincl. As a result, they ioleraet wilh dirfcrenl adapter proteins and become loca lized lO differenl regions oC Ihe egg. The imporlance of the 3' UTRs in determining where each rnRNA becomes localized is revealed by the following experimenl. Jf Ihe 3' UTR from tbe oskar mRNA is replaced with Ihat from bicoid. Ihe hybrid oskor mRNA is located 10 anterior regions (jUSI as bicoid normally isJ_ This mi.<;]or-f¡lization is sufficicnt lo induce the Cormation oC pole cells al abnorma l Joca lions in Ihe early embryo {seo Figure 18-17J_ In addition, Ihe misloca li zed polar granu les suppress Ihe expression of genes required foc Ihe differentiation of head tissues. As a result, embryonic eells thal normally form head tissues are transformed inlo germ cells.
!
matureegg
posterior
plasm 5' = = =
F le;; U R E 18-16 localization of maternal mRNAs in lhe Drosophila egg ami embtyo. (a) 1he unfcrtilized Drosophila egg contairls two localized mRNAs, bicoid in anterior regions ~d osk.ar in posterior regions. (b) lhe Oskar protein h€Ips coordinate the assembly of the polar granules in the posteriof cytoplasm. Nudei thín enter this regiOll bud-off the posterior end 01 the embryo and form the pole cells. (e) OUnng lorrTliltion oIlhe Orosophilo egg. poIanred mrcrotubules are lormed!hat extend ffOlTl the oocyte nucJeus and grow tcward lhe posterior plasm lhe osk.ar mRNA binds adaplE'f proteins!ha! intCfOCt ~ the mlCJotlbules, and Ihereby !ransport the RNA lo me posterior pIasm. The '-" ;n:! • +' symbots ind!cdle me diwctlon of thc groYYing strilluls 01 lhe mlClO'
=
UTRre::= 3'
bicoidmRNA
C)S}(8rmRNA
t
J
l
me
rubtAes.
oskarmRNA
f lCU AE 18-17 The bkoid and osIcOl'mRNAs contain díffe,ent UTR sequences. The bicoid UTR Ciluses it lo be loca ~zed te the antenof pole while the di',¡jnd osf.or UTR sequence causes Jocalization in Ihe posterior plasm, AA engineered oskor rnRNA mal. cootdíns the bicold UTR lS 10000lized la Ihe anlerior pole. JUSI like the normal brcoid rnRNA This mislQCillizallon oloskor CclUses the tormation 01 poIe cells in anlerior reglons. Pole cells also lorrn from !he posterior poIe due to 10caJization of lhe nafma] oskar rnRNA in (he posterior plasm.
The Molecular BiaJogy DI Drosophila Embryogellesis
60 J
The Bicoid G radient Regulates the Expression of Segmentation Genes in a Con ce ntrati on~Depe nd e nt Fashion The Bicoid regulat'Oly protein is synthesized prior to the complet ion of cellularil.ation. As a resu ll . it d iffuses away from ils source of synthesis al the anterior poI e and becomes distribuled in a broad conccntrntion gradient aJong the length of the early embryo. The Oicoid gradient is form ed in a way that is differenl rrom Ihc e li and Dorsal gradients. We have aIread}' seen that these lalter gradients depend on tbe diITcrential activation oC ceU su rface receptors. By si mply d iffusing aCfOSS Ihe syneitial embryo. Bicúid bypasses tlle beed fOl" ceJI signaHng. Once formed, however, it produces multiple thresholds of gene expression , just Iike Ihe eH and Dorsal gradients. There are peak levels of the Bic.:oid prole¡o in anterior regions . intermedi ate levels in central regions. and low le vels in posterior regions {Figure 18-18). mfferent concentral ions of the Biooid protein are required for the regulation of diffcre nt target genes, JUSi as we have seen for Dorsa l. Peak levels of Bicoid are requircd for the activatiou or gelles in anlerior rcgions Iha t will form hcad structures; inlermediate Jevels are sufficient fOf Ihe actjvalion of Ihose genes required for the d ifferen liation of !he thorax. We consider the differential regulation of two Bicoid target genes , orthodcnticle ano hunchback. Onl y high concenlrations of Bieoid activate the expression of orthodenUde, which is essential for the differentiation of head slructures {Figure l B-18a). In contrast, both high and intcrmediate concentra -
•
b
Biooid gadient
Bicoid gradient
.:.:. :)
Bicoid protei1
C·.:.:.::) -Orthodenticle expression
enhancer-Bicoid
binding sites
<.~.:
OH ortIlodenticle
"'""
Orlhoclenticle gradienl
ON
Hunchback expression
hUllChbad< enhancer-Bicoid bincling sites
• huncflbacJ< gene
Hunchback expressiOr1
F I CU RE 18-18 The Bicoid gradient adivates gene expression in. concentralion-dependent fashion. (a) The brood ante rior-pastero, BiCOld prolein 81<1aenl produc.es differenl thtesholds of orthodenfKle dnd hUflchbodc, ge. .e ~esslo.- L Onhodent/Ck: is 1!Ctivaled only by hlgh leveIs uf !he Bicoid graclim i in !he head; hunchlxx:k Is actJvated by bcAh high and inlefmediale levels 01 !he Bicoid gradient in heild ilnd thorax. lhe orrhodentide am:! hunchbodi S' regulatosy DNAs COI1\(1in BicolÓ blflding siles. Hc;we.e, all mree Bicoid s.tes In the orthodenrick enhancer bind W.m Iow affinity. whereas the thll'C Sltes In the huncllbodr. regulatory regioo ilI'e high affiruty S.Ies.. (b) In central /eglOns 01 embryo, tIle orthodentic:le gene is off beCause me Jevels of BicOlÓ prolein are insuff!CÍE"f1t 10 bi!1(! meIo.v affimry Sltes in the orthodenticle S' re&\llalory DNA. ln COntl'i1st, hunchbod 15 Ofl beciluse lhe5e levels uf Bicoid are sufflCÍ€f1l to bind me h~ affinity siles in the hunchbock regulatory region.
me
me
(j ,)
l
) ) )
:
1
O<1hodenlide
OFF
•
mhodenticel
gene
enhancer-Bicoid
binding siles
hu_
enhancef-Bicoid bindIng sltes
ON !
hundlbacJ< gene
602
Q;ne Regulation dUrinS Dt:Vf'Jopmenl
Nanos protein gradient
m3lernal hunchback mRNA
Hunchback protein
gradienl
FIC;URE 18-19 Hunchbackprotein gradient and translation inhibition by Nanos. The Nonas mRNA js ilSSOCÍated INith polar grill1ules.. After its fIanslation, lI1e protein diffu5es hom postera fE'glOl!S lO fOlm a 8,OOi. enT. The matemitI hunchbock mRNA is distributed through:lut !he early embf)'O, bul j[S tl'anslallon is "rreste
Hons of Bicoid are sufficient to activate hunchbock, which is required for the development of the thorax. Tbis t.Ufferenlial rogul~tion oC orthodenUcJe and hunchback depen ds 011 tbe binding affinities of Bieoid recognition seq uences. We have already seen that Dorsa l binding affinities are important for ensuring difIerent Ihrcsholds of gel1e express ion across Ihe dorsal-ventral axis. The restricted expression of the orthodenticle gene is regulated by a 186 bp enhancer localed 5' of Ihe transcription start site (Figure 18-18a). This cnhancer (;onlains a series of low-affinity Bit:oid binding sites. which can be occupied only when B.icoid is present al high conccnlrations-lbat is. in n uclei at tbe high end of tbe Bicoid gradient. As a resulto orthodenticJe is Iranscribcd only in anterior regions and nol in posterior regions where Ihere are lower levels of lhe Bicoid activator {Figure 18-18bl, [n contrast, Ihe hU1Jchback gene is regulaled by a s' enhanccl' that contains highaffinity Bicoid binding sites. These are bound by both hígh and intermediate levels of the Bicoid protein. and consequenl ly, JJUnchback is transcribed in both anterior and central regions of Ibe embryo. The Bit:oid protein binds lo DNA as a rnOnomer. This is differenl from many other regulatory proleins, such as Ihe A repressor and Dorsal. which bind lo DNA as dimers. Bicoid monomers inleract with one another lo foster Ihe cooperalive occupant:y of adjacent sites. This cooperative binding produces sharp on/off borders in lhe hunchback expression patlcm. Perhaps as li"le as a twofold decline in Ihe levels of the Bicoid gradient determines whether lhe Bicoid bínding sites in lhe hunc/¡oock enhancers are oCl.:upied or not, and hence, a sharp border of hunc}¡bock express ion is established in the middle of the embryo. We have already encountered this principie wiLh regard lo the A repressor (Chapler 16) and the regulation of the J3-interferon gene in mammals (Chapter 17).
Hunchback Expression Is also Regulated at the Level of Translation The localized expression of the hunchback gene in the anlerior half of the eal'ly embryo is a major event in the subdivision of the embryo into a series of scgments. We will sec that the encoded Hunchback regulalory proteio conlrols the expression of several genes Ihat are essential for segmcntation. Beforc describing this process we fU'St consider the regulation of Hunchbaek expression in a bit more detaiL The hunchback gene is actuall y Iransl.:ribed from two promoters: one activated by the Bieoid gradienl as discussed above; the other controls expression in !he developing oocyte. The laUcr, "maternal" promoter leads lO the synthesis of a hunchback mRNA that is evenly distributed throughout the cyloplasm of unfcrtilized eggs. The lranslalion of this maternal transcript is blocked in posterior regions by an RNA-binding proMin called Nanos {Figure 16-19). Nanos is found only in posterior regions becausc its mRNA ¡s, jn turn , selectively lo· calized tbere tbrough intel'actions between ils 3' UTR and Ihe polar granules we cncountered carlier. Nanos protein binds specific RNA sequences , NREs (Nanos response -ªlements), located in lhe 3' UTR of the maternal huncJ¡· back mRNAs, and this binding causes a reduclion in IJw hunchback poI y-A tail, which in tuFO deslabilizcs Ihe RNA and inhibits ils transla! ion (see Chapter 14). Thus, we see that the Bicoid gradient
The Molecular Bio/ogy ofOrosophila Emb¡yogenesis
act.i vates the zygotic hunchback prom oler in Il-¡e anterior halr oCthe emhryo, while Nanos inhibits the translalion oC the materna) hunch hack mRNA in posterior regions (see Figure 18-19). This dual regulation of hunchback expression produces a steep Hunchbac.:k protein grad ien l with th e highest concentrations located in the a nterior half of the embryo, and sharply diminis hing levels in the posterior half.
The Gradient of Hunchback Repressor Establishes D¡fferent Limits ol Gap Gene Expression Hunchback function s as a transcriptional repressor lo establish different limHs of expression of the so-cal) ed "gap" genes, Krtippel, knirps. and giunt (discllssed in Box 16-4). We wi ll see tbat Hunch-· back also works in concerl with Ihe proteins encoded by Lhese gap genes to produce segmentation stripes of gene expression, Ihe firsl step in subdividing Ihe embryo inlo a repeating series of body segmenl s. The Hunchback protein is distribuloo in a steep gradient that extends through lhe presumptive thorax and ioto the abdomen . High levels of the Hllnchback protein repress the trallscriptiOll of Krüppe/, whereas intermediate and low levels of the protein repress the expression of knirps and giant. respectively (Figure 16-20a). We have seen that the binding affinities of the Bicoid and Dorsal activatoIS are respollsible for prodllcing differenl thresholds of gene expression. The Hllnchback repressor gr<'ldient might not work in ¡he sa me way. Inslead. Ihe numoor of Hunchback repressor siles may be a more critical determinant for distincl patterns of Krüppel. knirps , and giant expression (Figure 18-20bJ. The Krüppel enhancer contains only lhree Hllnchback binding siles and is repressed by high levels of ¡he Hunchback gradienl. ln contrast , lhe gion t enhaocer conlains seven Hunchback sites and is rcpressed by low levels of the Hunchback gradient. The underlying müchanism here is unknown. Perhaps different thresholds of repression are produced by the addilive effects of the individual Hunchback repression domains.
a
b KrüpPCI
giant
FI <> U R f 18·20 Hundlbock forms sequentlal gap expresslon pattems. (a) lhe anterior¡;oslenor Hunchw' rcpressor gOOf2fl1 establishes diffurenllimits of K!üppeI, Itntrps, and gont expresSlOll. High levels of Hunchback are reQuired for l he reprBsion of KJO¡:peJ. bu! Icw leveIs 1Ire 5ufficieflt 10 repress giont. (b) The Krüppel ami giont 5' reguJatory DNAs contillo diffl'ff2flt numbm of Hunchbxk repressor siles. There are three 5iles in KfOppd, btft. seven sites In giant. The il"lCleaSed number of Hur.:hbocl srtes In the gionl enhancer may be respJr1sible lar its repression by low 1005 of !he Hur.:hbadc. gradient (Source (a) Roorawn Ircm Gilbert S.E. 199'"/. DeveIopmentol bioIogy. Slh edition, p. 565, fig 14·23. Ccpynght e 1997 Sinaul'l" Associate5. Used with permissior1.)
60;
Gen~
Regula/ion duri/l8 lX:velopment
Hunchback and Gap Proteins Produce Segmentation Stripes of Gene Expression A culminating event in ¡he reguJalory cascade that begins with the localized bicoid and o.skar mRNAs is the express ion of a "pair-ru lc" gene r..alled even-skípped, or simply eve. The cve gene is expressed in a series of seven altemaling.or .. pair-rule... suipes that extend along ¡he length of the embryo (Figure 18-21). Each eve stripe encompasses four cells. and neighboring suipes are separated by "interstripe" regioos-also four cells wide-lhat exp.ress little or no elle. These sldpes foreshadow the subdjvision of the embryo inlo a repealing series ofbody segments. The eve protein coding seqllellce is ratller small. less Ihan 2 kb in lenglh. Jn conlrast, the flanking reguJalory DNAs tha! control eve expression encompass more Ihan 12 kb of genomic DNA; about 4 kb localed 5' of Ihe eve transcription slart site, and abou! 8 kb in the 3' flanking region (see Figure 18-21 1, The S' regulatOJ:y region is responsib le for initioling stripes 2. 3, ond 7, while the 3' region rcgulates alripes 1, 4, S, and 6. The 12 kb of regu latory DNA conlains five separatc enhancers Ihal logelher produce the seven different stripes of eve exprcssion secn in Ihe carly embryo. Each enhancer initiales the expression of jusi one or two stripes. We will now consider Ihe regula!ion of Lhe enhancer lhat contl'Ols the expression of eve stripe 2. The stripe 2 enhancer is 500 bp in lengt h and located 1 kb upstrcam of Ihe eve transcription 51art site. (f conlaina bind ing sites for four differen t reguJalory proteins: Bicoid, Hunchback , Gianl. and Kriippel {Figure 18-221 . We have seeu haw Hunchbad fundious as a repressor when controlling the expression of the gap genes ; in the context of the eve stripe 2 enhancer. ir works as en ectivator. We wiII relurn lo Ihis issue-how Huochhack can fun clion as bolh an activalor and repressot-a bit l ateJ'. (n principie. Bicoid and Hunchback can activale the stripe 2 enhancer in tbe entil'e anterior half of Ih e emhryo bec8use bolh proleins are present there. but Giant and KrüppeJ funclion as repressors thal eslablish Ihe edges of Ihe slripe 2 pattern-the anterior ;md posterior borders, respectively (see Figure 18-22). (See Box 18-ti. Bioinformatics Methods for ldenti ficalion of Complex Enhancers.)
f iG U RE 18- 21
Elrpress¡onof thf-
eve gene in (he devetoping embryo. (a) Eve e:pression pattern in thc easfy embryo. (b) The eve Iocus contains CM.'!' 12 kb 01 regulatCJy DNA The S' regtJatOfy region contains two 61hancers. These controf !he e>!pfession 01Slripes 2, 3, and 7. Each 61hancer is 500 bp in length. The 3 ' regulatay region ronl
•
b slripe #3, #7 enhancer I
-4 kb
slripe #2 ennancer
I
I
-3
2
, I
ooding
., I
I
'2
., I
. I
stripe #4, #6 enhancer
stripe #1, #5 enhancer
I
I
I
'5
'6
'7
t:Ll I +8
The fI·folecu/ar Bia/ogy of Drosophila Embryogenesis
repres="=", = -:::-:__ [iiKrüppel O
a
activatofS
. Gianti J [. Biooicl~ "+~. =H-""-':-hba --ck® I
b
<,-----.,. I r---
..§ r "c-- .....r----:.-----.... ~
~
'O
.§
I
L-_______ _anterior
~
. .......__
~~
posilion aloog embryo
_________J
605
FICURE 18-22 Regulationofeve stripe 1. (a) lhe SOCl bp EflM1cer COIltains a total el tvve/ve binding sites far the BicoIó, Hunchbad<, Krüppel, and Giant proteins. Thc distribubonS el Ihese regulatory protems ir1 the early Drosophilo ernbryo is SUT.mafized in the 00gr<'lfTl sIlO'Ml Ífl (b). Trere are l1igh leveIs of the Bicoo and Hunchbact proteins in the cells thaI €>q:re5S eve stripe 2. lhe baders of the stnpes are formed by the Giarlt and KrOppe! repressors. (Giart is €1p"essed in
poStenOf_
Kri.ippel media tes transcriptional repression through two distin cl mechanisms. One is compelition , which is similar lo Ihe stratcgy employed by many prokaryoti c repressors {discussed in Chapl ar 16). Therc are Ihree Krüppel binding sites in 1he stripe 2 enhancer (Figure 18-23). Two oC these sites directIy overIap Bicoid activator s ues , and so ít appears Ihat ¡he binding or Krüppel lo tbese siles precludes the binding of the aclivatoT. The third Kruppel re pressor
BoX
18-6 8ioinfonnatics Methods for the Identification 01 Complex Enhancers
A variety of computer prograrns have been developed to identify regulatory ONAs within genomes that have been completely sequenced, knovvn as "whole-genome assemblies." These programs take advantage 01 the lact that regulatory ONAs contain deflse clusters of ONA-binding sites. Fer exampie, !he eve stripe 2 enhancer is 500 bp and contains 12 separate binding sites fer lout different regulatory proteins: BiOJid, Hurxhback, Krüppel. and Knirps (see Figure 18-22). Thus. there is more than one binding site per 50 bp O\rIer the length of the enhancer. 11"115 density 01 bínding sites is typical of enhancers that direct localized panems of gene expression in the early Orosophilo embryu. As we have discussed in this chapter, a number of regulatory proteins have been implicated in the regulation 01 pair-rule stripes of gene expression in !he Orosophilo embryo. These include Bicoid, Hunchback, Krüppel, Giant, and Knirps. unfortunately, an insuffident number 01 Giant blnding SRes llave been ¡dentified to determine the range of sequences that this plOtein is likely to recognize. In oontrast, there is extensive ONA binding informaoon for the other four regulatory proteins, as well as for a homedomain pmtein called Caudal, v.rhich is expressed in a broad gradient in the posterior half of the embl)"Q where it functlons as a transoiptional actfvator.
Bicoid, Caudal, Hunchback, Krüppel, and Knirps each bind ONA as a monorner and recognize relativcly Simple sequences that are present in extremely high copy number in the Drosophilo genotne. Bicoid, la example, recognizes a simple sequence that contains an ATIAlery 1 kb in the Droso¡::I!iJa gencrne. Therefore, the use of Bicoid binding sites fOf identify;ng segmentation emancers VIIOUId be futjle because !here are more than 100,000 such sites in the gerane (nearfy ten sites per gene). HCM'eVer, dustering Bicoid binding sites, together v.;th !he binding sites of regulatory proteins Ihat \o\Ork together with Bicoid, pr~ a pcM'eIful filter fer eliminating fortuitous binding sites (oc ~ncise·). Consider a I Mb region encompassing the eve Iocus (Box 18-6 Figure 1). There are thousands of Bicoid, Caudal, Hunchoock, Krüppel, and Knirps binding siles in this interval (Box 18-6 Figure 1a). There are, however, only \hree clusters that contain at least 13 binding sites in a window of 700 bp or less (a density of nearly one binding site per 50 bp; (Box 18-6 Figure lb). Remarkably, these three clusters map in the 5' and 3' regulatory region or!he eve gene. One cluster corresponds to the eve stripe 3/7 enhancer, another duster cOlnddes wilh the eve stripe 2 enhancer, and !he third
Gene Regulo/ion during De" clopm8f11
606
c1usters correspond to actual enhancers. It is conceivable that a higher hit rate wil1 be obtained by placing spatial constraints on binding Sltes rather than relying so1ely on simple dustering of sites. We saw in Chapter 17, far example, that the interferon enhanceosome contains binding sites with fixed spacíng. induding helieal phasing between neighboring sltes.
duste/ ;s Iocated in the 3' regulatory /egion and coincides with the eve stripe 4/6 enhancer (BO)( 18-6 FiglJ/e 1). dustering of DNA-binding sites has proven to be a valuable tool ror identífying enhancers in the Drosophila genorne. H(I\r','eve/, the ament computer programs are not 101JC1t¡ accurate. In the best cases, only approximately one-third of the identified
(a) High stringency match.s
,
I
111111
e....'
I II II ~II I
Ilunchba<:k
,l ' '11 ' .
Kmrps
' ~ Il"
" . "", KnilJlS
Ilun<'hback Cnudal Biooid
-
- I
I
111111111111 111 111 IIIIIH IQIIIII I IIIII IIIIIII I I!MI II 11 IIU II II IIlI Il IlIIIl Il II IIIlUll n ~ I 'UIIIIII I 1 11 1lIII!III I«IIUl ll l Ulllf l UM l6n, 111 ,1 11 1 .,111 11 11 IIIIII UI I IIJIIII I ~ I ' II I W I I UI ~ 11ft' 11111 11 • • Iftllllll ll l hll lll l ' 1 UIII I IIQ I I 1 111 1111 . 111111 111 ~I II III U 111"l1li 11 11 1 1111 1 IIli" 11II Iftl .. a l a m ll! I 11111.... 111111111... . 1111111111.1 I nll l II • • IIII II U ' , IC111 I 11 /fl lll 1 11 1 1 11 II Ml II I Id . 11 11 1I II 11 " 11111 1 11 1 1 111 111 1 I 1I 1111 111 11 q 1" 1 111 1 11 1 ' 111 1 1 III ~ 111111l1li1 IIIl11r 111 111 tu 1111111" ' "1111 · 11 ,.111,, '111 " 1, , 111,,""111\11>11 ' 11111, 1' 1 111 11111· 111 . 111' 111 11.111.11111111 11111
nicoid
""""'"
_.
-
, '''''''''
eveo-skippcd
1I tt 11111 11. . " '1 JI I H'I~ HII IIllt-tltll .. 1I IIIHI-II I II II I R .. 1111 Hlmlll llUI-I 11 1-1 n-~I IIII 11I I H III ' ' ' 1 1111 111111-1 ' 1'I In ~II' III IUII I II, II 1I ta, . 1' 1 11, 1 L l un l !!' l l' ,11 "1 ' 111 11111111 11 1111'11 NI , ..1 ~ ' I I!!I IIN '" ' hU 1 .. 1 1 111 11 11""1 . 11 ,. 1111 11 11 111 11 11 1 111 11 11 I U IIII NI I 11 1 111 1 1 "111 1111 111 1 . 1 11 1 11 1111 11 1" 1111111111111 .. 1 1 111 . 111 1 11' 1 II II" '_" "IIIII II U I " 11111111111 11 111 11 l' 1II' 1fIi1. U,,' l tw l 11I 11111 II IIIIIIII IUM!II . IIII ,. 911 1 I I I 1lI I IlI'Il.... UI II g II I ~ I III R 11 111I11I1I . 1 ,, 'lIl ll ll tl llll mI Il KII I 1 11 111111'1 111 tI .~ . 11 1 I I 1 111111 1 IIII IIIU 1_ QIIU 1 11 1 1 I~I 111 1111:11111 111 . 11 I 1111 111 111 111111 I 1 1I 11 11 dll . 1 11 1 11111 1 Id l l l'" 111 11 11 ' 1111 11 1 111 1111 11 1 1 11' ' ' " " 111111 111 11 1 111 1
• la-i 1 ""1 11
I
I
. I
I _,.""'b
·lOOJl(b
~.
(b) High slringency malches and c1ustering filler ,.... ~Jt:b I
-~
I
-~ ~
I
-_.
· IüQI(b
~.
I ,
I
""""
, "I " cvcn.skipped I
C~OOaI
Hunchback Kruppd Km"" I HI
I~
,
I IH IR 11 111
Knirps
KruIt.1 Hune' el:
..
c ...., nicoid
I
I
~
·Y"Kb
"• ''" H
• • 1 ti
I -.KIIIKb
I
.
-
,_.
I
-
~~
I
I
I
-.
, , , I
,. ~
I
l.
16 •
I
....ltU ' " ' .~
I ,
I
_. _.
. I
I
,~ ~
~.
·llllKb
.. 11
,
11 IIt ll IHA. HII IH I I I I 11
"• " "',""
,-
I
I
~
1111 tI-ttttIllI
, I
~, ~
""
ro. I
I H 1 11
_. I
I
~,~
1l1Hi-t1 11 111111"
-
,I
(e) Expanded view of evell-skipped region · S.Qo;b
-loso;¡,
I
I
,
BiC
Caudal Hunchhack
" ,,
11 1111
"'Kn"JIIi
=
H
enico¡d .....
I!
I
. $.lO{b
"
-
1 lit I
11
I
I
SIri(>e416
,,
-loSKb
,, ~
I I 111 ,
- -
,," ' "
k
,~.
," ,
Slnpe 2
Sfripe Jn
Knirps
,, , •,
,
""~
I
.~.
I l J Kb
"I
'1 ü Kb
I 7..'IKb
B 0)( 18·6 F 1 G U R E 1 Ouster,; of binding sites tdentify eve stripe enhancet"S. (a) Individual Bicoid, caudal, Hunchbad., Krüppel, and Kn1rps binding sites in a \ Mb region that rontains the e-..€fl-Skjpped locus (in rente. along \Mth othcr intron-exon structures 01 neighbOOr,g genes). (b) High density dustering 01 bindifl! sites is uniquely detooed real eve and not e lscoMoere ,n the 1Mb regio.-.. (e) There are three high density dustefS 01 binding 5Ítes associated lMlh eve These coincide INith the stnpe 3/7, stripe 2. and stTlpe 4/6 enhancers. (Source: Redfi3'M1 from BeRnan P. et al. 2002. &:ploiring transaiption fac:\Of blnding site dustering to Identily cis-regulatory modules it"iVOO.€d in pattem formation in \he Drosophila genome. Proc Nar/. Acod. 5ci. 99: 757-762. fig l . p. 759.)
r /te Molecular Bio/og}' ofOrosophila Embryogenesis
s ite maps about 50 bp from the ocaresl Bicoid activator s it o within lhe slripe 2 enhaocer. lo lhi s caso KfÜppel and Bicoid can co-occupy lhQ neighboring sitos. Once bouod lo DNA, however, Kruppol is able to inhibit th e aclion of the Bicoid actívator bound nearby. Quenching depends 00 the recruitment of a traoscriptional repressor callcd CtBP (see Figure 18-23), which we considered earlier io the contcxl of lhe Notch s ignaling pathway. Rocent studies suggost that CtBP possesSQS an enzymalic activity, which somehow impairs the fune lion af ncighboring ncti valors. It js likcl y tltal Cianl cmploys a s imila r combioation of competition and inhihitiun to eslablish Ihe a nt erior border of the slripe. This basic mechanism of stripe formati on-broadly distributcd aclivators aod localizcd repressors -is a recurring theme io developmeot. The same principal govems HO cxpression in yeast , and \Ve al so Sl:l W how the locali zed Snail rcpressor rcshicts the action of the broad Dorsal nuclear gradient and limits the expression of the rhomboid and sog genes to laleral regi ons that lorm Ihe nemogenlc ectode rmo It is nol known ha w Hunchback is able to funetion as an activator in thfl contexl ol' lhe eve stripe 2 enhancflr, but il is indispensable in this role. The removal of lhe single Hunchback bindjng site wi!hin the stripe 2 enhaneer essentially aboUshes stripe 2 express ion. Moreover. replacing this site with an optimallJicoid recognition sequcnce causes on1y a partia! rC5toration in enhancer function. We have scen othcr examples of lhis type of transcription syncrgy in Chapler 17, including tbe aclivation of HO ex prflss ion by SWl5 and SBF jI) yoast, and the actiwtion of the interferon gene by NF-KB and Jun/ATF in marnmals. lo a11 of th ese examples, the presencc of two different c1asses of transcriptional activators induce far more robust expressi on than does ejther one alone. In the case of HO rogulation. the SWIs and SBF activators functi oo syoergistically by rccruitiog ditTerent transcription complexes required for activation : SWl5 rocruits the SWtlS N I~ nucleosome remodeling complex, whereas SBF recruits the Mediator Complex at the core prornolcr. lt is easy to imagioe thal a similar mechanism applies to the aetivation of the eve stripe 2 enhancer by Bicoid and Hunchback.
Gap R epressor Gradients Produce many Stripes of Gene Expression Eve stripe 2 is formed by the interplay of broadly distributed aclivalors {Bieoid and Hunehback) and localized repressors (Giant ami Krüppel). Tho same basic mocha ni sm applies to the rogulalion of the otller eve enhancers as well. For exa mple, the enhancer that dirncts the expression of eve stripe 3 can be aetivated through out Ihe early embryo by ubiquitous transcriptional actival ors. The stripe borders are defined by loca lizod gap rflpressors: Hunchback establishes Ihe anterior border. while Knirps specifies the posterior border (Figure 18-24). Tho enhanccr that contro(s the expression oC eve stripe 4 is also rnpressed by Hunehback and Knirps. However, diffcmnt coneentralions al' thesc repressors are required in cac h case. Low Icvels of lIle Hunchback grarlie nt tha! aro ins ufficient to repress the eve stripc 3 enhancer are sufficienl to reprcss tlw eve stripe 4 cnhancor (Figure 18-24). This diffcrential rcgulation of thc Iwo enh aneers by the Hunchback mpressor gf"ddiHnt produces distinct anterior borrlers for the stripe 3 and stripe 4 expression patterns. The Knirps protein is also distributcd in a gradi ent in Ihfl prc-collular embryo. Higher levels
607
a competilion ,___Bicoid aClivatocs ___
)
) Kr1 -"""""Krüppel
Kr2
repressors/
b quendling
flCURE 18-23 Twodistindmodesof tlanscriptional reptession. (a) 1he bro'fl of Kriippel rcpressor to !he Kr\ and Kr3 9tes p ed.KIe5 thc binding o. Btcoid 10 over\cwlng sitcs..
(b) The binding of Krüppcl repressor ID the Kt2 Slle does na! interfere with lile birding of 1he E!iooid actívator lo ~ sites. In this case, KrOppel mediales rept'e5S/Or) by recruting the Ct8P repressor protein cap (01)005 an enzyrnatic ac:tMty mal mghl modify the BiCOld activa101 so t.tlolt it ca .... l1U ~ SllrIluale Iransoiption
608
Cene Reg uJo/ion flun'ng DeVfJJopment
o Hunchback
bioding sites
~r· :·J. ·.·:: ~.r.-.l~. ,. I.I'~'
Knirps bioding sites
=
stripe 3 enhancer
n:
l ;l
i.L J:l
)
_3
stripe 4 enhancer
(][):;¡ •
·.'... r ~ -. - .
__
D l : :l lJ l. []
¡I-
FI c:; UR E 18-24 Diffefential regulation of Úle stripe J and stripe 4 enhancers by opposing gradients of the Hundlback and Knlrps repJt5S0rs. The two slripes ilfe positioned I!'I diffcrent rcgions
oIlhe ernbryo. 1he.fW stJ1>e 3 emancer IS repressed by high leveIs oI lhe Hunchback gradient bu! Ia.v IeveIs of \he KrÑIpS gradient CotPIef~. !he stilJe 4 erhancer is repressed by b.v le.els of !he Hunchback gra.jent but high IcveIs of KnJps, The stnpe 3 enhancer contains ~ a few l-kJrocl-back bll"ICIire sites, and zs a resul~ h¡gh leveIs 01 the Hurdlb.rl. gract.ent are reqjred ror ts repression The stripe 3 enhaocer mntains many KnIpS bIOding sites, ,md cOl"l5eq..enUy, k:Jv.¡ leveIs 01KrWps are suffic.ent fOI' rep~ TI1E' str1Je 4 emaree\" has the opposite OJganization 01 repressor bindi~ sites.. There. are many Hunchbad ~tcs. aOO the<...e allovv lo.;lress.on The stnpe 4 enhancer conlains jU;! a lew I
a.,.
oC this gradieot are required to repress the slTipe 4 enhancer lhan are needed to repress the st ripe 3 e nhancer. This dislinction produces diserute posterior borders of the stripe J and stripe 4 expression pattems . We have secn that the Hunchback repmssor gradien! produces
diffcrcnt pattem s of Kruppcl , Knirps, and Ciant expressioo. This d iffnrential regulation might be due to Ihe increasing numbor of Hunchback binding sitcs" in the Krüppel. Knjrps, and Ciant cnhancers . A similar principie appJjes to tho dirrerential regulation of tho slripe 3 and stripe 4 cnhanccrs by the Hunchhack aud Knirps gradicnts. Tho eve stripe :1 enhancer contaios rf'! lative ly few Hunchback binding sites bul many Knirps siles , whereas !ha eve slripo 4 enhancer contains many HWlChback sites but relativcly fow Knirps sites (seo Figure 18-24). Similar principIes am li ke ly to govorn the reguJation of the remaining stripe enhancers that control Ihe eve expression pattern.
Short#Range Transcriptional Repressors Permit Different Enhancers to Work Independently of one Anothe r within the Complex eve Regulatory Region We have secn that eve expression is regulatud in the early ambryo 'by fivo separata enhancers. lo fad. Ihere are addit ional enJlaOcers tha! control eve expression in the hearl and CNS of oIcler cmbryos. This type of
Summary
complex rcgulation is not a peculiarity oC eve. There are genctic loci that contain oven moro enhanccrs distributed over even larger distances. For example, in the next chapter we will disCllSS the reguJation oC homoootic genes, which are responsible for making the body segments of the adult Uy morphologically distinct from one another. Several 01" these genes are regulatcd by as many as ten differcnt enhancers, perhaps more, that are scaltered Over distances approaching 100 kilohases. Thus, genes engaged in important developmental processes are often regulated by multiple enhancel'S. How do theso enhancers work independcmtly of ano another lo produce additive pattems oC gone expression? In the case of eve, five seperate enhancers produce seven different slrip es. Short-raogc transcriptional roprossion is one mechanism Cm ensunng enhancer autonomy-the ¡ndependent aclion oC multiple enhancers to general e additive patteros oC gene expression . This mf'-ans Ihal reprcssol'S bound lo one enh anccr do not ¡nterferc with the aclivators bound lo another enhancer within Ole rcgulatory region oC Ihe samc genc. For exampIe, we h ave secn that the KrüppeJ repressor binds lo the eve s tripc 2 e nhanccr and establishes the posterior border oC the stripe 2 patteen. The Krüppel rcpmssor works onIy within lhe Iimits of Ihe 500 bp s lripe 2 cnhancer. II does nol mpmss the core promoter or the activalors contained within the s tripe 3 enhancer, bolh of which map more than 1 kb away from lhe Krüppel repressor sites wilhin the stripe 2 enha nccr (Figure 18-25). t( Krüppcl was able lo function over long distances, tben it would interCere with lhe expression of eve stripc 3, bccause high levcls oC the Krüppel repressor are present in that region of the embr yo where the Bl'e stripe 3 enhancer is active. The underlying mechanism is not fully understood. We have already seen that Ihe Krüppcl rúprcssor mediates two forms of repression: competition and quenching. In the case oC competition. tlle acHvalm must bind lo a sequcnce thal dircctly overlaps lhe COTe Kri.ippel rp.cognilion sequenCfl. KrüppP.! also mcruits Ihfl CtBP protein . which is ahle to function over a distance of 100 bp or less lo inhibit nearby activators within lhe stripe 2 en hancer. The CtBP rcprcssm does not inhibit activators whose binding siles map more Ihan 100 bp away. Cm example . those bound within the stripe 3 enhancer.
609
Krüppel gradlenl
I
I
enhancer 2
enhancer3
Kriippel
stripe 3
repressor
act[vators
"---,1:
FIGURE 18-25 Sflort-fangerepression and enharKer lIutonomy. DifIeren! ertJancers worI:: ¡ndependentIy of one arott-er In lhe f\'E' regtJatory reg¡on due ID ston-rar.ge transai~onal repleSSion. ~preiSOl"S bound 10 ore crl1aT'l:Ef 00 rot mK.'ffere with a:tÍYaIm m I~ negttJOOr"f! enh
""'"""",
SUMMARY The cells of a developing embryo follow divergent palhways of developmenl by expressing differenl seIs of genes. Mosl difforential gene expression is regulatad 0 1 Ihe level of transcriplion iniliation. Thare are three major stl'Ulegies: mRNA. localiy..ation, cel\410-cell contact, and Ihe diffusion of secreled signaling molecules. mRNA localization Is achievad by Ihe attachmenl of specific 3' UTR sequences to the growiog cnds of microluhules. T)¡ís mechanism is used lo localizc Ihe oshl mRNA lo lhe daughler cells of budding yeasl . ti ís also usad lo localize ilie oskor mRNA to llle posterior plasm of Ihe unfertilized egg in Drosophilo. In cell-Io-cell contacl , a membrnue-bouud s ignaling molecule alters sene expression in neighboring cells by l.lctivaling a col! signaling pathway. In sorne cases. a dar· mant lranscnptional aClivator, or co-aclivalor protein , is releascd From the ceU surfaee lulo Ihe Ilucleus. In olher
cases, a quiescent lranscriplion faclor (oc transcriplional repressor) already presenl iu Ihe nucleus is modified so Ihat it can activate gene expression. Cell·l o-cell contact Is usad by B. subtilis lo establish dlfferent programs of gene expression in Ihe mothar cell and forespore. A rcmarkably similar mechallism ls used lO prevenl skin cells &om becoming mlUrons during Ihe devcloprnent of Ihe insed central nervous syslem. ExtrocelJuiar gradients of socreloo mil signoling molocoles can estabJish mulliple cell types during Ihe develapment of a complex tissue or organ. These grarlients produce introcclJular grodients of activaled transcription factars. which. in tumo control gene expression in a conceutral.ion· dependenl rashion. An exlracellular Sanie Hoosehog gradient leads lO a Gli activator gradienl in lhe ventral half of thlil vet1ebrale neural tube. Different levels of GH regulate dislillct seiS af largel genes, and thereby produce dlffereU1
610
Gene Re¡,¡u/olion during DeveJopment
neuronal ccU types. Similarly, thP. Dorsal gradient in Ihe early Drosophilo em bryo olicits dilTerent pallorns of gene exprcssion across the dmsru-ventraJ axis. This diFferenti¡¡1 regulation depellds on \be bind ing affinilies of Dorsal binding siles in the targel enhaTlcers. Tbe scgmenlaHon of the Dmsophilo emhryo depends on a combinalion of locali 7.ed mRNAs and gradienls oC rcgulalory faclors. Loealized bicoid and osko r rnR NAs, al the anterior and posterior poles. respectivcly, Icad lo the fonnalion of a sleep Hunchback rcprassor gradient across Ihe ante rior-posterio r axis. This gradicn t establishes sequential patte rns of KrüppeJ, Knirps , and Cíant in the presumpti ve th orax and abdomon. Tbese four protains are collcctivcly called gap protein s; Ibey Cunction as Iranscriptíonal ropressors that eslablish
locelized stripes of pair-rule gene e)( pression. [ndividual st ripes are regulated by separ ata enhanccrs h x::atad in the regulatory regions oC psir-rule genos s uc h as eve. Each enhance r contmns multiple binding sites for both activators and gap repressors. 1I is the in terplay of broadly di str ib ule d acli valors, such as Sico id , and localized gap ropressors tJla' ostablish Ihe an terior ami postorior borders of individual pair-rule slripes. Separate strip o enhancers work independently of one an olher 10 produce composito. 7-slripe palleros oC p ai ro rule express io n. Thi s ollhullccr uulollomy j s duo, in part, to sh ort-range Iranscripti onal re pression. A gap rep ressor bound to one enhaneer daos nol inle rrere wilh the acti vi,ies of a ne igh boring stripe enhancer localod in tho samo gen e.
BIBLIOGRAPHY Books Alberts B.• Johnson A., Lewis J., Raff M .• Roberts K.. and WaHer P. 2002. Molecular bia/ogy o{ the cell, 4th edilían. Garland Scienee, Now York. Daviclson E.H. 2001. Cenomic regllJa tory systems. Ácademie Prass, San Diego. Gilben S.E. 2000. Developmen lal bio/agr, tilh edition. Sinaum Assucinlos, Sunderland, MassaehuseHs. Lawrence P.A. 1992. The maJdng o{ a f1y: The getretics o{ anim al designo BlackweJl ScieJlce, Oxford .
Wolper1 L.. Beddingloll R., Lawrence P.. Meyerowilz K. Smilh J.. and lesselJ T.M. 2002. Prindp/es o{ developmento 2nd edition. Oxford University Press, EngJand.
rnRNA Localization Dne C.Q. and Bowennan 8. 2001 . Asymmetric eell divi-
sion: Fly neuroblas' meelS woml zygoto. Curro Opino Ceff Bio/. 13: 68-75. Grunert S. and St. lolms' on D . 1996. RNA localization and Ihe dovelopment of asymmel ry during Drosophifa DOgenasis . CurroOpinoGene' . nev. 6 : 395- 402. Kwon S, and Schnapp 8.J. 2001. RNA locaJizatian: Shed· ding ¡¡ghl on tho RNA-molor linkage. Curr. Biol. 11 : R166- 16a. Mowry KL. and Cote C.A. 1999, RNA sorting in Xenopus oocytes and embryos. FASEB J. 13: 435-445. RiechmruUl V. and Ephrussi A. 2001. A)(is formation during Drosophifo oogenesis. Curro Opin o Genet. Del'. 11 ; 374-383. Strome S. 1969. Gener;alion o f ccll diversily during early embl)'ogenesis in tlle nematode Caen orhabditis elegans. /nl. Rev. CytoJ. 114: 81 - 123.
Cell.ro-Cell Contact Arlnvanis-Tsakollas S" Rand M.D. , and Lake R.J. 1999. Noteh signaJing: Call fate control and signal inlegra~ion in developmen t. Sciellce 284: 770-776.
000 G.Q. and Skoalh J.B. 1996. Neurogenesis in Ihe insect central nervou s sysl em ~ Curr. Opin o Nellrobiol. 6: 18- 24.
Goodman C.S., Basliani M.J., Doe C.Q. and Dulac S. 1986. Growth cone guidance aud celJ recognition il! ¡nseel embryos. Dev. Bio/ . 3: 283 -300. Greonwrud 1. 1998. LIN-12/Notch signaling: l.essons from worms and mes. Genes Dev. 12: 1751-1762 . Losick R. and Oworkin J. 1999. U nking asymmolric di vision 10 eel! fate: Teaching an o ld m icrobo ncw hi eks. Cenes Dev, 13: 377 - 381.
Morphogen Gradicnts Belvln M. P. aud Anderson K.V. 1996. A conservad signal¡ng pathway: The Drosophila Toll·Dorsal pathway. Annu. Rev. GeJJ Dev. Bio/. 12: 393-416. Orior K A. and Steward R. 1997. Tbe dorsovenlral signal lransd ucti on p
Bibli08Fophy
Sl rigini M. und Cohen S.M. 1999. Fonnalion of morphogen grad ienls in Ihe Drosophilu wing. Semin. Cell Oev. Biol. 10: 335-344.
Wolperl L. 1996. One hundl'ed years of positional informatlon. Trends Gene!. 12: 359-364. Segmentation
Hulskamp M and Taulz D. 1991 . Gap gones and grndienls- The logic behind Ihe gnpl;. Bioessnys 13: 261 -2.68;
Johnslone O . and Lasko P. 2001. Trallslalional regulalion ftnd RNA localization in DrosophiJa oocyles and embryos. "nnu. nevo Cenel. 35: 365-406. Lawrence P.A. and Struhl C. 1996. Morphogens. eomllarlmanis. and paUero: Lessons from Drosophilo? CeU 85: 951-961.
Mahowald A.P. 2001 . Assambly of Ihe DrosophUa germ plasm. Int. ROl'. Cy rol. 203: 187-2 13. Nüsslein-Vnlhard C. 1996. Gradienls Ihal organizo embryo developmenl. Sci. l\m. 275: 54-55; 58 - 61.
611
Pankrat.z M.J. alld Jackle H. 1990. Making slllpes in Ihe Drosophilu embryo. Trends Cene/. 6: 287-292. Mannervik M .. Nibu Y.. Zhang H .. ond Levine M. 1999. Transcriptional coregulmors in development. Science 284: 606-609.
Pick L. 1998. Segmenl ation: Pain ling slripes hom lJies lo venebrales. Dfw. Genel. 23: 1- 10. Rongo c.. Broihier H.T.. Moora L., Van Doren M., Forbes A .. wld Lehmann R. 1997. Cerm pla::;m a.:sStllllLly and genn eell migralion in Drosophi/a. Cold Spring Haroor Symp. Quant Biol. 62: 1-11,
ScOl t M,P. and O'Farrell P.H. 1986. Spalia) program.ming of gene expl'ossion in llarly Drosopl1i1a e mLryogenesis. Annu. Rev. CeJJ BíoJ. 2: 49-80. Slruhl G. 1989. Morphogen gradienls and tha control 01' Lody paUern in insecl e mbryos. (.'iba Found. Symp. 144: 65 -86: discussion 86 -91, 92-98. Wilson I.E. and Macdonald P.M. 1993. Fonnation of gmm cells in Drosophílo. Curro Opino Gene/. Dev. 3: 562-565.
e H A P TER
Comparative Genomics and the Evolution of Animal Diversity t the end oC hi s book. 00 the Origio of Spedes. Charles Darwin speculates that
A
","",,01><>'1_
*
...
r-------~~ ...... ~ ,
¡---~::::::~-
_. --cm ..."_
...01'.-•
...,'.".".... n.
--~
r"""'"
--- ,
-
1=:::::= -
1ICdysolooot..
I
ou TLI NE
• Most Animals Have Essentially the $ame Genes (p. 6 14)
Thlee Ways Gene Expression 15 Changed
duñng Evolution (p. 619)
• Expenmental Manlpulations mal Alter Animal MOIphology (p. 620)
• Morphological Changes in Oustaceans and Insects (p. 630)
• Genome EvolutiOl'l and HUrT13n Oñgins (p. 635)
FtC;URE 19-1 SUmmaryofphyla. Eac.h phylun represenl5 a basic type 01 animal. lhe bilaterians are divided ¡nlo three major groups: the deuterosomes (pl..fple) , !he lophotrochozoans (orange). and the ea:tyso. zoans (blue). (Source: Adapted lrom DaVldson EH 200 \ . Genomic tegUlaror¡ ~ems, p. 22. f 1.6. Copyright Q 2001 v.ith pennission from Elsevier.)
614
Com pamlil'f: Genomics ond tlle Et'OllJlion 01 Animal Dit'ersily
f I GUR E
19-2 PhyJogeny of usembk!d pnomes. lhe figlfe shows lhe rclationships iIfTlOf'€ those ani mals....rose genomes have been sequenced lo dote. lhe genomes 01 the organisms shCM'l1 in Ihe figure represenl three phyIa: nematode IMlrrn, arthr0p0d5 (Iruit IIy and rrosqtrito), and chor~les (sea squirt. pufferflsh, 1llOU5e, human).
fruit fly. mosquito
oornmon anceslor
sea squirl
(Ciona)
pufrerfish
protostomes.) Chordates sueh as vertebrales are deuterostomes. The ccdysozoans indude the Iwo müjor model organisms for studics in genclics and devclopmnntal biology: the fruit l1y, Drosophi!a me/anogaste.. and thc nematodo worm, Caenorhabditis eJegans (Chapter 21). Whole-genome sequencc infonnation is now available for botll ecdyso7..oans and deu'-erostomes. Unfortunately, thero is very HUle molecular infonnation available Coc any of Ole lophotrochozoans, whieh ¡nelude two fascinating phyla, molJusks anrl annelids. Thc systematie comparison of different an ima l genomes offers the promise of irlell tifying the gcnetic basis for divers ity. As of tllis wril ing, the genomes uf sevcn differenl anim<;lls from ¡hree phyla (JJeIllil torles, arthroporls. and chordat es) have been seque nced and assembled (Figure 19·2). lt is tikcly that genomo assemblies will be 8vailable ror spedes representing most of the rcmaining animal phyla in the next few years.
MOST ANIMALS HAVE ESSENTlALLY THE SAME GENES Compari son of the current ly available genomes reveals one particularly striking fcalure: diffcrnnt animals share essentially the same genes. Thus, the three known vertcbra le gnnomes-puffe rfish, mice, and humans -each contain aboul 30,000 genes. With very few exceptions. just about evcry human gene has a desr countcrpart in the mouse gonome. In olher worrls, no new genes were "i nventcd" during the 50 millian years of evoluti onary divergcnce thal separale miee and humans from Iheir lasl shared ancestor. Similarly. humans and pufferfi sh last shared a common ancestor ovcr 400 mi Ilion years ago. Yet. the two genomos eonlain the samo number of genes, and mas! of these genes-more Ihan three ql1arters-can bo unambiguously 1I lignod .
Masl Animols Hove Esselllially lile Same Genes
ifI
.Si
1
fi15
F I G U R E 19-3 Phytogerteti( tree
showing gene dupli(ation of tfle fibrobhlSt growtft 'acto.- genes (FGf). Gema FGFs are shown In Ofange, lIVhereas \lertebrate FGFs are in black lenenng. Bfanchles!> i!> an FGF found in Drosopllila. EGL· ¡7 ancl lel·756 are lound in e eIegans. (Source: Adapted fmm Satou Y. el al. 2002. FGF genes In the basal chordate Ciona mtestJna/is. DEV. Genes Evo/. 2 12: 437, fig 3, Copyright e 2002 Springer Verlag.)
The gene tic conservati on seen among vertebrates exlends to Ihe hu mble sea squirt. Ciona intestinalis. which is an invertehratc chorda tc (scc Chaplcr 18). lt con lains haH the n UInLer of gCllCS prescnt in vertebrates ami last sharecl a common Hnccstor with that group more tha n 500 m Hlion years ago. Nonct hnless . neady Iwothirds oí" the proteio coding genes in sea squirts contain a clear. recognizable counlerpart in vertcbrHles. Moreovf~r. the inc rease in gene number seco in verlebrales is p ri ma ril y due lo the duplication of genes already presont in the sea squirt. For example. the sea squ irt gOllome con tains six dHferen t F'CF (f'ibroblast growth fac torl genes (Figure 19-3). T here are al least 22 FCF genes in the mouso and humao gc n omes - {~ac h gene in the SRa sq uirt duplica ted into a n average of fom copies in vertebrales. The genel ic conscrvaLi on secn amollg chorclales a ppcars lo cxtend to other phyla. The genom es of Ihree different ecdysozoans (nematode worm. fruil Oy. aod mosqu it o) have been sequenced a nd assemblcd . They conlain an average of 15 ,000 genes-s imilar to the numbe r io sea squ irts. As seco for Ihe Sea sq uirt. ¡ncrease in gene number in vertebrates is primarily due to the dupli cation of genes
lilfi
Compomlive Genomics (llld Ihe Evo/u lion 01 Animal Diversily
aLrcady present in the ecdysozoa ns ralher than the in Vfm ti on of enli re ly ncw genes. How Does Gene Duplication Give Rise to B io loglcal D lversit y? The increase in gene number seen in vertebrates is largely dun lo gene duplication. Bul ho\\' does increasi ng Ihe numbec oC copies oC ccrtain genes load to increased morphological divecsity? There are two ways Ihi s can bappen. First, the convontional view is Ihat an ancestra l gene produces multiple genes via duplication, and the coding regions oC tba oew genes undorgo mutation. This duplication procoss does nol typically produce new genes tha! encode peateins oC enticely new function. Rather, it ereates genes encoding related proteins with slighlly difforenl activities. The second way Ihal duplicated genes can genecale diversity has been ralher neglected unlil very recently. Accocding to Ihis model , the duplicated genes do not necessarily take 00 ncw function s, bul in slead acquire new regulatol)' DNA sequences. This allows differant copies of the gene to be cx pressed in differcnt pattecns with in the develop ing organismo Considor tho spoci6c cxample oC thn FGF gonos. Tho 22 FGF genes of vertebra les are expressed in a far bcoadec speclrum of cell types than is Ihe singln gene pcesent in Drosophila. Thus. while FGF is expressed Ln the devcloping rcspiralory organs oC Iruit mes and those oC higher vertebrate~ as well, sovera l oC th o " new " FCF genes are addili ollally expressed in the developing limbs of verlebrates where Oies do nol exhibit a comparable pattcrfl oC exp rcssion. Anolh ec example is described io Box 19-1, Gene Duplicatioo and the lmportance oC Regulatory Evolu lion. ThllS. we have two models for how dllp licaled genes can creah~ diversi ty. Accorrling to one sccoario, Ihe function of Ihe gene is modi ¡ied, thrOllgh mutation oC the coding seqllence. According lo the other scenario. the l\Vo genes are expresscd in diffecent patterns wilhin
Boa 19...1 Gene Duplication and the Importance o, RegulatDfY Evolution
lhe regulatory proteins Gooseberry and Paired probably arose
'mm an ancient gene duplication event. Each contains two dis-
DNA-binding dornains: a homeodomain and a paired dorna'n (Box 19- 1 rlgure 1). lhe two proteins possess similar OIIerall structures, but share only 25% amino acid sequence identity. In addition lo substantial sequence divergenee, the two genes exhibit totally distind patterns of expressiOl1 in the developing embryo. lhe poired b~e is expressed in a series of seven stripes across tIle anterior-posterior axis of eellulañzing embryos. In crntrast, gooseberry is expressed ;n every segment and exhibits 14 stripes of expression in somewhal older embryos. Mutan! embryos exhibit dístinct p/lenotypes: paired mutants lack altemating segments,. wh~e goaseberry mutants crntd;n patteroing defects in every segmenl W1at is more important in lhe evolution of these distínd activities: ehanges in protein SCCjuence or changos in gene expression1 The folt!fU
lüV-.Iing experiment provides a definitive answer. It is the changes in expression Ihat pro:::luee tIle distinctive activItiC5 úf Paired and Gooseberry. The Pa:ired protein coding regial was placed under the oontrol of the gooseberry regulatory DNA The resultíng gooseberry-poired fuSIon gene \NaS expre.ssed in transgenic Drosq;hl1o embryúS lhat lack Ihe endogenous gooseberry gene (Box 19- 1 Figure 1). Norrnally, gooseberry mutant embryos die and exhibit patteming defects in every body segment t-Iowever. tIle gcxJSebefry-poired fusion gene completely reseues gooseberry mutants. Normal embryos are formed, and Ihese go al to hatch and produce normal (but sterile) aduh flies. Thls experiment demon5trates mat the Paired protein, although quite distínct from Gooseberry, can fulfi!! most of the regulatory activites of Goosebeny when gÍllen the enance - that !S, when expressed in every segment usinggooseberry rCfs'Jlatory sequences.
Moslllnimols Hove Essel1liolly Ihe 50me Cenes
_ ,9-,
617
(eonUnued)
goosebeny
homeobo><
80X 19-1 FIGURt: 1 Comparisonof
(11=zc~ :~ ~:¡~le:·:=:)t=====~;:)
peked
() , o
""""""' t l" e f'\)(:J , , , 100
200
)
300
the Goosebeny and Paired proteins. The tv.Kl óagrcYns SUTlmarize and ~ ile s1ru;:. tlres el the genes encoding the Goosebeny (Gsb) and Paifed (Prd) pt'otens. 80th ptOten5 alOtiJln mx and homeoboo:; ~ (lo. rmins, and the regions " the ~ ellClJdÍI 18
these
•
~
1kb
e
b
1 ThePrdproteincanreswegsbmutants. (a)1hegoaseber'rypore fusI::;on gene. The fu90n gene ~ains ibout 6 kb of 5' fla1Icirrg sequence frcm!he gsb gene attacned to !he Prd protei1 codil'l! region, thereby ~ the Prd ccdng sequeoce t.nder the tontroI of ilegsb 5' regu!ar::.ry DNA The gsb regUatory DNA contaR; tv.Kl enhancers, CEE and GLE, lhat COI'lIT'OI !he initiation and ll1ilI1tenance of expression in !he ectoderm 01 cIeveIopre ~ respectiveIy. (b) 1he gsb m.rtlllt 80X 19-1
FIGURE
treI contIIins ~ gsb?d transgcne. 1he fusion gene ampletely rescues the mutilllt phenotype 01 gsb l1"lJtants, indicat.ng lhIIt the A'd protein c.Yl ILlfiH Gsb function. I\bte that the embryo dispIays a compIelely normal panem of dentides. (e) The gsb mu\anllhat ladIs the gsb-prd tra~ In !he gsb rnutant (without !he lransgeoe)!he pattem of dentide hairs is abnormal and there is very littIe naked cutide separating neighbomg segments. (Source; Courtesy 01 Mcñus NoII; li x. and NoI M. 1994, Nature 367: 83-87, Figure 3.)
the organismo In sorne cases both mechanisms operate. In Box 19-2, Duplication of Globin Genes Produces New Expression Patterns s nd Diverse PrOlein Functions. we describe the cluster of human globin genes. These arose by gene duplication, and. while the differen! protein products a1l bind oxygen as part oí hemoglobin. they show subtly difTerent affinities for their ligando The different genes are expressed al dilferent times during developmenl as well. The high degree of conservation of Ibe genes found in differenl a niOla ls has recently focuse d etten Uon on tbe role of changes in ge ne expression as a gene ral mechanis m in generating evolulionary diversity. The importa nce of this mechanism is high lighted by the striking changes in morp hology cause d by misexpressing genes in new places duriog Ihe development of the fruit n y. In Ibi s chapter, we emphas ize how e volutionary divers it y can be generaled by expressing a fix ed set of genes in different patterns.
616
Box
ÚJmpnra tive GenomiN and Ihe Evallllion of Animol Diveni/y
19--2 Dupliution of Globin Genes Produces New Expression Pattems and Diverse Protein Functions
Gene duplication events offer the opporttJnity to expand the repertoire of prOlein functions aOO expression profiles. 80th forms of evoIution are seen for the l3-g1obin genes in mammals (see Chapter 17). foor related globins have arisen from gene duplication events in humans: €, "'1, 8, aOO 13 (Box 19-2 Figure 1). A11four genes are linked within a common Mcomplex.~ 1he tour genes exhibit subtle changes in the'r expression protiles aOO protein structtires. The e- and
Pgen.
"'I-g1obins bind oxygen more tiglltly than do S and 13. They are used by the fetus. which lacks functioning lungs and must obtain oxygen by exchange from its mother's blood. The 8- and l3-gIobins bind oxygen with lower affinity, and are used by ne\r\obofT1S and adults. ~ich contain higher Ievels of oxygen. In this example, the evolution of both fhe prolein coding genes aOO associilTed regulatory DNAs lead ro the specialization of globin function.
reptiles and birds
-
monotremas higher fistJ and verlea-ates
y gene
f3 sharll. f3 chiCken f3 platypus lO humao y human
mammals
'Y gene
artiod~s
I placeotals
li gene primates
'-1
f3 human f3 rhesus
B O X 19-2 FI e u Rl 1 Duplication 01 Il-cIobin gene family in the ewolution 01 vertebrates.
(Source: Adapte
flg 26- 15. Copyright e 2000
Box 1g..3 CreaOOn of New Genes Drives Bacterial EvoluOOn Simple bacteria appeared more than three billion years ago, ....tIile anlmals have been aroond fOl just over half a billion years. The rapid evolution of bacteria, along with their extended evolutionary history, have created different forms of metabolism so that they can livein highly diverse and extleme environments. Sorne live within thermal vents beneath the sea, while Olhers Hve in sulfur hat springs on land. lhere is tremendotJs variation in both rhe number and types of genes present in different bacterial genomes. The simplest bacteria such as mycoplasma contain as few as 500 genes. li'Al~e Ihe mast sophisticated bacteria such as 5treptomyces encode over 7,000 genes. This huge range in gene number sharply contrasts 'A4th the modest, twofold variation
seen among different animals. The genetic content is also highly divergent among even closely related species of bacteria. Fer example, 5taphococcus and E. coN last shared a common ancestar about 50 millian years ago, which is comparable ro the time of divergence of mice and humans. Nonerheless, anly approximately 75% o, the plOtein coding genes are sharecl by !he two bacteria. A stunning 25% of the genes are unique aOO have no clear counterpan in the other species. In contrast, all animals inhabit similar, and far more temperate, environments. They employ similar metabolic pathways, but exhibit distinctive morphologies. As we .....;11 see 'in the course 01 this chapter, these diverse morphologies depend 011 changing the adivities of a fixed set of genes rather than inventing new ones.
Three Ways Gene Expn
Before beginning thal discussion. however. it ls worth noting that evolution need not work by redeploying lbe same genes to generate divers ily as seen ror animals. For example. bacteria possess the most highly diverse genomes among aH living organisms. They contain more than a tenfold range in the number of genes, and li ve in remarkabIy diverse environments (Box 19-3, Creation of New Genes Orives Bacterial Evoluti on).
THREE WAYS GENE EXPRESSION IS CHANGED DURING EVOLUTION How do genes acquire new patlerns of expression during evoluti on? Regulatory genes encode peoteins that control the expression of other genes (see Chaptcrs 16 and 17). Most often lbese protoins are lranscription faetars, but some influenee albor stcps oCgene expression instead. or particular inlorest ITom the perspective of tbe c urrent discussion is a class of regulatory genes called pattorn determining genes. Changes in the activilies and expressioLl patterns of these during evoluli on seem to cause significant changos in animal morphology. The distinguishing eharaeteristie of paltern determining genes is that they cause the eorreet structures to develop, but in the wrong place, whe n Ibey are misexpressed during development. For cxamplo, we will sce Ihal the misexpression of the pattern dctermining gene, Pax6. causes eyes l O develop on Ihe legs of fruit Ilies. We will consider several additional examples in Ibis chapter, The average _animal genome encodes approxi mately 1,000 diffecent regulatory genes . We do not have an accurate estimale of lbe number of regulatory genes that (unclion as paltem detennining genes, bul it is JUSI a subset of lhem. To accurately assess the number. it would be necessary to misexpress every rcgu latory gene in the wrong tissucs during development lo see which cause Iransfonnations in morpho logy. Our best guess is thal something likc 10% of all regulatory genes would fulfill the operational definition of a pattem-determining gene. So. tbe Iypical animal genome might contain about 100 such genes. The major focus of this chapter is to describe h ow changes in the deployment 0 1' activities of these pattern determining genes produce diversity during evolution. There are three majar strategies for altering the activities of pattern determining genes (Figure 19-4). 1. A given paltero delermining gene can itself be expressed in a new pattern, This, in lurn, wi1l cause those genes whose expression it controls (so-ca lled target genes) to acquire new patterns of expressi~n (Figure 19-4a). 2. The regulatory protein encoded by a pattem deterrnining gene can acquire new functi ons, for example. a transcriptional activation dornain can be converled into a repression domain. Thus, a regulatory protein tbat was an activator of a set of genes might now repress tbern (Figure 19-4b). Note tbat. although this strategy involves a change in protein function. the evolutionary consequence is a result of changes in expression pallern of targel genes. 3. Targel genes of 11 given p
620
Comporative Cenomics ond the Evo/utjan af Animal
F I G u R E 19-4 Summary of the three strategies for altering!he roles of pattem deltnnining genes. (a) Hypolhetical mechanism for evolutionary change in two extinct tribolites. In Zocanthoides, repressor XIS expressed In thoracie segments Tl-n . In CJk.noides, repressor X is expressed in thoracic
DiI'f~rsily
a
gene X e/(pressed in T1- T7
gene X e'Jtpanded 6lq)ressed in T1- lB
segments ll - TB. This suppresses the development of the axial spine, v.f¡ich arises from the lB segment. (b) Proteins enroded by patlem delermining genes acquire new runctions through mutariOll_(e) Differenl ta'Bel genes ilre rt.'glIlal ed due 10 dJanges in enhancef
&pine O/enoides
"""""'''' b activabon domain
repression domain
N
e
(])J
DNA-binding
DNA-binding
domain
dorpain
']¡¡¿ON1J
.. (l)l
"
"
IIn
"~-
n;]~
• ¡~ II
jUO í;
EXPERIMENTAL MANIPULATIONS THAT ALTER ANIMAL MORPHOLOGY The first pau ern determining gene was id entified in Drosopb i/a in the Margan fly lab (see Chapter 1 , Bax 1-2 and Chapte r 21). A mutalion caBed bxd cau ses a partial transformation oC haJtel"es int o wings. (As we shall see, normal fmil m es have a pajI" 01' wings and a pair of vestigial hindwings called hall eros. ) During the past 20 years. a vadety oC manipul atiol1s in Drosophj/Q embryos and larvHe have doculIIenlcd lhe importam.:e of several paltern determining genes in development. Abnonnal morphologies are obtai ned fhrough each of the tbree mechanisms described above: altering the express ion . functian . and targels oi' pattern determining genes. We first descdbe how the morphology of the rruit fl y can be altered by rnanipulati.ng tbe acli vit ies of specific pattern de termining genes. We Ihen apply Ihese slrategies lo Ihe interpretati on of Ihe cvoluti onary diversification seen in different groups of arthropocls_
l:.xpcrimenta/ MouipIIIOliotl51hal A ller Illlimn/ Morpholo8Y
621
Changes in Pax6 Expression Create Ectopic Eves The most notorious paltern delennin ing gene is Pox6. wbich controls eye devclopmenl in mosl or all anima ls. Changes in lhe expressiol1 paltero of the Pox6 gene are probably respolls ible for sorne of Uw morphological diversity seen amoog the eyes of dirferenl animals. Pax6 is normally expressed within developing eyes; but. when misexpressed in thc wroog lissues. Pax6 causes the dcvelopment oC extra eyes in those tissues (Figure 19-5). ln particular, extra eyes form in Ibe wings and legs or adult rues. Changes in Ibe Pax6 expression patlcrn during cvolu Lion probably accowlt ror dirferenc.:es in Ihe posilioning of eyes in diffcl'ent animals. Most animals conlain bilateral eyes Ibal reside within Ihe head capsuJe. Bul, altered expression of Pox6 has been correla ted with Ihe formal ion of eye spots on Ihe stalks of snails. Evolutionary changes in lhe regulation of Pax6 expression have been more importanl ror Ihe crealion of morphologically diverse eyes Ihan have changes in Pax6 proteio flmdioo. Thus . Pox6 genes frolO other animals also produce ectopic eyes when misexpressed in Drosophila. Fol'
•
eye imaginal disk
Drosophila
,."'"
FIGURE 19-5 MiHXpre5sion ofPaK6 (also calied ey) and eye fofrnation in Drosophilo. Misexpression al !he Pax.6 gene
Ieg imaginal disk
• •
results in the formation of eyes In inappropriate
Paces. (a) Wild-type fIv. (b) Abnormal1eg wilh
1
mlSplaced eye. The eyes aod legs arise. ffom 'imaginal disks In ¡he 1aNae. (Source: (a) Adapted ' rom Albel15 B. el al 2002. Molecular bology o{ the cell 4th edition. p. 426, f 7-74. parts a & b. Copyright !D 2002 Reproduced by penTIIssion Roulledgel TayIor 8. Frantis Books. lnc. (b) Courtesy o, Georg Halder.)
DrosopI>'.
o,
""""
normal fly
cclls Ihat glW nsc 10:
_
adull eye _
adull leg
® Ieg cells that misexpress Pax6 gene
b
fly with ey geoe artificially expressed in Ieg precursor cell
'ÚJulioIl of Animal Diversit)'
example, fruit meS were engincered to misexprcss the squ id Pax6 gene. Exlra eyes were obtained in Ihe wings and legs. similar to those ohtained when Ihe Drosophila Pox6 was miscxpressed {see Figure 19-5). The fly and squid Pax6 proteins share only 30% overall amino acid sequencc ident ily, yel they mediate similar activilies in transgenic flies.
Changes in Antp Expression Transform Antennae íoto Legs A second DmsophUa pattero determi ning gene. An tp (AnteJlnapedia), controls the dcvclopmenl of Ihe middlc segment of the thorax, Ihe mesothorax. The mesothorax produces a pair of legs Ihat are morphologically disti nct froln Ihe forelegs and hindlegs. Anlp encodes a homeodOffiain regulalory protein Ihat is normal1y expressed in tha mesothorax of Ihe developing embryo (Figure 19-6). The gene is not expressed , for exam ple, in the developing head tissues. But , a dominant Anlp mutati on, ca used by a chromosome inversion, b rings Ihe An tp protein coding sequence undar Ibe control oC a "fomign" regu latory DNA Ihal media tes gene expression in bead tissues, including Ihe antennae {see Figure 19-6). When misexpresscd in the head. Antp causes a striking change in morphology: legs develop instead of antennae. lmportance of Proteio Function: Interconversion
oí ftz and An'l> Pattern detennining genes need not be expressed in different places lo produce changes in morphology. A second mechanism for evolutionary diversily is changes in the sequence and function of Ihe regulalory proteins encoded by pattern determining genes that is, the second strategy shown in Figure 19-4. Consider two reJated pa ttcrn detennining genes in Dmsophila, the segmenlation gene ft2 (jushi Torozu ) and the homeolic gene Anlp (Figure 19-7). These genes are Iinked and arase from en andent duplication evcnt that predatcd Ihe divergence of cruslaceans and insects more than 400 mi lli on years ago. The two encoded proteins are related and
FI c:> U R E 19-6 A dominant mutat;on in tf1e Anfp gene fewlts in \he homeotic transfotomation of antennH ¡nlo Iegs. The fly 00 the r¡gtll is normal. Note Ihe rudm'\entary set of antennae ilt the Iront Efld of !he head. lhe ftV on the Ieh. ¡s heterozygous lar a dominant Antp mutation (AntpD/ +) 11 is IUlly viable and malnly I'1OfTT'oal in appearance ~t lor lhe remarlt.ab!e se! oIlegs ern
(Source: Courtesy 01 Matthew Scon.)
Experjmetl/o/ Manipula/ioas tilO' Alter Animal Morphology
cQntain very similar DNA-binding domains (homeodomains). The Antp and Hz proteins recogn ize rustinct DNA-binding sites because they form heterodimers wilh differenl " partner" proteins . These proteinprolein interactions are med iated by short peptide mol ifs Ipal map outsi de tha DNA-binding domain (see Chapler 17). An ip contains a tetrapeptide sequence motif. YPWM, which mediates interactions with a ubiquitous regulatory protein called Exd (Extradenticle). In conlrast. Ftz conlains a pentapeplide scquence, LRALL, which mediales inlernclious wilh a d ifferent ubiqu il ous regulalory proteiJl, FtzF1
(see Figure 19-7). Ptz-FtzFl dimers recognize DNA scquences thal are distim.1 fro m those bound by Antp-Exd dimers. As a result , Antp and Ftz regulale different target genes. In this example . after the gene duplication event Ihat produced i\n lp and [iz. the two encoded proteins acquired distinct regu lalory activities through sequence divergence. Tnterestingly, the Flz protei n in more primitive insects, such as Ihe flour beelle Thbolium cQt,·'oneum. contaios bot h the LRALL a nd YPWM molifs. 1'hus. it would appear that the TriboUu m ftz protein has hybrid properhes and can fund ion as both a segmenlatton gene and homeotic gene. Indeed. when misexpressed in DrosopJliJa embryos. the Tribolium Ftz protein causes bOlh segmentation Jefects and homeotic t ransformations .
Subtle Changes lO an Enhaocer Sequence Can Produce New Patterns of G ene Expression The Ihird mechanism for evoJutionary diversity (Figure 19-4) is changes in lhe target enhancers that are regulated by pattern determining genes. in lhis case neitber Ihe expression pattern nar the fu nction of the encoded regulatory prot.ein is a lt.ered . T hi s mechanism is nice1y iII ustratcd by Ihe Dorsal regula lory grad ien t in lhe early fIy embryo. in Chapler lB, we saw how the binding affi nities of Dorsal recognition scquences produce dislincl patlerns of gene expression. Target e nhancers Iha! contain low-affinity Dorsal binding sites aro expressed in the mesoderm, whero there are high levels of Ihe Dorsal gradient. In contrast, enhancers with high-affinity sites afC exprcssed in the neurogenic ectodenn. whme Ihere are intermediate and low levels of lhe gradient. The principie Ihal changes in enhancers can ra pidJy evolve new pallerns of gene expression stems from Ihe ex-perimental mani pulation of a 200 bp tissue specific enhaneer Ihal is activaled only in Ihe mesodermo The e nhancer contains two low-affinity Dorsal binding sites and is activaled by high levels of Ihe Dorsal gradient in ventra l regions (the fUlure mesoderm). Single nucleotide substil utions that convert each site into an optimal Dorsal binding site cause the modified enhancer to be activated in a broader pallern (Figure 19-8a and b). Dorsal functions synergistically with another trans(:ript ion Jactar Twist lo activate gene expression in Ihe neul"Ogenic ectodermo Tbmo are no Twist binding sites in the native enhancer. However. a total of cight nucleolide substitutions are sufficient to create two Twist binding sites (CACATG) . When combined with the two nucleotide substitutions Ihat produce high-affinity Dorsal binding sites, t.he modified enhanr:er now directs a broad panern of gene expression in bolh the mesoderm and mlUrugenic octoderm (Figure 19-8c). A few additiOlls1 nudeoüde
6:!:1
N
homcodomain
gene duplication ' ' ' ,_ _- '_ _ _,AnIP N
LRALL (f"""
N YPWM'-.:
e
"'' 1
FIGURE 19-7 Dupltcation DI ancestral gene leading tD Antp and ftL /Jo¡ ancestral Hox gene underwent a dupticat10n event 10 ¡:)feduce me modern ftz and Antp genes. TIle encoded protens contain similar homeodomains.
but have acquired disbnct protein-protein ¡nteraction motlls. Ft2 (Iet! pathway) contains LRAlL VI41ich peonits il te IOteract with ftzf l .....tule Antp (nghl p;"lthway) contalns YWJM and IOteracts WIh 8:d. Ft2-Ft2F' ane! Anlp-E'xd dimers recognize distind binding sites ane! therefoJe
legulate dlfferenl tdrset genes.
624
C..olllpam/ i ve Genornics ond /he Evo/u/ion
f I G U R E 19-8 Regulalion of transgene
expression in the earty Drosophilo embryo. lhe figure shows a series 01 crosssec1ions 01 early Orosophilo embfyos that express different locZ transgenes. (a) Expression of Jaez oontrolled by an enhaneer with two Io\.vatfinity Dorsal binding sites. (b) Expression of W controlled by a modified enhancer with two high-affinity Dorsal binding sites. (e) Expression of kxZ cootrolled by a modilied enhaneer containing two high-affinity Dorsal binding srtes and two TIMSI sites. (d) Expression 01/acZ controlled by a modified enhaocer conta.ning two high-
Di Animtll Diversi/y a low affinity Dorsal
I
ventral neurogenic oclOde
b
affinity Dorsal binding sites. two Twist siles. aoo two Snail repressot" siles.
e Twist
ro
Twisl
ro
, - -+
d
Twisl
ro
Snail ·nding sites etpression blocked by Snail in mesoderm
changes create binding siles fm a zinc finger repressor. SnaiJ. The Snail repressor is expressed only in lhe mesode rmo A modified enhancer. <:ontai ning optimal Dorsal siles. Twist activator sites. and Snail represo sor siles, is expressed only in ¡he neurogenic cctoderm wherc Ihere are low levels of the Dorsal gradient (see Figure 19..ad). AlIogether, a series of 2, 10. and 14 nucleotide substitutions produce a spectrum of Dorsal largel enlJanccrs which direct expression in the mesoderm, the mesoderm and neurogenic ectoderm o or jusI in the neurogenic ectodermo Thcse observations suggest that enbancers can evolve quickly to create new paltems of gene expression.
The Misexpression of Ubx Changes the Morphology
01 .he Frui. Fly The analysis of a Drosophi}o pattem detecmining gene called Ubx iIluslrates all three principies oi evolutionary change: new palterns of gene expression are produced by changing tJle Ubx expression paltern. the encodcd regulatory protein. oc its target enhancers. Ubx
Experimental MonipuJoliOlls !ho! /\l/er A.nimol Morpho/ogy
a
625
b
F I e u R E 19-9 IA>x mutants cause rile Itanslonnatfon of the metathorax into a ~itated mesothofilx. (a) A normal fIy is stx:r.vn tIlal conlaÍf1s a pair of prominent wings and a smaller se! ef hal:eres just beh¡nd!he Wlngs. (b) A mutant tIlat 15 I'orrozy¡pus fa- a weak lT'ItJtatIon in the l.b,.; gene is shovvn. The me\athorax is Iransfornred into a duplicated rnesothorax Ps a lesull the fly has two pairs of IMngs I
encodes a homeodomain regulatory protein that conlrols the development of the third ¡horade segment, the melathorax. Ubx specifically represses Ihe express ion of genes that are required for Ihe development of Ibe second Ih orade segment, or mesothorax. Indeed, AI1 tp is one oí the genes Ihal it regulates: Ubx represses AI1tp expression in Ihe metathorax and restricts its expression 10 Ihe mesolh orax of developing embryos. Mulants Ihat laek Ihe Vbx repressor exhibil an abnorma.l paltern of Antp expression. The gene is 001 only expressed within ils nOlmal site of aetioll in Ihe developing mesothorax, bul il is also misexpressed in the developing metathorax. This misexpression of An tp ca uses a transformation of the metalh orax inlo B duplicated mesothorax (Figure 19·9). In adult rues, Ihe mesolhorax contains a pair oC Jegs and wings. while the metalhorax conlains a pair of legs ond halteres (see Figure 19-9). The hal leres are considerably smaller than the wings and function as balancing slruclures during flight. Ubx rnulants exhibit a spectacular phenotype: they bave four full y developed wings, due lo the transfonnation of Ihe halteres into wings. This mutanl phenolype stems, in parI. from the misexpression of Antp.Later, we will look more closely al how Lfbx spccifies balteres through Ihe reprcssion of severall arget genes required íor the developmenl oí wings. The expression of Ubx in the differeol tissues of Ih e metathorax depends on regulatory sequences Ihol encom pass roore than 80 kb of ge· nomic DNA. A mulation called Cbx (Contrabithorax) disrupts this Ubx regulatory DNA with out changing the Ubx protein coding regían. The Cbx mutation causes Ubx to be misexpressed in Ihe mesotborax, in Cbx (Wiogless)
wild-type
T2
-
Wing T3 haltere
FIGURE 19·10 Misexpn:ssionofUb. in the mesothOfilx resutts io the ioss of wings. The Cbx mutatioo disruptS the regulatOly regioo cf /.tJx. causing its misexpressico in !he mescthcrax aod fesUIts in its traostoonatico ioto !he meta!hOJ3X.
addition to ils normal site of expression in the metathorax (Figure 19-10). Ubx now represses the expression of Antp, as well as the other genes needod for Ihe normal developmenl of the mesothorax. As a result , the mesothorax is transformed into a duplicaled copy of the normal metathorax, Tbis is a striking phenotype: tho wings are transformed into halleres. and the resuhing Cbx mulan! flies look like wingless anls. This example clearly iIlusu ates the consequences of m isexpressing a pattern delermining gene: a dramatic change in morphology results. We will see how Ihis mechanism is used lo convert swimming Iimbs ¡nto feeding appendages in certain shrimp.
Changes in Ubx Function Modify the Morphology 01 Fruit Fly Embryos We have seen that lhe Ubx p roteio call function as a fnmscriptiol1HI repressor thal precludes lhe expression of Antp and olhor " mesothofaX" genes in the developing metathorax. The conversion of Ubx ¡nto a Iranscriptional activator ca uses it to funcl ion Iike Antp and promote the developmen l oI Ihe mesothorax. This example illustrales how changos in the functio n of a pattern determining regulatory protein can alter morphology. 11 is not currentJy known how Ubx functions as a repressor. However, Ihe Ubx protein conlains specific peptide sequences Ihat recruit rcpression complexes. One such peptide is composed oC a slretc:h oC alanine residues. Alanine-rich repression domains are seen in olher pattern delermining rt;!gulatory prole¡os, such as Eve. which we discussed in Chapler 18. Transgenic fly embryos have been ereated Ihal con lain eit her Ihe Anlp or Ubx proleio coding sequence under the control of the hsp70 heat shock ds-regulatory DNA. When these embryos are p laced al elevated temperatures, there is ubiquítous expression of either Antp or Ubx in mOSI, or all , ti ssues, The misexpression of Antp causes all of the head and thoracic segme nts of Ihe embryo to develop as duplicated mesothoracic segments. These embryos arC dead, bul difCe rent segments can be ídentified by th e pattern of fine hairs. or de nticles , on the surface of Ihe embryo. In the case of misexpressing Antp. aU oC the thoracie segments conta in denticle patterns that look Iike the one normaIly presen l only on the mesothorax. In contrast, Ihe misexprcssion of Ubx causes aH three thorade segments lo deve lop denticle patterns typical of Ihe normal metathorax {Figure 19·11). Ubx normally functions as a repressor. It can be converted into an actívator by fus ing the Ubx DNA-binding domain (homeodomain) lo the potent activa ti on domaio from the viral VP1 6 protein , which we encounlered in Chapler 17. The protein sequences thal mediale transcriptional repression lIlap outside the Ubx homeodolllain and are not present in the Ubx-VP16 fusion prote¡n. The misexpression oC the Ubx-VP16 fusion protein causes aIl oC the segments lO develop as mesoth oracic segmems, not metathoracic segments as secn when ¡he normal Ubx prolein is misexpressed in engineered embryos. Thus, ralher Ihan behaving like lhe normal Ubx protein, the Ubx-VP16 fu sion prote¡n produces Ihe same phenoty pe as that obtained with Antp (see Figure 19-11).
Experimental Manipu la/ions ¡hal Altel" Animal MorpholD8Y
e
a él·
1'2
..
....-..:.
•
___ .. .
__... ,,_1
._....•. ..., '
.,....,.'1;. w. F I G U R E 19-11
"1'
. .'
627
•
"' • .... .. " Ñ
----~
- ..... -
_ _ _ - 12
_ _ ._ 1 3
" ,_ _ JI
~'_
¡IQ
...•
.. h· • •:":
Changing the regulatory actMties 01 the UbJ: ptotein. lhe panels show the
anterior 5egments of advanced·stage embryos. (a) Normal embryo. Note how the dentide hairs become nartCNlef from Al (the r¡rs( abdormnal segment) to mOle anterior reglOns (13. 12, and so forth) . (b) The misexpression 01 ubx causes me anteOOf dcntK:le hairs 10 become thider. as seen fo/ me normal A 1 sesmen!. The TI, n , aOO TI segments nON Iook rike duplrcated copteS 01 A1. (e) The mi~ression 01 a Ublc-Vp 16 fusion prolen C31JSeS anterior segments (T I and some of !he head segments) 10 10011 like T2 01 13 ~nlS. This is diHerent 110m the Al duplicabOfIS obtalned with!he normal Ubx proten. ln fact,!he translormations oblamed v-nth Ubx-VP I6 are Similar to those sern upon mtsexpressiOn 01 lhe l1oun,,1Antp protem (d). (Scurce: Reproduced Imm li X and McGinnis W 1999. ActlVlty tegulatlon 01 HOlI protens. a mechamsm fot altering functionalspecificity in deveIopment and evolution. PrO(. NatI. Acod. 56. 96: 6802- 6807. lig 1, parts a, b, and c. p. 6804. tmage COUl1esy of William McGrnnrs.)
Changes in Ubx Target Enhancers Can Alter Patterns of Gene Expression The Ubx protein contains 8 homeodomain that mediales sequencespecific DNA binding. Ubx a lso conlains a tetrapeptide motif (YPWMJ Ihal med iales inte ractions with Exd. We have already encountered this motU in our discussion of the evolutionary di vergence of Anlp an d Ftz. Antp a lso contains Ihe YPWM motif and hinds ONA as an Antp-Exd dime r. Simi lacl y, Ubx binds ONA as a Ubx-Exd dimer. Many homeotie regula lory prot eins interaet with Exd and bind a eomposite Exd-Hox reeognition sequenee. Exd binds lo a haif-sile with Ihe eore sequenee, TGAT, whereas Hox proteins sueh as Uhx bind an adjaeent half-site with a different core COIlsens us sequence, 80x 19-4 lhe Homeotic. Genes of DrosphHo Are Organized In Special Chromosome Clusters Anrp aOO Ubx represent ooly two of the eight homeotk genes in the Drosophi/o genome. lhe eight homeotic genes o, Drosophila are located in two dustetS, or gene complexes. Five 01the eight genes are located \Mtt)in me AAtennapedia complex, while the remaining three genes are located \Nithin the Bithorax complex (Box 19-4 Figure 1). 00 not cantuse the names of the canplex v.1th the indiVidual genes within the complex. For exampie, the AAtennapedia complex is named in honor of the Antennapedio gene (Mtp), .....f¡ich was the first homeooc gene identifled wrthin me complex. There are four other homeotic genes in the Antennapedia complex: fcJbio/ (Job), proboscipedio (pb), Deformed (Ofd), aOO Sex combs reduced (Sa) . Similarly, the Bithorax complex is named i1 honor 01 the UftrdHthCYCJX
gene (Ubx), but there are TINO others in this complex: obdomioo!-A (oJ:x:M) and AbOOmind-B (;.bd-B) . AAother insect, the flour beetle, contains a single c:anplex of horneoric genes Ihat ¡ndudes homologs of all ei~ homeooc genes contdined in the Orosophifo Antennapedia and Bithorax complexes. lhe two complexes probably arose from a chromosomal rearrangement v.1thin a single ancestral complex. lhefe is a colinear correspondence between the order of the homeotic genes along the rnromosome and their pattems of expression across the anterior-posterior axis in developing embryos (see Box 19-4 figure 1). For example,!he fob gene, Iocated in the 3' -most plSition of me Nltennapedia complex, is expressed in the anteriormost head regions of !he developing Drosr:#!i/o
628
Comparolivc Genom ics (md ,he E\'Vlulion of Animal [)¡I'ursity
BoIC , ... (Contlnued) Drosophlla embryo
--
h¡nclbrain ~
midbfaln /
í
spinal cord
cervical
..e~I"Sv ~
mouse embryo
BO. 19-4 FIGU RE 1 Or¡aniution and eJlp'ession of Hox genes in DrosophRlo and in the mouse. The figure compares the coIinear sequences and tral1'SClÍption pal1efns of the Hox genes in Drosophila and in the mouse. (Soun:e: Adapted from McGinnis W. and K.rumlauf R. 1992. Homeobox genes and axial paneming. CeI/68: 285, f 2.)
embryo. In contras!, Ihe Abd-B gene, """"ich is 10cated in the 5'-mast ¡::osition 01 me Bithorax complex, Is expressed in the posterlormost regions (see Box 19-4 FIgure l). lhe significance 01 Ihis coIineañty has not been established, but il must be important because it is preserved in eam of the major groups of arthropods (induding llour beet\es), as well as all vertetxates Ihat have been studied, including mice and humans. Mammalian Hox Gene Complexes Control Anterior-Posterior Patteming Miee contain 38 Hox genes arranged 'Aithin lour d usters (HOX a, b, c, d). Eaeh cluster or complex eontains nine or ten Hox genes and corresponds to the single homeotic gene d uster in inseds Ihat formed the AAtennapedia and BithoraK oomplexes in Drosophifa (BoK 19-4 Fígure 2). Far example, Ihe Hma- I and Hoxb-l genes are mast closely related to !he
Iab gene in Drosophifa, while Hoxa-9 and HQ1(b-9-located
at the other end o, /heir respective eomplexes- are similar to Ihe AMB gene. In addition to Ihis ·ser¡al~ homology betwe€n mouse and fly HQ1( genes, eaeh mouse HOK complex exhibits Ihe same type o, eolinearity as Ihat seen in Drosophila. For example, Hox genes located at the 3' end 01 eaeh complex. such as the Hoxa- l and Hoxb- l , are expressed in the anteriormost regions 01 developing mouse embryos (future híndbrain). In contrast, Hox genes localed near the S' end 01 eadl complex, sueh as Hoxa-9 and Hoxb-9, are eKpressed in posterior regions of the embryo (thorade and lumbar regions 01 the developing spinal eord). lhe Hoxd eomplex exhibits sequential eKpression aaoss Ihe anterior-posterior aKis o, the developing limbs. A comparable panem IS not observed in insed limbs. suggesting that the Hoxd genes have aequired ~nO\l€ r
Experimental Monipulotions thol Alter Aro"mol Morphology
629
Boa Ig..t (Continued)
Drosophi/a Hom-C
..
(¡ O
DI' Su Amp , Ubx
p'
rl O
O
Mouse Hoxb ()
b3
b1
b2
Q
O D
.1 Hoxa(l O
., .,
n
"
.. .. '" ..e .5
.7
O O 11 O O
O
"
HoKd O O ~
1
" , ", "• O
~
~
~
'. ".
~
5
,I b9
c:: oS
O
e ." e
.10
.13
Q l
,10
<1 1
<1'
d IO
'"
d12
'"
~
~
~
~
"
12
13
" e'" e e e
<5
n O
Hoxc (l
abtM AM8
-,- -- -- --' --- ----'- ---- --, b6 b7 b6 O O O
1 1 1. 1 ;~ -,
.-----D ,I
~
~
•
~
'" ~
7 9 6 paralogoos sobgroup
10
<1'
el
8 o x 19-4 F I G U R E 2 Conservation of organilmon and expresskm o, the homeotic gene c:omplexes in DIosopltllo and in the mOUSf!. (SoU/ce: Adapted from Gilbert s. F_ 20(XlDeveIopmenral boIogy, 6th edibon, fig 11 .36, part a. Copyright CI 2000 Sinauer Assodates. Used with pem1lss6on.)
regulalOly DNAs during vertebrate evolution. Indeed, we have already seen in Chapter 17 that a spedalized "global control region~ (GCR) coordinates !he expression of !he individual Haxd genes in developing limbs. Altered Patterns ef Hox Expressien (reate Morphological Diversity in Vertebrales Mutations in mammalian Hox genes cause disruptions in me axial skeleton, v..flich consists of the spinal corel aOO the differenl vertebrae of the backbone. lhese alterations are evocative of some of the changes in mof))hology we have seen for !he Antp and Ubx mulants in Dmsophila. Consider the Hoxc-B gene in mice, which is mes! dosely related 10 Ihe abd-A gene of the Drosophila Bithorax com-
plex. It is normally expressed near the boundary between !he developing rib cage and lumbar region of the backbone, the anterior "tair (Box 19-4 Figure 3). (The abd-A gene is expressed in the anterior abdomen of the Drosophifa embryo.) The first lumbar vertebra normally lacks ribs.. However, mutanl embryos that are homozygous for a knockout mulation in the Hoxc-B gene exhibit a dramatic mutant phenotype. The firs! lumbar vertebra develops an extra pair of vestigial ribs (see Box 19-4 Figure 3). lhis type 01 developmental abnormality IS sometimes called a "homeotic~ transformation, one in "....-hich the proper structure develops in the wrong place. In this case a vertebra that is typical 01 the posterior ¡horadc reglon develops wi¡hin the anterior lumbar region.
/ 80X 19-4 FIGUR E 3 Partial transformation uf the first lumbar vertebra in a mutant mouse embryo. The figule shows a dose-up ~ of me thoracic-Iumbar region of a mutant mouse embiyo lha! Iacks Hoxc·8 gene actMty. The mutant (shov.Tl on the right) eontains a vestigial pajr of ribs on the II vertebra. Normal mice C()(Itain ribs onlv on thoraóe venebtae. (Soulce: Adapted from Gilbert S. E. 2000. Deve10pmenral bioIogy, 6th edtion, lig 1 1.38, p. 368. Copyright e 2000 Sinauer Associates. Used with permission.)
6JO
Comporolil'c Genomics ond Ihe Evolution 01 Animal DiwmJty
a
Ho>< binding site j
I
bindlng site 1
I
b
e
fiGURE 19·12 Intetconvemonof Labjal and ubx binding sftes.
WoQSt Hox
potelns cootdÚl a vanant of Ihe YP'NM mol" NO nudeotides (NN). The exact sequence of Ihese residues strongIy mfluences speciflOty. for exarnple. WUbx dimers pre{erenba!ly bind te compositE sites ~..tth a TI core, while Exd-Ldb binds Sltes with a GG coreo (ldb 15 a Hox p otein IIlat cOl"ltrClk !he
deveIopment el anterior head structules.)
A-T-T/G-A/G (Figure 19-12.). Thc two half-sites are often separat ed by two nudeotides that are important for determining whicb Exd-Hox dimer can bind. For exam ple. Exd-Ubx dimers preíer recognition sequences that contain T-T in the central position . In contrdst, Exd Labial dimers prefer G-G central re·siduéS. (Labial is encoded by the 3'-most Hox gene in the An tcnnapedia compl ex. ) This observat1on raiscs the possibility that ta rget enhancers rcgulaled by one Hox protein can rapidly evolve into a target enhancer íor a different Hox protein. We will see how thi s principie might explain the diffcrcnt wing morphologies seeo in íruít fli es and butterflies. TIlese results suggest thal altering the funcU on or expression oí the Ubx protein or its target enhancers profoundly changes patterning in the DrosoplJila embryo. It is easy to imagine thal similar changes in protein functi on and expression have OCCUI'red dW'ing cvolution and are respon~ sible ror millng relnterl animals morphologicaUy distinct (see Box 19 -4, The Homeotic Genes oí Drosophila Are Organized in Special Chromosome Clusters).
MORPHOLOGICAL CHANGES IN CRUSTACEANS ANDINSECTS Thus íar we have discussed how changes io pattern deterrnining genes alter morphology in fruil mes. We now discuss how the !bree strntegies for altering the activitíes of pattem determining genes can explain examples of natural morphological diveI'Sity íound among different arthropods. The ñrst tINO mechanisms. changes in the expression and function oí pattern determin ing genes. can accounl rol' changes in limb morphol ob'Y seen in certain crustaceans and insects. The third mecbanism. changes in reguJatory sequen(.'CS, might provide an explanaHan l'or the differen! patterns oí wing development in fruit flies and butterflies.
Arthropods Are Remarkably Diverse Arthropods embrace five groups: trilobites rsadly extmct) . hexa pods (such as insects), crustaceans (shrimp, lobslcrs , crabs, aod so on) , myriapods (eentipedes and millipedes), and chelieerates (horseshoe ccabs, spiders, ami scorpions). The suceess of the arthropods derives. in parto from their modular architecture. These organisms are composed of a series oí repeating body segments that can be modified in seemingly Iimitless ways. Some st:gments carry w ings, whereas o!hers have antcIU1ae, legs. ja ws, or specialized mating devices . We know more abou! Ihe evoluti onary processes responsiblc for the diversificatioo of arthropods than íor any other group ol' animals.
Changes in Vbx Expression Explain Modifications in Limbs among the Crustaceans Crustacea ns ¡nelude mosto but nol aH, of the arthropods tha! swim. Sorne Ii vc in the oeean. while others preíer fresh water. They ¡nelud e sorne of our favorite culinary dishes , such as shrimp. ereb, and lobster. Dil e of fh e most popular groups oí crustaceans ror study is Artemin, also known as "sea monkeys. " Their embryos arrest as tough spores Iha' can be purchased at toy s tores. The spores quickly resume developm ent upon addition oí saJt water. The heads oí thesc shrimp r.ontai n feeding appendages. The thorar.ic segment nemes! the head, Tl, oontains sWimmmg appendages that look
MO/phoJogk:al
like those further back on the thorax {the second Ihrough elcventh Ihorade segments, 1'2 - Tll l. Artemio beIongs to an order of erustaceans known as branchiopotls. Cúnsider a different order of crustaceans, caJled lsopods. Isopods eontain swimming limhs on the second through eighth thoracie segments, jusI like the branchiopods. But. the limbs on lhe first thoradc segment of isopods have been modified. They are smaIJer than the others and ftuletion as feeding limbs (Figure 19· 13). Tbese modified limbs are called maxHlipeds (otherwise known as jaw feel), and look like appendages fmmd on Iho hcad {though Ibes(~ are nol shown in the figW'e}. Slightfy differenl patterns of Ubx exprcssion are observed in branehiopods and ísopods. These diffcrent exprossion pattems are cOrrelaled with Ihe modification of Ihe swimming Iimbs on Ihe first thorade segment of isopods. Perhaps Ihe lasl shared ancestor 01' the present branehiopods and isopods eontain the anangement 01' thorade limbs seen in A rtemia (which is itself a branehiopod): a1l thorade segments contain swimming limhs. Duñng the divergence of branchiopods and isopods. the Ubx regu latory sequenecs c1umged in isopods. As a resull oí this change. Ubx expression was eliminated in Ihe first thoracie segment, and restricted lo segments T2 - TB. fl is oos).' to imagine thal Ubx rcpresses one or more "head" palterning genes in the tborax. In Artemia, these head genes are kept off in a ll 11 thorade segments, bUI in isopods the head genes can be expressed in the T1 segmenl due to the loss of the Ubx repressor. lndeed . expression of the Ser gene ís restricted to hcad regioos of branchiolX'ds, bUI is expressed in T1 of isopods. The expressioo of Ser in Tl muses maxiJIipeds 10 develop in place of normal swi mming Iimbs (see Figure 19-13). Whal is the basis for the different pattem s of Ubx expression in isopods and branchiopods? There are severa! possible explanations, bul the mosÍ. likely one is that the Ubx regu Latory DNA of isopods acquired ffiulations. By this ruodel. the Ubx enhancer n o longer mediales expres· sion in the firsl tbOl"ddc segment. In fact, there is II tight correLation between Ihe absence of Ubx expression in the Ihorax and the deveIopment of feed ing appendages in different crustaceans. For example , Jobster embryos lack Ubx expression in lhe firsl two thorade segments and eontain two pairs of mtIXillipeds. Qeaner shrimp lack Ubx expression in the first thme thoradc segments and eontain Ihree pairs of maxillipeds.
Cho"8~
in
Cr/l.Sracean.~ "nd
head segmeols
lnsccts
1131
thora~
branchiopod
isopod
Why Insects uck Abdominal Umbs A H insects have s ix legs. two on eaeh of the duce thorade segmenls: Ihis applies to every one of the more Ihan one million spedes of insects. In contrast, other arthropods, such as erustaeeans , have a variable number of ¡imbs, Some CJ'uslaceans have limbs on every segment in both tbe thorax and abdomen. This evo lutionary change in morpholag)', Ihe los5 of limbs 00 the abdomen of insects, is not due lo altered expression of pattcrn determining genes, as seen in the case of maxilIi ped formati on in isopods. Rather. tite los5 of abdominal limbs in inseets is due lo functional changes in lhe Ubx regulalory proleiJl. In insects, Ubx and obd·A repres!> the expression of a critical gene that lS required for the development of limbs , called Distalless (Dlll. In developing Drosophila embryos, Ubx is expressed a t high levels in the metathorax and anterior abdominal scgments: abd-A expression exlends ioto more posterior abdominal segments. Togelher, Ubx and abd-A kcep 011 off in the first seven abdominaJ segments. Although Ubx is expres5ed in ¡he metathorax. ji does not interfere with Ihe expression of D1I in Ibat segment obccause Ubx ís not expressed in lhe de-
F I c; U RE 19-13 Changing morphotogies in two different groups of austaceans. In branchlcpods So expression !s lestllCted to head regions ....nere il helps promete Ihe devel· opment o, 'eeding appendages, while I.lbx is expressed In the thOfax ......nere it contJob the deveIopmenl of S'IMVT1.ng ijmbs. In isopods, Ser expression is detECIed In both me head and Ihe first lhoracic segrnt.'flt (T I), elnd ;,s el lesult, the SWlnvmng limb in TI is tlansformed in'o a
feeding appendage (the malYl!iped). This poSterior eq>iII1Sion ot 5a was made possible by th€ Ioss of lJt»¡ ~ion in TI sinte lb IlOlTT)ally rE'pfesse5 Ser eqJfesion, (Soulce: Adapte
632
Ccmporotive Ccoomics tmd the Evolufion of Animnl O;ven;;ty
veloping T3 Jegs until after the time when D1I is act ivated . As a result. Ubx docs nol inlerfere with limb development in T3. In erustaeeans, sueh as Ihe branehiop od Artemia already menHoned, there are high levels of both Ubx and D1I in a Ull thorad e segmenls (Figure 19-14). The expression of DII promotes the development of swimming limbs. Why d oes Ubx repress DIl expression in the abdominal segments of insects. but not crustaeeans? The answer lS that the Ubx protein has diverged between insects and erustaceans. This was demonslraled in the fol1 owing experiment. The misexpression of Ubx throughout all of the tissues of the presumptive thorax in transgenie Drosophila embryos suppres8es limb development due to the repression of DIl. In contrast. the misexpression of the crustacean Ubx protein in transgenic flies does no! interfere wlth Dll gene expression and the formation of th ornde Iimbs. These observations indicat e tha t the Drosoplúla Ubx proteio is fun ctio nally distinct from Ubx in erustaceans. The fly protein represses D/l gene expression, whereas the crustacean Ubx protein does noL What is the basis for Ihis functi onal differonce behveen the twa Ubx proleins? (They share only 32% overnll amino acid identity. bul thcir homeod omains are virtuaJl y identical-59/60 matches. ) It tums out lhat lile cl'ustaccan protcin has él short motif containing 29 arnino acid residues Ihat block repress ion actívüy. Wbcn this scquence is deleted. the erustaeean Ubx protein is jusI as effective as the fly protein al repressing DIl gene expression (Figure 19-15 ). Botb the crustacean and fly Ubx proteins contain multíple repression dorna¡ns. As diseussed in Olapter 17, it is lilely that thcse domains internel with one or more tnmseriptional reprcssion complexes. The "antirepression" peptide presenl in the crustacean Ubx protein might interfere with the abiJjty of the repression dornains to recruit these eom· plexes. Wben this peplide is attachcd to Lhe fly protein. the hybrid protein behaves I¡ke the el1.lstaeean Ubx prolein and no longer represses DII (see Box 19-5, Ca-oplion ofCene Networks for Evolut ionary IIUlovation).
Modification of Flight Limbs Might Arise from the Evolution of Regulatory DNA Scqucnces Ubx has dominated our di sellssion of morphologieal ehange in arthl'opods . Changes in the Ubx expression pattern appear to be responsible for the transformati on of swim ming Iimbs into
•
b
e
F IG U R E 19~ 14 Eyolutional)' changes in vbx p'otein function. (a) The OH et1hancer (00.104) is nonnally ac:tivaled In tI1ree palrs ot ·spots" in DrosopIlifo embryos. These spots EP on 10 form!he three pairs of legs in Ihe adult 11y. (b) The miseKpteSSion of the Drosophila Ubx pIOtein (DmUbxHA) stJOngtv su~es ~ Irom the DI! enhancer. (c) In contrasl, Ihe misexpression of!he ubl< protein 110m the brineshrimp ArterT1ia (AIUbn-IA) causes only a sflgh( suppression 01Ihe DII enhoYlC€r. (Source: klapted From Ronshaugen M. et al. 2002. Hox protein mutation and maooevolulioo of!he insect body plan. Norure 41S: 91 4 - 9 17. fig 2, part c, p. 9 15. Copynght e 2002 Nature Publishing Group. used with permlSSÍon. Irnages courtesy of Wdliam McGinnlS and Matt Ror&\a.Jgen.)
MorphoJogicol Chongcs in
maxillipeds in crustaccans. Moreover. the 10ss of the anlirepression m otif in the Uhx proteio likely accounts for the suppression of ahd ominallimhs in insects. In this final section on that theme. we review evidence Iha l changes in Ihe regulatory sequences in Ubx larget genes might exploin the different wing morphologies found in hui! fiies and hutterflies. In Drosophilo, Ubx is expresscd in the developing balteros whcre it functions as a repressor of wing clüvclopment. Approximalely five to ten target genes are repressed by Ubx. Thesc genes encocle proteins that are crucial for the growth and patteming of the wings (Figure 19-16) and aH are expressed in the developing wíng. In Ubx mulants, these genes are no longer repressoo in the halteres. and as a result. the halleres develop into a second set of wings. Fruit fiies are cliptcrans . and aU of tbe m embers of this order contain a single paír of wings and a set of haheres. lt ís likely that Uhx functions as a repressor of wing developmenl in all clipterans. Butterflies bel ong to a differenl order of insects. the lepidopterans. AIl of Ihe lI1ernbers of Ihis onIea- (which also uldudes 1I1 01h s ) t;onlain two pairs of wings rather Ihan a single pair of wings and a sel of halleres. Whi:ll is Ihe bas is f OI" these different wing m orphologies in dipterans and lepid opterans? The lwo ordcrs cliverged from a common ancestor m ore than 250 million ycars ago. This is ahout the time 0 1' divergcnce lhal scpa* rates humans and n oomamalian vertebrales such as frogs. It would secm lo be a sufficienl perlod of time to alter Ubx gene function tluough a ny or aH of the three strategies that we havc discussed . The simplest mechanism would be to change the Ubx expression pattem sO that it is lost in Ihe progenilors of lhe hindwings in lepidoptera. Such a loss would pennit lhe developing hind\\rings lO cxpress a ll of the genes that are normally rcpressed by Ubx. The transl'ormation of swimming límbs into maxillipeds in isopod s provides a clear precedent roe such a mechanism. However. lhere is no ohvious change in lhe Ubx expression pattern in flies and butterflies; Ubx is expressed al high levcls t hroughout lhe developing hindwings of butterflics. That lea ves us wilh two possibilities. Fírst, lhe Uhx protein is functi onalIy distinct in fli es a nd butterflies. The seco nd is thal each of the approx imately Ilvc lo te n targel genes that are represscd by Ubx in Drosophila have evolved changes in their regulatory ONAs so thal
Crusfocums ond lnsCCfs
cruslacean
633
jnsecl
molit
e
e fIGURE 19~15 eomparisonofUb"in austaceans and in insects. (a) I.bx in oustaceans. The e-teffiunal ilntirepression pepr.de bIocks lile actnIIty of fhe N-tE'lTllIfla! represstan dom..-n. (b) l..bx in insects. The e -tetmil1ill a1tirepression peptide was Iost thrwght mutatioo. (Soufce: Adapted from Ronshaugeo M.. et al2002 Hox protein mulation Md macroevolutloo el !he V'lSect body plan. NoII.Ife 4 ' 5: 9 14 - 9 17, Iig 4. ptrtb, P. 9 16. Copyright C 2002 Nature Publishing Group. Used wlth perrass¡oH ~
BoJe 1.5 eo.-option of Gene Networks ro.. Evolutionery Innoveüon
lhe regulatory gene DistoJ../ess (011) has been implicated in the development of most or an animallimbs, ¡ndudng !he antennae and legs of Drosophila, !he swimming limbs and maxillipeds of austaceans, the fins of fish. and the limbs of mice (Box 19-5 F¡gure 1). In an of these cases. DlI is required for the extension of ~mbs away from !he body. lhe extensive consevation o, DistalIess expression in virtually all animals has loo to the propasal that the ancestral animals, perhaps !he pre-Cambrian flattish round vvorm, contained small protuberances or ~pIacodes" 'A4Ih sites of OH expession. These n.xlimentary plilcodes in the ancestor mióf1t have Ied te me evoIution of limbs in the higher animals. DI/ is not dedicated to lhe elongation of animal limbs since It is also expressed in other types of tissues. One interesting
example 15 seen in Ihe wings of butterflies. Many consider the eyes~ts of bunerfly 'Nings lo be among !he mast beautiful paneros encountered in nature. It is thought Iha! these f!'(espots are used as decoys mat allow bunerflies 10 evade predatorso011 is expressed in me progenitors of the eyespots, called foci (Box 19-5 Figure 2). It is diffkult te argue lhat me eyespot is a degenerate limb. Rather, ir v.Quld awe
6:14
Comparative Gtmomics and the Evolutjan af Animal Diversity
Box 19-5 (Continued)
t1
/,,-" .t, .... t2
13
al"
•
.••••. ~
-;;..... ." ' 'Afv
Box 19·5 FIGURE 1 OistaUess eapression in varioos animal emlnyos. fue embtyos shO'M1 are stained wilh 011anlibody. Top row: arthropod (fruit fIy in left panel and tuterily in cenler panel) and 0USIaceaJ1 (right panel). Bottom rCNJ from the Ieft: echiooderm (sea urchin), annefKl, and vettebrate (chicken and zebrafish). (Source: Photos provided courtesy of Steve Paddod and Sean Carroll.) Box 19·5 FIGURE 1 The expression of DI1 and othe.- pattem delennining genes in IN eyespol of B. GnytHHtO. DIf (red) is expressed in the eyespots of lile deveIopIng OOtterfly 'Nings. (Scurre: Courtesy of Gaig Sfunetti arxl Sean Carrol. Brunctll el al. 2001. Curreot BioIogy
11 ; 1578, fig 2, pilrt:s b and d.)
they are no (onger repressed by Ubx in butt erflies (see Figure 19·16). An ind ivid ual predisposed lo gambling would lay odds on the former mechanism: a change in Ubx protein fun ction. lt seems easier to modify repression activity than to change the regulatory sequences of five lo ten d ifferen t Ubx target genes. We have seen that Ihis type oC mecho
Cenomc ElIO/utio/l flnd Humml Origins
•
FIGURE 19+16 Changes in the replatory DNA of UbJ( target genes.. (a) The Ubio; repressor is epressed in the haltetes 01 ápterans ",nd hind'Ning; of !epidq>terar6 (orange). (b) Differenl target: genes conIain LJbx represSOf sites in dipterans. These M.te been Iost In Iepdopterans,
dipterans
b
ti35
!
ON
.,
wg
!D)(l~===,------+;: J:::: , =05N..
.
anism accounts ror lhe repression of abd omi nal Ii mbs in insects as comparcd wit h crulOlact:;'¡nl" Surprisingly, it appears that Ihe less likely explanation-changes in Ihe regulatory sequences oC several Ubx target genes-accounts ror the differenl wing morpho logies. The Ubx proteio appears lo functi on in
Ihe same way in fruil mes and buUerllies. For example. in butterflíes. the 1055 oC Vbx in patches oC ceUs in Ihe hindwing causes Ihem lo be transformed iota forewing structllres. (See Figure 19-16 foc the dirreTcnce between forewings and hindwings. ) This observation suggests thal the butterfly Ubx protein funclions as a reprcssor thal suppresses Ihe developme nt of forewings. While nol proven, il is possible tha! lhe regu lalory ONAs of the wiug patteming genes have lost the Ubx bínding sites (Figure 1 9-16b). As a result, they are no longer repressed by Ubx in the develaping hindwing. An impli cati on of the p receding a rguments is that evolutionary changes readil y occur in regulatory ONAs. This is consistent with various experimental manipulations in Drosophila. We have secn how changing jusi 7% of the nucl eotides in a mesodenn-specific enhancer converts il int o a neurogen ic enha ncer in {he fr llit fly embryo.
GENOME EVOLUTION AND HUMAN ORIGINS We have described how changes in gene expression cause morphological diversily among d ifferent groups of arthropods. We now consider fun cti onal di versity among different mammals. The genomes of mice and hu mans have been sequenced and assembled, and their camparison should shcd lighl on om own huma n origins.
H umans Contain Surprisingly Few Genes A variely of gene predicti on programs are u~t!J Lo iJt!ntify prole jn cocling genes in whole·genome assembUes (see Ch aptcr 20) . These programs identify distinctive ONA sequence features associated with protein coding genes, inclllding putative open-read ing frames, spli ceosome recognition signa ls , and core promoter e lements. Pre-dicted genes are sometimes confirmed by independe nt tests- mosl frequently, the isolation oC cDNAs corresponding to the encoded mRNAs. Bu t the gene predicti on programs are nol co mpl etely ilccurate
'Olution 01 Animal Diversity
(Chapler 20). Shorl, fortuitOllS open-reading rrames can be falseI y iclentified as proteio cod ing gelles. Conv~)rsel)', autheolic genes composed of many small exons can be missed because they lack obvious extended open-reading frames. Finally, there are numerous inaccurndes in lhe intron-exon struc ture of predicted genes due lo the degeneracy and simplicity of the sequence signals required for splicing (as we saw in Chapter 13 ). Despite these many caveats , Ihe human genome contains only 25,000 -30,000 protein coding genes. This number carne as quite a shock to many scientists working in the area of human genetics. There was a general sense that the remarkable sophistication in human morphology and behavior required many more genes. Before Ihe human genome was sequenced, there were popular estimales for 100,000 protein cod ing genes. Bast!d on the logic thal we have introduced in Ihis chapter, we anticipate that higher vertebrates, such as humans, conlain sophislicaled mechanisms for gene regulation in order lo produce many patterns of gene expression. b1 other words, organismal complexity is nol con-elatL>d wilh gene number, but inslead depends on Ihe number of gene expreso sion pattems. Consider tbe following arglUllent. The nematode worm, C. eJegans, contains nearly 20,000 genes (see Chapter 211. while Ihe fruit fiy. Drosophila melanogaster. conlains signitlcantl y fewer genes, less Ihan 14,000. Nonethe less. fru it flies exhibit a far more sophistk.ated range of morphologies and behaviors Iban those secn in worOlS. Tms increased compl exity might resuJI from an increase in the number of gene expression pattems. For example. the average fly gene might be regulated by tbree or four separate eohancers thal together produce about 50.000 lotal pattems of gene expression. In contrasto each worm gene is probably regulated by only one or two enhancers. As a result, tite worm mighl be built from about 30.000 total patterns of gene expression - significantl y fewer tllan the numoor of pattems producl...J in fIies even though the worm possesses more genes.
The Human Genome ls very Similar to that of the Mouse and Vírtually Identical lo the Chimp Mice aod humans conlaio roughly the same number of genes -about 28.000 proteio coding genes. Approx imately 80% of these genes pos-
scss a cJear aod unique one-to-one scquencc align mcnt with one anolher between the two species. The proteins encoded by these genes are highly conserved and s hare an average DI 80% amino acid sequence idlc,JOtity. Mos! of the remaining 20% of the genes in miee and humaos differ by virtue of Iineage-specifie gene duplication events. Fer cxample. mice contain more copies or i.I ge ne call ed cytochrome P450 Ihan do humans. Of course. ihere are a lso examples of gene families tha! are more extensively expanded in humans than mice. The main point here ís that there are few. if any, "new" genes in humans lba! are completely absenl io m ico. 1'he chímp and hu man genomes are even more highly conserved. They vary by
Genome Evo/ulion ond Humon Origins
637
as much as 2.5% sequence variation. There is also extensive synteny between ehimps and humans (and for that matter, miee). The order and distanees scparating neighboring genes are h.i gh ly conserved. We have seen that regulatory ONA evolve more rapidly lhan pTOleins. Perhaps Ihe limited sequ enec divergenec between eh imps and huma ns is suffkienl lo a lter the activities of sevcral key regulatmy ONAs.
The Evolutiona~y Origins of Human Speech Given the simil ar genetic eompositions oCmiee, ehimps, and humans, il is inleresting lo consider how new evolutionary innovations suddenly appear in hu mans. We speculate on the origin of one such Irait, speeeh, as il i::; one uf Ihe defining fcalUI'es of heing human. We alone possess the capacity for precise eommunicalion in Ihe form of speech and written language. Our c10sest cousins, the chimpanzees, display a simple form oC language Ihat is quite crudo in comparison to our own. How did our clisUnctive form of language arise in human evolution? Speech depends on the precise ooordination of the small muscles in om larynx and mouth , Redueed levels of a regulatory protein called FOXP2 cause severe defecls in speech. Afflicted inclividuals exhibit a variety of difficulties in articulation. The FOXP2 gene was isolated in a variety of mammals , ioc1uding miee, chimps, and mangulaos (Figure 19-17). The human form of the protein is slightly different froro those presen! in mice and primates. In particular. there are two am ino acid residu es at positions 303 and 3 25 Iha t are unique to humans: lhr to asn (T lo N) at position 303 and asn to ser (N to S) al position 325 (Figure 19· 18). Perhaps these changes have altered the function of the human FOXP2 protein. For example, there is evident;e Ihal these changes ot;cur within a repress ion domain of Ihe protein. thereby ra ising the pússibility that human FOXP2 fails to regulate target genes that are repressed in miee and chim ps. " his wouJd be comparable to the antirepression peptide thal evolved in the Ubx prote in of cruslaceans, Altematively, changes in the expression pattern or c:hanges in FOXP2 larget genes might he responsible ror the ahi lity of FOXP2 lo promote speech in humaos, as we no\.\! discuss. How FOXP2 Fosters Spcech in Humans In this chapter we have discussed three mechanisms for chaoging the function of regulatory genes such as Ubx. The same principles a pply to FOXP2. Perbaps a combination of aU !hree mechanisms, changos in the FOXP2 expression paltern , changes in its amino acid sequence, and changes in FOXP2 targe! genes might explain it s emergence as en importan! mediator of human speech. For example,
r----------------mo"~
rommoo _ anceslor
--
r--------
orangutan
. -_ _ _ _ _ goritta
-
r--- pygmy ct\impanzee
'-
rlL..--- chimpa?2 L.._ _ _ _ human
3
FIGURE 19-17 Summaryofaminoedd changes in the fOXP2 proteins of mlce and primetes. The nurnbers .ndi,ate nonconsecval¡ve ",mino acid substJt~ons. (Source: Adapled from Zhang 1. et al 2002. Acceleraled protem evalua\lon and ong¡ns of htJman-specitic leal\.lres. GenetIG 162: 1829. tig 4.)
638
Cvmpoml¡ve Cenvm¡c,~ and IIw E'vvlutiutJ vI Animal Dh-ersily
80
human
. ¡¡==¡\;¡== == =¡
;¡== = = = = " 'c
FIGURE 19-18 Comparison 01 the FO)(P2 gene sequences In human.. chimp, and mouse, Thc li~re shows the aligomenl 01 lhe FQXP] sequences 101' human, chimp, and mouse wilh amino acid changes. There are t'I.() ditlerenccs between human and chimp (N to T at position 303 and S to N al pos¡Iion 325 In Ihe human sequence) and!hree differences beMeffi human and I'1lOJse (¡he Ihird change 15 D 10 E al posillOn 80). (Source: Data trom [nard W. el al. 2002. Molecular evo!utiOl1 01 FQXP2, a gene involve::l in speech and language. Norure 4 18: 869-872.)
changes in the FOXP2 regulatory DNA mighl cause the gene lo acquire a new pattem oC gene expression in the human brain. In chimps the gene might nol bt:! expressed in lile apprnpriale region oC the brain at the right time during development. In contrast, in humans FOXP2 might be expressed al the right levels in Ihe corrcet time aod place lo foster the devcJopment oC languauge in the brains oC inCants. In thl' nexl section we diseuss the possibility that FOXP2 might regulate dilIerent sets 01' largct genes in chimps and humans. This discussion is speculative, bul serves lo provide a Cramework for hnw suhtle changes in just a few regulatory genes and their targets might lead lo the innovation of a crit ical trait such as the use oC languago. Consider potential target genes of the FOXP2 regulatory protein. Sorne might encode neurotransmitters or other critical signals that are expressed within the dcvcloping larynx. Perhaps these changes have augmented the levels Uf timing uf gene expression, so that critical signals are active in Ihe larynx during the time when we are most susceptible lo acquiring language as infanls. Tlle currespollding genes nüght be expressed at lower levels, al Jaler steges, or in the \\Irnng regions, of the developing chimp larynx (Figure 19-19). FOXP2 is jusI one example or a regulalory gene Ihal underlies human speech. It is difficult to estimate the number of "speech regulatory genes" tha! have evolved after the divergence of chimps and humans. However, we have secn Ihat fewer than 100 pattern delennihing genes are sufficient lo account ror the morpbological diversification of differcnt arthropod groups, Pcrhaps a significantly smaller set con account for the acquisilinn of language.
The Future of Comparative Genome Analysis Given the extensivc body 01' information tha! has becn compiled for a variety of different proteins, il is possible to infer the fundion of roughly half of aU predicted protein ooding gehlls based solely on primary DNA sequence inrormation. In contrast, there is a glaring Iimitation in our abilily to ínter thtJ function of regulalory DNA from simple scquenoo inspection. Fewer than 100 reguJatol}' ONAs have becn carefully charnclerized in all animals combined. This is nol a sufficient dala sel to determine whethcr regulatory DNAs ¡hal mediate similar pattems of gene expression share a common "code"- thal is, whether conserved c1usters of binding sites for particular combinations of regulatory proteins can be identified by simple sequenoo inspection. Ir such acode exists, then it might be possible to infer both the timing and sites of gene expression by
Summw}'
639
FIGURE 19~19 AscenariolorÜle
primate neocortex
evolution of speech in humans. A hypothetical ~lalofY proIein is expressed in !he neocortcx á both chimps. aod humans. ~, it possesses sli~1Iy ditferent octivities in mese groups. The human gene is slrongly elCpfessed al: !he cntlcal time in the developrnern
N
01 !he spced1 (ente!" CI1d adivines al th'ee hypothe\ical target genes in lile neocorteX. The;e targel genes mlghl encode neurotr4\SlTliu€fS 'il1X'1i1r"11 Iof lhe fonnat1Ofl 01 \he speech renler. In contrasl
activation dornain
FOXP2
e
!he chlmp fOOl1 of !he gene mighl not be e:o:prcssed ,]1 optImallcveIs 11\ !he %f1l time in the dcvelof,menl of !he speech (.enter, Al!e1Tlit"" li\rely, it might be expressed at Ihe ri¡1l1IlIne, but amno ocld difft'fence5 cause ~ lo be a ~
winged helbc DNA blnding domain
I l'Iuman
~C::I~~r:~g~e: ., o:e:A:, ON
ac=c=c=c=c=~.ge!,~o:e:A~¡ OFF
(1
(1
geM e
a:::~~~S ., ON
tto..J its human COUnlerpart As iI result, lhc rnimp ~latory protcin IS uncble 10 actMIIe
ac'J\lator
chimpan:ree
geneB j
.,
!he funspectrull1 ct target genes in me neocor\ex, Conscquently, the chimp possesses iI roore primilive fOOTl of language.
) OFF
gene C
--=~q~:' ~ ., ON
simply scanning the DNA sequences associated wi th any given gene (in 5' , intronic, and 3' positions relative to the transcription unil). Th is wou ld pennit a far more robusl brand of compamtive genome analysis Ihan is currenlly available. For now. we must be content with comparisons of prolein ooding genes as discussed for FOXP2 , In the fulure it mighl a1so be possible to identify changes in Ihe expression 'profiles of homologous genes. Tbe continuing develop-meo t or new computational methods and the availability of new genome assemblies offer exciting prospects for the use of comparative methods lo reveal Ihe mechanj sms of evolutionary diversity.
SUMMARY In Chap tcr 18. \'Ve silw how d itlerential gene expression is rcsponsible for the establishment of differcnt ccll types in Ihe dcvp.lop ing embryo. In this chapter we argued that tho same conccpl of differcn ti;;l gene exprcssíon ca n explain the evolution of anima l d iversily. U is lK:comi llg c1ear Ihal the cvolution of d ivcrsity amol1g oryanisms i¡¡ nol due lo Ihe prcsence of different spccializcd gf!n es. Halher.
animal evo lulion de pends on dep loying the same sel oC genes in differenl ways. Evo luliolHlTY change can thercfore be viewed as a problem in gene regu lati on . and comparative genome nnn lys is offers Ihe promise of identifying lhe regulalory mechImisms responsible for Ihis diversity. Al the time of this wriling, seven d iffcre nt animal ge namos have been sequenced and assp.mbled . Increasingly
640
Comporulive Ger¡CJmic;,~ ofld 'he E"oJilfion 01 Animal Dil'ersify
sophisticated melhods of gcnome analysis are revcaling O number of u,nexpocted fiod ings. Fírsl. Ihere are fe)NCl" protein coding genes in a !ypical genome than expoctcd. In humans. for example. this numoor dropped ÍTom an expected "alue of üround 100.000 gellcs-berore the snquendng of Ihe genome in the yeflf 2000-to JUSI undf'J' 30,000. Invertebrates. includi ng CiOllU. nematooe. and fruit fly. have approximately ha lf this number (1 5.800 . 19.000. and 14,000 genes, respectively). Second, camparalive genome st udics have revealocl a striking conslancy in genetk composition: mast animals hnve csscntiall)l Ihe sarne sel of genes. Thus. between human and chimp, we find 98% conservation in the prote in cocling genes, but more s urprisingly, Ihe conservalion bc!ween human and mouse is over 80%. Furthermore. lhe ¡necease in gelle numbcr seen for vertebrates (as compnroo with invcrtebrales) is primaril y due lo Ihe duplicatioo of "old" genes ralher Ihan Ihe invuotion of ncw oncs. Changos in gene ('xpression d uring evolulion depend on a ltering lhe activitiú5 of a spccial c111SS of regulnlory gones, caBed patlcm detcrmining b'Cnes. Whereas a typicnl a nimal genome might encocle approximately 1,000 difrcronl rcgulalof)' genes. rollgh ly 10%, or 100, of tllcsc corrnspond lo paltern delennining genes. Thcse gellos aJ"C cha l'acteri:ted by Ihe abiHly. when misexpressed during devdopmonl , lo Cnlise Ihe "righl" structurcs lo nppcar in the "wf"Ong" place, For example. the misexprnssion of the pattcm determin ing gene Pax6 causes tho formalion of exlra eyes in Ihe wings Imd legs ofadult llies. Therc aTe three majar slrntcgies for alteTing Ihe activilies of pattorn determining genes: changos in their expression profilcs, c hangos in lho fllllclion of ¡he encoded regulatory protcin s, and changes in Ihe en· halll;ers thal are rccognized ilnd fegu latcd by pllttern dctermining proleins. The pattem determining gene Uvx (UJtrobithorox) in Drosophi/o previdos examples of a ll three stralcgim¡. Tlle miscxpression of Ubx in Ihe dnvol· oping wings causes Ih e devclopmcnt of wi nglcss mes. In an extreme change of runcHon. t he conversion of Uhx from a reprcssor inlo un activalor, Ihe modified Ubx
gene behaves li kc nnothor patterll delcrmining gene , Anlp, ;)nd controls Ihe developmenl of T2 {rntheT Ihan T3} segmen ts in dcvoloping ombryos_ In princip ie. il migllt be possibl c lo cOllvorl a Ubx largel cnhanccr intó an Antp cnhancer by s imply clmnf'l ing th e spadng be lwoon Exd and Ubx half-s iles. In lerms of sheer numbers 8 n d dívcrsity, (he Ilflhro· pods can be considered the mosl successful of al! animal phyla. More is known about Ihe molecular bas is of arlhropod d ivCTsity Ihao any other graup of animals. Il is c1cólf thal the Ihrce sl.ratcgics for a ltering the octivil ics of patlem delennining genes have buen critica! in gencraling wide O1orpho logical diversily among nrthropods. Cruslacenns nnd insects represen l two of the fi ve ma jar groups of arthropods. and changos in their morphologics have been corrdated with altercd activilies of pattem determíning genes, particularJy Ubx. Changes ÍLl the expmssion plUfile of the ubx gene aro correlaled with the convcrsion of swimming Iimbs inlo maxillipeds. Funclional changes in Ihe Ubx prolein mighl accounl for the reprcssion of abdominal limbs in insccls. Finally. changos in Uh" la.rgct onhancnrs mign! explain lhe differenl morphologies of Ihe ha lteros in dipteruns and lhe hindwings ofle pidoptcrons. W'e are fasl enlering a golden erd of compilrati ve genome analysis. Tho amollnt of information Ihnt is bccoming ovailnblc is Illuggcring. Al the current rate of DNA sequencc production . lhe equivülent of 20 human gonomes wiU be sequcnccd cvel")' yem, Tlle human genome contnins surprisingly few genes, and these are highly conservod in olhcr primates. mammals, and verte-brales. lt is like ly, Ihereforo. Ihat the acquisition of m8ny of the remarkable characteristics unique lo humans, such
BIBLIOGRAPHY
Book, 13rown T.A. 2002. Cenomes, 2nd e dil io n. mos Scientific Publishers LId., Oxford, Uniled Kingdom . Cnrroll S.B.. Grenier },K. , and Wentherbcc S.D. 2001. From DNA lO divcrsily: Molecular genetícs ond Ihe evo/ulion 01 animal design , Blackwell Science, Maldcn, Massachusetts. Darwin C, 1859. On /he urígin of specjes by moons of no fu rol se/ection, Joh n Murmy. London. Davidson E.H. 2001. Genomic rYjgu/olory sys1emslVevclopmenl ond elfo/urion. Acadcmic Press. An EIsevier Sdence lmprint, San Diego, California,
Gcrhart 1. and IGrschner M. 1997. CeIJs, embl}'os, ood evolutionlToward a ecHu/ar and dwelopmelllal undorstandjng o/ phenolypk voriotion and clrolutionnry ndoptobili,y. Blackwcll Sciem.""C. Malden. Massachusells. Gilbcrt S.E. 2000. DcvelopmenloJ bia/os,)'. 6th cdition. Sinauer Associates. Sunderland, Massachusctts, Griffiths A.f,F" MilIcr J.H., Suzuki D.T" Lewonti n R.e. , Gelbart W.M. 2000. An ínlroduction lo genetic ana}ysis, 7th edition. W.H. Freeman, New York. lcvi-Setti R 1995. Trilobiles, 2nd edition. Univcrsity of Chicago Press, Chicago, IIlinois.
Bibliography
Raff R.A. 1996. The shape of life: Cenes, development,
and the evolulioll of animal formo University of Chicago Press. Chicago. Illinois
Evolution of Diversity Akam M. 1998. Hox genes in arthropod devnlopment ;)0(1 evolution. BioJ Bull. 195: 373 -374. - -1998. Hox genes: from mastflr genes to microman agers. CuIT. BiD}. 8: R676 -R67B. - - 2000. Arlhropods: developmcnlal diversity within a (super) phylum. Proc. Natl. Acod. Sd. 97: 4438 -4441 . Browne W.E. 3nd Pald N,H. 2000, Molecular genctics of cruslacrmn feeding appendage dcvclopment and diversification. Semin, CeJ/ Dev, Bjol. 11: 427 - 435. Carroll S.R 2000. Endless for ms: The evolulion of gene regulalion and morphological divcrsity, Ce ll 101 ; 4
577-580. - - 2001 . Chance a nd Ilccessily: The evolutioll of mOT4 phological complexity and diversity. NaturP. 409: 1102 - 1109.
Doeblcy J. and Lukens L. 1998. Transcriplional rcguJators and the 9volution of planL fOnTl, Plant CelJ 10: 1075-1082.
641
Duboule D. and wilkills A. 199ft The evolution of "bricolage." 'frends Cener. 14: 54-59. Cellon G. and McGinnis W. 1998, Shnping animal body plans in development and cvolution by moduliltion of Hoxcxprcssion pallerns. Bioessays20: 11 6-125. Lnll S. and Pale! N.H . 2001. Conscrvalion and divergence in molecular mech3nisms ofaxis fonnation. Antm. Rev. Ctum /. 35: 407 -437, Patel N,H. and Prince V,E, 2000, Beyond the Hax complex. Cenome Biol. 1: R10Z7. Tautz D. ZOOO. Evolution of transcriptiooal regulation. CUlTo Opin. Cenet. De" . 10: 575-579.
Genome Evolution and Human Origins Bishop D.V.M. 2002, Pulling language genes in perspeclive, Trends Cellet. 18: 57 - 59. Human Genome. 2001. Noture 409: 813-96U. Human Gp,nomc. 2001. Scionce 291: 11 45-1434. Mouse Gcnome. 2002, Nallll'c 420: 509-590. P;¡abo S. 1999. Human e\·olutlon. lhmds Genet. 15: M13-)\II 16.
PAR
T
• n
METHODS
644
Pan 5 Methods
PART
OUTl l NE
• Chapter 20 Techniques of Molecular Blology Chapter 21 Model Organisms
n Parls 2 and 3. we outlined our understanding of tbe molecular mechanisms underlying the central dogma; Part 4 focuscd on the mecban isrns of gene exprossion ilnd how differenUal gene expression controls the development and evoluti on of diverso animals. Mosl of what we know in Ihese areas slems from tbe study of a few model organisms using techniques of genetics, molecular biology and biochemistry. and more recently frorn genome analysis. The last par! of Ibis book is devoled to summarizing sorne Qf these methocls and organisms. Chapter 20 outlines basic lechniques of moleCular biology and biochemistry. Tbese allow molec:ules (DNA. RNA, and proteins) to be isolated from cells (isolated, thal is. from complex mixtures of such moleculesJ and studjed in pure form in vHro. Chapler 21 outlines key features 01' a few model organisms whose study underpins modem biological thinking. These are: phage and bacteria; yeasl; Ihe \\Iorm C. elcgans; tbe Drosophilo fruit t1y; and tbH mouse. Genetic analysis of lhese organisms bas enahled the study of biological processes- in trivo. The power of molecular biology -and Ihe revolution in our understanding of biology gained from it over Ihe las! 50 years- stems from using in vivo genetic and in vitro biochemical approaches in combinalion. A golden era of molecular biology was launched once it became possible to isolale specific DNA segments representing individual genes. In earlier times. it was possible to obtain bulk DNA from an organismo but only during the mid-1970s were methods developed that permitted the isolation of specific genes. The use of restriction en ~ zymes and gel eJectrophoresis to ¡solate specific DNA fragments is described early in Olapter 20. and this is followed by a consideration of how such fragments can be amplified and expressed in vivo. Next. we tum lo lechniques associated with in vitro amplification by polymerase chaiu macHon, and DNA sequeucing. Bolh of these re-quite !he chemica l synthesls of DNA fragments for use as primers. This ü.'Chnique is briefly describcd. PCR permits the purification of virtuaHy unlimited quantities oI any given DNA segment-even whell slarting with just a single DNA molecllle. PCR amplificatioo has revolutionizoo maoy scientific disciplines, including forcn sics, medicine, ocology. anrl of cowse, molecular biology. In the mid-1970s and 19805, methods for DNA sequencing wete still manual and somewhallaborious. Duriog the 1990s, slimulated by lhe ambilions of the human genome project. ONA sequcncing became highly mechanizcd and has now developed to the point where it is possiblc to determine Ihe exacl nudeolido sequcnce of enlire genomes in jusI days or weeks. Chapler 20 also ineludes a descliption of tbe compulational methods frona the emerging discipline of bioinformatics th al are used to assnmble complex genomes and identiEy both protein coding genes and associated regulatory DNAs. Considerable efforts focus on comparing tbe genetic content of different genomes, and lhereby determine Ihe basis for organismal diversity. In the second half oE Chapler 20. we deal wilh melhods of protein purification and analysis. This closes wilh an outline of Ihe new field of proteomics. Chapler 21, in wbich we describe a handful of model o.rgamsrns. stresses the principIe that researchers employ the simplm;t organ ism in which the problem of ¡nterest can be studied. The simplesl organ -
I
Por1 5 Methods
645
isms of aU-in terms of genome complexity and ra pidíty ol' the lite cyde-arc baderial viruses, or bacteriophage. The study of bacteria and bacteriophage detenninoo many of the basic features of DNA l'uncHan . inc1udjng the ¡nductian of gene expression, DNA replication, recombinatien, and repair. E. coli was the key organism of sl udy in vlucidating the genetic cede during ¡he early 1960s. In !he 1970s, molecular hiologists were getting restless. Many felt that prokaryoles such as bacteria and their Vlruses had been cotlquered and to answer the next round 01' biological questions demanded experirnenls on eukaryotes. Masl accessible of these is Ule yeasl, 5accharomyces celVisjae. JI has a very rapid life cycle. like bacteria, bul nonetheless exhibits many ol' the properlies of more elaborate eukaryolic censo Yeasl has heetl used for a varlely of studies, including DNA replication, the cell cycle, and transcription regulalion: Ihese studies proved mosl valuable because in each case it was l'outld Ihal yeasl contain many of the molecular machines used in higher eukaryoles as well. Chapler 21 ends with Ihe three mosl popular animal models, the nemalode wonn , Caenorhabditis eJegans. Ihe fruit fiy. DrosophjJa m e/anogaster and Ihe h O lJSC mouse. Mus muscu/os. One ofthe big surpdses in the past 20 years is Ihe realization thal many genetic p~ces ses are high ly conserved among a broad spectrum Qf anima ls. fro mnematode worms lo humans. Exhaustive gcnetic screens in the fruil fi y, for example, have idcnti fied many of the signaling pathways aJld regulatory genes Ibal control basic devcl opmentaJ proeesses common lo higher animals as \VeJ1. The develo pment ol' highJy sophisticated gene man ipulation methods in IIansge nic miee have permittcd resea rchcrs lo determine what processes are eontrolled by the geneti c pathways fo und in fruit fiies. Genetically ahered mice aJ so provide models for testing ideas aboul. and Ireatmenls for, many human disorders. including Alzheimers, Parkinson's disease , a nd rheumatoid arlhritis.
PHOTOS FROM THE COLO SPRING HARBOR LABORATORY ARCHIVES Seymour Benzfl. 1975 Symposium on the Synapse. Using pI1age genetics, Benzel define
646
Teániques o[ Mo/ecu/or 8j%S)'
Wernet' Alber and Daniel Nathans, 1918 Symposium 0fI ONA: ReplicatiOfl and Recombination. These two st1ared. witl1 Hamilton Smirh, rhe 1978 Nobel Pnze in iV.edicine lar !he discovery d restrlCbon enzymes and lhen appIication to the molecular dMlysis of DNA. rhis was one of!he key disCCNe1ÍC5 In !he developrnent 01 reoorrbinanl DNA fcd mology ¡n tfle early 19705.
hui Befg. 1961 Symposium on Synthesis and StJUdure of Mauomo~ules. Berg was a pior¡cer in the ronstructioo of recombtnant DNA moIeOJles In vitro, WOfk rellected In his share of!he 1980 I\'obel Pnre fa Chemistry.
Dale Kaiser, 1985 symposium 0fI Molecular SKI!ogy of Oevelopment. Kaiser contJibuted murn to lhe early studleS 01 pl1age lambda propaga/fonoOne aspeO 01 mis IMJrk lec! t1lm 10 recognire that DNA moIecules Wltl1 (CIITlpIemen~ single-stJanded ends can readily be joined togetI1er, a finding rnlicallo lhe developmenl of recombinanl DNA tecl1nologies.
Walter Cilbert and David sotstein. 1986 Symposium on Molecular Siology of Horno Gilbert, who ill\lenled a memical melt10d 101 sequencin¡¡ DNA. is shCMIn here wítl1 Botstein during the !llStorlCdebate abarl ...melhel It WdS feastble and sensible lo atlempt 10 seqUl'nce tfle t1uman genome Botstcin, aher working WItI1 phage for l"l"WIy years, contributed much 10 lhe de\IeIopmenl ot lhe yeast S. cerevisioe as a rnodel eokar)
Albert Keston. Sidney Udenfriend,. and Frederick Sange.-. 1949 Symposium 0fI Amino Adds and Proteins. Keslon - invenlor of the tesl tape fOf detedÍng glucose-and Udenfnend-who ~Ioped saeens 101. and tests ot. antimalarial drugs-are l1ere slxmn wítl1 Sangcf, tI1c only person to wm two NobeI Prizes in Chemistry. The first, In 1958, WdS for deveJoping a melhod 10 detennine !he. amino actd sequence 01 a protein; l/le second, 22 yeals lata. was Ior deveIoping lt1e metl10d for sequencing DNA tflal is [)(lV>I used almost exdUSlvely, indud¡ng In the automated machines used to sequence W"role genomes (Chapters 2 and 20). Beyond !he ctI\Iious tecl1no1oglcill dCl"rievement, delermining lIlat a prlteln t1ad a deflned sequencc revcaled '01 me flrst hme Ihat it likely t1ad a defined stnlctUfe as ",-el1.
e
H A P T ER
Techniques of Molecular Biology
INTRODUCTlON The living cell . as we have seen, is an extraordinarily compli catee! enlity. producing Ihousands of diffcront ffiA cromolecu les and harboriog a geoomc Ihol ranges in size I"rom mi Uiofls to billions of base pairs. Understanding how Ihe genctic ptocesses of the call work requires powerful. ami com plementary experimental apptoaches incJuding the use of suitable model orgflnisms in which thc lools of gencli c analysis are available. as di scussed in Cha pl er 21. They al so indude. as di scusseO here. methods for separating individual macromolcc,Jius ¡'rom the myriad mixtures found in the cell . and for dissecling the genome inlo managcably-sized segments for manipuJation and analysis of spec ific DNA sequences. The sUCCessfu l development of such melhods has been one of the major dtiving forces in Ihe field of molecular biology over Ihe lasl several decades. as \Vell as one 01' ils grealest triumphs. Rccently. il has OOcome possible to a pply molCClliar a pproaehcs to the large-scalc ana lysis of tbe fu l1 complement of RNAs and proteins in Ihe cclJ and lo determine the nucleotide sequence of cntire genomes. Wilh a rapidly increasing number of genome seqlJences becoming aveUable. it is possible. using wmputati onaJ or biointormetics approaches. to undertake large-scale gcnomic comparisons oi" both lhe coding am i noncod ing regions of various organ isms. In this chapte r. we provide a briaf introduction lo these molecular llnd cOlllputaliona l methods and lo the principIes upon whi ch they are based . As we shall see, the meth ods of molocular biology dcpenc1 upon. and were dovelopcd from . an understanding 01" the properties of biolugical macromolecu les Ih emselves. For oxample. an underslanding of the sltucture and bHse-pairing characlcristics of DNA a nd RNA gave ri se to the dcvBlopment oi" techniques of hybridization and seq urmcing that all ow for 111e tapid and detailed analysis of gene slructure and gene express ion. Jnsighl 1nlo lhe aclivilies of ONA polymerases. restfÍction endonucleases. and DNA ligases gave birth to the techniques of DNA cloning and thc polymerase chaio reaction . which allo\\! scieoti sts to ¡solato essentially any DNA segmcnl - evon sorne fmm prehistoric Iire forms- in unLi mited quantities. 'fh is chapler is divided into two parts. The 61'S! parl is devoled lO lechniques for the manipulatioo and characterization 01" nucleie acids. from Ihe isolation of RNAs and ONAs to lhe sequencing af enlire genomos end comparative genomics. The second par! is concemed with the ¡solabon and anaJysi s of proleios. (rom lhe purific81ion of individual proteim:i to proteomic melhods for analyzing Ihe foil array of proleins in a cell 01' bssue. Although these ca tcgories of techniques are dissimil ar in dUlail . many of the procedures for isolating and
O UTLINE Inltoduct.on (p. 647) Nudeic Acids (p. 648)
Pm'';ns (p. 672)
648
Thc/Uliques o{ Molecular Biology
maoipu laliog nucleic acids and proteins are, as we shall see . based 00 common underlying principIes. FinaJ ly, a note: il is important to Appreciate that when we ta lk abelut isolating and puriCyiog a given macromolecu le in the ensuing discussion we rarely {if eved mean Ihal a single moJecule is isolatvd. Rather, the goal oC these procedures is lo ¡solate a large popu Jation of identical molecules from all of lhe oiber kinds of molecules in the cell.
NUCLEIC AClDS Electrophoresis through a Gel Separates DNA and RNA Molecules According to Size We begin by discussing the sepflration of DNA and RNA moJecu les by the lechnique 01' gel elef.:trophoresis. Linear DNA molecules separate according lo size when subjecl to an electric field through .. gel matrix, an inert. jello-like porous material, Becausc DNA is negalively cha rged, when subject lo an electrica) field in this way. il migrales Ihrough the gel (oward !he positive pole (Figure 20-1). DNA molecules are flexible and occupy an effec ti ve volume. Pores in Ihe gel matrix sieve the DNA molecules according to this volu me; large molccules migrate more slowly through the gel because thay have a larger effecti ve voJume lban do smaJIer DNAs. and thus have more difficuJty passi ng lhrough the intcrstices of lhe gol. This means tha! once tbe gels have been "run" for a given time. mohlCules of difl"erent sizes are separated becausc they ha ve moved d ifferent d istances through Ihe gel. After eJectrop horesis is complete, the DNA moJecu les can be vi su ~ aUzed by stBmrng the gel with fluorescent dyes . such as elhidiu m, which binds to DNA and intercalates betweon the stacked bases (see Figure 6-2IH. Each bond reveals Ihe presence of a population oCDNA molecules or a specific size. 1\vo alternat ive kinds of gel matrices are used: polyac rylam idc and aga rost!. Polyacrylamidl} bas high resolving capability but can
f I GU RE
20-1 DNA separatio n by gel
electroplloresis. The figure sh~ a gel Irom the side in cross-sect\on. Thus the ·welr into wtlich Itle ONA mixture is Ioaded Of'Ilo me gel is indicared ell the left. al [he hedd of lhe gel. That is also lhe end al which \he caltlode d the electne fteld rs Iocaled, the anode being at the foct of !he gel. As a resuJ! the DNA fragmenlS. whic::h are negativcly eharged. move mroogh the gel from!he heacl to the 10Ql The dislance they
electfophoresis chamber
/ -~ .-
agarose gel
I
.~
r
lTavells uwerscly relalee! lo the SLle 01 !he DNA fragmer.t, as s/lown. (Souree: Adapted 110m MfdIos DA and freyer CA 2003. DNA science: A ftrSt COUfse 2nd edition. p. 114. ccld
Spring Harbar liIboratory Press. ccId Spnng Harba, NY.)
DNA fragments
buffer soIutioo
r
.-
O
_. O
El:)
[U
1 - ....
-- -' ~J
sma" ONA fragmeots move fu rther through the gel than large hagmenls
-
El:)
Nudeic Acids
separate DNAs only over a naITOW size r
Restriction Endonucleases Cleave DNA Molecules at Particular Sites Mosl naturaJly occurring DNA molecuJes are much largec than can readily be managed. DI' ana ly ~ed, in the lab. Thus, as we have seen . chromosomes are extremely long single DNA molecules Ih at can contain thousands 01" genes (see Chapler 7). Jf we are lo study individua l genes and individual sites on ONA. the large DNA molocules found in ce))s musl be broken inlo manageable fragments. This is done using reslriction endonudcases. These are nucJeases that cleave DNA at particular sites by the recognilion of specific sequences.
649
-
/"
~
/
f I G U RE 20-2 Pulst!d-field
gel
elecUophOfesis. In this figure, the i1galOse
se is shown 110m above wilh !he Ilead 01 me gel and a series of sample weUs, at the top, A dOd B represen! two seIS el electJodes. These are switched on and off allemately, as desrnbeJ In me texT. l/'/'hen A is en, the ONA as dnven tMaro me bottom rigllt comer of l he gel v.41e1e lile anode of that pdlf is situaled. VItleJ1 A IS w.ítd1ed off, al"'(! B rS switched 01\ me DNA moves ¡{MIare! !he boo:orn Iclt cemcr. l he arr~ thus sha.v !he pam followed by !he DNA as c!ectrop/laesis procecds. (Source:
Adapled Irom Sambrook J. al"'(! Russell DW 2001. Molecular doning: A Idxxoroty monool, 3rd edillrn, p. 555, Iig 5-7. Cdd Spnng Harbar Laborat01y Press, CoId Spri ng Harbar, NV,)
650
Tech/liques 01 Molecular Biology
Restriction enzymes used in molecular biology typicalJy recognize short (4-8 bp) target sequences, usually palindromic, and cul al a defined posilion within Ihose sequences, Thus, consider one widely used restrict ion enzyme, EcoRI, so named because it was found in cerlain strains of f¡scherichia coli, and was Ihe first (I) such enzyme found in thal species, This enzyme recognizes and e/eaves Ihe sequence S'-GAATIC-3', (Because Ihe two strands of DNA are complementary, \Ve need specify only one strand and its polarity to describe a recognition sequence unambiguously.) This hexameric sequence (like sny other) wou ld be expected to occur once in every 4 kilobases on average. (This is because there are four possible bases thal can occur at any given position within a ONA seq uenc:c. and so Ihe chancos of findi ng any given specific 6 bp sequencc is 1 in 46 .) So. consider a linear DNA molccule with six copies 01" Ihe GAATfC sequence: EcoRI would cut it into seven fragments in a range oC sizes reflecting Ihe distribution of those sites in the molecule. Suppose we Ihen subject Ihe EcoRI-cut DNA to electrophoresis through a gel: lbe sevon fragments would separate from each otber on the basis 01" their different sizes (Figure 20-3). Thus, in Ihe experiment shown, EcoRJ has dissecled the DNA into specific fragments. eac:h correspond ing to a part icula.r region of tlw molccuJe. If the same DNA molecule had been cleaved with a different restriction enzyme-for examplc. HindlIl. which al50 recognizes a 6 bp target. but ol' a di(ferent sequence (5'-AAGCIT-3')-the molecuJe wou ld have been cut at different p05itions and gmwraled fragmenls of different sizes. Tbus. the use of multiple enzymes allows different regioos of a DNA molecule to be isolated. 11 also allows a given molccule lo be idcntified. Thus. a given molecule will generate a cbar8cleristic series of paHems whrm digested witb a sel of different enzymes. Other restriction enzymcs such as $au3Al (which is found in tbe bacterium JitophyJococcus illdreus) recognize teLrameric sequences (S'-GATC-3') and so cut ONA more frequently. approximately once evcry 250 bp. At the otbcr extreme is Notl. which recognizes an octarncric sequence (S'-GCCGCCCC-3 ') and cU,.ts. on average, only once every 65 kilohases (Table 20-1).
fiGURE 2()..1 DigestionohDNA fragment with endonuclease EcoRI. Pi the lop is shaMJ a DNA moIeo.Jle and lhe positions witt1.n il at wt1icl1 EcoRJ deaves. W'hen lhe moIecule, digested"""¡!h tI1al enzyme, is run on an agarose 8el, the pattem of bands sI10wn are
EcoRl siles ( H
¡
!
¡
¡ ¡
¡
(J A
B
~.
E BA decraasing size D -
c-
FG -
e
D
E
F
G
Nuc/eic Acids
TA B l E 20-1 Some Restrict;on Endonudeases and TheiJ Recognition Sequences
Enzyme
Sequence
Sau3A 1
S'-GATC-3' 5 ' -GAATTC-3' S'-GCGGCCGC-3'
EcoAl
NoII
Cut Frequency·
651
Hpa l
EcoR1
0.25 kb
4 kb 6S kb
HinclUl -FfeQuency - 1/4n , whefe n = lile I1\.lI'TtJef 01 bps in lile recogn~iCro sequence
Rcs1riction enzymes differ nol only in Ihe specificity and length ol' Lhcir recognition sequences, but also in Ole nature of lhe DNA ends they generatc, Thus, sorne enzymes, like Hpal, generate flu sh ends: others, sllch as EcoRI. HjndJU and PsIJ . generare staggered tmds (Figuro 20-4), Por example, EcoRI deaves covalent (phosphodiester) bonds betWL"Cn G aod A at staggered positions on cach strand. The hydrogen bonds betwecn lhe 4 base pairs between lhese cut sites are easily broken lo geoerate 5' protruding ends of 4 nudeotides in length (Figure 20-5). Notice Ihal lhese ends are complementary to each othet. They are said to be "sticky" oocnuse Lhcy read ily anneal through base-pairing to each oLher 01' to oLher DNA molecules cut wilh the same enzyrne. Tbis is 11 useful property lhal we consider when we discuss DNA d ooing.
DNA Hybridization Can Be Used Specific DNA Molecules
lO
F I G U R: E 20-4 Recognition sequences
and cut sites of various endonuclea.ses.. As shcM",. not only do different endonudeases recognize differenl targel sites, they a/so art al diffe renl posÍllOns within !hose sites. Thus moIecules .....¡1t1 blunt ends or Wlth 5' or 3' overnanging ends can be ~f)Cfated.
Idcntify
As we saw in Chapler 6. Ihe capncity of denatured DNA lo reanoeal (Ihal is, lo re-form base pairs between complement aty slrands) allows for the formatjon of hybrid molecules when homologou s. denal ured DNAs from two differenl sources are mixed with each other under the appropriate conditions 0 1" ¡ooie strength and temperalure. This process of base-pai ring between complementary single-stranded polynucleotides from two ditTcreot sources is known as hybridization. Many lechniques rely 00 'he specificity of hybridization between two ONA molecules of complementary sequenee. For example, Ihis property underlies how specific sequences within complicated mixtures of nucleic acids can be identified, In tbis case, one of tbe moleenles is a probe of defined sequence-either a purified frflgment or a chemically syntbesized DNA molccule. The probe is used lo searcb mixtures of nuc/eic ncids for molecu les containing a complementary sequence. Tlw probe DNA must he labeled so thal it can be readily localed once it has found ilS targel scquencv. The mixture bei ng pl"Qbed has typically either becn separaled by size on a gel, or is distributed as a library in dilferent colon ies (see below). There are two basic methods 1"0 1' labeli ng DNA. The first involves synlhesizing new DNA in Ihe presence of a labeled precursor, as we describe below. The other involves adding a label lo the end 0 1" an ¡olacl DNA moleculc. Thus , rOl' example, Ule en zyme polynucleotide kinase adds the y-phosphate from ATP to the 5'OH group of DNA. If Ihal pbosphate is radioactivc , Ulis process labels the DNA moleculc to which it 15 transferred. Labeling by in corporation (the otb er mechani sm) is often carried out by using polymerase chain reaclion (PCR) wilh a labelcd precursor, oc even by hybri dizing shon rando m hexameric oligonuc!eotides
""d
" " ._ _ _>- °stickY· erx:!
F I G U R: E 20-5 Cleavage of an EcoRI site.
EcoRI cuts!he two stlands vJtt\in its recognition s(fe lo give 5' ovemanglng ends. These are ca11ed ·stick-{ eJ1ds bec.ause lt1ev rcadlly adhere to other moIecules QJ t .....ill1 the same mzyrne bec.ause they provide complementary single-stranded ends mal (ome logE!tt1er tl1lough base.palflng,
to DNA and a llowing a DNA polymerase to extend tbem. Tbe labeled precuJ'Sors are most commonly nucleotides modified with either a fluorescent moiety 01' radioactive atoms. Typica lly the flu orescent moiety need only be attacbed to tbe base of one of the fom nucleotides use
Hybridization Peohes Can Identify Electrophoretically... Separate
example, this can be useful when determining !he amOlln! of a specific mRNA that is expressed in two different cell types; 01' the JengUl ol' a restriction fragment that contains the gene you are studying. This type of infonTIAtion ca n be obtained using blotting methods thal localizo specific nucleic ncids after they have becn separated by t}lectrophoresis. Suppose that you have cleaved the yeast genome with Ihe reslrielion fragment EcoRI and want to kn ow the s iz6 oC ¡he fragment Ibat contains your gene of interest. When stained with ethidium bromidc, the thousand s of DNA fragments generated by cutting the yeast genome are too n umerous to resolve into discretely visible bands. and instead look Iike a smear centere
Nucleic /lcids
that are sensitive to the Iight or electrons emitted by the labeled DNA . When. for example, an X-ray film is exposed lo the filt er an d Ihen deve lope d. an auloradiogram is produeed in which the pattero of exposure on the fi lm corresponds to the position oCthe hybrids on the filter (Figure 20-6). A similar procedure caUed northern blot hybridization (to distinguish it from Southern blol hybridization) can he used to identify a particular mRNA in a population oC RNAs. Because rnRNAs are relatively short (typieally less than 5 kb) the re is no need for Ihem to be digested with any enzymes (there are only a limited number oC specific RNA cleaving enzymes anyway). Otherwise. the protocol is Cairly s imilar to thal described Cor Southem blotting. The separated mRNAs are transCe rred lo a positively-charged membrnne and probed \vith a radioactive DNA oC choi ce. (In this case, hybrids are formed by basepairing between complementary strands ofRNA and DNA) An ex perimenter might carry out norlhern blot hybridization lo ascertain the amounl oC a particu lar mRNA prese nt in a sample rather than ils size. This measure is a re neelio n oC Ihe level of express ion oC the gene thal encocles that mRNA. Thus . for examp le. one mighl use northe m blot hyhridization to ask how much more mRNA oC a specifi c type is prese nt in a cell treate d with an inducer 01' the gene in question comparcd lo an uninduced ce l!. As another example. northero blot hybridization might be carrie d out to compa re the relative levels oC a particular transcript (and he nce tbe expression level oC the gene in question) between different tissues oC an organismo Because an excess of DNA probe is used in these assays, the amount of hybridi zation is related to the amouot oC mRNA present in the original sample. allowing the re lative a mounts of mRNA lO be deternlined . The principies oC Soulhern and northero blot hybridization also underlie gene microarray analysis . which we consider in Chapler l B. [n microarray analysis, the hybridization probe comprises amplified cONA generaled from lota l RNA from a cell or tissue. These prohes are hybridized to an array oC DNAs. eaeh corresponding to a dift'erent gene in Ihe organis m under st udy. The intens ity oC '-he hybridization s ignal to each oC ¡he DNAs in the array is a measure of the level of exprcssion oC Ihe gene in question.
IsoJation of Specific Segments of DNA Much of the molecular analysis oC genes and their Cunction requires the se paration of specifie segments of DNA Crom much larger DNA moleeules . and the ir selective amplification . This allows the iofonuation encoded in Ihal particular DNA molecu le lo be analyzed. Thus. the DNA can be seque nced. or it can be ex pressed and its product studied. The ability to purify specific DNA molecules in significant quantities allows them to be manipul ated in various other ways as ",ell. Thu s. recombinant DNA rnolecu] es can be created. These ean be used lo alter the éx pression oC a particular gene (by Cusing its coding se-· quem:;e lo a promoter, for example) or even to generate DNAs that encode so-called fusion proteins - Ihat is . hybrid proteins made up oC parts derived Crom different proteins. The techniques of DNA cloning and amplification by PCR have become essential tools in asking ques-
653
gel
O
bIot
F I (j. UR E 20-6 A scuthefn Mol. DNA ger¡eraled by dlg€5bon of a DNA moIeaJle by a lestncbon ef)lyITle, ale lUI1 out on an agarose gel. Once stallled, a pattem 01frago ments is seen Vvhen transf€ITed 10 a filler and probed with a [)NA lliIgIllel"lt hornoIogoos lo ¡LlsI one sequence in !he digested moIewle. a SIngle band IS seen. corresporxling lo the posilion en the gel of lhe Ilagmenl mntaining mal sequence. Ila~ts,
tions about the control of gene expression and mainlenance of the genome.
DNA Cloning The abilily lo construct recombinant DNA molecules and maintain them in cells is called DNA clollÍng. This process typically involves a vector that provides Ihe information necessary to propagale the cloned ONA in the cell and an insert DNA thal is inserted within the vector and ¡neludes the DNA of intcrest. Key lo crealing recombinant DNA molecules are the restriction enzymes tbat c ul DNA al specifi c sequences. and other enzymes thal ¡oin !he cut DNAs to one anothec. By creating reoom hinan t DNA molecu les tbat can he propagated in fl host organism, a particular ¡oserl DNA can be both purified fmm other DNAs and amplified to produce large quantities. In the remainder of this section , \Ve describe how DNA molecules are cut, recombined, and propagated. We t.hen discuss how targe 001lections of such hybrid molecules, called libraries, can be created. In a library, a common vector carries many altecnati ve inserts. We describe ho\!\' librarles are made and how specific DNA segments can be identified and isolated Crom them.
Cloning DNA in Plasmid Vectors Once the DNA is cleaved into fragments by a restriction enzyme, it typica lly needs lo be inserted into a vector for propagation. That Is, the DNA fragment. must he inserted within thal second DNA molecule (tbe vector) to be cepli cated in a host organism as we described above. By rar the most common hosl used to propagate UNA Is the bacteriulll E. coli. Vector DNAs typically have duce chamcteri stics, 1. They contain an origin of replication that allu ws Ihem lo replicate
¡ndependently of the chromosome of the host. 2, They contain a selecta ble marker Ihut allows cells that con tain me vector (and any attached DNA) lo be readily identi Fi ed . 3. They have si ngl e sites for one or more reslriction enzymes. This allows DNA fragments to be inserted at a defined point within an otherwlse intact vector. The most common vectors are small (approximately 3 kb) circular DNA mol ecules Ihat are called plasmids. These molecules were originally derived fro m circular ONA molecules thal are found naturall y in many bacteria and single-rol! eukaryotes (Chapter 21), In many cases, these DNAs carry genes encoding resislance to antibiotics. T hus, naturally occurring plasmids already have two of Ihe characteristics desirable for a vector: they can propagate independentl y in the host and they carry a selectabl e marker. A furth er benefit is thal tbese plasmids a re sometimes present in multi p le copies per cel!. Thi s increases the amount of DNA thal can be isolated from a population. In sorne cases these plasmids also have useful unique restriction sites. However. since their discovery the plasmids have been simplified and modified such that a typical plasm id vector now has greater than 20 unique reslriction sites within a smalJ region. This allows a much more d iverse arfay of restriction enzymes lo be used to cut the target DNA. Bacterial viruses-phage-have been modified to allow their use as cloning vectors as well (see Q lapter 21 ).
Nuc/cic Acids
To insert a fragment of DNA into a vector is a relatively simple process (Figure 20-7). Suppose lhat a plasmid vector has a unique recognition site for EcoRI. 1featment with tha1 restriction enzyme would linearize lhe plasmid. Becallse EcoRI generates prolruding 5' ends that are complementary to each other (Figure 20-5), lhe sticky ends are capable of reannealing lo re-form a drcte with two nicks. Thus, trealment of Ihe cirde with the enzymc DNA ligase and ATP would seallhe nicks lo re-form a covaJent ly closed circle. A targel DNA is cleaved with a restriction enzyme to generate po!ential ¡nserl ONAs. Vector DNA that has beco cut with the same e nzyme is mixed with these insert DNAs and DNA ligase is used to link the compatible ends of Ihe two ONAs. By add ing an excess of the insert ONA relative to the plasmid DNA. thc majority of vectors will resea! with inser! ONA incorporated (Figure 20-7). Some vectors nol only aUow the isolation and purificalion of a particuJar ONA. bul also drive the expression of genes within the insert ONA. These plasmids are cal!ed expression vectol'S and have transcriptional promo!ers immediately arljacenl lo the site of inseruon . If the coding region of a gene (without its promoter) is placed al Ihe site of insertion in the proper or;entation . Ihen Ihe inserted gene will be transcribed inlo mRNA and translaled into prolein by Ihe host ce Jl. Expression vectors are frequently used lo f!xpres s helerologous or mutant genes lO asscss their function . They can also be used to produce large amounls of a prole¡n for purificahon. Ln addilion. the promotcf in the express ion vector can be chosen such that expression of the insert is rcgulated by Ihe addition uf a simple compound lo lhe growth media (for example. a sugar or an amino acid), This conlrol of when the gene wiU be expressed is particularly useful iflhe gene product is toxico
Vector DNA Can Be lntroduced into Host Organisms by Transformation Propagation of Ihe vector with its ¡nsert DNA requires this recombinant molecule be introduced into a hos! cell by transfonuation. Transfonnation is lhe process by which a hosl organism can take up DNA from its envuonment. Sorne bacteria, bul not E. cok can do lhis naturally and are said lo have genclic compelcncc. E. coli can be rendered compete nt lo take up ONA, however, by treatment wilh calcium ions. A1though Ihe exacl mechanism for ONA uptake is not known. jt is likely thal the Ca 2 + ions shicld the negative charge on the ONA. allowing it to pass lhrough Ihe cell membrane. Calcium -treated cells are lhus sa id lo be competenl to be lransformed, An antibiotic lo which Ihe plasmid imparts resistance is Ihen used lo select transromlants that have acquired lhe plasm.id; cells harboring the plasmid will be able lo grow in the presence of tite antibiotic whereas those lacking il will no!. Transfomlalion generalIy is a relatively inefficient process. On ly a small percentage of the DNA-treated cells take up tlle plasmid . lt is this low efficiency of transformation Ihal makes necessary seJection with !he antibiotic. After DNA tIealment, Ihe cells are transferred onto mediulll containing the relevant antibiotic and only Ihose cells that have taken up lhe plasmid and maintain it stably are abJe to grow. The inefficiency of lransformation also ensures thal. in most cases, each cell receives only a single molecule of ONA. Th is property makes
655
ptasmod 'leCtor
recombinant plasmid
¡
lraolSIOfmed ceNs
platBd 01110 medium
containing
telr.Jcyc~ne
mI)' ccis conlaining rOCtlfTbinant plasmid survive ID p<00uce resistant coIony
FICURE 20-7 Cloninginaplasmid YK1Of. A hagrnent of [)NA, ~ted by (~ WlIh EcoRI, IS 1O:<.erted InlO the pasrrwd vectOf linea'lzec! by thal same el"lZ)me. Cloce 11gOted (see text). !he recanbinant plasmid is ir¡. troduced IntO bactena, by lranslamatlOn (see tect). CeMs containing !:he plasmid can be selected by growth 011 !he antibiotic 10 whlch the plilsrmd confers teSlStanc:e. ($ource; Adapta! frorn MICkIos DA and Freyer CA 2003. DNA So'· ence: A flf5t rour.:e 2nd editlOl'\ p. 129, !eh col· umn. Cold Spring Harbar LaborntOly Press, CoId Spring Ham, NY.)
65 6
Techniqucs 01 Moleculo/' Biology
each transformed cell and its progeny a carrier of a unique DNA molecuJe and effecti vely allows the purification of that molecule away &om all olher DNAs in lhe transform ing mixture.
Libraries of DNA Molecules Can Be Created by Cloning ligate fragmenl s into vector
I
!
r
tl ansrormalion
I
plate ce lls onlO filler on agar plate
j
l emCI\le
liller
and prepare for hybriclizalion
'.
expose hybridized filter lo X..... y film
----
J ~
F 1" URE 20-8 Construdion of iII DNA library. fa construct the libra!)', genomic: DNA aOO vector DNA. dígested ....ith the sarna restri[· !ion enzyme, are íncubated together Wllh ligase. The resulting pool or libra!)' of hybrid vectars (eadl vectof c.arrying a different msert 01 genorn¡[ DNA. replesenled 10 a díffaef1t rolar) is lhen IOlloduced InlO é col, and the cells are piafad onlo il filler place
It is trivial to generdte a spedfic clone if lhe slarting donor DNA is simple. Th us. if the starting DNA is small (derived from a small virus, for example. with a genome of perhaps only 10 kb), Ihen lhis can be accom plished simply by separating the DNA fragmen ts a ft or digestion with restriclion cnzymes and gel electrophoresis. Once separaled. ONAs of difrerenl sizes can be excised fro m the gel and purified prior l o inseltion into a vecto r. This is harder lo do if the starting DNA is more complex (for exam ple. lhe hrn11an genome). [n this case, simple electrophoretic separation of DNA treated with a restriction enzyme will resul1 in very many fragments distributed in a braad range of sizes around lhe averdge distance between cut sites. Thus, it is easier under these circumslances to clone the whole population of fragments and separate the individual dones afterwards. A DNA library is a popuJation of identical vectors thal each contains a differenl DNA Lnserl (Figure 20-8). To construct a ONA library. the targel ONA ([or exampJe, human genom ic ONA) is digested with a restriction enzyme that gives a desired average insert size. The insen size can be of any size rangiog from less than 100 base pairs 10 more than a megabase (for very large inser! sizes the ONA is typica ll y incompletely cut with a reslriction enzyme). The cleaved DNA is then mixed with lhe appropriate vector cut with Ihe same restriction enzyme in the prescnce of ligase. This creates a large collecHon of vectors with differen t DNA inserts. Different kinds of li braries are made using ¡nserl ONA from diHerent SOUTces. The simplest are derived from total genomic DNA cleaved wilh a restriclion enzyrne; these are caJled genomic libraries. This type of library is most useful when gener'dti ng ONA for sequencing a geoome. If, 011 the other hand . the objective is to clone a ONA fragmenl encornng a particular gene, a genom ic Iibrary can be used efficienlly on ly when Ihe organism in q uestion has relativeJy ti ttle noncoding DNA. For an organism with a more complex genome. this type of librar)' is nol suitable for th is task because many of the ONA inserts will no! contai n coding DNA sequences. To e nri ch for coding seq uences in the li brary. a cONA Ii brary is used. This is ma de as follows (Figure 20-9). lostead of start.ing w ith geoomic DNA. mRNA is con verte d into DNA sequence. Th e process that allows th is is cal1ed revcrse transcription and is performed by a special ONA polymerase (reverse transcriptase) that can make DNA from an RNA templatc (see Chapter 11). When treated with reverse t ....mscriptase, mRNA sequences can be converted in to doublest randed DNA copies that are cH lIed cDNAs (for copy DNAs). These fragmen ts are then Iigated into lhe vector. To iso)ate indiv idual insorts from a li brary, E. co/i cells are transformed w ith the entire li brary. Each transformed cell ty picall y co ntains on Jy a single vector with its associated insert ONA. T hus. each cell that propagates a ft er transrormati on w iII conla in multi ple copies of jusI one or the possible clones from the li brary. The colony prod uced rrolll cells' carrying any c10ned sequence of interesl can be
,._-_....
Nuclcic Acids
identified and the DNA relrieved. There are various ways to identify the clone, Fo r example, as \Ve will describe below, hybrid ization with a unique DNA or RNA probe can identi fy a population of cells thal ¡nclude a particu lar ioserl DNA.
Hybridization Can Be Used to Identify a Specific C lone in a DNA Ubrary When attempting to clone a gene. a common slep is to identi fy fragmenls of that gene among clones in a Iibrary. This can be achieved using a DN A probe whose sequence matches part of the gene of interest. Such a probe can be used to identify colonies of cells harboring clones containing that region of the gene, as we now describe. The process by which a labe led DNA probe is used lo screen a librdI)' is called coJony hybridization. A typical cDNA Iibrury will have lhousands of different inserts, each contained wilhin a common vector (St..'€ above). After transformation of a su itable bacteriaJ host strain wi th lhe library, the cells are plaled out on petri dishes containing solid growlh medium (usually agar-see Chapter 21). Each ceU grows into an isolated colony of rells, and each cell within a given colony conlaios the same vector and insert fmm the library (there are typically a few hundred colonies per dish). The same type of positively-charged membrane filler used in lhe Southem and northem blotting lechn iques is agaín used to securc small amounts of QNA fOf probing. lo this case. pieces of Ihe membrane are pressed on top of the dish of colon ies, and impri nls of cells (including some DNA) fmm each coluny are left on the filter. Thus. lhe filter relains a sample of each DNA clone pos itioned on lhe filter in a pallern lhal matches the pattern of colonies on the pIate, This ensures thal once the desired clone has been idcntified by probing the fil ler, the colony oC cells carrying that done can be read ily identified and lhe plasmid contain ing lhe appropriate losen DNA can be purifi ed. Probing of lhe filters is carried out as foJlows. They are trealed under conditions Ihat cause the ceUs on lhe membrane to break open and Ihe DNA lo leak oul and bind lO the filter al the same loca· tion as the ceIls the ONA was deri ved úom. The rilters can then be incubated with Ihe labeled probe under the same conditions lhat were used in the northern and Southern bIotting experiments. As we me nti oned earHer and discuss in Chapter 21 , bacteriophage (partic ul arly >..) have also been modifi ed for use as vectors. When libraries are made using a phage vector. they can be screened in much lhe same way as ¡ust described [or the screening of plasmid libraries. The difference is that Ihe p laques forrned by growth of Ihe phage on bacterial lawns are screened rather lhan eoJonies fsee Chapter 21 J.
Chemically Synthesized Oligonucleotides Shorl. custom-designed segments of DNA known as oligonucleotides aro critical for several techn iques we describe in Ibis chapler. Although ONA pol.ymerases are lhe most efficient machines for syntheslzing DNA molecules. DNA can also be synthesízed chemically. The most common melhods of chell1 icaJ synlhesis are perfomled on solid supports using machines lhat automate the process. The precUISors used for nudootíde
, oligodT
657
.~
,._---_. ' 1:,a.~ ,.,.
1
,--_.... ,em,,,e RNA
,
rnndom he~me
1
and dN l Ps
1,~ale '"to p~,m'd ONA o-DNA library F I (j U R E 20-9 Const,uc;tion of ilI cDNA library. The RNA-dependent ONA pol}otTlera5e levefse lransoiptase (RT) lransaibes RNA ;nlO ONA (CClp)I or cONA). In the firsl. step (firSl Slrand synthesi:s), oligos 01 poIy.l sequeoce <;erve as primers by hybridizing lo !he poIy-A tails 01!he mRNAs.. Re\Ierse transcriptase ex\ends the dT pnmer 10 complete a DNA oop)I of!he mRNA templale. The prodUCI is a dupla composed of
ore strand 01 mRNA and !ls oomplemenlil'Y
-Stland of ONA. The RNA stland i5 rema.'ed by Ireatrnenlwlh base (NaOH), and the. Jemaining sir'@le-slranded DNA l10W serves as !emplate for !he second s lep (second strand synthesis). Shoft random sequences 01 DNA usually approxirnately 6 bp long (c.alled random hexamers) serve as primers by hybridizing 10 vauous seQuences along Ihe cop¡! DNA lemplate. These prirners aJe lhen extended by DNA poIymerase 10 creale do~Janded [)NA products lha! (dn be doned ¡nlO a plasnitd vector (see Fi@Jre 20-8) 10 create a cONA Ilbra'Y.
658
Techniqucs 01 MoleculoI' Biology
S'-hydroxyl blocked by dimethoxytrityl (OMT)
/
\
DMT- O - CH2 0
....
PI"Otonated phosphoramidite
FIGURE 20-10 Pfotonatedphospllo-
ramicflte_ As shov.n, the 5 ·-hydmxyl gfOUp IS bIcx:ked by the add1tion of a dimethoxyltlltyl protecting group.
addition are chemically protected molecules called phosphoamidines (Figure 20-10). Growtb oC the ONA chain is by addition to the 5' end of the molecule, in contrast to the direction of chain growlh used by ONA polymerases. Chemical synthesis of DNA molecules up lo 30 bases long is efficient and accurale. and takes only a few hours. It is a routine procedure: a researrner can simply program a DNA synlhesizer lo make any desired sequence by typing tb e base sequence into a computer controlling lhe machine. Bul as the synthetic molecules gel longer, the final product is less unifono duc to lhe inherent failures thal occur during any cycle oC lbe process. Thus, mo)ecules over 100 nucleotides or so are difficult to synthesize in the quantity and with the accuracy desimble for most molecular analysis. The rather short DNA molecules tbat can readily be made, bowever, are well suited for many purposes. FOf example, a cuslom-designed oligonucleotide harboring a mismatch lo a segment of c10ned DNA can be used lo creale a directed mutation in lbat c10ned DNA. This method . calJed site-directed mutagenesis is performed as Collows. The oligonucleotide is bybridi:t.L'-d to the cloned fragment, and used lo prmle DNA synthesis wilh the cJoned DNA as template. In tbis way, a doublestranded molecule \<\'ith one mismatcb is made. The two strands are tben separnled and thal with Ihe desired rnismatch amplified furlher. Custom-designed oligonucleotides can be used in this manner lo introduce restriction siLes into cloned DNAs which are the n used to cn..>ate fusions belween a coding sequence and 3nothcr coding sequcnce or a promoter or ribosome bi nding sile. As another example. synlhl¡llic oligonucleotides that ha ve been labeJed fluoresceutl y or radioactively can be used as probes in byhridization experimenl<;. Moreover. customdesigned oligonucleotidcs are critical in the polymerase chain reaclion, which we describe next, and are an indispensable feature of the DNA sequencing strategies Ihat we describe belo\<\'. Therefore. a com mon feature in designing experiments lo construct new molecular clones of genes to deteel specific DNAs, to amplify DNAs, and to sequence ONAs is to design and bave synthesized a sbort synthetic DNA oligonucleolide of desired sequencc.
The Polymerase Chain Reaction (PCR) Amplifies DNAs by Repeated Rounds of DNA Replication in Vitro A powerful method lor amplifying particular segments of DNA, distinct from clornng and propagation witbin a hOSl cell , is the polymerase chain reaction (PCRJ. This procedure is carried out entirely biochemically, that is, in viuo. PCR uses the enzyme DNA polyrnerase tba! directs the synthesis of DNA from deoxynucleolide substrates on a singlestranded DNA ternplate. As \Ve saw in Chapler 8. DNA polymerase synthesizes ONA in a 5' to 3' directioll and can add nucleotides to the 3' end of a cuslom-designed oligonucleotide. Thus, if a synthctic oligonucleotide is annealed lo a singh7stranded temp late that contains a region complementary to the oligonucleotide, DNA polymer-dSe can use the oligonucleotide as a primer and elongate it in a S' to 3' direction to generate an extended region oC double-stranded DNA. How is this enzyme and reaction exploited to amplify specific DNA sequences? T\Vo synthetic. single-stranded oligonucleotides are synthesized. One is complementary in sequence to the 5' end of one slrand oC the DNA to be amplified, the other complemenlary to tbe 5' end of the other strand (Figure 20-11 J. The DNA lo be amplified is then denatured and the oligonucleotides annealed to their target sequences. At this
Nuclcic Acids
F I e u R E 20-11 PoIymerase dlain reKtion. In the first step 01 lhe PCR the heat denalured
1IIIIIIIUlIIIIIIIIIJWllWl'11UIIIIIIIII'111
111111111111111,111111111111111111111111111111"
.I11III.
add ONA poIymerase
1111111mllllll"'IIIIIIIIIIIIIIIIII n 1IIIII1rr Illlll111111ll lllll1
JWuuw Wlnnllllllllllllllll "llullllIl' ,
Iheat~d,
.....t
11111111111111111111111111111111
Ihlillllllllli
11111111111111111111111111111111
'11111"111"1
'II!IIf .IIIW. ¡IIIII"IIIIIIIIIII, 1IIt11I1'1'1
1111111 1If":I;lIgU IIIIIIII1I 1
(lIlA
lempl.ale $S denéltured by heating and annealed s'I"thetic ohgonudeoode primers (&m. ClJ
mnmmmlUlllllllrnmonnrnmmm
659
point, ONA polymerase and deoxynucleotide substrates are added to the reaction and the enzyme extends the two primers. This reaction generates double-stranded ONA over the regioo oC iolere~1 00 both of the strands of DNA. Thus two double-stranded copies of the startiog fragmenl ofDNA are produced in this, the first, cycle ofthe PCR reaction. Next, the DNA is subject to another round of denaturation and DNA synthesis usillg the same prlmers. This generates fo ur copies of th e fragment of inte rest. lo this way, additional repeated cycles oCdenaturdtion and primer-direcled ONA synthesis amplify the region between Ihe two primers in a geometric manner (2 , 4, B, 16, 32, 64, and so forlh). So a Cragment of DNA thal was originally present in vanish¡ngly small amoun ts is amplified into a relatively large quantíty oC a double-strdoded ONA (see Figure 20-11). In a sense, ONA clon ing and the polymerase chain roaction (PCR) roly 00 the same concept: repeated rounds of ONA duplicationwhether carried out by cyc1es of ccll division or cycles oC DNA synthesis in vitro-amplify tiny samples oC DNA into large quantities. In cloning, however, we oflen rely on a selecti ve reagent or olher device to locate t.he amplified seq uence in an a lrea dy existing library of clones. whereas in PCR, Ihe seJective reagenl, Ihe pajr of oligonucleotides, li mits the amplification process lo Ihe particular DNA sequence of interest from the beginning (see Box 20-1, Forensics and the Polymerase ehaio Reaction).
Nested Sets of DNA Fragments Reveal Nucleotide Sequences We next consider how nucleotide sequences are deternlined. In a sense, nucleotide sequencing represents the ultimi:lte in probing a genome with high selectivity. We determine the entile sequence of nucleotides for a genome. as has now been done for organisms ranging in complexity from bacteria to Homo sapiens, and this pennits us lo find any specific seq uence with greal rapidity and accuracy through the use oI a computer and appropriate algorithms. In other wOTds. our "selective Teagenl" when deaJing with nucJeot.idc sequences is a stri ng oC bases that \Ve feed inlo a computer. The increasing availabil it y of large nl1mbers oC genome sequences makes jt possible to search with high precision Cor copies oC re lated sequences both wilhin a nd between organ isms in silico. Obviously, nucleotide sequencing generates ext raordinarily powerful databases as we shalJ describe below. The underlying principie oC ONA sequenci ng is based on the separation, by size, oC nesled sets oC DNA molecules. Each of the DNA molecules starls al a common 5' end , and lenn inates al one of severa) alternaUve 3' endpoints. Members oC any given sel have a particu lar type oC base at their 3' ends. Thus, for one set, the moJecules a1l e.nd with a G. for another a e, for a third an A, and for the final sel a T. Molecules within a given set (the G sel for example) vary in length depending on w here the particular G al their 3' end lies in the sequence. Each frag· ment from this seto Iherefore, teUs you where there is a G in the DNA molecule from which they were generated . How these fragments are gen· erated we return to below (and is shov.'l1later in Figure 20-14). The diiferent lengths of these fragments can be determined by elec· trophoresis through a poJyacrylamide gel. Running the G set on a gel in this way gives a Jadder of fragmenls, with each rung corresponding lo a fragment whose length reveals the position 01' a G in the ONA sequence. The fotil' nested sets can be run o ut on the gt':l side-by-side. generating fOUT ladders and revealing where there are Cs, es, As, and Ts within the
Box 20-1 Forensic:s and the Polymerase Chain Reaction Imagine thal you are in a forensic laba-attxy aOO have a DNA sample from a suspected criminal. Yoo wish ro determine whelher Ihe suspect's DNA contains a ¡:oIymorphism that is present in DNA loond al the scene of the crime. Polymorphisms are allernative DNA sequer.ces (alleles) found in a (:lq)Ulation of organisms al a common, horndogous region of the duornosome, sum as a gene. A poIyrnOfphism can be as simple as altemative, single base pair diflerences al the same site in Ihe chl'OlTlOSOTle among diflerent members of the population or differences in Ihe length of a simple nucleotide repeat sequence such as CA (see Chapter 9). V'Jhat W2- want lO do is amplify DNA surrounding and induding the site of the poIymorphism so that we can subject it lO nudeolide sequenóng (belo.v) and detetmine if there is a match to the sequence found in Ihe aime sc:ene sample. lhe nudeotide sequence of amplified DNA helps to detetmine (along with mecks for additional poIymorphisms) whelher !he tv-.to DNA samples match.
me
sequeoce. Comparing the posilions of the rungs in these four ladders reveals the enlUe scquence of lhe starting DNA molecule. Alternatively, the four nested seL<; can 00 diiferentially labeled with distinct fluorophores. allowing them lo be s ubjected to electrophoresis as a single mixture and distinguished later usiog fluorometry. How are nesled seis of DNA molecules crealed? Two methods werc illvenled for doing this. In one, DNA molecules are radioaclively labeled al their 5' temlini and are Ihen subjected to four difIerent regimens of chem ical trealment thal ca use Ihem lo break preferentiaJ ly al Cs, Cs. Ts, or As. This chemicaJ procedure is no longer in wide use. and we will nol consider it furiber. The other procedure. which employs chain-terminating nucleolides, continues lo be used lo th is day and is the technology upon w hi cb modern, automatic sequencing machines caBed Sequenatol"s are based. In the ehain termination me thod , DNA is copied by DN A po lymerase from a ONA templale starting rrom a fixed poinl specified by the use o r ao oligonucleotide primer. As we saw in Chapler 8, ONA polymerase uses 2'-deoxynucleoside triphos phates as substrates for DNA synthes is. and DNA synthesis occurs in a 5' to 3' d irection. Phosphodi esler bonds are rormed by the nucleophilic allack of the 3'-hydroxy l al the 3' e nd of lhe grúwing polyo udeotide chai n 00 the Q-phosphate ol' an ¡ncom ing s ubslrale molecul e. (The chain lermination melhod reHes on the principies of enzymatic synlhesis of ONA. which we djscussed in Chapl er B. ) The chain temli nation method employs spedal, mod ifled substrales caBed 2'-,3'-di deoxynudeotides (ddNTPs), which lack Ihe 3'-hydroxyl group on their sugar moiely as we ll as the 2'-bydroxyl (figure 20-12). DNA pulymenlse wi ll 2' -dec»cy ATP
2'-.3'-dideoxy ATP
A
5'
~ OH~
O OH H
F I (; U R E 26-12 Dideoxynudeotides uneS in DNA sequencing. On lhe Ief! is 2'-deoxy ATP. Thrs can be incorporated into a grl:J\.AÁng Dt-JA melln and aUCIIN anothef nucleolide lo be inrorpornted dilectly afier it. On !he right is 2 '-,3 ' -dideoxy AIP. lhis can be inrorporated inlo a growing [)NA maln, bU! once in place it bIocks further nlldeoodes being added 10 the 5affie maln.
H
H
662
Tcchniqucs 01 AJolcculor Biology
FI CU R E 2()..13 Chain termination in me presence of dideoxynudeotides. In the top line is a DNA chain being extended al the 3 ' end wrth additien 01an adenine nudeotide ento !he pl'eo.1ously Inrorporated cytOSlne. The presence of d~ne in !he gro.-;ing chaín (shovm at Ihe bonorn) bIocks rurther additien of inroming nudeotides as described In the texto
primer
5'
primer
incorporate a 2' -,3'-dideoxynucleotide al lhe 3 ' end of a growing polynuc1eotide chain but once incorporated, the presence of Ihe modified nucleotide causes elongation lo terminate. The reason for this is the absence al lhe 3 ' end of the growing chain of a 3'-hydroxy l, which is needed ror nudeophilic attack on Ihe next incoming substrate molecu le (Figure 20-131. Now suppose that we "spike" a cocktail of the nuc1eotide substrates with the modified substr.de 2'-,3 ' -dideoxyguanosine tnphosphate (ddGTPI al a ratio of one ddGTP molecule to 100 2'-deoxy·GTP molecules (dGTP). This will CAuse DNA synthesis tu abort al a frequency of one in one hundred every time the ONA polymerase encounters a e on the template strand (figure 20-14al. Because al) of the ONA chains commence growlh from the same point, lhe chain-terminating a
"Iemplate strand" 2
3"
.. A'
r
3 G
O e
5 6 G A
O e
8 9 10 T T A
m 12 e
b 12
D5'
11
10 9 8
r::::===== 3'
=======JGiD)3' s_-===:: lIDC s•••
nucleotide 7 length 6
3'
ONA s ynthesis ---.~ Sl,Jbstrales:
dtm d~
dGlil dlilil
dd t!m
flCURE 2()"14 DNA sl!quencing by thechain tennination method. f.s desaibed in the tex\, mairfj of differenllength ale synthesiZ€d in !he presence of di~udeotides. The length of lhe chains produced depend on !he sequence of!he ONA lemplate, and whid l dideOllynudeotide is induded in lhe reaction. In the figure, !he sequence of the templale is shown al lhe lop of (a). In thlS (Mctlen, all bases are presenl as deoxyoudeotides, 001 G is presem in the dKleoxy fofITI as well. Thus, ....nen the elongoting chain reaches a e in lhe templdte, il ~"¡It ir) sorne fl'action of the moIec.ules, add lhe ddGW inslead of dGTP. In !hose cases, chaif15 tenninale al thal poinl Part (b) sI'IOIIVS fragrnents sep¡lrated on a poIyacrylalTllde gel The lengths of kagrnenls seen on!he gel [eveal!he ¡:ositions of cytrn.l ne5 in !he !emplate DNA bei~ sequenced In Ihe reactlOn descnbed.
5 4 3 2
• E8
Nuc/cic Acids
nucleotides will generate a nesled set of pol ynucleotide fragments. all sharing Ih~ same 5' end bUI diffeling in their lengths and henee their 3' ends. The leogth of the fragmenl s. Iherefore. specifies Ihe position of es in the template strand. If the fragmenls are labeled al Ihcir 5' end Ihrough Ihe use of a rodioactively labeled primer, a primer tbat had becn tagged with a fluoresccnt adduct , or at their 3 ' end wito fluorescently labeled derivalíves of ddGTP. Ihcn upon electrophoresis Imough a polyaerylamide gel the oested set of fragments would yield a ¡adder of fragmenls, eae h cung of the laddcr representing a e on the templale strand {Figure 20-14b). lf we similarly spike DNA synthesis reactions wilh JdCTP. ddATP. and ddTTP, then in loto we will generate four oested seis of fragmenls. which together provide tbe full nucleotide sequence of the DNA. To read Ihat sequence, Ihe fragm enls generaled in each oflhe four reactions were resolved on a polyacrylamide gel (Figure 20-15). As we shall see below, th is conceptually simpl e approach, developed initially lo sequence short, defmed DNA fragme nts, has undergone a series of technical adapta!ions and improvements that allow Ihe analysis of whole genomes (sce Box 20-2, Sequenalors Are Used for High Throughput Seql1encíng).
Shotgun Sequencing a Bacterial Genome The baclerium HemophiJus influenzne was the fiest free-living organism to nave a complete geflome sequence and assembJy. It was a logical choice since il has a small , compact genome thal is composed of jusI 1.8 megabase pairs (Mb) of DNA. The H. influenzae genome was randomly sheared into many random fragmenl s with an averi'lge s ize of 1 kb. These pieces of genomic DNA \Vere doned into a plasmid recombinant DNA vector. DNA was prepared from individual recombinanl DNA colonies and separately sequenced on Sequenators using the dideoxy method tha! was discussed earHer in this chapter. This method is called "shotguo" sequencing. Random recombinant ONA colonÍes are picked . processed , and sequenced. In order to make cerla¡n lhat every single nucleotide in the genome was captured in the B,nal genome assembly. some thing like 30,000-40,000 separate recombinanl clones were sequenced. A total of aboul 20 Mb 01" raw genome sequence was produced (600 bp of sequence is prod uced in an average reaction . and 600 bp X 33,000 different colonies = 20 Mb of total ONA sequence). This ís calJed 10X sequence coverage. In principIe. every nucleotich'l in the genome was sequenced ten times. This method mighl seem ttldious. bul il ís considerably fast er and ¡ess expensíve Ihan the techniques Ihat were originally en visioned. One early strategy called for systematically sequencing every defined restriction DNA fragmenl 00 the phys ical map of the baclerial chromosome. A drawback of tbis procedure is thal mosl of Ihe known restriction fragmonts are larger Ihan Ihe amouot of ONA sequence information that can be generatcd in a single reaction. Consequenlly. additionaJ rounds of digeslion, mapping. and sequencing would be required to obtaín a complete scquence for any given defined region 01' the genOll1e. These additional sleps 01' c10ning and restriction mapping are considerably more time consuming Ihan the repctitive automated sequencing of random DNA fragmenls. In olher words, the computer is much faster al assembling random ONA sequences than the time required to perform 6ne-scale restrictlon mapping of the bacterial chromosome. Tha approximately 30,000 random sequencíng readR derived from randum gennmic DNA fragments are directly loaded inlo ¡he computer,
•
e
G A
663
T
--- . .. -• . -- · - - =- =
' -
~ - -
T
•
e
=
=
·
=
G
!
--=
e
A A A G
G G
e
T
G
A
e e
=
T T T
==
e
G
e e e
G
T
e
A
.. F 1" U R E
e T
20-15 DNA sequencing gel.
fue \engttls 01 DNA c:hall1S, lerminate
Tedmiques 01 Molecular Biology
664
and different programs are used lo assemble overlapping DNA sequences. This proCQSS is concepluaJly similar lo the assembly of a jigsaw puzzle. Random DNA fragmen ls are "assembled" based on containing malching seqllences. The sequential assembly 01' such short DNA sequences ultimately leads lo a single continuous assembly, also caBed a contig (see Figure 20-171ater in this chapter) .
The Shotgun Strategy Permits a Partial Assembly of Large Genome Sequences From oUt preceding discussion it is obvious thal sequencing short 600 bp DNA fragments is incredibly fasl and efficienL In fact , the aul.omated sequencing machines are so efficienl that they far surpass our ability lo assemble and annolale the raw DNA sequencc information. In olher words, lhe rate-limiting step in delennining the completé DNA sequeI1ce of complex genomes, such as ¡he human genome, is Ihe analysis of the data. rather than Ihe production of Ihe dal
K~ Kl KK 1
,II
2
(15
un
3
4,5
aa II xx Xl IX Kl ,
aA XI 11
,
,XI
,
19,20
I, I
,n
IX ,
21 . 22
seQueoce ends lo produce a 6-fold genome coverage
....mOle
I sequaf'lCe ends 10 ~uce a 3-fold genome coverage
5 kb plasmid library
sequence ¡nto chromosome sblngs
11,
16-18
13 15
n,
xx
I
,r
(7.5 X tOS plasmids) L
6-12
,11
I
,
,
1 kb plasmid library
x 106 plasm;ds)
H , rY
(2.5 X 106 BACs)
:
100kbBAC library
I
,¡sequence ends lo produce a l-fold
genome coverage
f I G U R E 26-16 Stl1ltegy fOf construction and sequencing of wf10le genome Iibf"añes.
Contigo; clre determine
~
NI/deie Acids
fifiS
80.: 20.2 Sequenators Are Used for High lhroughput Sequenanc
I/v'hen lhe sequencing of the human genome was first envisioned, il seemed like a daunting. virtually hopeless enterprise. After aH, the complete human genome consists of a staggering 3 billion (3 x 109) base pairs, and lhe early melhods for determining the nud eotide sequence of even short ONA fragments were quite tedious. In the 1980s and early 19905, an individual researcher could produce only a few hundred base pairs, perhaps 500 bp, 01 DNA sequence in a day or two of concentrated effort Several technical innovations have greatly accelerated Ihe speed and reliability of DNA sequencing. As we desaibed in !he preceding sectioo, the main lermin.,.. ticn method pn:duces nested sets of DNAs that differ in size by just a single nudeotide. Initially, (arge poIyaaylamide ~ were use<:! lo fractionale these nested ONAs (see Figure 20- 15). Hcwever, in recenl years cumbersome gels llave been replaced by short coIumns. whim pamit lhe resolution 01 nested ONAs in just 2 lo 3 hours. These short reusable ooIumns permil the fractionation of DNA fragments ranging fn:rn 700 ID as many as 800 bps, sinilar to the capacity of the lar more cumbersorne
poIyaoylamide gels thatthey I-.ve ,eptaced. A major lemnical advance in DNA sequencing came from the use of fluorescent chain-terminatinc nudeotides. In principie, il is possible to !abel cach of the neste
depends on lhe identification of the last nudeotide. For exampJe. ONAs ending with a T residue at positi'on 50 in the template DNA mi~t be labeled red, while those nested ONAs ending lMth a G residue at position 5 I corresponding lo pc5ition 51 might be !abeted blad. Thus, each nested DNA has a unique size and color. As they are fractionated on the sequencing coIumns based on size, fluorescent senSOl'S detect the color 01 each oested Of\IA (&bI20-2 Figure 1) . tn this way, a single cotumn produces 600 to 800 bp 01 DNA sequence aher 1ess than lhree hours 01 size separation. Automated sequencing machines- SequenatoB- have becn developed thal have 96, and most recenlly, even 384 separale Iractionation coIumns. In principie, lhe 384-co1umn machines can generate over 200,000 nucleotides (200 kb) of raw ONA sequence in juSI a f€Vl' rours. In a 9-hour day, earo machine can produce three sequencing ~runs" and more than one-half a megabase (500 kb) 01 sequence infoonation. A cluster 01 100 such mamines could g€flerate lhe equivalen! 01 one human genome, 3 x 109 bp, in just t'MJ months. There are currently flVC majar sequencing centers in the United Slates and thc United Kingdom. Each contains large dusters of automated DNA sequencing machines. Together, these five centers produce a staggering 60 x 109 bp 01 raw DNA sequence infoonation per year. (This corresponds 10 the equivalenl 01 20 human genomes per year!)
20 30 40 so 60 '0 TCAC T GCCC GC TTT CC AGTC GGG AAA CC T GT C GT GCC AGC T GC ATT AA T GAA TC GGG C AA C GC GC GG
BOX 2~2 f I G U R E I DNA seqUénce rud out. In this reaction, as desaibed in me text, flUOJe5Cent end-IabeIed dideoxynudeotides are usee! and the chains are sepilfated by coIumn d'lromatography. The profile of posítions of As is represented in grecn; 15 in red; Gs in bIack; and es in blue.
quickly sequenced using the automated sequencing machines. To ensure Ihat every sequence is sampled in ,he complete c hromosome, an average o[ Iwo mUHon raodom DNA fragment s are processed. With an average of 600 bp of DNA sequence pe r fragmcnt , Ihis proccdure prod uces over one billion bp of sequence data. 01' nearly ten limes the amouot of DNA in a Iypi ca l chromosomc. As discussed earlicr for Ihe sequencing of lhe bacter ial chromosome, by sampling a bou! ten times the amoun! of sequence in a ch romosome we can be confiden t tha! e\'er)' portion of the chromosome is captured . The process oC producing "shotgun" rccombinant librades al ld huge excesses oí random ONA sequencing rcads seems very wasteful. However, a cluster of one h undred 384-co lumn au tomated sequencing machines can generate tenfold covernge of a human cluomosome in just
666
Techniqucs 01 Molecular Biology
three weeks. Thís ís consíderably faster than the methods ínvolvíng the ísolatíon of known regions within the chromosome and sequentially sequencing a knoWIl set 01' staggered DNA fragments. Thus, the key tech· nological insight that facilitaled lhe sequencing of Ihe human genome was the reliance on automated shotgun scquencing and then subsequent use of the computer lo assemble the different pieces like a jigsaw puzzle. The com binalion 01' automatOO sequencing machines and computers proved to be a potenl one-Iwo punch that loo to lhe completion of the human genome sequence years earlier l11an originally planned. Sophisticated compuler programs have been developed tha! assemble the short sequences from random sholgun DNAs inlo larger contiguous sequences called contigs. Reads containing identical sequences are assumoo lo overlap and are joineel lo form la rger contigs (Figure 20-171. The SlzcS 01' these contigs depend on Ihe ammm! of sequence obtail1ed Ule more sequonce, the larger the contigs anel the I'ewer gaps in Ihe sequence. Individual contigs are Iypically composed oC 50,000 lo 200,000 bp. This is sU Il far shor! 01' a typica l human chromosome. However, such conligs are lIseful ror analyzing compact gellomes. For example. the DrosophiJa genome cont ains an average of one gene every 10 kb, so a lypical conlig has several linked genes. Unrortunately, more complex ·genomes often contain considerab l ~' lower gene densities. The human gonome con lains a n average of one gene every 100 kb. so a typica] contig is often insufEici en l lo capture an entire gene, let alone a series of linked genes. We now consider how relatively short contigs are assembled into larger scaffolds thal are typically 1- 2 Mb in length.
The Paired-Eod S.rategy Perroits .he Assembly 01 large Genome Scaffolds A major Iimit ation to producing larger contigs is tbe occurrence of repetitive ONAs. Suc h stJquences complicate the assembly process sln ce random DNA fragmenls frOID unlinked regioos oC a chromosome or genome might appear to overlap due lo the presence of Ihe same
!' short ONA
•
,8
,
iI
"•
sequences
H
O
"
conlig 1
Ii s
•
)
¡
Vre.'"
contig 2
paired--E!nd
scaffolcl seqoenced contig 1
, • ••
.
ic
"v/ palred-end
1
reads
sequenced contig 2
F I G U R E 20- J7 Co ntigs a re Jinked by sequelKing the ends of la rge DNA fragmenls.. For example, ene end of a random 100 IW genomic DNA Iragment might coolain sequence matches within oontig 1, .....ni!e me othef end mil tches sequences in contig 2. TN~ places dle two oonligs on a common scalldd. (Soun:e: Adapted from Grílfiths AJ.F. et al Modem gene~ 2nd e
sequenced conllg 3
)
"
contig 3
¡
repelitive ONA sequence. One method thal is used lo overcome Ihis difficulty is called paired·end sequencing. This is a simple lechnique thal has produced pm.verful results. In addilion lo producing shotgun DNA libraries composed of short ONA fragm ents. Ihe same genorn ic ONA is also usoo lo produce r~ornbi.nanl libraries composad of larger fragmenls. typically between 3-100 kb in lenglh. Consider a ONA sample from a single human chrornosome. Sorne of Ihe DNA is used lo produce 1 kb fragrnenls. while anolher aliquol of Ihe same sample is used lo produce 5 kb fragmenls. The end result is Ihe construction of two Iibraries, one with small ¡nserts and a second with larger inserts (see Figure 20·16). Universal primers are made Ihal anneal al Ihe ¡unction belween Ihe plasmid and bolh sides of Ihe large inserted DNA fragment. Individual runs will produce abau! 600 bp of sequence information al each end of Ihe random ínsert. A record is kepl oE what end·sequences are deríved from the Sarné insertcd fragment. Dne end might align wilh scquences contarned within contig A, whíle Ihe olher end aligns with a differenl contigo contig B. Contigs A and B are now assumed lo derive from the same region oC lhe chromosome since Ihey share sequences with a common 5 kb fragment. Mosl repetitive ONA sequences are less Ihan 2 or 3 kb in length. so the "p;:¡ired-end" sequences frorn Ihe 5 kb insert are sufficienl to span conligs interrupled by repelítive DNAs. The preceding results usually produce contigs Ihat are less Ihan 500 kb in length. In order lo oblain long-range sequence dala. 00 lhe order of several megabases or more. il is necessary lo oblain paired· en d sequence dala from large ONA fragments thal are at leasl. 100 kb in length. These can be oblained using a special cJoning veclor called a BAe (baderial artificial chromosome). The principIe of how Ihese are used lo produce long·range sC
Analyses
For Ihe genomes of bacteria and simple eukaryoles. Ihe process of finding protein coding genes ís relatívely sl raightforward, essentiaJly amounling lo the identificaBon of open·reading frames. Although nol
aH open-reading frames-especially smaH ones-are real prote in coding genes , Ihis process is fairly effective, and Ihe key challenge is in idenlify ing Ihe functions of Ihese genes. For animal genornes with complex exon-íntmn Structures, the chaJlenge is far greater. In Ihis case, a variety of bioinfOl"matics tools are required to idenlify genes anri determine the genetic composilion oí complex genomes. Computer programs have been developed tha! identify potential protein coding genes throllgh a varie!y of sequence crHería. indudíng, the occurrence of extended open-readíng frames lhflt are flanked by appropríate 5' alld 3' splice sites (Figure 20-18). However, these methods have not yet been refined lo the point of 100% aceuracy. Perhaps something like three-fourths of all genes can be idenlified in Ihis Wfly, bul many are missed, and even among the predicted genes that are identified. smal1 exons- particularly noncoding exons- are missed. A notable limitation o" current gene lindar programs is the faHurí! to identify promoters. A typical metazoan core promoter is abolll 60 bp in length and eontains sequence motifs. such as TATA, INR, and OPE, whi.ch are sufficienl for tbe binding of the TFlID ínHialion eomplex and recruitment of Ihe PoI 11 transeription eomplex (sea Chaplers 12 and 17). Unfortunately. core promoter elements are highly degenerale. and allhough the lranscription complex is smart enough lo idenlify Ihese elemenls within the ceU, we are nol ye! smarl enough to write programs Ihal idenlify them in silico even when other seqllcnce constrainls are invoked (t'or example, associated exons, etc.). It is conceívable that computer programs will be created IhAI. exploit al! 01" the aforemenlioned properhes of a gene: core promole r elemenl s, open-reading frames. spliee sites. and so on, lo identify prolein coding genes in a consisten! and effi cient manner. The most importan! method for valida líng predicted protein coding genes and idenlífying those missed by curren! gene finder programs is the use of cONA seq uence dala (see Figure 20-18). cONAs are generated by reverse tronscription {see Figure 20-9} from mature rnRNAs and hence represen! bona fide exon sequences. Tho cDNAs are usoo to generale EST dAta. An EST, or expressed sequence tag, is símply II short sequence ruad from a larger cDNA. These reads are typicaJly obtained
--
• tU fi-t +ftt
.....
I I
•
11
~
•
1-' ............ .
c._c:OWI
l • • • • • , • • • 1 . . . 11
1 11 •• I
II •
C.~_Iko~t;!pI>M
HHll t
11'
............
_ •• " -~• • ,. . . ___
t.
I
I
_ _ _ _ •••+1+1-<
F I G UR E 20-18 Gene finder meÜlods; analysis of prob!ino
from either lhe 5' or 3' end of the cDNA-usually Ihe 3' end. Random 'W',!;:. f;:u~tm45h. W R.. 'P11t¿.'l1. liS.~~ , 'lm. -fflJ-R.w Urm-!. using shotgtm sequcncing melhods ami then aliglled onlo genomic scaffolds. Regions of alignmenl correspond lo exons. whíle genome sequence localed between regions of alignment ofien correspond lo inlrons (although, altemative splicing might utílíze an exon nol conlainoo in the particular cONA or EST thal was sequenced). Sholgun cONA sequence information can help link different oontigs or scaffolds. Consider the case of a cDNA thal is transcribed from a very la rge gene with inlrons of 100 kb or more in length. 1Wo different scafTolds Ibat share differenl sequences from this common cONA are likely lo arise flum Iinked regions of the genome and represent a single large gene.
....~~~ . ~lR.'i'a..'lf""
Comparative Genome Analysis The comparison of diffcrent animal genomes permíls a direct assessmenl of changcs in gene structure and sequence thal have (Irisen during evolution (Figure 20-1B). Such comparisons also refine the ¡denlificalÍon of protein coding genes wilhin a given genome. For example. !he exons of orthologous genes are highly conserved relative to noncoding DNA sequences soch as introns. Simple comparisons of Ihe mouse amI human genomes have identificd a large number of highly conserved exOTIS. Given the conservation of prnlein-coding sequences, there is no ambiguity in dislinguishing conserved exons fmm olher conserverl sec¡uences, such as enhancers (see below). Comparalive analysis helps identífy shor1 exons. some located near Ihe 5' end of the gene and Ihe core promotcr. that are often missed by gene prediction programs. One of the striking findings of comparative genome anaIysis is the high degree of synteny, conservalion in genelic Iinkage. between dislantly related animals. There is extensive synleny between mice and hurnans (Figure 20-19). In many cases, Ihi s linkage even exlends lo the pufferfish, which last shared a common ancestor with manuuals more !han 400 millíon years ago. The exlensive syn leny seen for verlebrale genomes. olong with the coordinate expression of linked genes in Dl'Osophilo, raises Ihe possibility Iha! neighboring genes share common regulatory sequences. A recent bíoinfonnatics survey in DrosophUo suggests Ihat 10-20 linked genes within a chromosome dornain spanning 100-200 kb exhibit similar patterns oC gene expression. Each of Ihe eslirnaled '5 00-1,000 chromosome domains in Drosophi/o mighl relnin fixed synleny due lo a reliance on common regulalory seq uences. Prolein-coding sequcnces are nol !he only regIons of Ihe genome thal are under funclional conslraints. Regulatory sequences-transcription faclor binding sites and larger elemenls of gene regulation, such as enhaneers- Iend lo be selectively conserved. These regulatory elemenls can ofien be recognized as short but conserved non-prolein-coding sequences. For example, a compuler program called VISTA aligns the sequences colltained in differenl genomcs over shol1 windows. on Ihe order of 10- 20 bp. Conservalion in !he muge of 70% idenlified over distances oC 50 -75 bp is seen for cerlain regulatory ONAs (Figure 20-20). Pufferfish anel miee share something like 10,000 shor1 noncoding sequences. lt is conceivable Iha! many of Ihese correspond lo tlssuespecHic enhancers. However, it is Iikely Iha! bolh animals, particularly mice, have many more enhancers thal were missed by simple sequence conservation. The humble sea squirt. Ciono inlestinalis. is estimated lo conlain on the order of 20,000 differenl tissue-specific enhancers and il
670
Tcchniqucs 01 Molecular Dlo/ogr human chrornooomes
FI GUR E 2~19 Synteny in the mouse
and human mromosomes. Ead1 hulT1
•
. 13
;
2
3
4
5
6
1
8
~
14
2
10
11
12
X
y
I 15
16
17
18
19
~
•~~• • • • • •• I •• ••• • • 1
9
3
4
5
6
7
8
112 13 14 15 16 17 18 19
I 21
n
9 10 11
X YI
would nol be surprising fo r miee Bod humans to contain more like 50,000 - 100.000 such enhaneers. Olher methods have bOOn used to identify enhancers. based on Ihe c1ustering of binding siles for sequenee-spccifie lran scriplional activators and repressors (see Chapter 18. Box 18-6). The recognitioll of regulatory sequen ces in ONA poses a mueh greater chaIl enge than Ihe idenlification of prol ein-eoding sequences as regulatory sequenecs are nol subjccl to constraint s as stringent as thal of Ihe genelie codeo Hence. il is likely th at a combínation of bioiflformati,es melhods will be required 10 ide nli fy regu lalory DN As in who le-genome sequences. The mosl eornmonly used genome 1001 is Bl..AST (basic 10caJ aJignmenl 5eaTCh lool) . There are variations in BLAST programs, bul the)' all share Ihe common fealure of finding regions of simílarity between dífferent protein cod ing genes (f igure 20-21). There are many ways in whieh a BLAST search can be done. One involves searching a geflome, or many genomes, ror all of Ihe predicted protein sequenC€S that are relaled to a so-ca lled query sequence. Consíder the following example. We have already discussed Ihe even -skipped (eve) gene in Chapter l B. The el'e gene encndes a homeodomain protein Ihat is essential Car tb e segmentót tlon of the DrosophiJa ernbr)'o. The Eve p~ teín is compased of 37Fi am ina rlci d re~i du es. The homeo dom~in
...........",....-_-..'"
' ,.'
,
---...... .. H-1--tt-: _
... ,
..
,
oO '
O'
~. ~
• • oL.. " . . . .
0 o ~
0 0
--
....... ..,.".'-"..... ....... '" ...-..,,..... ",. _ ...., .._ .
-
~.
"
... - ,. ." ........ ....... _" .. ,. •.._.... ' "
00, . ,
"'ft '0
~. .... o• • • ~
' " ... . ,oo ~
......... .... ,
..
f I (j UR.e 20-20 Compañson of a 14 kb Jegion of the mouse and human genomes.
lhis intCfVal conlains two lin\.:.ed genes, gene 1 and gene 2, \I\ot¡,ch are !ranscribe
resides between amino acid residues 71 - 130. When this 60 amino acíd long polypeptide is used as a query. it idenlifies about 75 homeoboxcontaining genes in Ihe Drosophila genome. Thus, BLAST quickly idelllifies a variety oC genes with similar function s. In Ihis case, genes Ihal encode regulalory proteins conlflining a s pecíalized f010l uf the helíx-tum-nelix ONA bínding motif (see Chapters 16 a nd 17). There are olher ways Ihal this type of BLAST search could be done. In Ihe preceding example. we used a 60 amino acid poJypeptide sequence. 11 is
612
Tcchniques 01 MO/CCiJ/ar BioJoS)'
O::l046-PA translation frcm gene zen Length '" 353
Score,. 150 (57 . 9 bits), Expect ,, 2. 1e-ll, P " 2.1e-ll ldent.ities = )1/57 (SIl') , Poo.L tives " 39/57 HiS1lJ RRYR'f~~!~':I7
Quer;y:
+R R'111J"T'
91
Sb;ct :
~-PA
Ler'Ig'tl¡
QL
LE EF
Y. R RR E+A t UL E
-+K.WFtIJRRlo'.K K,
KRSRTAf'I'S\,.QL~YRJRRlElACRLSLC~
147
t¡;ar,slation f:rc.rn gene l.l.rPJ
485
score - 152 (58 . 6 bits) , Expect = 2 . 4e-ll , P = 2 . 4e-ll ldentities = )0/57 (52%) , Pooitives " 39/57 (68% ) 1
Sbjct:
RRYRT~R.RCEi..'\AOl.MESl'l~
57
RR R'l'AFT +QL LE+EF• • Y+S R HA L L E . K.h'FCt-JRR K KR 320 RRRR'l'AF1'Se:;u.EJ.EREl'W\JO(Yt.%TE:R.9'JlA'ISLKI..SE.V(.MOWFQIlRRI\K\>'KR )76
O'j],OlBS-PB uansl ation
{~
l.1bx
Length ,. 346
SCOre" 149 (57 . 5 bits) , Expect = 2.6e-l 1, P" 2 . 6e-ll Idcntities" 31/58 eS3%) , Positives " 40/58 468%) 1 RRYRrAF'I'R.D;;.(.GR~NLPES'I'I~ S8 RR R . 'l'R O LEKIT+ . y ... R RR E+A L L lK.~ KH
e:
Sbjct :
253
~'l'RRRRIEMt\IIALCLTE'RQIK~
)10
FIG URE 20-21 Example of a BLAST search. A sequence of 57 aminoaód residues from!he horneodomain of !he Eve pfOlein was used lo •q..¡f!f'( lhe Drowph!la ~nome. Thrs sequence was entered in !he publidy available F1y BLAST V\eb site (ww.vJruiIfIy,org/bIast/). There are 3 steps in this process. first, yoo are asked VIf1ich program you wtSh lo use. In this case, the AA ptogram was selected as lhe Eve poIypeptide IS an amino aoe! sequence. A nudeollde BLAST search could be done by selectmg the"Ni database.. The second step IS 10 scIect a detaset In Ihrs eXilmple, lhe ptedlcted prOlein's dataset was selected because we are COI'l'JP'lI"ing protem seqlJfficeS. For a DNA search, one 01 sewral nudeolide dataseis could be used, inducling the lotal getlO(I)Íc DNA or ;..tst !he predlcted genes. Tre results 01 the searm ilfe usuaIly obtained in Iess lhiIn a minute. Fif51 you '>€e i'I list 01 !he top rMIl:hes, and wen you saoll down on !he compoter sc:reen !he detailed results are cbtained, as shown in !he figure. The first "hIt" is !he €Ve gene itselt v.tJich IS nol shov..n here, The secooo 1\11" correspooc!s lo the zen gene, \o\otllch encocles a horneodorni!ln ptOleln lha! rs importanl for dorsal-ventral paneming. The len gene I.S represenled by a specific code, ce. 1046, whch is ene oI lhe predK::ted genes In tte Dr050phila genomc. A score 01 1SO 15 assigned lO lhe match betwren lhe Eve iIld Zen homeodom.ains. A lotal 01 3 1 0157 amino aód r€Sldues are identical betweef1 the two (54%). and 39 01 !he residues are either ldt-rrtical or similar (thal ISo they represen1 col1SelValÍ\le amno ilCld substitultOnS). A score 01152 was obtained lor lhe I'orneodomain proteo. Unplugged (Unpg), which Is essential lor !he devcIopment 01the. central nervous system. In lhIs. case there afe 30 01 57 eXilct Il"lirtches wilh!he Eve homeodomain, aOO 39 of 57 total Similarities, The th!rd highest score. 149, was obtarned Wth !he ub.r. homeodomaen. l1f»r IS a rorneotlc gene lhal was ex\ensrvely discus.sed In Chapter 19.
PROTEINS Specific Proteins Can Be Purified from Cell Extracts The purificalion of individual proteins is critical to understanding their fun etíon. Although in sorne instances the fun ction of a protein ca n be studied in a complex mixture, these studies can oflen lead lo ambiguities. For exarnp le. if you are stl.ldying Ihe aclivily of one spe-
Proteins
cific DNA polymerase in a crude mixture oC proteins (such as a cell Iysate) other DNA polymcrases and accessory proteins may be partly or completely responsible for any DNA synthes is activity that you observe. For this reason. the purification oC protoins is a majar part oC understanding their funetion. Each protein has unique properties that make its purification sornewhat diCferent. This is in contrast to different DNAs, which all share the same helical structure and are only distinguished by thei r precise sequence. The purification oC a protein is designed lo exploit its unique c haracteristics, inc1uding sizú. charge, shape, and in many instanees. function.
•
.
r
Preparation of Cell Extracts Containing Active Proteins The starting material for almost aH protein purifications are extracts derived from cells. Unlike DNA. which is very resilient lO temperature. even moderate temperatures readily denature proteins once they are released from a cell. For lhis reason, most extract preparation and prolein purification is performed al 4 "e. Cell extracts are prepared in a number of different ways. Cells can be lysed by detergent, shearing forees. trealment with low ionic salt (which causes cells lo osmotically absorb water and pop casily), or rapid changes in prcssurc. In each cnse, the goal is to wcaken and break the membrane surrounding the cell to allow proteins to escape. In sorne instances this is perfonned at very low lemperatures by freezing the eells prior lo applying shearing forces (typicatly. using a blender similar lo the one in many kit chens).
Proteins Can Be Separated from One Another Using Column Chromatography The mosl common melhod for protein purification is cotumn chromatography. ln this approach to protein purificalion. protein fractions are passúd through glass columns filled with appropriatcly modiflcd small acrylamide or agarose ooads. There are various ways columns can be used lo separate p·roteins. Each separation technique varies on lhe basis of differont properties úf Ihe proteios. Three basic approaches are described here. The firsl Iwo. in lhis section . separate proteins on Ihe basis 01' their charge or size. respectively. Thcse methods are summarized In Figure 20-22 .
l
.: --' - -- - - - '
j
posillvely-dlarged
¡ pro".
.> I
negalively-harged beads
-
negatj,¡cfy-charged
J
protein
Purification of a Protein Requires a Spe<:ific Assay To purify a protein requires that you have an assay that is unique to thal prolein. For the purification of a DNA, the same assay is almost always used, hybridization to its complement. As you willlearn in the discussion of immunoblotting, an antibody can be used lo detect spccifrc proteins in the same way. In many instanres, il is more convenient lo use a more dircct measure for the functíon of the protein. For example. a specific DNA-binding protein can be assayed by determining ils intcraction with the appropriate DNA (for example using a gel shift assay, see Chapter 16). Similarly. a DNA or RNA polymerase can be assayed by adding the appropriate templatú and radioactive nucleotide precursor to a crude extracl in a manner simil ar to the methods used to label DNA described above. This typú of assay is called an incorporation assay. Incorporation assays are lIseful for monitoring the purifir.ation and funcHon of many different enzymes catalyzing the synthesis of polymers like DNA. RNA. or proteins.
fi13
b srnall moIccules enter
aqucoos spaces wi\hil beads
•
,O,--
:'--..-l. .
••~ 4'1'-.<-------' ~
,
el /
I'~
t-;-- large molewles cannot ente.. beads
)
l l·l
It
FICU RE 20-22 Ion exchange and gel
filtration chromatogtaplly. As. desolbed in
!he tellT, mese two commonly used fOflTlS of d1fornatograpn,. separate proteins on \he basis of lheir charge and size respeo;...ety. Thus, in each case, a gIass h.be is packed wit!1 beads, and the proten mixture is passed through this matrix. The nawre 01 the beads dictales the basis 01 protein sepafation. (a) lhey ale negcmvely charged. lhus, posilively-charged proteins bind lo them aOO are JetaiJted on the coIumn. while negatively-charged proteins pass tlTOJgh. (b) The beads contam aqueous spaces into wuch small ~teins Cdn pass, sIovMg oo.vn meJr ¡:rogress thlOugn the COIumn. laf8e' protei"ns cannol: enter !he beads and so pa5S freely thl"Ough !he cokxnn.
Ion exchange chromatogrophy In this technique, the proteins are separated by their surface ionic charge usiog beads lhat are modified with either positively-charged or oegatively-charged chemical groups. Proteins that ¡oteraet weakly with the beads (such as a weak positivelycharged proteio passed over beads modified with a negatively-charged group) are released from the beads (or eluted) in a low salt buffer. Proteios lhat interad more strongly require more salt to be eluted (the sal! masks the charged regions allowing the proteio lo be released from the beads). By gradually increasing the conceotration of salt io the eluting buffer. even proteins with mther similar charge characteristics can be separated into different fractions as they elute from the colurno.
Gel jillrolion chromalogrophy This technique separates proteios 00 the hasis oi size and shape. The beads used for Ihis type oC chromlltography do nol have charged moietics aHachcd, bul ¡nslead have a variely of different sized pores Ihroughoul. Small proteins can enter al! the pores and, therefore, can access more of the column and take longer lo elute (io olber words, they have more space to explore). Large protel.ns can access less of the column and elute more rapidly. For each type oC column . chromatogmphy fractions are collected at different sal! concentratioos or elution times and assayed for the protein oC interest. The fmetions with lhe most activity are pooled and subjected to additional purification. By passing proteins through a number of difierent columns. they are increasingly purified. Although it is rare that an individual column w ilJ purify a protein to homogcneity by repeatedly separating fraclions that contain the protein of ¡nleresl (as defermined by Ihe assay for the protein), a series of chromalographic steps can result in a fraction thal contruns many molecules of a specific protein. For cxample, allhough thero are many proteins thal elute in high sah from a positivcly-chargud column (indicating a high negative charge) or slowly from a gel filtration column (indicating a relatively small size), Ihere will be iar fewer tbat satisfy both of these criterio.
Affinity Chromatography Can Facilitate More Rapid Protein Purificarion Specific knowledge of a protein can frequenlly be exploited to purify a protein more rapidly. for example, if you know thal a prote¡n binds ATP duriog its function , the protejo can be applied to a coturno oCbeads tbat are coupled to .ATIJ. Only proteins lbat bind to A1V will billd to the column, allowing the large majority of proteins that do nol bind ATP to pass through the colurno. This appfO<.ch lo purifiOllion is cnlled affinity chromatography. ather reagents can be attached lo colunUls lo allo\\' the rapid purifiOltion of proleins; thuse ¡nelude specific DNA &equellces (to purify ONA-binding proteins) or even specific proteios lbat are suspected to interact with the protein to be purified. Thus, before beginning a purification, it is important to think about what information is known aboul the targe! prolein aud to try to exploit Ihis knowledge. One l'ery common form oC protein affinlty chrOTnatography is immunoaffinity chromatography. In this approach. en antibody Ihat Is specific for Ihe larget proteiu Is attached to beads. Ideally, Ihis a ntibody will interact only with the intended target protein and allow all other proleins to pass through the beads. Tbe bound protein can then be eluted from the colurno using salt oro in sorne cases, mild detergent. The primary difficulty with Ihis approach is thal frequently the antibody binds the target protein so tightly that the protein musl be completely denatured before it can be eluted. Because protein denatura-
tion is often irreversible, the target protein obtained in this manner may be inactive and therefore less useful. Proteins can be modified lo facilitate tbeir purification. Tbis modiflcalion usually ¡nvolves adding short additional amino acid sequences lo the beginning (N-terminus) or the end (C-terminus) of a target protein. These additions, or "Iags" can be generaled using molecular cloning melhods. The peptide lags add known properties to Ihe madlfied proleins lhat assist in Ibeir purificalion. F'or example. adding six histidine residucs in a row to Ihe begjnning or end of a protein will mnke the modjfied protejn bind tightly lo a column with inunobilized Ni H ions aUached lo beads-a p.roperty Ihat is uncommoll among proteins in general. In addition, specific epitopes (a sequcnce of 7-10 amino acids recognized by an anlihody) have been dcfined Ihal can be altached lo any protein. This procedure allows lbe modified protein to be purified lIsing immunoaffmily puriflcation and a heterologolls anti body Ihal is spocific for lhe added epilope. Importantly. such antibod¡es and epilopes can be chosen such lhal they bind with higb affinity under one condition (for exampLe. in t.he absence of Ca H ) hui readily elute under a second condilion (such as the addition of lo\\! amollnts of Ca "2 ) . This avoids the nced to use denaturiug condilions for olutiOIl. lmmunoaffinity chromatography can also be uscd to rapidly precipitate a speciflc protein (and any proteins tightly associated with it) from a crude extracto In Ihis case, precipitation is achieved by attaching the antibody to the same type of bead used in column chromatography. Because tbese heads are relatively large, they eapidly sink to the bottorn of a test tulle along \'vith thc antibody and any proteins bound lo the anlibody. This process, called immunoprecipitation. 15 used lo rapidly purify proteins oc protein complexes frem crude extracts. Although the protein is ramly compleleJy pure ni this point. !his is often a useful method lo determine whal proteins oc other moleades (for example, DNA. see the section on Chromalin Inununoprecipitation in Chapter 17) are assodated with the target protein.
Separation of Proteins on Polyacrylamide Gels Proteins have neither a uniJorm negative charge nor a unifonn secondary strucllIte. Ralhcr, !hcy are constJ1Jcted from 20 distinct arnmo acids. sorne of which are 1I11charged. some positively charged, and stiU olhers are negatively charged (Figure 5-4). A1so, as \Ve discussed in Chaptee 5. pl'Oteins have extensiva secondary and tertiary structures and are often in multimeric complexes (quartemary structure). If. however. a protein is treatoo with the strong ionie detergcnt sodium dodecyl sulphatc (SDS) and a reducing agent, such as rnereaptoethanol. the secondary, tertiary, and q\larternary struclure is uSllally eliminated. Onre coatoo with SDS, the proteio behaves as an unsLructured polymer. SDS ions coat tbe polypeptide chain and thereby impart on it a unifonn negative charge. Mercaploothanol reduces disulphide bonds and thereby disrupts intramolecular and intennQlecular disulphide bridges formed betwoen cysteine residues. Thus, as is the case with mixtures ONA and RNA. elect:rophoresis in !he presence of SOS can be used lO fcsolve mixtures of protei.ns accoeding lo tbe length of individual poJypcptide chains. After eloctrophoresis, the proteíns can be visualized with a stain, such as Coomassie brilliant blue, tbat binds lo prolein. When the SOS is omitted, electrophoresis can be used lo separate proteins according to properties other than molecular weighl, such as net charge and isaeleetrie point (500 below).
Antibodies Visualize
Elec tropho reticany~Separated
Proteins
Proteins are, of course. quite different from DNA and RNA. bul Ihe procedure known as immunobJotting, by which an individual proteio is visua lizcd amidst IhOU5ands of other proteins. is analogous in concept lo Southcrn and northero blol hybridizatioll. In immunoblotting, electrophoretically separated proleins are transferred and bound lo a filter. Tbe filter is then incuhated in a soluti on of an antibody thal had beeo raised against an individual purified protein of interest. The anlibody finds lhe corresponding protein on th e filter to which it avidly binds. Finally, a c hromogenic enzyme is used to visualize the filterbound anlibody. Soulhem, núrthern, and inununoblútting have in common the use of selective reagents to visualize particular molecules in complex mixtures.
Protein Molecules Can Be Direc tly Sequenced Although mom complex than the sequencing of nucleic acids, protein molecules can also be sequencecl: tha! is, the linear order of amino adds in a proteio chaio can be dimctly determincd. There are two widely used melhods for determining proteio sequence: Edman degradation using an automaled protein sequencer and landem mass spectrometry. The ability lo determine a prolein's sequence is very valuable fur protein identifkation. Furthermore , because of the vast resource of complete or lIea:rly complete genome ser¡uences, the determination of even a smal1 stretch oC protein sequence is often sufficient to identify the gene which encoded !hal protein by finding a matching opcn-reading frame. Edman degradalion is a chemical readion in which the amino acid's residues are sequentially released for the N-terminus of a polypeptide chain (Figure 20·23). One key feature of this method is that the N-terminal-most ammo acid in a chain can be specifically rnodified by a chernical ceagent called phenylisothiocyanate (PITe), which modifies the free o-amino group. This derivatized amino acid is then cleaved off the polypeptide by treatment with acid under conditions tha! do not destroy the remaining protein. The identity of the released amino acid derivative can be easily determined by its elution profile using a column chromatography method caBed High Performance Liquid Chromatography (HPLC) (each of the amino acids has a characteristic retention time). Each round of peptide cleavage regenerates a normal N-terminus with a free a-amino group. Thus. Edman degradation can be repeated for numerous cycles. a nd thereby reveal the sequence of fh e N-terminal segment ofthe protein. In practice. 8 to 15 cycles of degradation are commonly performed for protein identification. This number of cycles is nearly always sufficient to uniquely identify an individual protein. N-terminal sequencing by automated Edman degradation is a widespread and robust technique. Problems arise. however, when the N-terminlls of a protein is chemically modified (for example, by formyl or acetyJ groups). Such bJockage may occur in vivo. or during the process of protein isolation. When a protein is N-terminally blocked, it can usually be sequenced after digestion with a protease to reveal an internal region for sequencing. Tandero ma.'iS spectrometry (MSIMS) can also be used to determine regions of protein sequence. Mass spectrometry is a method in which the mass of very small samples of a material can be determined with great accuracy. Very briefly. the principie is that material travels Ihrough the instrument (in a vacuum) in a manner that is sensitive to
Proteins
pheoyIisoIhioc:yanatc Edman degradation
,~
! !
< }-N=C=S
+
labeling
flrst '00,,",
o- '~
! Iabcling
",ca",
0- ' !labcliflQ
0-&-®-@--{ID
!
relcase
o-e
@-@--
PTH-alanlne
~ N_c!'S ~- 11 e N#\/
o
/ H
e
\
peptidc shorlened by cne residue
H
CH,
F I C; U R f 20-23 Protein seque nO"I by Edman degradation The N-Iermil'lal residue is labeled and can be removed without hyd~ng the res! of Ihe peptide. Thus, In edCh rrxJnd, one resió.Je is identifled, aOO lhat residue represents Ihe next ene in the sequence of!he pepllde,
its mass/cbarge ratio, For small biological macromolecules such as peplides and sma ll proleins, the mass of a molecu Je can be deter· mined with the accuracy of a s ingle Oallon. To use MS/MS lo detennine proteio sequeno1, the protein of mleresl is usuall y digested into short peplides (often IttS5 ilian 20 amino adds) by digestion with a specific protense sllcb as trypsin. This mixture of peptidcs is subjected to mass spectrometry and ench individual pap. tide will be separated from Ihe othe rs in the mixlure by its mass/charge ratio. The individual pa ptides are then captured ¡md fragmented into aU Ihe com ponen! peptides. and Ihe mass oC each of these componen! fragments IS then determined (Figure 20·24). Dcconvolutioll o f thero data reveals an unambiguous sequence oC the ¡nitial peptide. As with Edman degradalion. sequencc ora single approxiruate ly 15 ami lJo acid peptide from a protein is I1carly a lways sufficient to identify tbe protein by oomparison of the scquence oC that predictoo from ONA se. quences. MS/MS has revolutionized protein sequcnci ng nnd idtmtification. O nl y very small amounts of material are nceded . and comp lex mixtures of proteins can be s imuitaneously analyzed. Proteomics T he availabil ity of whole senome scquences in combinalion with analytic metbods ror protein separalion and identi ficalion has ushered in the fie ld o r proteomics. Proteomics is concemoo with Ibe identification of the full sel of proteins produced by a cell or tissue under a parlicu lar sel of conditions, the ir relalive abundance, and their inler·
677
678
Tcc/miques oi Molecular Biology
:r-----------r;! _:, , ~
.
,
:
•
•
,--- --- -_.!"_---,
•
digest with trypsin
• e 100 !
,_
.•:l
----------,C ~ O
.~:~ ~
40
"
20
I
230<.' 9
o lIo....toul.ol 820
1176
, 1532 1888 Mass (miz)
2244
2600
F I G U R E 20-24 Analysis of the p«Jteome by 20 electrophotesis and mass spectrometry. (a) Exasnple of proteins from a cell extrae! separated by 20 gel electrophoresis. Note mal .n this example, only Pfoteins 'l"lifh a small range af lsoelectric points (bel'vveen 5 and 5.5) are analyzed (here sep<'llated leh to right). IEfstands far lsoelecttic focusing. The verticill directivn separates proteins by their SD5-
acting partner proteins. Whereas microarray analysl s (see Chapter 18) males it possible to visualize gene transcripti on 00 a genome-wide bas iso tbe lools of proleomics provide a soapshot of the ccll's fuH reper· toire of proteins. Proteomics is based on three principal methods: Iwo-dimensional gel electrophoresis for prot ein separation, mass spcctrometry for lhe precise delennination of the molecular weight aud identity of a proteio (or peptides geoerated rrom the proteio). aod bioinfonnatics for assigning proteins and peplides to the predicted products of proteincoding sequences in the genome. A sin gle cell ofteo produces thou· sands of different proteins , far too many lo sepa rate aod identify by SDS gel electrophoresis alone. As its name implies, two-dimensiooal gel elcclrophoresis separates proleins in Iwo dimeosions and does so io successive steps. lo the firsl step, the proteins are fractionated according lo their iso· electrie poiol by isoelectrie focusi ng. During isoelectric focusi ng, a gradieol of pH is generated in a gel. The isoclectric point is the pH at wh ieh a protein exhibits no fi el charge and heoce becomes staliooary (focuses) in the pH gradient. In the second step, the proteios are separaled according to size by SDS gel eleetrophoresis as described above. Because proteins are separated on Ihe bas is of two properlies (isoelee· trie poiot and molecular weight), lhousands of different proteins ean be resolved from eaeh uther io a s ingle experimento After fractiooatioo
Bibliography
679
by two-dimensional gel electrophoresis, each protejo is separately subjected lo mass spectrometry in order lo determ ine jls exact molecular weight. As diseussed ~bove , i t is generally more effect ive lo first treat the proteio with a protease and Ihen determ ino the molecular weight of Ihe resulting proteolytic fragments. rather than lhe ¡nlaet proteio ¡tself. MS/MS ana lysis also allows tbe precise seq uence of th e polype pt ide fragmen ts of e<:lch protejo to be identi fied . FinaJl y. gi vell a complete genome sequenc:e for the organism uorler study and these peptide sequcnces fro m the proteins of interesl. the tools of bioin formatics make il possible to assign each protei o (lbal ¡s. its proteolytic fragrnents) to a particu la r protein-coding sequence (gene) in the genome.
BIBLlOGRAPHY Books Browl1 T.A. 2002. Cenomes. 2nd editi on. BlOS Scientific. Oxford, Unitcd Kingdom . Griffi ths A.J.F., Gelbart W.M., Lewon tin R.e. and MiUer J.H. 2002. Modern genetic nnolysis. 2nd edil ion. W.H. Freeman , Ncw York. Hartwell L. , Hoad L. , Goldberg M.L., Reynolds A.E., Sil ver L.M.. and Veres R.e. 2003. Cenetics: From genes to genomes. 2nd ed ition. McCrnw-HiII. New York.
Samurook J. and RusseJ D. W. 2001. Molecular clonjng: A laboratO/y manual. Cold Spring Harbor Laboratory Press, Cold Spri ng Harba r. New York. Snustad O.P. and Simmons M.J. 2002. PrincipIes of gener¡es, 3rd edition. Wiley, New York.
Genomic Analysis H uman genome. 2001. Nu ture. 409: 813-960.
Human genome. 2001 . Science. 291: 1145-1434. Mouse genome. 2002. Notvre. 4.20: 509-590.
CHAPTER
Model Organisms
Well-known adage in molecular biology is that fundamental problems are most easily sol ved in the simplest and múst accessibJe syslem in which Ihe probJcm can be addressed. Por this reason. over Ihe years mok'Cular biologisls have focused their attention on a relatively small number of so-called model organisms. Among the most importanl of tnese in order of increasing complexity are: EscIJer¡cllia coli and its phage, the T phage and phage A; baker's yeast SaccJwromyces cerevisioe; the nematode Caenorhabditis elegons; the fruit fly DrosopMJa meJanogaster; and lhe house mouse Mus musculus. What is it that ruodel systems have in common? An important feature of all model syslems is the availability of pOWCl'ful 10015 of tradltional and molecular genelics. making il possible to mnnipulatc and sludy the organism geneticaUy. Second is Ihat the study of eaeh model system attracted a critical mass of investigators. This meant tbat ideas. methods. lools. and stra ins could be shnred among scientists investigating the snme organ ismo fncilitating rapid progress. For example. beginning in Ihe 1940s a circ1e of scientisls gnthered around Max Delrúck. Salvadore Luria. and Alfred D. Hershey, spending Ihe summers al lhe Cold Spring Harbor Laboratorics in Ncw York stu dying the multiplication of the T phage of E. cou. This group. called the Phage Group. werc among those who were impor/anl in estabJishing the field of molecular biology. Many of th e rnembers of Ihe Phage Group were physicists ntlIacted to phnge. nol only because of their reJath'c simplicity . but because !he Jarge numbers of phage thal could be studied in each experiment generated resu lts thal were quantitative and statistically significant. By the late 1950s Cold Spring Harbor offered nn annual phage course. where ever-growing numbf:lr~ of investigators carne to learn the oew system. This was a case where focusing on the same model organismo guara nteed faster progress than would have been made if these individuals had studied many different organisms. Tbe choice of a model organism depends on what question is being a~ked. When srudying fundamental issues of molecular biology. it is often convenient to stlldy simpler unicelllllar organisms or viruses. These organisms can be grown rapidly and in lorge quantities and typically allow genetic and biochemical approaches to be combined. Other questions. for example lhose concerning development, can ofien only be addressed using more complicated model organisms. Thus, the T phage (and its best-known member. T4. in particular) proved to be an ideal system for lackling fundamental aspects of the nalure of the gene and informarion transfer. Meanwhile. yeast. with its powerful maling syslem for genetic ana lysis. became the premier system fOl' elucidating fundamental aspects of the eukaryotic ceU .
A
OUTllNE
• B.'lcte,iophage (p. 682)
• BlIctena (p. 687)
• Bake(s Yeasl Soccharomyces CEreVl: (p. 693)
The Nematode Wonn.
Coenorhabditis elegans (p. 696)
The fniit Fly. Dra;ophilo melanogo!(p. 699) The House Mouse. Mus (p. 70S)
musculv~
Evolutionary conservation from fungi lo higher ce lls has meant that di scoveri es made in yeast frequently hold true for h umens. The nematode and the fruit fl y also orrer well -developed genetic syste ms for fackling probl ems tbat can not be effectivel y addressediu lower organ isms. such as developmenl and behavior. Finall y, Ih e mouse, though Icss fac ite lo sludy lhan ne matodes and fruit flies. is a m8)TI mal and hence the best model system for gaining ins ights int o human biology and hum an disease. In this c hapter we will describe so me of the most com monl y sh ldied experime ntal organisms an d prcse nt the principal fe atu flls and advantages of each as a model system. We shall also consider Ihe kiod of e xperimental lools Ihat are available for studying each orga nism and sorne of the biological problems that have becn studied in each case . This chaptcr is not intended as a comprehensive presentalion of all the model organ is ms Iha t have had ao important impact in molecular biology. For examplc. nol included here is the mustard Ambidopsis thaljana, which has emerged as a pawerful model organi sm for understanding the mol ecu lar biology of planls.
BACTERIOPHAGE Bacteriophage (and viruses in general) arfer thc simplest system to examine tbe basic processes of lire. Their genomes, typically small , are replicated - and lhe genes they encode expressed-only after being injccted into a host cel! (in the case of phage, a bacterial cell). The genome can also undergo recombination during these infections. Because of the relative s implicity of the system , phage were used extensively in the early days of molecu lar biology - indced. they were vital lo the deve lopment of that field . Even today they remain a system of choice when studying the basic mechanisms of DNA replication , gene exprcssion , and recombination. In addition, they have been import ant as vectors in recombinant ONA technology (Chapler 20) and are used in assays for assessing the mUlagenic activi ty of various compounds. Phage typ ical1y consist of a genome (DNA 01' RNA, most commonly lhe fonnerJ packaged in a coat of protein subunits. some of which fmm a head slnlcture (in whicb the genome is storOO) aod sorne a tail struclure. The tail attaches the pnage partide to the oulside of a bacterial hast cell, allowing the genome of the phage to be passed iota that cell. There is specificity here: each phage atlaches to a specific cell surface molecule (usually a protein) and so only cell s bearing that "receptor" can be infected by a given phage. Phage come in two basic types- Iytic a nd temperate. The fonner. examples of which include the T phage. grow only Iytically. That is, as shown in Figure 21-1, when the phage infects a bacterial cel!, its DNA is replicated lo produce multiple copies of its genome (anythiog up to several hundred cop ies) a nd expresses genes that encode new coat proteins. These events a re highly coordinated to ensure new phage partides are constructed befare the hosl cell is Iysed lo re lease tbem. The progeoy phage are then free to inCect fmthe r hast cells. Temperate phage (such as phage ~) can also replicate Iytically. But they can adopt an alternative developmental pathway called Iysogeny (figure 21-2). In Iysogeny, illslead of being repli cated , the phage genome is ¡ntegraled into th ~ bacterial genome, and tlle coat protein gunes are nol expressed. tn t11 is integrated, ropressed statc the phage is catl ed a prophagc. The prophage is replicated passively as part ofthe
BClcteriophage
683
F I C; U R E 21-1 TIte fytic growth cyde of
a bacteriophage. The phage partide sticks lo me outer surface of a su.table bacterial hos! cell (afie bearing!he appropriale receptor) Md
injects its genome. IJSllaIIy a DNA moIecule. lhat DNA 15 replicaled. aOO \he genes expressed 10 produce many new phage. Once the progeny pllage are assembled into matUfe partides, Ihe bacterial cell is Iysed. aOO me progeny released lo infect ano!her host cel1.
(
,-
~j oo~-- z
\.~ ... .' ...
( '" il"
phage s truclural parfS
fTagmenls
bacteria} chromosome at ceJl divisioo. and so bOfh daughter cells are Iysogens. The Iysogen ic stata can be maintained in this way for many generat ioo s buI is al50 poised lo switch lo lylic growth al any time. T his swi tch from the Iysogenic lo Iyti c pathway, ca ll ed induction. involvlls exc ision of Ihe prophage DNA from the baclerial gen ome,
t ./phage
1
/
FIGU RE 21-2 The Iysogenic cyde of a baderiophage. The ¡Mial steps of infection
bacteriaJ
~:~o~.+
_phage ONA
~ !
S~
\ 1
g~
'\
~
\~¿
are me same as seen in me Iytic: case (see Figl.lte 2 1-1). Bur once Ihe ONA has entered lhe cel!, it is integraled into the bacteria! chromosorne \I\otlere 1I is pdSSlVely replicated as pdrt of tha! genome. AIso, me genes enroding the coal proteins are kepl switched di. The integraled phage IS called a prophage. The Iysogen can be slably maintained for rnany generarions. bUI can aIso S\Nitch ro me Iytic cyde efficiently under appropriate circumstances. See Chapter 16 for a fullel description of these
""""'.
684
ModelOrganisms
replication. and the aCljvation of genes nceded lo make coat proleins and to regulate Iytic growth (shown in Figure 16-24 ). Assays oí Phage Growth For bacteriophuge to be useful as an experimental system, methods are needed lo propBgate und quantify phage. Propagatlon is oeeded to generale mate rial -high titer phage stoOO for use in experirnents, or for DNA extraction. Phage are typica11y propagaled by growth on a suitable bacteria! host in liquid culture. Thus, for example, a vigorously growing fl ask of bacterial ceUs can be infected with phuge. Alter a suitable time. the cells 1yse. leavil;¡g a c1ear liquid suspension of phage partides. To qmmtify the munhers of phage partides in a so tution , a plaque assay is used (Figure 21-3). This is done as foHows: phage are mi.xed wilh, and adsorb lo. bacteria! cells into which Ihey inject thell DNA. Tha mix is lhen d iluted. and those dilutions are added lo "soft agar," which contains many more (and uninfected) bacteria] ceUs. These mixtures are poored onto a hard agar base in a petri dish. where Ihe soft agar sets lo form a jelly-like top layer in which Ihe bacteria1 cells are suspended ; sorne are infected. bul most are nol. The plales are tben incubated for several hours to allow bacteriaI growth and phage ¡nfeelion lo take Iheir course. Each infected cell (from the original mix) wiII Iyse during s uhse· quent incubalioo in Lhe soft agar. The consistency of the agar allows the progeny phage to diffuse, bul not far, so they infeet onll' baclerial cell s growing in Ihe ifllmcdiale vici nity. Those cell s. in turn, Iyse releas ing more progenl'. which again infeet local cells , and so on. The result of multiple rounds of infection is formation of a plaque, a circular clearing in the otherw ise opaque lawn of densely grown uninfeeted bacteriaJ cell s. This is because Ihe unin fected bacteri aJ cells grow into a dense population within the soft agar, whit e those bacterial cell s located in areas around each in itiaJ ¡nfeclioo are killed uff, leaving a clear patch. Knowing lhe number of p laques on a given plate, and the extenl to which the original stock was dil uled before p lating, makes it trivial lo caJculale Ibe number of phage in thal original stock.
f I GU RE 21-3 Plaques formed by phage infection of a lawn of bacterial cells. In the case shoIrvn, the plaques are produced by a Iytic T-phage. (Sanee: Slenl Gs. MoJeculcJr bioIogy 01 bacteria! wuse~ p. 4 1.)
•
••
••
•
.
•
Bacleriopl1aga
T he
Si ngl e~Step
685
G rowth C urve
This classie experimenl revcaled Ihe Iife eyele of a Iypicallylic phage and paved Ihe way for many subsequent experiments Ihat examined Ihat life eycle in delall. The esscntia l fealure of this proced ure is Ihe syncbronous infection of a populalion of bacteria and the elimination of any re·infection by Ihe progony. This aJlows the progress of a single round of ¡nfeclion to be followed (Figure 21·4). Phage were mixed with oocterial eells for 10 minutes. This is long enough for phage to adsorb to baeterial eells, bul il is loo short for ¡nfeclioo lo progress much further. This mixture is Ihen diluted (with fresh growth media) by a factor of 10,000. This dilution ensures Ihal only those ce lls thal oound phage in lhe ¡nitial incubalion wilI con· tribute lo Ihe infeeted populalion; also, JI ensures thal prageny phage produced from thase jnfections will nol find hast cells to infeet. The diluted popul ation of infecled cells is Ihe n ineubaled lo a llow ¡nfeclion lo proceed. Al intervals. a sam ple can be re moved from Ihe mixture and Ihe number of free phage counled using a plaque 8ssay. Initially lhal number is very low (comprising just Ihe phage from '-he ¡nitial ¡nfection Ihal did nOI inreet a L:l!1I before being diluled). Once sufficienl time has elapsed for in fected cell s lo Iyse and release their prngeny, a big increase in the number of free phage is detectad. (This lakes about 30 minutes ror the ¡ytjc phage T4.) The time lapse bet'w een ¡nfactioo and release 01' prageny is ca lle
Phage Crosses and Complementation Tests Beiog able to eouot Ihe number of phage within a population allows researchers to measure whelher a given phagc derivative can gro"" on a given bacterial hast eel! (and Ihe efficiency wilh which il does soror example, Ihe bursl size), Also. the plale assay al10ws cortain Iypes of phage derivativos lo be distinguished hecause of Ihc di fferent plaque morpholagies lhey produce. Differences in hosl range and plaque mOl'phologies were very often the result of genetie diffcrences between othelváse iden ti cal phage. [n !he eacly days of molecular biolog)'. Ihis provided genetie markers in a syslem in which they eould be ana lyzed, enabling researchers to ask how genelic informa · lion is encoded a nd funetions. The ability to pcrform núxed infeetions-in which a single cel! is infeded with two phage partieles al once-makes genetie analysis possible in Iwo ways. First. jI 8110\'\'s one to perform phage crosses. Thus , ir Iwc differenl mulants of the same phage (and Ihus harooring homologous chromosomesJ oo-inroo a celi, recombination-and Ihus gcnetie exchange-can occur between the genomes. The frcqueney of this genetic exchange can be used lo order genes on the genome. A high recombination frequenc)' indicales Ihat Ihe mulations are rela· tively far apart, whereas a low rrcquenc)' indicates that the mutalions are localed close lo each other. The large numbers of pbage particles Lha! can be used in such experimenls ensures Ihal even very tare evenls will occur (rccombination between Iwo very dosely positioned mutalions) as long as there is a }oVa)' to screen for-or beUer stil!, ser loo for-the rare even!. Second. eo·infection also allows one to ass ign mulalions lo eom plemenlatio n groups; lhat is, one can idenlify when lwo or more mutalions are in the sa me or in differenl genes. Thus, if
•
2' ~
"6
bu'"
• 11
size
,e
E
J laten! period lime
F I C> U11: E 21-4 lhe singte-step growth curve. As descnbed in lhe tex\, !he slngle-step grONlh curve reveals !he length of time it tllkes 11 pI;age to undergo ooe roullCl of Iytic grONth, Md also !he number 01 p-ogeny phage ~ duced per inlected cell These <'Jre!he latenl perlOd and burSl Slre lespectivety,
two differenl mutanl phage are used lo co-infect the same cel! and as a result each provides the functian tha! the ather was lacking, the twa mutations must be in differenl genes (complementation groups). lf, on Ihe other hand. Lhe two mutants faH lo complement each other, then Ihat can be taken as evidence Ihal tbe Iwo mutation s are likely located in the same gene.
Transduction and Recombinant ONA Phage crosses and complemenlalion tests allow the genetics of Ihe phage Ihemselves lo be analyzed. These same vehides and techniques can , however, also be used lo investigale the genetics oC other syslems, Initially these observations were restricted lo bacterial genes inadvcrtenlly picked up doring an infection (as we describe below). With the advent oC recombinant DNA techniques in the 1970s. however. these studies were extended to DNA from any organismo Doring infecHon, a phage might occasionally (and accidentally ) pick up a piece of baclerial ONA. The most comffion way in which a phage picks up a section of the hosl ONA is when a prophage excises from the bacterial chromosome during induclion of a Iysogen , Thal process involves a site-specific recomhination event (see Chapter 11), and if that evenl occurs al slightly tbe wrong position. pbage DNA is lost and bacterial DNA incIuded, As long as thal exchange does no! e hminale part of Ihe phage genome required for propagation. tbe resulting recombinanl phage can sl ill grow and can be used lo trans fe r the bac te rial ONA from one bacte rial host to a nother. Thi s process is known as specialized transduction. The bacterial ONA incIuded in tbe speciali zed transducing phage is amenable to the same kind of genetic analysis as is possible for Ihe phage itself. Because of il s ability to promote specialized transduction . il was natural tbat pbage ~ was chosen as one of the original cloning vectors (Chapler 20). Thus. by eliminating many oC the sites for a particular restriction enzyme. and leavi ng only one (insertion vector) or two (replacement veclor) in a region of Ihe phage nol essential for lytic growlh. ~ can be made lo accept Ihe insertion (in vitral of ONA from any source. Thal DNA can be prapagated and analyzed much more easily Ihan il could in its organism of origino The restriction endonuclease sites in ~ were eliminated by repealedly selecting phage thal plated with higher and higher efficiencies on stmins expressing Ihe reslriction syste m in question. By enriching for resistance lo endonuclease in thi s way. and then . in vitro. mapping which sites were losl and which retained. the desired derivative was identified. Many different ),. vectors were developed. all differing in the reslriction s ites used and in how recombinant phage could be identified. One selection system worked as follows: a ~ derivative was derived in wbich a solitary restriction site was retai ncd within Ihe cf gene, the gene that encodes the repressor (see Chapter 16). In the parent vector. therefore. this gene is intacl and Ihe phage can. if it chooses , form a Iysogen; the phage. tberefore. forros lurbid plaques. Wben a piece of DNA is illserted al this site. however. the res ulting recombinanl phage has a disrupted el gene. cannot form lysogens. and so it forms only clear plaques. Thi!> chflnge in plaque morphology provides an easy way of dis tingui shing recombinant from nonrecombinant phage. Moreove r, Ihi s approach can be made into a salecHon (ralher than a screcn) ir
Bacteria
687
the bacterial stra in used is an Ilfl strain (see Box 16-5 in Chapter 16). On that stra in , a ny pn age that can form a lysogen invariably does so. Thus, only recombina nt phage produce plaq ues un the Ilfl stra in_
BACTERIA The atlraction of bacteria such as E. coli or B. subtilis as experimental systems is thal they fII"C relative!y simple cells and can be grown and rnanipuJated w ith comparative ease. Bacteria are single-celled organisms in w hich all of the macrunery for DNA, RNA, and protein synthesis is contained in the same cellular compartment (bacteria have no nucleusl. Bacteria usually have a s ingle chromosome- typica lly much s maller th an the geno me of higher organisms. Also. bacteria have a short generalion tim e (Ihe ceJI cycJ e can he as short as 20 minutes) a nd a geneticall y húmogeno us popul ati on of cell s (a clo ne ) can easily be genera te d from a s ingle cel!. Fin ally, bacteria ure convenient to sludy genetica ll y bcca use. on the one ha nd , they are ha ploid (which means thal Ihe phe notypes oC muta ti ons , even recess ive mutations, menifcsl rendily). a nd , o n the other hand . because genelic mAterial ca n be con venie ntly excha nged betwcen bacteria. Molecular bi ology owes its o rigin to ex periments with bacte rial and phage model systems. Up until the famous f1 uctuation a nalysis experimcnts 0 1" Salvadore Luria and Max Delrück in 1 943, the study oC bacteri a (bacteriology) had remained largely outsi de Ihe realm oCtrad iti o na! geneti cs. l 'n king a stati sli cal approach . Lud a and Oe lrüc k demonstrated tha t bacteria can undergo a change in which lhey become res istant to ¡nfeetion by a partic ular p hage. Critically, they showed tha t thi s c hange ari ses spon taneously. ruther tba n as 'a res ponse (a da ptation ) lo the phage. Thus, Iike o ther organ is ms, bacteria can ¡nh eri t lra its (fo r exa mple. sens itiv ity o r resi stance lo a phage). and occasionall y Ihis inheritance can undergo a spontaneou s change (mut ation) to a n alte rnative inheritable state. Th u experim enls oC Luria and Delrück showed that , like o ther organi sms, bacteria exhi bit genetica lly determin ed c haracteristics. Bul beca use oCtheir s implicity. bacteri a would be ideal experimenta l systems in w hich to eluci date the n ature oC the genetic material and Ihe trait-d ete rminin g factors (ge nes) of Cregor Mend e!.
s tationary ph ase
lag phase
time
Assays of Bacterial Growth Bacteria can be grown in liquid or on solid (agar) me dium . Bacteria1 cells are large eno ugh (about 2 ¡.tm in le ngthl to scatter light. allowíng the growth of a bacteria! c ulture lo be monilored con veni entl y in liquid culture by the ¡ncrease in aplical densi ly. Acti vely growing bact eria that are dividing with a constant genera tiun time increase in numbe rs ex ponentjaUy. They are said to be in th e exponenlial phase of growth. As the populat ion increases to high numbe rs oC ct:llls, th e growlh ra te s lows and bacteria e nter the stalionary phase (Figure 21-5). Th e number of bacteria can be determined by diluting the culture and plating the cell s on solid (aga r) medium in a petri dish. Single cetls grow into macroscopic col onies cons isting of millions oC cells w ithin a re lati vely beleC perl od oCtime. Knowing how many coloni es are o n the pi ate ane! how much the culture was diluted makes it poss i ~ ble to calculate the concentration of cells in the origina l cu lture.
FI(¡URE 21-5
Bacteriatgrowthcurve.
As desc.ribed in Ihe Iext, bacterial cells. such as [ . eoli, Ciln grow very tapidly ...men no! overaowded "nd v.hen propagated In well 01:>/genatro r1Ch medium. 11115 phase 0 1 growth IS cillled the exponential phase beCCII.I5e Ihe cells are repliCilltng e¡ponentially. Once the number 01 cells gets too high. and the culture bec.ornes very dense, gfOwth tilils off ¡nlo [he SCKiIIIed stiltionary phase.. úlls taken Irom stiltlOlKllY phase and diluted lo k:.w denSlty in lresh medÍUlJl 'Mil ilgain enter exponen,).:,1 phase growth, but only ahet"" lag phase. The fale of (eUnumbe.r ir.creases ¡n each of these phases is
.nown.
688
ModeJ Orgonisms
F'
,chromosome
#==p,__~""" plaSmid
~
P- /ac
_ _ bacterial cell
/oc
,&;-,
F I C; U11: E 21-6 lhe three fonns of f.pumid carrying cells. F· ~11s harbor a single cq>y of me F-plasmid whK:h repllGlles as an independenl mlní·c:hrom050me. In an HIr slraln, me F·plasmid is inlegraled inlo lhe baaenal chromosorne and is leplicated as part ot lhal larger molecule. In an F -5lraln, an F·plasrrud that h.ocl pfe\.tously been Integraled Inlo lhe host
dlromosorne e.\(lses, bringlng VvÍln 1I a regían 01 adjacent hosl DNA A11three cell types can be
transferred lo a redpient f- cel!. 1I !he donor ocll is NI F-t slrain, it coPies and tr<'J nslers tusl lhe Fplasmid; if <'Jn F, 1I copes
genome ot !he reopienl cel!.
Bacteria Exchange DNA by Sexual Ccnjugation, Phage#Mediated Transduction, and DNA#Mediated Transfonnation A principal advantage of bacteria as a model system in molecular biology is the availability of faciJe s)'stems for genetic change. Genetic exchange makes it possible to map mutations, to construct stmins with mulhp le mutations. and to buiId partiall)' diploid strains for distinguishing recessive from dominont mut otion s and for carrying out ds-trons anal)'ses. Bacteria often harbor autonomously replicaling DNA elements known as plasmids (Figure 21-6). Some of these plasm ids. such as the fertility plasmid of E. coN (known as the F-factorl are capable of transferring themselves from one cell to another. Thus . a cell harbodog an Ffactor (which is said to be F ~ ) can transfer the plasmid to an F- celL Ffa clor-mediated conjugation is a replicati ve process. Thus. Ihe P cell transfers a copy of the F-factor, while slill retaining a copy. such that Lhe products of con¡ugation are two F-t cells. Sometimes the F-faclor integrales into the chromosome and as a consequence mobilizcs conjugati vc transfer of the host chron1úsome to en F cel!. A strain harboring such an integrated F-factor is said to be a n Hfr (for high frequ ency re-combina ntl strain and is enormousl)' useful for carrying out gelletlc exchange. Precisel)' which parts of the host chromosome are transferred during any given example of Ihi s exchange varies for two reasons. First, different Hfr strains have the F-plasmid integrated at differenl locations wíthin the host chromosome. 'ITansfer of the host chromosome into the recipient ceJl takes place linearl)', slarting with Iha! region of Ihe chromosome cIosest to one end of Ihe integrated F-plasmid. Thus. where the plasmid is integrated detennines which pHI1 of the chromosome is !ransferred first. Also. it is rore that the entire chromosome gets transferred befare mating is broken off. Thus. genes far fram lhe tra ns fer start poinl are lransfelTed with low frequency, and distanl genes may ncver get transferred in a given mating. Note that a complete copy of the inte-gra ted F-factor is transferred last, if at alL A third amI extremely important form of the F-factor is the F' plasmido The F' is a fertilit)' plasmid that conta ins a small segment of chromoso mal DNA, which is lransferre d along with the p lasmid from ce ll to cell with higb frequen cy For example. one such F ' of historic importance is F'- Ioe. nn F factor thal contains the lactose operon. F' -factors can be used lo creato partially dipl oid strains that have two copies of a particular region of the c hromosome. This was pre-cisely ho\V Jacob and Monod created pcutially diploid stra ins for carr)'ing out their Gis-trnns analyses of mutations in the lactose operon repressor gene and the operator site at which the repressor binds (sec Box 16-3 in Chapter 16) . The F-factor can undergo conjugation onl)' with other E. coN strains; however. certajn other conjugative plasm ids are promi scuous anl! c:an transfer ONA to a wide varie!)' of unrelated strains- even lo yeast. Such promiscuolls conjugative plasmids provide o convenient means for introducing ONA. including ONA that hos becn modified b), recomhinant ONA technology. into hacterial straíns that are otherwisa lacking in their o wn s)'stems of genetic exchange. Yet anothe r powerful tool for gelletic exchange is phage-medi ated transduclion (Figure 21-7). GeneraHzed transduction is mediated by phage that occasionally package a fragment of chromosomal DNA d uring maturation of the virus rather than viral ONA. When sucb a phage partide infects a cell, it introduces the segment of chromosomal DNA
&clerio
from its previous host in place oC inCectious viral DNA. The injected crnomosomal DNA can recombine w ith the chromosome of the infected host cell, effecting the permanenl lransCer of genetic informa tion from one cell to another. This kind oC transduction is cHlled generalized transduction because any segment 01' hast c hromusomal DNA can be transCerred froJO one celJ lo another. Depending on the size of the virion, some generalized transducing phages transduce only a few kilobases oC chromosomal DNA. whereas others Iransduce well over 100 kb oC DNA. Another kind of phage-mediated transduction is callee! specialized transduction, os already mentioned. This process involves e lysogenic phage such as ~ thel has incorporaled a segment of chromosomal DNA in place of a segment of phage DNA. 5uch a specialized transducing phage can, upon ¡nfection, tronsfer lhis bacterial ONA to a new bacterial host cell. FinaUy. we come to the case oC DNA-mediated transCormatíon. whi ch we described in Chapler 20. Cel'lain experimenlally important bacterial species (for example, B. subtilis bul nol E. 00111 possess a natural system of genet ic exchange that enables them to take up and incorporate linear, naked DNA (released or obtained from lheir siblings) into their own chromosome by recombination. Qften the ce11s mus! be in a specialized state k-nown as "genctic competencc" to take up and incorporoto DNA from their environment. Genelic competence lS especially useful as il is possible to use recombi nant DNA technology lo modify a doned segmenl of chromosomal DNA and then have it taken up ami incorporatcd into the chromosomes of competent rocipient reUs.
Bacterial Plasmids Can Be Used as Clonjng Vectors As we have seen, bacte ria fre<¡uent ly harbor circu lar DNA elements known as plasmids tbat can replicale autonomously. Such plasmids can serve as convenient Vt,'Ctors for bacterial DNA as wall as foreign ONA. lndecd. th e ¡nitial (and successful) attempts lo clone recombinant DNA involved a plasmid (pSC101) of E. coli thal con· tajns a uniQue restriction síte for EcoRl into which DNA could be inserted without impairing the capaci ty of the plasmid to replicate (Cha pter 20).
Transposons Can Be Used to Generate Insertional Mutations and Gene aod Operon Fusions As we discussed in Chapler 11. lransposons are not only fascinating genetic elements in their own right but are enormously useful lools for carrying out molecular genetic manipu1ations in OOcteria. For example. transposons that intagrnle into the chromosome with low-sequence specificily (t.hat IS, with a high degree of randomness), such as ToS and Mu . can be used to generate a library of inscrUona l mutations on a genome-wide basis (Figure 21-8). 5uch mutations have two important advantagcs over traditional mutations induced by chemical mutagenesis. One advantage is thal the insertion of a transposon into a gene is more likely lo result in complete ina1,1ivation (a nu ll mutation) of the gene (when such is desired) than a simple nucleotíde switch crealed by a mutagcn. The second advantage is that, having inactjvated the gene, the presencc of the insertad ONA makes it easy lo ¡solate and clone that gene. Even more s imply. with the appropriate DNA primers. the identity oC the inactivated gene can be
~~===o==
__~~~~geONA
...... ' ...-, - .., .-- J-
(' \. .. -
-
J .....,."
},
\. ... ..-...... __ .-'J
",1I~"
ONA
!
689
fragmeots of
~l1ular ONA
ptlageONA
FIGURE 21-7 Phage-mediated generaliled uansduction. As described in !he text, durlng sorne phage infections. the too duomosome is ffagmenled. and segments of that DNA can be packaged in [he phage panídes InSlead of the replicated phage ONA. l hts hast DNA is lhe.eby deltvered 10
690
ModelOrganisms targel gene
FIG UR E 21-8 Tnllnsposon-generated insemonal mutagenesis. The transposon, carried inlOa cell on a plasmid, can then transpose from that vehicle into the host genome. Because of the high density of coding regions (genes) on a typic,,1bacterial chromosome, lhe transposoo Mil very often inser\ InlO a gene. A rnarker carried on lhe transposon (such as anlibiotk Il3Ístanre) allows ceIIs harbaring insertions lo I:e ISOlated. KllO'Mng the scquence al: the ends of Ihe transposon. and of Ihe genome into which it has insened, males Klenlifying ils lacaban straightforward
Iflterrupled gene
detennined by DNA sequence analysis from chromosomal ONA harboring the transposon insertion. Transposons can also be used to create gene and operon fusions on a genome-wide oosis. Modified transposons have becn created that harbor a reporter gene such as a promoter-Iess JoeZ (for example. TnS/oe). When this transposon inserts into the chromosome (in the a ppropriate orientation). transcription of thc reporter is broughl under the control of the disrupted target gene. Such a fusion is known as a n operon or transcriptional fusion (Figure 21-9). Olher fu sion-generating transposons have bcen creatcd that harbor a reporter gene lacking both a promoter and sequences Cm the iniii ation of translation . In these cases, cxpression oC the reporter requires bolh that it is brought under the transcriptional control of Ihe target gene and that it is introduced into the reading fra me oC the turget gene so thal it can be translated propcrly. A fusion in which Ihe reporter is joined both transeriptionally and tra nslationally to Ihe target gene is known as a gene fusion ,
Studies on the Molecular Biology oE Bacteria Have Becn Enhanced by Recombinant DNA Technology, Whole...Genome Sequencing, and Transcriptional Profiling Wilh the advent oC r€Combinanl DNA tcchnologies , such as DNA cloning, the availabi lity of whole-genome sequen ces, and ll1ethods for
FIGURE 21-9 Transposon-generated loa fusions. lhe rnelhod of transposon mutagenesis outlined in Ihe previous figure can be rnodified to alla.v Inser1ÍOn 01 a reporter gene (Ior example, IocZ) ¡nlo any region 01 !he genome. lhls allow.;; expression 01 a hesl ger¡c (dle one In IM"lICh lhe transposon·.bcZ fuslrn IS Inserte
promoter
,"'Z
sludying geno transcription on a genome-wide basis have, of course, revolutionized molecular biological sludies of higher cells. Bul these same lechnologies have had an impacl on Ihe study oC bacterial model syslems as well. especially when used in conjunclion with Ihe traditional lools oC bacterial genetics. For example. the developmenl oC tailor-made derivatives of transposons for crealing gene fusions is facilitated by recombinant DNA methodologies. As anolher example, tbe use of genetic compelence in combinalion with recombinant melhods for creating precise mutations and gene fusions has expanded the kinds and number of molecular genelic manipulations. The availability of microarrays representing all of the genes in a baclerium has made it possible to study gene express ion on a genome-wide basis. In combination with the 1001s described above, the function of genes identificd as being expressed undcr a particular sel of condilions can be rapidly and conveniently clucidated. Melhods for rapidly idenlifymg proteins that inleract with each other (such as tWlrhybrid analysis; sec Chapler 17, Box 17-1). which have had a great impact in yeast and other eukaryoUc systems, are also powerful lools for elucidaling networks of interactions among bacterial proleins. The availability of whole-genome sequences and promiscuous conjugaUve plasmi ds has crealed opportunitics for carrying out molecular gcnetic manipulalions in bacterial species lhat olherwise Inck sophistiealed. traditional tools of genetics.
Biochemical Analysis Is Especially Powerful ln Simple Cells with Well,Developed TooIs of Traditional and Molecular Genetics Since the earliest days of molecular biology, bacteria have occupied center slage Cor biochemical studies of the machinery for DNA replication, information transfer, and gene wgulation, among many other topi es. There are severa) reasons for this. First. large quantities of bacteria} cells can be grown in a defined and homogcnous physiological state. Second, the lools of traditional ami molecular genetics make it possible to purify protein complexes harboring precisely engineered alterations or lo overproduce and thereby obtain individual proteins in largo quanlities. Third. and of great importance. lhe machinery for carrying out ONA replication. gene transcription, protein synthesis. and so forlh is mllch simpler (having far fewer componentsJ in bacteria than in higher (;eUs. as we have seen repeatedly in this texl. Thus. elucidaling fundamental mechanisms proceeds more rapidly ín bacteria in which (ewe r proteins need to be isolated and in which m(lchanisms are generally more streamtined than in higher cells.
Bacteria Are Accessible to Cytological Analysis Oespite their apparent simplicity and the absence of memhrane-bound cellular compartments (for example. a nucleus and a mitochondrion) , bacteria are not simply bags of enzymes. as had been thought for many decades. Instead . as we no\\' know, proteins and protein complexes have characterístic locations within the cell. Even the chromosome is highly organized inside bacteria. Despite their sma ll size, bacteria are accessible to the tools of cytology. such as immunofluoresence microscopy for localizing proteins in fixed cells witb specific
antibodies, fluorescence microscopy with the Creen Fluorescent Protein for localizing proteins in living cells, and fluorescence in situ hybridizahon (FISHl fOr localizing cmornosomal wgions and plasmids within cells. The applications of such methods have pro· vided invaluable insights into several of the molecular processes considered in this texto For example, we now know that the replica. lion machinery of the bacterial cel! is relatively stationary and is lo-calized lo the cell center (Chapter 8). This finding leUs us that lbe DNA template is threaded through a relatively stationary replication "factory" during ils duplication as opposcd to tbe traditional view in which the DNA polymerase traveled along the template like a train on a track. As anothar example, lhe appLication of cytological methods have taught us (agaín contrary to the traditional view) tbat during replication the two newly duplicated origin regions of the chrornosorne migrate toward opposite poles of the ccll. Cytological methods are an important part ofthe arsenal for molecular studies on the bacterial celL
Phage and Bacteria Told Us Most of the Fundamental Things about the Gene Molecular biology owes its origin to experiments witb bacterial and phage model systems. Indeed, as we saw in Chapter 2. groundbreaking work with a pneumococcus bacterium led lo the discovery tbal the genetic material is DNA. Since tben. experiments with E. coli Bnd its phage ha ve led the way, as we have seen throughout this book. For example, the experiment of Hershey and Chase convin ced people that the genetic material of phage is DNA; the experiment of Meselson and Stah! proved that DNA replicates semiconservatively in E. coli; the phage crosses of Crick and Brenn er (Chapler 15) reven led that the genetic code is built of triplet codons; while the elegent genetic studies carried out by Yanofsky in E. coli demon· strB ted genetic colinearity; and not forgetting the wOl'k of JBcob and Monod (see Chapter 16, Box 16-3), which uncovered the fundamental stratcg¡es oC gene regulation. There are countless olber examples where. by choosing these simplest of systems, fundament al processes of Iife were understood. An important example comes from the cIassic work oC Seymor Benzer, who examined intensely a single genetic locus in phage T4, called rll. Wild-type T4 is capable of growing in eHher of two strains of E. coli known as B and K, but rlI mutants grow only in strain B. This makes it poss ible to detect wild-type phage (arising, for example, from recombination between two different rJI mutants) at frequencies of less than 0.01 %. That is , a single wild-type phage can be detected among 10,000 rO mutant phage when plated on a lawn or stmin K bacteria where only lhe rare recombinant will form a plaque. Taking advantage of this seemingly arcane property of rll mutalions, Seymour Benzer carried out recombination experiments between pairs of rU mutanls and was thereby able to map the order of such mutations at a high level of resolution (approaching or reaching that of the nucleotide base pair). He also devised a "complementation" test (discussed above) for showing that the rll locus comprises two adjacent genes. Benzer ¡ntroduced lhe te rm cistron to describe the gene (based on lbe words cis and trans). As an aside, it is interesting lo note that it was thil; work that enabled this same locus to be exploited by Crick and Brenner in their genetic studies on the genetic codeo
8 aker's Yoosr, Saccharo myccs ccrcvisiac
BAKER'S YEAST, Saccharomyces cerev,s,ae Unicellular eukaryotes offer many advantages ;:1 5 experimental model systems. They have relatively small genames compared lo other e ukaryotes (see Chaplee 7) and a s imilarly smaller number of genes. Like E. eolio they can be grown rapidly in thB lahoratory (approximately 90 minutes pec cell division under ideal conditionsl. aUowing c10ned populations to be propagaled from a single precursor cell. Dcspite Ihi s simplicity, yeast cells have lhe central characteristics of all eukaryotic cells. They contain a discrete nucleus with multiple linear chromosomes packaged into chromatin. and their cytoplasm inducles a fuIl spectnun of intraceUular organelles (Cor example, mittrchonrlria) and qtoskeJetal SITuctures (5llCh as actin filaments). Tbe best studied uniceUular eukaryole is Ihe budding yenst S, cerevis/oe. aften refeITad lo as brewe r's or baker's yeasl beca use of its use as a fermenling agent. S. cerevisiae has been intensely studied for more than 100 years. in experiments in the 1860s. Louis Pas teur idenlifi ed Ihis yeasl as the catalyst for ferme ntation (sugar was beli eved lo break down spontaneous ly into alcohol and carbon dioxide). These sludies evenluaUy led lo Ihe identiftcation of Ihe firsl enzymes and the development of biochemi stry as a experimental approach . The genctics of S. corevisiao has becn studied since the 1930s, res ulti ng in the characlerization of many of its genes. Thus, like E. coN, S. carevisioe aIlows investigators lo aUack fundamental probl ems of biology using both genetk and biochemical approaches.
The Existence of Haploid and Diploid Cells Facilitate Genetic Analysis of S. ceTevi.~ i(re S. cerevisiall celIs can gro\\' in cither o haploid slale (one copy of eac h chromosomel or diploid state (two copies of each chromosomeJ (Figure 21-10) . Conversioo belween Ihe haploid and diploid slates is
diplold
mitOIk: division
~
sporulafion and
~J~IiCdiViSion diploid cell mating
o
""'"
FICURE 21-10 Thelifecydeofthe budding yeast S. cerevi5Íoe. As described in the tex! here and elsevklere, S. cerevistae eJÓsts in three fOOTlS. TVI'O hapIoid celll)pes, a and <:1, and the diploid product 01 milting between these \1No. Replicaticx1 of Ihese different cell types. maling and sporulation,
'''' .rown.
694
ModeJ Orgonisms
gene of ¡nteres!
==:::::::- ::::=;¡
lransform With linear DNA wilh
011
ends homologOl./S 10 Ihe chromosome
_o, , ___ __ ....r==J __ , .- _ U:=='
,
,,
,
~ ""
~
ii" --- -. r ---
.'
, ' m
m
•
--- ---)
hom~ogo",
""" recombine with chromosome DNA
1 :.. -
DNA belween homologous regions replaces Ihe gene or interes! F I e u RE
21- 11 Recombinational
transformation in .,.easl. As described in lhe any reglOn oIlhe yeest genorne can readiIy be rcploced by sequences 01 choice. l he [)NA lo be ir6e/'led is f1anked VJllh short sequenc.es 1lornoIogws lo lra.e I~nking !he region II1lhe ctuomosorne 10 be replaced VI'hen!he dcnor Irdgments are introó.Jced lo !he CE'R, high Ievek 01 I-cmJICSOOS recombin
mediated by maling (hap loid lo dipl oid) and sporu lalion (diploid lo haploid). There are Iwo haploid cell types called a- and o-cells. When grown logelher, Ihese cell s mate lo form a/o. diploid cells. Under conditions of reduced nulrienls. a /o. diploids undergo meiotic division lo generale a struclure known as Ihe ascus thal conlains four haploid spores (two a-spores ond two o-spores) , When growlh conditions improve, Ihese spores can germinale and grow as haploid cells or mate to re-forOl ajo. diploids. In the laboratory, Ihese cel! Iypes can be maniplllated lo perform a variety of genetic assays. Genelic complemenl ation can be perfonned by simply mating two haploid stmins. each of which conlains one of the two mutations whose complementation is being tested. lf the mu tations complemenl eadl olher, the diploid wil! be a wi ld type for the mutant phenotype. To test lhe funchon of an indiv idual gene, mlltations can be made in baploid cells in wh ich there is only a single copy of that gene. For example. to nsk if a given gene is essen tial fnr cell growth, Ihe gene can be deleted in a haploid. Only deletions of nonessential genes can be tolerated by haploid cell s.
Generating Precise Mutations in Yeast ls Easy The genetic analysis of S. cerevisiae is further enhanced by the ava il ability of techn iques used In precise ly and rapidl y mod ify individual genes. When linear DNA with ends homologous to any given region of the genome is introduced into S. cerevisiae ce ll s, very high mtes of .homologous recomhination a re observed resulting in Ihe replacement of chroOlosoma l seque nces wilh DNA used in the transformation (figure 21-11). This property can be exploi ted to Illake precise changes within the genome. This approach can ue uscd lo precisely delele the coding region of a n enlire gene, change a specific codon in an open-read ing frame. DI' eve n change a specific base pair in a promoter. The ability to make such precise changes in the genome a Uows very detail ed questions concerning Ihe function of particular genes or thei r regulatory sequem:es lo be pllrsued with relali ve ease.
S. cere1Jisiae Has a Small, Well-Characterized Genome Because of its rich history of genetic sludies and its relatively small genome. S. cerevisiae was chosen as the first eukaryolic (nonvi ral) organism to have its genome entire ly sequenced. This landmark was accomplished in 1996. Analysis of lhe sequence (1 .3 X uf base pairs) identified approx imalely 6,000 genes and prollided the first view of Ihe genetic complexily required to direct the fomlation of a eukaryolic organ ism. The availability of the complete genome sequence of S. cerevisiae has allowed "genome-w ide " approaches to sludies of this organism. For examp le, DNA Illicroarrays that inelude sequences from each of the approximately 6,000 S. cerm'isiae genes have been used exten· s ively to charncterize patterns of gene expression under different physiological conditions. Indeed, the levels of gene expression in S. cerovisiae cells have now been tested in more than 200 difIerent conditions. induding different carbon sourccs (such as glucose liS. galactosel, cell types, and growth temperatures. These findings are no! only useful lo delemüne lhe expl'ession of ¡ndividuul genes but have
B(lKef's YeoS/. Saccharomyces ccrcdsiac
695
also led lo the grouping of genes into coordinalely regulaled seis, which a1l respond simiJarly lo c ha nges in conditions. Olher genome-wide rcsources inelude a librar)' of 6,000 strain s. eii.l:h dele ted Cor on ly on e gene. Grealer than 5,000 uf these- strains are viable as haploids, indicaling that Ihe majorit y of yeast genes are nonessential. This coll ection of strains has allowed Ule development of new genetic screens in which overy gene in the S. cerevisiae genome can be tested indi vidua ll y for its role in a partkular process. The use of microarrays has also allowed the genome-wide ma pping of binding sites for Iranscriptional regulators usi ng chromatin immunoprecipitation techniques (see Chapter 17, Box 17-2 ).
S. cerevisiae Cells C hange Shape as They Grow As S. cerevjsioe cell s progress through Ihe cell cycle. they undergo ChnnlCleristi c l:hanges in shape (Figure 21-12). Immediately after a new ceU is relúased from its mother, Ihe daughler cell appears slighlly ellipti cal in shape. As th~ cel J progresses Ihrough Ihe cdl cycJ~, j( fonn s a small "bud" that w iJl eventually become a separate cel!. The bud grows until it reac hes a size approx imalely equal to Ihe size of Ih e "mother" cell from which it arose. Al Ihis point the bud is released from the mother and both ce lls start Ihe process again. Simple microscop ic observation of S. cerevisioe cell shape can provide a lot of in[onnation about the even ls oecurring insirle the cel!. A cell thatl acks a buJ has yel lo slart replicaling its genome. Thi s is because in a wild-ty pe S. cerevisiae eeH, the ernergence of a new bud is lightl y connccted lo the ¡nitialion o[ DNA replication . Similarly, a growing eeU with a ver)' large bud is almost always in Ihe process of executing chromosome segregalion. The powerful genctic, biochemi<:a1, and genomic lools available lo study S. cere"isiae have made il a favored organism for the analysis of
FIGURE 21-12 The mltoticteU cyde in yeasl S. cetevisioe divides by buddiog. lhe development 01 a daughtec bud Ihrough lhe milObC cyde tS shown, aod desaibed lOthe teKl.
,,11
ce!! cycle
large budded ce11
basic molecular and cell biological questions. Studies of S. cel'ev1SIoe have made fundamenta l contributions to our underslanding of eukary· otlc transcription and gene regulalion, ONA replicalion . recombination. lranslation. and splicing. Genetic studies in baker's yeasl have identified proteins involved in aU of tlJese events.
THE NEMATODE WORM, Caenorhabditis elegans Sydney Brenner, after making seminal contributions in molecular genet· i<.:S. identified a small melazoan in whid lo study the imporlanl queslioos of deve lopmcnt and Ihe molecular basis of behavior. Learniog from the success of molel.:ular genetlc Shldics in phage and bacteria, he wanted the s implest possible organism Ihat had djfferentialed cel! types, but that was also amenable lo microbiologícal-like genetics, [o 1965 he seuled on lIle small nematode wonn Caenorhabditis eJegons re. eJegansJ because it conlained a variety of suitable eharncteristics. These include a rapid generation time lo enable genetic screens; hermaphrodite reprodudion produciog hundreds of "self-progeny" so that large numbcrs of animals could be generated; sexual reproduction so that genetic stocks could be constructed by matlng; and a smaJl number of transparent cells so that development could be foll owed direct.ly. Brenner set two ambitious initial goals that would be essentiaJ for the long-term success of Ibis endcovor, One was a complete mapping of a ll cell s by reconstructing serial seetion electron mi~rographs (compleled by John Wbile in 1986). and the olher was the mapping of the cell lineage (completed by John Sulston in 1983). Seven years laler Brenner established the genelics of the new model organism with the isolation of over 300 morphologit:al and behavioral mutants. These de· fined over 100 complementation groups mapping lo six linkage groups. Neurly 30 years lat er Ihere are 400 laboratorj es worldwide that sludy C. eJeSClns. Due to its simplicity and experimental accessibility. it is now one oCIhe mosl completely understood metazoan.
C. elegans Has a Very Rapid Life Cvcle C. eJegolls is cultured on petri dishes and fed a simple diet of bacteria. They gmw weH at a range of temperatu res, growing twice as fasl at 25<>C than at 15<>C. Al 25"<: fertili zed embryos complete rlevelopment in 12 hours and hatch into free-living animals capable of complex behaviors. The first stage juveniJe (L1) passes through four juvenile stages IL1-L4) over the course of 40 hours to become a sexually mature adult (Figure 21-1 3). The ad ult henna phrodite can produce up to 300 self-progeny over the course of about 4 days. or can be mated w ith rare males lo produce up lo 1,000 hybrid progeny. The adu lt Uves fo r abou! 15 days, Under stressful conditions (Iow food. increa."Cd lemperatures, high populatlon density), Ule L1 slnge a nimal can enter an alternative developmental stoge in which it form s who! is called a dauer. Dauers are res istant lo en vironmental stresses unel can live many months while waiting for environmental conditions to improve. The study of mutants that fail to enter the dauer stage, or that enter it inappropriately. have identified genes expressed in specific neurons that funcHon to sense environmental cond itions, genes expressed throughout the animal that control body growth. and genes that control Iife-span.
The Nem otode Won n, Cacnorhabditis clcgans
697
FIGURE 21-13 Thelifecydeofthe wonn. C. eIegoftS. Sha.o.-Tl is me lile cyde in
embryogenesis
hours of d€'l.dopment, Imm first stage jlNeni!e lo adult. as describe
deveIopmental stage lhal an I I jwenile en-lers-to become a dauer-is also shown. halching
Post· embfyogenesl s
Activation or these ¡atter genes in lhe adult can dramatically extend the lifespan of lIle animal and homologs or these genes have becn implicated in Jife exlension in mammals.
c. elegans 1s Composed of Rclatively Few, w en Studied cen Lineages
c. eJegans has a simp le body plan (Figure 21 -14). The prominent organ in the adult hermaphrodite is fhe gonad. whü.;h con(ains Ihe prolHerating and differen liating germ cells (spcrm and oocytcsJ, rerlj¡ization chamber (spemlathecaJ. and uterus far temparary storage oC young embryos. The embryos pass from the uteru s to the ouls ide through Ihe vu lva, a slruclure fonned from 22 epirle nna l cells. Mulations that disrupt Ihe fo rmation oft he vulva do nol in terfere wilh proa
b
dorsal
posterior
a nus
ventral 1.2mm F I G U R E 21 - 14 The body plan of the worm. Above (in part a) i5 shcJo..vn a sectJon tlu ough <'In adult herm.::phrodite oorm. The various orgaffi are identilied in the sketch beIow (in part b) and ale desoibed In the lext (Sourre: (a) Sulston J.E. and l-IorViu H.R 19n. Oev. BioI. 56: 110 - 156.)
duction of embryos, but do preven! the eggs from being laido Consequenll y. the embryos develop and hatch insi de the utems. The ha tched worms then devour Ihe ir molher and become Irapped inside her ski n (cuticle layer) forming a "bag of wonllS." This readily identified phenotype has allowed the isolation of hundreds oC vulva-less mutants identifying scores of genes tha! funcHon to control the generation. specification. and differentiation oC the vulva cells. Among these genes are components of a highly conserved receptor tyrosine kinase signaling pathway thal controls cell proliferation. Many oC the marrunalian homologs oC these genes are ont:ogenes and lumor-supressor genes thal when altered can lead lo cancer. In C. eJegans, mutations Ibat inactivate this pathway eliminate vulva developmen! because the vulval cells are never generated. whereas mutalions Ihal activale Ihis pathway cause overproliferation of the vulva precursor ceUs. resulting in a nmlüple vulva pbenotype. Because the animal is transparent and the vulva is generated from only 22 cell s. il is possible to describe the mutant deCe<.:1 with cellular resolution such that the type of mutation can be associaled with a specific cellular lransformation .
The Cell Death Pathway Was Discovered in C. elegans The mast nolable achievement lo date in C. eJegans research has been Ihe elucidalion of Ihe molecular pathway Ihat regulales apoplosis or cell death. Early analysis of celllineages noled that the same sel of cells died in every animol, suggesting Ulal cell death was under genetic control. The first cel1 death defective (ced) mutanls isolated were defective Cor the consmnption oC Ihe ccll corpse by neighboring cells. thus in the mulants t:ell corpses persisled Cor many hours. Using tiu:se ced mutanls, H. Robert Horvilz and his coLleagues isolatcd many additional ced mutants tilat failed lo produce persistent ccll corpses. Thcse mutants proved to be defective a l initialing lhe ceU d&'lth programoAnalysis of Ihe ccd mutants showed Ihal, in all but one case. developmentally prograrnmed ce ll death is cell autonomous. thal is. the ce ll commits suicide. ln mal es, a cell knOVVJ1 as the linker cell is killed by ils neighbor. The molcc..u lar identification of the c.:ed genes provided the means lo identify proteins in mammals thaf carry out cssentially the identical biot:hemical reactions tn control cell deatJl in all animals. in fad expressing human haO1ologs in C. elegnns can substitute Cor 8 mutated ced gene. Cell death is as important as ceU proliferation in developmenl and diseasc and is Ihe focus oC intense research lo deveJop therapeutics for thA control oC caneer and neurodegenerntivc diseases.
RNAi Was Discovered in C. elegans In 1998 a remarkable discovery was annount:cd. The introdudion of double-slranded RNA (dsRNAl into C. eJegolls silcnced the gene homologous to the dsRNA. This unexpeded discovery and subsequent anal ysis oC RNA interference (RNAi) is significant in two respects. One is that RNAi appears lo be universal since inlrodudion oC dsRNA into nearly aU animal. fungal. or plant cells leads lo homology-directed mRNA degradation. Indeed, much of wha! we knO\\l abou! RNAi comes from studies in plants (Chapler 17). The second was the rapidity with which experimental investigation of this mysterious process revealed the molecular mechanisms (see Chapter 17. Figure 17-30). These investigations intersected with the analysis of another RNA-mediated gene regulatory process that involves tiny endogenous microRNAs that have been
The Fruil Fly. Orosophila mclanogaslcr
699
shown lo regulate gene expression in plants and animals. coordinate genome rearrangements in dliates. and regulate cruomatin structure in yeas!. The fust Iwo microRNAs were discovered in genetic screens in C. e/egans. A fraction of Ihese worm microRNAs is conservcd in fli es and mammals where Iheir functions are juSI beginning to be revealed. It is Iikely that more examples 01' RNA-direcled gene regulation will be disl:overed in the eoming years.
THE FRUIT FLY. Drosophila melanogaster We are approaching the 10mh anniversary of the fruit fly as a mode l organism for studies in genetics and developmental biology. In 1908 Thomas Hunt Morga n nnd his rescaTeh associatcs al Columbia University placed rotOng fruit on the win do\\' ledge of lheir laboralory in Sehermerhorn Hall. Thei r goa l was lo ¡so lale a sma ll, quickly re producing Animal Ihat eould be eultured in Ihe lab and used lo sludy Ihe inherita nce of quantilalive traits, such as eye color. Among the menagerie of creatures thal were captured . Ihe fruit fly emerged as the animal of choice. Adults produced large numbers of progeny in jusI tWD weeks. Culturing was done in reeycled milk bottles using an inexpens ive concoction of yeast and agar,
Drosophila Has a Rapid Life CVclc The saHent features of the DrosophiJa Jife cycle are n very rapid period of embI)'ogenesis, followed by three periods of larval growth prior lo metamorphosis (Figure 21-151. Embryogenesis is compleled within 24 hours after fcrlilization and culminales in the halching of a llrst-instar larva. As we discussed in Chapler 18, Ihe early periods of DrosophiJa embryonic development exhibit the most rapid nuclear c1cavages known for any anima l. A first-instar larva grows for 24 hours and then molts iolo a terger, second-instar larva. The process is repeated to yield a tbirdinstar larva Ulat feeds and grows foe two lo three days. One of the key processes that occurs during larval deve lopment is the growth oCthe imaginal disks. which arise frorn invaginations of the
F I G U RE. 21- 15 The DrosopllilcJlife cy-
de. lhe various st
r"'' ' Droscphi/a life cycle
'h¡;'~d.~;O~'~"'¡".,l. J"' '-;~~
t
'
-~
second-instar larva
J'dwi
firsl-instar
~"'"
700
ModelOrgonisms
epidemüs in mid-stage embryos (Figure 21-16). There is a pair of disks for every sel of appendages (fm example, a sel o[ foreleg imaginal disks and a sel of wing imaginal disks). There are also imaginal disks for eyes, antennae, the mouthparts, and genitalia. Disks are initially small and composed of fewer tIlan 100 cells in Ihe embryo bul conlain lens of thousands of cells jo mature larvae. The deve lopment uf tbe wing imaginal disk has become an importanl model syslem fOf underslanding how gradients of secreted signaling molecules sllch as Hedgehog and Dpp (TGF-m control complex patteming processes. Imaginal disks differentiate into theif appropri ate adult structures during metamorphosis (or pupationl.
The First Genome Maps Were Produced in Drosophila In 1910 the Margan lab identified a spontaneous mutant male Ay thal
had white eyes rather tllan the brilliant red seen for normal slra¡os. Trus s in gle t1.y launched an incisive series of gene tic studies that led tu two majur discover ies: genes are located 00 chromosomes. and each gene Js composed of two alletes that assort independently during meiosis (see Mendel's 6rst law: Chapter 1). The identification of addili a na] mutations led lo the demonstration Ihat genes localed on separate chromosomes segregate independently (McndaJ's se(;(lnd law) . whereas those Unked o[] the same chromoso me do nol. An undergraduate al Columbia University, Alfred ¡.J, Sturtevanl (a member of the Morgan lab), developed a simple mathemalical algorithm fm mapping the distances between linked genes based 00 recombinati on frequencics. By the 1930s. extensive genetic maps were produced thal ide ntifled Ihe relative positions of numeraus genes controllin g a variety of physical characleristics of the adult. such as wing s izc and shape and eye color a nd shape. Hermann J. Muller, anolher sd entisl trained in the Morgan fly lab, provided the first evidence that environmental faetors, such as ionizing radialion. can caUSfl chromosorne rearrangements and genetic mutations. Large-senle "genetic screens" are routiocly perfoffiled by feeding adult mal es a mutagen. slu.:h as EMS (etbylmethanesulfonate). aod then tnating them Witll nonnal fe males. The F, progeny are heterozygous and
FIGURE 21-16 lmaginaldisksm Drosophllo. lhe posibon of vaf¡OU!. imaginal disks In the laM are shov.11 01'1 the righl. On ¡he Ieft IS shcw;n the limbs and orgal1S lhey locm in me adull fly. lhese disks are initially formed as SfO
lfon131plate I r- - - - - - 'ndupper i
ITI
adult
larva
The Froil Fly, Drosophila me
contain one normal chromosome and one random mutation. A variety of mcthods are used lo sludy Ihese rnulations. as described below. 11) addition to ils remarkable fecundity (a single remale can produce Ihousands oC eggs) and rapid Iife cycle. the fruil fl y \Vas found lo possess several very usefu l fcatures Ihal guaranleed il a sustained and prominent role in experimental research. 11 contains only foor chromosomes: two large autosomes, chromosomes 2 amI 3, a smaJler X chromosome (whit:h detenni nes sex), and a very small fourth chromosome. Calvin B. Br'idgas-yct anothcr of Muller's coUeagues- discovered thal certain ti ssues in Drosophila larvae undergo cxtensivc endorepllcation witbout mitosis. In the salivary gland, this process produces remarkable giant chromosornes composed oC approxirnately 1,000 copies of each chrornatid. Bridges used these polytene chromosomes to detemline a physical map of the Drosophila genome (the first produced for any organism l (Figure 21-1 7). Bridges identified a total of approximately 5,000 "bands" on the four chromosomes and established a correlation between many of these bands and the locations of genetic loei identified io the classical recombination maps. For example, fernale fruit flies lhat are heterozygous fo r lhe recessive wbite mulalion exhibit normal red eyes. However, similar females tha! contain the w}¡ite mutalioo and a small delelion in the other X chromosome , which rernoves polytene bands 3C2-3C3. exhibit white eyes. This is because there is no longer a nonual, dominanl copy of tbe gene. This type of analysis led to the concl usioo !hat Ihe white gene is located somewhcre betlVeen polytcnc bands 3C2 and 3C3 on 'he X chromosome. A variety of additiona l genetic methods were created to establish the fruit By as the premiere model organism for studies in animal inheritancc. For exampl e, balancer chromosomes were created Ihal contaio a series of inversions relative lo !he organization of Ihe native chrnmosome {Figure 21-181. Critically, such balancers fail to undergo re.::ombination with the nati ve chromosome during meiosis. As a result. it is possíble lo rnaintnin permanent I.:ultures of fruit mes Ihal contain recessive, lelhaJ mutations. Considcr a null mutation in Ihe
region band numbers
Of 258-45 Df67c23
w- (st· fa'
DI 264-32
w' rs t- fa -
D(258-33 DfN8
w ' (st· faw- rsl- fa-
specíflc deletions
w- (st· fa'
W 1
1
regions of genes
phenotype of deletion/mutation heterozygote
F I G U R E 21-17 Geneüc map!o. pofytene dlromosomes, and deficiency mapping. Endoreplication in Ihe c:bsence 01 mitosis generales enlarged chromosomes in sorne Iissue5 ollhe f!y. mosl notably!he Si'llivary glaods v.ilere tre giant chrornosomes are composed of a thousand chromalids. JI was ~sible. lor !he flrsl time, lo coffeLate the IXnIrrence of genes lar cenaln trc:ilS v.ith given physical segmenls 01chromosomes. Speortcally. phenotypes 01 flies (white eyes) were correlaled Wlh deletions In Ihe chromosomes. (Sau rce: ~rtwell L el aL 2003. Genet.ci: Ffom genes ro genomP.5, 2nd edil1on, p. 8 16, hg 0-4,)
702
Molle! Orgoni.m ls
original chromosorne d
J
ne
d
balancer chromosorne F I (i U R E 21-18 Balancer chlOmosome. Balancer chfOlTlO5Orre. (bonom panel) contaln a ~ries 01 inversions v.hen compared wilh !he or18inal. p.:srental chromosome (top panel). In this diagram, a hypolheticar chromosome has tvvo arms. lhe left arm 01 lhe balancef mromosorne has an internar inversion thal reverses lhe order of genes <1, b,
evell-skipped (eve) gene, which we discussed in Chapter 18. Embryos Ihat are homozygous for this mutation die ond fail lo produce viable larvne and adults. The eve I(){:us maps on chromosome 2 (al polytene band 46C). The null mutation can be mainlained in a popuJation that is heterozygous for a "nomlal " c hromosome containing Ihe nuH allele of eve and a balancer second chromosome. which contains a normal copy of the gene. Since the eve nutl aUele is striclly recess ive. these flies are wmpl etely viable, However, ooly heterozygotes are observed among adult progeny in successive generations. Embryos thal contain two copies of Ihe bala ncer chromosome die because some of the inversions produce recessive disruptions in critical genes. In addition , embryos that contain two copies of the normal chromosome die beca use they are bomozgyous Cor th e eve nuH mutation.
Genetic Mosaics Permit the Analysis of Lethal Genes in Adult Flies Mosaics are animals that contain small patches of mutant ti ssue in a genernlly "normal " genetic background. Such smaJl patches do nol kili the individual since most of tJle tissues in the organism are nom181. For example, small pntches oC 811groiJed lensmiled homozygous mulant lissue can be produced by inducing mitolic recombination in developiog larvae using X-rays. When 5uch patches are crcated in posterior regions of Ihe devcloping wings. Ihen the rcsulting flies exhibit abnorrnal wings that Jltlve duplicated anterior struclures in place uf the nonntll posterior struclures. Thc analysis of genclic mosaics provided thc first evidence thal Engrailed is requirod for subdividing the appendages and segments of llies ¡nlo anterior and posterior compartments. The mosl s pectacu lar genetic mosaies are gynandromorphs (Figwe 21-19), These are llies Ihal are literall y half male and half female, Sexual identity in llies is determined by the number of X chromosomes. Individuals with two X chromosomes are fe ma les. while those wit.h jusI une X are males lIhe y chromosome does not define sexual identity in nies as it does in mice and humans: in nies, y is only needed Cor th e production of sperm). Rarely. one of Ihe two X chromosomes is lost al lhe firsl mitotic division l'ollowing the fusion of the sperm and egg pronuclei in a newly fertilized XX embryo. This X instabili ty occurs only at the fiest division. In all subsequenl divisions. nudei containing Iwo X L'hromosomcs give rise lo daughter nuclei with lwo X chromosomes. whjle nuclei with just oue X duomosome give ri se lo daughters conlaining a single X. As we discussed in Cha pter 18, lhese nuelei undergo rapid deavages withoul cel' mem branes ond tllen migrate lo the periphery of the egg. This migration is coherent and there is Iittle or no intennixing ol' nuclei containing one X chromosome with nuclei containing two X chromosomes. Thus , half Ihe embryo is male and half is remal e, although the "Iine" separating Iho male and female lissues is random. Its exact posillon depends on !he orientation of the two daughtcr nudei after the firsl deav
The Fruil Fly. Orosophila melanogaslcr
The Yeast FLP Recombinase Permits the Efficient Production of Genetic Mosaics What was no! an tic ipated during the cl assical era of genelic onalysis is the fact thal Dm50ph ila possesses several favorable allribules for molecu lar studies and whole-genome a nalysis_ Most notably. the genome is re latively 5mall . It is composed of only npproximntely 150 Mb and con la inS fe wer Ihan 14,000 prolein coding genes. Th is represenls just 5% of Ihe amoun! of ONA that makes up the mouse nnd human genomes. As Ihe fru il ny enlered the modern era, several melhocls were eslablished Iha! imprO\'ed some of Ihe olcler lechn iqlles I)f genetic manipuJation and a lso Ifld lo completely new experim enta l methods. sllch as lhe produclion of stable transgenic strains carrying recombina nt ONAs. As we discussed earlier. genelic rnosail:s are produced by mitolic recombinat ion in somalic lissues. lnitia lly. X-rays were used lo induce recombi nali on, alt hough Ihis method is ineffident a nd produces smnJl patches of mulant tissuc. More recently. lhe frcquency ofmitotic recombination was greal ly enhanced by the use oflhe FLP fl.>(;ombinase from yeast (Figu re 21 -20). FLP recognizes a simple sequence mOliC. FRT. and then catalyzes ONA rearrongement {see Chaptcr 11}. FRT sequen{;es \Vere inserted ncar Ihe cen!romere of each of Ihe four chromosomes using P-element transformntion (see betow). Heterozygous flies are then produced Ihal {;ontain a nuH allele in gene Z on one chromosome and a wild~ type copy of Iha! gene on Ihe homologous chromosome. 80th chromosomes contai n the FRT sequences. These fli es are slable a nd viable as there is no endogenous FLP rccombinase in DrosophiJa. lt is, however. possible to introduce Ih e I't1t;ombinase in transgenic strnins that contain the yeast FLP protein codi ng sequence undee lhe control of Ihe heat-i nducible hsp70 promoter. Upon hea! shock. FLP is synthesized in aH cell s. FLP binds to Ihe FRT moti fs in the Iwo homologs conlaining gene Z and calalyze mitotic recombination (Figure 21-20). l1lis method is quite efficient. In fact. short pu lses oCheal shock are often sufficient lo produce enough FLP recombinase to produce large patches ofz- /z- tisSUEl in d iITerenl regions ofan adu lt Oy.
1t Is Easy to Create Transgenic Fruit FIies that Carry Foreign DNA P-elements are transposable ONA scgmcnts tllat are the causal agenl of a genetic phenomenon called hybrid dysgenesis {Figure 21-21, see also Box 19-3}. Consider Ihe consequences of matiog females from Ihe "M" strain of Drosophila melonogoster with males from the "P" strai n (same species. bul differenl populations). 111e F'¡ progeny are ohen s terile. Tbe mason is that the P slrain contains numerous copies of the P-element transposon thal are rnobílized in embryos derived fro m M eggs. These eggs lac k a rcpressor proteio thal ¡ohibits P-elemenl mobilizat ion . P-elemenl excision and insert ion is lirnited lo the poJe cells. the progenilors of Ihe gametes (spenn in males and eggs in females). Someli.mes Ihe P-elemenls ¡nser! ioto genes thUl are essential rOf the development oCIhese germ cell s. ami. as a result, tbe adult f1ies derh'ed from fhese mutings are steri le. P-elements are used a'i transfomlation vedors to introduce tOCombinanl ONAs into otherw ise nonnal stmins oCmes (Figure 2 1 ~22 ) . A fuJl -
703
a
• anaphasc lagging chromalid is lOSI
normal diploid cea
monosomic cell
b
xx
xo
'?
y
"
----FIGURE 21-19 Gyandromorphs. GyallClfomorph mUlanlS ace iI pilruOJli!f1y Sltlklng IOfm of genetlC mosalC. (a) lhe blue Xcr,anosorne carnes !he recessive (whlle) mutation. 1M1efea5 !he fed X chromosome has d flOflTIaI domInanl or lhe gene. The mutanllS lhe resutt el Xchromosome Ioss al the lirst 1'l1I1otiJ(t (b) In ~Je !e§lillng m,,¡lanl, ooe ~f ci!he
a:cv
fty 6 female. lhe oÚlef IS maIe.
704
Modd Ol'j,'OniSlllS
P-element tennini
centromere, FRT
FLP
~ · x ;: ~=
--
HP
r
,-
•
¡-
,
f I G UR E 21-20 FlP-FRT. The use 01 this site-spedlic recombination system trom yeast (desaibed in Chapter 11 ) prornotes high leveIs of mitot\( re<:ombil'\iltlOfl in flie; lhe recombinahon is controlled by elpressing the reoombmase In ¡hes only when required.
• x
b nOlldysgenic crosses
pO
pI'
X
X
MO
X
MI'
yyy normal progeny
""""al
progcny
normal progcny
e dysgenic crosses
F, progeny frequently sterile
¡
F2 progeny wilh many mutations
FI G U RE
21-21
Hybrid dysgenesis.
P.elemenr transposons reside pdssrvely in P.srrains beGause t!ley expI'€5S a repressor thal keeps!he transposons silent Vvt.en P strains are mated v,iith an M strain Iclcklllg such a re pressor, Ihe transpo5Ol1S are mobtliled within !he poIe cells. and o/ten IIllegmte mto genes rcqulred 10r germ ceU foonalJOn. Thrs ~lains !he h¡gh frequ€OCy 01 stenllfy in !he offspring fmm such a<»=
lengtb P-e leme nt transposon is 3 kb in length. 11 contains in verted repeals al the term ini that are essential for excision and insertion. The intervening DNA encocles bolh a repressor of transposition and a transposase that promotes mobilization . The repressor is expresséd in [he developing eggs of P slrains. As a result, there is no movement of P-elements in embryos derived fmm females of the P-strain (these contllin P-elements). Movement is seen only in embryos derived from eggs produced by M stmin females. which lack P-elemenl s. Recombinant ONA is inserted into defective P-elemenls Ih at lack. Ihe interna] genes encoding reprcssor and transposase. This DNA is injected into posterior regions of ea rly, preo,:llular embryos (as we saw in Chapler lB , thi s is Ihe region thal contains Ihe polar granules). The tran sposase is injecled along with the rocombinan l P-elflment vector. As the cJeavage nuelei enler posterior regions, Ihey acquire bolh the polar granules and recombinant P-element DNA logelher with tran sposa se. The pole cells bud off from the polar plasm and the recombinant P-elements in sert ¡nto random posi lioos in Ihe pole ceUs. Differflllt pole cell s conlllin dHffH'fmt P-e lemenl insertion events. The amounl 01' recombinant P-elcment ONA and tra nsposase is calibraled so th3\, on average. a gi ve.n poJe celJ receives just a single integrated P-e lement. The embryos are allowed lo develop into ad ult s and then mated with appropr iate tester strains. The recombinant P-element oontains a "marker" gf:ne such as white+ and fhe slra ín used for the injections is a white mutant. The tester stmins are also w hite ~ . so Ihat any F:¿ fly thal has red eyes mllst oontain a copy of Ihe recombinanl P-element. Thi s method or P-elemenl transformation is routinely used to idenlify regulatory sequellces such as Ihose goveming elle stripe 2 express ion fwhi c h \Ve di scussed in Chapter lB) . In addition , tbis strategy is used lo examine prolein cod ing genes in various genetic backgrounds. (n summary. Drosophila offers many of the sophisticaled tools of dassical and molecular genetics thal, as we bave seen. are available in microbial model systems. Dne conspicuous excepLion has been the absence of methods ror precise manjpulation of the genome by homo10gous recombinalion with recombinanl ONA, such as in the erealion of gene deletions. However, SI/eh methods \Vere recently devel oped. and are now being streamlined ror routine use. lronically. such manipuJalions are readily avaiJable. as we sball see, in the more complicated model system, the mouse. Nevertheless. because or ¡he wealth of genet ic t001s available in Drosophila and Ihe extensive ground work of knowledge about thi s organism resulting from decades of investigation ,
The House Maure, Mus musculus
P-element \
,
•
M~
embryo
•
transposase
" - "/gene
+
x
705
Iransformed DNA
in gamete genome
f I ~u RE 21-22 P·element transformation. P-eJements can be l.I5ed as vectors in the Iranstormation 01fly embryos. Thus, as ÓSCUSsed., !he Iex\, seq.¡ences ot choice can be.-.serted into a modified P-elemerl. A Single CCVi of this reccmbinanl moIecuIe is stably ncorporaed into a si1g1e 1ocation 01 a fly chf~e.
the fruil fly remains one of tbe premjer modeI systems for sludies of development and behavior.
THE HOUSE MOUSE, Mus musculus By Ihe standards of the C. e/egans ami DrosophiJa. the Iife cycle of the mouse is slow and cumbersome. Embryonic developmenl. or gestation, occurs over a period of three weeks and the newborn mouse does nol reach pubert y fOJ' another 5-6 weeks. Thus, the effective Jire cyc1e is roughly 8-9 weeks , more than five times longer than that of Drosoph iJa. The mouse. however, enjoys a speciaJ status due to its exalted position on the evoJulionary tree: iI is a mammal and, therefore, related to humans. OC course, chimps and other higher primates are doser to humans Ihan rnice, bul they are not amenablp. to the various experimenta l manipulalions available in mice. Thus, the mouse provides the link between the basic principIes, discovered in simpler creatures like wortns and Oies, and human disease. For example. lhe potched gene of DrosopMJa encodes a critica] componen! oC Ole Hedgehog receptor (Chapler 18). Mutanl fly embryos tJJal lack the wild-type potched gene activity exhibil a varíety of pallerning defects. The orthologous genes in rnice are also importanl in development. Unexpected ly. however. cerlain patchcd lllutants cause various cancers. such as skin canCflr, in both mice and humans. No amount oCanalysis in the By \Vould revea1 such a function. In addition , methods Itave been developed Ihal pennit Ihe efficient removaJ of specific genes in olherwise normal mice. This "knockout" technology continues lo have an enonnous impacl 011 oue understanding of the basic mechanisms underJying human develoJlmenl, behavior, and disease. We shall briefl)! review the salient features of Ihe mouse as an experimenlal system. The chromosome complement oC the mouse is similar lo that seen in humans: tllP.fe are 19 a utosomolllcs in lllice (22 in humansJ, as \Vell as X and Y scx chromosomes. There is extensive synlen)! ootween rnice and humans: extended regions of a given mouse chromosome conlain the same set of genes (in lhe same order) as Ihe "homologous" regions of Ihe corresponding human chromosomes. The mOllsc genome has been sequenced and assembled. As discussed in Chapter 19, Ihe mouse has virtually the same comp}ement oC genes as Ihose presf'.nt in lhe human
x
)
Mode/Organisms
706
,
"'"e"''' prOflUclcus p81ema1pronu
\ - fertllization , sperm head expands
Mouse Embryonic Development Depends on Stem Cells
1:-
, 1st cleavagc
, 2nd dcall3ge
l3«l
c.leavage
¡4th cleavagc
ICM (inncr cell mass)
,blastoc:YSI formalion
~ ~I l
implanl ation
'
fl~URE
genome: each eontains approximately 30,000 genes and there is a one-toone cOlTespondenCfl for more than 85% of tJlCse genps. Most, if not all , of the differences between the mouse and human genomes is the seleeHve duplication of certain gene fam ili es in one lineage or the other. Comparative genome analysis confimls what \Ve have known for sorne time: the mouse is an excellfml model for human development and disease.
Ulerine wall
21-230 Overviewofmouse embryogenesi.s.
Mouse eggs are small and diffieult to manipulale. Like human eggs. they are jusI 1UU microns in diameter. Their sma ll size prohibils grafling experiments of the sort done in zebrallsh and frogs. bul Olicroinject ion method s have beell developed ror introd ucing recombinant DNA in to mouse eell bnes so as to create transgen ic strains, as di scussed below. In addition, it is possible lo harvest e nough mouse embryos , even at Ihe earli est stages , for in silu hybridiza60n assays and tb e visuali zati on of speci fic gene express i ~n patterns. Such visuali zation methods can be appli ed lo bolh normal embryos a nd mulants ca rrying disruptions in defined genetic loei . Figure 21-23 shows an overview or mouse embryogenesis. The initial divisions of ,-he early mouse embryo are very slo\'V and occur with an average frequency of jusi once every ]2-24 hours. The first obvious diversificalion of ceLl I)'pes is seen al lhe 1&-cell slage, calJed the morula (Figure 21-23, panel 6). The celJ s located in ouler regions rorm lissues Ihal do not contribute lo Ihe embryo, bul inslead deveJop inlo Ule placenta. Cell s located in internal regions generale Ihe inner cell mass nCM). Al the 64-cell slage, lhere are onl )' 13 lCM cells. but these ronn all of th e tissues of the ad ult mouse. The ICM is Ihe prime source of embryonic slem ceUs, which can be cultured and induced to foml any adult cell type upon addition of the appropriate growth faetors. Human stem cells have oocome the subject of consi derable social controversy, but offer the promise of providing a renewable source of tissues that can be used to replace defecti ve cells in a variely of degenerati ve djseases s uch as diabetes and Alzheimer's. At tJ1e 64-cell stage labout 3-4 days after fertilizationJ Ihe Illouse embryo, no\'V ca lled a blastocyst, is finall y ready for implantation. Inleractions be tween the bJastocyst and uterine waU lead lo the rormation of the placenta. a characteristic of all mammals except th e primilive egg-lflying platypus. After fOrnlntion of the placenta. the emhryo enters gaslrulati on. whereby Ihe ICM fo nns a1l Ihree germ layers: endodenn, mesodernl , and eclodenn. Shortly thereafter, a fetus emerges thal contains a brain , a spinal cord, and internal organs such as the heart and liver. The first slage in mouse gastrulation is Ihe subdivision of the fCM into Iwo cell layers: an inner hypoblasl and an ouler epiblast. which form the endoderm ami ectoderm, ff'spectively. A groove ca lJed the prirnjtive streak forms along the lenglh or the epiblast and lhe cells Ihat migrale into tbe groove fonn the interna l mesodermo The anterior end of the primilive strea k is called the node; il is the source of a valiety or signaling molecuJes that are used lo pattem Ihe anterior-posterior axis of the embryo, hlcJuding Iwo secreled jnhibHors of TCF-J3 signaHng, Chordin wld Noggin. Doubl e mutant mouse embryos that lack
The House Mouse, Mus musculus
707
bolh genes develop into fetuses thal lack head struetures such as the forehrain and nose.
It 15 Easy to Introduce Foreign DNA into the Mouse Embryo Microinjection melhods have been developed for lhe enicient expression of recombinanl DNAin transgenic strains of miee. ONA is i.njected into Ihe egg pronucJcus. ami lhe embryos are placed ioto the oviducl of a remale mouse and aUowed lo implanl and develop. The injected ONA integrales al random positions in the genome (Figure 2 1- 24). The efficiency of inlegration is quite high and usually occu rs during ea rly stages of development, often in o ne-cell embl'Yos. As a resulto Ihe rusion gene inserts inlo most or aU of Ihe cells in the embryo. incJuding the lCM cells that form Ihe somatic tissues and germline of the adull mOllse. Approximately 50% of the transgenic mice thal are produced using thi s simple method of microinjection exhibit germlinc transformation; that ¡s, their o[fspring aIso contain the foreign recombinant ONA. Consider as an example a fusion gene cont ain ing Ihe enhancer from Ihe H oxb-2 gene attached lO a JaeZ reporter gene. Embryos and fetuses can be harvesled from transgenic strains carrying Ihis reporter and slained to reveal Ihe pall.ern of JaeZ expression . In this case, slaining is observed in Ihe hindbrain (Figure 21-25). Transgenic mice have been used to characterize several reguJ atory sequences, including those that regula te Ihe l3-g)obin genes and HoxD genes. Both complex loei conlain long-range regulatory elements (Ihe LCR and GCR. respectively) Ihal coordinale the expression of the diffel'ent genes over dislances of several hunrned kilobases (see Chapter 17).
Homologous Recombination Permits the Selective Ablation oE Individual Genes The single mosl powerfuI method of mouse transgenesis is the ability to disrupt. or "knock out," single genetie loei. This pennits the creation of mouse modeJs for human disease. For example. the p53 gene encodes a regulatory protein that activates the expression of genes required for DNA repair. lt has becn impli caled in a variety of human cancers. When p53 funcHon is losl, cancer ce)]s become highly invasive due lo rapid accumuJalion of UNA mutalions. A slrain of mice has been eslablished thal is completel)' normal except for the removal or lhe p53 gene. Thew mice, which are highJy susceptible to cancer, die young. There is Ihe hope that these mice can be used to test potential drugs and anticancer agenls for use in humans. Although Drosophila conlains a p53 gene, and mutants have been isolated, it does no! provide l.he saOle opportunity for drug discovery as does the moltse model. Gene disruplion experiments are done wilh embryouic stem (ES) cells (Figure 21-26). These cells are obtained by culturing mouse blastocysts so that ICM cell s proliferate w'i thout differenliating. A recombinan t ONA is crealed that contains a mutan t forOl of lhe gene ofinteresl. For example. the protei n coding region of a given larget gene is modified by deleling a small region near Ihe beginning of the gene that removes codons for essentia l am ino acids from the encoded protein and causes a frameshift jn the rema ining coding sequence. The modified (orm of the target gene is linked lo a drug
embryos from sacriflCed female
pronudeus
ONAto be injected
--
ernbryos are placed In CNiducl of feceptive femalE
F I (; U RE 21-24 Creation of transgenic mice by microinjection of DNA into the egg prtInucleus. One-cell emb¡yos
708
ModeJ Organisms
FIGURE 21-25 InsituexpreSSKlnpattems of embryos obtained from transgenic mKe. A transgenic strain 01mice was aeated that contains a portien 01lile Hoxb-2 regulatory region attached lo a locZ reporter gene. Embryos \Mere obtained fmm transgenic fenales and stained to reveal sites 01 J3,-galactosidase (lacZ) acti\.-ity. There are two prorninent bands 01stalning detected in the hindbfain region of 10.5 clay embryos. The embryo is displayed witl! !.he head up and the tail dovm. (Source: Nonchev et al. 1996. PNA5 USA 93: 9339-9345, F1C.)
resistance gene, s uch as NEO that conCers resistance lo neomycio. Only Ihose ES cells lhat contain Ihe transgene are able to grow in medium containing the anlibiotic. The NEO gene is p]aced downslream oC the modifjed large! gene, bul upslream oC a flanking region oC homology with Ihe chromosome such tha! douhle recombination with Ihe ch romosome will result in lhe replacemenl oC lhe t8rgel gene wi lh Ihe mulanl gene alld Ih e drug resislance gene. (Altemali "ely, lhe NEO gene can be inserted inlo Ihe target gene.) There is. howe"er, a high incidence of nonhomologous recombinalion in which recombination occurs illicitly al sites other Ihan Ihe endogenous gene. To enrich for homologous recombinalion evenls, Ihe recombinanl veclor also contains 8 marker- the gene for Ihe enzyme thymidine kinase (TK)-th
Th e House Mouse. Mus muscwus
NEO', , ____ o, ,____
" !
:'.
__ o, • ____ __ ____________ •
."
.
,,
"
"
:'
flCURE 21-26 wneknockoulvia '
t,
(1" : '" ., ... , ....... , j \ ')
' '.
!
!
homologous recombinatiOfl
non-homologoos recombination
g¡::::r:~N~ E1:O~::::::~~~==~~~I '
1
adcl cloned DNA lo ES cell culture
.I
selecl lor NEO neomYCln .. (positive selectionl
I
select against TI< GANe .. (counter selcclion)
-
709
!
create clonalline
clone ' kt'lOCkOut" mouse
rise to both somalic ti ssues and lhe germline. Once mice arA produced tha! con lain Irans formed germ cell s. matings among siblings are perfo rmed to obta in homozygous mu ta nls. Sometimes t"ese mu tanls mus! be analyzed as embryos due lo le thalit y. With other genes, Ihe mutant embryos develop into full-grown rnice. which are then examined us ing a va ri cty of techniques.
Mice Exhibit Epigenetic Inheritance Stuclies on man ipul ated mouse embryos led to the discovery of a very peculiar mechani sm of non-Mendelian, or ep igenetic, inher itance. Thi s phenomenon is known as parental imprinting (Figure 21-2 7). The basic idea is tha! on ly one of the two alleles for certa in genes is acti ve. This is beca use the other copy is selecti vely inac livated either in the developing sperm cell or the developing egg. Consider th e case of the 19f-2 gen.e. TI encodes an ins ulin-Iike growth factor that is expressed in lhe gut and li ver of developing fetuses. UnIy Ihe Igf-2 a lleJe inherited from Ihe father is actively expressed in Ihe embryo. The other copy, aJthough perfectly normal in sequence, is inactive. The
homologou$ recombination.
The figure
outlines the metI10d used to aeale a cell line lacking any given gene. Homologous recombtnalion that occurs within a target gene (shov.n in green) results ¡n the il1COfpOfation oi NEO and disruption of thal gene. Nonhomologous, or random. recombinallon can result in the incorporatlon of !he disrupted gene COfltaimng NEO, and the gene encoding thymidine krnase (TI<) dones carrying both cooslf\JCts SUMVe exposure lo neomyan. but the dones also carf'ying TK are subseq.¡ently coonterselected by growth in gancyc::IcMr (GANO. Oones containing the conSlrua Cilrrying the target gene Wlth lhe NEO insemon are thus lhe only survivoo. Once produced. these cells can be cloned and used to generate a complete mouse lacking that same gene (see F~re 2 1-24).
710
ModeJ Orgonisms
FIGURE 21-27 Imprinting in !he mouse. The pemKlnent Sllenong of one allele of a glven gene in a moose_As outlir.ed In the text and des.cribed in detail in Chaptef 17. imprinling ensures
exprcssed aUele o. ~-;,c
fgf-2gene
that only one copy of me
mouse 19'2 gene is expressed in each celL It is aiways !he copy carried on lhe paternal enromo-
sorne Ihat is ~ressed,
somalJccell
somatlC cell
meiosis remo\ES imprinling
I femate ¡mpónt established
male impónt estab~shed
(]) C) @(i) I I l
eggs
w.f1~
11. • 11 I
somatic cell
\
sperm
offspriog expresses different alleles
somatic cell
difrerential activities of thfl matemal and paternal copies of the Igf-2 gene arise from tht;l methylation of an associated sil encer DNA thal represses '81-2 expression . During spemliogenesis. the DNA is meth ylaled, and as a result o the Igf-2 gene can be aClivated in ¡he developing fefu s _The methylation inactivates Ihe silencer. In contrast, Ihe sjle nr;er DNA is not methylated in lhe developillg oocyte. Hence, the IgJ-2 aUcle inherited from the fe male is siJenl. ln olher words. the pa ternal t.:upy of the gent! is " imprinted"-in Ihis case, mathylated - for future express ion in Ihe embryo. Thi s specific example is discussed in grea ler detaH in Cbapter 17. TIJera are approximalely 3U imprinted genes in mice and hum ans. Many of Ihe genes, induding lhe preceding exam ple of Igf-Z, control the growth of Ihe developing fetu s. It has been suggested that imprint ing has evol ved to protect the mother from her own fetu s. The 19f-2 prote in promoLes the gl'Owlh of thp. fetu s. The molher attempls to limit Ihis gmwth by illacti vatillg Ihe maternal copy of the gene. We have considered how every organism must mainlain and dupli cate its DNA lo survive. aelapt, and propagale. The overall strntegies for achieving these basic biological goals are similar in the vast majority of organisms ando therefore, may be exam ined rather successfully using simple organisms. It is, however. dp-Br Ihat the more intricate processes found in higher organisms, such as differentiation and developmenl, require more complicated syslems for regulating gene expression and
Bib/iogrophy
711
thal Ihese can be studied only in more complex organisms. We have seen tltat a \Vide range of powerful experimentallechniques can be used \\ith success to manipuJate the mouse and to explore vanOHS complex biological problems. As a result, the mouse has served as an excellent model syslem (or studyillg de\lf!lopmental, genet j(:, and biochemical processes tha! are Iikely to occur in more highly evolved mammals. The recent publication and annota tion of (he mouse genome has underSL'Ored the importance of the mouse as a ruodel for furlher exploring and understanding problems in human developmeot and disp.ase.
BIBUOGRAPHY Books Burke O .. Dawson o .. and Stearns T. 2000. Methods in yeosl genctics. Cold Spring Harbor Laboralory Press, Cold Spri ng Harbor, New York. Hartwell L.H .. Hood L., Coldberg M.L.. Rcynolds A.E .. S ilver L.S., amI Veres R.c. 2004. GeneLics: From g enes lo genomes. 2nd cdítion. McCraw Hi1I, New York. New York. Millcr !.H. 1972. Expcriments in molecular genetics. Cold Spriug Harbor Laborotory Press. Cold Spring Hurbor. New York. Nugy A. . Certseusle in M .. Vinterstcn K.. and Bchringer R. 2003. Monipu/otíng /.he mous!"! embryo, 3rcl editiun. Cold Spring Hurbor Laboro tory Pross, Cold Spring Harbor, New York.
Sambrook J. and Russell O.W. 2001 . Molecu lor c/oning: A lobor%fy monual. 3rd edition. Cold Spring Harbor Laooratory Press. Cold Spring Harbor, Ncw York. Sn ustad O.P. and Simmons M.J. 2002 . PrincipIes ofgenclics, 3rd edjtjon. John Wiley and Sons, New York. Slent C .S. and Ca lendar R. 1978. Molecular glJllctics : A n inlroduclOJy normtive. W.H. Freeman and Co.. San Francisco, C..alifornia . Su llivun W., Ashburner M.. and HawJe)' R.S. 2000. orosophila protocols. Co ld SlJring Harbor l.aboralory Prcss, Cold Spring Harbor, New York.
Wolpcrt l.., Beddington R.. I....lJwl-ence P., Meyerowitz E.. Smith l. , and JcsseJl T.M. 2002. PrincipIes 01 devc/op. menl, 2nd edil ion. Oxford Un iversity Press , England.
Index Page rc fcrences in ¡talia; mfer 10 ¡nronnalion found in
~ures
A A complex. 385 A ONA, 106, JOB A sites, 430. 432, 450. 454 abdomino/-A (oM-A) gene. 627, 631-32 J\bdominaJ-B (Ahd-B) gene, 627. 628 abortive initialion . 358- 59 absorbance, DNA. ll0. 110 acceplor !lites, 381 accommoda tion, 442 acetate. 46 acrid ioe.246 actin filaments. 582, 582 - 83 activated molecules, 62 ar: tivated states, 56 ectil'ating regions, 492 acll vatio n energies, 56. 57 acti valor bypass experiments. 492, 493-94 acll valors assc mbly, 555 chromatin alterations, 54! eukaryotes, 531-33, 534 - 36 s igmli integmtion , 544 - 45 synerg}', 597-99 lrens lalioo . 483 Iransporl.555 adaptar h ypolhcsis. 31 - 32 adaptors. bindins, 577 ADAR (adenosine deaminase acl ing on RNA). 405 adenine. 9 9, 10 0. 100, J01 , 102 adenosine nronophosphate (A MP). 63 adenosine triphos phate, see ATP (adenosine triphospha le) adenavj rus-2 genome. 399 adenovirusell, 398. 399 adenylation. tRN A. 4 J 8, 418-19 affinity, oooperative binding ando 51 6-17 alfinity chromatography, 674 - 75 agarosc gel electrophore~is, 214- 15, 216. 648-49 Agassiz, Jean L" 5 alaninc (Ala. Al. 50. 73 alanine-tRNAc". 423. 423 alkylation.244-45 aHeles. 7, 26(1 a llolaclose. 89. 496 all()!; teric efrectDrs. 88 allosteric regulalion. 88 a Uoslery activators. 485 CAP conttol . 496- 98 cooperati ve bindif!g and oS t '1 gene tegulatiofl. 487 lac repressor contrul. 496 - 96 protejn.88-9 1 traoscripti unaJ activator runclion •. 500
and tablcs. a hlllice; characleristics, 74- 78 coilod-ooils ando 79 dipole momenls, 85 polypcpticlc chaio foMed into. 76 proteio stTUcture and, 72 alternativc sp licing. 394 - 400 constitutive. 394, 395 dCSCri¡llion, 380 mRNA,583-85
regu laled. 394, 396 Ames lest. 243 amino acids activation.63-64 aUachmcn t t o I RNA, 417 -23 incorpornl ion by mRNA. 465 - 66
non polar side groups, 52 protejn incorporation or, 46 7 in prolc ins, 73
stan -alion.566-67 struct l.lI'eli. 72.422
synonyrns, 461 lrip1CIS. 35 --,36
IRNhand, 415
lurns in protein structure, 76 amino groups, 48 ami noacyl synthetases, 63 aminoacyl-tRNA a mino ac id altar.hme nl, 428 binding, 468 ri boSomc selection. 441-42. 443
aminoacyl tRNA synlhctascs. 41 5 c1asses o r, 419 editing pockets , 422 formatí on , 42]-22 fu ncl;on.918 - 19 lraoslation end. 411 tRNA rocognition, 42U anaphase, 146, 147, 148, 149 anterior, definilion, 578 an lerior-poslflrior pallcminll, 628- 29 onli-oonrormetion s, DNA, 107- 8 anlibiolic.s, 453, 453 anl ibodies, 338, 339 antibody-antigen complexos, 47, tUl anlicodons, 463, 463. 465 antiparallcl ¡3shecls, 76. 77, 79 An¡irrh¡num (snapdrasons), 328 antisense RNA, 329 anlilerminetion, 487, 523-24 , 562 Anlp (Antennopcdio) genes. 622,622. 6. epolipoprOlein -B gene . 404 .104 apoptosis. 698 (lIVBJ\Doperon. 503. 50 3-4
/ndex
714
arabi nose, 503- 4 AmC, 50:1-4 Are rapm6SOJ", 400 An;hOOll, 275 archilectural prote.ins, 802 arginine (Arg, R), 73 Argoll3ut prolein ramily. 568 At1eIT1io (sea munkeys). 630- 31 arlhropod s, 630
Asdd;ans, 586 oshl.58 1
Ashl repressor. 580- 83. 597 asparagine (Asn . NI. 73 aspartale lra nscarbanwylaso. 69 ¡u;par1ic ac:id (Asp. DI, 73 Astbury. William, 21 k T base pairo 103 II.T - AC spliceOSOme. 400, 401 (ltoms. 42 - 43 A1P (nrlenosine Iriphos pha lcl. 206-7 dinucleot ide fa lda Bnd hinding oro8 2 onf'.rs)' donation. 6 1 oncrgy Iran8fcr. 58 on:lyme billlling of. 83 group- ITlInsrer fflactiOni and, 62 - 63 hydrol}'SiS, 61-67 allenuation. 5 05. 507 a lltmUlll ors, 505. 507
oltPettachrncnl si/c, 304 AUC tri p lel. 38 aulonomous lransposons. 31 3 autonomuus l}' repliC8lin(l sequences (ARSt;) , 214
aul oradiographs. 36. 653 aUlcresulalion . 519 Avel)'. OswaJ¡! T.• 20 -21
bacteriuPhage P22. 496 bacteriuphll8a PI Cre. 297 bacleriophago S P01 . 49
/J sheet!>
B B complex. 38S B ONA, J06, 107, IOR
lJocillus s ub/ilis. 584 - 87. 587. 687 BACI; (I)!1clflriAI iHlificial c ruomosomes). 667 bacleria. r;ee o. /so s pccific bacteria c)'lologir.al analysili. 691 - 92 gcne regulalury elernen!s. fi30 genomc sequencing. 663 - 64 growl h cu",c5, 687 as models. 687-93 phage íllroctiun. 521 , 684 ribowme. 425 RNA ltyn!h08is. 34 tnlfl5Cription cycle. 353-63 ltanSCTiplion Initialion. 4t18- 504 lrensformillions in "\l'\Iloooe. :to bader.oph8f:e <,/129. 503 bacteriophagc A. 111 doning veclors. 68ti map o f circular fuml, 5 13 as mode l, 661 promolers, 5 J3 reoonlhinlll ion si!es. 294. 294 - 95 reguJa tioll,512-25 lranscti ption , 5 14 bac leriophll€c A Inlegrasc, 297, 303. 303 - 01 bactcriophego Mu, 331 -34 . 332
ch8racleristics.74 -78 ONA dist orlion a nd o366- 67 fumUl. 76. n hyd rogen bond<;, 77 prolcill struc ture a ot! . 7i twi st. 7R bicoid mRNA. 599- 002. 600.604, 607 Bicj'cJus (ll1ynoll o. 634 biuinfunnalics. 605 - 6. 008 biulQgica/ divcl'$lty. 616 - 19 biomulccules. lroo enflrgy, 58-69 bivalonl allachmenl . 146 Blair, Tuny. 667 BLAS'T (basic loca / il lignmeol scareh 1001) r 670 -7 1,672 blaslOCySIS. 706 bloonlyt:in.245 bond angles. 42 boundary clernenls. 53 1 Brachet , lean, 30 brachi opods. 63 1 bnmc h m isrBtion, 260. 262. 276 branch polnl site, 381 Bn:mner. Sydney. 36, 692. 696. 699 Bridg4!S. C8 lvi n B., 10. !J . 701 bromOtlomainll. 112-74.540 S-bromouracil. 245-46 bubble 1ltC8, 2J 5 burst siw, 6115 b"d genes, Dro&ophll«. 620
e
cell exlmc..1s. 672-75
e complex, 387
censurfuce receptors, 552, 579. 579
C ·tennimtl domllins (CTD), 492 CA repeatl;, 237 Caclus protein. 591 , 591 Cmmor/Jobdiris e/egOIlS body plan, 697 expression paIlOJJ7ls, 575
cell-Io-c.e ll r.
g(:Il Umll size. 636 lir~ cycJe. 696-97. 697 linooges.697-98 as model, 681. 696-99 pbyla.614
RNAi diS(~overy, 698-99 Te l e leQlellts, 334 - 35 CAG repoolli, 237 CAK.90 calones. 43 5' cnp, 414 cAMP (cyclic aclcllosine monophosphate), 83, 83 - 64 CAP- aCTO-ONA comp leJ(, 493 CAP (cat;;bolite gene Ilctivator pmlein) aIJ ost.erie,: oonlrol of. 496 - 98 combímllorial con trol , 499 DNA binding motU, 493-gfi durnajns,83 inlernction wilh cAMPo 83-84 loe promoler nCliva tioll, 492
RNA'JlO1yrncrase binding. 489-9Z surfaces, 492 lranscription and, 488 Cllpping. 371, 372, 3 72
cawonyl bonds, 56 CIlrbonyl groups. 74. 74
carooxyl groups, '18 CaSpC~OIl, Torbjorn. 21. 30 U1tabolile ge ne activlllor protein. see CAP (c.1Iabolitfl gene activator prutein) catalysts, activation energies and, 56 l".ati!nation , 11 7 ¡k:atcnin.596 CBP,555 Cbx (contrabithomx) mutation, 625-26 cccDNA (covalently clOIitld, cill:ulilr DNA), 11 2- 14, 113 cee repoots, 237 edc6 recruitmen t. 223 Cdks (cyclin-dependent kinases) activat ion, 90. 9 1 pre-RC actival ion , 223 promoter regulati on, 485 regu lation of pre-RCn. 226_ 226- 27. 227 cDNAs kopy DNAsI. 324, 656, 657 edil rocrultmenl. 223 red (cell death defectivo) muta nlS, 696 r.ell cyele cllCckpoinlS, 146 -46 chromosome replicalion, 223 chromosomes duriog, 141 - 43 gap phases. 146-46 pre-RC regulalion during. 226-27 cell dp.i!th pathway, 696 cell division antibiolics ando 453 chmmosomes during. 136-41 DNA melh ylat ion pattcms and, 561 inherited gene expression, 560- 62 origins of replication ando 219
716
Index
dgen6.514 CU proleill slabili l)'. 522-23 0000 in,estinali:; (sea squirt). 584. 585, 615.668 Clol1ll life C)'d e. 585 dSIrOns. 692 cJaSlogenlc IIgenlfi, 245 Clinlon, CilI , 667 clones. idcntific.allon, 657 c loning. ONA. 654 c10ning vllCtOrs, 686 t:106t!d compl(t)(e¡¡, 352 Cochran. WilIlam. 22 cOOUIl - .. ntíoodon IJlliring. ~62 cudons, 412 aminoadd synon)'mS. 461 8ssignmenlfi, 468-69 dt.'finUJon. 36 direcl choln termination. 463-64 mixed oopol)'mers. 467 nonsense, 38 Irnnsloca lioll. 440 tRNA and, 415 rohesins. 142. 144. 145 coiled
ONA. 151
rore enzymEl6. 349, 356 oorehislunes. 153-54.154 oonservation of. 163 ONA inlerot.1íollS. 156-59 N-terlll)naltails, 156 structullll rold, 154
corepressors, 504 rurn (Zt!a maysJ. 11. J J. 328 Cotnlf15, Karl. 6 OOV810nl bonds, 41. 42. 44 oovnlenl ly c1osed, cin;;ular ONA (t:eeONA), 11 2- 14. 113 cuxll gene. 405 CPSF (dcavage and polyadenylalion specificily factor), 374 Cro recombi nascs. 297. 300. 301 Creighlon, Hllrriet B.. 11 Criel:, Froncis H ... 22, 31-32, 692 ero (¡:rmlml of repressor and Olhcr things¡ pmlein, 495, 51~ 517-18.518 CrtJlllSl:llll. bal..1eriuphago:s. 685 -8 6 CI'1.I56ing over ch romQSome IMpping a nd o12 c:ylologirnl View. 260 dil6l..-riplion. 259 gtlne Iinkage and , 11 durillg meiosis, 148 I.hoory of. 11 thre&fSl(."'tor. /3 two-factOf, 12 c russover products. 262, 266 CRP h:AMP receptor protein), 488 cruslaream. 630-35. 63 " 633 CstF Icloovage slimulatioll factor), 374 oCJ1) lcarboxy lenninal domain), 356 Cl'Ft. 537 crp, 89 c;ot-o nd·paste mechanlsms, 314- 16 , 315 cut gll llll, 541 cuticltl!;, 595 cydin-dt::pondenl kinllses (Cdks). 90 r:ycllns, 90, 485 (.-yclobu!anc rings Ihymint diroor.;. 245 cy5teinc teys, C), 73, 423 cysleinyl-IRNAc,o, 423, 423 C)'lidinc dcaminasc, 404 cytochrome P4SO, 636-37 cytokllle6is, 146, 14 7
cytulogiOll I analys.is, b8ct.,..iol, 691 -92 eytOllim: I\:C incompalibillty. 102 doomination, 244 p.1iring. 101 struelure, loo laulomeric slates, 100 c)'loskelelon. 576. 582-83
o IlBm rtlftth )·¡ transfenl!ill, 21 7 Dan¡ rntlthy lase, 240 IJom me lhylation. 24 1 Oarwin. Charlefi-, 5. 613 uauers. t;9(i [)(, Vries, Hugo. 6 dtlltCelyI3Iion.556-58 OEAlJ.box helicase proteins, 384 deamillillion, 244. 404, 404 - 6 dccatcna lion. 1 17 deo::od ing ceolers. 425 deRciency mo pping. 701 Dt1formed (DfdJ gene, 627 utlfonny lascs. 433 dllge.n6tllC)', genelic code, 461 - 69
Index degrado li ve palhwoys. 58 dehyd rogcnase enzymes. 82-83 Dclbriick. MOll:. 681. 687 Delta molecu le . 587. 588 denalurution. 71. t08 - 11 . 109. tlI dlms ity gradirm t r:enhifugn tion. 26. 27 denlicJe hnirs. 595 2' .deuxyoflellosine. 99 :r .d eoxYlldenus ine 5 ' phosphal~ . 99 dooxynucleosi de triphosp ha ' P.!I. 1 HZ dooxyrihonucleases. 2 1 2' .d coxyribose. 98 dLVurulRlhm, Z4~ d1.1C!rmined cells, 590 deu t eru.~I om~, 6 14 - 15
devolopment D . me/allu~:usler. 592-94
gene ACli viry, 576 gtlne reguJAlio n dUring, 575 - 611 mouSIJ. 706-7 2' -.J '-didoo¡o:,yguanos illE: trip hos phillll [ddGTP). f,62 2 ' '.J' -dideoxynudllOtidf!ti (ddNTPs ). 6131 . 6U l , 862 di/ Sites, J07-8. 308 7.8-dihydru-&-OJmguonine (moC). 244 di hyd rouridiOll (O), 41 5, 4 15 din udoolide folds. 8 1. 82 d lnucJL'Olide rl!peals, 137 di pl uid cells, 132,148 , 497. 693- 94 d ipole mumenls, 45 diplet"llns. 6.15 dirw:1 reJ>RIIl. 295 discri minalors. 420 d ispl':rsive processes, 27. 28 DJI (Dist o/-Joss) genl':, 631 - 32. 6.1.1, 634 Omr.l protein . 283. 283
DNA A formo 100, 100- 7 Hocessibil il )'. 1 75 IImp lifical.i on. 658-60. 659 B foml . 106. 106-7
OOnl. l68 lutlokinglerminlll pairs, 70 circular, 111 , 117. 3V 7. 30 7- 10. 1273 dOElval!ll.276 - 77 compnclion,11:i2 - 63 core, 151 dcnol uralion. 109 doub lc helix. l/ 1 -:al gel I! leclrophDl"fJ5is. 648- 4t1 genomic. 212. 577 helical slructure. Y8 Iw:liK formalio n. 69- 71 , 70 histol"lll Slabil izati on of. 159 hybridiza tio n. 651 -52 Inoo ling. 651 Iinkflr. 151, 153 ma jur gfOOvO of. 84 - 85 mmh ylalion . 558- 60. 559. 56 1, 710 nuclCClsomll·ossocialfJd . 16 (; o ucll!Qlides, 2S packoging. ISl polv nuc1eotid e m ajns, 98 - 100 polypnptido dloin , 23 prOCUf"SOni. 64 pnbf.'s, 652 -53 protl'!in afflnity furo85. 48t,. recugnilion by proteios . 8 4
regula I0f)'.596 rep lico ti on, 23. 181 -233 RNA compare
lopoisomers. 120 lupology. 111 - 22 w~nk bonds. 70 X-tay photogra ph . 22 Z form o 106 ONA, mismatchctl , 191 -112 ONA· binding prutcins. 84 - 87, 487 h ls"lo ne-Ii-etl ONA and , 166 leucine zipper ramily. 80 mcth yla tal sequence n!l:CJgn itiOfl . 558 nuclaosome posilionl ng depeJIIlctlt 011.1(',8. 168 - 60 ret:eptor bi nding. 577 ONA-bi nding site de locliun. 490 ONA dOl1ing. 654 ONA demage. 242- 46, 246- 57 , 247 -48 ONA excha ng,e, 300 - 302 ON A foolprioting. 490 ONA (ragmenl s, 660-63 ON A glyoosy lll5es. 248, 250 ONA srm~, 116 ONA 11I:!l ic<1!;es. 194 - 95 biochernical assay rOl" acl;vily. 106, 197 ONA "'01111 helicase binding lo. 2 11 E. col; mooe l for inilialion. 220 Nnd ion, 195. 222 rullClio n al replical.ion furb, 199. 199-200 iNll:live !>Iole, 219 in leracliorl wi th pri mases. 210 poll1rity. 195, 196, 197 ONA ligase5. 194, 194. 236 .251.3 16. 655 ONA loops. 486 - 87 . 519 - 20 DNA· rnedia lcd lmnsforrnal iullS. 681l ONA me1 hylll5e1i. 556-62 DN" lI1 icronrrays. 694 ONA m icrulialeUites. 237 DNA pholol)'ases, 247 IJNA Poi n/ primase. 200-201. 208. 226 DNA PuIS. 200. 208-9, 226 ONA Pu l e, 200. 208-9, 226 ONA PolI, 24. 25. 200 ONA PollI!, 200. 236. 241 ONA Pul 111 helicase, 211 ONA Poi 111 holoonzyme. 200. 206. 208 DNA pulymerases, 24 . 25. 500 o/so s pecific ONA polymcnI &CliV e. sile. J8 7 ¡:IIlalys is.168-91 ONA &ynthesis and o 184-65. 205-9 dumain~ , 186 fl.ngl..'r dornaio. 188
fIlI1CI;O" 6, 201
ga p fi lling by. 251 mechanism of. ' 84-92 meta l ions, 187
71R
1m/ex
DNA pol)'1l1eraStl (Ccm t inuedJ pa hll d omain. 188 pruct!!>S ivily, 190 slid ing d lllllps and prucess ivil y. 2U1 - 4. 2U4 b-pec::iali1.8liUII o f. 200- 205 s ludure, 186. 166 - 88 switchi ng. 201 , 202 leloll.eraSf!. 140 Ir.mp tale billd ing. 189 lem p lole P.11h Ih rough. 100 IlI umb dU"'!'iill. 168, ' 1)1 Y lamil)'. 254 _ 57 y fami ly p hylogt:.}otir. lree, 256 DNA p ri mase, 220 DNA rocog1lillOfl. 53.'; DNA repair. 247 ONA repll ir polymel'llloc. 316 DNA rep licalloll E. (;'U1i mudc l for inili il!íuu. 220 fin ishi ng, 228 - :i2 hi slone inhetll once a nd o1 76, J 76. 1'18 inoolllp le lc.22J in il i;,lio ll ill cu kllryo les. 223 - 26. 22'1- 26 inili" lic)n in proMI')'ul e$, 22'1 - 26 ¡nilialio n of. 212 - 14 melh)'latiun "oIltlcrnll , ::;61 ONA s lrnnd Irn fl sfcr, 316 DNA lrilllsposiliotl cul-a nd -posle meclul1lismIi, J I 5 lIIer;;ha nism . 314 - 16 non lra llsftlrred ~1rll n d cle8vagtl, J 16- 18 , 3 1 7 repLiCOlivll nlflCh.1 l1ism fl)l", J IIJ-2 0 ONA lra nsposons. 312 - 1 J Oria A protei n. 8J. 2 14 . 2 17 DnaC helicose looder, :l1 9, 220 DNA - hi slo lltl inleroc1io ns . 156-5 9. 157. 158. 165-66 DNA.se 1, 104 , JJ 4 Docl protein. 275 Dobzhansky. Thetx!OSiUIi, 16 dOll 1<1ins . d di "i!lo n . 81 d omi nan l Irai lS. 7 d onar s iles. 361 Dorsal proleln , 590- 91. $96, 596- 98. 623 Dorsa l-Twist syno rgy. 598
d orsal·ventrul plll!crn;llg, 5t10 -~~ do ublo hel;" base pa iring. 100 busll pai rs per t um o104 com ph:lllentllty IICXjUtlllCe5. 101 - 2 UNA. 21-28 groow!!l. 10:1 - 6 helicol lleri odicily. E2 1 llIu JlipJa CQflfO n lllllions, 10fi- 7 prope.llerhvisl, 101, 107 righ t-handed. IOJ RNfI..123 unwind ing, 104 - 95. E95 dOllb/c. sex gene, 39 7 duuIJ Ie-strand br'ea k repair. 2 47. 253 - 54, :lfN -66 . 265 doub lN lm lldcd bNlllks 105&). 264 . 266, 266- 61. 280- 82 do ubl C-Slnmdcd RNA IdsRNt\.s). 568
downslream prolUoter elen>cnt {DP\:.l, JtiJ-64
Dpp wlllrol oomplex. 100 Dro9op/ll/o melanogastilr (f"lit fly) A flfpge no5, 62:.!. 622 Cor.tu s pruloin . 51'1 1 cl! rornosornll 2, E5
cut genfl, 54\ Do rsal p roHlin. 500- 01 [k.ü.llJle sex (O'<;X). 565 doubl" ' !reJi genfl. 397 DSCJ\M8f!OI;. ,'192. JQ4 enlbryogenesis . 500 - 609, ."i92 eflgroiled/f'ngmílod IlI ut:ml!l, 10 2 genome mnps. 700. 701 genome si:r.e. 636 ha lf. p;nt ¡¡rotein . 397 homeolic gelle com plexes, 629 homeolic gP.nfl urgll:nit.1Uoll. 621- 29 l-1o.~ genes, 1;28 HSP70gf3 nlJ, 562 - 63 /lIInclilJacJ.: {,'t!1l0, 59 2. il1logi.na lllisks, 700
Krií l' PP'-J 11111 1111115. 595 life cyd e. Im!J. fill!J - 700 /IItlrinf'r e lements, J3 4- J5 as lI1od"' . 681, 691l-70 5 mUlulll ,:lo nt.'s . 1-1 OsklJr prolein. 59:1 u ... erview of d evclop me n t. 59 2- \14 p!1 ' Ienl dnll.'flll ining genes. 620 Pa:d j gene . fi 2 1- 22 ph~ l ll . 6 14 SflgITIenlfll ioll,509- (;00 SllX IIelerll1 inu!ion, 563-6 5. 56-1 . 565 scx·linl:t«I ge nfl!S. 10 Spiilzle p m luin . 591 slod y (j f lIIu11l1l15, 10 TolI recepto rs. SU I Il'Ilnliseoa m pw.ssl(jll. 624 w hil e-eye Ill ulanl gene. 10 . 10 wild. lY)lfI gelle, 10 DSCAM gl:!lle, 392. 3~14 flyad a)(06. 156 d ystrop h in genC'. 380
E El prulcin. 76 E ,i les. 4:m
t:.a rl y (E) r.(lm pl li~. J 65 t.'CdySOWllIlS. 614 - 1 ro Ecolll l'f!Slri r.liOIl 'lm:yml:!. 6 50. 650 . 65 1. 6 5 1 Edmall degnu]¡¡lioll. 676 . 677 EF-G. 4 4<1 - 45, 445, 4 4Ij.45 0. 4!i 1 E2F re lOOSfl. 554 EF-Ts. 446 EF- Th . 441 . 442. 446, 448 t~. 576. 600. 106-1 alF2.4:15 ... W5B,435 eIF4 F. 43ti, 437 aloctrull pIIirs. 45 e loclroncgoli ve aloms, 45 elw ro pho fl..'Sis , 1 20. 120 , 2 14- 15. 648 _ 49. ti elElCtrapo!iili ve ¡lI ums. 45 eler.lros lalic foI"Cf!S. 42-43. Ifl7 - U8 eloll8llli nC polymemse. 35Q elollglllion Po i U, 310 - 11 RNA synlh csis befol'l'! . 356-59 lro nscriplioll cycle. 350. 351. 352, 562- (;3 Inm51111¡01l. 4.. 0 - 41\ e long.1lio n fal,10rs. 440
In< EF-G, 444 -46,445. 446.450
EF-C and RRF combinalíon, 45 1 EF-Ts . 446 EF-Tu. 44 1, 44 2, 446, 448
CT P excha nge. 446 s lruc hlral comparison. 446 embryogenesis . 590 -609, 5.92. 706 ernbryos, 576. 578. 587- 86. 6.14
emply 'arge! sheli. 330 EM!\ (ethylillelnll nesulloll ale). 700 Enl1ll\' ex pressiull, 634 cad rltplication problem . 228-30. 229, 232 .mdonuclcHSes. 649-51
onrlonucleolytic d eovage. 248 engrr¡jledl el\grtli/lJd h OIllOZygOll s rn ul a n hl. 70 2
enhanCflosomes, 546. 54 7 enhancers. 530 . 60S -6, 606- U entropy. 44 mlviroml \e nud fllctors . 242 - 44
enzynlCS. see olso specifi c cnzymcs ilclivalion cncrgies am I. 57 allOllh:,rir. transfamllllions anJ, /18
atlacnmllnt lo s ubstrate:;. 53 reactioo rot es ando 57 Ephrussi, Boris. 17 epige nelic inhe ri tllnce, 709-11 cpigenelic: regu lation . 560 - 62 cpígenCli(; s wi tc he!>, 562 epitopes, 675 oqui Ji brium conslan!!> (K... ) biosynlhelic reaclions. 60 c hcmicallJonds.43 t ne11Jies of aclivalion and , 57 fn:e ellergy a n(1. 44 free energ}' chauge and o 5 7 ERCC1 -SP F. 251 Eschericllia co!i araBADoperon.503 - 4 el! pro te in s tabililY. 522- 23 drcularchromosome. 111 DnaA pru tein. , 14 go l genes. 4!)9 ga m gene~ , 49!l gell o" .e. 1:14 as modal, 681. 687 model for in itiation of DNA replica tion. 220 MulS prolein. 238 nucleot ide 0)(C i8ion ropair enzymes, 251 oríC. 213 phage A regu la t ion. 512 - 25 regula tion or replication. 217-2 1 replisome, 210- 12 riOOso mal proh>.in ov orons, 5 11 Up !oader sequcllce. 413 XerC fUIll: tioll. 297 Xr.rO function , 297 EST (oxpressed seqLll:! nCe lagsl. 688 clhidium lons. 120-22, 122. 246. 648- 4!j Eubactoria. 275 e ukaryotes c h romooolllll l DNA. 135- .16 chromosome makeup. 131 . 132 cumbinatoria! oon trol. 547 - 49 clIcc t uf anlibiotics. 453 gene dens it )' in genome, 133 gene rcgulatioll in. 529- 73 genes. 379
hOlllologous recombination, 277- 78 ini liation of DNA rep lication, 223- 28 ITIRNA. 414 - 15. 4.18. 440 protein synt hesis, 28. 30 R(!CA homo log. 275 ribosomes. 42to, 4;j5 - 3G stan o.:ooons, 41 2 lransc riplíon , 363 -76, .~30. 5 38, 5 49-5 1 Irons lation , 424 (el/fj) tJV8.n-.~kippll{/g,me. 604. ijM - 6. 6ÓS. 6u6. 67'2. ellúlution c unserlled inlron sequences, 401 cQnservoo mechanisms, 531 - 37 of ,!illcrsily. 613- 41 CXOIl shuffiing. 403 gene oouscrvalion. 6 14- 19 gene duplicatioll, 6 16 - 17 gene expression r.hangp.s. 619-20 genome, 635- 39 huma n origin s, 635 - 39 insocl w ingli. 632'- 35 Mediator Com p!ex r.onserval ion , 369, 369-70 morphologica J changcs . 630 - 35 new genes. 618 RN A in, 126 p'xc,ision repllir sys lp.IlIl;, 247 exon shllffii ng, 400 ... 403 cxonic npli cing 'm lllJm:ers (ESEs), 393. 396 'CXOJlS, 379, 402 t>-'lolludellSes DNA Poi I. 200 proorrelldi ng by. 11)1 - 92, 192 remova! of miSllmtchf!d ON A, 242 ex púnontial phaoo o r gro\o\'th . 687 exprtlssio n wlclors. 5 04, 6 ~5 exlracell u lllr llHltrix. 579 eyes, ectopic. 621 , 62 1- 22
F F-fac tor, 666 F' pJ;¡smids. 68fl f -plasmids, 688 factor binding r.enters. 441 , 44 7 fa lly acid biosynthesis. 139 fibroblast growth factor genes (FCF). 6 15. 8J 5 Fís (facto r for j ll\~I'liion s timu la tion) prolein. 306 - 7 Fis her, Romlld 1'.., 15 n;¡ge lH ns, 305 Oanking hosl DNA, :n5 fijA promote r. 305 fl¡8 promoter, 305 notorp lales.588 n011l' boot!e (Tribolium oosWneum). 6
720
Tlldex
Franklin, Rosalind , 22 free energy in bia ll1oll;!Cules, 58 - 59 ooncept of, 44 co nservaliulI, 58 molecular swbility ancl, 55- 57 ser::ondll ry bonds, 52 f~ euergy changt! I~G} i\1T' hydrolys;5, 61 - 67 COl\Cllpl or, 44 ¡;quilibrium ml'lS lnn lS Qnd, 57 high-enllrKY bonds and, 58- 59 in mehlbolic palhwlly, 60 freedom u[ roW!íUII, 42 Friz:ded f1'r.epIQfll, 598 froí l fly. 5f.'P. Drosopllila mp.lanugoSfer (fruil fly) FlsH proteín, 522 f'tsK prote in, 306-10, J09 ftL {l-'ushi lalY/7.u} geUfl, 6Z2 - 23. 623
G Y · resoll/lIse. 297 CACA binding factor, 562 Ca180. 554 CALE gene, 532. 53.1, 551 . 551 golgerKIS. 499 Cal4 prutllin. 532, .';32. 534 - 37.554 Cal repres:sor, 503 P-gulac:tosidm>e,486 galR W=" ~, 499 gamet65, alle/e;¡ 111. 7 gancyclovir. 708 gap genes, 603 8ílP pMSC$, 146- 48 Cap repoir. 3 16 Cerrod, Arch lbald E.. 16- 1 7 CATe Sf!(IUCIlCP. dislribulilln. 240 C:C content, 110 Gcn4, 565-67. 56'6 GOP/CTP I.lxchangli. 450 gel e lp.clrophoresis ONA lopolsomp.,separal1on , 120 , 120, 12 1 nucleíe Ilcids, 64R, 646 - 49 poIYllcrylam idl', 675 proteomic> and o 677 - 78 $t;I(juencing. 663 gel filtrm iun r.hromlltogfllphy, 673. 674 Kel mll lri", nur:leir: M:ids, 648-49 geJ mobility shif't assays , 490, 49J gelle cullversion evellls. 264, 284 mo li ng·lype swilr.hing and , 266- 88 mciolic rocombinntion. 289 mismolch "'poir ando 200 gene dens ilY. chrumosolllol. r :14 gene duplicaliun. 616- 19 gene Ilnder melhods. 668 gene findcr progroms, 668 gene linka8'l, 9 - 11 gene Q. 523 gene f;ilencing descripliun, 531 HO gt.Tles, 580- 83 mechnnisrlll';, 556- 62 lelumeres, 556. 557 gene- enzyme re l,1 tionships. 17
genera l Imnscription fadors tGTFs}, 363, 367-t genertlliuW transduction, 668- 89, 6MI
......
compositioll,403 I:X.lIISflrVQlion . 6 14 - 19 definil ion, 7 dens ity in gunomes, 133 linkag.¡, JO l1I ulonl, 10 proteins and o 16- 17 regul.1liOIl oft!! r Iranscrip lio n illil;8IiOl1, 504 regula lion during developmt!nl. 575- 611 regulal ioll in euka rvut es, 529- 73 regulal iol1 il1 proknryoleti, 483 - 527 sUllprflSSOr mul
globin genes, 543-44, 617, 6 18 g1 ococort uid recepturs (CRI, 555 gluCV6f!, 42 gluC08C-6-phll6plu! te , 59 glutamic lICid (Clu , El, 73 gl utamine (Cln , Q), 73 glulaminyl aminl/o'lCtllyllRNA synthase, 42 I glycl ne (Gly, CJ, 46, 47, 73 glyeolyase. 248 CooseOOny prulp:in, 6 u i- Hl61 7 goos«oid gene, 598 Cre foctors, 359 green nUuresutn t protein {CFPI. 7R. 691-92 Gri ffilh. Frederick, 20 grou p act ivat ion, 62 group- Irnnsfer read ia ns, 61 -67. 63 growth curves, bacteria!, 687 CTP. 446 - 47, 450 GTP-bindi ng proloins_ 435. 447 C :U base pairs, 123-24. 124 gurm;ne (CI mod¡fica lioll,244 palring, 101 s1OlelUTe, 100 taul omeri c Slales, 100 van d er Waa l ~ forcf'.$, 46 guido RNAs (gRNAs I. 405 gynlllldromorphs, 702, 703
H 1-11 9genu, 559- 60 Haedel, Em it, 6 hairpin [)N A, 318 hnirp ín loops, 79 hairpi n RNA. 3lil Haldane. John Buroe n S.1nderson . 15, 17 ha lf-pinl prote in, 397 Hammnrslen , Ola, 21 ham mcrhcnd ri bozyme. 125. 126 ha pl uid cells . 132. 146. 5&]. 693- 94 ha ploíd·specific goo~, 549 H2A.z. 165 heal den¡¡lumlion. 109 hoal shock o factor. 499 Hoogehog. 700 helices, handedness, 103 he li x-bn;a king res idut!s, 74 h eli x- loop.lud ix O10lif5, 536, 536 helb ·lurn · he li x motUs. 85, 354 . 493, 491 hemtlglCJbi n, 29, 2R. 37 Hemophillls Ülflupnzoe, 663, 664-61) ht:redit )', 8- 9 Hershey, AUred D.. 21. 68 1, 69Z heleruch romnl in , 556 - 62 helerod imers, 535 helerud up lt!X DNA, 262 heterologous recom bina lion. intermedi ales, 21>6 heteruzygous gtlI'l(:$, 7 hexapods, 630 hjl gelle, 522 HI'r (h igh frequ ency recombi nant) stm ills, 666 high-e ncrgy bonds, 58, 5R, 60 -61 H in invcrlnse, 2R7 Hin recombinase, 305- 7 , 306 Hilldlll reSlriclion cnzyrn e. 650 hingc helit.'cs, 89
h¡ stld ine (HilO, H), n hislolll>; IIcetyltrllnsferaSfl (HAn cum p lexes, 174. 540 histone CM pv.fOnes. 173. 1 78-79 hisfone cotle hypothesis. 55f1 histone dPAcely l3lill. 557 h i¡¡tun e doocelyln:;e Complllxes, 1 74, ' 74 11i51onfl-QNA inh:rnt.1ions, 158 hi slone-fuld domn;lls, 153, 154 hiSlunf! Hl, W O. IBI histone meth ylll'ans fllrnses, 557 hislone mClhyll18eS, 174, 174 h islo nes chromusomes an d, 129 gllllli! s ilem:ing, 556 - 56 inheritnnce, 176, 176, J 77, 176 modifica ti on. 174 -75, 556 , 559 N-term inal tnils. 159, 162. 169-74. 172. 1 7. nucloosomll fum:lioll ando 163-65 propert itlS, 153, 153-54 slru cl ufc.1 54 - 56 HIV vi ru s, 56:3, 567 bix roco mbinal ioll SilP.S, 305. 306 b lfslnJ ill.687 HU~ proleins, besic, 5S6 HMG- I,547 hnRNPA1,397 hnRNP1, 397, 398 HD genes, 5 46, 546, 580-83 Hu tliday jUlll.1 io ns clt!nvav.. 263 dof'inil;un , 260 ONA clm vage, 276- '71 fúNfl,1!iofl. 300 genero lian.262 reco01bina liv n in lurmediales. 266 Ru vAB com plp.x recognilio n (lf, 276 , 277 RuvC d imt:r binding. 278 Holliday mooe l. 260- 64 , 261 h ulue1U:ymes. 537 homeodoml1in p rotei flS. 535 hOnlcudomaifls, 535 homeotk genes, 627- 29 homulugou f' rtteQm bi naliOIl CIIlolys is of, 2 1;9 chromosome segregat ion during m eiosis , 2:; ci rculAr DNA multi mers, 30 7, 307- 10 dcscriptiOll, 259 llllublc-strnnd bnw k "'pair muJe\, 264-66 in eu karyoles, 277-78 genolic cIJI\5efJuenctlíi. 288-89 Holl iday modol. 260 - 64 klluckou l I(:ch nolugy, 707 -9. 709 m ooe ls o f. 259 - 66 molecu la r level. 259 - 6 1 prolei n machines. 268 - 78 OOlllologs, 131. 146 homozygous genllS, 7 Horvilz. H. Rober1 , 698 hoUSfl mouse. see mousc; M us museu /us Hox genes. 628, 629 Uoxb-2 genes, 707 J-/()xb ge n(:$, 626 Hoxd genes, 6 211 Upal reslriction e nzyme, 651 HS F, 562 hsp70 conlrol. 626
722
Index
HSP70gene. 562-63 HU prolcin fUI1(.1;OO. 306 humans e volulion.635- 39 F'CFgenes.fl t 5 FOXP2gene.6J8 gene COlllll . 635 -36 gene regula tOf)' clements. 530 genomecontonl . '37.671 spooch,6.17-.19 &ynleny, 670 hunchbod g.:ne, 592. 601 , 602. 002-5, 603 Hunchback reP'"'*>o~. 60B Huxley.lulian. 15-16 Huxlcy. Thomos Hu 5 hybrid DNA. 262 hybrid dy~llrn:;is. 703-5. 7()4 hybridirnlion, 108-9. 552- 53, 657 hyd mgen atoms. 42 hydrogen bonds 13 sheet slrlll:lure Rnd. 77 base Plliring a ndo 102 in biological molccu les, 48 bcll1rl hmgths. 47 characte' istic:&.47-48 diret:tiona! properlies. 49 DNA helix formlltion , 69- 71, 70 protein st rodureand , 78-64 water mo lecules. 49 hydrolYSis A1P. 61 -67 dcfiníli on, 56 in DNA synthesis. 163 - 64 peptide bonlk. 60- 61 types u f dam.1ge by, 244 hydro lyHc editing. 359 hydrophobíc bonds. 51 - 52. 78 h,rdroxyl groups, 4B hyperchromiclty.l10 hypoxanthinf.'.415
IFl lini lill\ion factor 1), 434 - 35 1F2 (inilio lion factur 2) , 43 4 - 35 1F3 (iniliO liOll factor 3). 434 - 35 f¡j2gene, 559-60, 709- 10 IHF binding, 326. 50',54 1 ¡magiIUl] disks, 'lOO ¡mino groups, 74. '14 inullunú system.'I. 338- 41 immunoaffinity chroma tography, 674-75 il1lm UIlOOlulting.616 immu nonuol"I!:5Cence mir.roscopy, 691 - 92 im printing. 558-59, 561J, 709- 11 , 7 JO inoorpora lion 35S<1ys, 673 inducers, descrip lion. 5()4 inducti Oll.663 )ngram, Vernon M .• 29 inheritnnOll. 10, 562 in iliatioo RNA prlmefS and, 193- 94 sleps in, 352 tJlll1suiption, 350. 35 1. 538Imnslndun, 432-40, .,133 initinlJon complexes, 434 8OS,438
initialion faclurs UF,, ), 433 - 35, 43 4 inili nlOl" (lnl). 363 - 64 ínhilltor proteins. 212. 2 14. 214 - 28 In itílllor tRNA. 433 ins~..1.s, 630- 35. 633. Sfjff 0150 Drol>CJll'liJo
meJonogoster iust:rtion sequenCtl (lS ), 328-29 insertiollal mu lagenesls, 690 ins u \a lo~. 531 . 542. 542 ínl gene. 5 23, 524 - 25 Inl pruleills, 524 inlegr.lSl.:s, 297. 303- 4. 32 1-22, 322. see al. 1r-tIlSpoIlIISes
in tegralion hOfit r.1Clt)I" IIHF}, 304, 304 intercalca ti ng ogellts. 245 - 46 p..¡nterk'l'lJll gtlne, 546 - 41, 547 ¡nlergf)1l1c scqucncos, 135. 137-38 Intermedia rc fil"moflls. 582-63 in te rphase. 147 inle rwou nd wri thes, 11 2 ínlragen il; suppressioll, 47 2-74 intrOI\ - 6XCJIl boundary. 38 1 inlronir: sp lic:ing cllh al'lccrs (ESEs). 396 introns con~(~ &equenccs, 401 r.onlribut iun logenomes, 136 description. 379 ea rl y mude!, 400- 403 gruup 1, 286-91, 389. 369- 90 la riat rorm. 381 - 63 lale mude\' 402 moc:ticJIls, 300 removo!. '25 selr" plcing. 387-86 im'I..'I'ted repeal" 295 Ion exchange c hrom8 tognt phy, 673. 674 IRESs (intomn! ribo8ome culr)' siles ), 435 , 4 IRF. 546 ¡54 Com ily Imu&poson&. 327-29 ,soelec::lrofocusing. ti75 isufo rms.397 lsoh:uGint! Uluu, 1), 73. 422 lsomeriZlllioll,357 isopoc!s. 631
J lacub, fronf;Ois, 496. 497 , 668, 692 Illnsse ns, F. A., 11 joint moleculcs, 275 Jun /ATf'.5 46
K k~ I i.n, Guiltld-ooils, Bn
Khomll3, Ha r Cobind , ::16 kí lltllOChures, 143 knirps gime, r.o3 - 5 Knirps mutan ls , 595 Knirps prutein. 607 - 8 , 6(NJ knockOll1 11lf'.hlloIOSY. 705. 70.Q knollíng, ONA, 1 t8 Kombt!rg. J\rthur, 24 Kozak. Marilyn. 415 Kozak scquent:l.!, 4 15, 435 Krilppel8PIlC. 603 - 5. 601 Kuhn. A., 17
Index
L laooling. 3 7. 65" plllse-laoo ling, 37 labia l (Job} gelle, 627 Labial proteín, 630 loe. gll1leS, 488-89. 489. 496 loe opernlor, 489 loe ope rmlS, 488, 400. 498 loe prolllotcrs, 489-92 . 492 loe repressors. 488 allosterie eunll'ol of.496 - !l8 ONA binding moti f. 493 - 96 domains,83 fllneti on, 495, 495, 502-3 reglllativn. 89, 89 RNA polymcr..¡stl binoing, 489- 92 lactose perm6llse, 488 ladose repressors, 85 10cZ ganes. 690, 707 Jaggi ngstrand s. 192. 228-30 A integl'llSe (Al nt!. 29 7, 303 . 30 3-4 A Iysogen, 512 A repreSflor, 5 14 C-Ifl rminal domain in teractions, 5;;0 ONA in teraetion , lJ5 foMing.79 funct.ion, 495, 517- 18, 5 16 geomelry of repressor-Opl:rilt or (:vmpl ex. 85 in major groove, 495 negalive auloregullllivn , 519- 20 operlltor sile binding, 5J5, 5 15- 17 Aell lunction .520 -2.2 Itmguagc, evo(ution of. 637 -39 laten t poMods. 685 la w oCmass action, 65 LDL (low· dt¡n sí ty li poprotei n) receptor. 403 ¡cader sequences. 505 leadíngstronds. 192 lepí dopterans, 635 lcuci ne (Leu, l), 73 Icucine zipper DN A-bindi ng motifs, SS, 536. 536 leucjntl zipper family proteins, SO UJxA repressof, 257, 518- 19, 532 lí brd ries cONA . 657 DN A, 654.6,56.656- 57 whole genome, 664 ligatiuJl éxperi nlf!flt . 594 Iimbs, i nsect, 630 - 32 U NEs (long intersperset! nuclear elements), 337, 337 -38 lin kod gene~, 9- 11 linker e e lls , 60S Iinker ONA, 151 hislOne H1 binding. 160.160-61 , 18 J Iflngths, 1'53 Iinker histQncs, 153 linking difll:rcnees, 114-15 lin king nu mbers rU}. 38, 73. 112 - 14. 116, 433, 434 locus oontrol regions (L.CRs), 54 3-44 long tenni nal repp..ats [LTRsj. 3 14 lophotrochozoans, 61 4- 15 loo: silt¡S, 300 I.TRs (long terminal rep eillsl, 3 14 LuMa. Salvadore, 661, 687 Iysinil (l.ys. K) . 7,1 ly6Qgtl niC growth
bacterioplwge5, 683 52 J control uf, 5 17- 18 opigenetic switches and , 56:.! establishment. 522 gene expression in, 51 3- 14 Adl and, 520-23 Jysogenic í nclucli on, 512 Iysoganic propagation, 512 I~'sogen¡c slatu, 303 Iysogens, epigenelic swilches, 562 Iysogeny, 682 IYlie grmvth, 303 bacteriophage5, 683 choice o f, 521 contrQI vf, 5 17 - 18 gene e¡.:pressivn in, 513- '14 Adl and, 520-23 Iylie phages, 682 Iytie prupaga ti on. 5 12 choi~of,
M M phase lmilosis). 142, 14.1
Macho- I, 584, 584 MaeLeod. Colln M., 20 ma<;romok'Cll les, 5 1-52. 69 - 92 mllcrom Ulntions, 15 mllin tCnonce melhy lllSéS, 561 ma jof grooves, 103- 6, J05, 495 major late promoter. 398 mamma ls. see also chimpanWfJS; h uumns: muuse conservad mf'.challisrlls, 531 -37 FGF genes, 615 /'lo)¡ genes, 628 signa l lransduetion pathways. 553 MAl' kinosp. pathways, 552 ma p uni!s. 13 murinerelements, 335_53 masking proleins, 554-55 mass spt.'ctrometr}' (MS/MSI , 676-77,676 MAT ({)(;us, 28o¡ . 28tí mating· type loci, 285, 285-86, 560- 83 mating· type sWiteh ing, 266- 88. 287. 581 Matthaei. Heinrieh . 36 maXí lJipeds. 631 Mayr, Emsl, 16 McCarl y. Mad yn, 20 McClintock, Barbara, 11 Mem eomplm;. 226 Mechon/sl/! ofMi!.I1ddlan Hf:redify. 13 Med iator Complex, 368 - 70, 369. 537, 538, 544. 546 mt¡iosis ehrom osome nllmber and , 148-49 ehromOiSome segn..'gaIi Qn during. 278-64 ONA dynamics. 279 gene ero&Sing overo 11 phases, 149 meiotie cdl cyde. 148 meiotie rec.:om bina l.ion, 280. 282. 283-84 . 289 me lting puinl , ONA. 110 Mendel. Gregor, r. - 8. 687 Mendr.[·s firsl la w. sefl PrincipIe uf Ino'lpendent Segregat Mendel's seoond law. see PrincipIe of Indcpcndent Asso mefCfIp toeth anul.675 mercll ry ions, 501 -2 Me rR. 486. 500, 50 1-2 , 5Ó2
724
I ndp-x
merT gP.ne, 500, 501 merT pro moter. 486. 502
Mesclooll. MIl Uhew. 26. 260, 692 mesodenn d ifferent ia lion. 595 messenger RNA (mRNA ), 123 altemll live splicing, 583-85 assembly, 436 broken. 452- 56 circu lari:wlion . 438, 438 degmdati on.457 discuvery,33 10C8liution, 576 monocistronic.4 ·13 polyci~ lron ic , 413 prematura stop oodOflS, 456. 457 recruilllll;!nl in pi'okaryuh.lS, 433 slrudure.4 14 synlbelic, 465 -66 trnnsJalion II nd. 411 . 41 2-15 trnnslali un-depemletlt regU llll íOIl, 452-57 tr.:lnslu:;atitlll. 444 - 46 tronsport. 406- 8, 407 Mf'!I -RNA¡Moo. 435 rnet.1bolism. Hinbom ertofS Hin. 16 metal iOlls. J87. 349 ffit!til phastl, 147. 148, 149 methy l groups. removlI l. 248 N-mtllhyl-N' -n il ro-N-ni lrusoguanid ine m ulagenesis by. 244 melhyla liull. 2 18. 556-58 lIlt!thylguan ine, 415 Mica el
morpho logy, 620-30 monda, 706
mosaica. 702 mouse virus polyomn. 21 Mm11 prol ein. 283 MRX prolein, 282-83 MSH prolein, 242 Mu genome. 331 Mu lr:lIlsposons. 689 MuA p rol eins. 331, 332-33, 333 MuB proteina. 33 1, 3:\2-33. 333 Mu ller. Htlrmo nll 1_, 10. 13, 16, 700 A-Ius muscll/us lmouse). SI
N N prolcins, 523- 24 Nanos ptolein, 602. 602-3 1I"81llí ve aulorngululioll . 519 NEO gene. 708 lloomycin T!;!sislllllCC, 708 lle urogUllít;; ecloourm, 587. 588. 595. 5!J6, 597 flcuruns. 588 NF-I
Ntre , 485 - 86, 500 -501, SO l , 54 1 lludcHr rnHglletic n..'SOnnncc lochn iqllcs, 75 Il UdU.'!I' pon: complox, 407 nOclCH r SCil nolds . 163 nu d~:m>1l prOIOCliOIl fOlltpfIn lillg mulhod . ... oo n ud ei c h romOSOtfl{J!'; within. 1:12- 33 embr)'OBcrmSÍJ!. 5~J RNA S}'n lhC5is. 36 nud eic acid s biosynlhcsis, 66 comparllli vc cenomc AIlSI)"'u;, f'>69 -72 DNA c!oni ng. 654 - 55 DNA h ybrid i:r,aIiOIl. 65 1- 53 DNA librari os. 656- 57 DNA ~er,rnoll l isolali on. 653 clec r rophon:~ls . 648- 49 SOllclic in romlfllion convoyerl by. HI-40 gOllomc-wide una lyscs. 067- 60 oligonu c1l1orido sy nlhcsis. 657- 5/1 pai rúf l-ond 5lrn leg}'. 666-67 PeR. 658 - GO
proc ursors.64 1ll5' riClioll mldo nlldo/lsus, 640- 51 scqucnccs, 660- 64 syn theslS. 64 -65 vec10r DNA. 655-56 viral ge ne;. 21 nUdC(ll ds. 131 nudoop hillcall uc k, 126 tlllc lcosidn-S ' -tri phosphntllS. 64 nuclCQ\;ido phosphates, 64 tluc;!cosidCll, 98 nud cosol1Ull DNA. 162 - 63 n ur.!ca;omt!!i IIrrays, 16 1 -62 assl1Inbly, 155.175 - 70 alomic slruclUrc. 154 - 56 axis o r sym mclry. 157 chromoso mc Nlructul"C und, 150 - 5:1 core pu rlicles. 152 DNA binding. 168
DNA· binrling prOlci n doplm donl pn¡¡ilio ning, 168, 168-60 DNA puckaglng. 15 1 ronn¡¡li On or. 130 func lion 111 e ukilry olcs. 5Z9 h ¡¡¡IOll e vllriall lS ¡lIld , ' 63 - 65 Incking l iZA amll-l ZB. 156
micrococcal /lUclebo. 152 movlllnfm l. 166 - 67, /6 7 negativo ~u perooi lins Hnd. lI S positioning in cclls. 170. 171 protci n co reo 15J rernodcling. 167 remodding co mplcxcs. 166-67 nuduoti do I' xcision mpair.147. 246. lSO- 5J.152 nucloo.l idc:s J '-5' p hoopho(h eslorase link.ar,cs, 23 c hain-dclc rmininB. 661 conv8)'ance o r scuc lic inlorrnuho' l. 28- 31 crcalioll. 08 formal ioll, !lH triplots.36 nurse cclls, 500 Nu s p ro leins, 359
NusA prOlei n . 524 Nüsslein-Voll hard.
Chri ~tine.
594
o O' -mclhyIRunniJlc. 24 4, 247 Okasaki rrdgnlllnls corn plcti on. ZOH - 9 IOllRlh , Z 11
primen;. 194 replicalion rOM o 103, 193 sepa mt iOll. 242 O/lmoidc~. 620 01igo- n bonucleolidc6, 46!1 oligoll ucl oolidcs. 657 - 58 On the OrigJl! of SJ1~/f11I I Darw l nl , 6 13 onCOl,O(\IIe5, 696 ooc)'ll)S, 599
Op(m com plexr.s. :15Z. 356-58. 357 opcn ' lWding rrumes (ORFs). 412- 13. 6r.7 - 1 0P"Willors. rupl'{$sor binding. 464 opcrons, 507. 5 1 I optlc¡,¡l df' lIsit y. 110 O RF glJ nll s. 314 O RF prOlclos. 3:¿J olga nic I.l o lcculrlli. snlubility. 51 oriC. 2"13, :¿14 orig,i n rccogn ilioll l;OlllphiX,)!; (O RCs). 2 17 o rigins o r rcp lioltioll, 138 d uringCf.!1I d ivisíon. 1 38 - 3~ . 139 ir lr]ll lifi calioll o f. 214- 16 molccular idelllifiCA tiOfl. 2 16 priOf 10 ce]l divi sioll. 2 19 ort/looe,.'iclltgcnc, 60 1. fiO l - 2 oskor mRNA . 590, 600. 604 Osk.'! r protcin . 503 oxidalio n . DNA d a.m8ge. 24"1 - 45 oxoC (7.8-llih}'d l'G-8-0ll0J!uanincl. 244 oxy~ n . 42, 45
p P-<: hmlf) nll mn ~ fcrllUlli oll~.
703 -5, 705
p'
pilircd+llfld scqucncing. 667 PlIi rúfl prolCin. 6 16 - 10 . 617 Pungo lill. S96 panl llel (J sh OOl ¡;, 76. 77 plln:n tal imprinting. 7(1!) ~ 11 P¡¡~ lc ur. I..ouis . 603 palch prod uc ts. 262. 266 po lchl'CJ mut¡¡n l ~. 705 fHllt L'I'n-detcrrnllllng gCIlC!i'. 6 19-20. 620 Pa ulil lH. Llnus. 22 Pox6 gcllOS. 619, 62 1. 021- Z2. PCNA, 242 Pfi prornOlcr. 564 Pelle j,;illitSC acliv¡llioll. 5f1 1
pcPlidc bond s chnrilClt.'ri~U I..5.
n
el/olulion, 126 Ily({rolyBis. 60-61 iII u~ r rlll¡on, 74 p lan¡¡ r sl18pe, 42 pcPlillo groups. 48 pcplides, slaining. 675
720
llldex
peplidyllnmsrerase, 126, 440 center. 425, 442- 44 . 444 rCHc ti on. 428. 428 IRNA binrling nnd , 430 ptlplirlyl. IRNA,4 28 PGNA . 104 Phage emu ¡'l. B81 phiJges. sef' bacleriophagcs phellolypc:;-, 7. 8 phonylCl ]¡mi nfl (Ph u. F). 16. 36. 73,422 phenylisothiocyanlllo (PITC), 676 phosphoilmidincs. 656 phosphorl¡OSler linkflgc:¡. 99-100 pbosphodillSlCl'ilSCS.65 phos phoramidile, pmlrlllalod, fi5R phosphorylillion, 88, 90, 553 pholorcactivalion, 247, 247, 248 phyla, summary, 613 plaques. 664 , 664 plasmid vcclors, 654, B54 - 55 plasmids, 111, 131. 688 plocloncmic wrilhe, 11 2 PM promoler, 564 point mulations, 236, 470-7 1 Poi 11 core prolllolors, 363-64 • ..164 poh)r granules, 599 polar moloculcs. 45 polllri ly, DN A holic polYaflcn ylat ion. 371. 372 polyadcny lk add (poly· AI. 314. 466 polycistronl c mRNAs. 413. 488 polymcrao¡e ehain reaelions (PCRs) DNA ilmplilication. 658-66. 659 DNA lalxlling by. 65 t - 52 lorells ia; and o661 1l1icroorray assays and, 577 polymcrase swilching, 101, 202 polynuc1cotide l.:i(l.-¡su. 651 polynudcolidc phosphorylase.ljtj6. 466 polynuc1colides, 36, 99. 1 22 pol)'poplide backbone, 74-78 polypoptide chains. 37, 76, 79 pol)'peplides.432 polyph(lllyl~ hll\jn(l . 466-67 polyploid cclls, 132 polypurinclrocls (PPTsl. 324 poJyriboso mes. 33. 34. 426. 427 polysomos. 426 polylenc chromosomC$. 701, 701 posil.ional infonnalion. 578 posiliv(! amoregula tion, 519 positiva conlrol mulanls, 492 posleri or. dofiníll ol1. 578 Prd protein. 6 17 pre-iniUalioll complexos. 364 - 66, 369. 435 pre-mRNA . 380 PWt; promoter. 520 pro·rop lk.llt¡vc-cmn plex (prc-RCj. 222-27. 22S. 226. 227 primaly ~t rUl::turc, protcin. 72, 75
pri mases, 193 - 94. 199, 199- 200, 210 primales. 637. Sf'f> olso humans primer-IJinding silcs (PBSs l. 324 priOlCrs. 194 RNA . 193- 9'l pri mer.lempl ale ¡unelions, 162. 186-68 primilive slmak, 706 Principie of Inrlependenl Assorlmenl. 11,9. 9 Principie of Indepondon t Segrcgalion, 6 -8. 7 pro-¡¡!\ 5t16 proa2 eollagen geno. 379 pro!Jcs, 65 1-53 proOOSÓpt'f.{¡a {phJ geno. 627 procf',ssecl psuudogOJlCj;, 337 pnx:(l.~sivi ly. 1H9.
11)0, 201- 4
profllin, 583 proflav in,24li programmod roorr~ngcmcnls. 305 prokaryotos chromosomll makcup, 13 1, 132 crrocl or antibiolics, 453 gene rlensity in genome, J 33 gene rcgulalioll in, 483-527 inUialion o r DNA rcplícalion . 227- 26 tU RNA rocn ljlnullll. 433 rlbusomes. 425. 426 RNA R~Ss . 413-]'¡ topoisonmrllSCS. 116 lranslalion. 424. 424 Irnn~ l a li o n ini lhl lion, 435 prolinc (Pro. PI. 73. 74 pronlOh,/'S b!lt:luria l. 354 baelcriophage A. 5 13 eOIl~()nStt~ sflqlloneos. 355 eore el1zyme rccr uitrncnl. 356 dllscription. 530 rogu lalion. 464-85 RNA 1'011 .275. 374-76 RNA PoIIII. 275. 374-76 lranscriplion eydo. 350 prooliua.ling, 359, 370 -71 p[oorromilug oxo nudellses, 1!l1 · 92. 192" 2 37 propflJlflf twisl flrra ngclOcl11 . 107. 107 p[ophflges. 511, 682 - 83 prophase. 146, 147 protuín II 011111 in5. 402
prnl:p.ln stahilily, 452- á7 prolcin s uhuni1 ~,72 prote in synlhosis. 16 , 29. 30. 37 - 31:1. SC(! n/w l'rotuill- DNA inleracl ions DNA binding. 84-87 in il iil lion o r DNA rcplica lion and , 217 - 21 nons pocif¡c, B6 wc.-¡ k bollds ancl. 5:'1 prolein-prol"i .. illlur¡¡cUons. 53 alloslcric Ir¡¡ns rommlions an d o68, 90 inilialio n ofDNA replicalion ilnd. 2 17-2 \ al Che Iflplicalion fork. 2 10_ 12 prolL'in - RNA complex . .985 pmloil1- RNA i lll r.racti oll.~ . 86-B7 protc ins affinity for DNA. 480 ¡¡lloslory.88-!)1 ¡¡m ino acid ineorporal.iOIl, 467 arn ino aci.ls in. 73 b u llcl irlgblocks.7 1-72
I¡-d
in c hromosomcs. 129 determina tí on ofslruclure. 75 fo ld ing. 76. 78.79 gent.'S alld. 16- 17 hydrogen bonds 1.111,1 slOlclure oro76 _ 64 imnlll noblolli llft·676 largc.82 loveu; of 5lruclUle. 72-78 puriliCHlio n.672-73 scpa ru lion, 673-75 slruclural fcaluros , 75. 78. 81 -84 InwSO";plion rogll ll:l lion . 483 -87 IWO hvbr hl ussay. 533-34 unfohling.74 proloomicl>.677- 76 prolo nnl ud pho~IJhufi"lIirlil fJ. 658 psomlogcnC!¡, proccs!Wrl . J 37 psulllloknols . RNA. 123. 124 pseudouridinc (tIA !). 41 5. 415 PSTA IRE holi r.os. 00 P·T EFb.311 pu lso. scc pu lse·1eoo Bug pulsc· lnoding. l"allioIlClive. :J7 pulsed·flcld gcl c lL-c:lrophores\s. 64!J. &19 purhIL'S.l00. lOO puromyci n.454 pyrimitlincs. l OO. lIJO py ro ph05phal ust.'S. 184 pyropltOSphilhl gro \Jps. 64 - 65. SS Pyrollhos ph Of{llylic cdiling. 359 pyr" v{lle kinAAe. 81
f'CCClós i vc tmits . 7
rc<:ogllition hclix. 8~, 40J ftlCOmbi lllllll..;. phOIlOlyp6S, 8
rocombhlllSC rucombi nulion sequc nces, lO! rocOIUU iMSI..'fi. 204. 297 rocombina lion, 12.262 nxot"ll binnli oll raelOri Cl>. 28 3-84 recom bl lllllion siSlla! scquonce5. 340 t(!f"XIfllbina tion siles. 294 rcr.om lliJlalional repltir, 247 rccombi na ti onal lrrl .Isfo rmHlion.694 rtlCTu il mCnl, 485. s:n-44 redudng 8Rcnls, (j15
reguhllor binding ~iles. 530 Nlgnlll lory clcmo nls. 5.10 n:guls! orv sB
rdaxaliol1. 11 6 - 20 rnloxll<1 . dcfinition. 1 H
11)leusc f,lel':>!'!; (RFs¡ d IL'IS L 449. 44 9 - SO c1uss 11 . 450 fllnClh:m.448 - 4{l ~plidc rcll)i\se. 450
rernodcHng. nuc1c!,.'II;omal. l fi(i
nlll ll lnraliOIl. ONA . 110 repli r. mu tll hilh)' an tl. 235- 58 ffiprn! ltld .~oqu~ces. J 36
repliclllion DNA. 1I11 -233 erron; in. 235 fin íshing.228 - 32
or lclomcrcs.23J
Q
Q prol l)i l\~. 523 - 24 quanlum mochonics. 42-43 qua rl c n \{uy 5lruc turOtl. 72. 7Ci. 82 qucry 8Cquen ccs. 670
R R gro u ps. Ctl ItlJ,>ories. 72 R. lou p mnpp ing. 398. 3f1f1 Ro,IfiO proloin . 26:1 Rlllt S I proleln. 275. 211:1. 21:13 Rfln52 pro lcin. 264 rudi;¡¡i on. DNA dUIllHgC. 242- 44 . 244 - 45 RAC nx;ombinn sc. 341 RACl (rocombin;¡lion "c1iva,ing gt! nc ) ¡¡ubunil. :140 RAC2 (rCl;ombimt hOll llf"h.,¡¡ting genc) ¡;ubu ni t . :140 RALL mOli f. 623 Ran prolc in, 408 Rapl .557 mIs. S fo gene. 394 Rb (rCC inOblaS lomlll lltolci ns, 554 rettr.l i ~'eo"'ygell Sp'!eI CS (ROS ). 244 nJsw ng fmme!!. lypt'5. 4/3 RocA· li ke slnlllfl,cII:r.hange prot cin ~. 282 _ 63 R«:A prole; n a¡¡scm bl )'. 272-74 . 274 ATI' blll dillg. 83 bao;e--p"irw pQr1ncl"S wíthin. 274-75 f\Jnclion . 2.57. 5 18-1!J hOlllot~. 275-76. 276 ssDNA COltling. 270 RocBCD hcli C..1I hway. 208
replíc.:1tion bubblflll. 2 16
repli C
assembly. 555 i nl crscliofl at 0 , or Üt . 519-20 IllIl15Cri pHonlll. 549-5 I U"imSlnlion. 483 tnmsporl.555 resoI11Ii on.260 l'Cl;Oh·
rctrot ....dnsposonS LTR, 312 poly-A, :n2 , 322 - 20 pol)' -A scqUCI1WS. 3 14 RNA. 325 viml· like, 3 12, 313 - 14 , 320. 321 rct rovirol in h::gnts6&. 32 1-22 rntroviruses d)NA form8lion. 324 compor;ition.3 13 - 14 [lNA cnds, 324 integra/ion, 321 movemcnl .320 RNA onds. 324 rovmsc (biK'.k1 mu tlll ions. <171 reverso t ran~cri p t ase , 1 36, 324 rnvon;o trnnscri ption , 325. 326. 656 Rho-dependen t tc m li no tors, 301 Rho-indcpcn rl onl lllrmin8 tors. 361, 361. 362 rhomboicfgone, 595 . 596. 597 r¡bonuelonr pro tcins (RNPtl ). 62. 67 ri bonucl Cll~CIi.2 1 , 74 libose, 30. 1Z2-23 IibosO'm 8J prolllins. 506- 11 ribosomaJ RNA (r RNA). 413, 428- '29,43-1 riboNomobilulin¡¡ ~i l fll'l (RBSsl. 4 13 ri botiome rcC)'clin¡; factors (RRrsl, 450-52. 45 I ¡ibosomes. 411 . 423 - 32 am inoocy l,tRNA selcction . 44 1- 42. 443 channcls. 430-32, 432 compasillon, 425 CI'}'slal struclure. 429 cycles, 426 mRNA transport o430-32 recyclinc. 450 rf:SCue. 5~ 5 as ribmyme. 442 - 44 rRNA in. 428- 211 !iC8nning, 414 se pareli on , OS slro clure.431 subuni t~. -429 ~ ubun l l s du ri ng Ir<.l ns lulion . 425 - 27 IRNA bind lngs ltlls. 4211- 30 tRNA discri min ut lo n, 422 - 23 rib08 wit chcg, 509. 5JO fi bozYI1l(l!;, 125 . 38!l- 90. 442-44 RISC (R NA-ind uccd si hmr.ing complcxl. 566 RNA-8n ncaling r8Clors. :l64 RNA-depólIlden t RNA pol)'merase. 569 - 70 RNA hai rpins. 361 RNA hc liCl!.<;(,'S. 436 RN .... ;nlll'l'ferentc (RNAi). 567. 568 - 70. 569. 69H-99 RNA pol),mcrasc holocnz)'mes. 353. 353 RNA polymcfllSC J pron¡(ltcrs. 374-76, 375 RNA po lymcr
pr ima~u. 1 93 - 94
rí!(: rll itlncnt . 484 s in816-~lIbul1 it . 360 subllnits. 34fJ If"dnscript ion qcle and o 348-52 h"'¡ l1scription inililltion by, 247 Imm.r.ription proccss. 350-52 RNJ\ primllf'S. 193 - 1l4. 194 RNA -rocognition 1II0til'! ( RRM~I . 396 RNA ~ plicing. 135. 135.379- 41 0 chemistry or, 380 - 83 d a.~ses. 387, 387- 88 dcscri pliofl. 380 l!iscovfl ry. :198 C1.'O lution.389 - oo Intron romo"o l and, 125 palh wu)'s.38 5-93 1't.'llcli on.382 th rec, wllY !uneliOIl. 382 RNA - RNA h y brid.~. 384 RNA - RNA in t ornc t lon~. 384 RNA~ . sc.~ o /so nn,,'sscngllr RNA (mRNA): r RNA (rRNA I: lrunsfer RNA (IRNA I ooctllrial ,34 bnse composillon. 35 ClIlal)'lic region fold iug, 39 1 c hain rolding. t 23- 24 eotnposilio n. 3D, 30-:n ONA com pa rod w ith . 3 1 dou ble helical c harHct
s 58. 165 rRNA bintling. 5'11 S ph llSl< (syn th C!lis l. 142, 14 2 58 protl1in, 510 Socchnromyce5 ccrevlsrllO ganc silollCing. 557 gonern ling mul a tions. 694 gcnome, 694 - 95 growl h . 695 - 96 HOgone si1c nci ng. 580-83 introns i " 1!()I\OfUe or. 136 life r.yc le . 693
lnoox m;:¡ling·lype genes. 548-49 mitotlccell division. 695
skin-nelve regu lalory switch. 587-88 Sleeping Boouly elemenlS, 335 s lid ing. nuclcosome.l66 slid ing clamp loodenl. 204 - 5. 206-7, 220 slidin¡¡ clAmps, 242 slid ing DNA cltlmps, 201- 4 ATP !1m! looding of, 206 ONA proccssivil y and, 204 opon ing.204-5 polyrucrase proccssivily 8ml. 20 1- 4
as model. 681 , 693-9iJ rc p!lC
s lruclure, 203, 205 5 /0 gene. 394 SOIads. functioo. 598 small nuclear ribonuclear prolei lls (snRNPfiJ. J83 slllell lI ucleur RNAs IsnRNAs!' 87. 383-84
Scolt· Moncrieff, R05e. 17 Sc:r gene, 631 see monke)'s (Ar1emio), 630-:n SéH ¡;(¡ulrt (CíO/lO ílllP.stinoli.~), 584. 585. 615. ati8 ~ocond8ry b.:l!1ds, 49 socol1dary slfue tures, 72. 75 ~t!8mtlnUllion.
599-600.601-2. 602
610-71. 672
seU·splicing ¡nhons. 3R7-Ba semiconservalive p~. 27, 2R SoqA protcin, 211. 2 18, 219. 223
VISTA,669 ros gene, 595, 597 soknoid model. 161. J62 solubilil)', oq¡anic molecules. 5 1 Sanie hedgehog (Sbhl, 588- 89,589 SOS response, 256. 5 16-19 Soulhem , Edwaro . 652 SouUlern blol h}'hridiulioll. 652, 653
sequenillOI'S. 66 1. 665 sequelloo coverage. 10X, 663 sequcnei ng
ONA,062 ONA (ragrnen ls, 660- 63 Edmal1 degrndation, 677 gel eloclro phoresis, 663 hlgh Ihroughpul , 665 road oul, 665 s bolgun. 663-64
SP1. 537
serendipilous microhomologip.!>. 254 serine rerombineses, 296-91, 297. 298, 298-99 $eri ne (SeT, S I. 73 Sex combs reduced (Scr) gene, 627 S p.x·/t:lhol/Sxl) genes, 564, 564 S6)C.lillkage, la. 702 SF2IAS¡: proleill. 397 S hine-lA1 lga rno sequenCCli, 413 shOrl inwrferillg RNAs (s iRNAs l. 568- 70 sholgun sequencing, 663 -64, 666 sinmoü' ge nll, 500
sickle ce ll nnc mia, 29, 29 u factol'S , :154. 35.~, 356. 499-500. 58t> signal int cgmlion. 499. 544 -45 s igna! Imnsduction pathways. 551-55, 553, 571. 578 silencios. 53 1, 542 SÍlllpSOIl. George Cayloro . 16 SINE:; (short inlerspcrsoo nuclear elemenls), 337. 337 -38 sirlGle ·Nlmnded DNA-bind ing proloills ¡SSUs). 84. 84, 195-98 single-slranded ONA (ssDNA). 264. 272-74 S IR complcx recruilmenl. !¡57 SIR gentlS. 557 Sir., prOle in. 3J6
SisA. 563 SisB . .'>63. 564 sislcr ehromalid cohesion, 142. 144-46 siSlcrchrornalid separation. 143 sislar Chrollllllids, 142 silo·dlrocll.-d mutagenes is . 658 silt!-s¡>«ific rocombi nalÍlm, 302 - 10,
~orlware
m.AST IlMsic local aBglI/nonl sem'Ch 1(01) program,
58lenoc)'Sleine.41:J
Ski 7 protelo, 456 " kili ealls, 586
~2 l1!acllons. 183
Snail repressor, 597 snnpdfagans (AntilThinum). 8.8. 3211 /;oo ium dodl.lCyl sulrate (SOS), 615
1Q4
Sptilzle proleill, 59 1 s púcinli7.ed l.ransd uclion, fi8l) , 669 speoch, evoluti ún or. 637-39. 639 sperm, comenl or. 6 spindle pole bodles, 143 spiml, 11 3
spUce recll lnbi nalion produr.IS. 262 3' splicesile. 381 splloo'5ile recogni lioll. 392. 392 - 93 spl;coosom¡¡J proteim. 67 spliceosomos, 383-67. 366. 391 -9:1, 400, 401 splicillg. in eukaryotes, 529 sp lie ing en hancers, 563 sran prOle;n, 282- 83. 283. 291 SPOIfH gone. 586 SR (serino uq¡enine rieh) prole;lns, 3!)3, 393. 396 SSBs fsí ngln-st1'8ndcd ONA·blnding proleins). 84, 8 4, 195-98, /9 7.199
SsrA RNA, 452-56. 545 slllbla lem ary complexes. 352 S led le" L.. J., 16 S18h l, Frank \V,. 26, 260, 692
!ltarl rodon/;. 41 2, 413. 437-38. 440 STAT f.Nl lhways. 552, .5,'$3 sloUonar)' phoses, 687 stem·loop s lruetUff!S. 123 slerooisomers, 50 "slicky 811ds:' 11 1 Stop cOOOIlS, 38, 4 12, 41 3, 448-49, 456 SIOp mutalillns. 470 slmnd ;nvasion, 260 SIrep tococcus pneumo/l;oo. 20. 20-2 1 SlruClu l1l1 ffiIImlenance 01 chnllll080me (SMC) p l"oItlÍns, 144 - 46. 165 S lurleWrnl, Alfred H " 10, 13,700
Su(H )-No tch- : colnp lex. 581l, 589
730
/ndeN
Sulslon. )o hn, 696 supcrcoilcd ONA, " 3 ncglt livcl)'coilm l, 114 - 1á pa;ill vel)' collcd. Il a " Jlaxalionor. 11 5- 16 supercolling, BO, lH - 15, 198-9'J $upu rhelicul don silY, 11 S liUppres!lOr ~.!llC$. 472 sup pre660r mulalions defillil ioll. 412 I'mm~ hjft mula tioll$, 472 rnnd ion, 471 - 75 nOlls(,n so tnUla lions, 47J • .JN Sutt OIl. Wal lcr S . . A- Y SV'IO viJus. 111 . :1!!4 Svoclbnrg, Thcodor, 425 SVL't1bcrg u llih;, '125 SWI prolclns. 583 SWllSHF. 5 46 SWUSNF, 540 9ynap5 i ~, 11 syna plic cOmpleXL'5, 294. 314- 16 Ii}'nLilia. 590 s)'nergistlcllll y, 544 sj'll
T T IIJ\118cn. S V40 virus. 39
lelornerdSCS charoclcris tics.230 - 32 end roplicnlion problenl amI. 232 ltIauilnu:m1 of. 140 in Iclomere replicatlon. 2JI tolomeres duril1getlll divillion. 199, 1 ~!l-40 cOlTlp05ition . 2JO gene sile nci ng. 556, .~51
mp liClllion.231 structu re or, 141 1t¡lophl.iSé, 146, 141 túlll jlCnlle ph¡tges, SA2 tcm p.l'ralurn,45 tcrmiual io n chaío, 463 - 64 mmseose suppressof'S, 474 polyadeny latiOll und. 373, 374 Imnscription .. 361- 63.362 tl1m6Cri ption cycle, :'50. 352 trumil ll ti on, 448-52 Icnnimtlors, 361 . 361 teniar)' slructu res, Plcdein. 72. 75 Ictraloops, RNA. J23 TFIIB rocognition elomen t. 363-64,367-68
TFIII3+Tl3p. prornOlllr complox, 368 TFIIO, 364 . s:n, 540 TFU F:.368 TFIif<'.368 Tt-IlH, 25 1, ;i6a. 562 TFIIS, 37 1
TCF·p recep t or~. 5'J8 TCf'·p Itr.JOslo l'min¡; gl'OW lh foclor-l3l. 706 TC F.(J (tmndormínsgfOwll t r;tClor-13l receptors. thermodynemics, fh-st htW oro43 th crmodynamic:s. Se(;ond I
tmns-splicing, 3B3, 383 tran..<:criplion aborti ve inilifllion. :iSB- S!) BccurBcy.347 bacterial. 353-63 bactflriophal,>e A. 5-'4 control of loc genos, 488 - 89 in eu karyoles. 363 -76 initiflti oll in bacteria, 488 - 504 mr.chfmi~m or. :J47-77 modas of repression. 607 nucleolide seque nces Bnd, 34 phases. 351 reguJ¡lIi on,4A:'I-1J7 rcpres-~ors. 54!,J -5t RNAs in regu lation. 567-10 l$n ninIlHon. 361 -63. 362 transfer oI in(orffiBlion via. 31 transcription -couplcd ropair. 251. 253 Irtlnscription inilialion . 538, 562-61 lra nsa-iplionale longation. 350. 351, 352, 562-63 transCtiptionel silencing. 244 . 542 transcriptional terrnination. 507 tr.msduclio n gcnura lized, 688-89, 689 speciali7.tld , 686 lrnnsestcrificalion. :'118. 381 tr.msfcr RNA (tRNA). 123.415 - 16 amino ucids atlachmcnl. 411 -2:J chargf!d . 41 1, 418. 418- 19. 422-23 codon-anticodon pairing, 462 inllllgen ic suppl'ession, 412-74 isoacce pling. 4HI-20 modificd nudeosides. 415 ribosoma l discrimination, 422-23 ri bosolne bindings siles for, 4.29-30, 430. 431 16S,5 1/
soconollry StlUelUteS. 416. 416- 17. 417 struclure, 411, 420 ttanslalion flnd. 411 translocfl lioll . 444 -46 trinucleotide codon bindins. 468 uncharguo. 411. 422- 23 lrans{ernsflS.62
trdns(orrn'ltions ooctcrial vi1'U lence fl n~1. 20. 20. 21 DNA-medlated,689 P·eloment . 703. 105 vector ONA intrOlluction. 655-56 transgollic models, 103-5. 707, 101-9. 7()lJ Imnsi tiol1 5. 236. 236 lranslatioJl. 4U-5!;! an tibi ot io: lino . 453 desCTiption, 411 tJ!onSlltion. 440- 48 GTP-binding proteins in. 447 inilia lion. 432 - 40, 433 rnRNA stabili \y and. 452 - 57 nudoolirle sequellccs ando 34 overview.427 prole;n s lflbilíly an d, 452 -57 puromycin tops, 44 1 ternun¡¡lion.448-52 trans(er of inforfIlfItlon via. 31 transbltion iniliatioll faclors (IFs). 433-35, 438
translatiol1u l cou pling. 4 14 Irllnsle5io n DNA syn thosis. 247, 254-51. 255 trans lf)~ ion polymerase, 241. 255 tru ns locillions. 440. 444 - 46 t.rnns posabl u elomenls. 13B. 310. 3 11 - 12. 329 transposase genes. 31 2- 101 lraosposases, 32 1-22, 322. 330. !iCe nlso in tegras. tral1 ~position targo t irnmunity, 327. 333 - 34 transposi tions. 138, 293 . 310·- 26 lranspasoll Tn3 resolvase, 297 lrans posons dcscription. 259 fUIlCtioll . 310- 11 ¡ II genomlc occurrence. 312 h lsertioll 0(. 236 IS4 rarn l1 y. 327 - 29 Icw.Z fu~iOJls mooiatoo by, fi!JO regulation by, 327 uses, 699 - 90 trans pososomes.314 -1.6 U'allsversioll$, 236. 236 tri ·snRNP partidas. 385 trilobit es, 630 Irinucloolide- ribOliomfl cl11l1plexes. 468 triparllle Jeaders. 39H triple repea ls. 237 tritium (~H), .16 tRN A I't>o.465 trom bo ne model. 208-U tróponl n T. 394 Irp gene. 4 13. 505. 507 Irp operon. 504, 505 trypsin.31 tryp toph an (Trp. W), 73 Tschermak. Erich. 6 tubulin, 583 tumor-supplossor Senes. 600 Tupl prolein. 551 lurns.76 twist. 11 2 fwisl Sene. 595 twisl numoors. 114 1\Visl proloin, 5!;!1 fwo hybrid a5say. 533-34 1)' c1emenls, 335-36. 3:16 tyrosine recombinases. 296-91. 297. 299. 299-31 t)'Tosine (1» r. Y). 48, 73. 422 tycosyl tRNA synthelasc. 422
u U inscrtion. 406 U2AF (U2 auxiliar)' {actúr). U:A:U ba~() ',¡plll. 124
~1l4
Ubxgenes iu CruStaCCflllS, 630-~ 1 em bcyo nlorphology and , 626 fruill1)' morphology ando 624-26. 625 lIbx protei n bindins siles. 630 embryo morphology ¡md. 627 evoJutio nary changes, 632. 632 -35 largel enhancers. 627-30 Ubx rupressor. 635 Ullrabjlhorw.: fUbx} gfme. 627 u ltraccnlrifugalion . 425 , 425 ullrBViolellight, 109 - 10, 245
732
' ndex
UmuC protein, 254, 255 UnmO protein , 254 unfolding. 74. see ti/so denaturation 3'unrr¡¡nslared region (3'lITRI. 576 uORF~ (upSII'f:
UP-clcmenls, 3:l4 madI. 20. 122-23
ul"llCiI gl)'cosylase reactio n. 219 urid¡ne, 415 UvrA prolei ns, 251 UvrJ3 prolcins, 251 UvrC proleins. 2S1 UvrD l1eli case, 236 UvrD prolllins, 251 UvsX proloin. :17:l
Wl1ile, Joh n . 696 wild-typc gones, 10 Wilkins, Mllurica. 22 wings, insect, 632 - 35 , 6:15 Wnls pcoteins. 596 wo bble concepto 463 , 463 wO I·k, free énergy and o44 Wrigl1t . Scwall, 15. 17 writhe. 11 2,11 5 wfill1 ing numbers. 11 3
x X-ruy crysla llogrnphy, 75 X-rays, 16. 245 Xct recombimlsos, 307 x~rC,
v V3 caJls, 588
Vl inte me UfOlls,589 V2 inlerne urons, 589 valonce. definition. 42 valinc (Vat V), 73.422 van der W¡¡als bonding, 46 van dor Waa ls rorees-. 42
297. 30 7 - 8
XerO, 297, 307 -6 xerodcnna pigmcnlosum. 25 1 Xis bind ing. 304 Xnr gene. 598 XP (xeroocrma pigmenlos uml genes, 2 XPC. fu ndion, 251 Xrs2 prOlei n , 283
y
acela te.46
distance and o46 glycine, 46 guani ne.46 weak bonds. 45 ven der w¡¡¡¡ls rnrlii, 46. 4 7 Vand, Vladi mir. 22 V[DIJ recornbinlllion. 311 . 336-41 , 339, 340 veclors, defini!ion, 654 VegTgollc, 598 . 599 ventral. d efinition, 578 VgJ gene, 599 viral supprossors of genll silencing (VSGSsJ, 570 viroids. 125 vil'uses. genes. 21 VISTA softwllre. 669
w Wullace. AlfrC(\ R., 5 wafer molccu les, 45, 49, 49 Walson. Ja mes D., 22 Weschaus, Elic. 594
Y-are shupt:, 2 15 y fam ily ONA polymcrases, 256 Ya ll o f~ ky. CharlaS, 36, 692 yeasts ch rom osorne rnaps, 289 con~crvoo rnf:Chflnisms. 531 - 37 rLP fUl1ctiol1. 297 GCl1 4 activa tor. 565-67, 566 gene regu latory cJernenls. 530 gene silencing. 556-58 YPWM motif. 62J. 6:Jn
z Z DNA. J06, 107. 107 -8, J08
Zocrmthoides. 620 Zoo mo)'s. sce corn IZl'fl mOJl5) 7.igz.¡¡g model, 162 zinC cluster dornains, 535 7.inc-conwining DNA-bjnding dom¡¡ins zinc fingOf DNA·bi nding motirS. tl5, 58 7.inc fjngers , ~35. 535
a .~
5'
"
h)'drogen bond
""'"
--
~
~
sugar.phofiJtla!e
~ ~
"•,
<
~
•E
2
i r"
[5] _
G
"
21)
T
Asparaglne
Aspartic acld
A (2 0m)
Aspar8g1ne or asparllC aCld Cysleif1e Glutamine
b
Glutamic acid Glllamioe (JI' glutamic acld Glyclne Hi::;tidine
• •
P
In beses
(ti)
Scho'rBtic mxlel ol the d:dlle heli><.
Ibl
~fillirv¡
tTCdel. of che
A
"'9
R N
"'n
"'p
"',
e"
o B
e
Gln GI, GI, GIy
o
><,
H
I~
I
E Z G
L
Lysine
LV5
K
MethlOfline Phenylalanllle
Mel
M
""" "'.
F P S
T'"
T W
'"
y
Th'""""", TryPlophan TVfOSIOE! Valine
ester ct\8in
A"
Le
Pl'oJlne
•H O o e In ptIosphale e end N
. _. _ ..
IsoIeocine Leucine
Se<'"
"'
Y.'..
....... -
Alan¡ne Arginine
e
5'
,
¡ _. - ---~.
Se<
T,p
V"
V
N t for Sale in
e U .S.A. or Canada