2020_Winter_Bis2a_Facciotti_Lecture_21 - Biology

2020_Winter_Bis2a_Facciotti_Lecture_21 - Biology

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Learning goals associated with 2020_Winter_Bis2a_Facciotti_Lecture_21

  • Differentiate and convert between coding/noncoding, template/non-template strands.
  • Define and explain the function and structure of an open reading frame (ORF).
  • Create a model for a basic transcriptional unit that includes promoters, transcriptional regulatory sites for transcription factor binding, ribosome binding site, coding region,stopcodon andtranscriptionalterminator.
  • Use the model of a transcriptional unit to discuss the roles of each of the structural elements of a transcriptional unit. Identify those thatare transcribed, those that maybe translatedand those that serve other roles.
  • Create a dynamic model (verbal, drawing, etc.)ofthe process of transcription that includes the reactants, products, enzymes, the sites on the DNA required for transcription to take place, and how these components interact with the DNA template at different times during the process. Put your modelsintoaction!

The flow of genetic information

In bacteria, archaea, and eukaryotes, the primary role of DNA is to store heritable information that encodes the instruction set required for creating the organism in question. While we have gotten much better at quickly reading the chemical composition (the sequence of nucleotides in a genome and some chemical modifications that

are made

to it), we still don't know how to decode reliably all the information within and all the mechanisms by which it

is read

and ultimately expressed.

There are, however, some core principles and mechanisms associated with the reading and expression of the genetic code whose basic steps

are understood

and that need to be part of the conceptual toolkit for all biologists. Two of these processes are transcription and translation, which are the coping of parts of the genetic code written in DNA into molecules of the related polymer RNA and the reading and encoding of the RNA code into proteins, respectively.

In BIS2A, we focus on developing an understanding of the


of transcription (recall that an Energy Story is a rubric for describing a process) and its role in the expression of genetic information. We motivate our discussion of transcription by focusing on functional problems (bringing in parts of our problem solving/design challenge rubric) that must

be solved

the process to take place. We then describe how the process

is used

by Nature to create a variety of functional RNA molecules (that may have various structural, catalytic or regulatory roles) including so called messenger RNA (


) molecules that carry the information required to synthesize proteins. Likewise, we focus on challenges and questions associated with the process of translation, the process by which the ribosomes synthesize proteins.

We often depict the basic flow of genetic information in biological systems in a scheme known as "the central dogma" (see figure below). This scheme states that information encoded in DNA flows into RNA via transcription and ultimately translated to proteins. Processes like reverse transcription (the creation of DNA from and RNA template) and replication also represent mechanisms for propagating information in different forms. This scheme, however, says nothing per se about how informationis encodedor about the mechanisms by which regulatory signals move between the various layers of molecule types depicted in the model. Therefore, while the scheme below is a nearly required part of the lexicon of any biologist, perhaps left over fromoldtradition, students should also know mechanisms of information flow are more complex (we'll learn about some as we go, and that "the central dogma" only represents some core pathways).

Figure 1. The flow of genetic information.
Attribution:Marc T. Facciotti (original work)

Genotype to phenotype

An important concept in the following sections is the relationship between genetic information, the genotype, and the result of expressing it, the phenotype. These two terms and the mechanisms that link the two willbe discussedrepeatedly over the next few weeks—start becoming proficient with using this vocabulary.

Figure2.Information in DNA that willbe expressedby transcriptionis storedin the sequence of individual nucleotides read in the 5' to 3' direction. Conversion of the information from DNA into RNA (a process called transcription) expresses that information into a temporary copy that can be functional as is (e.g.tRNA,rRNA), or a message encoding the information required to build a protein (e.g.mRNA). Cells use themRNAas the template for the creation proteins via translation. Here, we show two different DNA sequences. The differences in each DNA sequence result in the production of two differentmRNAs, followed by the synthesis of two different proteins. Ultimately, these different proteins create two different coat colors in the mice.

Genotype refers to the information stored in the DNA of the organism, the sequence of the nucleotides, and the compilation of its genes. Phenotype refers to any physical characteristic that you can measure, such as height, weight, amount of ATP produced, ability to metabolize lactose, and response to environmental stimuli. Differences in genotype, evenslight, can lead to different phenotypes subject to natural selection. The figure above depicts this idea. Also note that, while we talk classically about the genotype and phenotype relationship inthe context ofmulticellular organisms, this nomenclature and the underlying concepts apply to all organisms, even single-celled organisms like bacteria and archaea.


What is a gene? A gene is a segment of DNA in an organism's genome that encodes a functional RNA (such as








protein product (enzymes, tubulin, etc.). A generic gene contains elements encoding regulatory regions and a region encoding a transcribed unit.

Genes can gain mutations—defined as changes in the composition and or sequence of the nucleotides—either in the coding or regulatory regions. These mutations can lead to several outcomes: (1) nothing measurable happens as a result; (2) the gene is no longer expressed; or (3) the expression or behavior of the gene product


s) are different. In a population of organisms sharing the same


different variants of the gene

are known

as alleles. Different alleles can lead to differences in phenotypes of individuals and contribute to the diversity in biology under selective pressure.

Start learning these vocabulary terms and associated concepts. You will then be


familiar with them when we dive into them in more detail over the next lectures.

Figure 3. A geneconsists ofa coding region for an RNA or protein product accompanied by its regulatory regions.The coding region is transcribedintoRNAwhichis then translatedinto protein.

Transcription from DNA to RNA

Section summary

Bacteria, archaea, and eukaryotes must all transcribe genes from their genomes. While the cellular location may be different (eukaryotes perform transcription in the nucleus; bacteria and archaea perform transcription in the cytoplasm), the mechanisms by which organisms from each of these clades carry out this process arefundamentallythe same and canbe characterizedby three stages: initiation, elongation, and termination.

A shortoverview of transcription

Transcription isthe process ofcreating an RNA copy of a segment of DNA. Since this is a process, we want to apply the Energy Story rubric to develop a functional understanding of transcription. What does the system of molecules look like before the start of the transcription? What does it look like at the end? What transformations of matter and transfers of energy happen during the transcription, and what catalyzes the process? We also want to think about the process from aDesign Challengestandpoint. If the biological task is to create a copy of DNA in the chemical language of RNA, what challenges can we reasonably hypothesize or expect, given our knowledge about other nucleotide polymer processes, mustbe overcome? Is there evidence that Nature solved these problems in different ways? What seem to be the criteria for the success of transcription? You get the idea.

Listing some basic requirements for transcription

Let us first consider the tasks at hand by using some of our foundational knowledge and imagining what might need to happen during transcription if the goal is to make an RNA copy of a piece of one strand of a double-stranded DNA molecule. We'll see that using some basic logic allows us to infer many of the important questions and things that we need to know

in order

to describe the process properly.

Let us imagine that we want to design a nanomachine/nanobot that would conduct transcription. We can use some

Design Challenge

thinking to identify problems and subproblems that need to

be solved

by our little robot.

• Where should the machine start? Along the millions to billions of base pairs, where should the machine

be directed

• Where should the machine stop?
• If we have start and stop sites, we will need ways of encoding that information so that our machine


s) can read this information—how will that

be accomplished

• How many RNA copies of the DNA will we need to make?
• How fast do the RNA copies need to

be made

• How accurately do the copies need to

be made

• How much energy will the process take and where is the energy going to come from?

These are only some core questions. One can dig deeper if they wish. However, these are already good enough for us to get a good feel for this process. Notice, too, that many of these questions are remarkably similar to those we inferred might be necessary to understand DNA replication.

The building blocks of transcription

The building blocks of RNA

Recall from our discussion on the structure of nucleotides that the building blocks of RNA are very similar to those in DNA. In RNA, the building blocks comprise nucleotide triphosphates thatare composedof a ribose sugar, a nitrogenous base, and three phosphate groups. The key differences between the building blocks of DNA and those of RNA are thatRNA molecules are composedof nucleotides with ribose sugars (as opposed to deoxyribose sugars) and use uridine, a uracil containing nucleotide (as opposed to thymidine in DNA). Note below that uracil and thymine are structurally very similar—the uracil is just lacking a methyl (CH3) functional group compared to thymine.

Figure 1. The basic chemical components of nucleotides.
Attribution:Marc T. Facciotti (original work)

Transcription initiation


Proteins responsible for creating an RNA copy of a specific piece of DNA (transcription) must first be able to recognize the beginning of the element tobe copied. A promoter is a DNA sequence onto which various proteins, collectively known as the transcription machinery, bind and start transcription. In most cases, promoters exist upstream (5' to the coding region) of the genes they regulate. The specific sequence of a promoter is very important because it determines whether the cell transcribes the corresponding coding portion of the gene all the time, sometimes, or infrequently. Although promoters vary among species, a few elements of similar sequencesare sometimes conserved. At the-10and-35regions upstream of the initiation site, there are twopromoterconsensus sequences, or regions similar across many promoters and across various species. Some promoters will have a sequence very similar to the consensus sequence (the sequence containing the most common sequence elements), and others will look very different. These sequence variations affect the strength to which the transcriptional machinery can bind to the promoter to start transcription. This helps to control the number of transcripts thatare madeand how often they get made.

Figure 2. (a) A general diagram of a gene. The gene includes the promoter sequence, an untranslated region (UTR), and the coding sequence. (b) A list of several strong E. colipromotersequences. The -35 box and -10 boxare highly conservedsequences throughout the strong promoter list. Weaker promoters will have more base pair differences when compared to these sequences.

Bacterial vs. eukaryotic promoters

In bacterial cells, the -10 consensus sequence, called the -10 region, is AT rich, often TATAAT. The -35 sequence, TTGACA,

is recognized

and bound by the protein σ. Once this protein-DNA interaction

is made

, the subunits of the core RNA polymerase bind to the site. Because of the relatively lower stability of AT associations, the AT-rich -10 region facilitates unwinding of the DNA template, and several phosphodiester bonds are made.

Eukaryotic promoters are much larger and more complex than prokaryotic promoters, but both have an AT-rich region—in eukaryotes, it

is typically called

a TATA box. For example, in the mouse thymidine kinase gene,

the TATA box is located


approximately -30

. For this gene, the exact TATA box sequence is TATAAAA, as read in the 5' to 3' direction on the nontemplate strand. This sequence is not identical to the E. coli -10 region, but both share the quality of


AT-rich element.

Instead of a single bacterial polymerase, the genomes of most eukaryotes encode three different RNA polymerases, each made up of ten protein subunits or more. Each eukaryotic polymerase also requires a distinct set of proteins known as transcription factors to recruit it to a promoter. In addition, an army of other transcription factors, proteins known as enhancers, and silencers help to regulate the synthesis of RNA from each promoter. Enhancers and silencers affect the efficiency of transcription but are

not necessary

for the initiation of transcription or its procession. Basal transcription factors are crucial in the formation of a preinitiation complex on the DNA template that subsequently recruits RNA polymerase for transcription initiation.

The initiation of transcription begins with the binding of RNA polymerase to the promoter. Transcription requires the DNA double helixto partially unwindsuch that one strand canbe used asthe template for RNA synthesis. The region of unwindingis calleda transcription bubble.

Figure 3. During elongation, RNA polymerase tracks along the DNA template, synthesizes mRNA in the 5' to 3' direction, and unwindsthen rewindsthe DNA as itis read.


Transcription always proceeds from the template strand, one of the two strands of the double-stranded DNA. The RNA product is complementary to the template strand and is almost identical to the nontemplate strand, called the coding strand, with the exception that RNA contains a uracil (U) in place of the thymine (T) found in DNA. During elongation, an enzyme called RNA polymerase proceeds along the DNA template, adding nucleotides by base pairing with the DNA template in a manner similar to DNA replication, with the difference being an RNA strand thatis synthesizeddoes not remain bound to the DNA template. As elongation proceeds, the DNAis continuously unwoundahead of the core enzyme and rewound behind it. Note that the direction of synthesis is identical to that ofsynthesisin DNA—5' to 3'.

Figure 4. During elongation, RNA polymerase tracks along the DNA template, synthesizing mRNA in the 5' to 3' direction, unwinding and then rewinding the DNA asit is read.

Figure 5. The addition of nucleotides during the process of transcription is very similar to nucleotide additioninDNA replication. The RNAis polymerizedfrom 5' to 3', and with each addition of a nucleotide, a phosphoanhidride bond is hydrolized by the enzyme, resulting in a longer polymer and the release of two inorganic phosphates.

Bacterial vs. eukaryotic elongation

In bacteria, elongation begins with the release of the σ subunit from the polymerase. The dissociation of σ allows the core enzyme to proceed along the DNA template, synthesizing mRNA in the 5' to 3' direction at a rate of approximately 40 nucleotides per second. As elongation proceeds, the DNA

is continuously unwound

ahead of the core enzyme and rewound behind it. The base pairing between DNA and RNA is not stable enough to maintain the stability of the mRNA synthesis components. Instead, the RNA polymerase acts as a stable linker between the DNA template and the nascent RNA strands to ensure that elongation

is not interrupted


In eukaryotes, following the formation of the preinitiation complex, the polymerase

is released

from the other transcription factors, and

elongation is allowed

to proceed as it does in prokaryotes with the polymerase synthesizing pre-mRNA in the 5' to 3' direction. As discussed previously, RNA polymerase II transcribes the major share of eukaryotic genes, so this section will focus on how this polymerase accomplishes elongation and termination.

Possible NB Discussion Point

Compare and contrast the energy story for DNA replication initiation + elongation to the energy story for transcription initiation + elongation.


In bacteria

Once a gene

is transcribed

, the bacterial polymerase needs to dissociate from the DNA template and liberate the newly made mRNA. Depending on the gene being transcribed, there are two kinds of termination signals. One is


protein-based, and the other is RNA-based. Rho-dependent termination

is controlled

by the


protein, which tracks along the polymerase on the growing mRNA chain. Near the end of the gene, the polymerase encounters a run of G nucleotides on the DNA template and it stalls. As a result, the


protein collides with the polymerase. The interaction with rho releases the mRNA from the transcription bubble.

Rho-independent termination

is controlled

by specific sequences in the DNA template strand. As the polymerase nears the end of the gene being transcribed, it encounters a region rich in CG nucleotides. The mRNA folds back on itself, and the complementary CG nucleotides bind


. The result is a stable hairpin that causes the polymerase to stall as soon as it

begins to transcribe

a region rich in AT nucleotides. The complementary UA region of the mRNA transcript forms only a weak interaction with the template DNA. This, coupled with the stalled polymerase, induces enough instability for the core enzyme to break away and liberate the new mRNA transcript.

In eukaryotes

The termination of transcription is different for the different polymerases. Unlike in prokaryotes, elongation by RNA polymerase II in eukaryotes takes place1,000–2,000 nucleotides beyond the end of the gene being transcribed. This pre-mRNAtailis subsequently removedby cleavage during mRNA processing.On the other hand, RNApolymerases I and III require termination signals. Genes transcribed by RNA polymerase I contain a specific 18-nucleotide sequence thatis recognizedby a termination protein. The process of termination in RNA polymerase III involves a mRNA hairpin similartorho-independent termination of transcription in prokaryotes.

In archaea

Termination of transcription in the archaea is far less studied than in the other two domains of life andis still not well understood. While the functional details are likely to resemble mechanisms that havebeen seenin the other domains of life, the details are beyondthe scope ofthis course.

Cellular location

In bacteria and archaea

In bacteria and archaea, transcription occurs in thecytoplasm,where the DNAis located. Because the location of the DNA, and thus the process of transcription,are not physically segregatedfrom the rest of the cell, translation often starts before transcription has finished. This means that mRNA in bacteria and archaea is used as the template for a protein before it produces the entire mRNA. The lack of spatial segregation also means that there is very little temporal segregation for these processes. Figure 6 shows the processes of transcription and translation occurring simultaneously.

Figure 6. The addition of nucleotides during the process of transcription is very similar to nucleotide additioninDNA replication.
Source:Marc T. Facciotti (own work)

In eukaryotes....

In eukaryotes, the process of transcriptionis physically segregatedfrom the rest of the cell, sequestered inside of the nucleus. This results in two things: the mRNA is completed before translation can start, and there is time to "adjust" or "edit" the mRNA before translation starts. The physical separation of these processes gives eukaryotes a chance to alter the mRNA in such a way as to extend the lifespan of the mRNA or even alter the protein product that willbe producedfrom the mRNA.

MRNA processing

5' G-cap and 3' poly-A tail

Whenan eukaryoticgeneis transcribed, the cell processes the primary transcript in the nucleus in several ways. Eukaryotic cellsmodifymRNAsat the 3' end by the addition of a poly-A tail. This run of A residueis addedby an enzyme that does not use genomic DNA as a template. The mRNAs have a chemical modification of the 5' end, called a 5'-cap. Data suggests that these modifications both help to increase the lifespan of the mRNA (prevent its premature degradation in the cytoplasm) and to help the mRNA start translation.

Figure 7. pre-mRNAsare processedin a series of steps.Intronsare removed, a 5' cap and poly-A tailare added.

Possible NB Discussion Point

Transcriptomics is a branch of “-omics” that involves studying an organism or population’s transcriptome or, the complete set of all RNA molecules. What kind of information can you obtain from studying the transcriptome(s)? Can you think of any cool scientific questions that a transcriptomic analysis might help resolve? What are some constraints to transcriptomic approaches one might keep in mind when conducting analyses?

Alternative splicing

Splicing occurs on most eukaryotic mRNAs in which intronsare removedfrom the mRNA sequence and exonsare ligatedtogether. This can create a much shorter mRNA than initially transcribed. Splicing allows cells to mix and matchwhichexonsare incorporatedinto the final mRNA product. As shown in the figure below, this can lead to multiple proteins being coded for by a single gene.

Figure 8. The information stored in the DNA is finite.In some cases, organisms can mix and match this information to create different end products. In eukaryotes, alternative splicing allows for the creation of different mRNA products, whichin turnare usedin translation to create different protein sequences. This ultimately leads to the production of different proteinshapes,and thus different protein functions.