The genome contains the hereditary information of the structure and function of a cell or organism. This information is stored as a sequence of bases in DNA. A relatively small percentage of DNA codes for proteins and ribonucleic acids (RNAs), while a large amount of the genome is composed of sequences without a clear function. The conversion of the information stored within DNA into a functional molecule, or RNA and proteins, is termed gene expression. Gene expression occurs in two stages: transcription and translation. During transcription, DNA is copied into RNA. RNA is then used to synthesize proteins during translation.
Key enzymes involved in transcription are DNA-dependent RNA polymerases. These enzymes synthesize the RNA molecule based on the genes encoded in DNA, which contain starting sites (promoters) where transcription begins. Transcription factors are required to recognize the promoter. RNA polymerase moves along the template strand of the double-stranded DNA. The strand is synthesized until the end of the DNA segment (termination site) is reached. In eukaryotes, the newly formed primary transcript is further modified to be, for example, available for protein synthesis.
Gene expression is strongly regulated at all levels. Some genes are expressed in all cells and are required as housekeeping genes for basic cellular functions (i.e., constitutive expression). Other genes are only active in certain cells; their expression is regulated by a variety of mechanisms. Genes can undergo activation or silencing, and transcription depends on the presence of specific DNA-binding proteins. The newly formed RNA may also be degraded after transcription by various mechanisms before use in protein synthesis. There are also regulatory mechanisms at a translational level. Although each cell in an organism contains the same DNA, the regulated expression of certain genes causes the cells to specialize and assume different functions, e.g., muscle cells or hepatocytes.
- Gene expression: conversion of genetic information stored in DNA into a functional gene product (RNA and proteins)
- Protein synthesis: process of gene expression (comprised of transcription and translation) as well as post-transcriptional modifications (see the article on translation and protein synthesis for more information)
- Central dogma of molecular biology: genetic information always flows in one direction from DNA to RNA to the protein
- Sense strand: the DNA segment in the double-strand DNA that is complementary to the antisense strand and has an almost identical base sequence to the mRNA that is transcribed along the antisense strand ; The sense strand is not involved in the transcription process.
- Antisense strand: the DNA segment in the double-strand DNA that is used as a template for transcription to produce the complementary mRNA strand
- Specific DNA sequence located upstream (= in the 5′ region) of a gene that regulates transcription
- Contains AT-rich sequences (e.g., TATA box and CAAT box)
- Binding site for RNA polymerase II and several other transcription factors at the start of transcription
- Mutations at the site of promoters usually lead to severely decreased transcription rate.
- Exon-intron structure: eukaryotic genes are composed of alternating coding and noncoding regions
- Substrates: the nucleoside triphosphates ATP, GTP, CTP, and UTP
- Enzymes: RNA polymerases
- General transcription factors: specific helper proteins that help RNA polymerase find and bind to the promoter and initiate RNA synthesis
RNA polymerases and transcription factors
Transcription reactions are catalyzed by (DNA-dependent) RNA polymerases. In eukaryotic cells, there are various types of RNA polymerase, which recognize different promoter types and transcribe different types of genes. In prokaryotes, on the other hand, there is only one type of RNA polymerase that transcribes all three types of RNA.
- Structure: composed of two large subunits with many polypeptide chains
- Function: synthesis of a new RNA strand from 5′ to 3′ direction; reading of the DNA strand from 3′ to 5′ direction
|Overview of RNA polymerases|
|Type of RNA polymerase||Transcripts||Location|
RNA polymerase I
(most common type)
RNA polymerase II
RNA polymerase III
Mitochondrial RNA polymerase
- General transcription factors: enable binding of RNA polymerase to the proximal promoter regions by binding of chromosomal DNA to specific base sequences → start of transcription
- Specific transcription factors
Proteins, such as transcription factors that bind to DNA, require specific protein domains, also termed structural motifs. These structural motifs usually use either an α-helix or a β sheet to bind to the major groove of DNA. Transcription factors have DNA-binding domains through which they are able to interact with specific DNA segments to perform their function. Numerous structural motifs of DNA-binding domains have been identified. Important examples are the zinc finger domains, leucine zippers, basic helix-loop-helix, and the homeobox.
- Zinc finger
- DNA binding: The DNA-binding hydrophilic regions of α-helices contain many basic residues that interact with the major groove of DNA.
- Basic helix-loop-helix
- Homeobox (with helix-turn-helix)
Stages of transcription
Transcription is divided into three phases: initiation, elongation, termination.
Initiation (transcription): the start of transcription by the formation of the initiation complex and unwinding of DNA
- Preinitiation complex (RNA polymerase-promoter closed complex) formation by binding of general transcription factors and RNA polymerase to the promoter region (e.g., TATA box, CAAT box, GC box)
- Formation of a transcription bubble by unwinding the DNA double helix to a single strand with a length of 10–12 bases (open complex)
- Start of RNA synthesis
- Termination: During termination, starts.
In eukaryotes, the end-product of transcription is heterogeneous nuclear RNA (hnRNA), which is then transformed into mature mRNA through posttranscriptional modifications in the nucleus. These modifications include capping, polyadenylation, splicing, and RNA editing. mRNA then leaves the nucleus and enters the cytosol.
- Definition: addition of a of 7-methylguanosine to the 5′ end of hnRNA to form the five-prime cap
- Cleavage of the 5′-phosphate group by RNA triphosphatase
- Addition of a GMP residue (formed from GTP with cleavage of pyrophosphate) to the 5′ diphosphate end of hnRNA by guanylyltransferase
- Methylation of one, two, or three ribosome residues of hnRNA with S-adenosylmethionine (SAM) as a methyl group donor
- Protects against degradation (through exonucleases )
- Initiation of translation
- Definition: addition of a tail of ∼200 adenosine monophosphates (polyadenylate, A) to the 3′ end of hnRNA
- ↑ Stability (protects against early degradation)
- Initiates translation
- Definition: excision of introns from hnRNA transcripts and direct linkage of exons
- Function: excision of introns so that the resulting mature mRNA only contains relevant information in the form of exons
Spliceosome formation at the exon-intron border
- Complex of:
- Involved sequence segments on the hnRNA:
- Mutations in the intronic splice site of the β-globin locus result in improper splicing, which leads to expression of abnormal β-globin in beta-thalassemia.
- Defective snRNP assembly can lead to congenital conditions such as spinal muscular atrophy, in which assembly is impaired due to decreased SMN protein.
- Opening of the exon-intron border at the 5′ splice site: A temporary lariat structure with a 2′ → 5′ phosphodiester bond is formed, which links the two ends to be joined together in proximity (loop formation)
- Opening of the exon-intron border at the 3′ splice site
- Joining of the exon ends
- Definition: alteration of RNA base sequences by the insertion, deletion, or modification of individual bases (independent of splicing)
- Function: possibility of producing various proteins
- A-to-I editing: adenosine is deaminated to inosine, i.e., the base adenine is converted to hypoxanthine
C-to-U editing: Cytidine is deaminated to uridine, i.e., the base cytosine is converted to uracil
- Occurs in mRNA
- Typical example of C-to-U editing
- The mRNA for apolipoprotein B (apoB) codes for apoB-100.
- After editing, the mRNA for apoB codes for a markedly smaller protein, apoB-48, because the deamination of cytidine to uridine generates a stop codon through cytidine deaminase.
- Via C-to-U-editing, e.g., apoB-48 is formed by enterocytes compared to apoB-100 by hepatocytes.
- Definition: removal of introns within hnRNA with differential joining of exons
- Process: similar to splicing with additional splicing factors that determine the range of splice locations
Quality control of mRNA
Prokaryotic gene regulation (operon model)
Regulation of gene expression was initially analyzed in E. coli. Regulatory sequences in the bacterial genome ensure gene expression of the enzyme β-galactosidase if the sugar lactose is available as an energy source. Other proteins are also synthesized, which are associated with lactose metabolism. Therefore, it involves the coordinated expression of several genes.
- Definition: a model for describing the gene regulatory mechanism in prokaryotes
- Function: adapt to changing environmental conditions by simultaneously increasing the expression of certain related genes
Example: lac operon
- Description: A transcriptional unit of genes for enzymes involved in lactose metabolism that is only expressed in the presence of lactose (e.g. β-galactosidase). The lac operon represents a classic example of how the environment creates a genetic response.
- Components (in their order in the genome)
- Regulatory gene lacI: does not directly belong to the lac operon but codes for a repressor protein that binds to the lac operator in the absence of lactose and prevents transcription
- Promoter: binding site for catabolite activator protein (CAP) and RNA polymerase in transcription
- Operator: binding site of the repressor that overlaps with the promoter
- lacZ: β-galactosidase gene
- lacY: permease gene
- lacA: transacetylase gene
- Presence of glucose and absence of lactose → transcription cannot take place → the lac repressor binds to the operator → polymerase cannot bind promoter → very few β-galactosidase molecules in the cell
- Absence of glucose and presence of lactose → ↑ transcription
- Presence of glucose and lactose: very low basal expression of lac genes
Eukaryotic gene regulation
Regulation of gene expression is significantly complicated in eukaryotes compared to prokaryotes. One reason is due to the difference in size between the genomes of eukaryotes and prokaryotes, with eukaryotes having a significantly larger genome. Another reason is that the DNA in the eukaryotic genome in the nucleus is strongly condensed and packaged as chromatin. As a result, it is less accessible than prokaryotic DNA. However, a common feature of eukaryotes and prokaryotes is the importance of activators and repressors, which bind specific DNA sequences and increase or inhibit gene expression.
Distal regulatory elements: DNA sequences that can affect the transcription rate of a gene and can be located before, within, or after an intron of the gene they regulate
- Short DNA sequences ∼ 20 bp in length
- Mainly a palindrome or a tandem repeat
- When specific transcription factors (activators) bind to enhancers, the transcription rate of a gene on the same chromosome increases.
- Examples of an enhancer: hypoxia-response element (HRE)
- The transcription factor hypoxia-inducible factor (HIF) binds to the HRE sequence during hypoxia and induces certain target genes that are important in the response to hypoxia, e.g., expression of EPO and VEGF.
- In normoxia (sufficient amount of oxygen), HIF is hydroxylated by HIF prolyl hydroxylase. Hydroxy-HIF is ubiquitinylated and degraded in the proteasome and is unable to increase the expression of its own target genes.
|Actinomycin D (dactinomycin)|| |