Biology
Eukaryotic genomes contain both nonrepetitive and repetitive DNA sequences
KEY TERMS:
- Nonrepetitive DNA shows reassociation kinetics expected of unique sequences.
- Repetitive DNA behaves in a reassociation reaction as though many (related or identical) sequences are present in a component, allowing any pair of complementary sequences to reassociate.
- A transposon (transposable element) is a DNA sequence able to insert itself (or a copy of itself) at a new location in the genome, without having any sequence relationship with the target locus.
- Selfish DNA describes sequences that do not contribute to the genotype of the organism but have self-perpetuation within the genome as their sole function.
KEY CONCEPTS:
- The kinetics of DNA reassociation after a genome has been denatured distinguish sequences by their frequency of repetition in the genome.
- Genes are generally coded by sequences in nonrepetitive DNA.
- Larger genomes within a phylum do not contain more genes, but have large amounts of repetitive DNA.
- A large part of repetitive DNA may be made up of transposons.
The general nature of the eukaryotic genome can be assessed by the kinetics of reassociation of denatured DNA. This technique was used extensively before large scale DNA sequencing became possible (for review see 32.1 DNA reassociation kinetics).
Reassociation kinetics identify two general types of genomic sequences (Britten and Davidson, 1971; Davidson and Britten, 1973):
- Nonrepetitive DNA consists of sequences that are unique: there is only one copy in a haploid genome.
- Repetitive DNA describes sequences that are present in more than one copy in each genome.
Repetitive DNA is often divided into two general types:
- Moderately repetitive DNA consists of relatively short sequences that are repeated typically 10-1000× in the genome. The sequences are dispersed throughout the genome, and are responsible for the high degree of secondary structure formation in pre-mRNA, when (inverted) repeats in the introns pair to form duplex regions.
- Highly repetitive DNA consists of very short sequences (typically <100 bp) that are present many thousands of times in the genome, often organized as long tandem repeats (see 4.11 Satellite DNAs often lie in heterochromatin). Neither class represents protein.
The proportion of the genome occupied by nonrepetitive DNA varies widely. Figure 3.8 summarizes the genome organization of some representative organisms. Prokaryotes contain only nonrepetitive DNA. For lower eukaryotes, most of the DNA is nonrepetitive;
<20% falls into one or more moderately repetitive components. In animal cells, up to half of the DNA often is occupied by moderately and highly repetitive components. In plants and amphibians, the moderately and highly repetitive components may account for up to 80% of the genome, so that the nonrepetitive DNA is reduced to a minority component.
A significant part of the moderately repetitive DNA consists of
transposons, short sequences of DNA (~1 kb) that have the ability to move to new locations in the genome and/or to make additional copies of themselves (see 16 Transposons and 17 Retroviruses and retroposons). In some higher eukaryotic genomes they may even occupy more than half of the genome (see 3.11 The human genome has fewer genes than expected).
Transposons are sometimes viewed as fitting the concept of selfish DNA, defined as sequences that propagate themselves within a genome, without contributing to the development of the organism. Transposons may sponsor genome rearrangements, and these could confer selective advantages, but it is fair to say that we do not really understand why selective forces do not act against transposons becoming such a large proportion of the genome. Another term that is sometimes used to describe the apparent excess of DNA is junk DNA, meaning genomic sequences without any apparent function. Of course, it is likely that there is a balance in the genome between the generation of new sequences and the elimination of unwanted sequences, and some proportion of DNA that apparently lacks function may be in the process of being eliminated.
The length of the nonrepetitive DNA component tends to increase with overall genome size, as we proceed up to a total genome size ~3 × 109 (characteristic of mammals). Further increase in genome size, however, generally reflects an increase in the amount and proportion of the repetitive components, so that it is rare for an organism to have a nonrepetitive DNA component >2 × 109. The nonrepetitive DNA content of genomes therefore accords better with our sense of the relative complexity of the organism. E. coli has 4.2 × 106 bp, C. elegans increases an order of magnitude to 6.6 × 107 bp, D. melanogaster increases further to ~108 bp, and mammals increase another order of magnitude to ~2 × 109 bp.
What type of DNA corresponds to protein-coding genes? Reassociation kinetics typically show that mRNA is derived from nonrepetitive DNA. The amount of nonrepetitive DNA is therefore a better indication that the total DNA of the coding potential. (However, more detailed analysis based on genomic sequences shows that many exons have related sequences in other exons [see 2.5 Exon sequences are conserved but introns vary]. Such exons evolve by a duplication to give copies that initially are identical, but which then diverge in sequence during evolution.)
-
Satellite Dnas Often Lie In Heterochromatin
KEY TERMS:Highly repetitive DNA (Simple sequence DNA) is the first component to reassociate and is equated with satellite DNA. Satellite DNA (Simple-sequence DNA) consists of many tandem repeats (identical or related) of a short basic repeating...
-
Clusters And Repeats
KEY TERMS:A gene family consists of a set of genes whose exons are related; the members were derived by duplication and variation from some ancestral gene. A translocation is a rearrangement in which part of a chromosome is detached by breakage or aberrant...
-
Organelle Genomes Are Circular Dnas That Code For Organelle Proteins
KEY TERMS:Mitochondrial DNA (mtDNA) is an independent DNA genome, usually circular, that is located in the mitochondrion. Chloroplast DNA (ctDNA) is an independent genome (usually circular) found in a plant chloroplast. KEY CONCEPTS: Organelle...
-
The Human Genome Has Fewer Genes Than Expected
KEY CONCEPTS:Only 1% of the human genome consists of coding regions. The exons comprise ~5% of each gene, so genes (exons plus introns) comprise ~25% of the genome. The human genome has 30,000-40,000 genes. ~60% of human genes are alternatively spliced....
-
Total Gene Number Is Known For Several Eukaryotes
KEY CONCEPTS:There are 6000 genes in yeast, 18,500 in worm, 13,600 in fly, 25,000 in the small plant Arabidopsis, and probably 30,000 in mouse and <40,000 in Man. As soon as we look at eukaryotic genomes, the relationship between genome size and gene...
Biology