Genes show a wide distribution of sizes
Biology

Genes show a wide distribution of sizes



KEY CONCEPTS:
  • Most genes are uninterrupted in yeasts, but are interrupted in higher eukaryotes.
  • Exons are usually short, typically coding for <100 amino acids.
  • Introns are short in lower eukaryotes, but range up to several 10s of kb in length in higher eukaryotes.
  • The overall length of a gene is determined largely by its introns. 

Figure 2.13 shows the overall organization of genes in yeasts, insects, and mammals. In S. cerevisiae, the great majority of genes (>96%) are not interrupted, and those that have exons usually remain reasonably compact. There are virtually no S. cerevisiae genes with more than 4 exons.
In insects and mammals, the situation is reversed. Only a few genes have uninterrupted coding sequences (6% in mammals). Insect genes tend to have a fairly small number of exons, typically fewer than 10. Mammalian genes are split into more pieces, and some have several 10s of exons. ~50% of mammalian genes have >10 introns.






Examining the consequences of this type of organization for the overall size of the gene, we see in Figure 2.14 that there is a striking difference between yeast and the higher eukaryotes. The average yeast gene is 1.4 kb long, and very few are longer than 5 kb. The predominance of interrupted genes in high eukaryotes, however, means that the gene can be much larger than the unit that codes for protein. Relatively few genes in flies or mammals are shorter than 2 kb, and many have lengths between 5 kb and 100 kb. The average human gene is 27 kb long (see Figure 3.22).
The switch from largely uninterrupted to largely interrupted genes occurs in the lower eukaryotes. In fungi (excepting the yeasts), the majority of genes are interrupted, but they have a relatively small number of exons (<6) and are fairly short (<5 kb). The switch to long genes occurs within the higher eukaryotes, and genes become significantly larger in the insects. With this increase in the length of the gene, the relationship between genome complexity and organism complexity is lost (see Figure 3.5).
As genome size increases, the tendency is for introns to become rather large, while exons remain quite small.

Figure 2.15 shows that the exons coding for stretches of protein tend to be fairly small. In higher eukaryotes, the average exon codes for ~50 amino acids, and the general distribution fits well with the idea that genes have evolved by the slow addition of units that code for small, individual domains of proteins (see 2.9 How did interrupted genes evolve?). There is no very significant difference in the sizes of exons in different types of higher eukaryotes, although the distribution is more compact in vertebrates where there are few exons longer than 200 bp. In yeast, there are some longer exons that represent uninterrupted genes where the coding sequence is intact. There is a tendency for exons coding for untranslated 5 and 3 regions to be longer than those that code for proteins.






Figure 2.16 shows that introns vary widely in size. In worms and flies, the average intron is not much longer than the exons. There are no very long introns in worms, but flies contain a significant proportion. In vertebrates, the size distribution is much wider, extending from approximately the same length as the exons (<200 bp) to lengths measured in 10s of kbs, and extending up to 50-60 kb in extreme cases.
Very long genes are the result of very long introns, not the result of coding for longer products. There is no correlation between gene size and mRNA size in higher eukaryotes; nor is there a good correlation between gene size and the number of exons. The size of a gene therefore depends primarily on the lengths of its individual introns. In mammals, insects, and birds, the "average" gene is approximately 5× the length of its mRNA.







- Mrna Splicing
Genes of eukaryotic cells containing introns and exons. Introns, short for intervening sequences, are non-coding regions of a gene. They are transcribed but not translated into the amino acid sequence of proteins. Exons, or expressed sequences, are coding...

- Some Exons Can Be Equated With Protein Functions
KEY CONCEPTS:Facts suggesting that exons were the building blocks of evolution and the first genes were interrupted are: Gene structure is conserved between genes in very distant species. Many exons can be equated with coding for protein sequences that...

- How Did Interrupted Genes Evolve?
KEY CONCEPTS:The major evolutionary question is whether genes originated as sequences interrupted by exons or whether they were originally uninterrupted. Most protein-coding genes probably originated in an interrupted form, but interrupted genes that...

- Exon Sequences Are Conserved But Introns Vary
KEY CONCEPTS:Comparisons of related genes in different species show that the sequences of the corresponding exons are usually conserved but the sequences of the introns are much less well related. Introns evolve much more rapidly than exons because of...

- An Interrupted Gene Consists Of Exons And Introns
KEY CONCEPTS:Introns are removed by the process of RNA splicing, which occurs only in cis on an individual RNA molecule. Only mutations in exons can affect protein sequence, but mutations in introns can affect processing of the RNA and therefore prevent...



Biology








.