Biology

Some exons can be equated with protein functions

KEY CONCEPTS:

Facts suggesting that exons were the building blocks of evolution and the first genes were interrupted are:
- Gene structure is conserved between genes in very distant species.
- Many exons can be equated with coding for protein sequences that have particular functions.
- Related exons are found in different genes.

If current proteins evolved by combining ancestral proteins that were originally separate, the accretion of units is likely to have occurred sequentially over some period of time, with one exon added at a time (for review see Blake, 1985). Can the different functions from which these genes were pieced together be seen in their present structures? In other words, can we equate particular functions of current proteins with individual exons ?

In some cases, there is a clear relationship between the structures of the gene and protein. The example par excellence is provided by the immunoglobulin proteins, which are coded by genes in which every exon corresponds exactly with a known functional domain of the protein. Figure 2.22 compares the structure of an immunoglobulin with its gene.

An immunoglobulin is a tetramer of two light chains and two heavy chains, which aggregate to generate a protein with several distinct domains. Light chains and heavy chains differ in structure, and there are several types of heavy chain. Each type of chain is expressed from a gene that has a series of exons corresponding with the structural domains of the protein.

In many instances, some of the exons of a gene can be identified with particular functions. In secretory proteins, the first exon, coding for the N-terminal region of the polypeptide, often specifies the signal sequence involved in membrane secretion. An example is insulin.

The view that exons are the functional building blocks of genes is supported by cases in which two genes may have some exons that are related to one another, while other exons are found only in one of the genes. Figure 2.23 summarizes the relationship between the receptor for human LDL (plasma low density lipoprotein) and other proteins. In the center of the LDL receptor gene is a series of exons related to the exons of the gene for the precursor for EGF (epidermal growth factor). In the N-terminal part of the protein, a series of exons codes for a sequence related to the blood protein complement factor C9. So the LDL receptor gene was created by assembling modules for its various functions. These modules are also used in different combinations in other proteins.

Exons tend to be fairly small (see Figure 2.12), around the size of the smallest polypeptide that can assume a stable folded structure, ~20-40 residues. Perhaps proteins were originally assembled from rather small modules. Each module need not necessarily correspond to a current function; several modules could have combined to generate a function. The number of exons in a gene tends to increase with the length of its protein, which is consistent with the view that proteins acquire multiple functions by successively adding appropriate modules.

This idea might explain another feature of protein structure: it seems that the sites represented at exon-intron boundaries often are located at the surface of a protein. As modules are added to a protein, the connections, at least of the most recently added modules, could tend to lie at the surface.

- Mrna Splicing
Genes of eukaryotic cells containing introns and exons. Introns, short for intervening sequences, are non-coding regions of a gene. They are transcribed but not translated into the amino acid sequence of proteins. Exons, or expressed sequences, are coding...

- The Members Of A Gene Family Have A Common Organization
KEY TERMS:A superfamily is a set of genes all related by presumed descent from a common ancestor, but now showing considerable variation. KEY CONCEPTS: A common feature in a set of genes is assumed to identify a property that preceded their separation...

- How Did Interrupted Genes Evolve?
KEY CONCEPTS:The major evolutionary question is whether genes originated as sequences interrupted by exons or whether they were originally uninterrupted. Most protein-coding genes probably originated in an interrupted form, but interrupted genes that...

- Genes Show A Wide Distribution Of Sizes
KEY CONCEPTS:Most genes are uninterrupted in yeasts, but are interrupted in higher eukaryotes. Exons are usually short, typically coding for <100 amino acids. Introns are short in lower eukaryotes, but range up to several 10s of kb in length in higher...

- An Interrupted Gene Consists Of Exons And Introns
KEY CONCEPTS:Introns are removed by the process of RNA splicing, which occurs only in cis on an individual RNA molecule. Only mutations in exons can affect protein sequence, but mutations in introns can affect processing of the RNA and therefore prevent...

Biology