The sequences of DNA comprising an interrupted gene are divided into the two categories depicted in Figure 2.1:
The expression of interrupted genes requires an additional step that does not occur for uninterrupted genes. The DNA gives rise to an RNA copy (a transcript) that exactly represents the genome sequence. But this RNA is only a precursor; it cannot be used for producing protein. First the introns must be removed from the RNA to give a messenger RNA that consists only of the series of exons. This process is called RNA splicing. It involves a precise deletion of an intron from the primary transcript; the ends of the RNA on either side are joined to form a covalently intact molecule (see 24 RNA splicing and processing).
The
structural gene comprises the region in the genome between points corresponding to the 5
and 3
terminal bases of mature mRNA. We know that transcription starts at the 5
end of the mRNA, but usually it extends beyond the 3
end, which is generated by cleavage of the RNA (see 24.19 The 3
ends of mRNAs are generated by cleavage and polyadenylation). The gene is considered to include the regulatory regions on both sides of the gene that are required for initiating and (sometimes) terminating gene expression.