The reason for the existence of the hotspots is that cytosine bases suffer spontaneous deamination at an appreciable frequency. In this reaction, the amino group is replaced by a keto group. Recall that deamination of cytosine generates uracil. Figure 1.24 compares this reaction with the deamination of 5-methylcytosine where deamination generates thymine. The effect in DNA is to generate the base pairs G
·U and G
·T, respectively, where there is a
mismatch between the partners.
All organisms have repair systems that correct mismatched base pairs by removing and replacing one of the bases. The operation of these systems determines whether mismatched pairs such as G·U and G·T result in mutations.
Figure 1.25 shows that the consequences of deamination are different for 5-methylcytosine and cytosine. Deaminating the (rare) 5-methylcytosine causes a mutation, whereas deamination of the more common cytosine does not have this effect (Coulondre et al., 1978). This happens because the repair systems are much more effective in recognizing G·U than G·T.
E. coli contains an enzyme, uracil-DNA-glycosidase, that removes uracil residues from DNA. This action leaves an unpaired G residue, and a "repair system" then inserts a C base to partner it. The net result of these reactions is to restore the original sequence of the DNA. This system protects DNA against the consequences of spontaneous deamination of cytosine (although it is not active enough to prevent the effects of the increased level of deamination caused by nitrous acid..
But the deamination of 5-methylcytosine leaves thymine. This creates a mismatched base pair, G·T. If the mismatch is not corrected before the next replication cycle, a mutation results. At the next replication, the bases in the mispaired G·T partnership separate, and then they pair with new partners to produce one wild-type G·C pair and one mutant A·T pair.
Deamination of 5-methylcytosine is the most common cause of production of G·T mismatched pairs in DNA. Repair systems that act on G·T mismatches have a bias toward replacing the T with a C (rather than the alternative of replacing the G with an A), which helps to reduce the rate of mutation. However, these systems are not as effective as the removal of U from G·U mismatches. As a result, deamination of 5-methylcytosine leads to mutation much more often than does deamination of cytosine.
5-methylcytosine also creates hotspots in eukaryotic DNA. It is common at CpG dinucleotides that are concentrated in regions called CpG islands. Although 5-methylcytosine accounts for ~1% of the bases in human DNA, sites containing the modified base account for ~30% of all point mutations. This makes the state of 5-methylcytosine a particularly important determinant of mutation in animal cells.
The importance of repair systems in reducing the rate of mutation is emphasized by the effects of eliminating the mouse enzyme MBD4, a glycosylase that can remove T (or U) from mismatches with G. The result is to increase the mutation rate at CpG sites by a factor of 3× (Millar et al., 2002). (The reason the effect is not greater is that MBD4 is only one of several systems that act on G·T mismatches; we can imagine that elimination of all the systems would increase the mutation rate much more.)
The operation of these systems casts an interesting light on the use of T in DNA compared with U in RNA. Perhaps it relates to the need of DNA for stability of sequence; the use of T means that any deaminations of C are immediately recognized, because they generate a base (U) not usually present in the DNA. This greatly increases the efficiency with which repair systems can function (compared with the situation when they have to recognize G·T mismatches, which can be produced also by situations where removing the T would not be the appropriate response). Also, the phosphodiester bond of the backbone is more labile when the base is U.