The rate of neutral substitution can be measured from divergence of repeated sequences
Biology

The rate of neutral substitution can be measured from divergence of repeated sequences



KEY CONCEPTS:
  • The rate of substitution per year at neutral sites is greater in the mouse than in the human genome.
We can make the best estimate of the rate of substitution at neutral sites by examining sequences that do not code for protein. (We use the term neutral here rather than silent, because there is no coding potential). An informative comparison can be made by comparing the members of common repetitive family in the human and mouse genomes (Waterston et al., 2002).

The principle of the analysis is summarized in Figure 4.9. We start with a family of related sequences that have evolved by duplication and substitution from an original family member. We assume that the common ancestral sequence can be deduced by taking the base that is most common at each position. Then we can calculate the divergence of each individual family member as the proportion of bases that differ from the deduced ancestral sequence. In this example, individual members vary from 0.13 - 0.18 divergence, and the average is 0.16.
One family used for this analysis in the human and mouse genomes derives from a sequence that is thought to have ceased to be active at about the time of the divergence between Man and rodents (the LINES family; see 17.9 Retroposons fall into three classes ). This means that it has been diverging without any selective pressure for the same length of time in both species. Its average divergence in Man is ~0.17 substitutions per site, corresponding to a rate of 2.2 × 10?9 substitutions per base per year over the 75 million years since the separation. In the mouse genome, however, neutral substitutions have occurred at twice this rate, corresponding to 0.34 substitutions per site in the family, or a rate of 4.5 × 10?9 .However, note that if we calculated the rate per generation instead of per year, it would be greater in Man than in mouse (~2.2 × 10?8 as opposed to ~10?9).
These figures probably underestimate the rate of substitution in the mouse, because at the time of divergence the rates in both species would have been the same, and the difference must have evolved since then. The current rate of neutral substitution per year in the mouse is probably 2-3× greater than the historical average. These rates reflect the balance between the occurrence of mutations and the ability of the genetic system of the organism to correct them. The difference between the species demonstrates that each species has systems that operate with a characteristic efficiency.
Comparing the mouse and human genomes allows us to assess whether syntenic (corresponding) sequences show signs of conservation or have differed at the rate expected from accumulation of neutral substitutions. The proportion of sites that show signs of selection is ~5%. This is much higher than the proportion that codes for protein or RNA (~1%). It implies that the genome includes many more stretches whose sequence is important for non-coding functions than for coding functions. Known regulatory elements are likely to comprise only a small part of this proportion. This number also suggests that most (i.e., the rest) of the genome sequences do not have any function that depends on the exact sequence. 




- Mammalian Satellites Consist Of Hierarchical Repeats
KEY CONCEPTS:Mouse satellite DNA has evolved by duplication and mutation of a short repeating unit to give a basic repeating unit of 234 bp in which the original half, quarter, and eighth repeats can be recognized.In the mammals, as typified by various...

- Pseudogenes Are Dead Ends Of Evolution
KEY CONCEPTS:Pseudogenes have no coding function, but they can be recognized by sequence similarities with existing functional genes. They arise by the accumulation of mutations in (formerly) functional genes. Pseudogenes (?) are defined by their possession...

- The Human Genome Has Fewer Genes Than Expected
KEY CONCEPTS:Only 1% of the human genome consists of coding regions. The exons comprise ~5% of each gene, so genes (exons plus introns) comprise ~25% of the genome. The human genome has 30,000-40,000 genes. ~60% of human genes are alternatively spliced....

- The Conservation Of Genome Organization Helps To Identify Genes
KEY TERMS:Synteny describes a relationship between chromosomal regions of different species where homologous genes occur in the same order. KEY CONCEPTS: Algorithms for identifying genes are not perfect and many corrections must be made to the initial...

- Exon Sequences Are Conserved But Introns Vary
KEY CONCEPTS:Comparisons of related genes in different species show that the sequences of the corresponding exons are usually conserved but the sequences of the introns are much less well related. Introns evolve much more rapidly than exons because of...



Biology








.