__The deadly threat of Alu-elements and other retro-transposons.__

One of the defense strategies may have been the mutation of the Alu-elements, possibly aimed at crippling their ability to proliferate. Since the entire spectrum of conceivable point mutations is consistent with the interpretation that all point mutations were caused by auto-mutagenic mechanisms, it would make sense, if the genomes had unleashed this arsenal for their defense. As will be shown in this presentation, the Alu mutants in the human genome that contain 50 or more base substitutions outnumber the 'original' Alu-copies by a wide margin. Considering that the Alu-elements are only approximately 280 bases long, such large numbers of base substitution must have had a substantial impact on their functionality.

One might expect that random Alu-proliferation in the host genome followed by random base substitutions of each Alu sequence results in poorly reproducible, rather chaotic distributions of Alu-mutants. Surprisingly, however, the process created precisely defined frequency distributions of Alu-mutants that were the same for all human chromosomes (and chimpanzee chr.1) and depended only on the specific family to which the Alu-element belonged. In order to explain this finding, this presentation offers a simple mathematical model of the dynamics of the proliferation of Alu-elements while their capacity to proliferate is increasingly inhibited by point mutations. If correct, this model will permit to reconstruct the evolutionary past of the Alu mutants and also to predict their evolutionary future. It may even serve to justify the interpretion of Alu mutants as time stamps on the host genome.

__The genome pixel image (GPxI) of Alu-elements and their mutations.__

**Fig.1. GPxI images of the sequences of the 3 major Alu-families.
Note, how similar the defining sequences are. Therefore, the present study distinguishes only between these 3 Alu-families
and ignores their further division into 217 sub-families (see text).**

The threshold N must not be chosen too small, lest the search would miss too many Alu-mutants. It must also not be too large, lest the search would accept sequences that could no longer be considered Alu mutants. As shown by their GPxIs (Fig 2) the patterns of the sequences identified by the search program were, indeed, easily recognizable Alu-mutants even for values as high as N = 100 base substitutions, provided the size search primer was 200 bases or larger. The same criterion of yielding recognizable Alu-patterns was applied to the selection of a suitable size of the search primers. While a search primer with = 200 bases searched with a threshold of N = 100 yielded clear Alu-patterns, a search primer with a size of 50 bases did not yield recognizable Alu-patterns, even if the threshold N was a low as 25 bases.

These and similar criteria led to the choices of a search primer size = 209 or 213 bases and a threshold of an acceptable number of N = 100 bases substitutions throughout the following.

**Fig.2. Effect of the maximal number of tolerated base substitutions on the Alu-mutant
sequences found by the search program.
The scale on top represents the positions of the bases of each found
sequence beginning with the down-stream end of the AluY-sequences in the human genome.
The GPxIs show small portions of the upstream and downstream flanks of the various Alu-mutants.
Note the appearance of poly-A stretches (=black pixel stretches) at the start of the each
down-stream flank of each Alu-sequence found by the search program.
**

__ The universal frequency distribution of Alu-mutants.__

**Fig.3. The remarkably high degree of reproducibility of the mutant distributions of Alu elements in the human genome.
Bars indicate the standard deviations of the values of the tested number of chromosomes. Abscissa:
number n of base substitutions of the Alu-mutants; ordinate: Frequency of the various Alu-mutants with exactly n base
substitutions, normalized to a maximum amplitude of 100.
**

In all cases, the frequency distributions showed a pronounced dominance of Alu- mutations with 50 and more base substitutions over Alu-elements that contained fewer than 50 mutations. Equating large numbers of base substitutions with large evolutionary age, it suggests that most Alu-elements in the human genome are quite 'old'.

Testing chimpanzee chromosome 1 yielded the same distributions as the human chromosomes.

__The decision, which of 2 Alu-elements is more similar to the 'original' based on their mutant distribution.__

To be sure, there is clear evidence that Alu sequences are part of the 7SL RNA gene of numerous species, including

Obviously, we can never decide whether a particular Alu-sequence is the 'true original' because the original may not even exist any more today. Therefore, in the literature many authors placed the terms 'original' or 'source' sequences in inverted commas as was done in the present article. Nevertheless, based on the set {M} of all Alu-mutants known today, it is quite possible to determine which of 2 Alu-mutants is more similar to the 'original' than the other. Traditionally, the students of Alu-elements have solved the problem by detailed studies of homologies between domains of different Alu-sequences, which can determine which sequence pre-dates the other and, thus identify the earliest among them as the most 'original'.

The mutant distributions presented here offer another rather simple way to tell which of two Alu-mutants is more similar to the 'original' sequence. Consider the set {M} of mutants that all arose from a common original sequence Alu0 in the human or any other genome. Using Alu

The comparison between the distributions A

For example, Fig. 4 shows the mutant distributions obtained from certain AluY-mutants X(i) with i = 0, 15, 30, and 63 base substitutions which were used as search primers on human chr.1 Clearly, the more base substitutions the search primer X(i) contained, the fewer mutants could be found that contained less than (say) 30 base substitutions.

On the other hand, it is extremely unlikely to find in {M} a single sequence that could qualify as a mutant of (say) X(30) that contains additional 1, 2, 3,… or other small numbers of base substitutions. Such a sequences would have to be mutants of Alu

This reasoning was used earlier, in order to conclude that AluJ was much older than AluY because it contained almost no mutants with fewer than 30 base substitutions (Fig. 3c). Likewise, one can see immediately that the age of AluS is in between AluY and AluJ but more similar to AluY, as its mutant distribution has fewer such mutants than AluY but many more than AluJ.

**Fig.4. Dependence of the mutant distribution on the number of initial base substitutions contained in its search primer.
The search program using the different search primers allowed up to 100 base substitutions for each mutant.
Abscissa: number n of base substitutions of the Alu-mutants; ordinate: absolute count of the various Alu-mutants with exactly n base substitutions.
**

__ A mathematical model of the dynamics of Alu mutations.__

It expresses these increases and decreases quantitatively as a function of A[n,R] and a certain time interval δR, in which they occurred.

In addition, it assumes that there were several episodes of new bursts of fully proliferative Alu-elements ('seedings') in the evolutionary past of primate genomes. These rather minimal assumptions were sufficient to reproduce the details of the actual mutant distributions of AluY, AluS and AluJ to a high degree of accuracy. None of theses assumptions expresses any chromosome specificity. Therefore, the mathematical model explains one of the main finding of this study, namely the remarkable similarity between the Alu-mutant distributions of different chromosomes.

A major attraction of the mathematical model is the possibility to reconstruct the past mutation distributions and their future (Figure 5). (See also the

**Fig.5. The time development of the distribution of the AluY-mutants in the human genome, reconstructed and predicted by the
mathematical model and calibrated.The panel marked 'present' matches the AluY mutant distribution of Fig. 3a quite well. Watch also the
ANIMATION OF THE EVOLUTION of the AluY and the
AluJ mutations (the thick lines show the present day distributions of the mutations).
**

__The calibration of the evolutionary age of Alu-elements.__

There is no evidence that the actual evolutionary time intervals δT that corresponds to a unit δR of the mathematical time had always the same value. It is quite possible, that at some times in evolution, the progression towards the next computed spectrum of Alu-mutants took much longer than at others. However, we are making here the same assumption as the field in general, namely that the rate of mutations is approximately constant, i.e.

T = τ R .

with a calibration constant τ. As explained in detail in ref 8, the generally accepted time of -60 million years of the first appearance of Alu-elements in primates yields a value of τ = 250,000 years/recursion.

Using this calibration to determine the times at which new bursts ('seedings') of fully proliferative Alu_elements appeared yielded the results shown in Figure 6. It turned out that these times coincided quite well with the appearance of new sub-families (See ref 8).

**Fig.6. Timing and relative magnitude of the various seedings of Alu-elements in the evolutionary past of the AluY, AluS, and AluJ
families reconstructed by the mathematical model.
**

The obvious answer is, that they did not need to. The safety net of natural selection guaranteed that the Alu-elements accumulated many more base substitutions than did the vital parts of their host genome. After all, if a point mutation damaged a vital part of the genome, that organism and its genome were eventually eliminated by natural selection. In contrast, if a point mutation hit an Alu-element, it even helped the survival of that genome, because it crippled a dangerous invader even more. Hence, selection favoured genomes which concentrated base substitutions in the Alu-sequences.

Therefore, it seems, that the massive base substitutions in Alu-elements may represent a case where