Our definition of tandem repeats over the edit distance follows the model of evolutive tandem repeats. This model assumes that each copy of the repeat, from left to right, is derived from the previous copy through zero or more mutations. Thus, each copy in the repeat is similar to its predecessor and successor copy.
A K-edit repeat is an evolutive tandem repeat in which there are at most K insertions, deletions, and mismatches, overall, between all consecutive copies of the repeat.
A K-edit repeat is maximal if it cannot be extended to the right or left, with matching characters. See this example to view a sample repeat output by our program. In the output, a copy of the repeat is written twice if it aligns differently to its predecessor and successor copies. (The second time a copy is written, its indices are omitted.) The error characters are color coded
A word r is a K-edit repeat if it can be partitioned into consecutive subwords, r = v'w1 w2 .. wlv" , l ≥2, such that ed(v',w'1) + ∑l-1i=1 ed(wi , wi+1)+ed(w"l , v") ≤K , where w'1 is some suffix of w1 and w"l is some prefix of wl.