Fast and practical algorithms for computing all the runs in a string

G. Chen; S.J. Puglisi; W.F. Smyth

doi:10.1007/978-3-540-73437-6_31

Back

Fast and practical algorithms for computing all the runs in a string

Journal article

Peer reviewed

Fast and practical algorithms for computing all the runs in a string

G. Chen, S.J. Puglisi and W.F. Smyth

Combinatorial Pattern Matching, Vol.4580, pp.307-315

2007

DOI: https://doi.org/10.1007/978-3-540-73437-6_31

Files and links (1)

url

Link to Published Version *Subscription may be requiredView

Abstract

A repetition in a string x is a substring w=ue of x, maximum e ≥ 2, where u is not itself a repetition in w. A run in x is a substring w=ueu∗ of “maximal periodicity”, where ue is a repetition and u * a maximum-length possibly empty proper prefix of u. A run may encode as many as |u| repetitions. The maximum number of repetitions in any string x=x[1..n] is well known to be Θ(nlogn). In 2000 Kolpakov & Kucherov showed that the maximum number of runs in x is O(n); they also described a Θ(n)-time algorithm, based on Farach’s Θ(n)-time suffix tree construction algorithm (STCA), Θ(n)-time Lempel-Ziv factorization, and Main’s Θ(n)-time leftmost runs algorithm, to compute all the runs in x. Recently Abouelhoda et al. proposed a Θ(n)-time Lempel-Ziv factorization algorithm based on an “enhanced” suffix array — a suffix array together with other supporting data structures. In this paper we introduce a collection of fast space-efficient algorithms for computing all the runs in a string that appear in many circumstances to be superior to those previously proposed.

Details

Title: Fast and practical algorithms for computing all the runs in a string
Authors/Creators: G. Chen (Author/Creator)
S.J. Puglisi (Author/Creator)
W.F. Smyth (Author/Creator)
Publication Details: Combinatorial Pattern Matching, Vol.4580, pp.307-315
Publisher: Springer Verlag
Identifiers: 991005545118807891
Murdoch Affiliation: Murdoch University
Language: English
Resource Type: Journal article

Metrics

62 Record Views