Logo image
Lempel–Ziv factorization using less time & space
Journal article   Peer reviewed

Lempel–Ziv factorization using less time & space

G. Chen, S.J. Puglisi and W.F. Smyth
Mathematics in Computer Science, Vol.1(4), pp.605-623
2008
url
Link to Published Version *Subscription may be requiredView

Abstract

For 30 years the Lempel–Ziv factorization LZ x of a string x = x[1..n] has been a fundamental data structure of string processing, especially valuable for string compression and for computing all the repetitions (runs) in x. Traditionally the standard method for computing LZ x was based on Θ(n)-time (or, depending on the measure used, O(n log n)-time) processing of the suffix tree ST x of x. Recently Abouelhoda et al. proposed an efficient Lempel–Ziv factorization algorithm based on an “enhanced” suffix array – that is, a suffix array SA x together with supporting data structures, principally an “interval tree”. In this paper we introduce a collection of fast space-efficient algorithms for LZ factorization, also based on suffix arrays, that in theory as well as in many practical circumstances are superior to those previously proposed; one family out of this collection achieves true Θ(n)-time alphabet-independent processing in the worst case by avoiding tree structures altogether.

Details

Metrics

Logo image