[Haskell-cafe] Re: [ANN] An efficient lazy suffix tree library

ChrisK haskell at list.mightyreason.com
Mon Aug 27 13:57:19 EDT 2007

Gleb Alexeyev wrote:
> Bryan O'Sullivan wrote:
>> I just posted a library named suffixtree to Hackage.
>> http://www.serpentine.com/software/suffixtree/
>> It implements Giegerich and Kurtz's lazy construction algorithm, with
>> a few tweaks for better performance and resource usage.
>> API docs:
>> http://darcs.serpentine.com/suffixtree/dist/doc/html/Data-SuffixTree.html
>> I've tested it on multi-megabyte input strings.
> I think I found a bug:
> import qualified Data.SuffixTree as T
>> T.countRepeats "ab" (T.construct "abab")
> 1

That is almost certainly because the algorithm expects the source string to have
a unique character at its end.

The library should either make that clear or add such a character on its own.

Otherwise the "ab" suffix is a prefix of the "abab" suffix and the shorter one
gets lost.  If you end in "$" then "ab$" cannot merge with "abab$" and there are
no distinct suffixes a and b for which (isPrefixOf a b) is true.


> *Data.SuffixTree> countRepeats "ab" (construct "abab")
> 1
> *Data.SuffixTree> countRepeats "ab" (construct "abab$")
> 2


More information about the Haskell-Cafe mailing list