[Haskell-cafe] Tuesday Morning Games (was: Friday Afternoon Play:)
The Burrows Wheeler Transformation
Ketil Malde
Ketil.Malde at bccs.uib.no
Tue Nov 21 03:53:09 EST 2006
apfelmus at quantentunnel.de wrote:
> sadly, nobody wants to play with you.
Okay, here's a small possible contribution.
> Most likely, it's because the next
> move is too hard: for a faster forward transform, you need (something as
> tricky as) suffix trees.
>
To make it fast, I guess you'll need a suffix array. (And in a sense,
that is what we're making when we sort the list of rotations, but
of course the time complexity is way worse than the linear time
SA algorithms.)
>>> rotations l = init $ zipWith (++) (tails l) (inits l)
>>>
I haven't done the benchmarks, but I'm a bit disturbed by the number of
list traversals. It seems to me that instead of sorting the rotations and
taking the last element, it would be more efficient to sort the
rotations by the tails,
and taking the head? Especially if the string contains little
redundancy, the
sorting wouldn't need to traverse very deeply. Here's a (rather ugly)
attempt,
which seems to work:
\begin{code}
slow_bwt2:: Ord t => [t] -> ([t], Maybe Int)
slow_bwt2 l
= (map fst tagged_bwt, index_of_original)
where tagged_bwt = map (head><id)
$ sortBy ((. tail . fst) . compare . tail .
fst) -- `on` (tail . fst)
$ rotations l`zip`[0..]
index_of_original = findIndex ((==0).snd) tagged_bwt
rotations l = zipWith (++) (drop 1 $ tails l) (repeat l)
\end{code}
More information about the Haskell-Cafe
mailing list