[Haskell-cafe] Very fast searching of byte strings
ChrisK
haskell at list.mightyreason.com
Fri Aug 17 10:55:46 EDT 2007
Post the the library mailing list at [1] is the Boyer-Moore algorithm
implemented for strict and lazy bytestrings (and combinations thereof). It
finds all the overlapping instances of the pattern inside the target.
I have performance tuned it. But the performance for searching a strict
bytestring is better then for a lazy bytestring (even if they only had a single
strict chunk), which almost certainly means I was not clever enough to get GHC
to produce the optimal code.
There is much more description in the module's haddock header.
Hopefully Don or other ByteString experts/maintainers can tweak this even further.
Also at [1] is a Knuth-Morris-Pratt module which find non-overlapping
instances and is slightly slower on my benchmarks.
Happy Searching,
Chris Kuklewicz
[1] http://www.haskell.org/pipermail/libraries/2007-August/007987.html
More information about the Haskell-Cafe
mailing list