[Haskell-cafe] [ANN] bloomfilter 1.0 - Fast immutable and mutable
bos at serpentine.com
Fri May 30 18:30:44 EDT 2008
I'm pleased to announce the availability of a fast Bloom filter library
for Haskell. A Bloom filter is a probabilistic data structure that
provides a fast set membership querying capability. It does not give
false negatives, but has a tunable false positive rate. (A false
positive arises when the filter claims that an element is present, but
in fact it is not.)
The library is easy to use. As an example, here's a reimplementation of
the Unix "spell" command.
import Data.BloomFilter.Easy (easyList, elemB)
main = do
filt <- (easyList 0.01 . words) `fmap` readFile "dictionary.txt"
let check word | word `elemB` filt = ""
| otherwise = word ++ "\n"
interact (concat . map check . lines)
It is also carefully tuned for performance. On my laptop, I can sustain
a construction or query rate well in excess of a million ByteStrings per
darcs get http://darcs.serpentine.com/bloomfilter
More information about the Haskell-Cafe