[Haskell-beginners] populating a bloom filter; stymied by ST monad
Joey Hess
joey at kitenet.net
Mon Mar 12 07:38:45 CET 2012
I have a potentially very large number of values to feed into a bloom
filter. They're coming from IO:
getValues :: (v -> k -> k) -> k -> IO k
getValues update initial = go initial =<< gen
where
go v [] = return v
go v (f:fs) = do
x <- val f
go (maybe v (`update` v) x) fs
gen = undefined
val f = undefined
This streams lazily, but if I build a list (getValues (:) []),
the laziness is lost; the whole list is constructed and returned
before any of it can be used. Which uses too much memory of course.
So I can't use Data.BloomFilter.fromListB.
Seems I need to do something like this:
let filter = newMB (cheapHashes 13) 33554432 --- sized for 1 million items
filter' <- unsafeFreezeMB <$> getValues insertMB filter
Except this won't work, the mutable bloom filter uses the ST monad.
And I cannot see a way to combine the three "MB" functions into a
single computation in the same ST monad.
Perhaps stToIO to is what I need, but I can't work out how to use it.
Help?
--
see shy jo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 828 bytes
Desc: Digital signature
URL: <http://www.haskell.org/pipermail/beginners/attachments/20120312/445439d8/attachment.pgp>
More information about the Beginners
mailing list