[Haskell-beginners] populating a bloom filter; stymied by ST monad

Joey Hess joey at kitenet.net
Mon Mar 12 07:38:45 CET 2012


I have a potentially very large number of values to feed into a bloom
filter. They're coming from IO:

getValues :: (v -> k -> k) -> k -> IO k
getValues update initial = go initial =<< gen
	where
		go v [] = return v
		go v (f:fs) = do
			x <- val f
			go (maybe v (`update` v) x) fs
		gen = undefined
		val f = undefined

This streams lazily, but if I build a list (getValues (:) []),
the laziness is lost; the whole list is constructed and returned
before any of it can be used. Which uses too much memory of course.
So I can't use Data.BloomFilter.fromListB.

Seems I need to do something like this:

let filter = newMB (cheapHashes 13) 33554432 --- sized for 1 million items
filter' <- unsafeFreezeMB <$> getValues insertMB filter

Except this won't work, the mutable bloom filter uses the ST monad.
And I cannot see a way to combine the three "MB" functions into a
single computation in the same ST monad.

Perhaps stToIO to is what I need, but I can't work out how to use it.

Help?

-- 
see shy jo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 828 bytes
Desc: Digital signature
URL: <http://www.haskell.org/pipermail/beginners/attachments/20120312/445439d8/attachment.pgp>


More information about the Beginners mailing list