[Haskell-cafe] How on Earth Do You Reason about Space?
jwlato at gmail.com
Wed Jun 1 00:30:06 CEST 2011
From: "Edward Z. Yang" <ezyang at MIT.EDU>
> Hello Aleksandar,
> It is possible that the iteratees library is space leaking; I recall some
> recent discussion to this effect. Your example seems simple enough that
> you might recompile with a version of iteratees that has -auto-all enabled.
> Unfortunately, it's not really a safe bet to assume your libraries are
> leak free, and if you've pinpointed it down to a single line, and there
> doesn't seem a way to squash the leak, I'd bet it's the library's fault.
I can't reproduce the space leak here. I tried Aleksander's original code,
my iteratee version, the Ngrams version posted by Johan Tibell, and a lazy
my iteratee version (only f' has changed from Aleksander's code):
f' :: Monad m => I.Iteratee S.ByteString m Wordcounts
f' = I.joinI $ (enumLinesBS I.><> I.filter (not . S.null)) $ I.foldl' (\t s
-> T.insertWith (+) s 1 t) T.empty
my lazy bytestring version
> import Data.Iteratee.Char
> import Data.List (foldl')import Data.Char (toLower)
> import Data.Ord (comparing)
> import Data.List (sortBy)
> import System.Environment (getArgs)
> import qualified Data.ByteString.Lazy.Char8 as L
> import qualified Data.HashMap.Strict as T
> f'2 = foldl' (\t s -> T.insertWith (+) s 1 t) T.empty . filter (not .
L.null) . L.lines
> main2 :: IO ()
> main2 = getArgs >>= L.readFile .head >>= print . T.keys . f'2
None of these leak space for me (all compiled with ghc-7.0.3 -O2).
Performance was pretty comparable for every version, although Aleksander's
original did seem to have a very small edge.
As someone already pointed out, keep in mind that this will use a lot of
memory anyway, unless there's a lot of repetition of words.
I'd be happy to help track down a space leak in iteratee, but for now I'm
not seeing one.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Haskell-Cafe