[Haskell-cafe] Space usage problems
Chris Kuklewicz
haskell at list.mightyreason.com
Tue Jan 10 12:28:03 EST 2006
I'll make a guess...
Ian Lynagh wrote:
> Hi all,
>
> In the middle (readChunks) is the equivalent of gunzip. It repeatedly
> calls foo until there is no more input left.
>
> At the top is a simple main function that calls them.
>
> If I do
>
> dd if=/dev/zero of=data bs=1000 count=3000" # making data around 3MB
> ghc --make Test1 -o Test1 -O -Wall
> ./Test1
>
> then in top I see Test1 increasing memory usage to around 150MB. I think
> this is because the "let (ys, zs) = foo xs" means zs holds on to xs
> (it's hard to be sure as compiling for profiling is too happy to change
> the behaviour).
I don't have a comment on your guess.
>
> I tried (Test2) changing foo to be a monad transformer over the calling
> monad, so the caller's remaining input was updated as we went along, but
> (as well as memory usage not obviously being fixed) this is giving me a
> stack overflow.
I will ignore Test2
> Has anyone got any suggestions for making a constant space, constant
> stack version?
Not yet.
>
>
> Thanks
> Ian
> ------------------------------------------------------------------------
>
>
> module Main (main) where
>
> import Control.Monad (liftM)
> import Control.Monad.State (State, runState, evalState, get, put)
>
> main :: IO ()
> main = do xs <- readFile "data"
> ys <- readFile "data"
> print (evalState readChunks xs == ys)
The equality should be constant space.
>
> ---
>
> type FirstMonad = State String
>
> readChunks :: FirstMonad String
> readChunks = do xs <- get
> if null xs then return []
> else do let (ys, zs) = foo xs
> put zs
And zs is the final state of "runState bar" which is suspect is []
And ys is the whole input (which is now all in memory)
> rest <- readChunks
> return (ys ++ rest)
>
> ---
>
> type SecondMonad = State String
>
> foo :: String -> (String, String)
> foo = runState bar
>
> bar :: SecondMonad String
> bar = do inp <- get
> case inp of
> [] -> return []
> x:xs -> do put xs
> liftM (x:) bar
The liftM should be equivalent to
temp <- bar
return ( (x:) temp )
It looks like the first call to foo will have bar consuming the entire
input string.
So the flow looks like
main
readChuncks all-input
foo all-input
bar (iterated over whole input length)
foo returns (all-input, [])
"rest <- readChunks" (recursive call, sees null xs then "return []")
"return (ys ++ rest)" which is return (all-input ++ [])
In essence, your bar traverses the whole string until the state is
empty. This loads your whole file into memory
More information about the Haskell-Cafe
mailing list