[Haskell-cafe] How do I get this done in constant mem?
Thomas Hartman
tphyahoo at gmail.com
Sat Oct 10 12:33:52 EDT 2009
> Yes, you should not do this in IO. That requires the entire
> computation to finish before the result can be used.
Not really the entire computation though... whnf, no?
main = do
let thunks :: IO [Int]
thunks = (sequence . replicate (10^6) $ (randomRIO (0,10^9)))
putStrLn . show . head =<< thunks -- prints
putStrLn . show . last =<< thunks -- overflows
In the case of [[num]] from the top post, I belive that would be the
first complete list.
2009/10/9 Luke Palmer <lrpalmer at gmail.com>:
> On Fri, Oct 9, 2009 at 2:05 PM, <mf-hcafe-15c311f0c at etc-network.de> wrote:
>> Hi all,
>>
>> I think there is something about my use of the IO monad that bites me,
>> but I am bored of staring at the code, so here you g. The code goes
>> through a list of records and collects the maximum in each record
>> position.
>>
>>
>> -- test.hs
>> import Random
>> import System.Environment (getArgs)
>> import System.IO (putStr)
>>
>> samples :: Int -> Int -> IO [[Double]]
>> samples i j = sequence . replicate i . sequence . replicate j $ randomRIO (0, 1000 ** 3)
>
> Yes, you should not do this in IO. That requires the entire
> computation to finish before the result can be used. This computation
> should be pure and lazy.
>
> It is possible, using split (and I believe not without it, unless you
> use mkStdGen), to make a 2D list of randoms where the random
> generation matches exactly the structure of the list.
>
> splits :: (RandomGen g) => Int -> g -> [g]
> splits 0 _ = []
> splits n g = let (g1,g2) = split g in g1 : splits (n-1) g2
>
> samples :: (RandomGen g) => Int -> Int -> g -> [[Double]]
> samples i j gen = map row (splits i gen)
> where
> row g = take j (randomRs (0, 10^9) g)
>
> In fact, we could omit all these counts and make an infinite 2D list,
> which you can cull in the client code.
>
> splits :: (RandomGen g) => g -> [g]
> splits g = let (g1,g2) = split g in g1 : splits g2
>
> samples :: (RandomGen g) => g -> [[Double]]
> samples = map row . splits
> where
> row = randomRs (0, 10^9)
>
> I find the latter to be more straightforward and obvious. Maintaining
> the laziness here is a fairly subtle thing, so study, perturb, try to
> write it yourself in different ways, etc.
>
>> maxima :: [[Double]] -> [Double]
>> maxima samples@(_:_) = foldr (\ x y -> map (uncurry max) $ zip x y) (head samples) (tail samples)
>
> FWIW, This function has a beautiful alternate definition:
>
> maxima :: [[Double]] -> [Double]
> maxima = map maximum . transpose
>
>> main = do
>> args <- getArgs
>> x <- samples (read (head args)) 5
>> putStr . (++ "\n") . show $ maxima x
>>
>>
>> I would expect this to take constant memory (foldr as well as foldl),
>> but this is what happens:
>>
>>
>> $ ghc -prof --make -O9 -o test test.hs
>> [1 of 1] Compiling Main ( test.hs, test.o )
>> Linking test ...
>> $ ./test 100 +RTS -p
>> [9.881155955344708e8,9.910336352165401e8,9.71000686630374e8,9.968532576451201e8,9.996200333115692e8]
>> $ grep 'total alloc' test.prof
>> total alloc = 744,180 bytes (excludes profiling overheads)
>> $ ./test 10000 +RTS -p
>> [9.996199711457872e8,9.998928358545277e8,9.99960283632381e8,9.999707142123885e8,9.998952151508758e8]
>> $ grep 'total alloc' test.prof
>> total alloc = 64,777,692 bytes (excludes profiling overheads)
>> $ ./test 1000000 +RTS -p
>> Stack space overflow: current size 8388608 bytes.
>> Use `+RTS -Ksize' to increase it.
>> $
>>
>>
>> so...
>>
>> does sequence somehow force the entire list of monads into evaluation
>> before the head of the result list can be used?
>
> Yep. IO is completely strict; in some sense the same as "call by
> value" (don't take the analogy too far). Rule of thumb: keep your
> distance from it ;-)
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
More information about the Haskell-Cafe
mailing list