laziness in `length'
Daniel Fischer
daniel.is.fischer at web.de
Mon Jun 14 10:51:22 EDT 2010
On Monday 14 June 2010 16:25:06, Serge D. Mechveliani wrote:
> Dear people and GHC team,
>
> I have a naive question about the compiler and library of ghc-6.12.3.
> Consider the program
>
> import List (genericLength)
> main = putStr $ shows (genericLength [1 .. n]) "\n"
> where
> n = -- 10^6, 10^7, 10^8 ...
>
> (1) When it is compiled under -O, it runs in a small constant space
> in n and in a time approximately proportional to n.
> (2) When it is compiled without -O, it takes at the run-time the
> stack proportional to n, and it takes enormousely large time
> for n >= 10^7.
> (3) In the interpreter mode ghci, `genericLength [1 .. n]'
> takes as much resource as (2).
>
> Are the points (2) and (3) natural for an Haskell implementation?
>
> Independently on whether lng is inlined or not, its lazy evaluation
> is, probably, like this:
> lng [1 .. n] =
> lng (1 : (list 2 n)) = 1 + (lng $ list 2 n) =
> 1 + (lng (2: (list 3 n))) = 1 + 1 + (lng $ list 3 n) =
> 2 + (lng (3: (list 4 n))) -- because this "+" is of Integer
> = 2 + 1 + (lng $ list 4 n) =
> 3 + (lng $ list 4 n)
> ...
> And this takes a small constant space.
Unfortunately, it would be
lng [1 .. n]
~> 1 + (lng [2 .. n])
~> 1 + (1 + (lng [3 .. n]))
~> 1 + (1 + (1 + (lng [4 .. n])))
~>
and that builds a thunk of size O(n).
The thing is, genericLength is written so that for lazy number types, the
construction of the result can begin before the entire list has been
traversed. This means however, that for strict number types, like Int or
Integer, it is woefully inefficient.
In the code above, the result type of generic length (and the type of list
elements) is defaulted to Integer.
When you compile with optimisations, a rewrite-rule fires:
-- | The 'genericLength' function is an overloaded version of 'length'. In
-- particular, instead of returning an 'Int', it returns any type which is
-- an instance of 'Num'. It is, however, less efficient than 'length'.
genericLength :: (Num i) => [b] -> i
genericLength [] = 0
genericLength (_:l) = 1 + genericLength l
{-# RULES
"genericLengthInt" genericLength = (strictGenericLength :: [a] ->
Int);
"genericLengthInteger" genericLength = (strictGenericLength :: [a] ->
Integer);
#-}
strictGenericLength :: (Num i) => [b] -> i
strictGenericLength l = gl l 0
where
gl [] a = a
gl (_:xs) a = let a' = a + 1 in a' `seq` gl xs a'
which gives a reasonabley efficient constant space calculation.
Without optimisations and in ghci, you get the generic code, which is slow
and thakes O(n) space.
> Thank you in advance for your explanation,
>
> -----------------
> Serge Mechveliani
> mechvel at botik.ru
More information about the Glasgow-haskell-users
mailing list