[Haskell-cafe] Re: Difference lists and ShowS
apfelmus
apfelmus at quantentunnel.de
Wed Jan 9 16:18:17 EST 2008
Albert Y. C. Lai wrote:
> apfelmus wrote:
>> I don't know a formalism for easy reasoning about time in a lazy
>> language. Anyone any pointers? Note that the problem is already
>> present for difference lists in strict languages.
>
> http://homepages.inf.ed.ac.uk/wadler/topics/strictness-analysis.html
>
> especially "strictness analysis aids time analysis".
Ah, of course, thanks. Together with
D. Sands. Complexity Analysis for a Lazy Higher-Order Language.
http://citeseer.ist.psu.edu/291589.html
for the higher-order case, a satisfactory analysis can be put together.
The formalism is basically as follows: for a function f, let f^T denote
the time needed to execute it to weak normal form given that it's
arguments are already in weak normal form. Weak normal form = full
normal form for algebraic values and lambda-abstraction for functions =
what you'd expect in a strict language. Plain values = nullary
functions. For instance
(++)^T [] ys = 1 + ys^T = 1
(++)^T (x:xs) ys = 1 + (x:(xs ++ ys))^T
= 1 + (++)^T xs ys + xs^T + ys^T -- (:) is free
= 1 + (++)^T xs ys
==>
(++)^T xs ys = O(length xs)
Substituting a function application by the function body is counted as 1
time step, that's where the 1 + comes from.
For difference lists, we have
(.)^T f g = O(1)
since it immediately returns the lambda-abstraction \x -> f(g x) . Now,
we missed something important about difference lists namely the function
toList f = f []
that turns a difference list into an ordinary list and this function is
O(n). In contrast, The "pendant" for ordinary lists, i.e. the identity
function, is only O(1). Why is it O(n)? Well, (.) itself may be O(1) but
it constructs a function that needs lots of time to run. In particular
(f . g)^T [] = ((\x->f (g x))[])^T
= 1 + (f (g []))^T
= 1 + f^T (g []) + (g [])^T
= 1 + f^T (g []) + g^T []
So, to analyze higher-order functions, we simply have to keep track of
the "size" of the returned functions (more precisely, Sands uses
"cost-closures"). The above reduces to
(f . g)^T [] = 1 + f^T [] + g^T []
Since our difference lists don't care of what they are prepended to
f^T xs = f^T []
Cheating a bit with the notation, we can write
toList^T (f . g) = 1 + toList^T f + toList^T g
This means that a difference list build out of n elements by m
applications of (.) will take O(n + m) time. This is the same as O(m)
because m >= n , our lists are concatenations of singletons. That's not
O(n) as anticipated, but it's alright: a concatenation of m empty
lists is empty but clearly takes O(m) time, so the number of
concatenations matters.
Since difference lists offer such a good concatenation, why not replace
ordinary lists entirely? Well, the problem is that we have another
function that should run fast, namely head . For ordinary lists,
head^T xs = O(1)
but for difference lists, we have
(head . toList)^T f = O(m) which is >= O(n)
in the worst case, lazy evaluation notwithstanding. How to analyze lazy
evaluation? Wadler's approach is to add an extra argument to every
expression which says how much of the expression is to be evaluated.
This extra information can be encoded via projections. But I think it's
sufficient here to let (head expr)^T symbolize the time to reduce expr
to weak head normal form. For example,
(head . toList)^T (f . g) = 1 + (head . toList)^T f
assuming that f is nonempty. But due to the 1 + , any left-nested
composition like
head . toList $ (((a . b) . c) . d) . e
still needs O(m) time. So, difference lists are no "eierlegende
wollmilchsau" either.
Regards,
apfelmus
More information about the Haskell-Cafe
mailing list