[Haskell-cafe] Musings on lists

Wed Jul 12 23:06:06 UTC 2023

You are correct that (++) is associative, but this is only a statement about the equality of the lists, not the complexity of the algorithm used to calculate the lists in order to compare them.

If you expect to be concatenating many lists in a left-associative way, it is more efficient to opt for an alternative representation, the difference list: https://wiki.haskell.org/Difference_list

Instead of storing the list per se, one stores instead a function, that would prepend all of its elements in the proper order to an existing list. The isomorphism to a regular list is the application to []. But this representation turns concatenation into function application: instead of a chain of (++) we obtain a chain of (:), which as you pointed out is much faster than repeatedly traversing the same list to get to the ever-receding end.

Isomorphisms like these are useful for changing the algorithmic complexity of operations without affecting their correctness. In the same vein, it’s more effective to combine several folds of the same list into a single accumulator, and traverse it only once. There are tools (such as the lens library) for converting between isomorphic representations, but I am not aware of any automagical solutions to this kind of problem (except, perhaps, rewrite rules, but that’s compiler wizardry).

Cheers
Melanie Brown

https://wiki.haskell.org/Difference_list

On Wed, Jul 12, 2023 at 18:06, Todd Wilson <[twilson at csufresno.edu](mailto:On Wed, Jul 12, 2023 at 18:06, Todd Wilson <<a href=)> wrote:

> Deaf Cafe,
>
> I'm an educator using Haskell, and I've been telling my students certain stories about Haskell evaluation that I'm now starting to question, so I'm writing for some clarification. I've written before about similar things (e.g., "Basic list exercise" from Mar 16), so it seems my own education on this topic is not yet complete! My apologies if this is too trivial for the Cafe.
>
> One story I tell students is that list concatenation, xs ++ ys, ends up "copying" the elements of xs but not the elements of ys, and in particular an expression like xs ++ [a], putting a new element at the end of a list, is particularly expensive, at least compared to x : xs, putting one at the beginning, which seems obvious, right?
>
> But here is my question: isn't xs ++ ys just a thunk that, as elements are "requested" from it, produces them first from xs and then, when those run out, defers entirely to ys from that point on? If so, then the elements of xs aren't "copied" any more than the elements of ys are; they are just produced from different places. Even the expression xs ++ [a] isn't problematic in this light: just act like xs, but with one more "card up the sleeve" when you are done. Have I gotten that right?
>
> Second, if the above is correct, then if there is a nesting, (xs ++ ys) ++ zs, and we want an element from it, there will be two steps before we can get that element from xs. And this will continue until xs is exhausted, at which point we get new elements in one step from ys, and finally in zero steps from zs. But ++ is associative, and so this list is also xs ++ (ys ++ zs), which requires less work getting the elements from xs. Is this kind of optimization something that is or can be done (automatically)?
>
> Thanks for your indulgence,
>
> Todd Wilson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20230712/beb72cc9/attachment.html>