[Haskell-cafe] Haskell's "historical futurism" needs better writing, not better tools

Fri Oct 1 09:52:53 UTC 2021

Am 01.10.21 um 03:14 schrieb Viktor Dukhovni:
> On Fri, Oct 01, 2021 at 02:40:49AM +0200, Ben Franksen wrote:
>> They define semantics in a semi-formal notation, which I find succinct
>> and very intuitive. This can be easily generalized to Foldable via
>> 'toList'. Indeed, there is almost nothing about Foldable that cannot be
>> understood once you understand Data.List and the toList method. Even
>> foldMap is (semantically) just (***):
> 
> Well a balanced Tree can have an efficient corecursive foldl, or a
> performant 'foldr`', and Sets can know their size statically, and `elem`
> runs in linear time even in structures that potentially support faster
> search.

All true, and I think it is important to document these things. The 
question is: where?

This is a general problem with all kinds of generic "container" 
classes/interfaces, and not limited to Haskell: performance 
characteristics of methods will vary widely depending on the 
implementation. In Haskell this includes semantics insofar as bottom / 
infinite structures are concerned.

Documenting the API /itself/ can go only so far before becoming a manual 
enumeration of all possible implementations one can think of or happens 
to know about. This clearly doesn't scale, so it would be better to 
leave it unspecified and attach such documentation to the instances instead.

It would help if rendering of instance docs in haddock were improved (as 
sub-paragraph under the instance instead of to the right of it).

> And it is perhaps worth asking whether you feel you still have anything
> you'd like to learn about Foldable, for if not, perhaps the
> documentation is not for you, and that's fine...

I may have earned this remark with the tone of my critique ;-) In case 
it came over as condescending, please accept my apologies.

>> In fact all class methods have simple semantic definitions in terms of
>> the same named functions from Data.List via 'toList'.
> 
> But performance may differ radically, and `toList` may diverge for
> `snocList` when infinite on the left, though that's a rather
> pathological example.
>
>> Expectations of runtime cost should be either explicitly stated in the
>> docs for each method or left out.
> 
> That's not enough if users can't reason about the available choices or
> don't know how to implement performant instances.

The truth is (and that is what the docs imply but fail to make explicit 
as that would be too embarrasing): you *cannot* reason about these 
things. If the implementation (instance) is free to choose whether 
foldr or foldl is the "natural" fold, then how can I make an informed 
choice between them for an arbitrary Foldable?

If we were to define the semantics in terms of 'toList', then we would 
acknowledge that Foldable is biased in the same direction as lists are, 
so behavior would no longer be implementation defined and could be 
easily reasoned about.

>  Function synopses
> rarely provide enough room for more than a cursory description and
> a few examples.  That's not their role.  This is why Unix manpages
> have both a SYNOPSIS and a DESCRIPTION section.
> 
> I am quite open to improved language in the overview, and less open to
> the idea that it is just baggage to throw overboard.  In particular,
> I've had positive feedback on the material, despite perhaps overly
> turgid prose in some places.  Please help to make it crisp.
> 
> I find absence of overview (DESCRIPTION if you like) sections in many a
> non-trivial Haskell library to be quite a barrier to working with the
> library, the synopses alone are rarely enough for my needs.

I agree. I do like (not too verbose) introduction of concepts when 
reading docs of a new module. See below for a proposal.

>> I may be wrong but it looks to me as if this could be derived by adding
>> one more method 'fromList' that is required to be left inverse of 'toList':
>>
>>     fromList :: Foldable a => [a] -> t a
>>     fromList . toList = id
> 
> This is roughly the sort of thing one can do with Traversable (recover
> the structure from its spine and element list, but not its element list
> alone).  The point is that various non-linear (e.g. tree-like)
> structures with the same element order have distinct "spines".
[...]
>> Is there a sensible (useful, lawful) Foldable instance which has no
>> 'fromList'?
> 
> Sure, any tree-like structure where shape is not implied by the element
> list alone.

Sorry, you are right, of course. I wasn't thinking clearly.

Regarding me helping to improve the docs: I have made concrete proposals 
to re-word the descriptions. But I really think that a large part of the 
documentation you added comes down to saying:

"""
As specified, this class is mostly useless, since it does not allow you 
to reason about the choice of methods (e.g. 'foldr' vs. 'foldl') to use 
when working with a generic Foldable container. To make this choice, you 
have to make assumptions about whether it is left-leaning (like lists) 
or right-leaning (like snoc-lists) or neither (unbiased). The usual 
assumption is that it is right-leaning or unbiased. This means that all 
methods can be considered as being defined, semantically, using 'toList' 
and the corresponding function with the same name from Data.List. (For 
fold and foldMap, see their default definitions).

If you can assume the element type is a Monoid, you can get away by 
using only the unbiased functions 'fold' and 'foldMap'. This leaves the 
choice of implementation strategy to the container (instance).
"""

Cheers
Ben
-- 
I would rather have questions that cannot be answered, than answers that
cannot be questioned.  -- Richard Feynman