[Haskell-cafe] Haskell's "historical futurism" needs better writing, not better tools

Fri Oct 1 00:40:49 UTC 2021

Am 16.09.21 um 22:52 schrieb Viktor Dukhovni:
> I am also curious whether I'm part of the solution or part of the
> precipitate.  I've recently contributed new documentation for
> Data.Foldable and Data.Traversable:
> 
>      https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Foldable.html#g:7
>      https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Traversable.html#g:4
> 
> are these a step in the right direction, or examples of more writing
> that sucks?

Sorry, but IMO it sucks.

"Foldable structures are reduced to a summary value by accumulating 
contributions to the result one element at a time."

The first sentence of the class docs says

"The Foldable class represents data structures that can be reduced to a 
summary value one element at a time."

which is more correct and (slightly) more concise with identical 
meaning. No value added here.

Next:

"The contribution of each element to the final result is combined with 
an accumulator via an operator function. [...]"

First of all, use of the term "operator function" is confusing. In 
Haskell an operator is a function, period. Infix versus prefix notation 
is completely irrelevant here. The whole sentence is formulated in a 
cryptic and complicated way. It seems to me that what you want to say is

"The result is calculated by successively applying the argument function 
to each element and an accumulator, producing a new accumulator." (*)

Your text tries to explain things in a precise manner, but is hampered 
by insisting that the words "sequence" or "list" must be avoided, even 
though words like "precede", "follow", "left", and "right" are used all 
over the place. This understandably leads to awkward formulations.

It is much easier to understand the concepts via sequences. The 
documentation of foldr and foldl has these two lines (**):

foldl f z [x1, x2, ..., xn] == (...((z `f` x1) `f` x2) `f`...) `f` xn
foldr f z [x1, x2, ..., xn] == x1 `f` (x2 `f` ... (xn `f` z)...)

They define semantics in a semi-formal notation, which I find succinct 
and very intuitive. This can be easily generalized to Foldable via 
'toList'. Indeed, there is almost nothing about Foldable that cannot be 
understood once you understand Data.List and the toList method. Even 
foldMap is (semantically) just (***):

foldMap f = foldr mappend mempty . map f . toList
           = foldl mappend mempty . map f . toList

In fact all class methods have simple semantic definitions in terms of 
the same named functions from Data.List via 'toList'. This makes any 
further documentation of them redundant. This would also obviate the 
need to introduce ambiguous terminology like "explicit operator".

The next sentence after the example:

"The first argument of both is an explicit operator that merges the 
contribution of an element of the structure with a partial fold over, 
respectively, either the preceding or following elements of the structure."

"merges the contribution of" is a complicated way to say "combines". 
Again, the semi-formal notation (**) says it all. As does my alternative 
proposal for the first sentence above (*), which incidentally 
demonstrates that you are repeating yourself here.

The next sentence I could understand on first reading. Nevertheless, the 
content of everything up to (including) this sentence can be 
abbreviated, by adding a singe sentence to (*):

"The result is calculated by successively applying the argument function 
to each element and an accumulator, producing a new accumulator. The 
accumulator is initialized with the second argument."

Generally, the text is too long, too verbose, and the wording is too 
complicated.

Some parts e.g. the discussion about "Chirality" (an extremely obscure 
term that I have never read anywhere before) again serves to illustrate 
that, semantically, Foldable is not much more than a glorified 'toList', 
everything else is optimization. Users would be served better by plainly 
stating it in this way, rather than obscuring it, as if there were kind 
of "deep" abstraction going on.

Expectations of runtime cost should be either explicitly stated in the 
docs for each method or left out.

The "Notes" section adds useful information.

The most important documentation for a class, its laws, is (1) not 
contained in the class docs, (2) not even linked from there, and (3) too 
cryptic. I would *at least* expect the Monoid newtype wrappers (Endo, 
Dual, Sum, Product) and their inverses to be hyperlinked to their 
definition. The way the first two laws are stated in terms of these 
newtypes makes them harder to understand than necessary, especially for 
newcomers. My (equivalent) semantic definition of foldMap above (***) in 
terms of foldr/foldl does not require knowledge about these types and 
the semantics of their Monoid instances.

Yet another aside: apart from trivial reductions to the corresponding 
function in Data.List using 'toList' (except 'toList' itself as well as 
'fold' and 'foldMap'), the only non-trivial law seems to be

"If the type is also a Functor instance, it should satisfy foldMap f = 
fold . fmap f"

I may be wrong but it looks to me as if this could be derived by adding 
one more method 'fromList' that is required to be left inverse of 'toList':

   fromList :: Foldable a => [a] -> t a
   fromList . toList = id

Of course, for some Foldables (e.g. those with a fixed finite number of 
elements), fromList would be partial.

Is there a sensible (useful, lawful) Foldable instance which has no 
'fromList'?

I suspect no. In fact I suspect addition of fromList (with the left 
inverse law) would nicely serve to rule out useless/surprising instances.

Cheers
Ben
-- 
I would rather have questions that cannot be answered, than answers that
cannot be questioned.  -- Richard Feynman