[Haskell-cafe] Haskell's "historical futurism" needs better writing, not better tools
Ben Franksen
ben.franksen at online.de
Fri Oct 1 00:40:49 UTC 2021
Am 16.09.21 um 22:52 schrieb Viktor Dukhovni:
> I am also curious whether I'm part of the solution or part of the
> precipitate. I've recently contributed new documentation for
> Data.Foldable and Data.Traversable:
>
> https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Foldable.html#g:7
> https://dnssec-stats.ant.isi.edu/~viktor/haskell/docs/libraries/base/Data-Traversable.html#g:4
>
> are these a step in the right direction, or examples of more writing
> that sucks?
Sorry, but IMO it sucks.
"Foldable structures are reduced to a summary value by accumulating
contributions to the result one element at a time."
The first sentence of the class docs says
"The Foldable class represents data structures that can be reduced to a
summary value one element at a time."
which is more correct and (slightly) more concise with identical
meaning. No value added here.
Next:
"The contribution of each element to the final result is combined with
an accumulator via an operator function. [...]"
First of all, use of the term "operator function" is confusing. In
Haskell an operator is a function, period. Infix versus prefix notation
is completely irrelevant here. The whole sentence is formulated in a
cryptic and complicated way. It seems to me that what you want to say is
"The result is calculated by successively applying the argument function
to each element and an accumulator, producing a new accumulator." (*)
Your text tries to explain things in a precise manner, but is hampered
by insisting that the words "sequence" or "list" must be avoided, even
though words like "precede", "follow", "left", and "right" are used all
over the place. This understandably leads to awkward formulations.
It is much easier to understand the concepts via sequences. The
documentation of foldr and foldl has these two lines (**):
foldl f z [x1, x2, ..., xn] == (...((z `f` x1) `f` x2) `f`...) `f` xn
foldr f z [x1, x2, ..., xn] == x1 `f` (x2 `f` ... (xn `f` z)...)
They define semantics in a semi-formal notation, which I find succinct
and very intuitive. This can be easily generalized to Foldable via
'toList'. Indeed, there is almost nothing about Foldable that cannot be
understood once you understand Data.List and the toList method. Even
foldMap is (semantically) just (***):
foldMap f = foldr mappend mempty . map f . toList
= foldl mappend mempty . map f . toList
In fact all class methods have simple semantic definitions in terms of
the same named functions from Data.List via 'toList'. This makes any
further documentation of them redundant. This would also obviate the
need to introduce ambiguous terminology like "explicit operator".
The next sentence after the example:
"The first argument of both is an explicit operator that merges the
contribution of an element of the structure with a partial fold over,
respectively, either the preceding or following elements of the structure."
"merges the contribution of" is a complicated way to say "combines".
Again, the semi-formal notation (**) says it all. As does my alternative
proposal for the first sentence above (*), which incidentally
demonstrates that you are repeating yourself here.
The next sentence I could understand on first reading. Nevertheless, the
content of everything up to (including) this sentence can be
abbreviated, by adding a singe sentence to (*):
"The result is calculated by successively applying the argument function
to each element and an accumulator, producing a new accumulator. The
accumulator is initialized with the second argument."
Generally, the text is too long, too verbose, and the wording is too
complicated.
Some parts e.g. the discussion about "Chirality" (an extremely obscure
term that I have never read anywhere before) again serves to illustrate
that, semantically, Foldable is not much more than a glorified 'toList',
everything else is optimization. Users would be served better by plainly
stating it in this way, rather than obscuring it, as if there were kind
of "deep" abstraction going on.
Expectations of runtime cost should be either explicitly stated in the
docs for each method or left out.
The "Notes" section adds useful information.
The most important documentation for a class, its laws, is (1) not
contained in the class docs, (2) not even linked from there, and (3) too
cryptic. I would *at least* expect the Monoid newtype wrappers (Endo,
Dual, Sum, Product) and their inverses to be hyperlinked to their
definition. The way the first two laws are stated in terms of these
newtypes makes them harder to understand than necessary, especially for
newcomers. My (equivalent) semantic definition of foldMap above (***) in
terms of foldr/foldl does not require knowledge about these types and
the semantics of their Monoid instances.
Yet another aside: apart from trivial reductions to the corresponding
function in Data.List using 'toList' (except 'toList' itself as well as
'fold' and 'foldMap'), the only non-trivial law seems to be
"If the type is also a Functor instance, it should satisfy foldMap f =
fold . fmap f"
I may be wrong but it looks to me as if this could be derived by adding
one more method 'fromList' that is required to be left inverse of 'toList':
fromList :: Foldable a => [a] -> t a
fromList . toList = id
Of course, for some Foldables (e.g. those with a fixed finite number of
elements), fromList would be partial.
Is there a sensible (useful, lawful) Foldable instance which has no
'fromList'?
I suspect no. In fact I suspect addition of fromList (with the left
inverse law) would nicely serve to rule out useless/surprising instances.
Cheers
Ben
--
I would rather have questions that cannot be answered, than answers that
cannot be questioned. -- Richard Feynman
More information about the Haskell-Cafe
mailing list