[Haskell-cafe] A Proposed Law for Foldable?

Fri Feb 27 19:17:37 UTC 2015

On February 27, 2015 at 1:39:10 AM, David Feuer (david.feuer at gmail.com) wrote:
> I am still struggling to understand why you want this to be a law for
> Foldable. It seems an interesting property of some Foldable instances,
> but, unlike Edward Kmett's proposed monoid morphism law, it's not
> clear to me how you can use this low to prove useful properties of
> programs. Could you explain?

I think there are a number of purposes for laws. Some can be thought of as “suggested rewrite rules” — and the monoid morphism law is one such, as are many related free approaches.

Note that the monoid morphism law that Edward provides is _not_ a “proposed” law — it is an “almost free theorem” — given a monoid morphism, it follows for free for any Foldable. There is no possible foldable instance that can violate this law, assuming you have an actual monoid morphism.

So Edward may have proposed adding it to the documentation (which makes sense to me) — but it provides absolutely no guidance or constraints as to what an “allowable” instance of Foldable is or is not.

But there are other reasons for laws than just to provide rewrite rules, even though it is often desirable to express laws in such terms. Consider the first Functor law for example — fmap id === id. Now clearly we can use it to go eliminate a bunch of “fmap id” calls in our program, should we have had them. But when would that ever be the case? Instead, the law is important because it _restricts_ the range of allowable instances — and so if you know you have a data type, and you know it has a functor instance, you then know what that functor instance must do, without looking at the source code.

In “An Investigation of the Laws of Traversals” from where we get our current Traversable laws, for example, Jaskelioff and Rypacek explain their motivation as establishing coherence laws that “capture the intuition of traversals” and rule out “bad” traversals (i.e. ones that violate our intuition). That’s the sort of goal I’m after — something that rules out “obviously bad” Foldable instances and lets us know something about the instances provided on structures without necessarily needing to read the code of those instances — and beyond what we can learn “for free” from the type.

Let me give an example. If you had a structure that represented a sequence of ‘a’, and you discovered that its Foldable instance only folded over the second, fifth, and seventeenth elements and no others — you would, I expect, find this to be a surprising result. You might even say “well, that’s a stupid Foldable instance”. However, outside of your gut feeling that this didn’t make sense, there would be no even semi-formal way to say it was “wrong”. Under a law such as I am trying to drive towards, we would be able to rule out such instances, just as our traversable laws now let us rule out instances that e.g. drop elements from the resultant structure. Hence, if you saw such a structure, and saw it had a Foldable instance, you would now know something useful about the behavior of that instance, without having to examine the implementation of Foldable — one of those useful things being, for example, that it could not possibly, lawfully, only fold over the fifth and seventeenth elements contained but no others.

(Now how to specify such a law that permits the _maximum_ amount of useful information while also permitting the _maximum_ number of instances that “match our intuition” is what my last post was driving at — I think I am closer, but certainly there remains work to be done).

In fact, we should note that when a Foldable is also Traversable, we do have a law specifying how the two typeclasses cohere, and because Traversable _does_ have strong laws, this _does_ suffice to well characterize Foldable in such a case. One nice property of the sort of law I have been driving at is that it will _agree_ with the current characterization in that same circumstance — when Foldable and Traversable instances both exist — and it will allow us to extend _those same intuitions_ in a formal way to circumstances when, for whatever reason, a Traversable instance does _not_ exist.

I hope this helps explain what I’m up to?

Cheers,
Gershom

>  
> On Wed, Feb 25, 2015 at 5:40 PM, Gershom B wrote:
> > I think I can now float an amended law that is a further step in the right
> > direction. First I will give the law, and then I will attempt to both
> > motivate why the amended treatment is needed, and why the weakening of the
> > initial proposal is a legitimate approach. Finally, I will discuss some
> > remaining issues that it would be nice to examine, including some sketches
> > of related but alternate approaches.
> >
> > So, the new law:
> >
> > ===
> > for a lawful Foldable f, and
> >
> > given a fresh newtype GenericSet = GenericSet Integer deriving (Eq, Ord),
> > and mkGenericSet = GenericSet, where GenericSet is otherwise fully abstract
> >
> > then
> >
> > forall (g :: forall a. f a -> Maybe a), (x :: f GenericSet). maybe True
> > (`Foldable.elem` x) (g x) === True
> > ==
> >
> > You may recall that the prior formulation asked that this condition hold for
> > all a., (x :: f a), and so required an external notion of equality. Here, we
> > ask that it only hold at a _specified type_ -- i.e. "GenericSet".
> >
> > So, in what sense was the older formulation "too strong"? Here is a minimal
> > case.
> >
> > data TBox a where
> > TAny :: a -> TBox a
> > TString :: String -> TBox String
> >
> > TBox is just a box with some optional type information, and it should be
> > able to be equipped with the obvious Foldable instance. However, we can now
> > write a function such as
> >
> > discrim :: forall a. TBox a -> Maybe a
> > discrim (TAny x) = Just x
> > discrim (TString _) = "zoinks"
> >
> > This is to say that if we _know_ that TBox contains a string, we can now
> > take _any_ string out of it. Clearly, the existence of such functions means
> > that the obvious Foldable instance cannot satisfy the stronger law.
> >
> > By contrast, the straightforward Foldable instance for TBox still satisfies
> > the weaker law, as we only test the action on "GenericSet" -- and since we
> > have defined GenericSet as a fresh newtype, we know that it cannot have a
> > tag to be matched on within a GADT we have already defined. Hence, our law,
> > though it gives a universal property, does not give this property itself
> > "universally" _at every type a_ but rather at a "generic type". This is by
> > analogy with the technique of using "generic points" in algebraic geometry
> > [1] to give properties that are not "everywhere true" but instead are true
> > "almost everywhere", with a precise meaning given to "almost" -- i.e. that
> > the space for which they do not hold _vanishes_.
> >
> > This is to say, a Foldable satisfying the proposed law will not have the
> > property hold "at all types," but rather at "almost all types" for an
> > analogously precise meaning of "almost". I believe that a statement of why
> > GenericSet is indeed "generic" in some topological sense is possible,
> > although I have not fully fleshed this out.
> >
> > To my knowledge, the introduction of this sort of approach in an FP context
> > is new, but I would welcome any references showing prior art.
> >
> > Aside from dealing with GADTs in some fashion, the new approach has a few
> > other useful features. First, by formulating things directly in terms of
> > "Foldable.elem" we omit the need for some sort of notion of equality in the
> > metalogic -- instead we can use the "Eq" typeclass directly. Furthermore,
> > this happens to sidestep the "toList" problem in the initial approach -- it
> > directly allows search of potentially infinite structures.
> >
> > There are still a few things that could potentially be better.
> >
> > 1) The "fully internal" approach means that we are now subject to the "bias"
> > of "||" -- that is to say that _|_ || True -> _|_, but True || _|_ -> True.
> > So we do not fully capture the searchability of infinite structures. This
> > may indicate that a retreat to a metalogical description of "elem" with
> > unbiased "or" is in order.
> >
> > 2) While we can generically characterize a Foldable instance on some f
> > "almost everywhere" this is at the cost of giving it _no_ characterization
> > at the points where our generic characterization fails. It would be nice to
> > establish some sort of relation between the generic characterization and the
> > action at specified points. However, I am not quite sure how to present
> > this. An alternate approach could be to specify a Foldable law for any `f`
> > that first takes `f` (which may be a GADT) to a related type `f1` (which
> > must be an ordinary ADT) that squashes or omits dictionaries and equality
> > constraints, and likewise takes the Foldable instance to a related instance
> > on f1, and then provides a condition on f1. So rather that retreating from
> > universal to generic properties, we instead take hold of the machinery of
> > logical relations directly to establish a law. I would be interested in
> > being pointed to related work along these lines as well.
> >
> > 3) An additional drawback of the "Generic Point" approach as given is that
> > we chose to derive only two particular typeclasses -- Eq and Ord. An
> > alternate approach would be to quantify over all a, but then give the
> > property in terms of say "newtype Generify a = Generify a deriving (...)"
> > which derives all classes on "a" that do not feature "a" in a positive
> > position. Doing this would also mean a retreat from a fully internal notion
> > of equality, of course...
> >
> > Anyway, this is clearly still work in progress, but I would appreciate any
> > feedback on the direction this is going, or references that may seem useful.
> >
> > Cheers,
> > Gershom
> >
> > [1] https://en.wikipedia.org/wiki/Generic_point
> >
> > On Thu, Feb 12, 2015 at 6:33 PM, Edward Kmett wrote:
> >>
> >> With Foldable we do have a very nice law around foldMap.
> >>
> >> A monoid homomorphism g is a combinator such that
> >>
> >> g mempty = mempty
> >> g (mappend a b) = mappend (g a) (mappend (g b)
> >>
> >> For any monoid homomorphism g,
> >>
> >> foldMap (g . f) = g . foldMap f
> >>
> >> We can use that to construct proofs of an analogue to "the banana split
> >> theorem" for foldr, but rephrased in terms of foldMap:
> >>
> >> foldMap f &&& foldMap g = foldMap (f &&& g)
> >>
> >> Getting there uses the fact that fst and snd are both monoid
> >> homomorphisms.
> >>
> >> There are also laws relating the behavior of all of the other combinators
> >> in Foldable to foldMap.
> >>
> >> Ultimately the reasons for the other members of the class are a sop to
> >> efficiency concerns: asymptotic factors in terms of time or stack usage
> >> matter.
> >>
> >> -Edward
> >>
> >>
> >> On Thu, Feb 12, 2015 at 5:22 PM, Kim-Ee Yeoh wrote:
> >>>
> >>> (removing Libraries, since not everyone on HC is on that list)
> >>>
> >>> I do not know all of the context of Foldable, but one I do know that's
> >>> not been mentioned is the implicit rule-of-thumb that every type class
> >>> should have a law.
> >>>
> >>> So the undercurrent is that if Foldable doesn't have a law, should it
> >>> even be a type class? This has led to efforts to uncover laws for Foldable.
> >>>
> >>> Worth discussing in a separate thread is the criterion itself of "if it
> >>> doesn't have a law, it's not a type class". Useful sometimes, but that's not
> >>> the sine qua non of type classes.
> >>>
> >>>
> >>>
> >>> -- Kim-Ee
> >>>
> >>> On Fri, Feb 13, 2015 at 2:47 AM, Gershom B wrote:
> >>>>
> >>>> For a long time, many people, including me, have said that "Foldable has
> >>>> no laws" (or Foldable only has free laws) -- this is true, as it stands,
> >>>> with the exception that Foldable has a non-free law in interaction with
> >>>> Traversable (namely that it act as a proper specialization of Traversable
> >>>> methods). However, I believe that there is a good law we can give for
> >>>> Foldable.
> >>>>
> >>>> I earlier explored this in a paper presented at IFL 2014 but
> >>>> (rightfully) rejected from the IFL post-proceedings.
> >>>> (http://gbaz.github.io/slides/buildable2014.pdf). That paper got part of  
> the
> >>>> way there, but I believe now have a better approach on the question of a
> >>>> Foldable law -- as sketched below.
> >>>>
> >>>> I think I now (unlike in the paper) can state a succinct law for
> >>>> Foldable that has desired properties: 1) It is not "free" -- it can be
> >>>> violated, and thus stating it adds semantic content. 2) We typically expect
> >>>> it to be true. 3) There are no places where I can see an argument for
> >>>> violating it.
> >>>>
> >>>> If it pans out, I intend to pursue this and write it up more formally,
> >>>> but given the current FTP discussion I thought it was worth documenting this
> >>>> earlier rather than later. For simplicity, I will state this property in
> >>>> terms of `toList` although that does not properly capture the infinite
> >>>> cases. Apologies for what may be nonstandard notation.
> >>>>
> >>>> Here is the law I think we should discuss requiring:
> >>>>
> >>>> * * *
> >>>> Given Foldable f, then
> >>>> forall (g :: forall a. f a -> Maybe a), (x :: f a). case g x of Just a
> >>>> --> a `elem` toList x
> >>>> * * *
> >>>>
> >>>> Since we do not require `a` to be of type `Eq`, note that the `elem`
> >>>> function given here is not internal to Haskell, but in the metalogic.
> >>>>
> >>>> Furthermore, note that the use of parametricity here lets us make an
> >>>> "end run" around the usual problem of giving laws to Foldable -- rather than
> >>>> providing an interaction with another class, we provide a claim about _all_
> >>>> functions of a particular type.
> >>>>
> >>>> Also note that the functions `g` we intend to quantify over are
> >>>> functions that _can be written_ -- so we can respect the property of data
> >>>> structures to abstract over information. Consider
> >>>>
> >>>> data Funny a = Funny {hidden :: a, public :: [a]}
> >>>>
> >>>> instance Foldable Funny where
> >>>> foldMap f x = foldMap f (public x)
> >>>>
> >>>> Now, if it is truly impossible to ever "see" hidden (i.e. it is not
> >>>> exported, or only exported through a semantics-breaking "Internal" module),  
> >>>> then the Foldable instance is legitimate. Otherwise, the Foldable instance
> >>>> is illegitimate by the law given above.
> >>>>
> >>>> I would suggest the law given is "morally" the right thing for Foldable
> >>>> -- a Foldable instance for `f` should suggest that it gives us "all the as
> >>>> in any `f a`", and so it is, in some particular restricted sense, initial
> >>>> among functions that extract as.
> >>>>
> >>>> I do not suggest we add this law right away. However, I would like to
> >>>> suggest considering it, and I believe it (or a cleaned-up variant) would
> >>>> help us to see Foldable as a more legitimately lawful class that not only
> >>>> provides conveniences but can be used to aid reasoning.
> >>>>
> >>>> Relating this to adjointness, as I do in the IFL preprint, remains
> >>>> future work.
> >>>>
> >>>> Cheers,
> >>>> Gershom
> >>>>
> >>>> _______________________________________________
> >>>> Haskell-Cafe mailing list
> >>>> Haskell-Cafe at haskell.org
> >>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
> >>>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Haskell-Cafe mailing list
> >>> Haskell-Cafe at haskell.org
> >>> http://www.haskell.org/mailman/listinfo/haskell-cafe
> >>>
> >>
> >
> >
> > _______________________________________________
> > Haskell-Cafe mailing list
> > Haskell-Cafe at haskell.org
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> >
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>