[Hs-Generics] Traversable Functor Data,or: X marks the spot

Fri Jun 27 06:04:52 EDT 2008

Hi Claus!

We have a deadline coming soon, so this is only a brief comment about your
GMap implementation.

I played a bit with your fmap' function to see whether I could find an
obvious problem with the implementation. For instance, I used this datatype:

data Tricky a = Tricky a Char deriving (Data,Typeable)

and tried to transform a value of type "Tricky Char" to "Tricky Bool". I
wanted to see whether the second argument of Tricky would be (incorrectly)
transformed to Bool. But it turned out that fmap' behaved as expected. So I
think that SYB can pass the GMap test.

On the other hand, I think that using unsafeCoerce as a way to define
generic functions is inelegant and probably bad practice due to possible
runtime failures. However, I must admit that I was surprised when I saw your
trick :).

Probably we'll get back to you around next week. Meanwhile, you may find
useful to look at the paper we wrote on comparing generic programming
libraries[1]. In particular, you can look at caveats that apply to SYB.

Cheers,

Alexey

[1]
http://www.cs.uu.nl/wiki/bin/view/Alexey/ComparingLibrariesForGenericProgrammingInHaskell

On Tue, Jun 24, 2008 at 8:45 PM, Claus Reinke <claus.reinke at talk21.com>
wrote:

> Dear Generics;-)
>
> this is a repost from cvs-libraries of an experiment which you might find
> of interest, and on which I'd welcome feedback. One impact being that SYB
> can implement your GMap test (on which it currently defaults), among other
> things (see message 1 below, the technique probably applies to Uniplate as
> well?). One issue being that Data instances for non-algebraic types that
> default to
> doing nothing render this technique unsafe (see message 2 below).
>
> Btw, the announcement of the unified generics library project was so long
> ago that I had forgotten about it and about this list (thanks to Simon PJ
> for reminding me). My interest in this is partially from my past with HaRe,
> partially from wanting generic traversal support over GHC AST types, so I'm
> more interested
> in generic traversal/analysis style libraries.
>
> Claus
>
> --------- message 1
>
> The issue at hand: can we use Data/Typeable to do what Functor and
> Traversible do so easily, namely type-changing maps? Or should
> there be support for deriving all of Data/Typeable/Functor/Traversible
> over GHC's AST?
>
> A drastically simplified example from David's Haddock 2 code
> (Haddock.Interface.Rename) needs to do a kind of mapM
>
>   renameDoc :: Monad m => HsDoc Name -> m (HsDoc DocName)
>
> which is straightforward with Traversible, but that is not derivable
> yet (David has been working on that, though), while the usual basis
> of SYB, Data/Typeable, is derivable, but all SYB transformations are based
> on gfoldl, the type of which does not permit type-changing maps:
>
>   gfoldl :: (Data a)       =>(forall a1 b. (Data a1) => c (a1 -> b) -> a1
> -> c b)
>       -> (forall g. g -> c g)
>       -> a-> c a
>
> One could probably go from heterogeneous Data types to a
> homogeneously typed tree representation, do the map there, then transform
> back, but that would require additional overhead
> and a type coercion for the backtransform. Also, it seems a pity that the
> derived code for gfoldl can handle such maps - it is just
> gfoldl's type that gets in the way (and trying to generalize that
> type is a whole different kind of headache..).
>
> While boilerplate scrappers around the globe eagerly await the
> release of
>   "Generalizing the type of gfold, or: to braindamage and beyond"
>   (author: unknown; release date: unknown; release: unlikely;-)
>
> I thought I'd have a go at the smaller problem of finding a way
> to bypass gfoldl's type systematically in order to use its derivable
> code for things like fmap and traverse. This message summarizes
> how far I've got so far, and asks for comments (does this do
> what it should? is it safe? assuming it can be cleaned up to do
> what it should in a safe way, does this mean that deriving Data/
> Typeable for GHCs AST types will be sufficient, or should we
> still have Traversible as well? etc.).
>
> First, here is an implementation of renameDoc in terms of
> gfoldl and unsafeCoerce (slightly cleaned up version of what
> I sent in the other thread earlier):
>
>   data Name    = Name    String deriving (Show,Data,Typeable)
>   data DocName = DocName String deriving (Show,Data,Typeable)
>
>   renameDoc :: Monad m => HsDoc Name -> m (HsDoc DocName)
>   renameDoc (DocIdentifier ids) =     mapM (\(Name n)->return (DocName n))
> ids >>= return . DocIdentifier
>   renameDoc hsDoc = n2d (gfoldl k return hsDoc)
>     where k c x = ap c (mkM (d2n . renameDoc) x)
>
>           n2d :: Monad m => m (HsDoc Name) -> m (HsDoc DocName)
>           n2d = unsafeCoerce
>           d2n :: Monad m => m (HsDoc DocName) -> m (HsDoc Name)
>           d2n = unsafeCoerce
>
> 'DocIdentifier :: [id] -> HsDoc id' is the only constructor in HsDoc that
> involves the parameter type 'id', so renameDoc either does the
> parameter type conversion or -for other constructors- recurses into the
> subexpressions, using gfoldl to build a monadic map.
> The important insight here is that gfoldl's code can handle the task,
> we just pretend that our map is type-preserving to conform to gfoldl's
> restrictive type, by coercing the result types (inside gfoldl, we pretend
> that renameDoc returns a (HsDoc Name), which outside gfoldl, we coerce back
> to (HsDoc DocName)).
>
> Assuming that noone looks at the return types inside gfoldl (at least,
> not in a way that would be affected by this change in type - SYB
> does support adhoc overloading, after all, as in mkM here), this
> seems to work:
>
>   testDoc =       DocAppend (DocParagraph (DocAppend (DocIdentifier [Name
> "well-typed"]) DocEmpty))            (DocAppend (DocString "programs")
>      (DocIdentifier [Name "don't",Name "go",Name "anywhere"]))
>
>   *Main> renameDoc testDoc
>   DocAppend (DocParagraph (DocAppend (DocIdentifier [DocName "well-typed"])
> DocEmpty))
>    (DocAppend (DocString "programs")   (DocIdentifier [DocName
> "don't",DocName "go",DocName "anywhere"]))
>
> But can we generalize this, and what about those coercions?
> Can we -ideally- define something like fmap and traverse in terms of
> gfoldl, and hide the uglyness in their implementations?
>
> Well, I'll spare you (and me;-) the details of my struggle with
> the type system, and just show the results, with some comments. First, the
> simpler fmap:
>
>   -- "X marks the spots";-) X should be private
>   data X = X deriving (Data,Typeable)
>
>   fmap' :: (Data (f X)) => (a -> b) -> f a -> f b
>   fmap' f x = markTheSpots (rec (wrap f)) x
>     where
>     markTheSpots :: (f X -> f X) -> (f a -> f b)
>     markTheSpots f = unsafeCoerce . f . unsafeCoerce
>     rec ::  (Data a) => (X -> X) -> a -> a
>     rec f x = (gmapT (rec f) `extT` f) x
>     wrap :: (a -> b) -> (X -> X)
>     wrap f = unsafeCoerce . f . unsafeCoerce
>
> Surprisingly simple for something that seemed impossible
> at first, isn't it?-) Since we're already committed (for this
> experiment, at least) to some type fiddling, we can make
> more constructive use of unsafeCoerce, for two purposes:
>
> 1. We wrap the function parameter f, to make it look
>   like a type-preserving function, on some private type X.
>
> 2. We mark the occurrences of the type constructor f's   parameter type, by
> coercing 'f a' to 'f X' and 'f X' to
>   'f b'.
>
> Then, we simply use SYB to apply f, when its type matches
> the parameter, or to recurse into the subexpressions using
> gmapT, otherwise. If X is private, f will only be applied
> to the "functor parameter" _positions_ in 'f a', not to other
> "functor parameter" _type_ occurrences in 'f a':
>
>   *Main> fmap' not $ (True,True)
>   (True,False)
>   *Main> fmap' not $ [True,True]
>   [False,False]
>
>   *Main> fmap' (\(Name s)->(DocName s)) testDoc
>   DocAppend (DocParagraph (DocAppend (DocIdentifier [DocName "well-typed"])
> DocEmpty))
>       (DocAppend (DocString "programs")       (DocIdentifier [DocName
> "don't",DocName "go",DocName "anywhere"]))
>
> Note how we use one kind of recursion where a manual implementation of fmap
> would use two: handling subexpressions
> (which we'd usually do by pattern matching and expression
> construction) and functor type recursion.
>
> Ecouraged by this success, we are ready to tackle the slightly more tricky
> traverse, using the same techniques:
> mark the spot, wrap the worker, one kind of recursion.
> Only this time, we need to take care of the applicative
> plumbing as well, so it's gfoldl instead of gmapT, and
> some more complex types.
>
> We need the usual SYB type extension support, but for
> Applicative, not Monad (SYB was defined before
> Applicative existed, it seems..):
>
>   -- type extension over Applicative f
>   mkF :: forall f a b . (Applicative f,Typeable a,Typeable b)
>        => (b -> f b) -> a -> f a
>   mkF f x = case gcast (F f) of { Just (F f) -> f x; Nothing -> pure x }
>
>   extF :: forall t f a b . (Typeable a,Typeable b)        => (a -> f a) ->
> (b -> f b) -> (a -> f a)
>   (f `extF` fspec) x = case gcast (F fspec) of { Just (F fspec) -> fspec x;
> Nothing -> f x }
>
>   newtype F f x = F { unF :: x -> f x }
>
> And here we go:
>
>   traverse' :: forall f t a b . (Applicative f,Typeable1 f,
>                                  Typeable1 t,Data (t X),
>                                  Typeable a)
>                              => (a -> f b) -> t a -> f (t b)
>   traverse' f x = markTheSpots (rec (wrap f)) x
>     where
>     markTheSpots :: forall a b . (t X -> f (t X)) -> (t a -> f (t b))
>     markTheSpots f = unsafeCoerce . f . unsafeCoerce
>     wrap :: forall a b . (a -> f b) -> (X -> f X)
>     wrap f = unsafeCoerce . f . unsafeCoerce
>
>     rec :: forall x . Data x => (X -> f X) -> x -> f x
>     rec f x = (gfoldl (k f) z `extF` f) x     k :: forall a b . Data a =>
> (X -> f X) -> f (a -> b) -> a -> f b
>     k f c x = c <*> (mkF (rec f :: Data a => a -> f a) `extF` f) x
>     z c = pure c
>
> This does seem to do the right thing, so I don't seem to be
> completely on the wrong track:
>
>   *Main> traverse' (pure . not) (True,True)
>   (True,False)
>
>   *Main> traverse' (pure . not) [True,True]
>   [False,False]
>
>   *Main> traverse' print testDoc
>   Name "well-typed"
>   Name "don't"
>   Name "go"
>   Name "anywhere"
>   DocAppend (DocParagraph (DocAppend (DocIdentifier [()]) DocEmpty))
> (DocAppend (DocString "programs") (DocIdentifier [(),(),()]))
>
>   *Main> traverse' (pure) testDoc
>   DocAppend (DocParagraph (DocAppend (DocIdentifier [Name "well-typed"])
> DocEmpty))       (DocAppend (DocString "programs")       (DocIdentifier
> [Name "don't",Name "go",Name "anywhere"]))
>
>   *Main> traverse' (pure . (\(Name s)->DocName s)) testDoc
>   DocAppend (DocParagraph (DocAppend (DocIdentifier [DocName "well-typed"])
> DocEmpty))
>    (DocAppend (DocString "programs")    (DocIdentifier [DocName
> "don't",DocName "go",DocName "anywhere"]))
>
> but I'm too battered from trying to coerce gfoldl to analyze this
> properly at the moment, so I'm sending this in the hope of (a)
> not having to look at gfoldl's type for a while !-) and (b) getting
> some feedback (is this useful? does the extra overhead matter?..), caveats
> (cyclic programs, nested traversals, escaping Xs, ..?), etc.
>
> Over to you,
> Claus
>
> PS. the "X marks the spot" trick reminds me of the popular
>   medicine topic: delivery system plus targeted activation
>   (SYB plus unsafeCoerce).
>
> ------------- message 2
>
>    fmap' :: (Data (f X)) => (a -> b) -> f a -> f b
>>   fmap' f x = markTheSpots (rec (wrap f)) x
>>     where
>>     markTheSpots :: (f X -> f X) -> (f a -> f b)
>>     markTheSpots f = unsafeCoerce . f . unsafeCoerce
>>     rec ::  (Data a) => (X -> X) -> a -> a
>>     rec f x = (gmapT (rec f) `extT` f) x
>>     wrap :: (a -> b) -> (X -> X)
>>     wrap f = unsafeCoerce . f . unsafeCoerce
>>
> ..
>
>> 1. We wrap the function parameter f, to make it look
>>   like a type-preserving function, on some private type X.
>>
>> 2. We mark the occurrences of the type constructor f's   parameter type,
>> by coercing 'f a' to 'f X' and 'f X' to
>>   'f b'.
>>
>> Then, we simply use SYB to apply f, when its type matches
>> the parameter, or to recurse into the subexpressions using
>> gmapT, otherwise. If X is private, f will only be applied
>> to the "functor parameter" _positions_ in 'f a', not to other
>> "functor parameter" _type_ occurrences in 'f a'
>>
>
> ..but f might not be applied at all, which leads to the first
> issue with this technique:
>
> I was surprised to see Data instances for (a->b) and IO a, since for such
> non-algebraic types, there isn't anything to gfoldl or gmap over. And those
> instances do indeed seem
> to offer very little functionality (not to mention the runtime
> errors..).
> For fmap'/traverse', this means that f will not be applied, and so it is
> not a good idea to coerce the types as if f had been applied (because the
> hidden parameter could be exposed after traversal, with changed type and
> unchanged
> representation!).. So we need to restrict the types of fmap'/traverse'.
> Which leads to two questions:
>
> - what is the rationale for having these non-functional
>   Data instances?
>
>   If one is to have 'instance Data (a->b)', is there a way
>   to make it more functional?
>
> - how can we capture algebraic types in a type (class)?
>
>   I thought that Data would do just that, being designed
>   to gfoldl over concrete data constructors, but apparently
>   not. And I don't really want to have a separate list of   all the types
> for which Data works, or of all the types   for which Data doesn't quite
> work.
>
> Claus
>
> ps. What is the right list for this topic?
>
>
> _______________________________________________
> Generics mailing list
> Generics at haskell.org
> http://www.haskell.org/mailman/listinfo/generics
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/generics/attachments/20080627/6ac374e6/attachment-0001.htm