Splitting SYB from the base package in GHC 6.10

Claus Reinke claus.reinke at talk21.com
Mon Sep 1 15:04:23 EDT 2008


> These instances are defined in such a way that they do not traverse the
> datatype. In fact, there is no other possible implementation, and this
> implementation at least allows for datatypes which contain both "regular" and
> "dubious" elements to still have their "regular" elements traversed.
> However, this implies that a user cannot redefine such instances even in the
> case where s/he knows extra information about these types that would allow for
> a more useful instance definition, for instance.

|These two statements appear to be contradictory.  Perhaps an example of
|a possible instance would help.

"no other possible implementation" is an overstatement, though an easy
one to make: those 'Data' instances are incomplete because better instances
are hard to come by. One can perhaps do little improvements, like replace
the effective 'gmapT = id' for 'IO a' and 'b -> a' with something like[1]:

   gmapT f fun = f . fun
    -- instead of gmapT f fun = fun

    gmapT f io = (return . f) =<< io
    -- instead of gmapT f io = io

but that still doesn't make those instances complete. If it wasn't for
the partial uses, like skipping 'IO a' and 'b -> a' as parts of derived
'Data' instances, one wouldn't want these instances at all, imho (at
least not in their current form).

Then there are abstract types, for which the current default when
implementing reflection is to assume "no constructors", hence no
basis for 'gunfold', hence more incomplete 'Data' instances and
runtime errors. It might be possible to experiment with associating
exactly one, abstract, constructor with each abstract type instead,
but that isn't something I'd like to bake in without more experience.

Another way to look at it:

    'Data' tries to do too much in a single class, and the consequence
    are all those half-implemented 'Data' instances. The probable
    long-term solution is to split 'Data' into 2 or 3 classes,

so that we know that

    a any type instantiating 'DataGfoldl' really supports 'gfoldl'
    b any type instantiating 'DataGunfold' really supports 'gunfold'
    c any type instantiating 'DataReflect' really supports 'Data' reflection

Currently, too many types instantiate 'Data' without supporting b
or c (-> runtime errors), and a few instances don't even support a.

All of which suggests that 'Data' should probably leave 'base',
as it needs to evolve further?

|Claus argued that -> and the monads could be treated by analogy
|with Show for these types.

I had mentioned 'Text.Show.Functions' as an example of "improper"
instances provided for optional import to support 'deriving Show'.

But when I read your sentence, my first thought was: perhaps there's
also a way to apply the showList trick? That would neatly avoid either
changing the 'deriving' mechanism or having dummy instances.

More reason for moving everything to 'syb', keeping it flexible
for a while.

|There is an additional problem with types like ThreadId, Array, ST, STM,
|TVar and MVar: they're notionally defined in other packages, even though
|they're actually defined in partially-hidden GHC.* modules in base and
|re-exported.

Would it be sufficient for 'syb' to depend on both 'base' and those
notional source packages? It would be useful to keep the instances
in 'syb' until the 'Data' story has settled down, after which the instances
ought to move to their 'data' type source packages.

Claus

[1] http://www.haskell.org/pipermail/libraries/2008-July/010319.html






More information about the Libraries mailing list