Syb Renovations? Issues with Data.Generics

Ian Lynagh igloo at earth.li
Wed Jul 30 21:05:39 EDT 2008


On Thu, Jul 31, 2008 at 12:11:28AM +0100, Claus Reinke wrote:
> >On Tue, Jul 29, 2008 at 08:27:00PM +0100, Claus Reinke wrote:
> >>>>   My suggestion is to split this module into two, and stop the implicit
> >>>>   import/export of the incomplete instances from Data.Generics.
> >>>I don't think that this is a good idea.
> >>Could you please elaborate on your reasons?
> >That's what the rest of that e-mail was supposed to be.
> 
> You explained why the change would not give as much flexibility as
> one might think, or at least not as easily, but you didn't explain why
> you think it is a bad idea to gain at least the flexibility to choose
> between instance and no instance for the problematic cases.

Ah, I see.

There are two reasons why one might not want to have, e.g., an IO
instance. The first one is to help the programmer find errors: if your
program relies on having the instance then it's probably buggy (modulo
the deriving instances thing).

I've done a bit of generic programming (although doubtless not as much
as others on this list) and haven't found this to be an issue.

Note that if you took a correct program without the instance, and added
the instance, then the program would still compile.


The other one is because you want to define your own instance. Hopefully
your instance differs from the existing one in some way (but if it's
moved into a different module then it's likely that people will recreate
the same empty instance because they don't realise that one already
exists).

If people start doing this then we will get libraries which cannot be
used in the same project. I therefore don't think that we should make it
possible for people to do this.

> >By the way, is there something somewhere describing the alternate
> >instance that you want to define?
> 
> That is the whole point, isn't it? The Data framework isn't designed
> to cope with things like (a->b) or (IO a), so there are no good instances 
> one could define for these types

OK, I think I've missed your point then. I've just reread the message
you started this thread with, and you say this:

>     Pro: - the instances are still available, and only one explicit import
>                 away, so 'deriving instance Data' for types containing
>                 uninteresting functions is still convenient
> 
>             - the problematic instances are no longer implicitly imported,
>                 so applications that don't want these instances can now
>                 avoid them completely, or define their own instances
> 
>             - these convenience instances are not just inconvenient for
>                 some applications, due to the way intances are handled
>                 in Haskell; they actually violate some "natural" invariants
>                 like "everything queries every substructure of the specified
>                 type", "everywhere applies a transformation at every
>                 substructure of matching type"
> 
>             - the situation is similar to Text.Show.Functions, as the
>                 convenience instances don't provide the full expected
>                 functionality, just barely enough for deriving to get by
> 
>     Cons: - due to the implicit import and use of these instances,
>                     there is no obvious transition scheme; it seems that
>                     the least painful process would be to make the change
>                     without transition/deprecation period and to document
>                     the explicit import option
> 
>                 [it would be useful to have a way of deprecating instance
>                 imports, so that any deriving scheme depending on imports
>                 from a deprecated location would trigger a warning, in this
>                 case suggesting the new import location]

These are supposed to be pros and cons of moving the instances into
their own module, as opposed to the status quo, right?

If so, the first "pro" isn't a pro at all. Debatably it's a con - it
would make it less convenient to derive instance Data for types
containing IO etc.

For the second one, not having an instance doesn't help a program
(although it may help a programmer find his errors, which I agree is a
pro), and you say that you can't think what an alternative instance
would be. Even if you could, I still think that you shouldn't be able to
define an alternative instance, as I said above, so I think that that is
a con.

The third one is more-or-less "help a programmer find his errors" again,
I think?

And the fourth one is just an observation, neither a pro or a con.

Incidentally, another con is that it means the instances have to be
orphan instances.


I don't see a benefit to moving the instances to their own module, which
outweighs the downsides, in my opinion.

> Scenario 2:
> 
>    Any attempt to use Data (a->b) or Data (IO a) indicates an
>    error. If we want to derive Data for complex structures containing 
>    those types, we need to define Data instances for the immediately
>    enclosing structures, or wrap those types in newtypes and define
>    Data instances for those. This is the case for which the incomplete
>    instances get in the way.

How do they "get in the way"? Do you mean the typechecker doesn't tell
you which instances you need to define by hand, because deriving worked?

> PS. The situation is not improved by the current reexports of 
>    Data.Generics.Instances from unexpected places.

Instances are not reexported, instances are global. Not even just within
a project, but between every single library on hackage; pick any two
libraries, and eventually someone is going to want them in the same
program.

(That module also contains things like "instance Data Int", so it's not
surprising that lots of things need it)


Thanks
Ian



More information about the Libraries mailing list