[Hs-Generics] Syb Renovations? Issues with Data.Generics
Claus Reinke
claus.reinke at talk21.com
Mon Jul 28 14:13:08 EDT 2008
Calling all Syb/Data.Generics users!-)
I keep running into problems with Data.Generics, mostly because I
actually want to use it (no claims that it is the best or final solution, or
that other approaches aren't equally in need of support, just that it is
the best-supported working approach right now).
Some tricky issues are (sometimes against published expectations)
solvable, suggesting useful additions to the library, but some seemingly
trivial things have me stumped, suggesting (to me, at least;-) a need for
improvements either in the library or in its documentation.
Part of the reason I'm interested in this now is that Data/Typeable
instances seem likely (I hope:-) to be added to the GHC Api, where
Thomas Schilling is working on improvements
http://hackage.haskell.org/trac/ghc/wiki/GhcApiStatus
http://hackage.haskell.org/trac/ghc/wiki/GhcApiAstTraversals
also, the old question of porting HaRe to the GHC Api is currently
being looked into again, by Chaddaï Fouché, and crucially depends
on Syb's generic traversals.
As it is still holiday season, it is a bit early for proposal deadlines,
but I'd like to start a discussion of Syb/Data.Generics and collect the
issues and solutions arising, in the hope of following up with concrete
proposals for improvements. To start the discussion, a simple item:
1. inconvenient convenience instances of Data for non-"data" types
Data.Generics.Instances defines instances of Data for many
types, including some abstract types that don't really fit into
the concrete value based model of Data, like 'IO a' and 'a->b'.
Those instances give runtime errors for some class methods,
and mainly offer faked (no-op) gmap traversals, serving as a
convenience/enabler for 'deriving instance Data':
http://www.haskell.org/pipermail/generics/2008-June/000346.html
A list of the odd instances in Data.Generics.Instances, with
examples of their oddities, can be found here:
http://www.haskell.org/pipermail/generics/2008-June/000347.html
My suggestion is to split this module into two, and stop the implicit
import/export of the incomplete instances from Data.Generics.
Reactions to this suggestion have been muted so far (Simon PJ was
as surprised as I was about the existence of these instances, but has
no strong opinion about the issue, Alexey Rodriguez supports the
suggestion, Ian Lynagh points out the difficulty of transition), which
is one reason why I'll try to move the discussion to libraries at .
Pro: - the instances are still available, and only one explicit import
away, so 'deriving instance Data' for types containing
uninteresting functions is still convenient
- the problematic instances are no longer implicitly imported,
so applications that don't want these instances can now
avoid them completely, or define their own instances
- these convenience instances are not just inconvenient for
some applications, due to the way intances are handled
in Haskell; they actually violate some "natural" invariants
like "everything queries every substructure of the specified
type", "everywhere applies a transformation at every
substructure of matching type"
- the situation is similar to Text.Show.Functions, as the
convenience instances don't provide the full expected
functionality, just barely enough for deriving to get by
Cons: - due to the implicit import and use of these instances,
there is no obvious transition scheme; it seems that
the least painful process would be to make the change
without transition/deprecation period and to document
the explicit import option
[it would be useful to have a way of deprecating instance
imports, so that any deriving scheme depending on imports
from a deprecated location would trigger a warning, in this
case suggesting the new import location]
As I said, I'd like to wait until at least the Syb authors are back from
holidays before setting any proposal deadlines, but I'd like to invite
feedback from Syb users on this and other Syb issues. Here is a
preview on other items I'd like to raise later on, please add your own:
2. Data.Generics.Utils
Since Data/Typeable are compiler-derivable (in GHC) while other
classes like Functor/Traversable/etc are not, it would be useful if
generic instances for those other classes could be defined in terms
of Data/Typeable.
The Uniplate library already does this for its own classes via
Data.Generics.PlateData, and it appears that at least Functor is
defineable as well (code exists, proof is only informal at this stage,
and those invariant violations and runtime errors in the implicitly
imported dummy instances from (1) really get in the way):
http://www.haskell.org/pipermail/generics/2008-June/000343.html
http://www.haskell.org/pipermail/generics/2008-July/000349.html
http://www.haskell.org/pipermail/generics/2008-July/000351.html
What other classes can be defined in this way? Traversable
(traverse) seems very nearly possible, what else?
3. Performance
Naive use of Syb traversal schemes can lead to huge performance
losses. Experienced users tend to write their own traversal schemes,
using Syb's low-level Api directly, but we can take inspiration from
some Uniplate/PlateData optimization techniques and generalise
them for use with Syb's high-level traversal scheme Api, yielding
similar performance gains for everywhere/everything:
http://www.haskell.org/pipermail/generics/2008-July/000353.html
Another direction that might be worth exploring is to use Maps
instead of nested generic extensions to define adhoc-overloaded
transformation and queries (I've actually started playing with that,
but am currently stuck on GHC ticket #2463).
4. Useability
There is probably nothing one can do to make the types of
Syb's low-level Api less of a brain hazard, but not all of the
stumbling blocks seem to be necessary consequences of the
carefully crafted edifice of interactions between nearly polymorphic
types, runtime type checks and type reflection. Examples:
- there doesn't seem to be a way to get hold of a types'
constructors, only of constructor representations, structure
scaffolds, and structure generators
- the actual domain on which a transformation/query acts
is hidden behind the near-polymorphic default type of
generic extensions
- I can't seem to figure out how to use typeOf1, when the
other Syb operations only give me 'forall a . Data a => a';
instead, I seem to be forced to use something like:
[ mkTyConApp tyCon (init tyArgs) | not (null tyArgs) ]
where (tyCon,tyArgs) = splitTyConApp typeRep
- others?
What are your personal gripes with Syb/Data/Typeable,
and for which of them do you see a chance of addressing
them by changing/adding code?
Claus
More information about the Generics
mailing list