[Hs-Generics] A Lightweight Implementation of Generics and Dynamics (LIGD)

Thu Oct 19 15:19:11 EDT 2006

Hi Bruno (and all),

Sorry for being so sluggish---but I just wanted to chime in with my  
comments on Bruno's summary.

On Oct 3, 2006, at 7:47 AM, Bruno Oliveira wrote:

> Hello all,
>
> ====================================================================== 
> =
> Approach: A Lightweight Implementation of Generics and Dynamics
>
> Required features:
>    - Haskell 98 + existential types
>
> Usage:
>    - Library Writer: Responsible for the sums-of-products machinery,
>      the representation(s) data type(s) to be defined in the  
> library and
>      defining the class and standard instances for "Representable".
>      The library writer needs to be have a deep knowledge about  
> generic
>      programming.
>    - Power User: Defines the boilerplate for new data types. Needs  
> to be fairly
>      familiar with the implementation.
>    - User: defines generic functions by pattern matching on the  
> structure of "Rep".
>      This requires that he is comfortable with sums of products and  
> isomorphisms.
>      For convenience, the user should also provide a definition  
> that uses the
>      type class "Representable", which avoids the representations  
> to be passed
>      explicitly.
>    - End User: If "Representable" is used, then the end can just  
> use generic functions
>      in the same way as any other ad-hoc (that is, type class  
> overloaded) functions.
>

I'm trying to get these roles straight in my head. I think we're  
going to have a similar division for many of the library proposals.

A Generic library has several important components: the "framework"  
or way of representing a particular type, instances of this framework  
for the primitive types, helper functions for creating instances for  
new types, and a base set of generic operations/support code to get  
things started. The "Library writer" is responsible for all of these.

A user of the generic library may want to do two things:
   (a) define new type-indexed functions
   (b) ensure that their datatypes can be used with type-indexed  
functions
In Bruno's terminology,
   - Power User: does both of these
   - User: does (b) but not (a)
   - End User: does neither, just calls existing type-indexed  
functions on existing datatypes.

Note that we could imagine eliminating/automating (b) for some  
platforms with Template Haskell.

> Extensibility:
>    - Data types are non-extensible, however we can try to combine the
>      approach with type classes to get some form of extensibility  
> (RepLib
>      proposes one solution like this).
>

I think I prefer the term "Specializability" (even though it is more  
of a mouthful.) Generic
functions can be specialized at particular datatypes to specific  
operations.
(LIGD does not support this behavior, but could be extended to do so  
by pushing RepLib through the GADT translation.)

> Performance considerations:
>    - Sums of products based and therefore involving considerable
>      performance penalties.
>

Besides the runtime types and sums-of-products explosion, there is  
also the issue of applying the identity function type coercions (if  
they are not optimized away). Those at least would be eliminated by  
using a GADT.

Aside: should we be developing a benchmark suite if we are actually  
going to be making decisions about performance? I don't want to guess  
here. Given that we will be collecting versions of several functions  
for each library, we should be able to run sort of "Library shootout".

I also would like to know if there are some specific Haskell  
extensions that could dramatically improve performance. These  
extensions would not be portable, of course, but perhaps for specific  
platforms we could support optimized execution.

> Expressibility:
>    - Can do both producer and consumer functions.
>    - Multiple representations needed for generic functions of  
> different
>      arities;
>    - No local redefinitions.

I think we should separate expressibility of the generic functions  
and expressibility wrt to what type information can index such  
functions.

For the former we talk about producer/consumer functions (read/show),  
functions of different arities (map,zip), functions defined by higher- 
order types (map, count, reduce), functions requiring type-indexed  
types (the zipper, bit-packed arrays, generalized tries).

For the latter, we talk about the ability to use the library with  
types that contain
   records
   nested datatypes
   higher-kinded types (what kinds?)
   datatypes with existential components (both of kind type and  
higher kinds)
   datatypes with first-class polymorphic components (both of kind  
type and higher kinds)
   above with class constraints
   GADTs (simple, and those that subsume existentials...)
   datatypes with higher-kinded type arguments

(I see that Johan also mentions this division in his email.)

What do you mean by "local redefinitions"?

>
> Boilerplate:
>    - Isomorphisms between sums of products and data types
>    - Instances of Representable
>    - Smart constructors for representation of data types
>
> Helpful extra features:
>    - GADTs (already implemented in GHC) allow more direct definitions
>      of generic functions, since there is no need to apply the type
>      coercions explicitly;

>    - A boilerplate generation mechanism. This would effectively mean
>      that power users would not be necessary.

>    - Open data types and functions in order for extensibility.

>    - Kind polymorphism, for eliminating the need for multiple
>      representations;
>

I don't think that kind polymorphism would solve this problem---the  
multiple representations are needed for functions (like map) that  
have different arities of kind-indexed types.

Kind polymorphism would help a little, but it wouldn't be completely  
useful without some form of kind-indexed types. (The representations  
of types with higher kinds is not parametric in those kinds, nor is  
the behavior of type-indexed operations on those representations.)

> Discussion:
>     An interesting approach which is quite lightweight and that is  
> fairly easy to
> use (specially if we would not need to  handle any of the  
> boilerplate). It is a bit
> outdated because with GADTs, the use can be slightly simplified.  
> However, this also
> takes us further away from Haskell 98 or even Haskell' (since GADTs  
> will not be there).

I think the case for GADTs in Haskell' would be strengthened if there  
were other Haskell compilers than GHC that implement GADTs. Does  
anyone know of any?

> I think that LIGD is still a reference for a generic programming  
> library that uses
> a data type for type representations and sums of products as a  
> family of data types.
>    A drawback may be performance, since we need to convert a data  
> type into its
> isomorphic sum of products representation.
> =============================================================
>
> Cheers,
>
> Bruno

Thanks,
Stephanie

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2427 bytes
Desc: not available
Url : http://www.haskell.org/pipermail/generics/attachments/20061019/39f5b7b9/smime-0001.bin