Changes to Data.Typeable

Simon Marlow marlowsd at
Mon Jul 11 10:17:53 CEST 2011

On 08/07/2011 17:36, Gábor Lehel wrote:
> 2011/7/7 Simon Marlow<marlowsd at>:
>> On 07/07/11 17:14, Gábor Lehel wrote:
>>> On Thu, Jul 7, 2011 at 5:44 PM, Simon Marlow<marlowsd at>    wrote:
>>>> Hi folks,
>>>> In response to this ticket:
>>>> I'm making some changes to Data.Typeable, some of which affect the API,
>>>> so
>>>> as per the new library guidelines I'm informing the list.
>>>> The current implementation of Typeable is based on
>>>>   mkTyCon :: String ->    TyCon
>>>> which internally keeps a table mapping Strings to Ints, so that each
>>>> TyCon
>>>> can be given a unique Int for fast comparison.  This means the String has
>>>> to
>>>> be unique across all types in the program.  Currently derived instances
>>>> of
>>>> typeable use the qualified original name (e.g. "GHC.Types.Int") which is
>>>> not
>>>> necessarily unique, is non-portable, and exposes implementation details.
>>>> The String passed to mkTyCon is returned by
>>>>   tyConString :: TyCon ->    String
>>>> which lets the user get at this non-portable representation (also the
>>>> Show
>>>> instance returns this String).
>>>> So the new proposal is to store three Strings in TyCon.  The internal
>>>> representation is this:
>>>> data TyCon = TyCon {
>>>>    tyConHash    :: {-# UNPACK #-} !Fingerprint,
>>>>    tyConPackage :: String,
>>>>    tyConModule  :: String,
>>>>    tyConName    :: String
>>>>   }
>>>> the fields of this type are not exposed externally.  Together the three
>>>> fields tyConPackage, tyConModule and tyConName uniquely identify a TyCon,
>>>> and the Fingerprint is a hash of the concatenation of these three Strings
>>>> (so no more internal cache to map strings to unique Ids). tyConString now
>>>> returns the value of tyConName only.
>>>> I've measured the performance impact of this change, and as far as I can
>>>> tell performance is uniformly better.  This should improve things for SYB
>>>> in
>>>> particular.  Also, the size of the code generated for deriving Typeable
>>>> is
>>>> less than half as much as before.
>>>> === Proposed API changes ===
>>>> 1. DEPRECATE mkTyCon
>>>>    mkTyCon is used by some hand-written instances of Typeable.  It
>>>>    will work as before, but is deprecated in favour of...
>>>> 2. Add
>>>>    mkTyCon3 :: String ->    String ->    String ->    TyCon
>>>>    which takes the package, module, and name of the TyCon respectively.
>>>>    Most users can just derive Typeable, there's no need to use mkTyCon3.
>>>> In due course we can rename mkTyCon3 back to mkTyCon.
>>>> Any comments?
>>>> Cheers,
>>>>         Simon
>>> Would this also mean typeRepKey could be taken out of the IO monad?
>>> That would be nice.
>> Ah yes, I forgot to mention the changes to typeRepKey.  So currently we have
>>   typeRepKey :: TypeRep ->  IO Int
>> this API is difficult to support in the new library, I'd have to reintroduce
>> the cache, and it wouldn't be very efficient.  I plan to change it to this:
>>   data TypeRepKey -- abstract, instance of Eq, Ord
>>   typeRepKey :: TypeRep ->  IO TypeRepKey
>> where TypeRepKey is a newtype of the internal Fingerprint.  Now, we could
>> take typeRepKey out of IO, but the Ord instance of TypeRepKey is
>> implementation-defined (it provides some total order, but we don't tell you
>> what it is).  So arguably we should keep the IO.  What do people think?
> Would the order be allowed to vary from run to run of the program
> (which is why it's in IO now)? Could it be specified as
> implementation-defined but non-varying? If so, I would favor that
> option along with taking it out of IO. (Plenty of things are
> implementation-defined, like the size of an Int.)

Yes, it's implementation-defined but non-varying.  I know some people 
have objected to these things being outside the IO monad before, but 
there is already plenty of precedent (System.Info.os, size of Int, 

However, if we take it out of IO then it may limit the possible 
implementations.  Would the previous implementation, in which keys were 
assigned at runtime, still be valid?  It is still implementation-defined 
and non-varying, but only over a single run.

> Albeit, the use case I had in mind was using Template Haskell to
> construct a case statement over the literal Int values of the keys as
> determined at compile time (hopefully compiling down to something like
> a C switch statement), and I'm not sure if that's going to work if the
> keys are no longer Ints. (That it wouldn't compile down to a switch
> statement is one thing, but I'm not sure if the code would literally
> be possible to write. Maybe it'd need a Lift instance?) Anyway, I
> don't think it would hurt to take it out of IO if given the
> opportunity, either way.

The keys are 128-bit hashes, so it might still be possible to do 
something like this, but you would need access to the internal 
representations.  I'm planning to expose these via 
Data.Typeable.Internal (no guarantees about stability of this API, however).


More information about the Libraries mailing list