default roles

Simon Peyton-Jones simonpj at microsoft.com
Thu Oct 10 09:49:38 UTC 2013


The way I've been describing GND all along has been an abbreviation. GHC does not coerce a dictionary from, say, Ord Int to Ord Age. Instead, GHC mints a fresh dictionary for Ord Age where all the methods are implemented as coerced versions of the methods for Ord Int. (I'm not sure why it's implemented this way, which is why I've elided this detail in just about every conversation on the topic.) With this in mind, I have a proposal:

1) All parameters of all classes have nominal role.
2) Classes also store one extra bit per parameter, saying whether all uses of that parameter are representational. Essentially, this bit says whether that parameter is suitable for GND. (Currently, we could just store for the last parameter, but we can imagine extensions to the GND mechanism for other parameters.)

Yes, this seems right.  And NOW I finally realise why GHC implements GND like this.  Consider

  class Show a => C a where { op :: a -> a }
  instance C Int where ...
  newtype Age = Age Int deriving( Show, C )

Here we want to make a (C Age) dictionary that use the (C Int) version of 'op'.  But the superclass for (Show Age) must not use the (Show Int) version of 'show'!  It must use Age's own version of Show.

So we I think Richard's proposal is spot on.  Go for it.   Can you work on that, Richard?

Simon

From: Glasgow-haskell-users [mailto:glasgow-haskell-users-bounces at haskell.org] On Behalf Of Richard Eisenberg
Sent: 09 October 2013 20:21
To: Edward Kmett
Cc: Simon Peyton-Jones; glasgow-haskell-users at haskell.org Mailing List
Subject: Re: default roles

Now I think we're on the same page, and I *am* a little worried about the sky falling because of this. (That's not a euphemism -- I'm only a little worried.)

Well, maybe I should be more worried.

The whole idea of roles is to protect against type-unsoundness. They are doing a great job of that here -- no problem that we've discussed in this thread is a threat against type safety.

The issue immediately at hand is about coherence (or perhaps you call it confluence) of instances. Roles do not address the issue of coherence at all, and thus they fail to protect against coherence attacks. It would take More Thought to reformulate roles (or devise something else) to handle coherence.

It's worth pointing out that this isn't a new problem, exactly. Bug #8338 shows a way to produce incoherence using only the GADTs extension. (It does need 4 modules, though.) I conjecture that incoherence is also possible through GeneralizedNewtypeDeriving, both as it existed in GHC 7.6.3 and in 7.8, so it's not an issue with Coercible, exactly. It's just that Coercible allows you to get incoherence with so much less fuss than before!

Wait! I have an idea!
The way I've been describing GND all along has been an abbreviation. GHC does not coerce a dictionary from, say, Ord Int to Ord Age. Instead, GHC mints a fresh dictionary for Ord Age where all the methods are implemented as coerced versions of the methods for Ord Int. (I'm not sure why it's implemented this way, which is why I've elided this detail in just about every conversation on the topic.) With this in mind, I have a proposal:

1) All parameters of all classes have nominal role.
2) Classes also store one extra bit per parameter, saying whether all uses of that parameter are representational. Essentially, this bit says whether that parameter is suitable for GND. (Currently, we could just store for the last parameter, but we can imagine extensions to the GND mechanism for other parameters.)

Because GND is implemented using coercions on each piece instead of wholesale, the nominal roles on classes won't get in the way of proper use of GND. An experiment (see below for details) also confirms that even superclasses work well with this idea -- the superclasses aren't coerced.

Under this proposal, dictionaries can never be coerced, but GND would still seem to work.

Thoughts?

Richard

Experiment:

newtype Age = MkAge Int

instance Eq Age where
  _ == _ = False

deriving instance Ord Age

useOrdInstance :: Ord a => a -> Bool
useOrdInstance x = (x == x)

What does `useOrdInstance (MkAge 5)` yield? It yields `False` (in HEAD). This means that the existing GND mechanism (I didn't change anything around this part of the code) uses superclass instances for the *newtype*, not for the *base type*. So, even with superclasses, class dictionaries don't need to be coerced.

On Oct 9, 2013, at 2:52 PM, Edward Kmett <ekmett at gmail.com<mailto:ekmett at gmail.com>> wrote:


I'd be happy to be wrong. =)

We do seem to have stumbled into a design paradox though.

To make it so you can use roles in GeneralizedNewtypeDeriving hinges on the parameter's role being representational, but making it representational means users can also use coerce to turn dictionaries into other dictionaries outside of GND.

This is quite insidious, as another dictionary for Eq or Ord may exist for that type, where it becomes unsound as the generated dictionary may be used to destroy confluence.

This means that even if something like Set has a nominal argument it isn't safe, because you can attack the invariants of the structure via Ord.

newtype Bad = Bad Int deriving Eq
instance Ord Bad where
   compare (Bad a) (Bad b) = compare b a

If Ord has a representational role then I can use coerce to convert a dictonary Ord Bad to Ord Int, then work locally in a context where that is the dictionary for Ord Int that I get when I go to do an insert or lookup.

I don't mean to sound like the sky is falling, but I do worry that the 'use of a constraint in a data type' may not be necessary or sufficient. That is a lot of surface area to defend against attack.

I am not sure that I actually need a data type to coerce a dictionary. It seems likely that I could do it with just a well crafted function argument and ScopedTypeVariables, but my version of HEAD is a bit too mangled at the moment to give it a try.

-Edward

On Wed, Oct 9, 2013 at 2:09 PM, Richard Eisenberg <eir at cis.upenn.edu<mailto:eir at cis.upenn.edu>> wrote:
I don't quite agree with your analysis, Edward.

Eq can be auto-derived, so it makes for a confusing example. Let's replace Eq in your example with this class:

> class C a where
>  c_meth :: a -> a -> Bool

Then, your example leads to the same embarrassing state of affairs: coercing a dictionary for (C Int) to one for (C Bar).

But, I would argue that we still want C's parameter to have a representational role. Why? Consider this:

> data Blargh = ...
> instance C Blargh where ...
>
> newtype Baz = MkBaz Blargh deriving C

We want that last line to work, using GeneralizedNewtypeDeriving. This hinges on C's parameter's role being representational.

I think that what you've witnessed is a case of bug #8338 (http://ghc.haskell.org/trac/ghc/ticket/8338). This is a problem, in my view, and it seems to touch on roles, but I'm not completely sure of their relationship.

So, I think that classes should keep their representational roles (regardless of the decision on datatypes -- Haskell doesn't really support "abstract" classes), but perhaps we have to find a way to stop these incoherent instances from forming. Maybe the use of a constraint makes a datatype's role be nominal?

Richard

On Oct 9, 2013, at 1:55 PM, Edward Kmett <ekmett at gmail.com<mailto:ekmett at gmail.com>> wrote:


I just noticed there is a pretty big issue with the current default role where typeclasses are concerned!

When implementing Data.Type.Coercion I had to use the fact that I could apply coerce to the arguments of

data Coercion a b where
  Coercion :: Coercible a b => Coercion a b

This makes sense as Coercion itself has two representational arguments.

This struck me as quite clever, so I went to test it further.

data Foo a where
   Foo :: Eq a => Foo a

newtype Bar = Bar Int
instance Eq Bar where
  _ == _ = False

I fully expected the following to fail:

coerce (Foo :: Foo Int) :: Foo Bar

but instead it succeeded.

This means I was able to convert a dictionary Eq Int into a dictionary for Eq Bar!

This indicates that Eq (actually all) of the typeclasses are currently marked as having representational, when actually it strikes me that (almost?) none of them should be.

Coercible is the only case I can think of in base of a class with two representational arguments, but this is only valid because we prevent users from defining Coercible instances manually.

If I try again with a new typeclass that has an explicit nominal role

type role Eq nominal
class Eq a
instance Eq Int
instance Eq Bar

then I get the failure to derive Coercible (Foo Int) (Foo Bar) that I'd expect.

This indicates two big issues to me:

1.) At the very least the default role for type classes should be nominal for each argument. The very point of an instance is to make a nominal distinction after all. =)

2.) It also indicates that making any typeclass with a representational (/ phantom?) argument shouldn't be possible in valid SafeHaskell, as you can use it to subvert the current restrictions on OverlappingInstances.

-Edward

On Wed, Oct 9, 2013 at 12:07 PM, Iavor Diatchki <iavor.diatchki at gmail.com<mailto:iavor.diatchki at gmail.com>> wrote:
Hello,

My preference would be for the following design:

1. The default datatypes for roles are Nominal, but programmers can add annotations to relax this.
2. Generlized newtype deriving works as follows:  we can coerce a dictionary for `C R` into `C T`, as long as we can coerce the types of all methods instantiated with `R`, into the corresponding types instantiated with `T`.  In other words, we are pretending that we are implementing all methods by using `coerce`.

As far as I can see this safe, and matches what I'd expect as a programmer.  It also solves the problem with the `Set` example: because `Set` has a nominal parameter, we cannot coerce `Set Int` into `Set MyAge` and, hence, we cannot derive an instance of `MyAge` for `HasSet`.  An added benefit of this approach is that when newtype deriving fails, we can give a nicer error saying exactly which method causes the problem.

-Iavor





On Mon, Oct 7, 2013 at 6:26 AM, Richard Eisenberg <eir at cis.upenn.edu<mailto:eir at cis.upenn.edu>> wrote:
As you may have heard, /roles/ will be introduced with GHC 7.8. Roles are a mechanism to allow for safe 0-cost conversions between newtypes and their base types. GeneralizedNewtypeDeriving (GND) already did this for class instances, but in an unsafe way -- the feature has essentially been retrofitted to work with roles. This means that some uses of GND that appear to be unsafe will no longer work. See the wiki page [1] or slides from a recent presentation [2] for more info.

[1] : http://ghc.haskell.org/trac/ghc/wiki/Roles
[2] : http://www.cis.upenn.edu/~eir/papers/2013/roles/roles-slides.pdf

I am writing because it's unclear what the *default* role should be -- that is, should GND be allowed by default? Examples follow, but the critical issue is this:

* If we allow GND by default anywhere it is type-safe, datatypes (even those that don't export constructors) will not be abstract by default. Library writers would have to use a role annotation everywhere they wish to declare a datatype they do not want users to be able to inspect. (Roles still keep type-*un*safe GND from happening.)

* If we disallow GND by default, then perhaps lots of current uses of GND will break. Library writers will have to explicitly declare when they wish to permit GND involving a datatype.

Which do we think is better?

Examples: The chief example demonstrating the problem is (a hypothetical implementation of) Set:

> module Set (Set) where   -- note: no constructors exported!
>
> data Set a = MkSet [a]
> insert :: Ord a => a -> Set a -> Set a
> ...

> {-# LANGUAGE GeneralizedNewtypeDeriving, StandaloneDeriving #-}
> module Client where
>
> import Set
>
> newtype Age = MkAge Int deriving Eq
>
> instance Ord Age where
>   (MkAge a) `compare` (MkAge b) = b `compare` a   -- flip operands, reversing the order
>
> class HasSet a where
>   getSet :: Set a
>
> instance HasSet Int where
>   getSet = insert 2 (insert 5 empty)
>
> deriving instance HasSet Age
>
> good :: Set Int
> good = getSet
>
> bad :: Set Age
> bad = getSet

According to the way GND works, `good` and `bad` will have the same runtime representation. But, using Set operations on `bad` would indeed be bad -- because the Ord instance for Age is different than that for Int, Set operations will fail unexpectedly on `bad`. The problem is that Set should really be abstract, but we've been able to break this abstraction with GND. Note that there is no type error in these operations, just wrong behavior.

So, if we default to *no* GND, then the "deriving" line above would have an error and this problem wouldn't happen. If we default to *allowing* GND, then the writer of Set would have to include
> type role Set nominal
in the definition of the Set module to prevent the use of GND. (Why that peculiar annotation? See the linked further reading, above.)

Although it doesn't figure in this example, a library writer who wishes to allow GND in the default-no scenario would need a similar annotation
> type role Foo representational
to allow it.

There are clearly reasons for and against either decision, but which is better? Let the users decide!

Discussion time: 2 weeks.

Thanks!
Richard

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users at haskell.org<mailto:Glasgow-haskell-users at haskell.org>
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users at haskell.org<mailto:Glasgow-haskell-users at haskell.org>
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20131010/ce6d9757/attachment-0001.html>


More information about the Glasgow-haskell-users mailing list