Newtype wrappers

Mon Jan 14 19:38:38 CET 2013

Many of us definitely care. =)

The main concern that I would have is that the existing solutions to this
problem could be implemented while retaining SafeHaskell, and I don't see
how a library that uses this can ever recover its SafeHaskell guarantee.

Here is a straw man example of a solution that permits SafeHaskell in the
resulting code that may be useful in addition to or in lieu of your
proposed approach:

We could extend Data.Functor with an fmap# operation that was only, say,
exposed via Data.Functor.Unsafe:

{-# LANGUAGE Unsafe, MagicHash #-}
module Data.Functor.Unsafe where
class Functor f where
  fmap# :: (a -> b) -> f a -> f b
  fmap :: (a -> b) -> f a -> f b
  (<$) :: b -> f a -> f b
  fmap# = \f -> \fa -> fa `seq` fmap f p

Then we flag Data.Functor as Trustworthy and export just the safe subset:

{-# LANGUAGE Trustworthy #-}
module Data.Functor (Functor(fmap,(<$))) where
import Data.Functor.Unsafe

then fmap# from Data.Functor.Unsafe is allowed to be fmap# _ = unsafeCoerce
for any Functor that doesn't perform GADT-like interrogation of its
argument (this could be assumed automatically in DeriveFunctor, which can't
handle those cases anyways!)

Then any user who wants to enable a more efficient fmap for fmapping over
his data type with a newtype instantiates fmap# for his Functor. They'd
have to claim Trustworthy (or use the enhanced DeriveFunctor), to discharge
the obligation that they aren't introducing an unsafeCoerce that is visible
to the user. (After all the user has to import another Unsafe module to get
access to fmap# to invoke it.)

Finally then code that is willing to trust other trustworthy code can claim
to be Trustworthy in turn, import Data.Functor.Unsafe and use fmap# for
newtypes and impossible arguments:

{-# LANGUAGE Trustworthy #-}
module Data.Void where

import Data.Functor.Unsafe

newtype Void = Void Void deriving Functor

absurd :: Void -> a
absurd (Void a) = absurd a

vacuous :: Functor f => f Void -> f a
vacuous = fmap# absurd

This becomes valuable when data types like Void are used to mark the
absence of variables in a syntax tree, which could be quite large.

Currently we have to fmap absurd over the tree, paying an asymptotic cost
for not using (forall a. Expr a) or some newtype wrapped equivalent as our
empty-expression type.

This would dramatically improve the performance of libraries like bound
which commonly use constructions like Expr Void.

Its safety could be built upon by making another class for tracking
newtypes etc so we can know whats safe to pass to fmap#, and you might be
able to spot opportunities to rewrite an explicit fmap of something that is
a `cast` in the core to a call to fmap#.

-Edward

On Mon, Jan 14, 2013 at 1:09 PM, Simon Peyton-Jones
<simonpj at microsoft.com>wrote:

>  Friends****
>
> ** **
>
> I’d like to propose a way to “promote” newtypes over their enclosing
> type.  Here’s the writeup****
>
>           http://hackage.haskell.org/trac/ghc/wiki/NewtypeWrappers****
>
> ** **
>
> Any comments?  Below is the problem statement, taken from the above page.*
> ***
>
> ** **
>
> I’d appreciate****
>
> **·         **A sense of whether you care. Does this matter?****
>
> **·         **Improvements to the design I propose****
>
> ** **
>
> Simon****
>
> ** **
>
> ** **
>
> ** **
>
> *The problem*
>
> Suppose we have ****
>
> newtype Age = MkAge Int****
>
> Then if n :: Int, we can convert n to an Age thus: MkAge n :: Age.
> Moreover, this conversion is a type conversion only, and involves no
> runtime instructions whatsoever. This cost model -- that newtypes are free
> -- is important to Haskell programmers, and encourages them to use newtypes
> freely to express type distinctions without introducing runtime overhead.
> ****
>
> Alas, the newtype cost model breaks down when we involve other data
> structures. Suppose we have these declarations ****
>
> data T a   = TLeaf a     | TNode (Tree a) (Tree a)****
>
> data S m a = SLeaf (m a) | SNode (S m a) (S m a)****
>
> and we have these variables in scope ****
>
> x1 :: [Int]****
>
> x2 :: Char -> Int****
>
> x3 :: T Int****
>
> x4 :: S IO Int****
>
> Can we convert these into the corresponding forms where the Int is
> replaced by Age? Alas, not easily, and certainly not without overhead. ***
> *
>
>    - For x1 we can write map MkAge x1 :: [Age]. But this does not follow
>    the newtype cost model: there will be runtime overhead from executing the
>    map at runtime, and sharing will be lost too. Could GHC optimise the
>    map somehow? This is hard; apart from anything else, how would GHC
>    know that map was special? And it it gets worse. ****
>
>
>    - For x2 we'd have to eta-expand: (\y -> MkAge (x2 y)) :: Char -> Age.
>    But this isn't good either, because eta exapansion isn't semantically valid
>    (if x2 was bottom, seq could distinguish the two). See #7542<http://hackage.haskell.org/trac/ghc/ticket/7542>for a real life example.
>    ****
>
>
>    - For x3, we'd have to map over T, thus mapT MkAge x3. But what if mapTdidn't exist? We'd have to make it. And not all data types have maps.
>    S is a harder one: you could only map over S-values if m was a
>    functor. There's a lot of discussion abou this on #2110<http://hackage.haskell.org/trac/ghc/ticket/2110>.
>    ****
>
> ** **
>
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20130114/9854e33d/attachment.htm>