No subject
Sun Oct 23 10:51:38 CEST 2011
a]] -> [a]<br>
diag =3D concat . foldr skew [] . map (map (\x -> [x]))<br><br>skew :: [=
[a]] -> [[a]] -> [[a]]<br>skew []=A0=A0=A0=A0 ys =3D ys<br>skew (x:xs=
) ys =3D x : combine (++) xs ys<br><br>combine :: (a -> a -> a) ->=
[a] -> [a] -> [a]<br>
combine _ xs=A0=A0=A0=A0 []=A0=A0=A0=A0 =3D xs<br>combine _ []=A0=A0=A0=A0 =
ys=A0=A0=A0=A0 =3D ys<br>combine f (x:xs) (y:ys) =3D f x y : combine f xs y=
s<br></blockquote><br>The particular implementation of these functions does=
n't really matter. What's important is that we have a way to interl=
eave lists (|||) and a way to diagonalise a matrix into a list (diag). We m=
ark these functions as NOINLINE because inlining them will only make the co=
re code more complicated (and may prevent rules from firing).<br>
<br>Suppose we have a type of Peano natural numbers:<br><br><blockquote sty=
le=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);paddi=
ng-left:1ex;font-family:courier new,monospace" class=3D"gmail_quote">data N=
at =3D Ze | Su Nat deriving Eq<br>
</blockquote><br>Implementing enumeration on this type is simple:<br><br><b=
lockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,2=
04,204);padding-left:1ex;font-family:courier new,monospace" class=3D"gmail_=
quote">
enumNat :: [Nat]<br>enumNat =3D [Ze] ||| map Su enumNat<br></blockquote><br=
>Now, a generic representation of Nat in terms of sums and products could l=
ook something like this:<br><br><blockquote style=3D"margin:0pt 0pt 0pt 0.8=
ex;border-left:1px solid rgb(204,204,204);padding-left:1ex;font-family:cour=
ier new,monospace" class=3D"gmail_quote">
type RepNat =3D Either () Nat<br></blockquote><br>That is, either a singlet=
on (for the Ze case) or a Nat (for the Su case). Note that I am building a =
shallow representation, since at the leaves we have Nat, and not RepNat. Th=
is mimics the situation with current generic programming libraries (in part=
icular GHC.Generics).<br>
<br>We'll need a way to convert between RepNat and Nat:<br><br><blockqu=
ote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204=
);padding-left:1ex;font-family:courier new,monospace" class=3D"gmail_quote"=
>
toNat :: RepNat -> Nat<br>toNat (Left ()) =3D Ze<br>toNat (Right n) =3D =
Su n<br><br>fromNat :: Nat -> RepNat<br>fromNat Ze =3D Left ()<br>fromNa=
t (Su n) =3D Right n<br></blockquote><br>(In fact, since we're only dea=
ling with a generic producer we won't need the fromNat function.)<br>
<br>To get an enumeration for RepNat, we first need to know how to enumerat=
e units and sums:<br><br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;bord=
er-left:1px solid rgb(204,204,204);padding-left:1ex;font-family:courier new=
,monospace" class=3D"gmail_quote">
enumU :: [()]<br>enumU =3D [()]<br><br>enumEither :: [a] -> [b] -> [E=
ither a b]<br>enumEither ea eb =3D map Left ea ||| map Right eb<br></blockq=
uote><br>Now we can define an enumeration for RepNat:<br><br><blockquote st=
yle=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padd=
ing-left:1ex;font-family:courier new,monospace" class=3D"gmail_quote">
enumRepNat :: [RepNat]<br>enumRepNat =3D enumEither enumU enumNatFromRep<br=
></blockquote><br>With the conversion function toNat, we can use enumRepNat=
to get an enumeration for Nat directly:<br><br><blockquote style=3D"margin=
:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex;=
font-family:courier new,monospace" class=3D"gmail_quote">
enumNatFromRep :: [Nat]<br>enumNatFromRep =3D map toNat enumRepNat<br></blo=
ckquote><br>First, convince yourself that enumNatFromRep and enumNat are eq=
uivalent functions:<br><br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;bo=
rder-left:1px solid rgb(204,204,204);padding-left:1ex;font-family:courier n=
ew,monospace" class=3D"gmail_quote">
take 100 enumNat =3D=3D take 100 enumNatFromRep<br></blockquote><br>Now, wh=
at I want is that enumNatFromRep generates the same core code as enumNat. T=
hat should be possible; here are the necessary steps:<br><br><blockquote st=
yle=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padd=
ing-left:1ex;font-family:courier new,monospace" class=3D"gmail_quote">
=A0 map toNat enumRepNat <br><br>=3D=3D { inline enumRepNat }<br><br>=A0 ma=
p toNat (enumEither enumU enumNatFromRep) <br><br>=3D=3D { inline enumEithe=
r }<br><br>=A0 map toNat (map Left enumU ||| map Right enumNatFromRep)<br><=
br>=3D=3D { inline enumU }<br>
<br>=A0 map toNat (map Left [()] ||| map Right enumNatFromRep)<br><br>=3D=
=3D { inline map }<br><br>=A0 map toNat ([Left ()] ||| map Right enumNatFro=
mRep)<br><br>=3D=3D { free theorem (|||): forall f a b. map f (a ||| b) =3D=
map f a ||| map f b }<br>
<br>=A0 map toNat [Left ()] ||| map toNat (map Right enumNatFromRep)<br><br=
>=3D=3D { inline map }<br><br>=A0 [toNat (Left ())] ||| map toNat (map Righ=
t enumNatFromRep)<br><br>=3D=3D { definition of toNat (or inline toNat + ca=
se of constant) }<br>
<br>=A0 [Ze] ||| map toNat (map Right enumNatFromRep)<br><br>=3D=3D { funct=
or composition law: forall f g l. map f (map g l) =3D map (f . g) l=A0}<br>=
<br>=A0 [Ze] ||| map (toNat . Right) enumNatFromRep<br><br>=3D=3D { definit=
ion of toNat (or inline toNat + case of constant) }<br>
<br>=A0 [Ze] ||| map Su enumNatFromRep<br></blockquote><br>Now let's se=
e what the compiler generates. I'm using GHC-7.4.1. Let's compile w=
ith -O1 and use -ddump-simpl to see the final simplifier output (core code)=
for enumNatFromRep:<br>
<br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb=
(204,204,204);padding-left:1ex;font-family:courier new,monospace" class=3D"=
gmail_quote">EnumAlone.enumNatFromRep :: [EnumAlone.Nat]<br>[GblId,<br>=A0S=
tr=3DDmdType,<br>
=A0Unf=3DUnf{Src=3D<vanilla>, TopLvl=3DTrue, Arity=3D0, Value=3DFalse=
,<br>=A0=A0=A0=A0=A0=A0=A0=A0 ConLike=3DFalse, Cheap=3DFalse, Expandable=3D=
False,<br>=A0=A0=A0=A0=A0=A0=A0=A0 Guidance=3DIF_ARGS [] 30 0}]<br>EnumAlon=
e.enumNatFromRep =3D<br>=A0 GHC.Base.map<br>=A0=A0=A0 @ EnumAlone.RepNat<br=
>
=A0=A0=A0 @ EnumAlone.Nat<br>=A0=A0=A0 EnumAlone.toNat<br>=A0=A0=A0 EnumAlo=
ne.enumRepNat<br><br>EnumAlone.enumRepNat [Occ=3DLoopBreaker] :: [EnumAlone=
.RepNat]<br>[GblId, Str=3DDmdType]<br>EnumAlone.enumRepNat =3D<br>=A0 EnumA=
lone.|||<br>=A0=A0=A0 @ (Data.Either.Either () EnumAlone.Nat) lvl4_rvV lvl5=
_rvW<br>
</blockquote><br>Ah, it didn't even inline enumRepNat because it made i=
t a loop breaker. We certainly want to inline it, so let's add a pragma=
:<br><br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px soli=
d rgb(204,204,204);padding-left:1ex" class=3D"gmail_quote">
<span style=3D"font-family:courier new,monospace">{-# INLINE enumRepNat #-}=
</span><br></blockquote><br>Recompiling, we get:<br><br><blockquote style=
=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding=
-left:1ex;font-family:courier new,monospace" class=3D"gmail_quote">
EnumAlone.enumRepNat [InlPrag=3DINLINE (sat-args=3D0)]<br>=A0 :: [EnumAlone=
.RepNat]<br>[GblId,<br>=A0Str=3DDmdType,<br>=A0Unf=3DUnf{Src=3DInlineStable=
, TopLvl=3DTrue, Arity=3D0, Value=3DFalse,<br>=A0=A0=A0=A0=A0=A0=A0=A0 ConL=
ike=3DFalse, Cheap=3DFalse, Expandable=3DFalse,<br>
=A0=A0=A0=A0=A0=A0=A0=A0 Guidance=3DALWAYS_IF(unsat_ok=3DFalse,boring_ok=3D=
False)<br>=A0=A0=A0=A0=A0=A0=A0=A0 Tmpl=3D EnumAlone.enumEither<br>=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 @ () @ EnumAlone.Nat EnumAlone.e=
numU EnumAlone.enumNatFromRep}]<br>EnumAlone.enumRepNat =3D<br>=A0 EnumAlon=
e.|||<br>
=A0=A0=A0 @ (Data.Either.Either () EnumAlone.Nat) lvl4_rvV lvl5_rvW<br><br>=
EnumAlone.enumNatFromRep [Occ=3DLoopBreaker] :: [EnumAlone.Nat]<br>[GblId, =
Str=3DDmdType]<br>EnumAlone.enumNatFromRep =3D<br>=A0 GHC.Base.map<br>=A0=
=A0=A0 @ EnumAlone.RepNat<br>
=A0=A0=A0 @ EnumAlone.Nat<br>=A0=A0=A0 EnumAlone.toNat<br>=A0=A0=A0 EnumAlo=
ne.enumRepNat<br></blockquote><br>So no real difference in the generated co=
de, other than the reassignment of loop breakers. For some reason enumRepNa=
t still doesn't get inlined.<br>
<br><b>Question: why won't GHC inline enumRepNat, even when I tell it t=
o do so with an INLINE pragma?</b><br><br>Well, let's inline it ourselv=
es, then. We redefine enumNatFromRep to:<br><br><blockquote style=3D"margin=
:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex;=
font-family:courier new,monospace" class=3D"gmail_quote">
enumNatFromRep =3D map toNat (enumEither enumU enumNatFromRep)<br></blockqu=
ote><br>This however doesn't help much. We get the following core:<br><=
br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(=
204,204,204);padding-left:1ex;font-family:courier new,monospace" class=3D"g=
mail_quote">
lvl6_rw5 :: [Data.Either.Either () EnumAlone.Nat]<br>[GblId]<br>lvl6_rw5 =
=3D<br>=A0 EnumAlone.|||<br>=A0=A0=A0 @ (Data.Either.Either () EnumAlone.Na=
t) lvl4_rw3 lvl5_rw4<br><br>EnumAlone.enumNatFromRep [Occ=3DLoopBreaker] ::=
[EnumAlone.Nat]<br>
[GblId, Str=3DDmdType]<br>EnumAlone.enumNatFromRep =3D<br>=A0 GHC.Base.map<=
br>=A0=A0=A0 @ EnumAlone.RepNat @ EnumAlone.Nat EnumAlone.toNat lvl6_rw5<br=
></blockquote><br>GHC is really keen on floating that (|||) out. Let's =
be very explicit about inlining:<br>
<br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb=
(204,204,204);padding-left:1ex;font-family:courier new,monospace" class=3D"=
gmail_quote">{-# INLINE toNat #-}<br>{-# INLINE enumU #-}<br>{-# INLINE enu=
mEither #-}<br>
</blockquote><br>Also, maybe it's floating it out because it doesn'=
t have anything else to do to it. Let's add the free theorem of (|||) a=
s a rule:<br><br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:=
1px solid rgb(204,204,204);padding-left:1ex;font-family:courier new,monospa=
ce" class=3D"gmail_quote">
{-# RULES "ft |||" forall f a b. map f (a ||| b) =3D map f a ||| =
map f b #-}<br></blockquote><br>We needed this in our manual derivation, so=
GHC should need it too. Recompiling, we see we've made some progress:<=
br>
<br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb=
(204,204,204);padding-left:1ex;font-family:courier new,monospace" class=3D"=
gmail_quote">lvl5_ryv :: [Data.Either.Either () EnumAlone.Nat]<br>[GblId]<b=
r>
lvl5_ryv =3D<br>=A0 GHC.Base.map<br>=A0=A0=A0 @ EnumAlone.Nat<br>=A0=A0=A0 =
@ (Data.Either.Either () EnumAlone.Nat)<br>=A0=A0=A0 (Data.Either.Right @ (=
) @ EnumAlone.Nat)<br>=A0=A0=A0 EnumAlone.enumNatFromRep<br><br>lvl6_ryw ::=
[EnumAlone.Nat]<br>[GblId]<br>
lvl6_ryw =3D<br>=A0 GHC.Base.map<br>=A0=A0=A0 @ (Data.Either.Either () Enum=
Alone.Nat)<br>=A0=A0=A0 @ EnumAlone.Nat<br>=A0=A0=A0 EnumAlone.toNat<br>=A0=
=A0=A0 lvl5_ryv<br><br>EnumAlone.enumNatFromRep [Occ=3DLoopBreaker] :: [Enu=
mAlone.Nat]<br>[GblId, Str=3DDmdType]<br>
EnumAlone.enumNatFromRep =3D<br>=A0 EnumAlone.||| @ EnumAlone.Nat lvl4_ryu =
lvl6_ryw<br></blockquote><br>enumNatFromRep finally starts with (|||) direc=
tly. But its second argument, lvl6_ryw, is a map of lvl5_ryv, which is itse=
lf a map! At this stage I expected GHC to be aware of the fusion law for ma=
p, but it seems that it isn't.<br>
<br><b>Question: why is map fusion not happening automatically?</b><br><br>=
Let's add it as a rule:<br><br><blockquote style=3D"margin:0pt 0pt 0pt =
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex;font-family:c=
ourier new,monospace" class=3D"gmail_quote">
{-# RULES "map/map1" forall f g l. map f (map g l) =3D map (f . g=
) l #-}<br></blockquote><br>And now we're in a much better situation:<b=
r><br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid r=
gb(204,204,204);padding-left:1ex;font-family:courier new,monospace" class=
=3D"gmail_quote">
lvl3_ryD :: [Data.Either.Either () EnumAlone.Nat]<br>[GblId, Caf=3DNoCafRef=
s]<br>lvl3_ryD =3D<br>=A0 GHC.Types.:<br>=A0=A0=A0 @ (Data.Either.Either ()=
EnumAlone.Nat)<br>=A0=A0=A0 EnumAlone.fromNat1<br>=A0=A0=A0 (GHC.Types.[] =
@ (Data.Either.Either () EnumAlone.Nat))<br>
<br>lvl4_ryE :: [EnumAlone.Nat]<br>[GblId]<br>lvl4_ryE =3D<br>=A0 GHC.Base.=
map<br>=A0=A0=A0 @ (Data.Either.Either () EnumAlone.Nat)<br>=A0=A0=A0 @ Enu=
mAlone.Nat<br>=A0=A0=A0 EnumAlone.toNat<br>=A0=A0=A0 lvl3_ryD<br><br>lvl5_r=
yF :: [EnumAlone.Nat]<br>
[GblId]<br>lvl5_ryF =3D<br>=A0 GHC.Base.map<br>=A0=A0=A0 @ EnumAlone.Nat<br=
>=A0=A0=A0 @ EnumAlone.Nat<br>=A0=A0=A0 EnumAlone.Su<br>=A0=A0=A0 EnumAlone=
.enumNatFromRep<br><br>EnumAlone.enumNatFromRep [Occ=3DLoopBreaker] :: [Enu=
mAlone.Nat]<br>[GblId, Str=3DDmdType]<br>
EnumAlone.enumNatFromRep =3D<br>=A0 EnumAlone.||| @ EnumAlone.Nat lvl4_ryE =
lvl5_ryF<br></blockquote><br>Note how toNat is entirely gone from the secon=
d part of the enumeration (lvl5_ryF). Strangely enough, the enumerator for =
Ze (lvl4_ryE) is still very complicated: map toNat ([Left ()]). Why doesn&#=
39;t GHC simplify this to just [Ze]? Apparently because GHC doesn't sim=
plify map over a single element list.<br>
<br><b>Question: why doesn't GHC optimise map f [x] to [f x]?</b><br><b=
r>Let's tell it to do so:<br><br><blockquote style=3D"margin:0pt 0pt 0p=
t 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex;font-family=
:courier new,monospace" class=3D"gmail_quote">
{-# RULES "map/map2" forall f x. map f (x:[]) =3D (f x):[] #-}<br=
></blockquote><br>Now we're finally where we wanted:<br><br><blockquote=
style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);p=
adding-left:1ex;font-family:courier new,monospace" class=3D"gmail_quote">
lvl_ryA :: [EnumAlone.Nat]<br>[GblId, Caf=3DNoCafRefs]<br>lvl_ryA =3D<br>=
=A0 GHC.Types.:<br>=A0=A0=A0 @ EnumAlone.Nat EnumAlone.Ze (GHC.Types.[] @ E=
numAlone.Nat)<br><br>lvl3_ryD :: [EnumAlone.Nat]<br>[GblId]<br>lvl3_ryD =3D=
<br>=A0 GHC.Base.map<br>
=A0=A0=A0 @ EnumAlone.Nat<br>=A0=A0=A0 @ EnumAlone.Nat<br>=A0=A0=A0 EnumAlo=
ne.Su<br>=A0=A0=A0 EnumAlone.enumNatFromRep<br><br>EnumAlone.enumNatFromRep=
[Occ=3DLoopBreaker] :: [EnumAlone.Nat]<br>[GblId, Str=3DDmdType]<br>EnumAl=
one.enumNatFromRep =3D<br>
=A0 EnumAlone.||| @ EnumAlone.Nat lvl_ryA lvl3_ryD<br></blockquote><br>This=
is what I wanted: no more representation types (Either or ()), and the cod=
e looks exactly like what is generated for the handwritten enumNat.<br><br>
<br><font size=3D"4">More realistic generic programming</font><br><br>Now l=
et's see if we can transport this to the setting of a generic programmi=
ng library. I'll use a bare-bones version of GHC.Generics:<br><br><bloc=
kquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,=
204);padding-left:1ex;font-family:courier new,monospace" class=3D"gmail_quo=
te">
infixr 5 :+:<br>infixr 6 :*:<br><br>data U=A0=A0=A0=A0=A0=A0=A0=A0=A0 =3D U=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 deriving (Show, Read)<br>data a :+:=
b=A0=A0=A0 =3D L a | R b=A0=A0=A0=A0=A0 deriving (Show, Read)<br>data a :*=
: b=A0=A0=A0 =3D a :*: b=A0=A0=A0=A0=A0=A0=A0 deriving (Show, Read)<br>newt=
ype Var a=A0=A0 =3D Var a=A0=A0=A0=A0=A0=A0=A0=A0=A0 deriving (Show, Read)<=
br>
newtype Rec a=A0=A0 =3D Rec a=A0=A0=A0=A0=A0=A0=A0=A0=A0 deriving (Show, Re=
ad)<br><br>class Representable a where<br>=A0 type Rep a<br>=A0 to=A0=A0 ::=
Rep a -> a<br>=A0 from :: a -> Rep a<br></blockquote><br>Let's r=
epresent Nat in this library:<br>
<br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb=
(204,204,204);padding-left:1ex;font-family:courier new,monospace" class=3D"=
gmail_quote">instance Representable Nat where<br>=A0 type Rep Nat =3D U :+:=
(Rec Nat)<br>
=A0 from Ze=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 =3D L U<br>=A0 from (Su n)=A0=
=A0=A0=A0=A0=A0=A0 =3D R (Rec n)<br>=A0 to (L U)=A0=A0=A0=A0=A0=A0 =3D Ze<b=
r>=A0 to (R (Rec n)) =3D Su n<br></blockquote><br>(Note, in particular, tha=
t we do not need INLINE pragmas on the from/to methods. This might just be =
because GHC thinks these are small and inlines them anyway, but in general =
we want to make sure they are inlined, so we typically use pragmas there.)<=
br>
<br>Now we need to implement enumeration generically. We do this by giving =
an instance for each representation type:<br><br><blockquote style=3D"margi=
n:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex=
;font-family:courier new,monospace" class=3D"gmail_quote">
class GEnum' a where<br>=A0 genum' :: [a]<br><br>instance GEnum'=
; U where<br>=A0 {-# INLINE genum' #-}<br>=A0 genum' =3D [U]<br><br=
>instance (GEnum a) =3D> GEnum' (Rec a) where<br>=A0 {-# INLINE genu=
m' #-}<br>
=A0 genum' =3D map Rec genum<br><br>instance (GEnum a) =3D> GEnum=
9; (Var a) where<br>=A0 {-# INLINE genum' #-}<br>=A0 genum' =3D map=
Var genum<br><br>instance (GEnum' f, GEnum' g) =3D> GEnum' =
(f :+: g) where<br>
=A0 {-# INLINE genum' #-}<br>=A0 genum' =3D map L genum' ||| ma=
p R genum'<br><br>instance (GEnum' f, GEnum' g) =3D> GEnum&#=
39; (f :*: g) where<br>=A0 {-# INLINE genum' #-}<br>=A0 --genum' =
=3D diag [ [ x :*: y | y <- genum' ] | x <- genum' ]<br>
=A0 genum' =3D diag (map (\x -> map (\y -> x :*: y) genum') g=
enum')<br></blockquote><br>We explicitly tell GHC to inline each case, =
as before. Note that for products I'm not using the more natural list c=
omprehension syntax because I don't quite understand how that gets tran=
slated into core.<br>
<br>In the cases for Var and Rec we use genum from the GEnum class:<br><br>=
<blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204=
,204,204);padding-left:1ex;font-family:courier new,monospace" class=3D"gmai=
l_quote">
class GEnum a where<br>=A0 genum :: [a]<br>=A0 {-# INLINE genum #-}<br>=A0 =
default genum :: (Representable a, GEnum' (Rep a)) =3D> [a]<br>=A0 g=
enum =3D map to genum'<br></blockquote><br>GEnum' is the class used=
for instantiating the generic representation types, and GEnum is used for =
user types. We use a default signature to provide a default method that can=
be used when we have a Representable instance for the type in question. Th=
is makes instantiating Nat very easy:<br>
<br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb=
(204,204,204);padding-left:1ex;font-family:courier new,monospace" class=3D"=
gmail_quote">instance GEnum Nat<br></blockquote><br>Unfortunately, the core=
code generated in this situation (with the same RULES as before) is not ni=
ce at all:<br>
<br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb=
(204,204,204);padding-left:1ex;font-family:courier new,monospace" class=3D"=
gmail_quote">Main.$fGEnumNat_$cgenum [Occ=3DLoopBreaker] :: [Base.Nat]<br>[=
GblId, Str=3DDmdType]<br>
Main.$fGEnumNat_$cgenum =3D<br>=A0 GHC.Base.map<br>=A0=A0=A0 @ (Base.Rep Ba=
se.Nat)<br>=A0=A0=A0 @ Base.Nat<br>=A0=A0=A0 Base.$fRepresentableNat_$cto<b=
r>=A0=A0=A0 (lvl37_r79y<br>=A0=A0=A0=A0 `cast` (Sym (GEnum.NTCo:GEnum')=
<Base.U Base.:+: (Base.Rec Base.Nat)> ; <br>
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 (GEnum.GEnum' (Sym (Base.TFCo:R=
:RepNat)) ;<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 GEnum.NTCo:GEn=
um' <Base.Rep Base.Nat>)<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 =
:: [Base.C Base.Nat_Ze_ Base.U<br>
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Base.:+: Base.C Base.Nat_S=
u_ (Base.Rec Base.Nat)]<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0 ~#<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 [Base.Rep Base.Nat]=
))<br></blockquote><br>We see a map of the `to` function, which is definite=
ly not what we want. Oddly enough, if we give an explicit definition of gen=
um for Nat, with the inlined default...<br>
<br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb=
(204,204,204);padding-left:1ex;font-family:courier new,monospace" class=3D"=
gmail_quote">instance GEnum Nat where genum =3D map to genum'<br></bloc=
kquote>
<br>... then we get the optimised code we want:<br><br><blockquote style=3D=
"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-le=
ft:1ex;font-family:courier new,monospace" class=3D"gmail_quote">lvl34_r79p =
:: [Base.Nat]<br>
[GblId, Caf=3DNoCafRefs]<br>lvl34_r79p =3D<br>=A0 GHC.Types.: @ Base.Nat Ba=
se.Ze (GHC.Types.[] @ Base.Nat)<br><br>lvl35_r79q :: [Base.Nat]<br>[GblId]<=
br>lvl35_r79q =3D<br>=A0 GHC.Base.map @ Base.Nat @ Base.Nat Base.Su Main.$f=
GEnumNat_$cgenum<br>
<br>Main.$fGEnumNat_$cgenum [Occ=3DLoopBreaker] :: [Base.Nat]<br>[GblId, St=
r=3DDmdType]<br>Main.$fGEnumNat_$cgenum =3D<br>=A0 GEnum.||| @ Base.Nat lvl=
34_r79p lvl35_r79q<br></blockquote><br>Again, no representation types, no `=
to`, just the same code that is generated for enumNat. Perfect. But we had =
to avoid using the default definition, which is a pity.<br>
<br><b>Question: why won't GHC inline the default method of a class, ev=
en when I have a pragma telling it to do so?</b><br><br>Let's look at o=
ne more datatype, because Nat does not use products. So let's consider =
trees:<br>
<br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb=
(204,204,204);padding-left:1ex;font-family:courier new,monospace" class=3D"=
gmail_quote">data Tree a =3D Leaf | Bin a (Tree a) (Tree a)<br><br>instance=
Representable (Tree a) where<br>
=A0 type Rep (Tree a) =3D U :+: (Var a :*: Rec (Tree a) :*: Rec (Tree a))<b=
r>=A0 from (Bin x l r) =3D R (Var x :*: Rec l :*: Rec r)<br>=A0 from Leaf=
=A0=A0=A0=A0=A0=A0=A0 =3D L U<br>=A0 to (R (Var x :*: (Rec l) :*: (Rec r)))=
=3D Bin x l r<br>=A0 to (L U)=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 =3D Leaf<br>
</blockquote><br>We give a GEnum instance using the same trick as before:<b=
r><br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid r=
gb(204,204,204);padding-left:1ex;font-family:courier new,monospace" class=
=3D"gmail_quote">
instance GEnum (Tree Int) where genum =3D map to genum'<br></blockquote=
><br>(For simplicity only for trees of integers.) The generated code for tr=
ees is unfortunately not as nice:<br><br><blockquote style=3D"margin:0pt 0p=
t 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex;font-fa=
mily:courier new,monospace" class=3D"gmail_quote">
a2_r79M<br>=A0 :: [Base.Rec (Base.Tree <a href=3D"http://GHC.Types.Int">GHC=
.Types.Int</a>)<br>=A0=A0=A0=A0=A0 Base.:*: Base.Rec (Base.Tree <a href=3D"=
http://GHC.Types.Int">GHC.Types.Int</a>)]<br>[GblId, Str=3DDmdType]<br>a2_r=
79M =3D<br>=A0 GEnum.diag<br>
=A0=A0=A0 @ (Base.Rec (Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types=
.Int</a>)<br>=A0=A0=A0=A0=A0=A0 Base.:*: Base.Rec (Base.Tree <a href=3D"htt=
p://GHC.Types.Int">GHC.Types.Int</a>))<br>=A0=A0=A0 lvl8_r79L<br><br>lvl9_r=
79N :: [Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types.Int</a>]<br>
[GblId]<br>lvl9_r79N =3D<br>=A0 GHC.Base.map<br>=A0=A0=A0 @ (Base.Rec (Base=
.Tree <a href=3D"http://GHC.Types.Int">GHC.Types.Int</a>)<br>=A0=A0=A0=A0=
=A0=A0 Base.:*: Base.Rec (Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Ty=
pes.Int</a>))<br>=A0=A0=A0 @ (Base.Tree <a href=3D"http://GHC.Types.Int">GH=
C.Types.Int</a>)<br>
=A0=A0=A0 lvl5_r79H<br>=A0=A0=A0 a2_r79M<br><br>Main.$fGEnumTree_$cgenum [O=
cc=3DLoopBreaker]<br>=A0 :: [Base.Tree <a href=3D"http://GHC.Types.Int">GHC=
.Types.Int</a>]<br>[GblId, Str=3DDmdType]<br>Main.$fGEnumTree_$cgenum =3D<b=
r>=A0 GEnum.||| @ (Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types.Int=
</a>) lvl4_r79G lvl9_r79N<br>
</blockquote><br>Note how lvl9_r79N is a map over a2_r79M, and a2_r79M is a=
`diag`. Ok, we need the free theorem of `diag` to tell GHC how to commute =
the `diag` with map:<br><br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;b=
order-left:1px solid rgb(204,204,204);padding-left:1ex;font-family:courier =
new,monospace" class=3D"gmail_quote">
{-# RULES "ft/diag" forall f l. map f (diag l) =3D diag (map (map=
f) l) #-}<br></blockquote><br>Unfortunately this doesn't change the ge=
nerated core code. With some more debugging looking at the generated code a=
t each simplifier iteration, I believe that this is because a2_r79M got lif=
ted out too soon, prevent the rule from applying. With some imagination I d=
ecided to try the -fno-full-laziness flag to prevent let-floating. I'm =
not sure this is a good idea in general, but in this particular case it giv=
es much better results:<br>
<br><blockquote style=3D"margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb=
(204,204,204);padding-left:1ex;font-family:courier new,monospace" class=3D"=
gmail_quote">a1_r72i :: [Base.Rec (Base.Tree <a href=3D"http://GHC.Types.In=
t">GHC.Types.Int</a>)]<br>
[GblId, Str=3DDmdType]<br>a1_r72i =3D<br>=A0 GHC.Base.map<br>=A0=A0=A0 @ (B=
ase.Tree <a href=3D"http://GHC.Types.Int">GHC.Types.Int</a>)<br>=A0=A0=A0 @=
(Base.Rec (Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types.Int</a>))<=
br>=A0=A0=A0 ((\ (tpl_B1 :: Base.Tree <a href=3D"http://GHC.Types.Int">GHC.=
Types.Int</a>) -> tpl_B1)<br>
=A0=A0=A0=A0 `cast` (<Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Typ=
es.Int</a>><br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 -> Sym (Base.NTCo=
:Rec <Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types.Int</a>>)<=
br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 :: (Base.Tree <a href=3D"http://GHC=
.Types.Int">GHC.Types.Int</a> -> Base.Tree <a href=3D"http://GHC.Types.I=
nt">GHC.Types.Int</a>)<br>
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ~#<br>=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 (Base.Tree <a href=3D"http://GHC.Types.Int">=
GHC.Types.Int</a> -> Base.Rec (Base.Tree <a href=3D"http://GHC.Types.Int=
">GHC.Types.Int</a>))))<br>=A0=A0=A0 Main.$fGEnumTree_$cgenum<br><br>Main.$=
fGEnumTree_$cgenum [Occ=3DLoopBreaker]<br>
=A0 :: [Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types.Int</a>]<br>[G=
blId, Str=3DDmdType]<br>Main.$fGEnumTree_$cgenum =3D<br>=A0 GEnum.|||<br>=
=A0=A0=A0 @ (Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types.Int</a>)<=
br>=A0=A0=A0 (GHC.Types.:<br>
=A0=A0=A0=A0=A0=A0 @ (Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types.=
Int</a>)<br>=A0=A0=A0=A0=A0=A0 (Base.Leaf @ <a href=3D"http://GHC.Types.Int=
">GHC.Types.Int</a>)<br>=A0=A0=A0=A0=A0=A0 (GHC.Types.[] @ (Base.Tree <a hr=
ef=3D"http://GHC.Types.Int">GHC.Types.Int</a>)))<br>
=A0=A0=A0 (GEnum.diag<br>=A0=A0=A0=A0=A0=A0 @ (Base.Tree <a href=3D"http://=
GHC.Types.Int">GHC.Types.Int</a>)<br>=A0=A0=A0=A0=A0=A0 (GHC.Base.map<br>=
=A0=A0=A0=A0=A0=A0=A0=A0=A0 @ (Base.Rec (Base.Tree <a href=3D"http://GHC.Ty=
pes.Int">GHC.Types.Int</a>))<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0 @ [Base.Tree <a=
href=3D"http://GHC.Types.Int">GHC.Types.Int</a>]<br>
=A0=A0=A0=A0=A0=A0=A0=A0=A0 (\ (x_a1yQ :: Base.Rec (Base.Tree <a href=3D"ht=
tp://GHC.Types.Int">GHC.Types.Int</a>)) -><br>=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0 GHC.Base.map<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 @ (=
Base.Rec (Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types.Int</a>))<br=
>
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 @ (Base.Tree <a href=3D"http://G=
HC.Types.Int">GHC.Types.Int</a>)<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0 (\ (x1_X1zB :: Base.Rec (Base.Tree <a href=3D"http://GHC.Types.Int">GHC=
.Types.Int</a>)) -><br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0 Base.Bin<br>
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 @ <a href=3D"http=
://GHC.Types.Int">GHC.Types.Int</a><br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0 a_r72h<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0 (x_a1yQ<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0 `cast` (Base.NTCo:Rec <Base.Tree <a href=3D"http://GHC.Type=
s.Int">GHC.Types.Int</a>><br>
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 :: Base.Rec (Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types=
.Int</a>) ~# Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types.Int</a>))=
<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 (x1_X1zB<br>=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 `cast` (Base.N=
TCo:Rec <Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types.Int</a>>=
;<br>
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 :: Base.Rec (Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types=
.Int</a>) ~# Base.Tree <a href=3D"http://GHC.Types.Int">GHC.Types.Int</a>))=
)<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 a1_r72i)<br>=A0=A0=A0=A0=A0=
=A0=A0=A0=A0 a1_r72i))<br>
</blockquote>
<br>Note how our enum is now of the shape `[Leaf] ||| diag y`, which is goo=
d. The only catch is that there are some `Rec`s still laying around, with t=
heir associated newtype coercions, and a function a1_r72i that basically wr=
aps the recursive enumeration in a Rec, only to be unwrapped in the body of=
`$fGEnumTree_$cgenum`. I don't know how to get GHC to simplify this co=
de any further.<br>
<br><b>Question: why do I need -fno-full-laziness for the ft/diag rule to a=
pply?</b><br><br><b>Question: why is GHC not getting rid of the Rec newtype=
in this case?</b><br><br>I have also played with -O2, in particular becaus=
e of the SpecConstr optimisation, but found that it does not affect these p=
articular examples (perhaps it only becomes important with larger datatypes=
). I have also experimented with phase control in the rewrite rules and the=
inline pragmas, but didn't find it necessary for this example. In gene=
ral, anyway, my experience with the inliner is that it is extremely fragile=
, especially across different GHC versions, and it's hard to get any gu=
arantees of optimisation. I have also played with the -funfolding-* options=
before, with mixed results. [1] It's also a pity that certain flags ar=
e not explained in detail in the user's manual [2,3], like -fliberate-c=
ase, and -fspec-constr-count and threshold, for instance.<br>
<br>Thank you for reading this. Any insights are welcome. In particular, I&=
#39;m wondering if I might be missing some details regarding strictness.<br=
><br><br>Cheers,<br>Pedro<br><br>[1] <a href=3D"http://dreixel.net/research=
/pdf/ogie.pdf">http://dreixel.net/research/pdf/ogie.pdf</a><br>
[2] <a href=3D"http://www.haskell.org/ghc/docs/latest/html/users_guide/flag=
-reference.html">http://www.haskell.org/ghc/docs/latest/html/users_guide/fl=
ag-reference.html</a><br>[3] <a href=3D"http://www.haskell.org/ghc/docs/lat=
est/html/users_guide/options-optimise.html#options-f">http://www.haskell.or=
g/ghc/docs/latest/html/users_guide/options-optimise.html#options-f</a><br>
--e89a8fb2065e6d510904b98e6765--
More information about the Generics
mailing list