[Haskell-cafe] Fixing Performance Leaks at the Type Level

Dimitrios Vytiniotis dimitris at microsoft.com
Tue Jul 12 15:33:07 CEST 2011


Dear Gershom,
Just to say many thanks for the extremely useful test cases! We will investigate further. 
Best, 
Dimitris



> -----Original Message-----
> From: glasgow-haskell-users-bounces at haskell.org [mailto:glasgow-haskell-
> users-bounces at haskell.org] On Behalf Of Gershom Bazerman
> Sent: 12 July 2011 13:18
> To: Haskell Cafe
> Cc: glasgow-haskell-users at haskell.org
> Subject: Fixing Performance Leaks at the Type Level
> 
> This post is in literate Haskell. It describes how certain performance leaks are
> introduced in type level programming. These leaks do not affect program
> runtimes, but can cause compile times to grow drastically. They exist both
> with Functional Dependencies and Type Families, but are currently worse
> with the former, and have grown worse with the new constraint solver in
> GHC 7. It is intended both as a guide to those encountering these issues, and
> as a motivation for the GHC development team to address such issues as the
> constraint solver is developed and improved.
> 
> > {-# OPTIONS_GHC -fcontext-stack=1000 #-} {-# LANGUAGE
> > FlexibleContexts, FlexibleInstances, FunctionalDependencies,
> > MultiParamTypeClasses, OverlappingInstances, TypeSynonymInstances,
> > TypeOperators, UndecidableInstances, TypeFamilies #-} module
> > TypePerformance where
> 
> Our running example, for simplicity's sake, is a type-level map of a single
> function. For reference, here is the code for a simple value-level map of a
> single function.
> 
> > vfoo = id
> > mapfoo (x : xs) = vfoo x : mapfoo xs
> > mapfoo [] = []
> 
> Because Haskell is a lazy language, this runs in O(n) time and constant stack.
> 
> We now lift map to the type level, to operate over HLists.
> 
> First, the basic HList types
> 
> > infixr 3 :*
> > data x :* xs = x :* xs deriving Show
> > data HNil = HNil deriving Show
> 
> Next, a large boring HList
> 
> > -- Adds ten cells
> > addData x = i :* i :* d :* d :* s :*
> >             i :* i :* d :* d :* s :*
> >             x
> >     where i = 1 :: Int
> >           d = 1 :: Double
> >           s = ""
> >
> > -- Has 70 cells.
> > sampleData = addData $ addData $ addData $ addData $ addData $
> >              addData $ addData $
> >              HNil
> 
> Next, a simple polymorphic function to map
> 
> > class Foo x y | x -> y
> >     where foo :: x -> y
> >           foo = undefined
> 
> > instance Foo Int Double
> > instance Foo Double Int
> > instance Foo String String
> 
> Now, our map
> 
> > class HMapFoo1 as bs | as -> bs where
> >     hMapFoo1 :: as -> bs
> >
> > instance (Foo a b, HMapFoo1 as bs) => HMapFoo1 (a :* as) (b :* bs) where
> >     hMapFoo1 (x :* xs) = foo x :* hMapFoo1 xs
> >
> > instance HMapFoo1 HNil HNil where
> >     hMapFoo1 _ = HNil
> 
> If we enable the following line, compilation time is ~ 9 seconds.
> 
>   > testHMapFoo1 = hMapFoo1 sampleData
> 
> Furthermore, playing with the size of sampleData, we see that the time
> spent in compilation is superlinear -- each additional cell costs a greater
> amount of time. This is because while Haskell is lazy at the value level, it is
> strict at the type level. Therefore, just as in a strict language, HMapFoo1's
> cost grows O(n^2) because even as we induct over the as, we build up a
> stack of bs. Just as in a strict language, the solution is to make hMapFoo tail
> recursive through introducing an accumulator. This also reverses the hlist, but
> never mind that.
> 
> > class HMapFoo2 acc as bs | acc as -> bs where
> >     hMapFoo2 :: acc -> as -> bs
> >
> > instance (Foo a b, HMapFoo2 (b :* bs) as res) => HMapFoo2 bs (a :* as) res
> where
> >     hMapFoo2 acc (x :* xs) = hMapFoo2 (foo x :* acc) xs
> >
> > instance HMapFoo2 acc HNil acc where
> >     hMapFoo2 acc _ = acc
> 
> If we enable the following line, compilation time is a much more satisfying
> ~0.5s.
> 
>   > testHMapFoo2 = hMapFoo2 HNil sampleData
> 
> But wait, there's trouble on the horizon! Consider the following version:
> 
> > class HMapFoo3 acc as bs | acc as -> bs where
> >     hMapFoo3 :: acc -> as -> bs
> >
> > instance (HMapFoo3 (b :* bs) as res, Foo a b) => HMapFoo3 bs (a :* as) res
> where
> >     hMapFoo3 acc (x :* xs) = hMapFoo3 (foo x :* acc) xs
> >
> > instance HMapFoo3 acc HNil acc where
> >     hMapFoo3 acc _ = acc
> 
> The only difference between hMapFoo2 and hMapFoo3 is that the order of
> constraints on the inductive case has been reversed, with the recursive
> constraint first and the immediately checkable constraint second. Now, if we
> enable the following line, compilation time rockets to ~6s!
> 
>   > testHMapFoo3 = hMapFoo3 HNil sampleData
> 
> Slowdowns such as those described here are not a purely hypothetical issue,
> but have caused real problems in production code. The example given above
> is fairly simple. The constraints used are minimal and easily checked. In real
> programs, the constraints are more difficult, increasing constant factors
> significantly. These constant factors, combined with unexpected algorithmic
> slowdowns due to the type inference engine, can lead (and have lead) to
> compilation times of HList-style code becoming deeply unwieldy-to-
> unusable, and can lead (and have lead) to this occuring only well after this
> code has been introduced and used on smaller cases without trouble.
> 
> The constraint solver should certainly be smart enough to reduce the compile
> times of HMapFoo3 to those of HMapFoo2. In fact, with type families, the
> there is no difference (see below). Could the compiler be smart enough to
> do the same for HMapFoo1? I'm not sure. Certainly, it could at least knock
> down its own constant factors a bit, even if it can't improve the big-O
> performance there.
> 
> ----
> Appendix: Examples with Type Families
> 
> As the below code demonstrates, the same issues demonstrated with
> Functional Dependencies also appear with Type Families, although less
> horribly, as their code-path seems more optimized in the current constraint
> solver:
> 
> > class TFoo x where
> >      type TFooFun x
> >      tfoo :: x -> TFooFun x
> >      tfoo = undefined
> >
> > instance TFoo Int where
> >     type TFooFun Int = Double
> > instance TFoo Double where
> >     type TFooFun Double = Int
> > instance TFoo String where
> >     type TFooFun String = String
> >
> > class THMapFoo1 as where
> >       type THMapFoo1Res as
> >       thMapFoo1 :: as -> THMapFoo1Res as
> >
> > instance (TFoo a, THMapFoo1 as) => THMapFoo1 (a :* as) where
> >     type THMapFoo1Res (a :* as) = TFooFun a :* THMapFoo1Res as
> >     thMapFoo1 (x :* xs) = tfoo x :* thMapFoo1 xs
> >
> > instance THMapFoo1 HNil where
> >     type THMapFoo1Res HNil = HNil
> >     thMapFoo1 _ = HNil
> 
> The following, when enabled, takes ~3.5s. This demonstrates that slowdown
> occurs with type families as well.
> 
>   > testTHMapFoo1 = thMapFoo1 sampleData
> 
> > class THMapFoo2 acc as where
> >       type THMapFoo2Res acc as
> >       thMapFoo2 :: acc -> as -> THMapFoo2Res acc as
> >
> > instance (TFoo a, THMapFoo2 (TFooFun a :* acc) as) => THMapFoo2 acc (a
> :* as) where
> >     type THMapFoo2Res acc (a :* as) = THMapFoo2Res (TFooFun a :* acc) as
> >     thMapFoo2 acc (x :* xs) = thMapFoo2 (tfoo x :* acc) xs
> >
> > instance THMapFoo2 acc HNil where
> >     type THMapFoo2Res acc HNil = acc
> >     thMapFoo2 acc _ = acc
> 
> The following, when enabled, takes ~0.6s. This demonstrates that the tail
> recursive transform fixes the slowdown with type families just as with
> fundeps.
> 
>   > testTHMapFoo2 = thMapFoo2 HNil sampleData
> 
> > class THMapFoo3 acc as where
> >       type THMapFoo3Res acc as
> >       thMapFoo3 :: acc -> as -> THMapFoo3Res acc as
> >
> > instance (THMapFoo3 (TFooFun a :* acc) as, TFoo a) => THMapFoo3 acc (a
> :* as) where
> >     type THMapFoo3Res acc (a :* as) = THMapFoo3Res (TFooFun a :* acc) as
> >     thMapFoo3 acc (x :* xs) = thMapFoo3 (tfoo x :* acc) xs
> >
> > instance THMapFoo3 acc HNil where
> >     type THMapFoo3Res acc HNil = acc
> >     thMapFoo3 acc _ = acc
> 
> The following, when enabled, also takes ~0.6s. This demonstrates that,
> unlike the fundep case, the order of type class constraints does not, in this
> instance, affect the performance of type families.
> 
>   > testTHMapFoo3 = thMapFoo3 HNil sampleData
> 
> --Gershom
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users




More information about the Haskell-Cafe mailing list