[GHC] #13390: Strict literal float-out during desugaring regresses T1969 at -O0

Tue Mar 7 19:06:56 UTC 2017

#13390: Strict literal float-out during desugaring regresses T1969 at -O0
-------------------------------------+-------------------------------------
        Reporter:  bgamari           |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:  8.2.1
       Component:  Compiler          |              Version:  8.0.1
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Description changed by bgamari:

Old description:

> Phab:D1259, which teaches the compiler to aggressively float-out string
> literals during desugaring, regresses compiler allocations on `T1969` by
> 15% or so at -O0`.
>
> = The problem =
> In the case of `T1969` (compiled with `-O0`) the difference is quite
> stark: with floating the non-optimizing simplifier pass produces `{terms:
> 16,893, types: 7,552, coercions: 0, joins: 0/0}`, without it produces
> `{terms: 12,693, types: 4,552, coercions: 0, joins: 0/0}`.
>
> The (minimized) test looks like,
> {{{#!hs
> module T1969 where
>
> class C a where
>     c :: a -> String
>     d :: a -> String
>     d x = c x
>     e :: a -> String
>     e x = c x
>
> data A1 = A1
> instance C A1 where
>     c A1 = "A1"
> }}}
>
> == Post-desugar ==
> The reason for the regression is in part due to the fact that we float
> out the `unpackCString# "An"` expression. That is, after desugaring
> without floating we get (looking at just the `A1` bindings),
> {{{#!hs
> T1969.$dme :: forall a. C a => a -> String
> T1969.$dme = \ (@ a_aM1) ($dC_a1h9 :: C a_aM1) (x_aM3 :: a_aM1) ->
>       c @ a_aM1 $dC_a1h9 x_aM3
>
> -- same as $dme
> T1969.$dmd :: forall a. C a => a -> String
> T1969.$dmd = \ (@ a_aM1) ($dC_a1h9 :: C a_aM1) (x_aM2 :: a_aM1) ->
>       c @ a_aM1 $dC_a1h9 x_aM2
>
> $cc_a1i7 :: A1 -> String
> $cc_a1i7= \ (ds_d1jJ :: A1) -> case ds_d1jJ of { A1 ->
> GHC.CString.unpackCString# "A1"# }
>
> Rec {
> T1969.$fCA3 :: C A3
> T1969.$fCA3 = T1969.C:C @ A3 $cc_a1hl $cd_a1hp $ce_a1hy
>
> $ce_a1hy :: A3 -> String
> $ce_a1hy = T1969.$dme @ A3 T1969.$fCA3
>
> $cd_a1hp :: A3 -> String
> $cd_a1hp = T1969.$dmd @ A3 T1969.$fCA3
> end Rec }
> }}}
>
> Whereas with floating we get,
> {{{#!hs
> -- same as above
> T1969.$dme :: forall a. C a => a -> String
> T1969.$dmd :: forall a. C a => a -> String
>
> ds_d1k4 :: [Char]
> ds_d1k4 = GHC.CString.unpackCString# "A1"#
>
> $cc_a1i7 :: A1 -> String
> $cc_a1i7 = \ (ds_d1k3 :: A1) -> case ds_d1k3 of { A1 -> ds_d1k4 }
>
> Rec {
> T1969.$fCA1 :: C A1
> T1969.$fCA1 = T1969.C:C @ A1 $cc_a1i7 $cd_a1ib $ce_a1ik
>
> $ce_a1ik :: A1 -> String
> $ce_a1ik = T1969.$dme @ A1 T1969.$fCA1
>
> $cd_a1ib :: A1 -> String
> $cd_a1ib = T1969.$dmd @ A1 T1969.$fCA1
> end Rec }
> }}}
>
> So far things aren't so bad: the only interesting difference is the
> floated `[Char]`, which we would expect. However, let's then see what
> happens during simplification.
>
> == Post-simplification ==
> Without floating we see,
> {{{#!hs
> T1969.$fCA1 :: C A1
> T1969.$fCA1 = T1969.C:C @ A1 $cc_a1i7 $cc_a1i7 $cc_a1i7
>
> $cc_a1i7 :: A1 -> String
> $cc_a1i7
>   = \ (ds_d1jJ :: A1) ->
>       case ds_d1jJ of { A1 -> GHC.CString.unpackCString# "A1"# }
> }}}
>
> Whereas with floating we have,
> {{{#!hs
> ds_d1k4 :: [Char]
> ds_d1k4 = GHC.CString.unpackCString# "A1"#
>
> $cc_a1i7 :: A1 -> String
> $cc_a1i7 = \ (ds_d1k3 :: A1) -> case ds_d1k3 of { A1 -> ds_d1k4 }
>
> $cd_a1ib :: A1 -> String
> $cd_a1ib = \ (x_aM2 :: A1) -> case x_aM2 of { A1 -> ds_d1k4 }
>
> $ce_a1ik :: A1 -> String
> $ce_a1ik = \ (x_aM3 :: A1) -> case x_aM3 of { A1 -> ds_d1k4 }
>
> T1969.$fCA1 :: C A1
> T1969.$fCA1 = T1969.C:C @ A1 $cc_a1i7 $cd_a1ib $ce_a1ik
> }}}
>
> This is quite interesting: without floating we are somehow able to
> collapse each of the `A1 -> String` bindings into a single binding
> (despite CSE being disabled due to `-O0`!).

New description:

 Phab:D1259, which teaches the compiler to aggressively float-out string
 literals during desugaring (namely `DsMonad.mkStringExprFSAtTopLevel`),
 regresses compiler allocations on `T1969` by 15% or so at `-O0`.

 = The problem =
 In the case of `T1969` (compiled with `-O0`) the difference is quite
 stark: with floating the non-optimizing simplifier pass produces `{terms:
 16,893, types: 7,552, coercions: 0, joins: 0/0}`, without it produces
 `{terms: 12,693, types: 4,552, coercions: 0, joins: 0/0}`.

 The (minimized) test looks like,
 {{{#!hs
 module T1969 where

 class C a where
     c :: a -> String
     d :: a -> String
     d x = c x
     e :: a -> String
     e x = c x

 data A1 = A1
 instance C A1 where
     c A1 = "A1"
 }}}
 This reduced program simplifies to 261 terms and 127 types with float-out
 and 219 terms, 97 types without.

 == Post-desugar ==
 The reason for the regression is in part due to the fact that we float out
 the `unpackCString# "An"` expression. That is, after desugaring without
 floating we get (looking at just the `A1` bindings),
 {{{#!hs
 T1969.$dme :: forall a. C a => a -> String
 T1969.$dme = \ (@ a_aM1) ($dC_a1h9 :: C a_aM1) (x_aM3 :: a_aM1) ->
       c @ a_aM1 $dC_a1h9 x_aM3

 -- same as $dme
 T1969.$dmd :: forall a. C a => a -> String
 T1969.$dmd = \ (@ a_aM1) ($dC_a1h9 :: C a_aM1) (x_aM2 :: a_aM1) ->
       c @ a_aM1 $dC_a1h9 x_aM2

 $cc_a1i7 :: A1 -> String
 $cc_a1i7= \ (ds_d1jJ :: A1) -> case ds_d1jJ of { A1 ->
 GHC.CString.unpackCString# "A1"# }

 Rec {
 T1969.$fCA3 :: C A3
 T1969.$fCA3 = T1969.C:C @ A3 $cc_a1hl $cd_a1hp $ce_a1hy

 $ce_a1hy :: A3 -> String
 $ce_a1hy = T1969.$dme @ A3 T1969.$fCA3

 $cd_a1hp :: A3 -> String
 $cd_a1hp = T1969.$dmd @ A3 T1969.$fCA3
 end Rec }
 }}}

 Whereas with floating we get,
 {{{#!hs
 -- same as above
 T1969.$dme :: forall a. C a => a -> String
 T1969.$dmd :: forall a. C a => a -> String

 ds_d1k4 :: [Char]
 ds_d1k4 = GHC.CString.unpackCString# "A1"#

 $cc_a1i7 :: A1 -> String
 $cc_a1i7 = \ (ds_d1k3 :: A1) -> case ds_d1k3 of { A1 -> ds_d1k4 }

 Rec {
 T1969.$fCA1 :: C A1
 T1969.$fCA1 = T1969.C:C @ A1 $cc_a1i7 $cd_a1ib $ce_a1ik

 $ce_a1ik :: A1 -> String
 $ce_a1ik = T1969.$dme @ A1 T1969.$fCA1

 $cd_a1ib :: A1 -> String
 $cd_a1ib = T1969.$dmd @ A1 T1969.$fCA1
 end Rec }
 }}}

 So far things aren't so bad: the only interesting difference is the
 floated `[Char]`, which we would expect. However, let's then see what
 happens during simplification.

 == Post-simplification ==
 Without floating we see,
 {{{#!hs
 T1969.$fCA1 :: C A1
 T1969.$fCA1 = T1969.C:C @ A1 $cc_a1i7 $cc_a1i7 $cc_a1i7

 $cc_a1i7 :: A1 -> String
 $cc_a1i7
   = \ (ds_d1jJ :: A1) ->
       case ds_d1jJ of { A1 -> GHC.CString.unpackCString# "A1"# }
 }}}

 Whereas with floating we have,
 {{{#!hs
 ds_d1k4 :: [Char]
 ds_d1k4 = GHC.CString.unpackCString# "A1"#

 $cc_a1i7 :: A1 -> String
 $cc_a1i7 = \ (ds_d1k3 :: A1) -> case ds_d1k3 of { A1 -> ds_d1k4 }

 $cd_a1ib :: A1 -> String
 $cd_a1ib = \ (x_aM2 :: A1) -> case x_aM2 of { A1 -> ds_d1k4 }

 $ce_a1ik :: A1 -> String
 $ce_a1ik = \ (x_aM3 :: A1) -> case x_aM3 of { A1 -> ds_d1k4 }

 T1969.$fCA1 :: C A1
 T1969.$fCA1 = T1969.C:C @ A1 $cc_a1i7 $cd_a1ib $ce_a1ik
 }}}

 This is quite interesting: without floating we are somehow able to
 collapse each of the `A1 -> String` bindings into a single binding
 (despite CSE being disabled due to `-O0`!).

--

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13390#comment:1>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler