[GHC] #9989: GHCI is slow for precompiled code

GHC ghc-devs at haskell.org
Mon Jan 19 15:55:14 UTC 2015


#9989: GHCI is slow for precompiled code
-------------------------------------+-------------------------------------
        Reporter:  remdezx           |                   Owner:
            Type:  bug               |                  Status:  new
        Priority:  normal            |               Milestone:
       Component:  GHC API           |                 Version:  7.8.3
      Resolution:                    |                Keywords:
Operating System:  Linux             |            Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |               Test Case:
      Blocked By:                    |                Blocking:
 Related Tickets:                    |  Differential Revisions:
-------------------------------------+-------------------------------------
Description changed by simonpj:

Old description:

> Hello! We found that loading compiled and optimised `accelerate` code
> into ghci works much much slower than code run directly.
>
> (I'm marking it as a ghc-api bug, because I noticed this behavior while
> using it)
>
> Here is the example:
> {{{#!hs
> module Main where
>
> import Data.Array.Accelerate as A
> import Data.Array.Accelerate.CUDA as C
> import Data.Time.Clock       (diffUTCTime, getCurrentTime)
>
> main :: IO ()
> main = do
>     start <- getCurrentTime
>     print $ C.run $ A.maximum $ A.map (+1) $ A.use (fromList (Z:.1000000)
> [1..1000000] :: Vector Double)
>     end   <- getCurrentTime
>     print $ diffUTCTime end start
> }}}
>
> When I compile it and run
> {{
>     $ ghc -O2 Main.hs -threaded
>     [1 of 1] Compiling Main             ( Main.hs, Main.o )
>     Linking Main ...
>     $ ./Main
>     Array (Z) [1000001.0]
>     0.162204s
> }}
> It takes `0.16 s` to finish
> When I compile it and load into ghci:
> {{{
>     $ ghc -O2 -dynamic -c Main.hs
>     $ ghci
>     GHCi, version 7.8.3: http://www.haskell.org/ghc/  :? for help
>     Loading package ghc-prim ... linking ... done.
>     Loading package integer-gmp ... linking ... done.
>     Loading package base ... linking ... done.
>     Prelude> :l Main
>     Ok, modules loaded: Main.
>     Prelude Main> main
>     Loading package transformers-0.3.0.0 ... linking ... done.
>     Loading package array-0.5.0.0 ... linking ... done.
>     Loading package deepseq-1.3.0.2 ... linking ... done.
>     Loading package bytestring-0.10.4.0 ... linking ... done.
>     Loading package containers-0.5.5.1 ... linking ... done.
>     Loading package binary-0.7.1.0 ... linking ... done.
>     Loading package pretty-1.1.1.1 ... linking ... done.
>     Loading package filepath-1.3.0.2 ... linking ... done.
>     Loading package old-locale-1.0.0.6 ... linking ... done.
>     Loading package time-1.4.2 ... linking ... done.
>     Loading package unix-2.7.0.1 ... linking ... done.
>     Loading package directory-1.2.1.0 ... linking ... done.
>     Loading package process-1.2.0.0 ... linking ... done.
>     Loading package stm-2.4.4 ... linking ... done.
>     Loading package SafeSemaphore-0.10.1 ... linking ... done.
>     Loading package mtl-2.1.3.1 ... linking ... done.
>     Loading package template-haskell ... linking ... done.
>     Loading package fclabels-2.0.2.2 ... linking ... done.
>     Loading package text-1.1.1.3 ... linking ... done.
>     Loading package hashable-1.2.3.1 ... linking ... done.
>     Loading package primitive-0.5.4.0 ... linking ... done.
>     Loading package vector-0.10.12.2 ... linking ... done.
>     Loading package hashtables-1.2.0.1 ... linking ... done.
>     Loading package unordered-containers-0.2.5.1 ... linking ... done.
>     Loading package accelerate-0.15.0.0 ... linking ... done.
>     Loading package byteable-0.1.1 ... linking ... done.
>     Loading package cryptohash-0.11.6 ... linking ... done.
>     Loading package cuda-0.6.5.1 ... linking ... done.
>     Loading package exception-transformers-0.3.0.4 ... linking ... done.
>     Loading package exception-mtl-0.3.0.5 ... linking ... done.
>     Loading package old-time-1.1.0.2 ... linking ... done.
>     Loading package polyparse-1.11 ... linking ... done.
>     Loading package cpphs-1.18.6 ... linking ... done.
>     Loading package haskell-src-exts-1.16.0.1 ... linking ... done.
>     Loading package syb-0.4.4 ... linking ... done.
>     Loading package th-lift-0.7 ... linking ... done.
>     Loading package safe-0.3.8 ... linking ... done.
>     Loading package th-expand-syns-0.3.0.4 ... linking ... done.
>     Loading package th-reify-many-0.1.2 ... linking ... done.
>     Loading package th-orphans-0.8.3 ... linking ... done.
>     Loading package haskell-src-meta-0.6.0.8 ... linking ... done.
>     Loading package srcloc-0.4.1 ... linking ... done.
>     Loading package mainland-pretty-0.2.7 ... linking ... done.
>     Loading package symbol-0.2.4 ... linking ... done.
>     Loading package language-c-quote-0.8.0 ... linking ... done.
>     Loading package accelerate-cuda-0.15.0.0 ... linking ... done.
>     Array (Z) [1000001.0]
>     0.256128s
> }}}
> It takes `0.26 s` to finish. On other computer, using `criterion`, we
> observed even a `50x` difference.
>
> Why is it working so slow, isn't ghci just to load function and simply
> call it?
>
> Here is how execution time changes for different matrix sizes. For
> comparison we have also measured time of interpreted code (without
> precompiling).
>
> {{{
>     size     | compiled  | precompiled  | interpreted
>     ---------+-----------+--------------+------------
>     100      | 0.054076s | 0.082686s    | 0.151857s
>     1000     | 0.054509s | 0.08305s     | 0.135452s
>     10000    | 0.055405s | 0.08469s     | 0.12632s
>     100000   | 0.057768s | 0.093011s    | 0.155359s
>     1000000  | 0.089811s | 0.251359s    | 0.202022s
>     10000000 | 0.397642s | 1.400603s    | 0.898547s
> }}}
>
> We believe that problem lies on the side of `ghci` rather than
> `accelerate`, how could we confirm this?
>
> Moreover and even more important is there any workaround for it, if we
> need this precompiled code running fast in production environment?
>
> Any guidance will be appreciated.

New description:

 Hello! We found that loading compiled and optimised `accelerate` code into
 ghci works much much slower than code run directly.

 (I'm marking it as a ghc-api bug, because I noticed this behavior while
 using it)

 Here is the example:
 {{{#!hs
 module Main where

 import Data.Array.Accelerate as A
 import Data.Array.Accelerate.CUDA as C
 import Data.Time.Clock       (diffUTCTime, getCurrentTime)

 main :: IO ()
 main = do
     start <- getCurrentTime
     print $ C.run $ A.maximum $ A.map (+1) $ A.use (fromList (Z:.1000000)
 [1..1000000] :: Vector Double)
     end   <- getCurrentTime
     print $ diffUTCTime end start
 }}}

 When I compile it and run
 {{{
     $ ghc -O2 Main.hs -threaded
     [1 of 1] Compiling Main             ( Main.hs, Main.o )
     Linking Main ...
     $ ./Main
     Array (Z) [1000001.0]
     0.162204s
 }}}
 It takes `0.16 s` to finish
 When I compile it and load into ghci:
 {{{
     $ ghc -O2 -dynamic -c Main.hs
     $ ghci
     GHCi, version 7.8.3: http://www.haskell.org/ghc/  :? for help
     Loading package ghc-prim ... linking ... done.
     Loading package integer-gmp ... linking ... done.
     Loading package base ... linking ... done.
     Prelude> :l Main
     Ok, modules loaded: Main.
     Prelude Main> main
     Loading package transformers-0.3.0.0 ... linking ... done.
     Loading package array-0.5.0.0 ... linking ... done.
     Loading package deepseq-1.3.0.2 ... linking ... done.
     Loading package bytestring-0.10.4.0 ... linking ... done.
     Loading package containers-0.5.5.1 ... linking ... done.
     Loading package binary-0.7.1.0 ... linking ... done.
     Loading package pretty-1.1.1.1 ... linking ... done.
     Loading package filepath-1.3.0.2 ... linking ... done.
     Loading package old-locale-1.0.0.6 ... linking ... done.
     Loading package time-1.4.2 ... linking ... done.
     Loading package unix-2.7.0.1 ... linking ... done.
     Loading package directory-1.2.1.0 ... linking ... done.
     Loading package process-1.2.0.0 ... linking ... done.
     Loading package stm-2.4.4 ... linking ... done.
     Loading package SafeSemaphore-0.10.1 ... linking ... done.
     Loading package mtl-2.1.3.1 ... linking ... done.
     Loading package template-haskell ... linking ... done.
     Loading package fclabels-2.0.2.2 ... linking ... done.
     Loading package text-1.1.1.3 ... linking ... done.
     Loading package hashable-1.2.3.1 ... linking ... done.
     Loading package primitive-0.5.4.0 ... linking ... done.
     Loading package vector-0.10.12.2 ... linking ... done.
     Loading package hashtables-1.2.0.1 ... linking ... done.
     Loading package unordered-containers-0.2.5.1 ... linking ... done.
     Loading package accelerate-0.15.0.0 ... linking ... done.
     Loading package byteable-0.1.1 ... linking ... done.
     Loading package cryptohash-0.11.6 ... linking ... done.
     Loading package cuda-0.6.5.1 ... linking ... done.
     Loading package exception-transformers-0.3.0.4 ... linking ... done.
     Loading package exception-mtl-0.3.0.5 ... linking ... done.
     Loading package old-time-1.1.0.2 ... linking ... done.
     Loading package polyparse-1.11 ... linking ... done.
     Loading package cpphs-1.18.6 ... linking ... done.
     Loading package haskell-src-exts-1.16.0.1 ... linking ... done.
     Loading package syb-0.4.4 ... linking ... done.
     Loading package th-lift-0.7 ... linking ... done.
     Loading package safe-0.3.8 ... linking ... done.
     Loading package th-expand-syns-0.3.0.4 ... linking ... done.
     Loading package th-reify-many-0.1.2 ... linking ... done.
     Loading package th-orphans-0.8.3 ... linking ... done.
     Loading package haskell-src-meta-0.6.0.8 ... linking ... done.
     Loading package srcloc-0.4.1 ... linking ... done.
     Loading package mainland-pretty-0.2.7 ... linking ... done.
     Loading package symbol-0.2.4 ... linking ... done.
     Loading package language-c-quote-0.8.0 ... linking ... done.
     Loading package accelerate-cuda-0.15.0.0 ... linking ... done.
     Array (Z) [1000001.0]
     0.256128s
 }}}
 It takes `0.26 s` to finish. On other computer, using `criterion`, we
 observed even a `50x` difference.

 Why is it working so slow, isn't ghci just to load function and simply
 call it?

 Here is how execution time changes for different matrix sizes. For
 comparison we have also measured time of interpreted code (without
 precompiling).

 {{{
     size     | compiled  | precompiled  | interpreted
     ---------+-----------+--------------+------------
     100      | 0.054076s | 0.082686s    | 0.151857s
     1000     | 0.054509s | 0.08305s     | 0.135452s
     10000    | 0.055405s | 0.08469s     | 0.12632s
     100000   | 0.057768s | 0.093011s    | 0.155359s
     1000000  | 0.089811s | 0.251359s    | 0.202022s
     10000000 | 0.397642s | 1.400603s    | 0.898547s
 }}}

 We believe that problem lies on the side of `ghci` rather than
 `accelerate`, how could we confirm this?

 Moreover and even more important is there any workaround for it, if we
 need this precompiled code running fast in production environment?

 Any guidance will be appreciated.

--

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9989#comment:3>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list