[Haskell-cafe] [Fwd: profiling in haskell]

Vlad Skvortsov vss at 73rus.com
Wed Sep 10 15:31:27 EDT 2008


Tim Chevalier wrote:
> 2008/9/8 Vlad Skvortsov <vss at 73rus.com>:
>   
>> Posting to cafe since I got just one reply on beginner at . I was suggested to
>> include more SCC annotations, but that didn't help. The 'serialize' function
>> is still reported to consume about 32% of running time, 29% inherited.
>> However, functions called from it only account for about 3% of time.
>>
>>     
>
> If "serialize" calls standard library functions, this is probably
> because the profiling libraries weren't built with -auto-all -- so the
> profiling report won't tell you how much time standard library
> functions consume.
>   

Hmm, that's a good point! I didn't think about it. Though how do I make 
GHC link in profiling versions of standard libraries? My own libraries 
are built with profiling support and I run Setup.hs with 
--enable-library-profiling and --enable-executable-profiling options.

> You can rebuild the libraries with -auto-all, but probably much easier
> would be to add SCC annotations to each call site. For example, you
> could annotate your locally defined dumpWith function like so:
>
> dumpWith f = {-# SCC "foldWithKey" #-} Data.Map.foldWithKey f []
>     docToStr k (Doc { docName=n, docVectorLength=vl}) =
>     (:) ("d " ++ show k ++ " " ++ n ++ " " ++ (show vl))
>   

Here is how my current version of the function looks like:

serialize :: Database -> [[String]]
serialize db =
  {-# SCC "XXXCons" #-}
  [
    [dbFormatTag],
    ({-# SCC "dwDoc" #-} dumpWith docToStr dmap),
    ({-# SCC "dwTerm" #-} dumpWith termToStr tmap)
  ]
  where
    (dmap, tmap) = {-# SCC "XXX" #-} db

    dumpWith f =  {-# SCC "dumpWith" #-} Data.Map.foldWithKey f []
        docToStr :: DocId -> Doc -> [String] -> [String]

    docToStr k (Doc { docName=n, docVectorLength=vl}) =
       {-# SCC "docToStr" #-} ((:) ("d " ++ show k ++ " " ++ n ++ " " ++ 
(show vl)))

    termToStr t il =
      {-# SCC "termToStr" #-} ((:) ("t " ++ t ++ " " ++ (foldl 
ilItemToStr "" il)))

    ilItemToStr acc (docid, weight) =
       {-# SCC "ilItemToStr" #-} (show docid ++ ":" ++ show weight ++ " 
" ++ acc)


...and still I don't see these cost centers to take a lot of time (they 
add up to about 3%, as I said before).

> Then your profiling report will tell you how much time/memory that
> particular call to foldWithKey uses.
>
> By the way, using foldl rather than foldl' or foldr is almost always a
> performance bug

Data.Map.foldWith key is implemented with foldr[1], however I'm not sure 
I'm getting how foldr is superior to foldl here (foldl' I understand). 
Could you shed some light on that for me please?

Thanks!

[1]: 
http://www.haskell.org/ghc/docs/latest/html/libraries/containers/src/Data-Map.html

-- 
Vlad Skvortsov, vss at 73rus.com, http://vss.73rus.com



More information about the Haskell-Cafe mailing list