[Haskell-cafe] Code runs 7x FASTER with profiling

Thu Dec 7 06:09:30 UTC 2017

I was writing a simple utility and I decided to use regexps to parse 
filenames. (I know, now I have two problems :-) )

I was surprised at how slow it ran, so I did a profiling build. The 
profiled code runs reasonably quickly, and is 7x faster, which makes it 
a bit hard to figure out where the slowdown is happening in the 
non-profiled code. I’m wondering if I’m doing something wrong, or if 
there’s a bug in |regex-tdfa| or in ghc.

I’ve pared my code down to just the following:

|import Text.Regex.TDFA ((=~)) main :: IO () main = do entries <- map 
parseFilename . lines <$> getContents let check (Right (_, t)) = last t 
== 'Z' check _ = False print $ all check entries parseFilename :: String 
-> Either String (String, String) parseFilename fn = case (fn =~ pattern 
:: [[String]]) of [[_, full, _, time]] -> Right $ (full, time) _ -> Left 
fn where pattern = "^\\./duplicity-(full|inc|new)(-signatures)?\\.\ 
\([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]T[0-9][0-9][0-9][0-9][0-9][0-9]Z)\\." 
|

The relevant part of my |.cabal| file looks like this:

|executable DuplicityAnalyzer main-is: DuplicityAnalyzer.hs 
build-depends: base >=4.6 && <4.11, regex-tdfa >= 1.0 && <1.3 
default-language: Haskell2010 ghc-options: -Wall -rtsopts |

To run the profiling, I do:

|cabal clean cabal configure --enable-profiling cabal build 
dist/build/DuplicityAnalyzer/DuplicityAnalyzer <names.in +RTS 
-sprofiling-summary.out -p |

The |MUT| time in the non-profiling build is 7x bigger, and the |%GC| 
time goes from 8% to 21%. I’ve put the actual output in a gist 
<https://gist.github.com/neilmayhew/247a30738c0e294902e7c2830ca2c6f5>. 
I’ve also put my test input file there, in case anyone wants to try this 
themselves.

I’ve done my testing with NixOS (ghc 8.0.2) and Debian with the Haskell 
Platform (ghc 8.2.1) and the results are basically the same. I even 
tried using Docker containers with Debian Jessie and Debian Stretch, 
just to eliminate any OS influence, and the results are still the same. 
I’ve tried it on an i5-2500K, i5-3317U and Xeon E5-1620.

I also wrote a dummy implementation of |=~| that ignores the regex 
pattern and does a hard-coded manual parse, and that produces times just 
slightly better than the profiled ones. So I don’t think there’s a 
problem in my outer code that uses |=~|.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20171206/8a45bcac/attachment-0001.html>