[Haskell-beginners] Performance problem

Lorenzo Bolla lbolla at gmail.com
Wed May 30 12:19:56 CEST 2012


I got a 40% speed improvement by simply changing "foldl" with "foldr"...

        $ diff p.hs p2.hs
        54c54
        <   let test = (\x -> foldl (&&) True $ map ($x) $ makePredicate qs)
        ---
        >   let test = (\x -> List.foldr (&&) True $ map ($x) $
makePredicate qs)
        66a67
        >

Profiling shows that most of the time is spent in the "query" cmd, in
particular, there is a lot of memory allocation going on: I would
concentrate on that.
(My test set is 10000 inserts and 4000 queries).

Wed May 30 11:17 2012 Time and Allocation Profiling Report  (Final)

   p2 +RTS -hc -p -RTS

total time  =        1.88 secs   (1883 ticks @ 1000 us, 1 processor)
total alloc = 170,594,304 bytes  (excludes profiling overheads)

COST CENTRE     MODULE    %time %alloc

doQuery.result  Main       45.7   61.0
makePredicate.\ Main       18.2    0.0
makePredicate   Main       11.6    1.2
doQuery.test.\  Main       10.6    0.0
doQuery.test    Main        8.3    0.0
main            Main        2.7   23.4
doAction        Main        1.3    6.5
doQuery         Main        0.8    4.5
doInsert.id     Main        0.7    2.5



 individual     inherited
COST CENTRE              MODULE                  no.     entries  %time
%alloc   %time %alloc

MAIN                     MAIN                     61           0    0.0
 0.0   100.0  100.0
 main                    Main                    123           0    2.7
23.4    99.9  100.0
  seqAction              Main                    124       14000    0.1
 0.8    97.3   76.5
   doAction              Main                    125       13999    1.3
 6.5    97.2   75.8
    doQuery              Main                    128       13000    0.8
 4.5    95.2   66.7
     doQuery.test        Main                    130       13000    0.0
 0.0     0.5    1.2
      doQuery.test.\     Main                    133           0    0.0
 0.0     0.5    1.2
       makePredicate     Main                    134       13000    0.5
 1.2     0.5    1.2
     doQuery.result      Main                    129       13000   45.7
61.0    93.9   61.0
      doQuery.test       Main                    131           0    8.3
 0.0    48.2    0.0
       doQuery.test.\    Main                    132    12987000   10.6
 0.0    39.9    0.0
        makePredicate    Main                    137           0   11.1
 0.0    29.3    0.0
         makePredicate.\ Main                    138    12987000   18.2
 0.0    18.2    0.0
    doInsert             Main                    126         999    0.0
 0.1     0.7    2.6
     doInsert.id         Main                    127         999    0.7
 2.5     0.7    2.5
 CAF:main1               Main                    120           0    0.0
 0.0     0.0    0.0
  main                   Main                    122           1    0.0
 0.0     0.0    0.0
 CAF:lvl1_r2dS           Main                    108           0    0.0
 0.0     0.0    0.0
  makePredicate          Main                    136           0    0.0
 0.0     0.0    0.0
 CAF:lvl_r2dR            Main                    107           0    0.0
 0.0     0.0    0.0
  makePredicate          Main                    135           0    0.0
 0.0     0.0    0.0
 CAF                     GHC.IO.Handle.FD        105           0    0.1
 0.0     0.1    0.0
 CAF                     GHC.Conc.Signal          99           0    0.0
 0.0     0.0    0.0
 CAF                     GHC.IO.Encoding          95           0    0.0
 0.0     0.0    0.0
 CAF                     Text.Read.Lex            91           0    0.0
 0.0     0.0    0.0
 CAF                     GHC.IO.Encoding.Iconv    89           0    0.0
 0.0     0.0    0.0


L.


On Wed, May 30, 2012 at 10:07 AM, Radosław Szymczyszyn <lavrin at gmail.com>wrote:

> Hello!
>
> I've had a similar problem with text processing discussed on the list
> some time ago (a topic about implementing a spellchecker). To keep
> things short'n'simple: built-in Haskell Strings are inefficient as
> they're simply lists of Chars, i.e. a String is in fact just a [Char].
>
> The usually suggested solution to this problem is using the ByteString
> type which comes from bytestring package. It's probably all nice when
> you only need ASCII/Latin encodings, but it bite me when processing
> Unicode (e.g. Data.ByteString.UTF8 doesn't have a words function,
> though Data.ByteString has one). However, the performance is good.
>
> The best solution as far as I have researched is the text package and
> type Text. It ought to support Unicode as far as I remember and has
> got all the useful list-like functions. As I hadn't yet had an
> occasion to play with it before, I took your code and adapted it to
> use Text and Text.IO. Let me know what are the results, as I haven't
> got any test set to compare the speed before and after the
> modifications.
>
> === CODE
>
> -- Problem id: HASHADQI
>
> import qualified Data.List as List
> import qualified Data.IntMap as Map
> import Data.Maybe
>
> import Data.Text (Text)
> import qualified Data.Text    as T
> import qualified Data.Text.IO as T
>
> type Person = (Text,Text,Text,Text)
> type IntPersonMap = Map.IntMap Person
>
> main = do
>  input <- T.getContents
>  seqAction Map.empty $ T.lines input
>
> seqAction :: IntPersonMap -> [Text] -> IO IntPersonMap
> seqAction m [] = return m
> seqAction m (l:ls) = do
>  m' <- doAction m l
>  seqAction m' ls
>
> doAction :: IntPersonMap -> Text -> IO IntPersonMap
> doAction m cmd = do
>  case T.unpack (T.take 1 cmd) of
>    "a" -> doInsert m $ T.words cmd
>    "d" -> doDelete m $ T.words cmd
>    "i" -> doInfo m $ T.words cmd
>    "q" -> doQuery m $ T.words cmd
>    [] -> return m
>
> doInsert :: IntPersonMap -> [Text] -> IO IntPersonMap
> doInsert m [_, idText, fn, ln, bd, pn] = do
>  let id = read (T.unpack idText) :: Int
>  if Map.member id m
>    then do putStrLn $ "ID " ++ show id ++ " ja cadastrado."
>            return m
>    else return (Map.insert id (fn, ln, bd, pn) m)
>
> doDelete :: IntPersonMap -> [Text] -> IO IntPersonMap
> doDelete m [_, idText] = do
>  let id = read (T.unpack idText) :: Int
>  if Map.member id m
>    then return (Map.delete id m)
>    else do putStrLn $ "ID " ++ show id ++ " nao existente."
>            return m
>
> doInfo :: IntPersonMap -> [Text] -> IO IntPersonMap
> doInfo m [_, idText] = do
>  let id = read (T.unpack idText) :: Int
>  case Map.lookup id m of
>    Just (fn, ln, bd, pn) -> do putStrLn . show $ T.unwords [fn, ln, bd, pn]
>                                return m
>    Nothing -> do putStrLn $ "ID " ++ show id ++ " nao existente."
>                  return m
>
> doQuery :: IntPersonMap -> [Text] -> IO IntPersonMap
> doQuery m (_:qs) = do
>  let test = (\x -> foldl (&&) True $ map ($x) $ makePredicate qs)
>      result = Map.filter test m
>  putStrLn $ unwords . map show $ Map.keys result
>  return m
>
> makePredicate :: [Text] -> [(Person -> Bool)]
> makePredicate [] = []
> makePredicate (q:qs) =
>  case (\(a,b) -> (T.unpack a, b)) (T.break (==':') q) of
>    ("fn", x) -> (\(fn,_,_,_) -> fn == (T.drop 1 x)) : (makePredicate qs)
>    ("ln", x) -> (\(_,ln,_,_) -> ln == (T.drop 1 x)) : (makePredicate qs)
>    ("bd", x) -> (\(_,_,bd,_) -> bd == (T.drop 1 x)) : (makePredicate qs)
>    ("pn", x) -> (\(_,_,_,pn) -> pn == (T.drop 1 x)) : (makePredicate qs)
>
> === END CODE
>
> Regards,
> Radek Szymczyszyn
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20120530/21d4bb68/attachment.htm>


More information about the Beginners mailing list