[Haskell-cafe] Trace parser

Andy Georges andy.georges at elis.ugent.be
Thu Jul 6 07:06:41 EDT 2006


Hello,

I'm looking for a bit of help (ok, a lot) the speed up my program  
which I use to build a calltree out of an annotated program execution  
trace. To give you an idea about the sluggishness at the moment, for  
a trace containing 70MB, it has been running for about 10 hours  
straight (AMD Athlon XP (Barton) 2600+).

The trace contains lines made up of a number of fields:

C 4 1000 1000000
C 4 1001 1000200
R 4 1001 1003045
R 4 1000 1003060

C indicates a function entrypoint (call), R indicates a function  
exitpoint (return). The second field indicates which thread is  
executing the function, the third field denotes the function id, the  
last field contains a performance counter value. As you can see,  
numbering each line with a pre-order and a post-order number yields a  
list that can be transformed easily into a tree, which can then be  
manipulated. The first goal is to build the tree. This is done in the  
following code:


data ParserState = ParserState { methodStack :: !ThreadMap
                                , methodQueue :: !ThreadMap
                                , pre         :: !Integer
                                , post        :: !Integer
                                , methodMap   :: !MethodMap
                                , currentThread :: !Integer
                                } deriving (Show)

initialParserState :: ParserState
initialParserState = ParserState e e 0 0 e 0
   where e = M.empty :: Map Integer a

readInteger :: B.ByteString -> Integer
readInteger = fromIntegral . fst . fromJust . B.readInt


parseTraceMonadic :: [B.ByteString] -> ParserState
parseTraceMonadic ss = state { methodQueue = M.map reverse  
(methodQueue state) }
   where state = execState (mapM_ (\x -> modify (updateState x) >>  
get >>= (`seq` return ())) ss) initialParserState


updateState :: B.ByteString -> ParserState -> ParserState
updateState s state = case (B.unpack $ head fields) of
   "M" -> updateStateMethod     fields state
   "E" -> updateStateException  fields state
   "C" -> updateStateEntry      fields state
   "R" -> updateStateExit       fields state
   where fields = B.splitWith (== ' ') s


updateStateMethod :: [B.ByteString] -> ParserState -> ParserState
updateStateMethod (_:methodId:methodName:_) state = state { methodMap  
= M.insert (readInteger methodId) methodName (methodMap state) }

updateStateException :: [B.ByteString] -> ParserState -> ParserState
updateStateException _ state = state

updateStateEntry :: [B.ByteString] -> ParserState -> ParserState
updateStateEntry (_:ss) state = {-Debug.Trace.trace ("before: " ++  
(show state) ++ "\nafter: " ++ (show newstate)) $-} newstate
   where newstate = state { methodStack = updateMap thread  
(methodStack state) (\x y -> Just (x:y)) (pre state, 0, method)
                                       , pre = ((+1) $! (pre state))
                                       }
         method = mkMethod (Prelude.map B.unpack ss)
         thread = Method.thread method

updateStateExit :: [B.ByteString] -> ParserState -> ParserState
updateStateExit (_:ss) state = {-Debug.Trace.trace ("before: " ++  
(show state)) $-} case updateMethod m (Prelude.map B.unpack ss) of
                                    Just um -> state { methodStack =  
M.update (\x -> Just (tail x)) thread (methodStack state)
                                                     , methodQueue =  
updateMap thread (methodQueue state) (\x y -> Just (x:y)) (pre_, post  
state, um)
                                                     , post = ((+1)  
$! (post state))
                                                     }
                                    Nothing -> error $ "Top of the  
stack is mismatching! Expected " ++ (show m) ++ " yet got " ++ (show  
ss) ++ "\n" ++ (show state)
   where method = mkMethod (Prelude.map B.unpack ss)
         thread = Method.thread method
         (pre_, _, m) = case M.lookup thread (methodStack state) of
                           Just stack -> head stack
                           Nothing    -> error $ "Method stack has  
not been found for thread " ++ (show thread) ++ " -> fields: " ++  
(show ss)


updateMap key map f value = case M.member key map of
                               True  -> M.update (f value) key map
                               False -> M.insert key [value] map

As you can see, the state is updated for each entry, a stack being  
maintained with methods we've seen up to now, and a list with methods  
that have received both pre and post order numbers, and of which both  
the entry and exit point have been parsed. I am using a ByteString,  
because using a plain String is causing the program to grab far too  
much heap.

The mkMethod yields a Method like this:


data Method = Method { mid :: Integer
                      , thread :: Integer
                      , instruction_entry :: Integer
                      , instruction_exit :: Integer
                      } deriving (Eq, Show)

eM = Method 0 0 0 0

mkMethod :: [String] -> Method
mkMethod s = let [_thread, _id, _entry] = take 3 $ map (read ::  
String -> Integer) s
              in [_thread, _id, _entry] `seq` Method { mid = _id
                                                     , thread = _thread
                                                     ,  
instruction_entry = _entry
                                                     ,  
instruction_exit = 0
                                                     }

updateMethod :: Method -> [String] -> Maybe Method
updateMethod (Method mid thread instruction_entry instruction_exit ) s
   | thread == _thread && mid == _id = _exit `seq` Just Method { mid  
= mid
                                                               ,  
thread = thread
                                                               ,  
instruction_entry = instruction_entry
                                                               ,  
instruction_exit = _exit
                                                               }
   | otherwise = Nothing
   where [_thread, _id, _exit] = take 3 $ map (read :: String ->  
Integer) s


Any suggestions for improving this code?

Thanks,

Andy




More information about the Haskell-Cafe mailing list