[Haskell-cafe] Trace parser
Andy Georges
andy.georges at elis.ugent.be
Thu Jul 6 07:06:41 EDT 2006
Hello,
I'm looking for a bit of help (ok, a lot) the speed up my program
which I use to build a calltree out of an annotated program execution
trace. To give you an idea about the sluggishness at the moment, for
a trace containing 70MB, it has been running for about 10 hours
straight (AMD Athlon XP (Barton) 2600+).
The trace contains lines made up of a number of fields:
C 4 1000 1000000
C 4 1001 1000200
R 4 1001 1003045
R 4 1000 1003060
C indicates a function entrypoint (call), R indicates a function
exitpoint (return). The second field indicates which thread is
executing the function, the third field denotes the function id, the
last field contains a performance counter value. As you can see,
numbering each line with a pre-order and a post-order number yields a
list that can be transformed easily into a tree, which can then be
manipulated. The first goal is to build the tree. This is done in the
following code:
data ParserState = ParserState { methodStack :: !ThreadMap
, methodQueue :: !ThreadMap
, pre :: !Integer
, post :: !Integer
, methodMap :: !MethodMap
, currentThread :: !Integer
} deriving (Show)
initialParserState :: ParserState
initialParserState = ParserState e e 0 0 e 0
where e = M.empty :: Map Integer a
readInteger :: B.ByteString -> Integer
readInteger = fromIntegral . fst . fromJust . B.readInt
parseTraceMonadic :: [B.ByteString] -> ParserState
parseTraceMonadic ss = state { methodQueue = M.map reverse
(methodQueue state) }
where state = execState (mapM_ (\x -> modify (updateState x) >>
get >>= (`seq` return ())) ss) initialParserState
updateState :: B.ByteString -> ParserState -> ParserState
updateState s state = case (B.unpack $ head fields) of
"M" -> updateStateMethod fields state
"E" -> updateStateException fields state
"C" -> updateStateEntry fields state
"R" -> updateStateExit fields state
where fields = B.splitWith (== ' ') s
updateStateMethod :: [B.ByteString] -> ParserState -> ParserState
updateStateMethod (_:methodId:methodName:_) state = state { methodMap
= M.insert (readInteger methodId) methodName (methodMap state) }
updateStateException :: [B.ByteString] -> ParserState -> ParserState
updateStateException _ state = state
updateStateEntry :: [B.ByteString] -> ParserState -> ParserState
updateStateEntry (_:ss) state = {-Debug.Trace.trace ("before: " ++
(show state) ++ "\nafter: " ++ (show newstate)) $-} newstate
where newstate = state { methodStack = updateMap thread
(methodStack state) (\x y -> Just (x:y)) (pre state, 0, method)
, pre = ((+1) $! (pre state))
}
method = mkMethod (Prelude.map B.unpack ss)
thread = Method.thread method
updateStateExit :: [B.ByteString] -> ParserState -> ParserState
updateStateExit (_:ss) state = {-Debug.Trace.trace ("before: " ++
(show state)) $-} case updateMethod m (Prelude.map B.unpack ss) of
Just um -> state { methodStack =
M.update (\x -> Just (tail x)) thread (methodStack state)
, methodQueue =
updateMap thread (methodQueue state) (\x y -> Just (x:y)) (pre_, post
state, um)
, post = ((+1)
$! (post state))
}
Nothing -> error $ "Top of the
stack is mismatching! Expected " ++ (show m) ++ " yet got " ++ (show
ss) ++ "\n" ++ (show state)
where method = mkMethod (Prelude.map B.unpack ss)
thread = Method.thread method
(pre_, _, m) = case M.lookup thread (methodStack state) of
Just stack -> head stack
Nothing -> error $ "Method stack has
not been found for thread " ++ (show thread) ++ " -> fields: " ++
(show ss)
updateMap key map f value = case M.member key map of
True -> M.update (f value) key map
False -> M.insert key [value] map
As you can see, the state is updated for each entry, a stack being
maintained with methods we've seen up to now, and a list with methods
that have received both pre and post order numbers, and of which both
the entry and exit point have been parsed. I am using a ByteString,
because using a plain String is causing the program to grab far too
much heap.
The mkMethod yields a Method like this:
data Method = Method { mid :: Integer
, thread :: Integer
, instruction_entry :: Integer
, instruction_exit :: Integer
} deriving (Eq, Show)
eM = Method 0 0 0 0
mkMethod :: [String] -> Method
mkMethod s = let [_thread, _id, _entry] = take 3 $ map (read ::
String -> Integer) s
in [_thread, _id, _entry] `seq` Method { mid = _id
, thread = _thread
,
instruction_entry = _entry
,
instruction_exit = 0
}
updateMethod :: Method -> [String] -> Maybe Method
updateMethod (Method mid thread instruction_entry instruction_exit ) s
| thread == _thread && mid == _id = _exit `seq` Just Method { mid
= mid
,
thread = thread
,
instruction_entry = instruction_entry
,
instruction_exit = _exit
}
| otherwise = Nothing
where [_thread, _id, _exit] = take 3 $ map (read :: String ->
Integer) s
Any suggestions for improving this code?
Thanks,
Andy
More information about the Haskell-Cafe
mailing list