[Haskell-beginners] Performance problem
Radosław Szymczyszyn
lavrin at gmail.com
Wed May 30 11:07:22 CEST 2012
Hello!
I've had a similar problem with text processing discussed on the list
some time ago (a topic about implementing a spellchecker). To keep
things short'n'simple: built-in Haskell Strings are inefficient as
they're simply lists of Chars, i.e. a String is in fact just a [Char].
The usually suggested solution to this problem is using the ByteString
type which comes from bytestring package. It's probably all nice when
you only need ASCII/Latin encodings, but it bite me when processing
Unicode (e.g. Data.ByteString.UTF8 doesn't have a words function,
though Data.ByteString has one). However, the performance is good.
The best solution as far as I have researched is the text package and
type Text. It ought to support Unicode as far as I remember and has
got all the useful list-like functions. As I hadn't yet had an
occasion to play with it before, I took your code and adapted it to
use Text and Text.IO. Let me know what are the results, as I haven't
got any test set to compare the speed before and after the
modifications.
=== CODE
-- Problem id: HASHADQI
import qualified Data.List as List
import qualified Data.IntMap as Map
import Data.Maybe
import Data.Text (Text)
import qualified Data.Text as T
import qualified Data.Text.IO as T
type Person = (Text,Text,Text,Text)
type IntPersonMap = Map.IntMap Person
main = do
input <- T.getContents
seqAction Map.empty $ T.lines input
seqAction :: IntPersonMap -> [Text] -> IO IntPersonMap
seqAction m [] = return m
seqAction m (l:ls) = do
m' <- doAction m l
seqAction m' ls
doAction :: IntPersonMap -> Text -> IO IntPersonMap
doAction m cmd = do
case T.unpack (T.take 1 cmd) of
"a" -> doInsert m $ T.words cmd
"d" -> doDelete m $ T.words cmd
"i" -> doInfo m $ T.words cmd
"q" -> doQuery m $ T.words cmd
[] -> return m
doInsert :: IntPersonMap -> [Text] -> IO IntPersonMap
doInsert m [_, idText, fn, ln, bd, pn] = do
let id = read (T.unpack idText) :: Int
if Map.member id m
then do putStrLn $ "ID " ++ show id ++ " ja cadastrado."
return m
else return (Map.insert id (fn, ln, bd, pn) m)
doDelete :: IntPersonMap -> [Text] -> IO IntPersonMap
doDelete m [_, idText] = do
let id = read (T.unpack idText) :: Int
if Map.member id m
then return (Map.delete id m)
else do putStrLn $ "ID " ++ show id ++ " nao existente."
return m
doInfo :: IntPersonMap -> [Text] -> IO IntPersonMap
doInfo m [_, idText] = do
let id = read (T.unpack idText) :: Int
case Map.lookup id m of
Just (fn, ln, bd, pn) -> do putStrLn . show $ T.unwords [fn, ln, bd, pn]
return m
Nothing -> do putStrLn $ "ID " ++ show id ++ " nao existente."
return m
doQuery :: IntPersonMap -> [Text] -> IO IntPersonMap
doQuery m (_:qs) = do
let test = (\x -> foldl (&&) True $ map ($x) $ makePredicate qs)
result = Map.filter test m
putStrLn $ unwords . map show $ Map.keys result
return m
makePredicate :: [Text] -> [(Person -> Bool)]
makePredicate [] = []
makePredicate (q:qs) =
case (\(a,b) -> (T.unpack a, b)) (T.break (==':') q) of
("fn", x) -> (\(fn,_,_,_) -> fn == (T.drop 1 x)) : (makePredicate qs)
("ln", x) -> (\(_,ln,_,_) -> ln == (T.drop 1 x)) : (makePredicate qs)
("bd", x) -> (\(_,_,bd,_) -> bd == (T.drop 1 x)) : (makePredicate qs)
("pn", x) -> (\(_,_,_,pn) -> pn == (T.drop 1 x)) : (makePredicate qs)
=== END CODE
Regards,
Radek Szymczyszyn
More information about the Beginners
mailing list