[Haskell-cafe] Fast number parsing with strict bytestrings [Was:
Re: Seemingly subtle change causes large performance variation]
Donald Bruce Stewart
dons at cse.unsw.edu.au
Thu Jun 7 21:59:14 EDT 2007
dons:
> mdanish:
> > Hello,
> >
> > I've been playing with the INTEST problem on SPOJ which demonstrates
> > the ability to write a program which processes large quantities of
> > input data. http://www.spoj.pl/problems/INTEST/
>
> > But when I make a slight modification, the program chews up a ton more memory
> > and takes more time:
> >
> > import Control.Monad
> > import Data.Maybe
> > import qualified Data.ByteString.Char8 as B
> >
> > divisibleBy :: Int -> Int -> Bool
> > a `divisibleBy` n = a `rem` n == 0
> >
> > main :: IO ()
> > main = do
> > [n,k] <- (map int . B.split ' ') `fmap` B.getLine :: IO [Int]
> >
> > let
> > doLine :: Int -> Int -> IO Int
> > doLine r _ = B.getLine >>= return . testDiv r
> > -- 'return' moved here ^^
>
And just following up with some GC statistics:
Original,
95% cpu 1.668 total
<<ghc: 258766440 bytes,
452 GCs,
3036/3036 avg/max bytes residency (1 samples),
3M in use, 0.00 INIT (0.00 elapsed),
1.51 MUT (1.63 elapsed),
0.01 GC (0.03 elapsed) :ghc>>
Too lazy:
96% cpu 4.219 total
<<ghc: 278683532 bytes,
495 GCs,
--> 14729345/52642396 avg/max bytes residency (7 samples),
--> 85M in use,
0.00 INIT (0.00 elapsed),
1.68 MUT (1.81 elapsed),
--> 2.07 GC (2.36 elapsed) :ghc>>
(clear space leak)
Fixing above program with $!:
94% cpu 1.656 total
<<ghc: 257394052 bytes
451 GCs,
--> 2288/2288 avg/max bytes residency (1 samples),
--> 1M in use,
0.00 INIT (0.00 elapsed),
1.49 MUT (1.64 elapsed),
0.01 GC (0.01 elapsed) :ghc>>
Using lazy bytestrings for pure processing:
90% cpu 1.424 total
<<ghc: 219403252 bytes,
410 GCs,
70527/74236 avg/max bytes residency (10 samples),
2M in use,
0.00 INIT (0.00 elapsed),
1.25 MUT (1.40 elapsed),
0.01 GC (0.01 elapsed) :ghc>>
And the killer strict chunk parser:
78% cpu 0.327 total
<<ghc: 20685092 bytes,
--> 38 GCs,
--> 81348/81348 avg/max bytes residency (1 samples),
2M in use,
0.00 INIT (0.00 elapsed),
--> 0.21 MUT (0.32 elapsed),
0.00 GC (0.00 elapsed) :ghc>>
Very little data shuffled around in the last one.
-- Don
More information about the Haskell-Cafe
mailing list