[Haskell-cafe] Fast number parsing with strict bytestrings [Was: Re: Seemingly subtle change causes large performance variation]

Donald Bruce Stewart dons at cse.unsw.edu.au
Sat Jun 9 01:10:19 EDT 2007


conal:
> 
>    Lovely!
>
>    Perhaps a stylistic shift would encourage writing this sort
>    of elegant, fusion-friendly code.
>        -- Generalized version of "interact".  Encapsulates data
>    getter & putter.
>        genInteract :: IO i -> (o -> IO ()) -> ((i -> o) -> IO
>    ())
>        genInteract get put = \ f -> get >>= put . f

Yes, I think this is a good idea. Perhaps even a small module 
for doing fast folds over lists of lines (abstracting out the
cache/demand/traversal code from solution #3.)

Also, I realise I'd missed another optimisatoin for #3, minimal copying
to do the alignment. Gains another 10%:

    process k !i (s:t:ts) | S.last s /= '\n' = process k (add k i s') ts'
      where
        (s',r)  = S.breakEnd (=='\n') s
        (r',rs) = S.break    (=='\n') t
        ts'     = S.concat [r,r',S.singleton '\n'] : unsafeTail rs : ts

Here, we copy only the smallest amount from the end of the current
chunk, and the start of the next chunk, to do \n aligning. Gets us down
to:

    $ time ./G < in
    29359
    ./G < in  0.19s user 0.02s system 96% cpu 0.210 total

And the code:

    {-# OPTIONS -fbang-patterns #-}

    import Data.Char
    import Data.Maybe
    import Data.ByteString.Base
    import qualified Data.ByteString.Char8      as S
    import qualified Data.ByteString.Lazy.Char8 as L

    main = do
        ss <- L.getContents -- done with IO now.

        let (l,ls) = L.break (=='\n') ss

            -- don't need count, we're allocating lazily
            k      = fst . fromJust . L.readInt . last . L.split ' ' $ l

            -- a lazy list of strict chunks
            file   = L.toChunks (L.tail ls)

        print $ process k 0 file

    divisibleBy :: Int -> Int -> Bool
    a `divisibleBy` n = a `rem` n == 0

    -- ---------------------------------------------------------------------
    --
    -- Optimised parsing of strict bytestrings representing \n separated numbers
    --

    --
    -- we have the file as a list of cache chunks
    -- align them on \n boundaries, and process each chunk separately
    -- when the next chunk is demanded, it will be read in.
    --
    process :: Int -> Int -> [S.ByteString] -> Int
    process k i []      = i

    process k !i (s:t:ts) | S.last s /= '\n' = process k (add k i s') ts'
      where
        (s',r)  = S.breakEnd (=='\n') s
        (r',rs) = S.break    (=='\n') t
        ts'     = S.concat [r,r',S.singleton '\n'] : unsafeTail rs : ts

    process k i (s: ss) = process k (add k i s) ss

    --
    -- process a single cache-sized chunk of numbers, \n aligned
    --
    add :: Int -> Int -> S.ByteString -> Int
    add k i s = if S.null s then i else test k i (parse x) xs
      where (x,xs) = uncons s
    {-# INLINE add #-}

    --
    -- process a single line, until \n
    --
    test :: Int -> Int -> Int -> ByteString -> Int
    test k i !n t
        | y == '\n' = -- done reading the line, process it:
            if n `divisibleBy` k then add k (i+1) ys
                                 else add k i     ys
        | otherwise = test k i n' ys
      where (y,ys) = uncons t
            n'     = parse y + 10 * n

    parse c  = ord c - ord '0'

    -- fastest way to take the head of a strict bytestring
    uncons s = (w2c (unsafeHead s), unsafeTail s)



More information about the Haskell-Cafe mailing list