[Haskell-cafe] Streaming bytes and performance

Nicolas Trangez nicolas at incubaid.com
Tue Mar 19 21:53:23 CET 2013


On Tue, 2013-03-19 at 20:32 +0000, Don Stewart wrote:
> Oh, I forgot the technique of inlining the lazy bytestring chunks, and
> processing each chunk seperately.
> 
> $ time ./fast
> 4166680
> ./fast  1.25s user 0.07s system 99% cpu 1.325 total
> 
> Essentially inline Lazy.foldlChunks and specializes is (the inliner should
> really get that).
> And now we have a nice unboxed inner loop, which llvm might spot:
> 
> $ ghc -O2 -funbox-strict-fields fast.hs  --make -fllvm
> $ time ./fast
> 4166680
> ./fast  1.07s user 0.06s system 98% cpu *1.146 total*
> 
> So about 8x faster. Waiting for some non-lazy bytestring benchmarks... :)

You could try something like this using Conduit:

{-# LANGUAGE BangPatterns #-}
module Main (main) where

import Data.Conduit
import qualified Data.Conduit.List as L
import qualified Data.Conduit.Binary as B
import qualified Data.ByteString.Char8 as BS8

main :: IO ()
main = print =<< runResourceT (
    B.sourceFile filename $$ L.fold (\(!a) (!b) -> a + BS8.count ' ' b)
(0 :: Int))
  where
    filename = ...

Nicolas

> 
> {-# LANGUAGE BangPatterns #-}
> 
> import Data.ByteString.Internal
> import Data.ByteString.Unsafe
> import qualified Data.ByteString.Char8      as S
> import qualified Data.ByteString.Lazy.Char8 as L
> import qualified Data.ByteString.Lazy.Internal as L
> import System.IO.Posix.MMap.Lazy
> 
> main = do
>     f <- unsafeMMapFile "test.txt"
>     print . new 0 $ L.toChunks f
> 
> new :: Int -> [ByteString] -> Int
> new i [] = i
> new i (x:xs) = new (add i x) xs
> 
> -- jump into the fast path
> {-# INLINE add #-}
> add :: Int -> ByteString -> Int
> add !i !s | S.null s   = i
>           | isSpace' x = add (i+1) xs
>           | otherwise  = add i     xs
>   where T x xs = uncons s
> 
> data T = T !Char ByteString
> 
> uncons s = T (w2c (unsafeHead s)) (unsafeTail s)
> 
> isSpace' c = c == '\n'    || c == ' '
> {-# INLINE isSpace' #-}
> 
> 
> 
> 
> On Tue, Mar 19, 2013 at 7:36 PM, Don Stewart <dons00 at gmail.com> wrote:
> 
> > Just for fun. Here's some improvements. about 6x faster.
> > I'd be interested to see what io-streams could do on this.
> >
> > Using a 250M test file.
> >
> > -- strict state monad and bang patterns on the uncons and accumulator
> > argument:
> >
> > $ time ./A
> > 4166680
> > ./A  8.42s user 0.57s system 99% cpu 9.037 total
> >
> > -- just write a loop
> >
> > $ time ./A
> > 4166680
> > ./A  3.84s user 0.26s system 99% cpu 4.121 total
> >
> > -- switch to Int
> >
> > $ time ./A
> > 4166680
> > ./A  1.89s user 0.23s system 99% cpu 2.134 total
> >
> > -- custom isSpace function
> >
> > $ time ./A
> > 4166680
> > ./A  1.56s user 0.24s system 99% cpu 1.808 total
> >
> > -- mmap IO
> >
> > $ time ./A
> > 4166680
> > ./A  1.54s user 0.09s system 99% cpu 1.636 total
> >
> > Here's the final program:
> >
> >
> > {-# LANGUAGE BangPatterns #-}
> >
> > import qualified Data.ByteString        as S
> > import qualified Data.ByteString.Lazy.Char8 as L
> > import System.IO.Posix.MMap.Lazy
> >
> > main = do
> >     f <- unsafeMMapFile "test.txt"
> >     print $ go 0 f
> >   where
> >     go :: Int -> L.ByteString -> Int
> >     go !a !s = case L.uncons s of
> >         Nothing     -> a
> >         Just (x,xs) | isSpaceChar8 x -> go (a+1) xs
> >                     | otherwise      -> go a     xs
> >
> > isSpaceChar8 c = c == '\n'    || c == ' '
> > {-# INLINE isSpaceChar8 #-}
> >
> >
> > On Mon, Mar 18, 2013 at 8:53 AM, Konstantin Litvinenko <
> > to.darkangel at gmail.com> wrote:
> >
> >> Hi All!
> >>
> >> I tune my toy project for performance and hit the wall on simple, in
> >> imperative world, task. Here is the code that model what I'm trying to
> >> achieve
> >>
> >> import qualified Data.ByteString.Lazy as L
> >> import Data.Word8(isSpace)
> >> import Data.Word
> >> import Control.Monad.State
> >>
> >> type Stream = State L.ByteString
> >>
> >> get_byte :: Stream (Maybe Word8)
> >> get_byte = do
> >>     s <- get
> >>     case L.uncons s of
> >>         Nothing -> return Nothing
> >>         Just (x, xs) -> put xs >> return (Just x)
> >>
> >> main = do
> >>     f <- L.readFile "test.txt"
> >>     let r = evalState count_spaces f
> >>     print r
> >>   where
> >>     count_spaces = go 0
> >>       where
> >>         go a = do
> >>             x <- get_byte
> >>             case x of
> >>                 Just x' ->  if isSpace x' then go (a + 1) else go a
> >>                 Nothing -> return a
> >>
> >> It takes the file and count spaces, in imperative way, consuming bytes
> >> one by one. The problem is: How to rewrite this to get rid of constant
> >> allocation of state but still working with stream of bytes? I can rewrite
> >> this as one-liner L.foldl, but that doesn't help me in any way to optimize
> >> my toy project where all algorithms build upon consuming stream of bytes.
> >>
> >> PS. My main lang is C++ over 10 years and I only learn Haskell :)
> >>
> >>
> >> ______________________________**_________________
> >> Haskell-Cafe mailing list
> >> Haskell-Cafe at haskell.org
> >> http://www.haskell.org/**mailman/listinfo/haskell-cafe<http://www.haskell.org/mailman/listinfo/haskell-cafe>
> >>
> >
> >
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe





More information about the Haskell-Cafe mailing list