[Haskell-cafe] Streaming bytes and performance
Gregory Collins
greg at gregorycollins.net
Mon Mar 18 13:14:36 CET 2013
Put a bang pattern on your accumulator in "go". Since the value is not
demanded until the end of the program, you're actually just building up a
huge space leak there.
Secondly, unconsing from the lazy bytestring will cause a lot of allocation
churn in the garbage collector -- each byte read in the input forces the
creation of a new "L.ByteString", which is many times larger.
Also please consider trying the "io-streams" library that I wrote (
http://hackage.haskell.org/package/io-streams). It provides primitives for
streaming IO in "basic Haskell" style. To provide a Word8 stream (which is
probably a bad idea performance-wise) it would be most efficient
allocation-wise to implement a mutable index cursor (i.e. IORef Int) that
pointed to your current position within the ByteString chunk, other
strategies will probably allocate too much.
G
On Mon, Mar 18, 2013 at 9:53 AM, Konstantin Litvinenko <
to.darkangel at gmail.com> wrote:
> Hi All!
>
> I tune my toy project for performance and hit the wall on simple, in
> imperative world, task. Here is the code that model what I'm trying to
> achieve
>
> import qualified Data.ByteString.Lazy as L
> import Data.Word8(isSpace)
> import Data.Word
> import Control.Monad.State
>
> type Stream = State L.ByteString
>
> get_byte :: Stream (Maybe Word8)
> get_byte = do
> s <- get
> case L.uncons s of
> Nothing -> return Nothing
> Just (x, xs) -> put xs >> return (Just x)
>
> main = do
> f <- L.readFile "test.txt"
> let r = evalState count_spaces f
> print r
> where
> count_spaces = go 0
> where
> go a = do
> x <- get_byte
> case x of
> Just x' -> if isSpace x' then go (a + 1) else go a
> Nothing -> return a
>
> It takes the file and count spaces, in imperative way, consuming bytes one
> by one. The problem is: How to rewrite this to get rid of constant
> allocation of state but still working with stream of bytes? I can rewrite
> this as one-liner L.foldl, but that doesn't help me in any way to optimize
> my toy project where all algorithms build upon consuming stream of bytes.
>
> PS. My main lang is C++ over 10 years and I only learn Haskell :)
>
>
> ______________________________**_________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/**mailman/listinfo/haskell-cafe<http://www.haskell.org/mailman/listinfo/haskell-cafe>
>
--
Gregory Collins <greg at gregorycollins.net>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20130318/34703eea/attachment.htm>
More information about the Haskell-Cafe
mailing list