[Haskell-cafe] stream/bytestring questions

Chad Scherrer chad.scherrer at gmail.com
Sun Feb 17 21:02:47 EST 2008


On Feb 17, 2008 5:01 PM, Don Stewart <dons at galois.com> wrote:
> yeah, with lists, as compared to bytestrings, there are:
>
>     * more complex operations to fuse
>     * allocation is much cheaper (lazy list cons nodes)
>     * built in desugaring for build/foldr fusion interferes (enumerations, comprehensions)
>
> so the benefits of fusing lists are less obvious than bytestrings, where
> every fusion point knocks out a big array allocation, and they're a bit
> more complex to get the full api going.

Ok, that makes sense.

> no, using the rules should be fine. you're not supposed to program in
> the stream abstraction.

I was working on some run-length encoding stuff and found it most
natural to do a lot of it using unfoldr, and I had a state in the
unfold that was naturally represented as a tuple. I got a little
worried about all the packing and unpacking tuples not being optimized
out, and this got me wondering about just using Streams directly. At
the time, the Skip constructor for a Step felt natural to use in some
places, which I thought could be really convenient. I dunno, maybe
it's not an issue. Hmm, if it would help I can try to post some of the
code...

> > Wouldn't
> > it make sense to do something like
> >
> > data ArrayList a i e = Empty | Chunk !(a i e) (ArrayList a i e)
> > ?
>
> someone could do that. we chose to  go with the monomorphic case, since
> its easier to get the api right, and the performance right.

Unless I'm overlooking something, this would involve something like...
(1) make the fusion stuff apply to an IArray (pretty handy anyway)
(2) make (strict) ByteString an instance of IArray (or maybe via StorableArray)
(3) write an ArrayList module, similar to Data.ByteString.Lazy

>From this, it would be pretty short again, in theory, to write an
alternate ByteString library using the abstractions. The point of this
would be to leverage future code improvements and reduce code
duplication.

Does this seem reasonable? Or is it too soon to consider this kind of extension?

Chad


More information about the Haskell-Cafe mailing list