[Haskell-cafe] Stream fusion for Hackage

Don Stewart dons at galois.com
Sat Nov 17 21:31:08 EST 2007


Just a quick announce: the stream fusion library for lists, 
that Duncan Coutts, Roman Leshchinskiy and I worked on earlier this year
is now available on Hackage as a standalone package:

    http://hackage.haskell.org/cgi-bin/hackage-scripts/package/stream-fusion-0.1.1

As described in the recent paper:

    "Stream Fusion: From Lists to Streams to Nothing at All"
    Duncan Coutts, Roman Leshchinskiy and Don Stewart. ICFP 2007

This is a drop-in replacement for Data.List.

Haddocks here,

    http://code.haskell.org/~dons/doc/stream-fusion/

(but note the interface is exactly the same as the Data.List library)

You might expect some small percent performance improvement for list
heavy programs, using ghc 6.8 and this list library with -O2 (use
-ddump-simpl-stats and look for:

    STREAM stream/unstream fusion

messages, indicating your intermediate lists are getting removed.

To get an idea of what is happening, consider this list program:

    foo :: Int -> Int
    foo n = sum (replicate n 1)

Compiled with ghc-6.8.1 -O2 -ddump-simpl

Normally, as sum is a left fold, an intermediate lazy list is allocated
between the call to sum and replicate, as GHC currently does:

    foo :: Int# -> Int#
    foo n = Data.List.sum (case <=# n 0 of
                                False -> go n
                                True  -> [])
        where
            go :: Int# -> [Int]   -- intermediate list!
            go n = case <=# n 1 of
                        False -> 1 : (go (n -# 1))
                        True  -> [1]

By using Data.List.Stream instead, you get a strict fused loop instead,
with no intermediate structure allocated:

    loop_sum :: Int# -> Int# -> Int#
    loop_sum k n  = case <=# n 0 of
                False -> loop_sum (k +# 1) (n -# 1)
                True  -> k

    foo :: Int# -> Int#
    foo n = loop_sum 0 n

This is a the halfway mark before porting other sequence types --
especially Data.ByteString -- over to the full stream fusion model (in
particular, strict bytestrings will benefit a lot, due to the O(n) cost
of intermediate bytestrings being removed).

The stream fusion types and combinators are also available in stripped
down form in the mlton sml compiler's extended prelude.

Enjoy.

-- Don


More information about the Haskell-Cafe mailing list