[Haskell-cafe] Bit streams programs in Haskell

Mon Mar 27 03:17:58 EST 2006

I have rewritten the Huffman benchmark (see
http://cse.unl.edu/~sjanssen/huffman.hs), resulting in a significant
performance increase.  On my computer it completes one iteration in
about 0.9 seconds.  In comparison, the O'Caml version takes around 3.1
seconds.  These samples seem to indicate that this Haskell program
will compare favorably with Erlang.

I'd say the program is written in fairly idiomatic Haskell.  Turning
the bytes in the buffer into  a lazy stream of Bool's (see the
bitstream function) is both elegant and surprisingly efficient.

There is one infidelity in my implementation: the program writes the
output once per iteration, and this IO is included in the measured
time.  This results in a small time penalty in comparison to the other
versions.  I am still searching for a solution to this, but for now it
seems that printing the output is the cheapest way to force
evaluation.

Improvements and discussion are solicited.

Feel free to include this code in your website, if you'd like.

Spencer Janssen

On 3/22/06, Per Gustafsson <per.gustafsson at it.uu.se> wrote:
>
> Haskell gurus,
>
> We have made a proposal to extend the Erlang `binary' data type from
> being a sequence of bytes (a byte stream) to being a sequence of bits (a
> bitstream) with the ability to do pattern matching at the bit level.
>
> This proposal has now been fully implemented all
> these at the level of the BEAM virtual machine and in the HiPE
> compiler. Most probably, they will be part of the next major
> Erlang/OTP open source release. (We write "most probably" because we
> do not really control what Ericsson desides to release as part of its
> open source system.)
>
> We wanted to evaluate the performance of our implementation and the
> succintness of our syntax, particularly against other `similar'
> languages. For this reason, we wrote five reasonably representative
> benchmarks, showing the things that one could employ bit stream pattern
> matching and bit-level comprehensions for.
>
> The benchmarks (drop3, five11, huffman, uudecode, and uuendode) have
> all been written in Erlang, O'Caml and Haskell. For some of them, C
> and Java versions exist. They can be found in the following homepage:
>
>        http://bitbenches.infogami.com/
>
> As you will see there, the Haskell numbers are significantly slower
> than those of Erlang and O'Caml. We are wondering why this is so.
>
> For each language, we have spent a considerable effort in writing the
> benchmarks in -- at least what we feel -- is the most natural and
> efficient way one can write them.
>
> The only constraint we impose is that for functional languages, data
> structures without any explicit mutation have to be used in the part
> of the program for which measurements are taken.
>
> Our experience in writing efficient (and beautiful) Haskell programs is
> close to (if not below) zero. Also, perhaps our mind might be suffering
> from severe case of strictness and might be completely unable to `think
> lazily'. So, we request your help in noticing obvious NO-NOs and stupid
> mistakes that we might have made. We even welcome completely different
> Haskell programs provided they adhere to the constraint mentioned
> above -- no mutation.
>
> Best regards,
>
> Kostis Sagonas and Per Gustafsson
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>