[Haskell-beginners] Re: Is Haskell for me?

Sat Nov 21 15:52:30 EST 2009

On Saturday 21 November 2009 18:57:21 you wrote:
> I'm not a Haskell expert-- in fact, I'm a beginner, and that's why I'm
> here, but I read your blog post on the subject of this hash table program,
> and it seems to me from reading the comments in reply that you're just here
> trolling, because you're using an algorithm that is fundamentally not
> purely functional, and of course that's going to be slower, just like
> asking Joel Zumaya to pitch left handed.
>
> If you were being honest about your complaint, you'd make an
> apples-to-apples comparison, and a number of commenters on your blog have
> proposed implementations that perform much better than your example.
>
> Sorry if I'm just being a jerk,

Not at all, that is a perfectly reasonable concern but I did already try to 
address it in my previous post:

> > In practice, you would turn to a purely functional dictionary in Haskell
> > based upon balanced binary trees in order to work around this
> > long-standing bug in the GC but those trees incur O(log n) indirections
> > and typically run orders of magnitude slower than a decent hash table.

For example, the following Haskell program builds a purely functional 
Data.Map:

module Main where

import Prelude hiding (lookup)
import Data.Map (empty, insert, lookup, size)

n = 1000000

build m 0 = m
build m n = build (insert x (1.0 / x) m) (n-1)
  where x = fromIntegral n :: Double

main = do
  let 
      m = build empty n
      (Just v) = lookup 100 m

  print $ size m
  print v

Running this program with different "n" gives:

        Data.Map
 1M:  2.797s  1.0x
 2M:  6.090s  2.2x
 4M: 14.226s  5.1x
 8M: 28.449s 10.2x
16M: 83.171s 29.7x

This is several times faster than Haskell's Data.Hashtable (because of the 
long-standing bug in their GC that I described) and is scaling better. 
However, if you compare with the timings I gave before:

      Data.Hashtable       OCaml          F#
 1M:   3.198s   1.0x   1.129s  1.0x  0.080s  1.0x
 2M:   8.498s   2.7x   2.313s  2.0x  0.138s  1.7x
 4M:  25.697s   8.0x   4.567s  4.0x  0.281s  3.5x
 8M:  97.994s  30.6x  10.450s  9.3x  0.637s  8.0x
16M: 388.080s 121.4x  23.261s 20.6x  1.338s 16.7x

you'll see that the absolute performance of this idiomatic Haskell solution is 
still absolutely awful: consistently about 50x slower than the F#.

This is also true in the context of sorting: Haskell's standard library 
routines for sorting are orders of magnitude slower than those found in most 
other compiled languages.

Suffice to say, idiomatic Haskell is also nowhere near being in the same 
ballpark as C++ with respect to performance. Realistically, with enough 
expertise you should be able to optimize most of your Haskell programs to 
beat Python's performance but there are some important cases where you will 
not even be able to do that.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e