[Haskell-cafe] Re: Mining Twitter data in Haskell and Clojure

Tue Jun 15 15:05:07 EDT 2010

> If you just want to optimize it and not compare exactly equal idiomatic code,
> you should stop using functional data structures and use a structure that fits
> your problem (the ST monad has been designed for that in Haskell), because
> compilers do not detect single-threaded usage and rewrite all your code to
> something mutable and thereby avoid O(log n) costs.
>
> In practice it is probably easier to write the whole thing against the parallel
> Boost Graph Library (a C++ library), since that library provides the
> abstractions that you would want. If you go this path, it will probably end up
> being 10-100 times faster.

Surely I can rewrite it in C++ or just C and gain speedup, but not
maintainability and ease of prototyping algorithms -- the main reasons
I switched to FP, with added reliability, conciseness, reasonable
syntax bonuses.  It turned out Clojure is quite capable of doing it
well, but has Null Pointer Exceptions and "debug data shape by
running" annoying to an ML programmer; trying Haskell for speedup is
worth it!  It proves less predictable than expected -- e.g. with OCaml
if it typechecks and runs, it usually runs fast.  The point of this
exercise is to identify typical performance bottlenecks in the domain
and see how they can be avoided.  The jury's still out on this one.

-- Alexy