[Haskell-cafe] Can Haskell outperform C++?

Wed May 16 12:54:08 CEST 2012

On 5/10/12 8:44 PM, Ryan Newton wrote:
>> through the trouble of writing my algorithms in C/C++, but simple-minded
>> people often have a desire to get the best performance possible, in
>> which case you really want to use C, C++, Fortran or whatever high level
>> assembler language you like.
>
> I think this is a bit of an unfair accusation ("simple-minded").

Well, yes and no.

On the one hand, characterizing those who desire the best performance 
possible as "simple-minded" is, at best, a gross over-generalization. 
Like you, I work in a field where optimization is king (e.g., in machine 
translation, program runtimes are measured in days).

But on the other hand, there are quite a lot of people who focus 
excessively on "optimization" when the actual differences for the code 
they're writing are either negligible (e.g., because of I/O bottlenecks) 
or uninteresting (e.g., a 2x slowdown is a nuisance rather than a 
crisis). For those who focus on optimization when the benefits are 
negligible or uninteresting, I think it's fair to characterize that 
focus as "simple-minded". This focus seems especially common when people 
start talking about "which language is faster"--- as opposed to, say, 
which library is faster, or which implementation of a given algorithm is 
faster. In some cases the question of language speed is legitimate, but 
IME it's far more often used to prop up prejudices about which language 
is "better"--- i.e., which group of human beings (defined by their 
choice of programming language) is "better".

I agree that, as a community, we need to come to a realistic 
understanding of the performance characteristics of Haskell compared to 
C etc. I don't like prejudice and propaganda for Haskell any more than I 
like prejudice and propaganda for C etc. In some cases it's true that 
Haskell out-performs C (or C++, or Java); but in many cases it does not, 
and we need to acknowledge that. The real problem, I feel, is that it's 
not always clear which category any given program falls into. In some 
cases the "idiomatic" Haskell is the fast one, in some cases its the 
slow one; but in all cases, what is meant by "idiomatic Haskell" is a 
moving target with poorly specified meaning.

The advent of ByteStrings was a major coup in figuring out exactly where 
and why Haskell is slow and how to fix it. Iteratees are another one, 
though the details are still being worked out. However, things like 
strictness annotations on (non-accumulator) function arguments, the 
choice whether to use ST or not, the choice whether to CPS-transform or 
not, etc, are still very much a black art. Even with a thorough 
understanding of the STG-machine and the GHC compilation pipeline, it's 
not always clear whether they will help or hurt. ST in particular is 
especially finicky, to the point where I wonder if it's ever a win 
(outside of in-place updates on arrays).

So while I think most language performance comparisons are dubious to 
begin with, I don't think the comparison of Haskell to C++ (or C, or 
Java,...) can even be construed as something with a meaningful answer. 
Haskell is just too different, and its in-the-large cost model is too 
poorly understood. I'm not even aware of any helpful generalizations 
over comparing two specific programs. Even when restricting to things 
like parsing or compute-intensive numeric code, the comparison comes 
back with mixed answers.

-- 
Live well,
~wren