[Haskell-cafe] Why Haskell is beautiful to the novice

Sat Aug 29 18:30:08 UTC 2015

On Sat, Aug 29, 2015 at 10:49 AM, Silvio Frischknecht
<silvio.frischi at gmail.com> wrote:
> Also just wanted to add an example of how Haskell can be really tricky
> for very simple things.
>
> Let's consider the simplest loop I can think of. The function to sum up
> all the numbers up to N. Anyone new to Haskell will implement this in
> the straight forward way.
>
> sumToN 0 = 0
> sumToN n = sumToN (n-1) + n
>
> It's very elegant. It's more of a definition than an algorithm, really.
> The equivalent python code would probably do a loop (4-5 lines).
>
> Now, if you run your python code for a billion, it will eventually
> return the correct result because you specified the algorithm more
> clearly. If you run the Haskell function for a billion however, it will
> crash your computer and set your house on fire.
>
> Now, you will have to explain tail recursion and why you won't get a
> StackOverflow despite the fact that you build up a huge stack. I have
> done Haskell for a quite a while, and I'm still unsure sometimes if
> something will work in constant space or not.

I understand what you’re trying to get at, and why this simplified
example is supposed to stand in for a more general, complex, elaborate
situation.  However, I believe a more thorough, explicit description
of this sort of situation may reveal such a simplified example is not
adequately representative of the general problem.

The function to sum up all the naturals up to N is most commonly
written thus, in Haskell and Python:

    sumToN n = n * (n + 1) `div` 2

    def sum_to_n(n): return n * (n + 1) / 2

No loops, no worries about stacks, no recursion or induction or
intuition about computational processes supporting complexity.
Depending on the student’s background, this may be the first solution
that is attempted for the given problem.  The general point of this
observation is that it often happens in computer programming that a
programming-oriented solution is accidentally complex because it goes
for a brute-force approach that ignores important qualities of the
underlying problem space that enable solutions that avoid complexity.

Granted, alternate strategies with lower complexity are not always
available, and sometimes you do need some kind of aggregation.  These
are likewise easy in Haskell and Python:

    sumToN n = sum [1..n]

    def sum_to_n(n): return reduce(add, range(n + 1))

It’s noteworthy that both of these solutions have terrible performance
for somewhat similar reasons unless you specifically go out of your
way to avoid the issue.  In GHCi, asking for the value of this sum at
one billion seems to produce a very large list, and it depletes my
machine’s RAM quite quickly.  Doing the same in Python also creates a
very large list and likewise starves my machine for memory.

These issues can be solved by both programming environments.

If you test the Haskell bit in a simple complete program compiled with
`-O2`, it takes a while to run with a billion as `n`, but it does
finish in a few seconds without consuming memory in proportion to `n`.
GHC can optimize that list away, but you need to be aware of the issue
to understand how to solve it: some things fuse away in compilation
and stay around in interpreted code.

The Python bit isn’t magically optimized by CPython 2.7.  However, the
recommended approach to these issues is to use `xrange` instead, which
returns a generator object that produces results lazily, one by one,
much like the list in Haskell, and `reduce` doesn’t need to hold onto
the whole list of results since it’s actually working like a strict
left fold.  In fact, new versions of Python altogether skip the old
list-building `range` function, and their `range` is Python 2.7’s
`xrange`, so this issue goes away if you can use a new version of the
language.  Perhaps a future version of GHC may fuse lists away in
GHCi.

You may now argue that summing is only meant to stand in for some more
elaborate computation, so this whole deal misses the point.  I agree.
In fact, this is a much more fair comparison:

    sumToN n = foldl1 (+) [1..n]

    def sum_to_n(n): return reduce(add, xrange(n + 1))

Indeed, both of these fail on empty ranges, and they have similar
performance if the Haskell version is compiled with optimisation.
There are many variations of these solutions, such as enabling them to
use 0 as the result for empty summations or expressing them in terms
of different primitives.  But indeed these are two somewhat
representative ways of expressing processes that need to perform
aggregation of results over a range, both expressed in ways that are a
natural fix **for the problem** as specified, ignoring issues related
to language communities, preferences and idioms.

You may now argue that the point of your original statements is
precisely about idioms.  I don’t know what the recommended, dominant
idiom may be for any given portion of the Python community.  I suppose
it largely depends on the background of the sample you take from the
Python community.  This StackOverflow answer seems to suggest some
people recommend `reduce` with lambdas for this sort of construct, yet
others recommend `for` loops:
http://stackoverflow.com/questions/13840379/python-multiply-all-items-in-a-list-together

Perhaps this solution is even more representative of common idioms in
a way that is somewhat less specific to the toy problem of performing
simple summations:

    sumToN n = foldl1 (\ x y → x + y) [1..n]

    def sum_to_n(n): return reduce(lambda x, y: x + y, xrange(n + 1))

Do these perform the same as before?  I don’t know about the one in
Haskell; I didn’t check.  I suppose it may possibly depend on whether
tests are performed in GHCi or compiled code with various optimization
options.  It should be absolutely trivial for the GHC simplifier to
reduce both into the same code, but does it work quite the same in
GHCi?  I don’t know.  The Python versions do appear to differ a bit in
performance according to comments in that StackOverflow post.

In any case, loops are also a common, recommended idiom in Python:

  sumToN n = runST $ do
    s <- newSTRef 0
    for_ [1..n] $ \ i -> do
      modifySTRef' s (+ i)
    readSTRef s

  def sum_to_n(n):
    s = 0
    for i in xrange(n + 1):
      s += i
    return s

I’m pretty sure most Haskell novices wouldn’t write this one, just as
I’m pretty sure most Python novices wouldn’t use `reduce`.  These two
behave more or less the same: they don’t leak space.  They would leak
space if you didn’t know you had to use `xrange` instead of `range`,
or if you didn’t know you had to use `modifySTRef'` instead of
`modifySTRef` (although the specific mechamism for the space leak is
not precisely the same, but the causes are quite similar).  Python
solved this issue in recent versions by getting rid of the one that’s
arguably less useful; in Haskell, there are warnings in the
documentation: https://hackage.haskell.org/package/base-4.8.1.0/docs/Data-STRef.html#v:modifySTRef

I must confess I fell into the trap when I wrote the Haskell version:
I forgot to use `modifySTRef'`.  I also fell into the `range` trap a
few days ago writing Python 2.7 at work; the only reason I avoided it
today is that it bit me very recently.  I’ve been programming for 16
years and writing Haskell and Python for work and leisure for about 6
years.

My point is that understanding these subtleties is important and you
cannot be an effective software engineer if you don’t know how to
avoid these sorts of pitfalls in whatever technology and style of
expression you pick.  Moreover, a modern, “multi-paradigm” language
will support and provide incentives for various styles of expression
to varying degrees, and they come with code quality and performance
tradeoffs that are not necessarily clear-cut.

Programming is tricky and abstractions leak.  Python uses fewer
abstractions than Haskell, and Haskell’s abstractions (of comparable
abstractness) leak less, and that’s how they both manage to be used
successfully by practitioners.  I prefer Haskell, because in the long
run, having abstractions leak less allows me to compose taller towers
of abstraction, which gives me more power as a software engineer.
Whether this is relevant for a novice depends, I suppose, on where the
novice wishes to go.

Note: Some imports and pragmas may be necessary to make these examples
work in their respective languages.  I believe it’s fair to leave them
out in all cases; providing a batteries-included default namespace is
a significant issue, but it’s somewhat independent of these musings.