[Haskell-cafe] Re: ANNOUNCE: Utrecht Haskell Compiler (UHC) -- first release

Tue Apr 21 23:34:58 EDT 2009

On 21 Apr 2009, at 7:39 pm, Jason Dagit wrote:
> Not really.  Obviously some programs use the feature, but let us
> restrict to interesting programs that have been shared with the world
> and have some potential to receive maintenance.

Why?

You are, in effect, saying that my code has no value at all.
You are saying that code written by students has no value at all.
Why do you think that only code that is "shared with the world"
has "some potential to receive maintenance"?

By the way, not all publicly available code is in Hackage.
The hbc release that's on my SPARC -- and thankful I've been for
it, the grief GHC has given me there -- has at least one use of
an n+k pattern that I know of, and more in a specification comment.

>  From these programs
> we can do a sampling.  While I'm not a statistics expert, my
> understanding is the main problem with using hackage packages is a bit
> of selection bias.

I can see no reason to assume that it's only "a bit".
Maybe it's a bit.  Maybe it's a very great deal.
It would be interesting to investigate this, but the only way
you can investigate it is to examine a lot of code that
_isn't_ there.

>  I bet the selection bias isn't even that bad for
> this statistical test due to the nature of programming style
> diversity.  Maybe someone with a stronger stats background could
> comment.

I have a statistics degree.  I don't know if that's strong enough
for you.  It's strong enough that I assume selection bias until I
see evidence otherwise.

> I think that would give us an exhaustive collection of haskell code,
> but I assert we don't need that.  Biologists don't need a DNA sample
> from every organism to draw conclusions about the genetics of a
> species.

It depends on what _kind_ of conclusion they want to draw.
If they want to show that some feature _is_ present, a sample
will do.  If they want to show that it's absent or rare, then
they need a much bigger sample.

>  Scientists work with incomplete data and draw sound
> conclusions in spite of that.  The tools they use to do so are known
> as statistics.

Yes, I know.  That's why I get cross when people suggest silly
things like trawling through Hackage to demonstrate that nobody
is using n+k patterns.  Where's the statistics in that?  Where are
the estimates of natural variation?

Note: I do not assert that the use of n_k patterns is rare.
Here's _all_ that I assert about n+k patterns:
(1) they are part of Haskell 98
(2) I believe they make programs more readable
(3) I use them
(4) they are no worse than certain features proposed for
     addition to Haskell'.

> Okay, then prove n+k patterns are not rare in the publicly available
> sources.

Why the X should I?  I do not claim that they are common
IN THE PUBLICLY AVAILABLE SOURCES, I have NEVER claimed that,
and I don't even CARE whether they are rare in the publicly
available sources or not.

Because they make programs more readable, n+k patterns
probably *should* be moderately common in the publicly available
sources, but I have no idea whether they are or not.

It *is* true that things that *are* used in the commonly
available sources should continue to be supported in order
to preserve the value of those commonly available sources.
It is *not* true that things that are *not* used in the
commonly available sources are therefore of no value and
safely to be discarded.

> That's the challenge I was trying to make in my first email.

It's a challenge to the irrelevant.

Let's consider three versions of naive Fibonacci.

fibA :: (Integral a, Integral b) => a -> b

fibA 0 = 1
fibA 1 = 1
fibA (n+2) = fibA (n+1) + fibA n

Simple, readable, Haskell 98.

pred2 :: (Integral a) => a -> Maybe a
pred2 n = if n >= 2 then Just (n-2) else Nothing

fibB :: (Integral a, Integral b) => a -> b

fibB 0 = 1
fibB 1 = 1
fibB x | Just n <- pred2 x = fibB (n+1) + fibB n

Uses a pattern guard, implemented in GHC.

Pattern guards are neat.  I like them a lot.  They make sense.
But it's impossible for me to see fibB as more readable than
fibA.

While pattern guards come _close_ to offering a replacement
for n+k patterns, they don't quite.  If I had

f x (n+1)
   | p x = ...
   | q x = ...

I would have to write the pattern guard twice as

f x n'
   | Just n <- pred1 n', p x = ...
   | Just n <- pred1 n', q x = ...

That doesn't seem like an advantage, somehow.

Now for the third alternative.
The view proposal in the Haskell' wiki and the views implemented
in GHC are different.  In fact the view proposal document goes to
some trouble to show how views can replace n+k patterns, so I
suppose I don't need to review that.  Here's what it looks like
using GHC syntax.  (I can't make ghc 6.8.3 accept this;
ghc --supported-languages does not list ViewPatterns.  So this is
UNTESTED CODE!)

data Integral a => Nat2 a = Succ2 a | One2 | Zero2

nat2 :: Integral a => a -> Nat2 a

nat2 n | n >= 2 = Succ2 (n-2)
nat2 1          = One2
nat2 0          = Zero2
nat2 _          = error "nat2: not a natural number"

fibC (nat2 -> Zero2)   = 1
fibC (nat2 -> One2)    = 1
fibC (nat2 -> Succ2 n) = fibC (n+1) + fibC n

I like views a lot.  The GHC version of views seems particularly
tidy.  But again, does anyone really think this makes the code
*more* readable?

I suppose I should include the 4th version:

fibD :: (Integral a, Integral b) => a -> b

fibD 0 = 1
fibD 1 = 1
fibD n = if n >= 2 then fibD (n-1) + fibD (n-2)
          else error "fibD: not a natural number"

That doesn't look like an improvement in readability or
maintainability or any other illity to me.