[Haskell-cafe] about Haskell code written to be "too smart"

Claus Reinke claus.reinke at talk21.com
Sat Apr 4 06:17:50 EDT 2009


>   takeListSt' = evalState . foldr k (return []) . map (State . splitAt)
>     where k m m'    = cutNull $ do x<-m; xs<-m'; return (x:xs)
>           cutNull m = do s<-get; if null s then return [] else m

|Not only is ths not that elegant anymore, 

As I was saying, sequence/mapM with early cutout is common
enough that one might want it in the libraries, which would return
this variant into readability again.

|I think it *still* has a bug, stack overflow against

That would really surprise me. Not that it is impossible - as I was
also saying, I haven't replayed the derivation for the modified code.
However, the modification was arrived at by taking the original
derivation, seeing where its result deviated from the explicitly
recursive specification, and spotting the aspect of the implicitly
recursive version that was responsible for the deviation. 

Of course, the derivation itself could have an error, but equating 
the functions themselves gives me rather more confidence/coverage 
than any finite number of tests could. If I were to enter the derivation
into a proof checking tool and be successful, that would further raise
the level of confidence/coverage (leaving bugs in the proof checker).

Note that I'm not asking whether the original spec did the "right"
thing, only whether or not the variations "correctly" do the same
thing as the original spec.

|testP pf = mapM_ putStrLn  [
|          show $ take 5 $ pf (repeat 0) [1,2,3]
|          , show $ pf ( take 1000 [3,7..] ) [1..100]
|          , show . pf [3,7,11,15] $ ( take (10^6) [1..])
|          , show . head . last $ pf (take 1000 $ [3,3..]) [1..10^6]
|        ]
|
|where the first test (with take 5) is new.
|whereas the version with explicit recursion and pattern matching
|doesn't suffer from this problem

I get identical results for 'takeListSt'' and the original 'takeList1'
(code repeated below). 

It took me a couple of moments to remember that you had been 
playing with Control.Monad.State.Strict instead of the default 
Control.Monad.State(.Lazy). That would invalidate the original 
derivation (different definition of '>>=', therefore a different end 
result after unfolding '>>='), let alone the modified code based on it. 

If you replay the derivation, taking the strictness variations into 
account, you should arrive at an explicitly recursive version that
differs from the spec. Which might make it easier to see what
the difference is.

|partitions [] xs = []
|partitions (n:parts) xs =
|  let (beg,end) = splitAt n xs
|  in  beg : ( case end of
|               [] -> []
|               xs -> partitions parts xs)

That version cannot be transformed into the original spec, because
it doesn't define the same function. As I mentioned:

> (btw, both 'takeListSt'' and 'takeListSc'' pass Thomas' 'testP', as does
> his 'partitions', but 'partitions' is not the same function as 'takeList':
> consider 'null $ takeList [1] undefined' and 'takeList [0] []' ;-)

With the original spec

takeList1 [] _         =  []
takeList1 _ []         =  []
takeList1 (n : ns) xs  =  h : takeList1 ns t
    where (h, t) = splitAt n xs

and 'takeList4' being your 'partitions', we get:

*Main> null $ takeList1 [1] undefined
*** Exception: Prelude.undefined
*Main> null $ takeList4 [1] undefined
False
*Main> takeList1 [0] []
[]
*Main> takeList4 [0] []
[[]]

>I am starting to think that the tricky part in all these functions is
>that by using higher order functions from the prelude, you sweep the
>failure case under the rug. 

Yes, the reason that more abstract functions are useful is that they
hide irrelevant details, allowing us to spend our limited capacity on
relevant details. If all abstract functions happen to hide details that 
matter, more concrete functions that expose those details can be 
more helpful. 

But even that has to be qualified: for instance, at first I found it easier 
to see the issues with the original 'State' variant in its transformed, 
explicitly recursive version, but after the derivation had convinced 
me that there was no magic going on, I realized that it was just the 
old 'mapM' doesn't stop early issue. So I could have seen the issue 
in the abstract form, but my mind (and apparently other minds, too;-) 
refused to think about the cornercases there until prompted. 

If not for this tendency to ignore details that might be relevant, the 
abstract code would provide an abstract treatment of the failure 
case as well: instead of working out the details by trying to find 
useful tests for the explicit pattern matches, we can just look at
wether the definition uses 'mapM' or 'mapMWithCut', or whether
it uses 'Control.Monad.State' or 'Control.Monad.State.Strict'.

Just exposing all the details all the time isn't going to help, as the
'partition' example demonstrates: we might still ignore the relevant
details, this time not because they are hidden in abstractions, but
because they are hidden in other irrelevant details. There really
isn't a single view of software that will serve all purposes, one
has to find appropriate views for every task, including using
multiple views of the same piece of software. Which is where
program transformation comes in handy!-)

>Specifically, what happens when splitAt n doesn't have a list 
>of length n? The answer isn't in fact obvious at all. I can think 
>of three things that could hapen.

I agree that the "right" thing to do can be a matter of dispute, and
so I based my definition of "correct" on equivalence to a specific 
version of the code, the first explicitly recursive version I could 
find ('takeList1' above).

Claus



More information about the Haskell-Cafe mailing list