[Haskell-cafe] about Haskell code written to be "too smart"

Sat Apr 4 04:33:16 EDT 2009

>   takeListSt' = evalState . foldr k (return []) . map (State . splitAt)
>     where k m m'    = cutNull $ do x<-m; xs<-m'; return (x:xs)
>           cutNull m = do s<-get; if null s then return [] else m

Not only is ths not that elegant anymore, I think it *still* has a
bug, stack overflow against

testP pf = mapM_ putStrLn  [
          show $ take 5 $ pf (repeat 0) [1,2,3]
          , show $ pf ( take 1000 [3,7..] ) [1..100]
          , show . pf [3,7,11,15] $ ( take (10^6) [1..])
          , show . head . last $ pf (take 1000 $ [3,3..]) [1..10^6]
        ]

where the first test (with take 5) is new.

whereas the version with explicit recursion and pattern matching
doesn't suffer from this problem

partitions [] xs = []
partitions (n:parts) xs =
  let (beg,end) = splitAt n xs
  in  beg : ( case end of
               [] -> []
               xs -> partitions parts xs)

I am starting to think that the tricky part in all these functions is
that by using higher order functions from the prelude, you sweep the
failure case under the rug. Specifically, what happens when splitAt n
doesn't have a list of length n? The answer isn't in fact obvious at
all. I can think of three things that could hapen.

You coud return (list,[]) where list is however many elements there
are left. (Which is what all the partitions functions do so far, and
the default behavior of splitAt.

Or, you could print an error message.

Or, you could return ([],[])

My tentative conclusion is that good haskell style makes error
modalities explicit when error behavior isn't obvious, or when there
is arguably more than one right way to fail. So:

partitionsE = partitionsE' error
partitionsE2 = partitionsE' ( \e n xs -> [])
partitionsE3 = partitionsE' (\e n xs -> [take n xs]) -- corresponds to
the behavior of partitions

partitionsE' err [] xs = []
partitionsE' err (n:parts) xs =
  case splitAtE n xs of
    Left e -> err e n xs
    Right (beg,end) ->
      beg : ( case end of
        [] -> []
        xs -> partitionsE' err parts xs )
  where splitAtE n as@(x:xs) | n <= length as = Right $ splitAt n as
        splitAtE n ys = Left $ "can't split at " ++ (show n) ++ ": "
++ (show ys)

2009/3/26 Claus Reinke <claus.reinke at talk21.com>:
> Continuing our adventures into stylistic and semantic differences:-)
>
> Comparing the 'State' and explicit recursion versions
>
>   takeListSt = evalState . mapM (State . splitAt)
>
>   -- ..with a derivation leading to..
>
>   takeListSt []    s = []
>   takeListSt (h:t) s = x : takeListSt t s'
>     where (x,s') = splitAt h s
>
> instead of
>
>   takeList [] _         =  []
>   takeList _ []         =  []
>   takeList (n : ns) xs  =  head : takeList ns tail
>       where (head, tail) = splitAt n xs
>
> we can see some differences, leading to different functions:
>
>   *Main> null $ takeListSt [1] undefined
>   False
>   *Main> null $ takeList [1] undefined
>   *** Exception: Prelude.undefined
>   *Main> takeList [0] []
>   []
>   *Main> takeListSt [0] []
>   [[]]
>
> and similarly for the 'scanl' version
>
>   takeListSc ns xs = zipWith take ns $ init $ scanl (flip drop) xs ns
>
> Depending on usage, these differences might not matter, but what if
> we want these different styles to lead to the same function, with only
> stylistic and no semantic differences, taking the explicit recursion as
> our spec?
>
> In the 'State' version, the issue is that 'mapM' does not terminate
> early, while the specification requires an empty list whenever 'xs'
> (the state) is empty. Following the derivation at
>
> http://www.haskell.org/pipermail/haskell-cafe/2009-March/058603.html
>
> the first step where we have a handle on that is after unfolding
> 'sequence':
>
>   takeListSt = evalState . foldr k (return []) . map (State . splitAt)
>     where k m m' = do x<-m; xs<-m'; return (x:xs)
>
> If we change that to
>
>   takeListSt' = evalState . foldr k (return []) . map (State . splitAt)
>     where k m m'    = cutNull $ do x<-m; xs<-m'; return (x:xs)
>           cutNull m = do s<-get; if null s then return [] else m
>
> and continue with the modified derivation, we should end up with
> the right spec (I haven't done this, so you should check!-). This
> isn't all that elegant any more, but support for 'mapM' with early
> exit isn't all that uncommon a need, either, so one might expect
> a 'mapM' variant that takes a 'cut' parameter to make it into the
> libraries.
>
> For the 'scanl' version, we have a more direct handle on the issue:
> we can simply drop the offending extras from the 'scanl' result,
> replacing 'init' with 'takeWhile (not.null)':
>
>   takeListSc' ns xs = zipWith take ns $ takeWhile (not.null) $ scanl (flip
> drop) xs ns
>
> A somewhat abbreviated derivation at the end of this message
> seems to confirm that this matches the spec (as usual with proofs,
> writing them down doesn't mean that they are correct, but that
> readers can check whether they are).
>
> (btw, both 'takeListSt'' and 'takeListSc'' pass Thomas' 'testP', as does
> his 'partitions', but 'partitions' is not the same function as 'takeList':
> consider 'null $ takeList [1] undefined' and 'takeList [0] []' ;-)
>
> Someone suggested using 'mapAccumL' instead of 'State', and
> that does indeed work, only that everything is the wrong way round:
>
>   takeListMAL = (snd.) . flip (mapAccumL (((snd&&&fst).).(flip splitAt)))
>
> This is an example where all the "cleverness" is spent on the
> irrelevant details, giving them way too much importance. So one
> might prefer a version that more clearly says that this is mostly
> 'mapAccumL splitAt', with some administratory complications
> that might be ignored on cursory inspection:
>
>   takeListMAL' = mapAccumL' splitAt'
>     where splitAt' l n       = swap $ splitAt n l
>           mapAccumL' f l acc = snd $ mapAccumL f acc l
>           swap (x,y)         = (y,x)
>
> Of course, this suffers from the "does not terminate early" issue,
> but as this thread encourages us to look at functions we might
> not otherwise consider, I thought I'd follow the suggestion, and
> perhaps someone might want to modify it with a 'mapAccumL'
> with cutoff, and demonstrate whether it matches the spec;-)
>
> Claus
>
> -- view transformation: reducing the level of abstraction
>
> takeList ns xs = zipWith take ns $ takeWhile (not.null) $ scanl (flip drop)
> xs ns
>
> -- fetch definitions of 'zipWith', 'takeWhile', and 'scanl'
>
> takeList ns xs = zipWith take ns $ takeWhile (not.null) $ scanl (flip drop)
> xs ns
>  where scanl f q ls = q : case  ls of
>                            [] -> []
>                            x:xs -> scanl f (f q x) xs
>       takeWhile _ []                 = []
>       takeWhile p (x:xs) | p x       = x : takeWhile p xs
>                          | otherwise = []
>       zipWith f (a:as) (b:bs) = f a b : zipWith f as bs
>       zipWith _ _      _      = []
>
> -- specialize for 'take', 'not.null', and 'flip drop'
>
> takeList ns xs = zipWith ns $ takeWhile $ scanl xs ns
>  where scanl q ls = q : case  ls of
>                            [] -> []
>                            x:xs -> scanl (drop x q) xs
>       takeWhile []                    = []
>       takeWhile (x:xs) | not (null x) = x : takeWhile xs
>                        | otherwise    = []
>       zipWith (a:as) (b:bs) = take a b : zipWith as bs
>       zipWith _      _      = []
>
> -- fuse 'takeWhile' and 'scanl' into 'tws'
>
> takeList ns xs = zipWith ns $ tws xs ns
>  where tws q ls | not (null q) = q : case  ls of
>                                      [] -> []
>                                      x:xs -> tws (drop x q) xs
>                | otherwise    = []
>       zipWith (a:as) (b:bs) = take a b : zipWith as bs
>       zipWith _      _      = []
>
> -- fuse 'zipWith' and 'tws' into 'ztws'
>
> takeList ns xs = ztws ns xs ns
>  where ztws (a:as) q ls | not (null q) = take a q : case  ls of
>                                                      [] -> []
>                                                      x:xs -> ztws as (drop x
> q) xs
>                        | otherwise    = []
>       ztws _      _ _                 = []
>
> -- 'ls' is 'as'
>
> takeList ns xs = ztws ns xs
>  where ztws (a:as) q | not (null q) = take a q : ztws as (drop a q)
>                     | otherwise    = []
>       ztws _      _                = []
>
> -- remove indirection
>
> takeList (a:as) q | not (null q) = take a q : takeList as (drop a q)
>                 | otherwise    = []
> takeList _      _                = []
>
> -- replace guard by clause
>
> takeList (a:as) [] = []
> takeList (a:as) q  = take a q : takeList as (drop a q)
> takeList _      _  = []
>
> -- '_' in last clause has to be '[]'
>
> takeList (a:as) [] = []
> takeList (a:as) q  = take a q : takeList as (drop a q)
> takeList []     _  = []
>
> -- switch non-overlapping clauses
>
> takeList []     _  = []
> takeList (a:as) [] = []
> takeList (a:as) q  = take a q : takeList as (drop a q)
>
> -- for second parameter '[]', both ':' and '[]' in first parameter result in
> '[]'
>
> takeList []     _  = []
> takeList _      [] = []
> takeList (a:as) q  = take a q : takeList as (drop a q)
>
> -- (take a q,drop a q) = splitAt a q
>
> takeList []     _  = []
> takeList _      [] = []
> takeList (a:as) q  = t : takeList as d
>  where (t,d) = splitAt a q
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>