[Haskell-cafe] Re: Knot tying vs monads

Mon Nov 19 11:42:41 EST 2007

John D. Ramsdell wrote:
> On Nov 17, 2007 3:04 PM, apfelmus <apfelmus at quantentunnel.de> wrote:
> 
>> Unfortunately, I don't have Paulson's book (or any other ML book :) at
>> home. I'm too lazy to figure out the specification from the source code,
> 
> I guess the code is too opaque, as my colleague claimed.
> 
> The layout the algorithm generates condensed indented blocks.  Within a
> block, it inserts a newline when the distance to the next break point plus
> the current position is greater than the space remaining on the current
> line.   Thus if S-Expression lists are rendered as blocks with indent two,
> and every element in a list is separated by a break point of length one,
> with the proper margin, you would see:
> 
> (defthingy name-of-thingy
>   (one thing) (two thing)
>   (a-big-thing-made-bigger)
>   (three thing) (four thing))
> 
> As an exercise, the book asks you to implement group indent, where if any
> break point in a group inserts a newline, they all do.  So with that layout,
> one would get:
> 
> (defthingy
>   name-of-thingy
>   (one thing)
>   (two thing)
>   (a-big-thing-made-bigger)
>   (there thing)
>   (four thing))
> 
> The C version I wrote supports this layout, but I didn't bother with that
> extension for the Haskell version.

Thanks. The interesting case of nested blocks still needs to be 
specified, but with this description in mind and judging from the code, 
I guess it behaves as follows: either a block fits entirely on the 
remaining line (no line breaks inside), or it begins a new line.

Now, the quest of programming is to make this description executable by 
computers while keeping it understandable by humans.

This is straightforward to do with Wadler's pretty printer combinators 
(package "wl-pprint" on http://hackage.haskell.org )

   data S = Atom String | List [S]  -- S-expressions

   layout :: Int -> [S] -> Doc
   layout indent []     = empty
   layout indent (x:xs) = foldr1 (<>) (render x : map f xs)
     where
     f x@(Atom _) = group line  <> render x
     f x@(List _) = group (line <> render x)

     render (Atom s ) = text s
     render (List xs) = nest indent $ parens $ layout xs

The semantics of  Doc  are (for more, see Wadler's paper):  Doc  is a 
document with a set of different layouts, where the only difference 
between them is that some  line  primitives are rendered as ("\n" ++ 
replicate currentIndentation ' ') and some are rendered as a single 
space. Now,  group x  adds a new layout to the set  x , namely the 
layout where all  line  in  x  have been flattened to a single space. 
This way, the definition of  f  directly reflects the alternative 
"either a block fits entirely on the remaining line or it begins a new 
line".

Your group indent extension is even easier, namely just

   layout2 :: Int -> [S] -> Doc
   layout2 indent = sep . map render
     where
     render (Atom s ) = text s
     render (List xs) = nest indent $ parens $ layout2 xs

with the functions

   sep     = group . foldr (<$>) empty
   x <$> y = x <> line <> y

from the library.

> On the strictness annotations, my reasons for them are the usual ones,
> primarily to prevent memory leaks due to dragging, but a performance boost
> is always welcome.  At some point, I plan to profile the code with and
> without the annotations, and find out where they are needed.

That seems excessive. Can you really prove that this will prevent space 
leaks? I doubt that.

Laziness is still "useful" (example: early bail-out in your  breakdist ) 
if only the data structures are fully evaluated as opposed to every 
function being strict, so it's an interesting idea. But that doesn't 
guarantee zero space leaks, since

   sumFromTo :: Int -> Int -> Int
   sumFromTo a b = f a b 0
     where f a b k = if a == b then k else f (a+1) b (k+a)

is one. Also, you won't be able to conveniently use lists as "suspended 
loops" which would be a pity.

Regards,
apfelmus