[Haskell-cafe] Re: Knot tying vs monads

Fri Nov 16 05:37:14 EST 2007

John D. Ramsdell wrote:
> This is another Haskell style question.
> 
> I had some trouble with the pretty printer that comes with GHC, so I
> translated one written in Standard ML.  I have already translated the
> program into C, so rewriting it in Haskell was quick and easy for me.

Concerning the choice of a pretty printer, the one bundled in GHC is 
close to

   John Hughes. The Design of a Pretty-printing Library.
   http://citeseer.ist.psu.edu/hughes95design.html

but there's also

   Philip Wadler. A prettier printer.
   http://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf

(probably available as a library on hackage). Btw, both papers are 
marvelous introductions to the derivation of programs from their 
specification.

Compared to that, I'm missing the specification part for your pretty 
printer. How's it supposed to lay out?

> The Standard ML version uses a reference cell to keep track of the
> space available on a line.  I threaded the value of the reference cell
> through the computation using a where clause to define two mutually
> recursive equations.  The fixed point implicit in the where clause
> ties the knot in the circular definitions in a way that allows the
> output string to be efficiently computed front to back.
> 
> I showed the code to a colleague, who found the circular definitions
> opaque.  He suggested a better style is to use monads, and describe
> the computation in a mode that is closer to its origin form in
> Standard ML.
>
> What style do to you prefer, a knot-tying or a monad-based style?  I
> have enclosed the pretty printer.  The printing function is the
> subject of the controversy.

Neither, I think that the code mixes too many concerns. You need neither 
knot tying nor monads for efficient string concatenation, a simple 
difference list

   type DString = Data.DList String = String -> String

will do. (There's a small difference list library Data.DList available 
on hackage). If ++ is too inefficient, then simply switch to a different 
String implementation with a faster ++.

Introducing a difference list means to replace the output type

   (Int, String) -> (Int, String)

of  printing  not by

   Int -> (String -> (Int, String)) -- state monad with state String

but by

   Int -> (Int, String -> String)   -- difference list

Furthermore, I guess that this can probably be replaced by

   Int -> (String -> String)
   (Int -> Int, String -> String)

or made entirely abstract

   type X = (Int, String) -> (Int, String)

   blanks :: Int -> X
> blanks n (space, s)
>      | n <= 0 = (space, s)
>      | otherwise = blanks (n - 1) (space - 1, showChar ' ' s)

   string :: String -> X
   string s (space,t) = (space - length s, s ++ t)

or something like that. I don't know what your printer is supposed to 
do, so I can't say for sure.

>> module Pretty(Pretty, pr, blo, str, brk) where
> 
>> data Pretty
>>     = Str !String
>>     | Brk !Int              -- Int is the number of breakable spaces
>>     | Blo ![Pretty] !Int !Int -- First int is the indent, second int
>>     --  is the number of chars and spaces for strings and breaks in block

Drop those strictness annotations from !String and ![Pretty], they won't 
do any good. The !Int are only useful if they will be unboxed, but I 
wouldn't bother right now.

> Indentation blocks
> 
>> blo :: Int -> [Pretty] -> Pretty
>> blo indent es =
>>     Blo es indent (sum es 0)
>>     where
>>       sum [] k = k
>>       sum (e:es) k = sum es (size e + k)
>>       size (Str s) = length s
>>       size (Brk n) = n
>>       size (Blo _ _ n) = n

  size  is of independent value, I'd make it a top-level function. Oh, 
and the  sum  won't be tail-recursive (until ghc's strictness analyzer 
figures it out). I'd like to point you to

   http://haskell.org/haskellwiki/Performance/Accumulating_parameter

for an explanation of why, but the information there is rather 
inaccurate. For the moment, I could only find

   http://monad.nfshost.com/wordpress/?p=19

   last section of
   http://blog.interlinked.org/tutorials/haskell_laziness.html

but isn't there a short text that describes in detail why foldl' is 
different from foldl and why foldr is "better" in many cases? I thought 
this faq would have been cached already :)

In any case, I'd simply write

   blo indent es = Blo es indent . sum . map size $ es

( sum  is a function from the Prelude.)

Regards,
apfelmus