[Haskell-cafe] HughesPJ vs. Wadler-Leijen

Tue Mar 20 07:50:46 CET 2012

I've been trying to get my head around the Wadler-Leijen pretty
printing combinators for a while now and having some trouble.

Specifically, I have trouble getting them to pick optimal line breaks.
 The existing combinators like 'sep' (and everything built from it)
merge all elements with <$> and then 'group' the whole thing, with the
result that they either all go on one line, or get one line each.
This is quite ugly for large lists of small elements.  The other
alternative is 'fillSep', which does a separate 'group' on each
element.  Unfortunately, it then tends to make very bad line wrapping
decisions, e.g. you get:

Rec { hi = "there" }
Rec
  { hi = "there", hi = "there"
  , hi = "there"
  }
Rec
  { lab = "short", label =
                     [ 0, 1, 2
                     , 3, 4, 5
                     , 6, 7, 8
                     , 9, 10
                     , 11, 12
                     ]
  }

No matter how much fancy 'group's and 'nest's and whatnot I threw in,
it was always a choice between forcing wrapping on every element and
looking ugly for many small elements, or trying to fit more into one
line and having it wrap in the wrong place and wind up scrunched up on
the right margin.  Here's my latest attempt:

list = commas PP.lbracket PP.rbracket . map format

commas :: Doc -> Doc -> [Doc] -> Doc
commas left right xs = PP.group $
    left <+> punctuate (\x -> PP.group (x <$$> PP.comma <> PP.space)) xs
        <$> right

punctuate :: (Doc -> Doc) -> [Doc] -> Doc
punctuate f [] = mempty
punctuate f [x] = x
punctuate f (x:xs) = f x <> punctuate f xs

record :: Doc -> [(String, Doc)] -> Doc
record title fields = PP.group $
    PP.hang 2 $ title <$> (commas PP.lbrace PP.rbrace (map f fields))
    where
    f (label, field) = PP.hang 2 $ PP.group $
        PP.text label <+> PP.equals <$> field

But the thing is, the HughesPJ-using Language.Haskell.Pretty in
haskell-src gets the line wrapping just right.  So I investigated how
it works, and it's very simple, here's the reduced version:

class Pretty a where format :: a -> Doc

list :: (Pretty a) => [a] -> Doc
list = bracket_list . PP.punctuate PP.comma . map format

fsep' :: [Doc] -> Doc
fsep' [] = PP.empty
fsep' (d:ds) = PP.nest 2 (PP.fsep (PP.nest (-2) d:ds))

bracket_list :: [Doc] -> Doc
bracket_list = PP.brackets . PP.fsep

brace_list :: [Doc] -> Doc
brace_list = PP.braces . PP.fsep

record :: Doc -> [(String, Doc)] -> Doc
record title fields = title <> (brace_list (map field fields))
    where
    field (name, val) = fsep' [PP.text name, PP.equals, val]

----

This formats records like so:

Rec{hi = "there"}
Rec{hi = "there" hi = "there"
    hi = "there"}
Rec{label =
      [0, 1, 2, 3, 4, 5, 6, 7,
       8, 9, 10, 11, 12, 13,
       14, 15, 16, 17, 18, 19,
       20, 21, 22, 23, 24, 25,
       26, 27, 28, 29, 30]
    label =
      [0, 1, 2, 3, 4, 5, 6, 7,
       8, 9, 10, 11, 12, 13,
       14, 15, 16, 17, 18, 19,
       20, 21, 22, 23, 24, 25,
       26, 27, 28, 29, 30]}
Rec{lab = "short"
    label =
      [0, 1, 2, 3, 4, 5, 6, 7,
       8, 9, 10, 11, 12, 13,
       14, 15, 16, 17, 18, 19,
       20, 21, 22, 23, 24, 25,
       26, 27, 28, 29, 30]}

Much better, even if it's not my preferred style!  Of course WL
doesn't have fsep or the negative nest craziness (I don't even know
what it's doing there), but it has the more general 'group'.  However,
no matter how complicated I got with WL it just never came out right.
In contrast, a very simple HughesPJ implementation gets it right.  The
thing is, when trying to figure out which pretty print library to use,
the consensus is that WL is just all around better, even though
HughesPJ is somewhat standard (but there are 6 WL variants on hackage,
and no (?) HughesPJ ones).  So am I just using it wrong?  If I
translate the HughesPJ one over directly into LW, here's what I get:

Rec{hi = "there"}
Rec{hi = "there", hi =
  "there", hi = "there"}
Rec{label = [0, 1, 2, 3, 4, 5,
  6, 7, 8, 9, 10, 11, 12, 13,
  14, 15, 16, 17, 18, 19, 20,
  21, 22, 23, 24, 25, 26, 27,
  28, 29, 30], label = [0, 1,
  2, 3, 4, 5, 6, 7, 8, 9, 10,
  11, 12, 13, 14, 15, 16, 17,
  18, 19, 20, 21, 22, 23, 24,
  25, 26, 27, 28, 29, 30]}
Rec{lab = "short", label = [0,
  1, 2, 3, 4, 5, 6, 7, 8, 9,
  10, 11, 12, 13, 14, 15, 16,
  17, 18, 19, 20, 21, 22, 23,
  24, 25, 26, 27, 28, 29, 30]}

So is WL really all it's cracked up to be?  Am I using it wrong?  I
was going to suggest some consolidation in the pretty printing library
packages, but now I'm not even sure which style should "win"...