[Ticket #2393] Text.PrettyPrint.HughesPJ: Bug fixes, performance
improvement
Benedikt Huber
benjovi at gmx.net
Wed Jun 25 08:41:20 EDT 2008
Simon PeytonJones schrieb:
> Thank you for doing this Benedict. I've added your more detailed comments to ticket #2393 so that they are preserved.
>
> Ian: would you like to apply? I'm not sure how to integrate the QuickCheck tests, but I bet you know.
I'm glad this is useful  and I'm impressed by the complexity of the
pretty printer library. I'm tackling the other tickets now and adding
more QuickCheck test.
> Benedict: while you are in the area, would you like to take a swing at http://hackage.haskell.org/trac/ghc/ticket/1337, and 1176?
In my opinion, ticket #667 can be marked invalid (see comments).
Concerning ticket #1337, we have to change the formal specification of
fill (it doesn't match the implementation):
 Current Specification:
 fill [] = empty
 fill [p] = p
 fill (p1:p2:ps) = oneLiner p1 <#> nest (length p1)
 (fill (oneLiner p2 : ps))
 `union`
 p1 $$ fill ps
Problem 1: We want to `unnest' the second argument of (p1 $$ fill ps),
but not the first one
In the definition above we have e.g.
> getSecondLayout $
> fillDef False [text "a", text "b", text "a"]
>> text "ab"; nilabove; nest 1; text "a"; empty
>> ab
>> .a
Problem 2: The overlapping $$ should only be used for those layouts of
p1 which aren't one liners (otherwise violating the invariant "Left
union arg has shorter first line").
I suggest the following specification (i believe it almost matches the
current implementation, modulo [fillNB: fix bug #1337] (see below):
 Revised Specification:
 fill g docs = fill' 0 docs
 gap g = if g then 1 else 0
 fill' n [] = []
 fill' n [p] = [p]
 fill' n (p1:p2:ps) =
 oneLiner p1 <g> (fill' (n+length p1+gap g) (oneLiner p2 : ps))
 `union`
 (p1 $*$ nest (n) (fill' g ps))

 $*$ is defined for layouts (OneLayout Documents) as

 layout1 $*$ layout2  isOneLiner layout1 = layout1 $+$ layout2
  otherwise = layout1 $$ layout2
I've also implemented the specification in HughesPJQuickCheck.hs,
and checked them against the patched pretty printer.
Concerning Bug #1337:
~~~~~~~~~~~~~~~~~~~~~
If the above formal specification is fine, it is easy to fix:
elide the nests of (oneLiner p2) [see attached patch, record bug #1337].
> PrettyPrint(0) $ ./Bug1337
> ....ab
> ...c
The (long) explanation follows below.
I'll look into the other tickets too.
thorkil: Can you help me with a simplified test case (pretty printer
only) of Bug #1176 ?
Also, I didn't find any regressions tests for the pretty printer  I'll
create some if someone points me in the right direction.
best regards,
benedikt
===========================================================
Explanation of Bug #1337:
Consider
> fcat [ nest 1 $ text "a", nest 2 $ text "b", text "c"]
> expected: (nest 1; text "a"; text "b"; nest 3; "c")
> actual : (nest 1; text "a"; text "b"; nest 5; "c")
Reduction:
=== (nest 1; text a) <> (fill (2) (p2:ps))
==> (nest 2 (text "b") $+$ text "c")
==> (nest 2 (text "b")) `nilabove`
(nest (3) (text "c"))
==> (nest 1; text a; text b; nest 5 c)
The problem is that if we decide to layout (p1:p2:ps) as
 p1 p2
 ps
(call it layout A), then we want to have
> (p1 <> p2) $+$ ps.
But following law <n6> this means that
> fcat_A [p1:nest k p2:ps]
is equivalent to
> fcat_A [p1,p2,ps]
so the nest of p2 has to be removed.
This is somewhat similar to bug #667, but easier to fix
from a semantic point of view:
p1,p2 and ps are distinct layouts  we only have to preserve the
individual layouts, and no combinations of them.
>
> Simon
>
>
>  Original Message
>  From: librariesbounces at haskell.org [mailto:librariesbounces at haskell.org] On Behalf Of Benedikt Huber
>  Sent: 24 June 2008 13:12
>  To: libraries at haskell.org
>  Subject: [Ticket #2393] Text.PrettyPrint.HughesPJ: Bug fixes, performance improvement
> 
>  Hello,
> 
>  I'd like to propose bugfixes, documentation fixes and a performance
>  improvement for Text.PrettyPrint.HughesPJ. The changes shouldn't
>  effect the expected behaviour of the PP library.
> 
>  I've written a QuickCheck test suite for the pretty printer (to test
>  the improvement), and found two bugs and some misconceptions/
>  ambiguities in the documentation. Additionally, there is a
>  microbenchmark for the suggested improvement.
>  Both are available at http://code.haskell.org/~bhuber/Text/
>  PrettyPrint/. Note that the QuickCheck tests need access to all top
>  level names in HughesPJ (i.e. ignore the export list).
> 
>  In summary, I propose to
>  * fix a bug in fillNB and one in fillNB/sepNB
>  * correct documentation on laws and invariants.
>  * add more efficient implementations of vcat,hsep,hcat
> 
>  More specifically:
> 
>  (1) Bugfix fillNB: Additional case for fillNB Empty (Empty : ys)
> 
>  (2) Bugfix [f](catsep): do not allow overlapping ($$) in vertical
>  composition
> 
>  (3) Lazy implementations of vcat,hcat and hsep
> 
>  (4) Law <t2> isn't always true
> 
>  (5) Invariant 5 should be made stronger
> 
>  (6) Change the comment about negative indentation
> 
 next part 
New patches:
[fillNB bug, lazy vcat
benedikt.huber at gmail.com**20080624113715] {
hunk ./Text/PrettyPrint/HughesPJ.hs 410
+
+** because of law n6, t2 only holds if x doesn't
+** start with `nest'.
+
hunk ./Text/PrettyPrint/HughesPJ.hs 429
<m1> (text s <> x) $$ y = text s <> ((text "" <> x)) $$
 nest (length s) y)
+<m1> (text s <> x) $$ y = text s <> ((text "" <> x) $$
+ nest (length s) y)
hunk ./Text/PrettyPrint/HughesPJ.hs 490
+ lazy list versions
+hcat = reduceAB . foldr (beside_' False) empty
+hsep = reduceAB . foldr (beside_' True) empty
+vcat = reduceAB . foldr (above_' True) empty
hunk ./Text/PrettyPrint/HughesPJ.hs 495
hcat = foldr (<>) empty
hsep = foldr (<+>) empty
vcat = foldr ($$) empty
+beside_' :: Bool > Doc > Doc > Doc
+beside_' _ p Empty = p
+beside_' g p q = Beside p g q
+
+above_' :: Bool > Doc > Doc > Doc
+above_' _ p Empty = p
+above_' g p q = Above p g q
+
+reduceAB :: Doc > Doc
+reduceAB (Above Empty _ q) = q
+reduceAB (Beside Empty _ q) = q
+reduceAB doc = doc
hunk ./Text/PrettyPrint/HughesPJ.hs 556
 * The arugment of @TextBeside@ is never @Nest at .
+ * The argument of @TextBeside@ is never @Nest at .
hunk ./Text/PrettyPrint/HughesPJ.hs 564
 * The right argument of a union cannot be equivalent to the empty set
 (@NoDoc@). If the left argument of a union is equivalent to the
 empty set (@NoDoc@), then the @NoDoc@ appears in the first line.
+ * A @NoDoc@ may only appear on the first line of the left argument of an
+ union. Therefore, the right argument of an union can never be equivalent
+ to the empty set (@NoDoc@).
hunk ./Text/PrettyPrint/HughesPJ.hs 578
  Arg of a NilAbove is always an RDoc
nilAbove_ :: Doc > Doc
+ Invariant: Args to the 4 functions below are always RDocs
+nilAbove_ :: RDoc > RDoc
hunk ./Text/PrettyPrint/HughesPJ.hs 583
textBeside_ :: TextDetails > Int > Doc > Doc
+textBeside_ :: TextDetails > Int > RDoc > RDoc
hunk ./Text/PrettyPrint/HughesPJ.hs 586
  Arg of Nest is always an RDoc
nest_ :: Int > Doc > Doc
+nest_ :: Int > RDoc > RDoc
hunk ./Text/PrettyPrint/HughesPJ.hs 589
  Args of union are always RDocs
union_ :: Doc > Doc > Doc
+union_ :: RDoc > RDoc > RDoc
hunk ./Text/PrettyPrint/HughesPJ.hs 765
 nilAboveNest False k (reduceDoc (vcat ys))
+ nilAboveNest True k (reduceDoc (vcat ys))
hunk ./Text/PrettyPrint/HughesPJ.hs 810
fillNB g Empty k (y:ys) = nilBeside g (fill1 g (oneLiner (reduceDoc y)) k1 ys)
+fillNB g Empty k (Empty:ys) = fillNB g Empty k ys
+fillNB g Empty k (y:ys) = fillNBE g k y ys
+fillNB g p k ys = fill1 g p k ys
+
+fillNBE g k y ys = nilBeside g (fill1 g (oneLiner (reduceDoc y)) k1 ys)
hunk ./Text/PrettyPrint/HughesPJ.hs 816
 nilAboveNest False k (fill g (y:ys))
+ nilAboveNest True k (fill g (y:ys))
hunk ./Text/PrettyPrint/HughesPJ.hs 821
fillNB g p k ys = fill1 g p k ys


hunk ./Text/PrettyPrint/HughesPJ.hs 1039
 It should never be called with 'n' < 0, but that can happen for reasons I don't understand
 Here's a test case:
 ncat x y = nest 4 $ cat [ x, y ]
 d1 = foldl1 ncat $ take 50 $ repeat $ char 'a'
 d2 = parens $ sep [ d1, text "+" , d1 ]
 main = print d2
 I don't feel motivated enough to find the Real Bug, so meanwhile we just test for n<=0
+ returns the empty string on negative argument.
+
hunk ./Text/PrettyPrint/HughesPJ.hs 1045
{ Comments from Johannes Waldmann about what the problem might be:
+{
+Concerning negative indentation:
+If we compose a <> b, and the first line of b is deeply nested, but other lines of b are not,
+then, because <> eats the nest, the pretty printer will try to layout some of b's lines with
+negative indentation:
hunk ./Text/PrettyPrint/HughesPJ.hs 1051
 In the example above, d2 and d1 are deeply nested, but `text "+"' is not,
 so the layout function tries to "outdent" it.

 when I look at the Doc values that are generated, there are lots of
 Nest constructors with negative arguments. see this sample output of
 d1 (obtained with hugs, :s u)

 tBeside (TextDetails_Chr 'a') 1 Doc_Empty) (Doc_NilAbove (Doc_Nest
 (241) (Doc_TextBeside (TextDetails_Chr 'a') 1 Doc_Empty)))))
 (Doc_NilAbove (Doc_Nest (236) (Doc_TextBeside (TextDetails_Chr 'a') 1
 (Doc_NilAbove (Doc_Nest (5) (Doc_TextBeside (TextDetails_Chr 'a') 1
 Doc_Empty)))))))) (Doc_NilAbove (Doc_Nest (231) (Doc_TextBeside
 (TextDetails_Chr 'a') 1 (Doc_NilAbove (Doc_Nest (5) (Doc_TextBeside
 (TextDetails_Chr 'a') 1 (Doc_NilAbove (Doc_Nest (5) (Doc_TextBeside
 (TextDetails_Chr 'a') 1 Doc_Empty))))))))))) (Doc_NilAbove (Doc_Nest
+doc 0123345
+
+d1 a
+d2  b
+ c
+d1<>d2 ab
+ c
}
[fillNB: fix bug #1337
benedikt.huber at gmail.com**20080625114659] {
hunk ./Text/PrettyPrint/HughesPJ.hs 814
fillNBE g k y ys = nilBeside g (fill1 g (oneLiner (reduceDoc y)) k1 ys)
+fillNBE g k y ys = nilBeside g (fill1 g (elideNest . oneLiner . reduceDoc $ y) k1 ys)
hunk ./Text/PrettyPrint/HughesPJ.hs 820

+elideNest (Nest k d) = d
+elideNest d = d
}
Context:
[Fix warnings
Ian Lynagh <igloo at earth.li>**20080620135156]
[TAG 20080528
Ian Lynagh <igloo at earth.li>**20080528004408]
Patch bundle hash:
1c1d36f58462f5c86700d37fb4d7729461aea35c
More information about the Libraries
mailing list