patch: pretty printing for Cabal

Mon Oct 11 18:30:23 EDT 2010

I did send this mail a week ago or so to the haskell-libraries mailing
list, but this was probably the wrong list.

I have in the meantime tested with all cabbal files on haddock, and have
fixed some issues.

This patch adds a pretty printer for GenericPackageDescription. It fixes
ticket http://hackage.haskell.org/trac/hackage/ticket/647.

I added a new module Distribution.PackageDescription.PrettyPrint which
exports writeGenericPackageDescription and
showGenericPackageDescription.
I use the field printers from Parse.hs, so that I had to add export
statements for them in Distribution.PackageDescription.Parse.
I decided to not use a class based approach because: 1. their is already
the Text class, which has a parse I didn't want to write, 
and 2. and more important, I didn't want to use -XFlexibleInstances
because I guess cabal should be portable.

I have not removed the old printer code, although this may be a good
idea (maybe mark it as deprecated first). Then
it would make sense as well to rename writeGenericPackageDescription to
writePackageDescription and 
showGenericPackageDescription to showPackageDescription, to be
symmetric.

I tested with the  code from Appendix 1 and all the cabal files on
hackage. The test succeds when writing and
reading a package description results in an equal description. This is
more rigorous then required, cause 
different GenericPackageDescr.. can have the same semantic, so for
example the order of exe definitions is not relevant.
But for our needs (GUI editor of cabal files for leksah), e.g.
preserving of order is important and it will  help to test the 
parser and printer to make this approach work.
Furthermore I want to print out the cabal file as short as possible,
e.g. not printing all empty fields like the old printer did.

Testing the code I found the following problems:

1. Found up an obvious "age old" bug:
hunk ./Distribution/PackageDescription/Parse.hs 359
-           ghcProfOptions        (\val binfo ->
binfo{ghcSharedOptions=val})
+           ghcSharedOptions        (\val binfo ->
binfo{ghcSharedOptions=val})

2. The order of elements are reversed by the parser. E.g executables: 
            (repos, flags, lib, exes, tests) <- getBody
            return (repos, flags, lib, exes ++ [(exename, flds)], tests)
since the rest is parsed in the line before, the element needs to be
prepended. so I changed to:
            (repos, flags, lib, exes, tests) <- getBody
            return (repos, flags, lib, (exename, flds): exes, tests)
The same applies to test fields.

The opposite order problem applies to storeXFields.., which is called in
the order of processing, 
so I changed it to 
storeXFieldsPD (f@('x':'-':_),val) pkg = Just pkg{ customFieldsPD = 
                                                        (customFieldsPD
pkg) ++ [(f,val)]}

3. The next problem is that we print nothing for empty fields , which
works ok, except for 
compiler opts, where empty opts differ from not given opts, so I added
this line to optsField
in ParseUtils, which causes empty opts to be parsed, as if no opts are
given:
        update f opts [] = [(f,opts)]

4. The next problem is that we loose final newlines with showFreeText.
Astonishing enough the culpid is lines 
from the List module , which evals lines "a\n" to ["a"].  By the way
even the claim that "'unlines' is an inverse 
operation to 'lines'" from the List module  is not true, as unlines
(lines "a") -> "a\n" (see Appendix 2). So I
use my own unlines' now!.

5. This is connected to the CondTree structure. When I write a Cabl file
like:
...
library
    extensions:         ForeignFunctionInterface
    if flag(use_xft)
        build-depends: X11-xft >= 0.2, utf8-string
        extensions: ForeignFunctionInterface
...        
the extension ForeignFunctionInterface in the conditional branch is
superflous, as it is specified in any case,
so the printer can simplify it to:      
...
library
    extensions:         ForeignFunctionInterface
    if flag(use_xft)
        build-depends: X11-xft >= 0.2, utf8-string
...        
However the parse result for the two cabal files will differ. Now I gave
up here, and  added a variable 
simplifiedPrinting which disables the printer smartness for now.
However, I hope someone can make 
the parser smarter in this respect. 

6. If free text starts with a line with a dot, the dot must be in a new
line.

7. Version tags have to be printed, which was not the case before
(why?).

8. VersionRangeParens needs to be printed.

9. Compiler options have to be put in a canonical order after parsing,
which is easy as CompilerFlavors do derive Ord. 

Kind regards
Jürgen Nicklisch-Franken

PS. I have added a remark about the new test suite syntax below. 

-------------------
Appendix1: Test code

main = do
    allCabalFiles <- ...
    res <- mapM test allCabalFiles
    if and res
        then putStrLn "all test succeeded"
        else putStrLn $ "tests failed: " ++ show (length (filter (==
False) res)) ++
                        " of: " ++ show (length res)

test :: FilePath -> IO Bool
test cabalFilePath = catch (do
    gpd <- readPackageDescription silent cabalFilePath
    writeGenericPackageDescription tempPath  gpd
    gpd2 <- readPackageDescription silent tempPath
    if gpd == gpd2
        then do
            putStr "."
            return True
        else do
            putStrLn $ "failed for " ++ cabalFilePath
            return False)
            (\e -> do
                putStrLn $ "runtime error " ++ show e ++ " for " ++
cabalFilePath
                return False)

------------------
Appendix2: lines and unlines revised:

-- | 'unlines' is an inverse operation to 'lines'.
-- It joins lines, after appending a terminating newline to each.
unlines'             :: [String] -> String
unlines' []          = []
unlines' [x]         = x
unlines' (x:xs)      = x ++ ('\n' : unlines' xs)

-- | 'lines' breaks a string up into a list of strings at newline
-- characters.  The resulting strings do not contain newlines.
lines'                   :: String -> [String]
lines' []                =  [""]
lines' s                 =  let (l, s') = break (== '\n') s
                            in  l : case s' of
                                        []    -> []
                                        (_:s'') -> lines' s''

----------------
Appendix 3:
 I find the test suite syntax problematic, for an exe I have to remeber
the combination:
test-suite foo
    type: exitcode-stdio-1.0
    main-is: main.hs
, and for a module:
test-suite bar
    type: library-1.0
    test-module: Bar
So its the combination of exitcode-stdio- and main-is or library- and
test-module which does the trick.
Wouldn't it be easier to write something like:    
test-suite foo
    test-version: 1.0
    main-is: main.hs
, and for a module:
test-suite bar
    test-version: 1.0
    test-module: Bar

-------------- next part --------------
A non-text attachment was scrubbed...
Name: pp2.patch
Type: text/x-patch
Size: 109611 bytes
Desc: not available
Url : http://www.haskell.org/pipermail/cabal-devel/attachments/20101011/d6444a64/pp2-0001.bin