[Haskell-cafe] Using ShowS or Difference Lists

Holger Siegel holgersiegel74 at yahoo.de
Sun Feb 7 11:06:37 EST 2010


Am Samstag, den 06.02.2010, 10:28 -0800 schrieb Ryan Ingram:
> As other people have mentioned, you are duplicating library
> functionality.  But nobody has actually talked about the performance
> characteristics of your code.
> 
> Fortunately for you, the calls to (++) in your recursion are
> right-associative, so you don't have an asymptotic problem where your
> program gets slower and slower for large inputs; it should stay
> linear.
> 
> But you are wasting some work.  In particular, (concat (intersperse "
> " l)) produces a String, and then (++) duplicates all of the cons
> cells in that string as it rebuilds it so that the tail connects with
> the next string.

By applying rewrite rules from the standard libraries, GHC figures out
that (concat xs ++ ys) == (foldr (++) ys xs) for every xs, ys. So, the
rule

joinLines (l:ls) = (concat (intersperse " " l)) ++ ('\n':joinLines ls)

becomes

joinLines (l:ls) = foldr (++) ('\n':joinLines ls) (intersperse " " l)

However, the problem remains the same: For every line you have to
construct a list (intersperse " " l) that is discarded immediately. But
if you implement function intersperse as

intersperse _   []      = []
intersperse sep (h:xs)  = h : concatMap (\x -> [sep,x]) xs

then you can derive the following implementation by applying the rules
for list fusion:

joinLines :: [[String]] -> String 
joinLines [[]]        = ""
joinLines [h:xs]      = h ++ foldr (\x y -> ' ': x ++ y) "" xs
joinLines ([]:ls)     = '\n':jl ls
joinLines ((h:xs):ls)
  = h ++ foldr (\x y -> ' ': x ++ y)  ('\n' : joinLines ls) xs

This version takes 10 seconds where the original version as well as the
idiomatic (unlines . map unwords) takes 15 seconds. Unfortunately, you
have to do that optimisation by hand; GHC doesn't figure it out by
itself.

> 
> So there is a potential benefit to using a difference list, albeit
> only by around a 2x factor.
> 
>   -- ryan
> 
> On Sat, Feb 6, 2010 at 4:42 AM, Mark Spezzano
> <mark.spezzano at chariot.net.au> wrote:
> > Hi,
> >
> > Just wondering whether I can use ShowS or tupling or Difference Lists to speed up the following code?
> >
> > It's basic text processing. It takes in a list of Lines where each Line is a list of Words and intersperses " " between them then concatenates them into a longer String. Note that there is a recursive call and the ++ operator.
> >
> > Thanks
> >
> > Mark
> >
> >
> > -- Function: joinLines
> > -- Joins the Words within Lines together with whitespace and newline characters
> > -- Argument: Lines to pad with whitespace and newlines
> > -- Evaluate: The processed and concatenated String
> > joinLines :: [Line] -> String
> > joinLines (l:[]) = concat (intersperse " " l)
> > joinLines (l:ls) = (concat (intersperse " " l)) ++ ('\n':joinLines ls)




More information about the Haskell-Cafe mailing list