[Haskell-cafe] Using ShowS or Difference Lists
Holger Siegel
holgersiegel74 at yahoo.de
Sun Feb 7 11:06:37 EST 2010
Am Samstag, den 06.02.2010, 10:28 -0800 schrieb Ryan Ingram:
> As other people have mentioned, you are duplicating library
> functionality. But nobody has actually talked about the performance
> characteristics of your code.
>
> Fortunately for you, the calls to (++) in your recursion are
> right-associative, so you don't have an asymptotic problem where your
> program gets slower and slower for large inputs; it should stay
> linear.
>
> But you are wasting some work. In particular, (concat (intersperse "
> " l)) produces a String, and then (++) duplicates all of the cons
> cells in that string as it rebuilds it so that the tail connects with
> the next string.
By applying rewrite rules from the standard libraries, GHC figures out
that (concat xs ++ ys) == (foldr (++) ys xs) for every xs, ys. So, the
rule
joinLines (l:ls) = (concat (intersperse " " l)) ++ ('\n':joinLines ls)
becomes
joinLines (l:ls) = foldr (++) ('\n':joinLines ls) (intersperse " " l)
However, the problem remains the same: For every line you have to
construct a list (intersperse " " l) that is discarded immediately. But
if you implement function intersperse as
intersperse _ [] = []
intersperse sep (h:xs) = h : concatMap (\x -> [sep,x]) xs
then you can derive the following implementation by applying the rules
for list fusion:
joinLines :: [[String]] -> String
joinLines [[]] = ""
joinLines [h:xs] = h ++ foldr (\x y -> ' ': x ++ y) "" xs
joinLines ([]:ls) = '\n':jl ls
joinLines ((h:xs):ls)
= h ++ foldr (\x y -> ' ': x ++ y) ('\n' : joinLines ls) xs
This version takes 10 seconds where the original version as well as the
idiomatic (unlines . map unwords) takes 15 seconds. Unfortunately, you
have to do that optimisation by hand; GHC doesn't figure it out by
itself.
>
> So there is a potential benefit to using a difference list, albeit
> only by around a 2x factor.
>
> -- ryan
>
> On Sat, Feb 6, 2010 at 4:42 AM, Mark Spezzano
> <mark.spezzano at chariot.net.au> wrote:
> > Hi,
> >
> > Just wondering whether I can use ShowS or tupling or Difference Lists to speed up the following code?
> >
> > It's basic text processing. It takes in a list of Lines where each Line is a list of Words and intersperses " " between them then concatenates them into a longer String. Note that there is a recursive call and the ++ operator.
> >
> > Thanks
> >
> > Mark
> >
> >
> > -- Function: joinLines
> > -- Joins the Words within Lines together with whitespace and newline characters
> > -- Argument: Lines to pad with whitespace and newlines
> > -- Evaluate: The processed and concatenated String
> > joinLines :: [Line] -> String
> > joinLines (l:[]) = concat (intersperse " " l)
> > joinLines (l:ls) = (concat (intersperse " " l)) ++ ('\n':joinLines ls)
More information about the Haskell-Cafe
mailing list