Proposal: adding 'tailDropWhile' to Data.List

wren ng thornton wren at freegeek.org
Fri Sep 30 00:10:17 CEST 2011


On 9/28/11 6:58 AM, Joachim Breitner wrote:
> also, it’s performance is not too bad, and definitely not as bad as
> reverse: If tailDropWhile is implemented in a way that allows list
> fusion (should be possible, I think), and I know that the suffix is not
> large (e.g. stripping at most trailing "\n"), then tailDropWhile should
> be ok to use.

Aoe's version uses foldr, so that'll give you a good consumer. The 
following version gives a good producer. The definition of rev is taken 
from GHC.List.reverse, but abstracting over cons and nil. Also we 
manually fuse away the (++) at the call site for rev, because it's 
trivial to do so.

     import GHC.Exts (build)

     tailDropWhile :: (a -> Bool) -> [a] -> [a]
     tailDropWhile p = \xs0 -> build (builder xs0)
         where
         builder xs0 cons nil = out xs0
             where
             out []        = nil
             out (x:xs)
               | p x       = inP [x] xs
               | otherwise = x `cons` out xs

             inP _ []      = nil
             inP ys (x:xs)
               | p x       = inP (x:ys) xs
               | otherwise = rev ys (x `cons` out xs)

             rev []     zs = zs
             rev (y:ys) zs = rev ys (y `cons` zs)

Providing both a good consumer and a good producer is trickier. If we 
could abstract over the `null` predicate alongside the cons and nil, 
then Aoe's version could be defined as a build over foldr:

     tailDropWhile p xs0 = build $ \cons nil ->
         let go x xs
                 | p x && null xs = nil
                 | otherwise      = x `cons` xs
         in foldr go nil xs0

Though I'm not sure whether that'd still be a good consumer, as we 
desire. It ought to be, but we'd want to verify that GHC's rewrite rules 
still fire on it appropriately.

A more transparent approach (which preserves laziness) would be to 
follow the route of stream-fusion: passing elements through one at a 
time, pausing and building up the buffer whenever the predicate holds, 
and flushing the buffer if the predicate fails before the end of list is 
reached.


-- 
Live well,
~wren



More information about the Libraries mailing list