Speeding up Data.List.inits

David Feuer david.feuer at gmail.com
Sat Jul 19 19:12:40 UTC 2014


In fact, since the queues are only used in a very restricted way, we can
make them even smaller. Something like this (completely untested) code:

data Queue a = Queue {-# UNPACK #-}!Int [a] [a]

empty = Queue 1 [] []

queue sp1 f r
  | popCount sp1 /= 1 = Queue sp1 f r
  | otherwise.               = Queue sp1 (f ++ reverse r) []
(Queue sp1 f r) |> x = queue (sp1 + 1) f (x:r)
toListQ (Queue _ f r) = f ++ reverse r
On Jul 19, 2014 1:33 PM, "David Feuer" <david.feuer at gmail.com> wrote:

> I did a little more benchmarking, and it turns out that the overhead
> to use Okasaki-style persistently efficient banker's queues is not
> terribly severe because these queues are simple and therefore fast
> (it's around 1.5 times as slow at its very worst, but that may have
> been a fluke--the performance generally seems comparable). It does
> indeed give far, far better performance when looking at the heads of
> many of the lists at the end of the result, as Edward Kmett predicted.
> I don't know if these queues are available in a standard library, but
> the following is a full implementation of initsQ, including what it
> needs of the queue implementation:
>
> data Queue a = Queue
>   {front :: [a],
>    lenF :: Int,
>    rear :: [a],
>    lenR ::Int} deriving (Show)
>
> empty = Queue {front=[],lenF=0,rear=[],lenR=0}
>
> queue f lF r lR
>   | lR <= lF = Queue f lF r lR
>   | otherwise    = Queue {front = f ++ reverse r,
>                           lenF = lF + lR,
>                           rear = [],
>                           lenR = 0}
>
> (Queue f lF r lR) |> x = queue f lF (x:r) (lR+1)
>
> toListQ (Queue f _ r _) = f ++ reverse r
>
> initsQ xs = map toListQ (scanl (|>) empty xs)
>
>
>
> On Sat, Jul 19, 2014 at 5:25 AM, Edward Kmett <ekmett at gmail.com> wrote:
> >
> > The half-baked thought was that reaching the kth list is O(k), and
> reversing _each list_ is O(k), but with a smarter structure you might share
> some of the work of construction across multiple lists. e.g. if you had an
> inits like function that returned a list of queues rather than a list of
> lists you could actually do less work overall setting up the inits
> asymptotically, sharing the work. If you only take a constant prefix off
> each queue then this can change the overhead from O(n^2) to = O(kn).
> >
> > A reversal or difference list requires you to pay separately for each
> list.
> >
> > I we are in "violent agreement", however, as I indicated in my comment
> that the constant factors almost definitely make it a non-starter for
> something like Data.List. ;)
> >
> > -Edward
> >
> >
> > On Sat, Jul 19, 2014 at 4:30 AM, David Feuer <david.feuer at gmail.com>
> wrote:
> >>
> >> Edward Kmett, reaching the kth list is O(k). Reversing it is also O(k).
> I don't know that anything as fancy as something by Okasaki could improve
> matters—such data structures tend to have large constant factors. Something
> much worse than that seems to be wrong with the Data.List version at the
> moment.
> >>
> >>
> >> On Sat, Jul 19, 2014 at 4:25 AM, Edward Kmett <ekmett at gmail.com> wrote:
> >>>
> >>> Good catch, the common inits and tails version of subsequences makes
> pretty heavy use of that, as do many of the uses of inits where you use it
> to get k-element substrings.
> >>>
> >>> I'd be curious if an okasaki-style catenable output restricted deque
> could recover both efficient head access and reasonably efficient appends,
> but the overhead of that is probably way out of line with the performance
> at stake!
> >>>
> >>> -Edward
> >>>
> >>>
> >>> On Sat, Jul 19, 2014 at 4:14 AM, Henning Thielemann <
> schlepptop at henning-thielemann.de> wrote:
> >>>>
> >>>> Am 19.07.2014 09:51, schrieb David Feuer:
> >>>>
> >>>>
> >>>>> Background: I was looking at the code for Data.List.inits in
> >>>>> base-4.7.0.0 (function renamed for clarity):
> >>>>>
> >>>>> initsDL                   :: [a] -> [[a]]
> >>>>> initsDL xs                =  [] : case xs of
> >>>>>                                    []      -> []
> >>>>>                                    x : xs' -> map (x :) (initsDL xs')
> >>>>>
> >>>>> The recursive maps made me suspicious. I decided to try writing a
> >>>>> different version:
> >>>>>
> >>>>> initsHO xs = map ($ []) (scanl q id xs)
> >>>>>    where
> >>>>>      q f x = f . (x:)
> >>>>>
> >>>>> I mentioned this on #haskell and noted that it would be a nice
> exercise
> >>>>> to write it with less fancy function footwork using map reverse.
> rasfar
> >>>>> responded (modulo naming) with
> >>>>>
> >>>>> initsR xs = map reverse (scanl q [] xs)
> >>>>>    where
> >>>>>      q acc x = x : acc
> >>>>
> >>>>
> >>>>
> >>>> The 'map' gives efficient access to the first elements of the
> sub-lists, thus it seems that initsDL is faster than initsR when you access
> only prefixes of the sub-lists. E.g.
> >>>>
> >>>> Prelude> head $ last $ inits [0..10000000::Int]
> >>>>
> >>>> _______________________________________________
> >>>> Libraries mailing list
> >>>> Libraries at haskell.org
> >>>> http://www.haskell.org/mailman/listinfo/libraries
> >>>
> >>>
> >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/libraries/attachments/20140719/d09d3d92/attachment.html>


More information about the Libraries mailing list