Proposal #3271: new methods for Data.Sequence
Louis Wasserman
wasserman.louis at gmail.com
Thu Jul 16 19:48:33 EDT 2009
In response to Ross's useful suggestions (which I had not actually noticed
until recently), I have made considerable changes to my Data.Sequence patch
here<http://hackage.haskell.org/trac/ghc/attachment/ticket/3271/new-methods-for-data_sequence.3.dpatch>
.
Here are the relevant points:
- I completely rewrote the sorting method. The sort is unstable, but it
is 40 lines (much more maintainable than the several-hundred line
implementation from earlier) and runs *twice as fast as* (fromList .
Data.List.sort . toList) for n=50000. (For sorted lists, it clocks in at
about 4x faster.)
- tails and inits are considerably more sophisticated, running to about
50 lines apiece (although some of that code is shared between them), but
- This implementation is genuinely asymptotically faster than the
previous implementation: evaluating any tail from the sequence
returned by
tails takes O(log (min (i, n-i))), as opposed to O(n) in scanr (<|) empty
xs, but evaluating every tail from the sequence takes O(n) overall, as
opposed to O(n log n) in fromList [drop n xs | n <- [0..length xs]].
- Without direct access to the internals of the sequence it is
impossible to match the asymptotic performance of this implementation.
- This implementation is also a hair faster in practice.
- partition is now in fact implemented via a simple foldl', which is
actually faster than my old, several-dozen-line implementation as well as
obviously simpler.
- filter has been added to the list of methods available in
Data.Sequence.
- iterate has been renamed to iterateN to reinforce the different type,
which requires a finite bound on the sequence length.
- On the back end, replicate, iterateN, and sortBy do not use fromList,
but instead use a common framework that wraps construction of a sequence in
an Applicative functor. This automatically induces the node sharing that
gives replicate performance O(log n), but behaves exactly correctly for all
its other requirements.
- As a result, there is a faster alternative to fromList if the length of
the list is known. The name and type of this method seems like it might
become genuinely contentious, so I'm not inclined to expose that faster
method at the moment.
Louis Wasserman
wasserman.louis at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/libraries/attachments/20090716/4c59c67a/attachment.html
More information about the Libraries
mailing list