[Haskell-cafe] Request for code review: slice-1.0.0, TestSupport
wren ng thornton
wren at freegeek.org
Tue Nov 25 00:02:55 EST 2008
Owen Smith wrote:
> Hi all,
>
> Thanks for your welcome and helpful comments. I've banged out a first
> attempt at a Haskell library and was curious if anybody would have
> time or interest in looking it over it for style, design, stuff that's
> just wrong, or (most likely) stuff that's been done better elsewhere.
> I'm willing to trade some labor in kind, if there's any odd jobs that
> can be done by a relative newbie (e.g. maybe setting up a Windows box
> for OpenGL build testing with GHC 6.10??).
>
> The library, slice, is an emulation of Python's list slicing mechanism:
>
> http://www.owendsmith.com/slice-1.0.0.tar.gz
>
> It uses a triple of Maybes to specify the range, so it's unwieldly in
> terms of syntax, but the functionality should be there.
This begs the question, why not use Haskell's enumeration syntax
directly? That is, is there a specific reason you're using SliceParams
instead of taking a list of indices as input? If you do the latter then
Haskell already provides the [1,3..42] syntactic sugar (albeit, the
second argument is the second element of the list with the step size
inferred, rather than giving the step size explicitly).
Upsides of lists of indices:
* You can do Perl-style slicing where you can take the elements in
whatever order you like, not just regular stepping orders. This is a
two-edged sword, but it does make things much more concise than needing
to manually stitch together a bunch of regular-step slices.
* You can allow a step size of 0 in order to get an infinite list of the
element at that index. (Of course, you can do this with the current
implementation by just allowing it.)
* You can piggyback on Haskell's syntactic sugar for regular step sequences.
* Negative indices are easy: let l = length xs in xs `slice` map (\i ->
if i < 0 then l+i else i) indices -- with some extra polish for
boundary conditions.
* It can potentially be generalized to operate over any indexable
collection. (Maybe more tricky than it's worth, if you want to guarantee
efficiency.)
Downsides:
* Handling Perlish slicing is trickier since you may need to jump
arbitrarily far back into the part of the list you've already seen. It's
especially tricky if you don't want to hold onto the whole list when you
don't need it.
* Constructing a list of indices from the [x,y..z] expression adds
memory overhead if you don't do list fusion. (FWIW, using list fusion
isn't too hard, heck you've basically rewritten a specialized case of it.)
Implementation-wise, you should use Data.List.Extras.LazyLength[1] for
dealing with those guards. When you just want to guarantee a list's
length is above or below some bound, getting the actual full length
always wastes time and generally wastes space as well (by not being
least-strict).
Similarly, if you do in fact need to get the full length of a list, you
should bind it to a local variable so you needn't calculate it over and
over. Also you can use case expressions (with a function like `compare`)
to replace many of the guards more efficiently.
In the same vein, the `slices` function can be made much more efficient
if you actually do a round-robin, instead of traversing the list
repeatedly for each slice. Round-robining is much easier than doing
slicing right anyways.
Finally, you may want to use Data.DList[2] in sliceRec, instead of
constructing the list backwards and then reversing it. Difference lists
have a constant-time append function, so you can eliminate the reversal
thus saving O(m) time where m is the length of the list returned.
[1] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/list-extras
[2] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/dlist
> Mostly,
> besides gaining experience with writing a library, I wanted to have
> functionality that skips through a list the way Python slices can, and
> didn't see a particularly clean way of doing that particular kind of
> list dicing in Data.List or elsewhere.
A naive implementation of the arbitrary-indexing Perlish slicer:
> xs `slice` indices = map (xs !!) indices
Of course, this implementation is quite slow since it traverses the list
from the beginning for each index. A smarter version would use something
like a list zipper[3] so that the traversal time is bounded by the step
size. In fact, difference lists can be thought of as a special case of
zippers for lists.[4]
[3] http://en.wikibooks.org/wiki/Haskell/Zippers
[4] Where "special" means you only ever expand it down, and never move
backwards. Difference lists can be generalized to other datastructures
in the same way zippers can, though it seems much less common to do so.
--
Live well,
~wren
More information about the Haskell-Cafe
mailing list