[Haskell-cafe] Request for code review: slice-1.0.0, TestSupport

Tue Nov 25 00:02:55 EST 2008

Owen Smith wrote:
> Hi all,
> 
> Thanks for your welcome and helpful comments. I've banged out a first
> attempt at a Haskell library and was curious if anybody would have
> time or interest in looking it over it for style, design, stuff that's
> just wrong, or (most likely) stuff that's been done better elsewhere.
> I'm willing to trade some labor in kind, if there's any odd jobs that
> can be done by a relative newbie (e.g. maybe setting up a Windows box
> for OpenGL build testing with GHC 6.10??).
> 
> The library, slice, is an emulation of Python's list slicing mechanism:
> 
> http://www.owendsmith.com/slice-1.0.0.tar.gz
> 
> It uses a triple of Maybes to specify the range, so it's unwieldly in
> terms of syntax, but the functionality should be there.

This begs the question, why not use Haskell's enumeration syntax 
directly? That is, is there a specific reason you're using SliceParams 
instead of taking a list of indices as input? If you do the latter then 
Haskell already provides the [1,3..42] syntactic sugar (albeit, the 
second argument is the second element of the list with the step size 
inferred, rather than giving the step size explicitly).

Upsides of lists of indices:
* You can do Perl-style slicing where you can take the elements in 
whatever order you like, not just regular stepping orders. This is a 
two-edged sword, but it does make things much more concise than needing 
to manually stitch together a bunch of regular-step slices.

* You can allow a step size of 0 in order to get an infinite list of the 
element at that index. (Of course, you can do this with the current 
implementation by just allowing it.)

* You can piggyback on Haskell's syntactic sugar for regular step sequences.

* Negative indices are easy: let l = length xs in xs `slice` map (\i -> 
if i < 0 then l+i else i) indices  -- with some extra polish for 
boundary conditions.

* It can potentially be generalized to operate over any indexable 
collection. (Maybe more tricky than it's worth, if you want to guarantee 
efficiency.)

Downsides:
* Handling Perlish slicing is trickier since you may need to jump 
arbitrarily far back into the part of the list you've already seen. It's 
especially tricky if you don't want to hold onto the whole list when you 
don't need it.

* Constructing a list of indices from the [x,y..z] expression adds 
memory overhead if you don't do list fusion. (FWIW, using list fusion 
isn't too hard, heck you've basically rewritten a specialized case of it.)

Implementation-wise, you should use Data.List.Extras.LazyLength[1] for 
dealing with those guards. When you just want to guarantee a list's 
length is above or below some bound, getting the actual full length 
always wastes time and generally wastes space as well (by not being 
least-strict).

Similarly, if you do in fact need to get the full length of a list, you 
should bind it to a local variable so you needn't calculate it over and 
over. Also you can use case expressions (with a function like `compare`) 
to replace many of the guards more efficiently.

In the same vein, the `slices` function can be made much more efficient 
if you actually do a round-robin, instead of traversing the list 
repeatedly for each slice. Round-robining is much easier than doing 
slicing right anyways.

Finally, you may want to use Data.DList[2] in sliceRec, instead of 
constructing the list backwards and then reversing it. Difference lists 
have a constant-time append function, so you can eliminate the reversal 
thus saving O(m) time where m is the length of the list returned.

[1] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/list-extras

[2] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/dlist

> Mostly,
> besides gaining experience with writing a library, I wanted to have
> functionality that skips through a list the way Python slices can, and
> didn't see a particularly clean way of doing that particular kind of
> list dicing in Data.List or elsewhere.

A naive implementation of the arbitrary-indexing Perlish slicer:

 > xs `slice` indices = map (xs !!) indices

Of course, this implementation is quite slow since it traverses the list 
from the beginning for each index. A smarter version would use something 
like a list zipper[3] so that the traversal time is bounded by the step 
size. In fact, difference lists can be thought of as a special case of 
zippers for lists.[4]

[3] http://en.wikibooks.org/wiki/Haskell/Zippers

[4] Where "special" means you only ever expand it down, and never move 
backwards. Difference lists can be generalized to other datastructures 
in the same way zippers can, though it seems much less common to do so.

-- 
Live well,
~wren