# [Haskell-cafe] Why isn't there a cheaper "split-in-two" operation for Data.Set?

Ryan Newton newton at mit.edu
Wed Sep 22 13:35:11 EDT 2010

```Would there be anything wrong with a Data.Set simply chopping off half its
(balanced) tree and returning two approximately balanced partitions -- i.e.
Data.Set.split without the pivot argument?  Even though it can't quite be
constant-time (still need rebalancing) it could still be cheaper
than Data.Set.split, right?  In the context of Set.hs, I'm thinking along
the lines of something like the following:

cleave :: Set a -> (Set a, Set a)
cleave Tip = (Tip, Tip)
cleave (Bin _ x l r)
| size l > size r = (l, insertMin x r)
| otherwise       = (insertMax x l, r)

For one thing, this seems like the proposed function would be useful for
consuming a set in parallel, similar to Guy Steele's arguments in recent
talks (see Organizing Functional Code for Parallel Execution; or, foldl and
foldr Considered Slightly
Harmful<http://research.sun.com/projects/plrg/Publications/ICFPAugust2009Steele.pdf>).
That is, one splits a tree and uses "par" to process the left and right
half in parallel, enabling

Alternatively, it would be simple to provide a parallel version of
mapMonotonic that does this internally, i.e.:

mapMonotonic _ Tip = Tip
mapMonotonic f (Bin sz x l r) =
Bin sz (f x) (mapMonotonic f l) (mapMonotonic f r)

Becomes something like:

parMapMonotonic f (Bin sz x l r) =
par left \$ seq right \$
Bin sz (f x) left right
where
left  = mapMonotonic f l
right = mapMonotonic f r

Cheers,
-Ryan
-------------- next part --------------
An HTML attachment was scrubbed...