[Haskell-cafe] Proposal: Partitionable goes somewhere + containers instances

Nicolas Trangez nicolas at incubaid.com
Sun Sep 29 09:07:33 UTC 2013


I'd think

partition :: t -> Either t (t, t)

might be more suited then...

Nicolas
On Sep 29, 2013 1:21 AM, "Ryan Newton" <rrnewton at gmail.com> wrote:

> <subject change>
>
> On Sun, Sep 29, 2013 at 3:31 AM, Mike Izbicki <mike at izbicki.me> wrote:
>
>> I've got a Partitionable class that I've been using for this purpose:
>>
>> https://github.com/mikeizbicki/ConstraintKinds/blob/master/src/Control/ConstraintKinds/Partitionable.hs
>>
>
> Mike -- Neat, that's a cool library!
>
> Edward --  ideally, where in the standard libraries should the
> Partitionable comonoid go?
>
> Btw, I'm not sure what the ideal return type for comappend is, given that
> it needs to be able to "bottom out".  Mike, our partition function's list
> return type seems more reasonable.  Or maybe something simple would be this:
>
> *class Partitionable t where*
> *  partition :: t -> Maybe (t,t)*
>
> That is, at some point its not worth splitting and returns Nothing, and
> you'd better be able to deal with the 't' directly.
>
> So what I really want is for the *containers package to please get some
> kind of Partitionable instances! * Johan & others, I would be happy to
> provide a patch if the class can be agreed on. This is important because
> currently the balanced tree structure of Data.Set/Map is an *amazing and
> beneficial property* that is *not* exposed at all through the API.
>    For example, it would be great to have a parallel traverse_ for Maps
> and Sets in the Par monad.  The particular impetus is that our new and
> enhanced Par monad makes extensive use of Maps and Sets, both the pure,
> balanced ones, and lockfree/inplace ones based on concurrent skip lists:
>
>     http://www.cs.indiana.edu/~rrnewton/haddock/lvish/
>
> Alternatively, it would be ok if there were a "Data.Map.Internal" module
> that exposed the Bin/Tip, but I assume people would rather have a clean
> Partitionable instance...
>
> Best,
>   -Ryan
>
>
> On Sun, Sep 29, 2013 at 3:31 AM, Mike Izbicki <mike at izbicki.me> wrote:
>
>> I've got a Partitionable class that I've been using for this purpose:
>>
>>
>> https://github.com/mikeizbicki/ConstraintKinds/blob/master/src/Control/ConstraintKinds/Partitionable.hs
>>
>> The function called "parallel" in the HLearn library will automatically
>> parallelize any homomorphism from a Partionable to a Monoid.  I
>> specifically use that to parallelize machine learning algorithms.
>>
>> I have two thoughts for better abstractions:
>>
>> 1)  This Partitionable class is essentially a comonoid.  By reversing the
>> arrows of mappend, we get:
>>
>> comappend :: a -> (a,a)
>>
>> By itself, this works well if the number of processors you have is a
>> power of two, but it needs some more fanciness to get things balanced
>> properly for other numbers of processors.  I bet there's another algebraic
>> structure that would capture these other cases, but I'm not sure what it is.
>>
>> 2) I'm working with parallelizing tree structures right now (kd-trees,
>> cover trees, oct-trees, etc.).  The real problem is not splitting the
>> number of data points equally (this is easy), but splitting the amount of
>> work equally.  Some points take longer to process than others, and this
>> cannot be determined in advance.  Therefore, an equal split of the data
>> points can result in one processor getting 25% of the work load, and the
>> second processor getting 75%.  Some sort of lazy Partitionable class that
>> was aware of processor loads and didn't split data points until they were
>> needed would be ideal for this scenario.
>>
>> On Sat, Sep 28, 2013 at 6:46 PM, adam vogt <vogt.adam at gmail.com> wrote:
>>
>>> On Sat, Sep 28, 2013 at 1:09 PM, Ryan Newton <rrnewton at gmail.com> wrote:
>>> > Hi all,
>>> >
>>> > We all know and love Data.Foldable and are familiar with left folds and
>>> > right folds.  But what you want in a parallel program is a balanced
>>> fold
>>> > over a tree.  Fortunately, many of our datatypes (Sets, Maps) actually
>>> ARE
>>> > balanced trees.  Hmm, but how do we expose that?
>>>
>>> Hi Ryan,
>>>
>>> At least for Data.Map, the Foldable instance seems to have a
>>> reasonably balanced fold called fold (or foldMap):
>>>
>>> >  fold t = go t
>>> >    where   go (Bin _ _ v l r) = go l `mappend` (v `mappend` go r)
>>>
>>> This doesn't seem to be guaranteed though. For example ghc's derived
>>> instance writes the foldr only, so fold would be right-associated for
>>> a:
>>>
>>> > data T a = B (T a) (T a) | L a deriving (Foldable)
>>>
>>> Regards,
>>> Adam
>>> _______________________________________________
>>> Haskell-Cafe mailing list
>>> Haskell-Cafe at haskell.org
>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>>
>>
>>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/libraries/attachments/20130929/67d58475/attachment.html>


More information about the Libraries mailing list