Edison 1.2rc2

Fri Mar 3 22:38:26 EST 2006

I just noticed this email thread.  I've wanted to use some things in Edison for a while so I'm glad to see that it is still alive.

As far as the fold variations go, I can't see that it can possibly hurt anything to include more variations, especially if the work is already done.  I would like to see the documentation constructed so that the ones that are most likely to be "correct" (or desirable) are listed first.  That way I don't have to figure out exactly what the issues are with them in order to grab the "right" one.  Especially since I usually look at the more subtle performance issues after I've built a working program.

Seth

On Fri, 3 Mar 2006 21:34:39 -0500
Robert Dockins <robdockins at fastmail.fm> wrote:

> On Friday 03 March 2006 08:28 pm, you wrote:
> > Robert Dockins writes:
> > > The second release candidate for Edison 1.2 is now ready for
> > > comments.  The notable changes from RC1 are:
> > >
> > >    * add strict variants of all folds and reduces
> >
> > You know, I was looking through the code last night and thought, "what
> > about foldl'?" Funny.
> >
> > That being said, eight variants of fold* (plus six variants of reduce*)
> > feels excessive to me. I can understand the desire for generality, but
> > I'm not sure it's possible to decide between, say, foldr and foldr'
> > without knowing the sequence type.
> 
> Well, that's certainly an issue I considered when adding them to the API.
> 
> > As I understand it, with regular lists you almost always[*] want either
> > foldr or foldl', because foldl builds up a chain of unevaluated thunks
> > and foldr' has linear call stack growth. (When I was learning ML, we
> > were told to always prefer foldl over foldr for that reason.)
> >
> > With reverse lists (i.e., Rev []), the situation is reversed: foldr' is
> > tail-recursive, foldl is fully lazy, and foldr and foldl' have linear
> > growth.
> 
> I toyed for a good while with only defining some of the strict folds, or doing 
> some other halfway measure.  The thing that eventually made be decide to 
> throw in the whole kitchen sink (all the strict folds) was the realization 
> that the API is specified at a purely semantic level; at that level, the only 
> thing that matters about strict folds is that they have exactly the same 
> semantics as the non-strict fold given a combining function which is strict 
> in the correct argument(s) (otherwise they are guaranteed to be the same 
> except that the strict fold may diverge in cases where the non-strict fold 
> does not).  When it comes time to implement, the strict folds give you more 
> leway to force closures early BUT YOU DON'T HAVE TO!  For the regular list 
> example, I could define foldr' = foldr and it would be correct according to 
> the specification.  For now I have defined all the implementations to be as 
> strict as possible in all cases, but I can make them lazier later if it turns 
> out to have better space behavior; as you mention, foldr' on regular lists is 
> problematic and it will probably be the first thing to be reverted to a lazy 
> implementation.
> 
> > For the other sequences, there may not be clear winners. With
> > SimpleQueue, for example, the choice between foldr and foldr' might
> > depend on how the queue is arranged.
> >
> > Maybe it would be better to have each implementation default to the
> > strictness that's most likely best, at least in the generic Seq
> > interface. That way, [] would get strict foldl and non-strict foldr, but
> > Rev [] would get strict foldr and non-strict foldl. Individual modules
> > could provide extra variants as desired outside the class interface.
> 
> Both the collection and the associated collection have fold (and fold') which 
> folds over elements in an unspecified order.  Perhaps I could add that to the 
> sequence API to give the "best behavior" choice to the sequence implementer.  
> Then for lists you would have (lazy) fold = foldr and (strict) fold' = 
> foldl'.
> 
> > [*] I guess you might want a non-strict foldl with lists if the foldee
> > is expensive enough that building up a pile of thunks is worth it if
> > they can potentially be skipped.
> 
> > > -- Edison defines 'subset' and 'subsetEq' in the set API.  Data.Set
> > > has operations with the
> > >     same meanings named 'isProperSubsetOf' and 'isSubsetOf'.  I am
> > > considering adoping the
> > >     Data.Set names, but they are quite a bit longer.  Also what about
> > > the meaning of "subset"
> > >     vs "proper subset"?  What do you find most intuitive?
> >
> > I would go with subset and properSubset. The "isSubsetOf" form only
> > makes sense if used in-line, but we have (|=) for that.
> 
> I'm leaning in this direction as well; I think I've pretty well decided the 
> names as they stand now have to go (they contradict standard usage), and I 
> don't really like the Data.Map names.
> 
> > > In addition, I'm interested in any API related feedback you might
> > > have.
> >
> > Is there a reason why the TernaryTrie and the PatriciaLoMap don't
> > implement OrdAssoc?
> 
> Not really.  When I inherited Edison, none of the associated collections 
> implemented the Ord* classes (I don't know why).  I added AssocList and 
> StandardMap because they were easy.  The others are significantly more 
> complicated and I thought it was more important to work out the API issues 
> and kick 1.2 out the door before tacking this (same reason the 'strict' folds 
> for TernaryTrie are really just the non-strict folds).
> 
> > For the Seq class, I'd rather have the O(n) stuff listed with the
> > implementations, rather than having a default in the interface.
> 
> Noted.  Its a dreadfully tedious cut-n-paste job which is why I haven't done 
> it; patches and clever scripts to do this are welcome ;-)  Of course, it 
> would be nice to have time complexity for the Collection and Associated 
> Collections as well...
> 
> > Finally, thanks for reviving Edison. I had always been disappointed by
> > its obscurity.
> 
> Indeed; it has always seemed to me like a great project that never got the 
> attention it deserved.
> 
> 
> Thanks for your comments!
> Rob Dockins
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
>