robdockins at fastmail.fm
Mon Mar 6 12:26:56 EST 2006
On Mar 6, 2006, at 10:56 AM, Christian Maeder wrote:
> Robert Dockins wrote:
>> I'm not sure I understand; why do you think that revealing
>> structure is unsafe?
> I don't want that i.e. equal maps yield different results, just
> because they are internally differently balanced.
Well, this obviously depends on what you mean by "equal". That can
get slippery pretty fast. I assume from your comments that you mean
extensional equality (ie, viewing the map as a partial function with
a known finite domain).
>> I'm hesitant to give "fold" a longer name because I think it
>> should be the first fold a programmer reaches for.
> I don't aggree. I immediately made it wrong by my own "toList =
> fold (:) " and did not even notice foldr or foldl. The fold for
> efficiency should be mainly used by library writers. But users I
> tell: correctness first, efficiency (when needed) later!
I tend to agree with that philosophy. On the other hand, if you are
bothering to use a dedicated data structure library, its probably
because you are at least somewhat cognizant of performance issues;
otherwise you'd just use lists and tuples for everything (a
phenomenon one one can actually observe in a lot of projects).
>> In the fairly common case where you have a commutative,
>> associative function (eg (+)), you don't care in what order the
>> function is applied to the elements
> These properties are not garuanteed and difficult to track down in
> the few (but severe) cases it's made wrong.
These properties _can't_ be guaranteed (at least not by Haskell
compilers). It is true that you can shoot yourself in the foot with
the Edison API (this is one of the stronger criticisms one can level
at it, IMO). Some operations even allow you to violate internal data
structure consistency if used incorrectly. Edison has already made
the design choice to present the programmer with abuse-able
operations. I think an arbitrary-order fold is pretty tame (on the
other hand, it is a significantly more "visible" operation than the
actually unsafe ones).
Humm... I wonder if there is a case for extracting a subset of the
API which only presents "safe" (ie "difficult/impossible to abuse")
operations? Perhaps with a simplified class hierarchy as well....
What do you think? That way you could start by using the simplified
API and move to the full API when/if you discover opportunities for
performance enhancements from the less-safe or more esoteric operations.
>> , and you have fold f z === foldl f z === foldr f z.
> I see, foldr and foldl are your safe versions! (I think Jean-
> Philippe only has a safe "fold" and misses an unsafe one.)
Well, its actually more complicated than safe vs unsafe. The base
"AssocX" class (wherein fold appears) doesn't assume an ordering
relation on keys (no "Ord k" context). What we do in the absence of a
total ordering on keys is to present the elements in an "arbitrary"
order. Because the order is "arbitrary" we can chose it to be
advantageous (ie, follow the internal structure), but that's really
just a side benefit of the relaxed semantics. The subclass
"OrdAssocX" further assumes an ordering on keys, so it can provide
foldr and friends (because now key ordering is defined). However,
even with foldr, foldl, etc, there is a degree undefinedness if the
data structure is a finite relation rather than a finite map (in what
order should you present elements bound to equal keys?). The folds
for bags have similar properties.
>> In light of this and previous discussions about folds, I think I
>> will be adding a section to the docs about how to chose an
>> appropriate fold for your application.
> At least that plus a warning!
Indeed. Its seems folds are sufficiently complicated to warrant some
special attention in the docs. In addition, I suppose I should track
down each and every operations which can be observed to behave
differently based on hidden state and mark them as "non-deteministic"
or some such. What is a good term for an operation whose result is
only partially defined? I really want to reserve the term "unsafe"
for operations that can violate internal consistency.
Thanks again! This is exactly the kind of discussion that helps
produce good APIs.
Speak softly and drive a Sherman tank.
Laugh hard; it's a long way to the bank.
More information about the Libraries