Standard Library report: List union
Jay Cox
sqrtofone@yahoo.com
Mon, 4 Mar 2002 19:51:13 -0600 (CST)
On Mon, 4 Mar 2002, Jon Fairbairn wrote:
> The current library report defines unionBy like this:
>
> unionBy eq xs ys = xs ++ deleteFirstsBy eq (nubBy eq ys) xs
>
> why does it take the nub of ys, but not xs? I'd have expected
>
> unionBy eq xs ys = (nubBy eq xs) ++ deleteFirstsBy eq (nubBy eq ys) xs
>
> Jón
Pure guess, but... (no really!)
For the sake of argument, lets define a ulist as a list where
for all elements x in list l, there is no element n with index not equal
to that of x (index being position of the element in the list) such that
eq n x == True.
In other words every element in a ulist appears only once.
Suppose you (unionBy eq x y) to get a result.
Suppose also that x is a ulist
A. x is a ulist by argument.
B. the result of (nubBy eq ys), lets call it z, is a ulist.
C. the result of (deleteFirstsBye eq z xs) is a list which has no elements
in common with xs). because (deleteFirstsBy eq) "deletes" elements and
doesnt add,the result is a ulist.
D. Adding new, unique, elements (elements not equal to a element in the
ulist in question) to a ulist results in a ulist.
E. Therefore (unionBy eq x y) is a ulist.
Why should this be important?
what if you wanted to fold the function (unionBy eq) over a list of lists
to get a ulist? Assuming you start with an initial ulist, by your
suggestion you'd be applying (nubBy eq) to a ulist (generated by the the repeated
application (unionByeq), which would be the same as applying the identity
function to a ulist. (meaning you have essentially one big nasty
no-op)!
However, in taking a look at unionBy, one might wonder why it isnt defined
like so (assuming xs would be the accumulated ulist in the fold.
unionBy eq xs ys = (deleteFirstsBy eq (nubBy eq ys) xs) ++ xs
or maybe better (to mirror (++))
unionBy' eq ys xs = (deleteFirstsBy eq (nubBy eq ys) xs) ++ xs
in using this definition, the number of conses with (:) should be
linear (or less) with with the number of elements to be added to the
first_ulist in the following folds.
foldl (unionBy eq) first_ulist list_of_lists
foldr (unionBy' eq) first_ulist list_of_lists
So, is there aother reason I cannot think of? I'm sure I haven't covered
all bases here.
Thanks,
Jay Cox