Standard Library report: List union

Jay Cox sqrtofone@yahoo.com
Mon, 4 Mar 2002 19:51:13 -0600 (CST)


On Mon, 4 Mar 2002, Jon Fairbairn wrote:

> The current library report defines unionBy like this:
>
>   unionBy eq xs ys =  xs ++ deleteFirstsBy eq (nubBy eq ys) xs
>
> why does it take the nub of ys, but not xs?  I'd have expected
>
>   unionBy eq xs ys =  (nubBy eq xs) ++ deleteFirstsBy eq (nubBy eq ys) xs
>
>   Jón

Pure guess, but... (no really!)


For the sake of argument, lets define a ulist as a list where
for all elements x in list l, there is no element n with index not equal
to that of x (index being position of the element in the list) such that
eq n x == True.

In other words every element in a ulist appears only once.

Suppose you (unionBy eq x y) to get a result.
Suppose also that x is a ulist
A. x is a ulist by argument.
B. the result of (nubBy eq ys), lets call it z, is a ulist.
C. the result of (deleteFirstsBye eq z xs) is a list which has no elements
in common with xs).  because (deleteFirstsBy eq) "deletes" elements and
doesnt add,the result is a ulist.
D. Adding new, unique, elements (elements not equal to a element in the
ulist in question) to a ulist results in a ulist.
E. Therefore (unionBy eq x y) is a ulist.


Why should this be important?

what if you wanted to fold the function (unionBy eq) over a list of lists
to get a ulist?  Assuming you start with an initial ulist, by your
suggestion you'd be applying (nubBy eq) to a ulist (generated by the the repeated
application (unionByeq), which would be the same as applying the identity
function to a ulist. (meaning you have essentially one big nasty
no-op)!


However, in taking a look at unionBy, one might wonder why it isnt defined
like so (assuming xs would be the accumulated ulist in the fold.


unionBy eq xs ys = (deleteFirstsBy eq (nubBy eq ys) xs) ++ xs

or maybe better (to mirror (++))

unionBy' eq ys xs = (deleteFirstsBy eq (nubBy eq ys) xs) ++ xs

in using this definition, the number of conses with (:) should be
linear (or less) with with the number of elements to be added to the
first_ulist in the following folds.


foldl (unionBy eq)  first_ulist list_of_lists
foldr (unionBy' eq) first_ulist list_of_lists

So, is there aother reason I cannot think of?  I'm sure I haven't covered
all bases here.

Thanks,

Jay Cox