[Haskell-cafe] ordNub

Conrad Parker conrad at metadecks.org
Mon Jul 15 04:24:40 CEST 2013


On 15 July 2013 09:54, Joey Adams <joeyadams3.14159 at gmail.com> wrote:
> On Sun, Jul 14, 2013 at 7:31 AM, Clark Gaebel <cgaebel at uwaterloo.ca> wrote:
>>
>> Similarly, I've always used:
>>
>> import qualified Data.HashSet as S
>>
>> nub :: Hashable a => [a] -> [a]
>> nub = S.toList . S.fromList
>>
>> And i can't think of any type which i can't write a Hashable instance, so
>> this is extremely practical.
>
> This won't yield results lazily (e.g. nub (repeat 'x') = _|_ instead of 'x'
> : _|_), but Niklas' ordNub will.  His ordNub can be translated directly to
> HashSet and still have the stability and laziness properties.
>
> A difficulty with putting ordNub in Data.List is that it depends on
> containers, which is outside of the base package.  Some options:
>
>  * Move the implementation of Set to base.
>
>  * Implement a lean version of Set in base that only provides 'insert' and
> 'member'.
>
>  * Define ordNub in Data.Set instead.
>
> Adding a Hashable-based nub to base would be even more problematic, since
> you'd need Hashable in base.

Right, I suggest the following community course of action:

1a) add ordNub to Data.Set
1b) add ordNub to Data.Hashable
(1 day)

2) make a libraries@ proposal to include a stripped-down Data.Set-like
balanced binary tree implementation to base.
(2 weeks)

3) bikeshed about the name, eg.:
  * is "nub" really intuitive? how about "uniq", like in
perl/ruby/underscore.js?
  * but "uniq" in unix only removes _adjacent_ duplicates, confusing!
  * how about "distinct"? "sole"? "unique"? "azygous"?
(7 weeks)

4) Failing consensus on technical grounds (that the stripped-down
Data.Set implementation is overkill for one library function), agree
that anyone who really cares should just use the version from
containers or hashable. Only newbs and textbook authors actually use
base anyway, and it's impossible to change the language definition.
Prelude will continue to fulfil its role of avoiding success at all
costs, quadratic or otherwise.

(Please, let's have both 1a and 1b :)

Conrad.




More information about the Haskell-Cafe mailing list