Proposal for containers: Add 'lookup' function to Data.Set
Ben Millwood
haskell at benmachine.co.uk
Thu Jul 14 15:33:35 UTC 2016
I agree with Henning, and moreover with wren's point that Data.Set is an
often-unsatisfactory data structure for interning in general. Map can do
the job more obviously and more comprehensively, at the cost of a few
more pointers -- and when has Haskell ever been about trading clarity
for fewer pointers?
I don't think I'll formally vote, since I haven't written any Haskell in
months, but I think the addition of this at-first-glance
seemingly-useless function adds surprise and a little bit of
headscratching to the user interface as a whole, to the extent that I
feel like it has a "weirdness cost" even if most users can just ignore
it. It also complicates (in my mind) the semantic story of Data.Set,
from something purely about a collection of elements and a membership
test to something which has specific opinions on which of two notionally
"equal" things are most appropriate in a given situation. I think the
cost of turning Data.Set from a simple thing into a complicated thing is
a subtle one that should not totally be overlooked -- API simplicity is
a virtue in itself.
Add to that Henning and wren's evidence, which I interpret as saying
that this function isn't quite the *perfect* tool for seemingly *any*
job, and it doesn't seem worth the cognitive / maintenance cost to me.
(I perhaps do not expect these considerations to be very strong or very
important, but I think it would be unwise to ignore them altogether.)
On Tue, Jun 28, 2016 at 11:19:16AM +0200, Henning Thielemann wrote:
>
>On Mon, 27 Jun 2016, Nicolas Godbout wrote:
>
>>Example use case: In a parser, the memory footprint can be reduced by collapsing
>>all equal strings to a single instance of each string. To achieve this, one needs
>>a way to get a previously seen string (internally, a pointer) equal to a newly
>>parsed string. Amazingly, this is very difficult with the current "containers" library interface.
>>One current option is to use a Map instead, e.g., 'Map String String'
>>which stores twice as many pointers as necessary.
>
>I think I'd prefer the (Map String String) variant to an obscure Set
>lookup function. I wonder whether you will later use the Map anyway,
>as the compiler grows and you need to attach more data to tokens. In
>order to make your intent clearer you might define
>
> newtype SharedToken = SharedToken String
>
>and use (Map String SharedToken).
>_______________________________________________
>Libraries mailing list
>Libraries at haskell.org
>http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
More information about the Libraries
mailing list