[Haskell-cafe] Faster set intersections?

Jan-Willem Maessen jmaessen at alum.mit.edu
Sun Dec 9 02:35:27 UTC 2018


On Sat, Dec 8, 2018 at 7:46 PM Jeffrey Brown <jeffbrown.the at gmail.com>
wrote:

> The following expressions both cause GHCI to hang:
>
> > S.intersection (S.fromList [1,2]) (S.fromList [1..])
> fromList ^CInterrupted.
> > S.intersection (S.fromList [1..]) (S.fromList [1,2])
> fromList ^CInterrupted.
> >
>
> Is there a smarter way to take the intersection of sets when at least one
> of them is small (but I don't know which one that is)?
>

Taking the intersection of already-constructed, finite sets is very fast if
one of them is small (should be O(n lg m) where n is the size of the
smaller set and m the size of the larger one).  However, Data.Set doesn't
handle infinite sets as it tries to construct balanced trees and is strict
in the keys.  Even fromAscList won't help you here.  To represent infinite
(or bigger than you actually want to construct) sets you'll likely need to
roll your own custom representation, and I suspect it will in general need
to either have limitations on the key space or worse asymptotic performance.

That said, you can often get the behavior you're looking for by
representing big sets by a membership predicate and using filter.  Here's
an (untested) example of the right sort of thing:

data MySet k = Pred (k -> Bool) | Concrete (S.Set k)

intersection (Pred a) (Pred b) = Pred (\k -> a k && b k)
intersection (Concrete a) (Pred b) = Concrete (S.filter b a)
intersection (Pred a) (Concrete b) = Concrete (S.filter a b)
intersection (Concrete a) (Concrete b) = Concrete (S.intersection a b)

Note that a concrete set "concretizes" anything it touches.  Don't take
unions of these sets, though, it'll just be a mess.

-Jan-Willem Maessen


>
> --
> Jeff Brown | Jeffrey Benjamin Brown
> Website <https://msu.edu/~brown202/>   |   Facebook
> <https://www.facebook.com/mejeff.younotjeff>   |   LinkedIn
> <https://www.linkedin.com/in/jeffreybenjaminbrown>(spammy, so I often
> miss messages here)   |   Github <https://github.com/jeffreybenjaminbrown>
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20181208/f12f8691/attachment.html>


More information about the Haskell-Cafe mailing list