Are constructors matched against using "switch" or chained if-else

David Feuer david.feuer at gmail.com
Wed Feb 23 05:13:37 UTC 2022


That answer (which I did indeed write once upon a time) is somewhat
incomplete. Case matches on Ints actually work in about three different
ways:

1. Linear search
2. Binary search
3. Jump table

There's some amount of mix and match available for gappy things. I don't
know all the details. As for tag matching, I know none of the mechanics. I
just know that I've always been told to make the first constructor hot.

On Wed, Feb 23, 2022, 12:08 AM Clinton Mead <clintonmead at gmail.com> wrote:

> David,
>
> A random google search has revealed this StackOverflow answer
> <https://stackoverflow.com/a/27324088/525980>, presumably by yourself or
> your evil twin, which mentions a binary search being performed. However the
> particular case you mention is a case match on "Ints", which are far from a
> dense set, whereas presumably you could do something nicer on the tag bits
> of constructors, which are a dense set?
>
>
> On Wed, Feb 23, 2022 at 1:59 PM David Feuer <david.feuer at gmail.com> wrote:
>
>> You can ask, but someone else will have to answer. Sorry.
>>
>> On Tue, Feb 22, 2022 at 9:52 PM Clinton Mead <clintonmead at gmail.com>
>> wrote:
>> >
>> > Thanks David. Can I ask why? Is it because the first constructor is
>> treated specially? (perhaps because it has zeroed tag bits)? Or is it just
>> because there's always an if/else chain in order of the constructor
>> definition regardless of the order of the case statement so the higher up
>> the list the better?
>> >
>> > On Wed, Feb 23, 2022 at 1:34 PM David Feuer <david.feuer at gmail.com>
>> wrote:
>> >>
>> >> I can answer one of your questions for sure: the order of your case
>> branches doesn't matter at all. However, the order of the data constructors
>> in the type declaration does matter. Put your most likely one first.
>> >>
>> >> On Tue, Feb 22, 2022, 9:09 PM Clinton Mead <clintonmead at gmail.com>
>> wrote:
>> >>>
>> >>> Hi All
>> >>>
>> >>> I'm developing an unbounded integer type, which I won't go into the
>> details here but in some circumstances has better performance than the
>> standard "Integer".
>> >>>
>> >>> Anyway, whilst there are complex cases, the most common case is a
>> standard machine int multiplication.
>> >>>
>> >>> Hence I want the type to be optimised for that case.
>> >>>
>> >>> I'm going to have a few constructors, anyway, so I first considered
>> something like this:
>> >>>
>> >>> `data MyInt = BasicZero | BasicPos Word | BasicNeg Word | ComplexPosA
>> ... | ComplexNegA ... | ComplexPosB ... | ComplexNegB ...`
>> >>>
>> >>> I'd naturally make the "Word"s in "BasicPos" and "BasicNeg"
>> strict/unpack, hopefully eliminating the indirection, or perhaps just
>> making them primitive directly.
>> >>>
>> >>> This has 7 constructors, which quite nicely I believe fits into the
>> three spare bits in a 64 bit pointer which GHC optimises I believe.
>> >>>
>> >>> However, this approach I thought of because I assumed that GHC would
>> do a switch style statement on the constructors, so once I have more than
>> one I might as well have as many as I want (well, up to 7, until I lose the
>> optimisation).
>> >>>
>> >>> But if GHC compiles this to a "if ... else if..." chain, a better
>> representation is the following:
>> >>>
>> >>> `data MyInt = BasicInt Int | ComplexPosA ... | ComplexNegA ... |
>> ComplexPosB ... | ComplexNegB ...`
>> >>>
>> >>> That way I can match against BasicInt first, and knock that out of
>> the way as my "hot" case. However, using "Int" instead of "Word" does make
>> the logic a bit more complex, but it may be worth it if I'm reducing the
>> number of patterns I have to check against for all arguments.
>> >>>
>> >>> I was just wondering if anyone could share some insight on what GHC
>> does in these circumstances? For example, does the order I list my cases in
>> a case statement matter if they're non-overlapping? Will GHC match them in
>> the order I list, or will it just make them into a switch statement so it
>> doesn't matter if I reorder them?
>> >>>
>> >>> I guess I could benchmark all this (and probably will) but some
>> insights and general guidance would be good.
>> >>>
>> >>> Thanks,
>> >>> Clinton
>> >>> _______________________________________________
>> >>> Glasgow-haskell-users mailing list
>> >>> Glasgow-haskell-users at haskell.org
>> >>>
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/glasgow-haskell-users/attachments/20220223/33c85560/attachment.html>


More information about the Glasgow-haskell-users mailing list