[GHC] #11822: Pattern match checker exceeded (2000000) iterations
GHC
ghc-devs at haskell.org
Fri May 6 09:28:46 UTC 2016
#11822: Pattern match checker exceeded (2000000) iterations
-------------------------------------+-------------------------------------
Reporter: j.waldmann | Owner: gkaracha
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.0.1-rc3
Resolution: | Keywords:
| PatternMatchWarnings
Operating System: Unknown/Multiple | Architecture: x86_64
Type of failure: Compile-time | (amd64)
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by gkaracha):
Apologies for the late (and long) response. You can find below some
explanation about the behavior of the checker, which I hope to clear
things out a bit.
== Why does this example exceed the limit ==
In principle, the match of `sameAttribute` gives rise to 161*161 = 25,921
cases. Also, the counter takes into account the arguments too and most
constructors have at least one.
Yet, the most important reason that the match iterates >2000000 times is
that **all** unmatched cases from the first clause are checked with the
second, and the resulting ones are checked with the third, etc.
{{{#!hs
data T = A | B | C
f :: T -> ()
f A = ()
f B = () -- (*)
f C = ()
}}}
That being said, for `f` above, the second clause will be checked against
missing cases `B` and `C`. This is a pretty small example, but in general
this gives rise to (something like) a quadratic number of checks. That's
why the iteration counter goes over the roof.
== General Comments on Performance ==
About the general performance of the algorithm, there are some things
worth mentioning:
1. In principle, the new checker is slower than the previous one on
average. This makes sense because we check term and type constraints, that
the previous checker didn't, so we have to pay something for that.
Nevertheless, the main reason that it's slower on average is that the new
checker checks uncovered matches eagerly, while the past checker did it
lazily (whether the specific match has GADTs or not, the new checking
function is in `TcM` also, to be able to call the type-checker).
2. Neither I or Simon were happy about the above, but (that's why we
opened #11528 but it still needs a lot of work to figure out how to
change) the constraint-based approach grew really fast, unless we
performed constraint solving as soon as possible, to remove useless cases
(as I said above, what is generated from one clause is passed to the next
one so this was devastating).
3. The takeaway is that, in order to avoid extreme cases (like #11195
which is really difficult for the checker), we chose to have predictable
behavior for all (with/without guards or GADTs) but a bit slower. The
other result is that since the checker is now strict, it had to be
bounded, to avoid memory-issues.
I dislike the exceeded-iterations-warning too, but I know no better
solution at the moment. My original limit was 10000000, which even worked
on #11195 and gave no such warning. However, as Herbert said, memory
consumption can be too high, if we let it. The best solution, the way I
see it would be to collect statistics (like the examples from Neil or this
whole bug report) and fine-tune it. Maybe 10000000 was too high but
2000000 seems to be too low. I do not have anything better to offer at the
moment, but I am open to suggestions.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/11822#comment:7>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list