[GHC] #14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA

GHC ghc-devs at haskell.org
Mon Nov 27 18:07:52 UTC 2017


#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy +
Text.RE.TDFA
-------------------------------------+-------------------------------------
        Reporter:  ntc2              |                Owner:  tdammers
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:  8.4.1
       Component:  Compiler          |              Version:  8.2.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  Runtime           |            Test Case:
  performance bug                    |  https://github.com/ntc2/ghc-8.2.1
                                     |  -regex-lazy-text-
                                     |  bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53
      Blocked By:                    |             Blocking:
 Related Tickets:  #13745            |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Changes (by ntc2):

 * testcase:  https://github.com/ntc2/ghc-8.2.1-regex-lazy-text-bug =>
     https://github.com/ntc2/ghc-8.2.1-regex-lazy-text-
     bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53


Old description:

> When using `Text.RE.TDFA` from the regex-tdfa package on GHC 8.2 to match
> trivial regexes against `Data.Text.Lazy` strings, I see **exponential
> runtime**. On GHC 8.0, or using `Data.Text` strict strings or `String`
> strings, the runtime is as expected.
>
> Here's my repo with simple test code illustrating the regression and
> timing stats: https://github.com/ntc2/ghc-8.2.1-regex-lazy-text-bug.
>
> In summary, for the problematic combination of GHC version and libraries,
> the run times are 3s, 10s, 22s, and 40s
> for counting regex matches in files with 10000,
> 20000, 30000, and 40000 lines, respectively. For all of the
> unproblematic combinations -- i.e. GHC 8.0.2, `String`, strict
> `Data.Text`, or building with profiling -- the run time is always about
> 1s!
>
> And **this is a Heisenbug**, in that the performance problem goes away if
> I build with profiling support!
>
> There is the separate Issue #13745 about **compile-time regressions** in
> GHC 8.2 + the regex-tdfa package, that
> [https://ghc.haskell.org/trac/ghc/ticket/13745#comment:18 I commented
> on].

New description:

 When using `Text.RE.TDFA` from the regex-tdfa package on GHC 8.2 to match
 trivial regexes against `Data.Text.Lazy` strings, I see **exponential
 runtime**. On GHC 8.0, or using `Data.Text` strict strings or `String`
 strings, the runtime is as expected.

 Here's my repo with simple test code illustrating the regression and
 timing stats: https://github.com/ntc2/ghc-8.2.1-regex-lazy-text-
 bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53 .

 In summary, for the problematic combination of GHC version and libraries,
 the run times are 3s, 10s, 22s, and 40s
 for counting regex matches in files with 10000,
 20000, 30000, and 40000 lines, respectively. For all of the
 unproblematic combinations -- i.e. GHC 8.0.2, `String`, strict
 `Data.Text`, or building with profiling -- the run time is always about
 1s!

 And **this is a Heisenbug**, in that the performance problem goes away if
 I build with profiling support!

 There is the separate Issue #13745 about **compile-time regressions** in
 GHC 8.2 + the regex-tdfa package, that
 [https://ghc.haskell.org/trac/ghc/ticket/13745#comment:18 I commented on].

--

Comment:

 I'm having trouble with the bisection search because `hashable` won't
 build on some intermediate dev GHCs I've tried. I tried changing the test
 case to use `regex-tdfa-text`+`Data.Text.Lazy` directly, instead of via
 the common interface provided by `regex`, but then the problem goes away.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14519#comment:4>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list