[GHC] #14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy + Text.RE.TDFA
GHC
ghc-devs at haskell.org
Tue Jan 16 09:59:38 UTC 2018
#14519: Exponential runtime performance regression in GHC 8.2 + Data.Text.Lazy +
Text.RE.TDFA
-------------------------------------+-------------------------------------
Reporter: ntc2 | Owner: tdammers
Type: bug | Status: new
Priority: normal | Milestone: 8.4.1
Component: Compiler | Version: 8.2.2
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: Runtime | Test Case:
performance bug | https://github.com/ntc2/ghc-8.2.1
| -regex-lazy-text-
| bug/tree/07b7bb32c6e90e8f2d2eada4b59943f37e632d53
Blocked By: | Blocking:
Related Tickets: #13745, #14564 | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by tdammers):
> So the string "$wnext2{v reC2}" must appear in some .o file. I'd do
strings *.o | grep to find it.
Using `strings -f **/*.o | grep wnext2`, I get:
{{{
regex-tdfa-text-1.0.0.3/dist/dist-sandbox-
3f35acf6/build/Text/Regex/TDFA/Text/Lazy.o: $wnext2{v reyv} (regex-tdfa-
text-1.0.0.3-CIfFZ6rjdCoJI5EFpqTwBO:Text.Regex.TDFA.Text.Lazy) (fun)
}}}
There is no identifier `wnext` or `next` in that module, but it does have
this:
{{{
{-# SPECIALIZE execMatch :: Regex -> Position -> Char -> L.Text ->
[MatchArray] #-}
execMatch :: Uncons text => Regex -> Position -> Char -> text ->
[MatchArray]
execMatch = Engine.execMatch
}}}
`execMatch` imported from `Text.Regex.TDFA.NewDFA.Engine`, in turn,
defines local functions `next` and `next'`; the alias defined here seems
to mainly exist in order to add the `SPECIALIZE` pragma for lazy `Text`.
`execMatch` is monstrously large and quite complex, so I haven't managed
to figure out entirely how it works, however there is a promising starting
point for further digging:
Whether or not `SPECIALIZE` triggers might depend on debugging and/or
profiling build flags, which would explain why the problem disappears in a
profiling build. The hypothesis would be that the specialized code does
something that the non-specialized code doesn't do, and that something
ends up being *less* efficient rather than *more*. The `execMatch`
function is polymorphic over, among other things, the `text` type, with a
typeclass constraint `Uncons text` (documentation
[http://hackage.haskell.org/package/regex-tdfa-1.2.2/docs/Text-Regex-TDFA-
NewDFA-Uncons.html here]), so we should expect `uncons` to inline for the
specialized version, but not the un-specialized one (because in the
specialized case we can statically resolve it to the non-polymorphic
`uncons` from `Data.Text.Lazy`), and then, possibly maybe, the inlined
uncons leads to a space leak.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14519#comment:21>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list