[GHC] #15113: Do not make CAFs from literal strings
GHC
ghc-devs at haskell.org
Fri Dec 21 11:55:27 UTC 2018
#15113: Do not make CAFs from literal strings
-------------------------------------+-------------------------------------
Reporter: simonpj | Owner: (none)
Type: bug | Status: patch
Priority: normal | Milestone: 8.10.1
Component: Compiler | Version: 8.2.2
Resolution: | Keywords: CAFs
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: #16014 | Differential Rev(s): Phab:D4717
Wiki Page: |
-------------------------------------+-------------------------------------
Old description:
> Currently (as I discovered in #15038), we get the following code for
> `GHC.Exception.Base.patError`:
> {{{
> lvl2_r3y3 :: [Char]
> [GblId]
> lvl2_r3y3 = unpackCString# lvl1_r3y2
>
> -- RHS size: {terms: 7, types: 6, coercions: 2, joins: 0/0}
> patError :: forall a. Addr# -> a
> [GblId, Arity=1, Str=<B,U>x, Unf=OtherCon []]
> patError
> = \ (@ a_a2kh) (s_a1Pi :: Addr#) ->
> raise#
> @ SomeException
> @ 'LiftedRep
> @ a_a2kh
> (Control.Exception.Base.$fExceptionPatternMatchFail_$ctoException
> ((untangle s_a1Pi lvl2_r3y3)
> `cast` (Sym (Control.Exception.Base.N:PatternMatchFail[0])
> :: (String :: *) ~R# (PatternMatchFail :: *))))
> }}}
> That stupid `lvl2_r3y3 :: String` is a CAF, and hence `patError` has CAF-
> refs, and hence so does any function that calls `patError`, and any
> function that calls them.
>
> That's bad! Lots more CAF entries in SRTs, lots more work traversing
> those SRTs in the garbage collector. And for what? To share the work of
> unpacking a C string! This is nuts.
>
> What to do?
>
> * Somehow refrain from floating `unpackCSTring# lit` to top level, even
> if you could otherwise do so. But that seems very ad-hoc, and it make the
> function bigger and less inlinable.
>
> * Treat a top level definition
> {{{
> x :: [Char]
> x = unpackCString# y
> }}}
> as NOT a CAF, and make it single-entry so that the thunk is not
> updated. Then every use of `x` will unpack the string afresh, which is
> probably a good idea anyhow.
>
> I like this more. It would be implemented somewhere in the code
> generator.
New description:
Currently (as I discovered in #15038), we get the following code for
`GHC.Exception.Base.patError`:
{{{
lvl2_r3y3 :: [Char]
[GblId]
lvl2_r3y3 = unpackCString# lvl1_r3y2
-- RHS size: {terms: 7, types: 6, coercions: 2, joins: 0/0}
patError :: forall a. Addr# -> a
[GblId, Arity=1, Str=<B,U>x, Unf=OtherCon []]
patError
= \ (@ a_a2kh) (s_a1Pi :: Addr#) ->
raise#
@ SomeException
@ 'LiftedRep
@ a_a2kh
(Control.Exception.Base.$fExceptionPatternMatchFail_$ctoException
((untangle s_a1Pi lvl2_r3y3)
`cast` (Sym (Control.Exception.Base.N:PatternMatchFail[0])
:: (String :: *) ~R# (PatternMatchFail :: *))))
}}}
That stupid `lvl2_r3y3 :: String` is a CAF, and hence `patError` has CAF-
refs, and hence so does any function that calls `patError`, and any
function that calls them.
That's bad! Lots more CAF entries in SRTs, lots more work traversing those
SRTs in the garbage collector. And for what? To share the work of
unpacking a C string! This is nuts.
What to do?
1. Somehow refrain from floating `unpackCSTring# lit` to top level, even
if you could otherwise do so. But that seems very ad-hoc, and it make the
function bigger and less inlinable.
2. Treat a top level definition
{{{
x :: [Char]
x = unpackCString# y
}}}
as NOT a CAF, and make it single-entry so that the thunk is not updated.
Then every use of `x` will unpack the string afresh, which is probably a
good idea anyhow.
I like this more. It would be implemented somewhere in the code
generator.
--
Comment (by simonpj):
Looking at #16014, I like alternative (2) from the Description better and
better. If we spot
{{{
x = unpackCString# "blah"#
}}}
in the code generator, we could allocate a top-level closure with
* Info-pointer: `rtsUnpackString_info`
* One word of payload, a pointer to the literal string `"blah"#`.
Now we can hand-write the single blob of code (plus info table)
`rtsUnpackString_info` to unpack the string. Easy! And the overhead per
string is only two words (for the closure) rather than all the stuff
described in #16014.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15113#comment:11>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list