[GHC] #10613: Mechanism for checking that we only enter single-entry thunks once
GHC
ghc-devs at haskell.org
Tue Dec 22 10:17:08 UTC 2015
#10613: Mechanism for checking that we only enter single-entry thunks once
-------------------------------------+-------------------------------------
Reporter: bgamari | Owner:
Type: feature request | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: Other | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by nomeata):
This is me thinking out aloud about this feature. Ignore me if you want,
but if you are curious, feel free to jump in.
Widening the scope of this ticket a bit (or shifting it?): Wouldn’t it be
nice if we not only link the single-entry thunks, but also find out about
missed opportunities for single-entry thunks? Such a counting would
subsume the linting above.
I’m considering to build on [Debugging/TickyTicky ticky-ticky debugging],
as it already provides a lot of the necessary infrastructure. In
particular, it counts entries to thunks, when `-ticky-dyn-thunk` is
enabled. The number is not quite what we want, though: It seems that if a
certain static thunk is created ''n'' times, then it will report ''n''
entries; whereas in the scope of this ticket we want to know that each
allocated thunk was entered once.
Also, currently a dynamic (i.e. an instance of a static) thunk that is not
marked as single-entry will always be reported to be entered 0 or 1 times,
as the result is shared. How do we get better numbers here? One way might
be a special heap object, let’s call it a “counting indirection”. It
behaves mostly like a regular indirection, i.e. it’s entry code would
count a tick and then jump to the entry code of the indirectee, but the
crucial difference would be that the garbage collector will 'not' erase it
when it evacuates memory. It will, however, evacuate the indirectee as
usual. This way, the sharing semantics are preserved, but every entry via
the thunk will be counted, not just the first.
These counting indirections would be created by the code that creates a
thunk. This way, this counting can be done independent of `-ticky-dyn-
thunk`. These heap objects can also carry additional data used in
counting, e.g. the name of the thunk (in `CorePrep` names) and whether GHC
created this thunk as single-entry or not (which would not affect the
behaviour of the counting indirection, but be useful for the reporting).
I’m not sure yet what’s the best data structure to do the counting, if we
indeed want to count uses of each dynamic instance of the thunk (or
rather, of the counting indirection) separately. Maybe statically allocate
'm' (maybe 1024) counters per thunk, and an index, and the first 'm'
instances of the thunk use the these counter (each, upon first entry,
incrementing the index to allocate its slot), and if there are more than
1024 entries, then simply stop counting.
I kind-of like this design, but I’m sure I’m missing some rather ugly
details.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/10613#comment:4>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list