[GHC] #9142: LLVM HEAD rejects aliases used by LLVM codegen

Wed May 28 20:44:54 UTC 2014

#9142: LLVM HEAD rejects aliases  used by LLVM codegen
-------------------------------------+------------------------------------
        Reporter:  bgamari           |            Owner:
            Type:  bug               |           Status:  new
        Priority:  high              |        Milestone:
       Component:  Compiler          |          Version:  7.8.2
      Resolution:                    |         Keywords:
Operating System:  Unknown/Multiple  |     Architecture:  Unknown/Multiple
 Type of failure:  None/Unknown      |       Difficulty:  Unknown
       Test Case:                    |       Blocked By:
        Blocking:  4213              |  Related Tickets:
-------------------------------------+------------------------------------

Comment (by altaic):

 As I understand it from the thread on LLVMDev, the stumbling block is that
 we don't have a function definition's type until we see a call to it. LLVM
 doesn't care about the ordering of definitions and calls, so at least
 that's one thing we don't have to worry about. Several options came to
 mind in no particular order:

 1. Store untyped definitions and calls in some sort of data structure
 until we've got a matching pair, at which point we know the type of the
 definition and we can queue it to be emitted as soon as the current object
 is finished being emitted. Unfortunately, I think this would mean that
 non-function data would be stored until the entire stream has been read.
 Additionally, functions being stored is dependent on the distance between
 a definition and a call-- if they are maximally unsorted, we could be
 storing the whole stream in memory.

 2. Iterate through the cmm twice, collecting type information for
 definitions on the first pass, and emitting code on the second pass. The
 drawback is the extra time from processing the stream twice, and a bit of
 extra memory usage, dependent on the number of definitions.

 3. Generate invalid LLVM IR and mangle definitions with the type info
 before handing it off to LLVM. This has all of the drawbacks of 2, with
 the added drawback of more cringeworthy text mangling.

 4. Generate an auxiliary data structure when we're processing the STG
 which contains the type information to pass to the LLVM backend. I don't
 know how complicated this would be to implement, nor if the extra cruft in
 other stages of the compilation pipeline would be acceptable.

 5. Generate LLVM from STG rather than cmm. As I understand it, David
 considered this when designing the LLVM backend, but decided against it
 due to code duplication. I imagine this would be a lot more work than the
 other options, though it may not have the drawbacks the other options
 have.

 6. Convince the LLVM folks to reverse the change they made with aliases,
 or otherwise add a new feature to LLVM. Not sure if we were abusing
 aliases, or if our use case fits into some sort of feature they'd like to
 offer, but it might take awhile for an upstream fix.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9142#comment:8>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler