What changed between GHC 8.8 and 8.10 that could cause this?
Phyx
lonetiger at gmail.com
Tue Mar 9 12:43:11 UTC 2021
Hi,
> But we don't _want_ the shared state, it's simply there.
> This whole issue arises from the fact that we were oblivious to the
shared RTS state, resulting in Clash doing GHC API calls where the RTS
loads/links an object file twice.
The RTS should under no circumstances be actually loading an object file
twice as there's only one linker map and should result in a symbol
collision.
Looking at the error you posted at
https://github.com/clash-lang/clash-compiler/issues/1686 is actually the
linker doing the right thing.
GHC runtime linker: fatal error: I found a duplicate definition for symbol
Lib2_plots2_closure
whilst processing object file
.stack-work/dist/x86_64-linux-tinfo6/Cabal-3.2.1.0/build/exe2/_clashilator/clash-syn/Lib2.o
The symbol was previously defined in
.stack-work/dist/x86_64-linux-tinfo6/Cabal-3.2.1.0/build/exe1/_clashilator/clash-syn/Lib2.o
You're loading the same object file twice from different build folders and
the linker has no guarantee that these two are the same symbol at all.
This however is indeed a shortcoming of M388 that we can't split the C
linker map easily.
> And we're not even explicitly linking/loading object files twice,
something to do with the GHC type-checker seems to do that.
Yes but you have a new object file, in a different path. This can't be
resolved by the linker cache. This looks like it accidentally worked
before as the shared Haskell Linker state resolves based on the Close name
itself.
So it never asked the C linker. I say accidental because there's no
guarantee that the closure in exe1 and exe2 are the same, despite them
having the same name..
> I don't see how I can avoid this issue without being forced to run within
a single `runGhc` session.
As I mentioned below, you can override the hsc_dynLinker in a wrapper
around runGhc.
i.e.
runClashGhc :: <action> -> ..
do shared_linker <- ...
runGhc .. $ do
setSession $ hsc_env { hsc_dynLinker = shared_linker }
<action>
Should restore the behavior. You don't need to run inside a single runGhc,
you just need to provide a single hsc_dynLinker.
That should work.
Kind Regards,
Tamar
On Tue, Mar 9, 2021, 11:46 Christiaan Baaij <christiaan.baaij at gmail.com>
wrote:
> But we don't _want_ the shared state, it's simply there.
> This whole issue arises from the fact that we were oblivious to the shared
> RTS state, resulting in Clash doing GHC API calls where the RTS loads/links
> an object file twice.
> And we're not even explicitly linking/loading object files twice,
> something to do with the GHC type-checker seems to do that.
> I don't see how I can avoid this issue without being forced to run within
> a single `runGhc` session.
>
> On Tue, 9 Mar 2021 at 12:22, Phyx <lonetiger at gmail.com> wrote:
>
>> Hi,
>>
>> Hmm... I don't agree..
>>
>> This isn't about grounds of truth or anything like that.. and in fact, an
>> object being in the linker map, doesn't mean its usable at all or meant to
>> be used at all.
>> It can be temporary state (symbol redirection or supporting of deprecated
>> symbols are two that come to mind). So this is also a case of.. be careful.
>>
>> The change introduced in the MR simply decoupled the top level user
>> interface and the C linker.
>> The reason for this is simply because the majority of projects do not
>> require shared state here, but infact benefit from unshared state.
>>
>> i.e. interpreters, IDEs etc. Where you want to be able to process
>> multiple separate files at the same time without needing to create new
>> processes for each.
>>
>> Now back to your point about runGhc needing to use a shared state.. In my
>> opinion that would be wrong.
>>
>> Here's the documentation for GHC 8.6.5
>> https://hackage.haskell.org/package/ghc-8.6.5/docs/GHC.html
>>
>> specifically:
>>
>> ----
>>
>> runGhc
>> :: Maybe FilePath - See argument to initGhcMonad.
>> -> Ghc a - The action to perform.
>> -> IO a - Run function for the Ghc monad.
>>
>> It initialises the GHC session and warnings via initGhcMonad.
>> Each call to this function will create a new session which should not be
>> shared among several threads.
>>
>> Any errors not handled inside the Ghc action are propagated as IO
>> exceptions.
>>
>> ---
>>
>> And if the session isn't guaranteed there's no guarantee about the
>> underlying state.
>> This explicit declaration that runGhc will not share state has been in
>> the API for for decades (going as far back as I stopped looking at 7.2).
>>
>> That Clash is relying on behavior we explicitly stated is not the case is
>> a bug in Clash.
>>
>> If you require shared state you should not be using the top level runGhc
>> wrapper but instead call unGhc yourself (or call setSession yourself).
>>
>> There is perhaps a case to be made for a runGhcShared which does this,
>> but runGhc itself never guaranteed one session or one state.
>>
>> Kind regards,
>> Tamar
>>
>> On Tue, Mar 9, 2021, 10:27 Christiaan Baaij <christiaan.baaij at gmail.com>
>> wrote:
>>
>>> Even if MR388 ( https://gitlab.haskell.org/ghc/ghc/-/merge_requests/388
>>> ) is the cause of the issue we're seeing with the API exposed by Clash, I
>>> still think MR388 is wrong.
>>> My reasoning is the following:
>>>
>>> In 8.8 and earlier we had:
>>> - RTS C-code contains the ground truth of what is linked. The API it
>>> provides are set-membership, insert, lookup, and delete. Notably it does
>>> not allow you to get the set of linked objects.
>>> - There is a globally shared MVar (using NOINLINE, sharedCaf,
>>> unsafePerformIO newIORef "tricks") to what is basically a log/view of the
>>> linked-objects state kept by the RTS C-code.
>>>
>>> With MR388, in 8.10 and later we get:
>>> - RTS C-code contains the ground truth of what is linked. The API it
>>> provides are set-membership, insert, lookup, and delete. Notably it does
>>> not allow you to get the set of linked objects.
>>> - A _new_ MVar for every call to `runGhc` which is a log/view of the
>>> linked-object state kept by the RTS C-code. But that means these MVar get
>>> out-of-sync with the ground truth that is the RTS C-code! And since the RTS
>>> C-code does not expose an API to get the set of linked objects, there's no
>>> way to sync these MVars either!
>>>
>>> I'm building a ghc-8.10.2 with MR388 reverted to see whether it is
>>> indeed what is causing the issue we're seeing in Clash.
>>> Given my analysis above of what I think is wrong with MR388, I'm not
>>> saying we should completely revert MR388, but simply ensure that every
>>> HscEnv created through `runGhc` gets the globally shared MVar; as opposed
>>> to the current call to `newMVar`.
>>>
>>> On Sun, 7 Mar 2021 at 04:02, ÉRDI Gergő <gergo at erdi.hu> wrote:
>>>
>>>> Thanks Matthew and Julian! Unfortunately, trying out GHC before/after
>>>> this
>>>> change didn't turn out to be as easy as I hoped: to do my testing, I
>>>> need to build a given GHC commit, and then use that via Stack to
>>>> install
>>>> ~140 dependencies so that I can then test the problem I have initially
>>>> seen. And it turns out doing that with a random GHC commit is quite
>>>> painful because in any given Stackage snapshot there will be packages
>>>> with
>>>> which the GHC-bundled libraries are incompatible... :/
>>>>
>>>>
>>>>
>>>> On Thu, 4 Mar 2021, Julian Leviston wrote:
>>>>
>>>> > Hi,I don’t know enough about what Clash does to comment really, but
>>>> it sounds like
>>>> > it’s to do with my work on enabling multiple linker instances
>>>> > in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/388 — maybe
>>>> reading through
>>>> > that or the plan I outlined at
>>>> https://gitlab.haskell.org/ghc/ghc/-/issues/3372 might
>>>> > help, though I’m not sure.
>>>> >
>>>> > Strange, though, as this work was to isolate state in GHC — to change
>>>> it from using a
>>>> > global IORef to use a per-process MVar . But it definitely did change
>>>> the way state is
>>>> > handled, so it might be the related to these issues somehow?
>>>> >
>>>> > I realise this isn’t much help, but maybe it points you in a
>>>> direction where you can
>>>> > begin to understand some more.
>>>> >
>>>> > Julian
>>>> >
>>>> > On 4 Mar 2021, at 10:55 pm, ÉRDI Gergő <gergo at erdi.hu> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> > I'm trying to figure out a Clash problem and managed to track it
>>>> down to a GHC
>>>> > upgrade; specifically, a given Clash version, when based on GHC 8.8,
>>>> has no
>>>> > problem synthesizing one module after another from one process; but
>>>> the same
>>>> > Clash version with GHC 8.10 fails with link-time errors on the second
>>>> > compilation.
>>>> >
>>>> > The details are at
>>>> https://github.com/clash-lang/clash-compiler/issues/1686
>>>> > but for now I'm just hoping that some lightbulb will go off for
>>>> someone if some
>>>> > handling of internal state has changed in GHC that could mean that
>>>> the symbol
>>>> > tables of loaded modules could persist between GHC invocations from
>>>> the same
>>>> > process.
>>>> >
>>>> > So, does this ring a bell for anyone?
>>>> >
>>>> > Thanks,
>>>> > Gergo
>>>> > _______________________________________________
>>>> > ghc-devs mailing list
>>>> > ghc-devs at haskell.org
>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>> --
>>>>
>>>> .--= ULLA! =-----------------.
>>>> \ http://gergo.erdi.hu \
>>>> `---= gergo at erdi.hu =-------'
>>>> I tried to commit suicide once by taking over 1,000 aspirin. But after
>>>> I took 2, I felt better!_______________________________________________
>>>> ghc-devs mailing list
>>>> ghc-devs at haskell.org
>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>>>
>>> _______________________________________________
>>> ghc-devs mailing list
>>> ghc-devs at haskell.org
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20210309/e6818ab1/attachment.html>
More information about the ghc-devs
mailing list