[GHC] #7367: Optimiser / Linker Problem on amd64
GHC
ghc-devs at haskell.org
Thu Aug 29 04:02:07 UTC 2013
#7367: Optimiser / Linker Problem on amd64
--------------------------------------------+------------------------------
Reporter: wurmli | Owner:
Type: bug | Status: new
Priority: normal | Milestone: 7.8.1
Component: Build System | Version: 7.6.1
Resolution: | Keywords:
Operating System: Linux | Architecture: x86_64
Type of failure: Runtime performance bug | (amd64)
Test Case: | Difficulty: Unknown
Blocking: | Blocked By:
| Related Tickets:
--------------------------------------------+------------------------------
Comment (by wurmli):
Replying to [comment:12 rwbarton]:
> wurmli, what's the matter with it?
>
> "800,100,272 bytes allocated in the heap" means that the total size of
all the allocations done over the course of the program is 800,100,272
bytes. That's the expected size of 20 million (Int, Int) pairs which
share their second field (`n`), plus a small amount of other stuff. It
doesn't have anything to do with the size of the heap at any given time.
The maximum heap size is shown separately: "50,520 bytes maximum
residency" which is quite reasonable.
>
> Similarly your original program does not ever occupy 10 GB of heap at a
time. If you look at the process in top you will see a memory usage close
to "47,184 bytes maximum residency" (well probably more like a couple MB,
to hold the program image, but not anything near 10 GB).
>
> I have no idea why the original program timed out on the language
benchmark machines, but it wasn't due to it allocating 10 GB sequentially.
Allocation of short-lived objects is very cheap. But it is not free, and
this discussion has been about why current GHC produces a program that
allocates a lot when GHC 7.4 did not. Eliminating the large amount of
allocation might reduce the runtime by a few percent or so.
Would you agree that it is reasonable to expect the optimiser to optimise
these allocations away? My simple assumption about the fannkuch program is
that speed is enhanced if memory use stays local. The more only registers
and cache are used the faster the program runs. With the repeated
allocation of an intermediary variable the cache might be exhausted and
the processor might have to copy in and out of cache what could slow down
the program.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/7367#comment:13>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list