Build time regressions

John Lato jwlato at gmail.com
Thu Oct 2 01:07:55 UTC 2014


Hi Simon,

Thanks for replying.  Unfortunately the field in question wasn't being
unpacked, so there's something else going on.  But there's a decent chance
Richard has already fixed the issue; I'll check and report back if the
problem persists.  Unfortunately it may take me a couple days before I have
time to investigate fully.

However, I agree with your suggestion that GHC should not unpack wide
strict constructors without an explicit UNPACK pragma.

John

On Wed, Oct 1, 2014 at 4:57 PM, Simon Peyton Jones <simonpj at microsoft.com>
wrote:

>  It sounds as if there are two issues here:
>
>
>
> ·         *Should GHC unpack a !’d constructor argument if the
> constructor’s argument has a lot of fields?  *It probably isn’t
> profitable to unbox very large products, because it doesn’t save much
> allocation, and might *cause* extra allocation at pattern-match sites.
> So I think the answer is yes.  I’ll open a ticket.
>
>
>
> ·         *Is some library (binary? blaze?) creating far too much code in
> some circumstances?*  I have no idea about this, but it sounds fishy.
> Simply creating the large worker function should not make things go bad.
>
>
>
> Incidentally, John, using {-# NOUNPACK #-} !Bar would prevent the
> unpacking while still allowing the field to be strict.  It’s manually
> controllable.
>
>
>
> Simon
>
>
>
>
>
>
>
> *From:* ghc-devs [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *John
> Lato
> *Sent:* 01 October 2014 00:45
> *To:* Edward Z. Yang
> *Cc:* Joachim Breitner; ghc-devs at haskell.org
> *Subject:* Re: Build time regressions
>
>
>
> Hi Edward,
>
>
>
> This is possibly unrelated, but the setup seems almost identical to a very
> similar problem we had in some code, i.e. very long compile times (6+
> minutes for 1 module) and excessive memory usage when compiling generic
> serialization instances for some data structures.
>
>
>
> In our case, I also thought that INLINE functions were the cause of the
> problem, but it turns out they were not.  We had a nested data structure,
> e.g.
>
>
>
> > data Foo { fooBar :: !Bar, ... }
>
>
>
> with Bar very large (~150 records).
>
>
>
> even when we explicitly NOINLINE'd the function that serialized Bar, GHC
> still created a very large helper function of the form:
>
>
>
> > serialize_foo :: Int# -> Int#  -> ...
>
>
>
> where the arguments were the unboxed fields of the Bar structure, along
> with the other fields within Foo.  It appears that even though the
> serialization function was NOINLINE'd, it simply created a Builder, and
> while combining the Builder's ghc saw the full structure.  Our serializer
> uses blaze, but perhaps Binary's builder is similar enough the same thing
> could happen.
>
>
>
> Anyway, in our case the fix was to simply remove the bang pattern from the
> 'fooBar' record field.  Then the serialize_foo function takes a Bar as an
> argument and serializes that.  I'm not entirely sure why compilation takes
> so much longer otherwise.  I've tried dumping the output of each simplifier
> phase and it clearly gets stuck at a certain point, but I didn't really
> debug in much detail so I don't recall the details.
>
>
>
> If you think this is related, I can investigate more thoroughly.
>
>
>
> Cheers,
>
> John L.
>
>
>
> On Wed, Oct 1, 2014 at 4:54 AM, Edward Z. Yang <ezyang at mit.edu> wrote:
>
> Hello Joachim,
>
> This was halfway known, but it sounds like we haven't solved
> it completely.
>
> The beginning of the sordid tale was when Cabal HEAD switched
> to using derived binary instances:
> https://ghc.haskell.org/trac/ghc/ticket/9583
>
> SPJ fixed the infinite loop bug in the simplifier, but apparently
> the deriving binary generates a lot of code, meaning a lot of
> memory. https://ghc.haskell.org/trac/ghc/ticket/9630
> hvr's fix was specifically to solve this problem.
>
> But it sounds like it didn't eliminate the regression entirely?
> If there's an unrelated regression, we should suss it out.  It would
> be helpful if someone could revert just the deriving changes,
> and see if this reverts the compilation time.
>
> Edward
>
> Excerpts from Joachim Breitner's message of 2014-09-30 13:36:27 -0700:
> > Hi,
> >
> > the attached graph shows a noticable increase in build time caused by
> >
> > Update Cabal submodule & ghc-pkg to use new module re-export types
> > author    Edward Z. Yang <ezyang at cs.stanford.edu>
> >
> https://git.haskell.org/ghc.git/commit/4b648be19c75e6c6a8e6f9f93fa12c7a4176f0ae
> >
> > and only halfway mitigated by
> >
> > Update `binary` submodule in an attempt to address #9630
> > author    Herbert Valerio Riedel <hvr at gnu.org>
> >
> https://git.haskell.org/ghc.git/commit/3ecca02516af5de803e4ff667c8c969c5bffb35f
> >
> >
> > I am not sure if the improvement is related to the regression, but in
> > any case: Edward, was such an increase expected by you? If not, can you
> > explain it? Can it be avoided?
> >
> > Or maybe Cabal just became much larger... +38% in allocations when
> > running haddock on it seems to confirm this.
> >
> > Greetings,
> > Joachim
> >
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20141002/07e64f08/attachment-0001.html>


More information about the ghc-devs mailing list