[Haskell-cafe] Minimizing cascading rebuilds

Alex Rozenshteyn rpglover64 at gmail.com
Fri Mar 30 15:17:32 UTC 2018


Might this be something backpack can help with? Write an interface file for
your data library and actually link it in at your top level application?
This doesn't help the total build time, but it helps tighten the feedback
loop for the intermediate packages.

This is tangentially related to Theodre's suggestion in that your data
doesn't get combined in until late in the build.

On Thu, Mar 29, 2018 at 10:59 AM Gleb Popov <6yearold at gmail.com> wrote:

> On Thu, Mar 29, 2018 at 7:31 PM, ☂Josh Chia (謝任中) <joshchia at gmail.com>
> wrote:
>
>> Rahul's idea works within a package to prevent cascading builds of
>> modules, but when the base package needs to be rebuilt, it is unregistered
>> first as are its direct and indirect dependents, so unfortunately the idea
>> works on an intra-package level but not an inter-package level.
>>
>> Regarding Gleb's comment, isn't this CPP distinction between embedded and
>> external file going to affect a lot of code including their function
>> signatures? (One version takes a filename for the data file and does IO and
>> another version doesn't) Sounds hard to maintain.
>>
>
> Well, you can wrap your data in IO:
>
> #ifdef DEVEL_BUILD
> constantData'  :: ByteString
> constantData' = $(embedFile "blabla")
>
> constantData :: IO ByteString
> constantData = return constantData '
> #else
> constantData :: IO ByteString
> constantData = readFile "blabla"
> #endif
>
>
>> On Thu, Mar 29, 2018 at 2:14 PM, Rahul Muttineni <rahulmutt at gmail.com>
>> wrote:
>>
>>> Hi Josh,
>>>
>>> I just tried a quick experiment with stack resolver lts-11.2 and I'd
>>> like to share the results as there are interesting:
>>>
>>> 1. Consider the following module setup that's a simplified version of
>>> your situation
>>> Dependencies:
>>> - Main depends on Hi
>>> - Hi depends on Hum
>>> - Hee depends on Hum
>>>
>>> Main.hs:
>>> ```
>>> module Main where
>>>
>>> import Hi
>>> import Hee
>>>
>>> main :: IO ()
>>> main = print $ hi ++ hee ++ "!"
>>> ```
>>>
>>> Hee.hs:
>>> ```
>>> module Hee (hee) where
>>>
>>> import Hum (hum)
>>>
>>> hee :: String
>>> hee = "hee1" ++ hum
>>> ```
>>>
>>> Hi.hs
>>> ```
>>> module Hi (hi) where
>>>
>>> import Hum (hum)
>>>
>>> hi :: String
>>> hi = "hi1" ++ hum
>>> ```
>>>
>>> Hum.hs
>>> ```
>>> module Hum (hum) where
>>>
>>> hum :: String
>>> hum = "hum"
>>> ```
>>>
>>> 2. Now build it once with `stack build`.
>>> 3. Now change "hum" to "hum1" and run `stack build` notice that all 4
>>> modules will recompile.
>>> 4. Now add {-# NOINLINE hum #-} just above hum :: String and run `stack
>>> build`
>>> 5. Change hum again and run `stack build`.
>>> 6. Only Hum will recompile!
>>>
>>> Lesson: Add NOINLINE to any function/value that you change frequently
>>> and don't want to trigger massive recompilations. This does come at a
>>> performace tradeoff since GHC will not be able to inline whatever you added
>>> that pragma to, but your compile-time will be saved. In your case of
>>> hard-coded data, I think you won't be able to measure any performance
>>> penalty.
>>>
>>> Hope that helps,
>>> Rahul
>>>
>>>
>>> On Thu, Mar 29, 2018 at 10:39 AM, ☂Josh Chia (謝任中) <joshchia at gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> In my project, I have multiple packages. One of the packages, packageA,
>>>> is very fundamental and depended on directly and indirectly by almost all
>>>> the other packages. It has functions that use some hard-coded data (a
>>>> ByteString top-level variable) also defined within packageA.
>>>>
>>>> This hard-coded data is appended regularly, causing packageA to be
>>>> rebuilt and thus almost all the other packages to be rebuilt, and building
>>>> takes a painfully long time. I know I can move this hard-coded data to a
>>>> file that's read at run-time, but that means one more item to plumb in at
>>>> run-time (where to find the file), and IO (preventing the functions from
>>>> being pure), so I would like to keep it hard-coded.
>>>>
>>>> Is there an elegant way to prevent or minimize the cascading rebuild of
>>>> the dependent packages just because the hard-coded data in packageA changed?
>>>>
>>>> For analogy, in C or C++, source code gets compiled to .o files, one
>>>> for each .cpp source file. Multiple .o files get linked into executables.
>>>> So, unless the interface (.hpp files) also change, an implementation (.cpp
>>>> file) change does not cause dependents to be recompiled to get new .o
>>>> files, although dependent executables get relinked. I'm not familiar with
>>>> the compilation and linking logic in GHC so maybe it has additional
>>>> complications.
>>>>
>>>> BTW, I'm using stack, in case it makes any difference to the nature of
>>>> the problem.
>>>>
>>>> Josh
>>>>
>>>> _______________________________________________
>>>> Haskell-Cafe mailing list
>>>> To (un)subscribe, modify options or view archives go to:
>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>>>> Only members subscribed via the mailman list are allowed to post.
>>>>
>>>
>>>
>>>
>>> --
>>> Rahul Muttineni
>>>
>>
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> To (un)subscribe, modify options or view archives go to:
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>> Only members subscribed via the mailman list are allowed to post.
>>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20180330/ceb83af8/attachment.html>


More information about the Haskell-Cafe mailing list