a better workflow?

Wed Jul 24 01:42:37 UTC 2019

This is very helpful information. I've long thought about doing something like this, but never quite had the crying need until now. And given my short-term peripateticism (summer at my in-laws' in Massachusetts, followed by a year's stint in Cambridge, UK, followed by another month's visit to my in-laws', all while my main home is rented out), this is not viable for now. But it does drive home the advantages quite well. And it describes exactly the trouble I thought I might get into with AWS, once I realized how big a machine I would need to make it worthwhile -- and how manual my interactions with it would have to be.

Thanks for writing this up. It convinces me to give up on AWS and either find another solution or live with what I have now.

Richard

> On Jul 23, 2019, at 9:06 PM, Ben Gamari <ben at smart-cactus.org> wrote:
> 
> Richard Eisenberg <rae at richarde.dev <mailto:rae at richarde.dev>> writes:
> 
>> Hi devs,
>> 
>> Having gotten back to spending more time on GHC, I've found myself
>> frequently hitting capacity limits on my machine. At one point, I
>> could use a server at work that was a workhorse, but that's not
>> possible any more (for boring reasons). It was great, and I miss it.
>> So I started wondering about renting an AWS instance to help, but I
>> quickly got overwhelmed by choice in setting that up. It's now pretty
>> clear that their free services won't serve me, even as a trial
>> prototype. So before diving deeper, I thought I'd ask: has anyone
>> tried this? Or does anyone have a workflow that they like?
>> 
>> Problems I have in want of a solution:
>> - Someone submits an MR and I'm reviewing it. I want to interact with
>> it. This invariably means building from scratch and waiting 45
>> minutes.
>> - I work on a patch for a few weeks, on and off. It's ready, but I
>> want to rebase. So I build from scratch and wait 45 minutes.
>> - I make a controversial change and want to smoke out any programs
>> that fail. So I run the testsuite and wait over an hour.
>> 
>> This gets tiresome quickly. Most days of GHC hacking require at least
>> one forced task-switch due to these wait times. If I had a snappy
>> server, perhaps these times would be lessened.
>> 
> Indeed. I can't imagine working on GHC without my build server. As you
> likely know, having a fast machine with plenty of storage always
> available has a few nice consequences:
> 
> * I can keep around as many GHC trees (often already built) as I have
>   concurrent projects
> 
> * I can leave a tmux session running for each of those projects with
>   build environment, an editor session, and whatever else might be
>   relevant
> 
> * working from my laptop is no problem, even when running on
>   battery: just SSH home and pick up where I left off
> 
> Compared to human-hours, even a snappy computer is cheap.
> 
> A few years ago I tried using an AWS instance for my development
> environment instead of self-hosting. In the end this experiment didn't
> last long for a few reasons:
> 
> * reasonably fast cloud instances are expensive so keeping the machine
>   up all the time simply wasn't economical (compared to the cost of
>   running the machine myself). The performance of one AWS "vCPU" tends
>   to be pretty anemic relative to a single modern core.
> 
>   Anyone who uses cloud services for long enough will eventually make a
>   mistake which puts this cost into perspective. In my case this
>   mistake was inadvertently leaving a moderate-size instance running
>   for ten days a few years ago. At that point I realized that with the
>   cost incurred by this one mistake I could have purchased around a
>   quarter of a far more capable computer.
> 
> * having to rebuild your development environment every time you need to
>   do a build is expensive in time, even when automated. Indeed some of
>   the steps necessary to build a branch aren't even readily automated
>   (e.g. ensuring that you remember to set your build flavour
>   correctly). This inevitably results in mistakes, resulting in yet
>   more rebuilds.
> 
> Admittedly self-hosting does have its costs:
> 
> * You need to reasonably reliable internet connection and power
> 
> * You must configure your local router to allow traffic into the box
> 
> * You must configure a dynamic DNS service so you can reliably reach
>   your box
> 
> * You must live with the knowledge that you are turning >10W of
>   perfectly good electricity into heat and carbon dioxide 24 hours per
>   day, seven days per week.
> 
>   (Of course, considering how many dead dinosaurs I will vaporize
>   getting to Berlin in a few weeks, I suspect I have bigger fish to
>   fry [1])
> 
> 
>> By the way, I'm aware of ghc-artefact-nix, but I don't know how to use
>> it. I tried it twice. The first time, I think it worked. But by the
>> second time, it had been revamped (ghc-head-from), and I think I
>> needed to go into two subshells to get it working... and then the ghc
>> I had didn't include the MR code. I think. It's hard to be sure when
>> you're not sure whether or not the patch itself is working. Part of
>> the problem is that I don't use Nix and mostly don't know what I'm
>> doing when I follow the ghc-artefact-nix instructions, which seem to
>> target Nix users.
>> 
> We should try to fix improve this. I think ghc-artefact-nix could be a
> great tool to enable the consumption of CI-prepared bindists. I'll try
> to heave a look and document this when I finish my second head.hackage
> blog post.
> 
> I personally use NixOS both on my laptop and my build server. This is
> quite nice since the environments are guaranteed to be reasonably
> consistent. Furthermore, bringing up a development environment on
> another machine is straightforward:
> 
>    $ git clone git://github.com/alpmestan/ghc.nix <git://github.com/alpmestan/ghc.nix>
>    $ nix-shell ghc.nix
>    $ git clone --recursive https://gitlab.haskell.org/ghc/ghc <https://gitlab.haskell.org/ghc/ghc>
>    $ cd ghc
>    $ ./validate
> 
> Of course, Nix is far from perfect and it doesn't always realize its
> goal of guaranteed reproducibility. However, it is in my opinion a step
> up from the ad-hoc Debian configuration that I used up until a couple of
> years ago.
> 
> Naturally, your mileage may vary.
> 
> Cheers,
> 
> - Ben
> 
> 
> [1] I was curious about the numbers here:
> 
>    The distance from New Hampshire to Berlin is around 3000 nautical
>    miles. A typical commercial flight of this distance has a burn rate
>    per seat [2] of around 3L/100km.
> 
>    Burning one liter of jet fuel will evolve [3] roughly 2.5 kg of
>    CO_2. Consequently, this single trip (both ways) will cost roughly
>    800 kg CO_2 eq.
> 
>    By contrast, the carbon intensity of electricity production in my
>    region [4] is 280 gCO_2 eq/kWh. Consequently, assuming an average
>    power of 50W, running my server for one year would cost around
>    100 kg CO_2 eq.
> 
>    Indeed it's not as negligible as I thought, but still not awful.
> 
> [2] https://en.wikipedia.org/wiki/Fuel_economy_in_aircraft#Long-haul_flights <https://en.wikipedia.org/wiki/Fuel_economy_in_aircraft#Long-haul_flights>
> [3] https://www.eia.gov/environment/emissions/co2_vol_mass.php <https://www.eia.gov/environment/emissions/co2_vol_mass.php> 
> [4] https://www.electricitymap.org/?page=country&solar=false&remote=true&wind=false&countryCode=US-NEISO <https://www.electricitymap.org/?page=country&solar=false&remote=true&wind=false&countryCode=US-NEISO>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20190723/bd39a3b0/attachment-0001.html>