[Haskell-cafe] Two GHC-related GSoC Proposals

Thomas Schilling nominolo at googlemail.com
Fri May 31 15:10:05 CEST 2013


[I'll be the mentor for this GSoC project.]

I used the MVar approach a while ago and so did Simon Marlow's
original solution.  Using MVars and Threads for this should scale well
enough (1000s of modules) and be relatively straightforward.
Error/exception handling could be a bit tricky, but you could use (or
copy ideas from) the 'async' package to deal with that.

 / Thomas

On 30 May 2013 18:51, Ryan Newton <rrnewton at gmail.com> wrote:
> What's the plan for what control / synchronization structures you'll use in
> part 2 of the plan to implement a parallel driver?
>
> Is the idea just to use an IO thread for each compile and block them on
> MVars when they encounter dependencies?  Or you can use a pool of worker
> threads and a work queue, and only add modules to the work queue when all
> their dependencies are met (limits memory use)... many options for executing
> a task DAG.  Fortunately the granularity is coarse.
>
>   -Ryan
>
>
>
> On Sun, Apr 21, 2013 at 10:34 PM, Patrick Palka <patrick at parcs.ath.cx>
> wrote:
>>
>> Good points. I did not take into account whether proposal #2 may be worth
>> it in light of -fllvm. I suppose that even if the LLVM codegen is able to
>> perform similar optimizations, it would still be beneficial to implement
>> proposal #2 as a core-to-core pass because the transformations it performs
>> would expose new information to subsequent core-to-core passes. Also,
>> Haskell has different overflow rules than C does (whose rules I assume
>> LLVM's optimizations are modeled from): In Haskell, integer overflow is
>> undefined for all integral types, whereas in C it's only undefined for
>> signed integral types. This limits the number of optimizations a C-based
>> optimizer can perform on unsigned arithmetic.
>>
>> I'm not sure how I would break up the parallel compilation proposal into
>> multiple self-contained units of work. I can only think of two units: making
>> GHC thread safe, and writing the new parallel compilation driver. Other
>> incidental units may come up during development (e.g. parallel compilation
>> might exacerbate #4012), but I still feel that three months of full time
>> work is ample time to complete the project, especially with existing
>> familiarity with the code base.
>>
>> Thanks for the feedback.
>>
>>
>> On Sun, Apr 21, 2013 at 5:55 PM, Carter Schonwald
>> <carter.schonwald at gmail.com> wrote:
>>>
>>> Hey Patrick,
>>> both are excellent ideas for work that would be really valuable for the
>>> community!
>>> (independent of whether or not they can be made into GSOC sided chunks )
>>>
>>> -------
>>> I'm actually hoping to invest some time this summer investigating
>>> improving the numerics optimization story in ghc. This is in large part
>>> because I'm writing LOTs of numeric codes in haskell presently (hopefully on
>>> track to make some available to the community ).
>>>
>>> That said, its not entirely obvious (at least to me) what a tractable
>>> focused GSOC sized subset of the numerics optimization improvement would be,
>>> and that would have to also be a subset that has real performance impact and
>>> doesn't benefit from eg using -fllvm rather than -fasm .
>>> ---------
>>>
>>> For helping pave the way to better parallel builds, what are some self
>>> contained units of work on ghc (or related work on cabal) that might be
>>> tractable over a summer? If you can put together a clear roadmap of "work
>>> chunks" that are tractable over the course of the summer, I'd favor choosing
>>> that work, especially if you can give a clear outline of the plan per chunk
>>> and how to evaluate the success of each unit of work.
>>>
>>> basically: while both are high value projects, helping improve the
>>> parallel build tooling (whether in performance or correctness or both!) has
>>> a more obvious scope of community impact, and if you can layout a clear plan
>>> of work that GHC folks agree with and seems doable, i'd favor that project
>>> :)
>>>
>>> hope this feedback helps you sort out project ideas
>>>
>>> cheers
>>> -Carter
>>>
>>>
>>>
>>>
>>> On Sun, Apr 21, 2013 at 12:20 PM, Patrick Palka <patrick at parcs.ath.cx>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm interested in participating in the GSoC by improving GHC with one of
>>>> these two features:
>>>>
>>>> 1) Implement native support for compiling modules in parallel (see
>>>> #910). This will involve making the compilation pipeline thread-safe,
>>>> implementing the logic for building modules in parallel (with an emphasis on
>>>> keeping compiler output deterministic), and lots of testing and
>>>> benchmarking. Being able to seamlessly build modules in parallel will
>>>> shorten the time it takes to recompile a project and will therefore improve
>>>> the life of every GHC user.
>>>>
>>>> 2) Improve existing constant folding, strength reduction and peephole
>>>> optimizations on arithmetic and logical expressions, and optionally
>>>> implement a core-to-core pass for optimizing nested comparisons (relevant
>>>> tickets include #2132,#5615,#4101). GHC currently performs some of these
>>>> simplifications (via its BuiltinRule framework), but there is a lot of room
>>>> for improvement. For instance, the core for this snippet is essentially
>>>> identical to the Haskell source:
>>>>
>>>> foo :: Int -> Int -> Int -> Int
>>>> foo a b c = 10*((b+7+a+12+b+9)+4) + 5*(a+7+b+12+a+9) + 7 + b + c
>>>>
>>>> Yet the RHS is actually equivalent to
>>>>
>>>> 20*a + 26*b + c + 467
>>>>
>>>> And:
>>>>
>>>> bar :: Int -> Int -> Int
>>>> bar a b = a + b - a - b -- the RHS is should be optimized away to 0
>>>>
>>>> Other optimizations include: multiplication and division by powers of
>>>> two should be converted to shifts; multiple plusAddr calls with constant
>>>> operands should be coalesced into a single plusAddr call; floating point
>>>> functions should be constant folded, etc..
>>>>
>>>> GHC should be able to perform all these algebraic simplifications. Of
>>>> course, emphasis should be placed on the correctness of such
>>>> transformations. A flag for performing unsafe optimizations like assuming
>>>> floating point arithmetic is associative and distributive should be added.
>>>> This proposal will benefit anybody writing or using numerically intensive
>>>> code.
>>>>
>>>>
>>>> I'm wondering what the community thinks of these projects. Which project
>>>> is a better fit for GSoC, or are both a good fit? Is a mentor willing to
>>>> supervise one of these projects?
>>>>
>>>> Thanks for your time.
>>>> Patrick
>>>>
>>>> (A little about myself: I'm a Mathematics student in the US, and I've
>>>> been programming in Haskell for about 3.5 years. Having a keen interest in
>>>> Haskell and compilers, I began studying the GHC source about 1 year ago and
>>>> I've since gotten a good understanding of its internals, contributing a few
>>>> patches along the way.)
>>>>
>>>>
>>>> _______________________________________________
>>>> Haskell-Cafe mailing list
>>>> Haskell-Cafe at haskell.org
>>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>>>
>>>
>>
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>



More information about the Haskell-Cafe mailing list