[Haskell-cafe] Distributing Haskell on a cluster

felipe zapata tifonzafel at gmail.com
Mon Mar 16 00:54:54 UTC 2015


I haven't considered that idea, but it seems the natural solution.

Many thanks

On 15 March 2015 at 20:31, Ozgun Ataman <ozgun.ataman at soostone.com> wrote:

> Anecdotal support for this idea: This is exactly how we distribute
> hadron[1]-based Hadoop MapReduce programs to cluster nodes at work. The
> compiled executable essentially ships itself to the nodes and recognizes
> the different environment when executed in that context.
>
> [1] hadron is a haskell hadoop streaming framework that came out of our
> work. It's on github and close to being released on hackage once the
> current dev branch is finalized/merged. In case it's helpful:
> https://github.com/soostone/hadron
>
> Oz
>
> On Mar 15, 2015, at 8:06 PM, Andrew Cowie <andrew at operationaldynamics.com>
> wrote:
>
> Bit of a whinger from left-field, but rather than deploying a Main script
> and then using GHCi, have you considered compiling the program and shipping
> that?
>
> Before you veto the idea out of hand, statically compiled binaries are
> good for being almost self-contained, and (depending on what you changed)
> and they rsync well. And if that doesn't appeal, then consider instead
> building the Haskell program dynamically; Hello World is only a couple kB;
> serious program only a hundred or so.
>
> Anyway, I know you're just looking to send a code fragment closure, but if
> you're dealing with the input and output of the program through a stable
> interface, then the program is the closure.
>
> Just a thought.
>
> AfC
>
> On Mon, Mar 16, 2015 at 9:53 AM felipe zapata <tifonzafel at gmail.com>
> wrote:
>
>> Hi all,
>> I have posted the following question on stackoverflow, but so far I have
>> not received an answer.
>>
>> http://stackoverflow.com/questions/29039815/distributing-haskell-on-a-cluster
>>
>>
>> I have a piece of code that process files,
>>
>> processFiles ::  [FilePath] -> (FilePath -> IO ()) -> IO ()
>>
>> This function spawns an async process that execute an IO action. This IO
>> action must be submitted to a cluster through a job scheduling system (e.g
>> Slurm).
>>
>> Because I must use the job scheduling system, it's not possible to use
>> cloudHaskell to distribute the closure. Instead the program writes a new
>> *Main.hs* containing the desired computations, that is copy to the
>> cluster node together with all the modules that main depends on and then it
>> is executed remotely with "runhaskell Main.hs [opts]". Then the async
>> process should ask periodically to the job scheduling system (using
>> *threadDelay*) if the job is done.
>>
>> Is there a way to avoid creating a new Main? Can I serialize the IO
>> action and execute it somehow in the node?
>>
>> Best,
>>
>> Felipe
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20150315/277e45f8/attachment.html>


More information about the Haskell-Cafe mailing list